Brighter Future 1

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

Diapers and... Beer?

What is
STATISTICS?
Adrian Kang

Diapers and Beer. Do you see the connection? Most


people will claim that the two items are unrelated.
However, mathematical computation and fair logic
—statistics in other words—do not lie. Big Data,
which are large data sets that are analyzed to reveal
patterns and trends in human behaviours, helped
supermarket brands to find out that when diapers
and beers are placed next to each other, people tend
to purchase those items more. These days, the
interest in analyzing big data sets is spiking since
they can be applied to many different fields such as
sports, biochemistry, and economics to obtain useful
information. As a result, more people hire data
scientists for their organizations, and more college
students enroll in statistics courses. An interview
with Mr. Kang, a statistics professor at Yonsei
University, reveals general information about
statistics—the major concepts, applications, limits,
and future outlook—to answer some frequently
asked questions and resolve misunderstandings
about the field.
Adrian: Good afternoon, Professor Kang.

Mr. Kang: Hello.

Adrian: I wonder if you heard about statistics regarding the diapers/beers


arrangement in supermarkets.

Mr. Kang: Yes. Haha. What about it?

Adrian: I was amazed by how statistical analysis skills could even help
people connect seemingly irrelevant ideas and items. I wanted to inform the
public what statistics is as an academic field. May I ask a few questions?

Mr. Kang: Sure.

Adrian: If you were to explain what statistics is to the general public,


what would you tell them?

Mr. Kang: In simple words, statistics is a field that extracts useful information
from the data.

Adrian: Useful information? Can you tell me more about


what statistics is?

Mr. Kang: It all started when people wanted to describe the overall picture of a 
population. Let’s suppose that I want to find out the general trend of how tall the
Americans are. I could use the average to do that, right?
Adrian: Yes, that's what we were taught when we were kids back in our
elementary schools.

Mr. Kang: It's an easy method, indeed. If I were to calculate the average, I should first
investigate how tall the 300 million people are. However, that method is not cost-
effective and requires a lot of time. Therefore, I would extract a small group of U.S.
citizens to estimate how tall the entire American population is.

Mr. Kang: In short, statistics is a field that studies this process of extracting
information from a small representative group—in other words, a sample—of
the entire population. Using a small representative sample of supermarkets in
the world and statistics, people found out that placing diapers and beers side
by side increases the purchase of those two items in supermarkets.

Adrian: That’s a very smart and efficient way to figure out the general trend of a
population. Could you explain more about what samples are?

Mr. Kang: I like to call a sample “the mini-population.” Samples are small
representative groups that display the entire population comprehensively.

Anything could be a sample after all. For instance, a lot of people watch
youtube. Every time you click a Youtube video to watch, Youtube records what
kind of video you chose. Youtube is collecting samples of what videos you like!
After a few days, your newsfeed will be filled with videos similar to the ones
you already watched.
Adrian: I like listening to Spanish songs through Youtube, and that's why Youtube
showed me more and more Spanish songs on the newsfeed. That's intriguing. Then,
what are some advantages in using statistics? What are the limits?

Mr. Kang: One of the significant advantages in using statistics is


that we can extract very useful information with a limited
amount of information and resources. While not spending much
time and cost on investigating the entire population, we can
obtain accurate and significant information by analyzing the
sample that represents the whole population.

Mr. Kang: A huge disadvantage in using statistics is the fact that it is highly
dependent on the data sets. If the data set is biased (regardless of how big the
data set is), we would extract biased information which is false and useless.

Adrian: Could you provide an example when a sample was biased, and any
information from the sample that was useless?

Mr. Kang: One famous example is the U.S. presidential election in 1936. The two
major candidates were Landon and Roosevelt.

Landon Roosevelt

One company, “Literary Digest,” conducted a major survey of the U.S. citizens
to predict the results for the 1936 election: it estimated that Landon would win
the election with 57% of the votes. However, Roosevelt won the 1936 election
with 62% of the votes. Literary Digest failed to predict the election correctly
because its sample was collected by calling the U.S citizens who had phones.

In 1936, people who had phones were generally wealthy, so the sample mostly
consisted of the rich, who favored the Republican Party and Landon.
Adrian: I never knew that people misused the concept of statistics back in 1936.
Can statistical analysis produce biased results with a proper unbiased sample?

Mr. Kang: A very good question. Yes, it may produce biased results. It is because we
cannot always be 100% sure that an extracted sample is an ideal representation of the
whole data set. Thus, we always evaluate the accuracy of the statistical inference. We
calculate a “confidence interval” which measures how confident we are with the
analysis results.

Adrian: I think it is very important to avoid or reduce the bias in statistics so that we can
produce accurate results. Next question. In what ways can statistics be useful in life? In
what fields is statistics frequently applied? How is statistics a useful tool in those particular
fields?

Mr. Kang: Statistics is used very frequently in the world. Most of the mathematical data
output that we know, such as the inflation rate, viewer ratings, and production of rice,
is obtained from representative samples rather than the population itself.

Mr. Kang: Statistics is used in most of the academic fields: economics, business,
psychology, education, medicine, engineering, agriculture, and even sports. The movie,
“Moneyball,” was based on the true story of a baseball team that used statistics to
analyze the data of batters in Major League Baseball, which ultimately led the team to
victory.
Adrian: It feels like statistics must be very important these days. What do you think about
the future outlook of data scientists and the field of “Big Data?”

Mr. Kang: We are living in the world of Big Data. AI will absorb the concept of
Big Data and open a new era by analyzing a colossal amount of data that
humans cannot handle. It would be able to analyze person by person, and we
can already recognize those changes from the examples I provided. Youtube’s
algorithm analyzes the types of videos you watched and suggest new videos
that you might like in your newsfeed. This is just a single example, and there is
a lot more to come in the future.

Artificial Intelligence

Adrian: The future of statistics looks bright indeed. I guess you are also part of
that future as a statistics professor, Mr. Kang.

Mr. Kang: Haha. Maybe.

Adrian: Anyway, I’ll wrap up this interview now. Thank you for your time, Mr.
Kang.

Mr. Kang: No problem

As technology develops rapidly, there have never


been so many data sets available for people to
use. After reading this interview, you would have
realized that statistics and Big Data have become
increasingly important, considering the fact that
statistics allows you to analyze large data sets
and extract significant information about the
population with less cost and time. Statistics has
become a necessity for modern society, and it is
imperative that we be aware of the right method
to employ statistics and be skeptical of misused
forms of statistics. The various applications of
statistics support its bright future outlook.

You might also like