Download as pdf or txt
Download as pdf or txt
You are on page 1of 21

BAM 1024

Introduction to Statistical Analysis


Weekly Course Objectives
● What exactly are statistics?
● Descriptive vs. Inferential Statistics
● What are probabilities?
● Go over sampling terms like population, sample, representative
samples, etc
● Review set notation, unions, intersects.
Statistics are everywhere!
● Believe it or not, you collect statistics everyday!
● Statistics are a collection and analyses of information,
especially for the purpose of making inferences when given a
sample.
● For example;
○ You trying out a pizza from a new restaurant and saying it’s
“bad” or “the restaurant is bad” is a statistic.
○ Finding the average height of people in a grocery store in a
brand new town and then claiming the whole town is full of
tall people is a statistic.
○ Attending 2 classes of a statistics course and saying the
class is hard is also a statistic.
Descriptive Statistics
● We have been using descriptive statistics so far in the course.
● Descriptive Statistics are those which involve summarizing
or organizing data.
● For example;
○ The mean, median and mode are descriptive statistics
because they organize / summarize data in meaningful
ways.
○ We can visualize our summaries of data using boxplots,
scatter plots and histograms. Meaning, they are also
descriptive statistics.
● When building an analysis, descriptive statistics are vital to
help you not only understand your data, but your claims about
it!
Inferential Statistics
● Later in this course, we will explore inferential statistics.
● Inferential statistics are those which involve making
inferences on our data using some probabilistic approach.
● For example;
○ Confidence intervals are intervals that can capture the true
population parameter if we were to collect a sample over
and over again.
○ Hypothesis testing to measure a claim made against the
data ( average height is greater than 150cm, average grades
are less than 70%, etc )

● Descriptive statistics describe what you see, inferential


statistics attempt to infer based on your data.
Descriptive vs. Inferential

● Descriptive statistics attempt to make summaries on your sample.


● Inferential Statistics attempt to make inferences on the whole population.
Populations vs samples - why do we even?
● A population is every single observation that falls under the
experiment you want to run.
● A sample is the subset of observations from your population.
● For example, the heights in the class of a first year statistics
course taken from 1 class would be a sample. The population
would be every first year statistics student (within some
demographic.
● It’s very expensive to collect information - some companies
even sell information from surveys (look at SurveyMonkey)!
● Getting adequate data for your study is critical, but sometimes
data isn’t easy to come by (the weather behavior during solar
eclipse).
● We ideally want a representative sample, which is a subset of
the population that has the same characteristics as it.
Populations and samples - a visual
Sets
● Most people are often interested in collections and assortments
of objects, and their relative proportions.
● A set is a collection of objects, or “stuff”. These objects could be
countries, cities, years, numbers, words, letters, hats, people,
whatever your heart desires.
● Naturally, we can define elements as these individual objects
that make up the set. So it could be a country, city, year,
number, word, letter, hat, person, etc.
● For example:
○ {a,t,i,n,d,e,r}
○ {0,1,2,3,5,8,13,21}
○ {every student that goes to Queen’s College}
○ {∅}
● The last point is called a null set aka an empty set. What does
it mean?...
Null Sets
● A shopping bag is an object to carry things; an empty bag is a
bag with nothing inside it.
● To elaborate on this idea, look at the set A = {5}.
○ Using basic English, set A is a set containing the element 5.
○ Hence, 5 is not just a number, but it is an element of the
set.
● In a similar way, an empty set is not nothing, it’s just a set with
no elements.
● The moral of this tricky idea is that a set is like a container, and
the elements are those objects or stuff we put in them.
● When you start thinking of sets as “containers”, you can easily
see why the null set is a subset of every set, since every set has a
“container”.
● A more depressing way of thinking is that “there is nothingness
in everything”.
Universal & complementary sets
● Universal sets are sets that contain every possible object
possible for the scenario or experiment you are looking at. This
is typically denoted with U.
● Sample sets is a smaller part of the population set of
observations of which your experiment will take on, denoted
by S. We typically work with samples!
● For example:
○ If we spoke about numbers, the universal set could be
U = {Every possible number}, S = {all even numbers}
○ If we spoke about provinces in Canada, the universal set
could be U ={All Provinces}, S = {Western Provinces}
● Complementary sets are basically everything a particular set
isn’t. It is typically given by ’
● For example:
○ If S = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10} and A = {2,4,6,3} then A’ =
{1,5,7,8,9,10}.
Subsets
● A subset is a smaller group of observations within some other
larger group which is at least as large as the smaller group
● In mathematical words, set B is a subset of set A when
everything inside B is within A. This is typically denoted as
B ⊆ A.
● For example, say set A = {1,2,4}. List all possible subsets.
● {1}, {2}, {4}, {1,2}, {1,4}, {2,4}, {1,2,4}, {∅}.
● Again, note that order didn’t matter. As the set {1,2} and {2,1}
are the exact same.
● What about this: A = {math, english, science}?
Unions and Intersects
● Unions are observations that are distinctly combined between
two ( or more ) sets, denoted by ∪
● For example, if A = {1,3,7,9} and B = {2,3,5,7,8} then
A ∪ B = {1,2,3,5,7,8,9}
● Intersects are observations that occur in both sets mutually,
denoted by ∩.
● For example, if A = {1,3,7,9} and B = {2,3,5,7,8} then
A ∩ B = {3,7}.
Number of objects!
● The last bit of information we can use to get closer to
probabilities is to identify the number of elements of a set
which is denoted by n()
● For example, if we had set A = {1,4,7}, then n(A) = 3.
● If we had information of everything in our sample, say S, then
we can technically find the probability of observing this
event, denoted by P()
● Ex, the probability of event A occurring, when A={1,4,7}, and S
= {1,2,3,4,5,6,7,8,9,10} then P(A) = n(A) / n(S) = 3/10 = 0.3
● Remember, probabilities can only be between 0 and 1.
● We will talk more about probabilities next class!
Inclusion Exclusion rule!
● In general, there is a very interesting property in statistics
called the inclusion-exclusion rule. It is given as follows:
n(A ∪ B) = n(A)+n(B)−n(A ∩ B)
● For example, if A = {1,3,7,9} and B = {2,3,5,7,8}, then:
○ n(A) = 4
○ n(B) = 5
○ n(A ∩ B) = n({3,7}) = 2
○ n(A ∪ B) = n({1,2,3,5,7,8,9}) = 7 = 4 + 5 - 2
DeMorgan’s Law
● DeMorgan’s Law states: The complement of the union of the
two sets A and B will be equal to the intersection of A' and B'.
● For example, (A ∩ B)’ = A’ ∪ B’ or (A ∪ B)’ = A’ ∩ B’
● Visually:
Independent Events
● Two events are independent if the probability of their
intersection are the product of the individual events.
● In other words P(A ∩ B) = P(A) × P(B)
● For example, if P(A) = ⅓, P(B) = 2/7, and P(A ∩ B) = 2/21, then
A and B are independent events!
● Intuitively, knowledge that one occurred does not affect the
chance the other occurs - i.e., these events don’t depend on
each other.
● When events don’t impact one another, the probability both
occur will be the product of each individual one.
● For example, the probability that it will rain in Hamilton and
the Toronto Raptors win the 2023 championships is definitely
independent.
Examples
● Given S = {1, 2,4,6,8,19,23,100,3} and A = {2, 6, 8} find P(A).
● Given S = {1, 2,4,6,8,19,23,100,3} and A = {2, 6, 8} find P(A’).
● Given S = {1, 2,4,6,8,19,23,100,3}, A = {2, 6, 8}, B =
{2,4,6,23,100,3} find P(A ∩ B).
● Find P(A’ ∩ B)
● Find P(A ∪ B’)
● Find P((A ∪ B)’)
● Are events A and B independent?
Answers
● 3/9 = ⅓
● 1-⅓=⅔.
● 2/9
● 4/9
● 5/9
● 2/9
● Yes
Homework
Homework: Answers

You might also like