The document provides an introduction to statistical analysis concepts including probability, random variables, probability distributions, and common discrete probability distributions such as the Bernoulli, uniform, and binomial distributions. Key points covered include defining probability as a measure of uncertainty, explaining random variables and how they relate to observable outcomes, distinguishing between continuous and discrete probability density functions, and deriving the mathematical formulas for and providing examples of the Bernoulli, uniform, and binomial distributions.
The document provides an introduction to statistical analysis concepts including probability, random variables, probability distributions, and common discrete probability distributions such as the Bernoulli, uniform, and binomial distributions. Key points covered include defining probability as a measure of uncertainty, explaining random variables and how they relate to observable outcomes, distinguishing between continuous and discrete probability density functions, and deriving the mathematical formulas for and providing examples of the Bernoulli, uniform, and binomial distributions.
The document provides an introduction to statistical analysis concepts including probability, random variables, probability distributions, and common discrete probability distributions such as the Bernoulli, uniform, and binomial distributions. Key points covered include defining probability as a measure of uncertainty, explaining random variables and how they relate to observable outcomes, distinguishing between continuous and discrete probability density functions, and deriving the mathematical formulas for and providing examples of the Bernoulli, uniform, and binomial distributions.
The document provides an introduction to statistical analysis concepts including probability, random variables, probability distributions, and common discrete probability distributions such as the Bernoulli, uniform, and binomial distributions. Key points covered include defining probability as a measure of uncertainty, explaining random variables and how they relate to observable outcomes, distinguishing between continuous and discrete probability density functions, and deriving the mathematical formulas for and providing examples of the Bernoulli, uniform, and binomial distributions.
Weekly Course Objectives ● What is exactly a probability? ● Explaining probability distributions. ● Continuous and discrete probabilities. ● Go over types of discrete probability distributions ○ bernoulli ○ uniform ○ binomial ● Means and Variances of PDFs! What exactly is a probability? ● Many real world events aren’t definite. For example, we know 1+1 = 2, we know when the sun being out brings light, but some events might happen. ● To explain this uncertainty, we use something called probabilities. ● Probabilities are the branch of mathematics that deal with how likely events are to occur. ● All probabilities for a specific event must add up to 1. For example, if we are measuring the event of weather, we might have data that supports the probability of it raining tomorrow as 20%, cloudy without rain as 40%, partially sunny skies as 40%. ● The notation for probability is given by P(). What is a random variable? ● By definition a variable is something that is not fixed and susceptible to change as you go from one observation to the next. ● Therefore, a random variable is something that denotes the outcomes of a particular random event that has an element of uncertainty. ● A random variable is typically denoted by X, and an observed variable is denoted by x. ● For example, P(X) is the probability of some random variable X. ● This doesn’t tell us much though… so we should say something like… P(X = x) which is the probability of some random variable X being a particular, observable, value. ● For example if X is the event of flipping a coin, then x could be getting Heads or Tails. Therefore the probability of getting Heads is P(X = heads) = 0.5 (assuming a fair coin). Probability Distributions ● A lot of data follows assumptions and patterns that resemble some classical distributions, called probability distributions. ● More formally, probability distributions are functions that tell you the likelihood of obtaining possible values from some random event. For example, the probability that: ○ You will arrive in the first 5 minutes of class. ○ That someone’s height is between 150cm and 160cm. ○ That the average weight is greater than 200lbs. ○ Can be represented by a probability density function (pdf, not the file format!) ● We can draw probabilities from such events because we have data that often follows certain assumptions! Continuous vs Discrete PDFs ● Recall, numerical data can be discrete or continuous. ● Discrete probability distributions count occurrences that have countable or finite outcomes. A discrete random variable, X, can take on only finite, countable values. ○ For example, the outcome of a dice throw. We know it can only be {1,2,3,4,5,6}. It can’t be anything else! ● Continuous probability distributions count an interval of values from an infinite range of values. In other words, a continuous random variable, X, can take on any value. ○ For example, the heights observed in a classroom. We know if two consecutive heights are 180.3 and 182.3, that there are an infinite set of heights that can occur within that range! For example 180.30001 is a possibility! ● Important to note that the continuous random variable X could assume infinite values, so the probability of X taking on any one specific value is zero. AKA P(X=x) = 0. Bernoulli Distribution ● For this lecture, we will focus on the discrete PDFs! ● Let me ask you a question - if you had to logically give me the probability function for flipping a fair coin, we would break it down as; ○ Probability of heads equals 0.5 ○ Probability of tails equals 0.5 ● The mathematical way of looking at this coin toss is to identify the success of the next trial. In that case, say that we say the next toss or trial is successful if we get heads. ○ Therefore, the probability of success ( the probability we get heads ) is 0.5. ○ The complementary rule states that the probability of failing (probability we don’t get heads) is 0.5. Bernoulli Distribution cont. ● More generally, say we had any experiment. The probability of success can be denoted by p. ● The random outcome being successful can be denoted by X=1, and X=0 if it fails! ● Combining this information all together yields the Bernoulli Distribution. ● Bernoulli Distribution is the probability distribution of a single experiment yielding a success or failure, given as: ○ P(X=x) = px(1-p)1-x ○ x = {0,1} (1 means a success, and 0 means a fail). ○ p = (0,1) ● Therefore P(X=1) = p1(1-p)1-1 = p1(1-p)0 = p ○ Note, anything to the power of 0 is 1 ● And P(X=0) = p0(1-p)1-0 = 1 - p ○ Logically, this makes sense. the probability your experiment fails us 1 minus the probability of succeeding. Bernoulli Distribution Examples ● Say we had an unfair coin. The probability of getting heads is ⅓ . What are the bernoulli distribution based outcomes of getting heads? ● In this problem, getting heads is the success outcome. Therefore, x=1 means we get our target outcome. ● p=⅓ ● The probability of success is; ○ P(X=1) = p1(1-p)1-1 = p1(1-p)0 = p = ⅓ ● The probability of failing is: ○ P(X=0) = p0(1-p)1-0 = 1 - p = ⅔ Bernoulli Distribution Examples cont. ● Approximately 1 in 200 Canadians are lawyers. If one Canadian is randomly selected, what is the probability they are not a lawyer? ● Let the random variable X denote if a Canadian is a lawyer. 1 means they are, 0 means they are not. ● Therefore P(X = 0) = ? ○ p = 1/200 ○ P(X=0) = p0(1-p)1-0 = 1 - p = 199/200 ● All we are doing is quantifying something very logical! ● Note, this is based on the assumption each trial is independent! Meaning doing one trial should not impact the next. ● If we were selecting multiple marbles, every time you choose a marble, your pool of marbles decreases so the next selection depends on the previous. We’ll get to this after :) Uniform Distribution ● Right off the bat, it is worth mentioning the uniform distribution could be continuous as well, depending on the experiment. ● Again, we will focus on the discrete case. ● Say we were rolling a die. We know fair dice can have outcomes of 1, 2, 3, 4, 5 or 6. Each outcome has an equally likely chance of happening (again, assuming its fair and not rigged). ● Therefore, the probability of getting a 3 is ⅙. ● How do we express this as a probability? ● Believe it or not, a histogram visually helps us picture this, and a theoretical frequency table helps us quantity this! Uniform Distribution cont. ● Graphically and tabularly, we have the following for a dice throw:
● Therefore x={1,2,3,4,5,6}. n, the number of outcomes, = 6.
● In this case, P(X=x) = 1/n = ⅙ , since every event is equally likely to occur. ● So with this, we now know that a Uniform (discrete) Distribution is one which has events which are equally likely to occur. If you have n outcomes, P(X=x) = 1/n. Binomial Distribution ● We know that you can use the Bernoulli Distribution when you want to measure an random event for a single trial… but what about multiple trials? ● Say we wanted to know not the probability that the next coin toss was heads, but the probability of getting 3 heads in 5 consecutive tosses? ○ If H = Heads, and T=Tails, you could get… ○ HHTHT, TTHHH, HHHTT, HTHTH, HTTHT, … ● Is there a more mathematical way of writing this down? ● COMBINATORICS! Out of say n trials, we get x successes. ○ nCx = n!/((n-x)!x!) ○ In laymen terms, this just accounts for all possible scenarios where we get x successes out of the n trials. Binomial Distribution cont. ● The Binomial Distribution is a distribution used to determine the number of successes in a sequence of n independent experiments. ● If n is the number of trials or experiments you conduct, then x = {0,1,2,3,...,n). ● This makes sense because say you had 10 trials of a coin toss. You can get 0 heads, 1 head, 2 heads, 3 heads, all the way to 10 heads in a row. Therefore x = {1,2,3,...,10}. ● More formally the equation is: ○ P(X=x) = (nCx)px(1-p)n-x ○ n is the number of trials, and p the probability of success. Binomial Distribution Example ● Say we had special coin that lands on heads with a probability of 70%. If you toss the coin 3 times, what is the probability you get heads 2 times? ● Logically, we can count the possible events: ○ HHT, THH,HTH ○ The probability of getting the event HHT is ■ P(Heads on the first toss) × P(Heads on the second toss) × P(Tails on the first toss) ■ P(X=1)×P(X=1)×P(X=0) ■ 0.7×0.7×(1-0.7) = 0.7×0.7×0.3 ■ This makes sense because each toss doesn’t depend on the last one. So the probability of the next toss is independent of the previous one. In general, each trial is independent of the last. ■ We can then quickly see that order doesn’t change the fact we’ll still mathematically end up with 0.7×0.7×0.3 for all 3 events Binomial Distribution Example cont. ● Therefore, we get 3 × 0.7×0.7×0.3 = 0.441 ○ Note we got the 3 in front because there’s only 3 cases where we can get 2 heads and 1 tail. HHT, THH, HTH ● Mathematical 3C2 = 3!/((3-2)!2!) = 3 ● Therefore in the equation, we also get: ○ P(X=2) = (3C2)(0.7)2(1-0.7)3-2 = 3 × (0.7)2 × (0.3) = 0.441 ● The equation gets us the answer way faster. Let’s try a harder problem Binomial Distribution Example 2 ● Say we had special coin that lands on heads with a probability of 70%. If you toss the coin 100 times, what is the probability you get heads 70 times? ● Good luck writing out all combinations to this… There are around 2.937 × 1025 possibilities…. this is a very large number. ● So now the equation would be easier… ● P(X=70) = (100C70)(0.7)70(1-0.7)100-70 = (100C70) × (0.7)70 × (0.3)30 = 0.0867 ● A lot easier :P Binomial Distribution Example 3 ● Say we had special coin that lands on heads with a probability of 70%. If you toss the coin 5 times, what is the probability you get heads at least 3 times? ● Here you can get x = 3, 4 or 5 because we mention that we get heads at least 3 times! ● P(X ≥ 3) = P(X = 3) + P(X = 4) + P(X = 5) = (5C3)(0.7)3(1-0.7)2 + (5C4)(0.7)4(1-0.7)1 + (5C5)(0.7)5(1-0.7)0 = 0.83692 ● We can even calculate using the complimentary event, which might be easier at times. ● P(X ≥ 3) = 1 - P(X < 3) = 1 - (P(X =0)+ P(X = 1) + P(X = 2)) = 1 - ((5C0)(0.7)0(1-0.7)5 + (5C1)(0.7)1(1-0.7)4 + (5C2)(0.7)2(1-0.7)3) = 1 - 0.16308 = 0.83692 Mean and Variance of PDFs?! ● Yep, you guessed it. Just like how we calculated averages and dispersions for data, we can do so for our PDFs! ● The Mean or Expected Value is what we expect our outcome to be. For this reason, it is a parameter and not a statistic! ● The Expected Value is a predicted value of a variable, calculated as the sum of all possible values each multiplied by the probability of its occurrence. ● The Variance of a predicted value of a variable, is how “off” we expect our random variables to be from the expected value. It’s again just a dispersion parameter that we defined before! ● For the discrete case, it is: ○ E(X) = µ = Σ x × P(X=x), for all possibilities of x. ○ Var(X) = σ = E((X-µ)2) = E(X2- 2µX + µ2) = E(X2)-2µE(X)+µ2 = E(X2) - 2µ2 + µ2 = E(X2) - µ2 = E(X2) - E(X)2 ■ Here, E(X2) = Σ x2 × P(X=x) Bernoulli Mean and Variance ● If X is a Bernoulli random variable with success probability p, then: ○ E(X) = Σ x × P(X=x) = 0 × P(X=0) + 1 × P(X=1) = p1(1-p)0 = p ○ Var(X) = E(X2) - µ2 = Σ x2 × P(X=x) - p2 = 02 × P(X=0) + 12 × P(X=1) - p2 = p - p2 = p(1-p) Discrete Uniform Mean and Variance ● If X is a discrete uniform random variable with n outcomes, then: ○ E(X) = Σ x × P(X=x) = 1 × P(X=1) + 2 × P(X=2) + … + n × P(X=n) = 1 × 1/n + 2 × 1/n + … n × 1/n = 1/n (1+ 2 + 3 + … + n) = 1/n (n×(n+1)/2) = (n+1)/2 Note, we did some math magic. Check it out here. ○ Var(X) = (n2-12) / 2 Trust me, this one is very weird. But if you want to know more, check it out here. Binomial Mean and Variance ● If X is a Binomial random variable with n outcomes and success p, then: ○ E(X) = np ○ Var(X) = np(1-p) Trust me, this one is even weirder. But if you want to know more, check it out here. ● Logically, this makes sense. ○ Say X is the random event of getting in a 3 pointer in basketball. If your probability of making a 3 pointer is constant, then: ■ Taking 100 shots (n=100) with a probability of getting a 3 pointer being 0.4 (p = 0.4), then it makes sense to say we expect you to make 40 shots (100 × 0.4). That’s exactly n×p! ● For those wondering, I’d never ask you to prove these. Just knowing them would suffice! Homework! Homework Answer ! Homework 2! Homework 2 Solutions! Homework 3 Q’s + Solutions!