Download as pdf
Download as pdf
You are on page 1of 55
Introduction to Probability Syllabus Introduction to probability theory, Probabity theory-terminology, Fundamental concepts in probability-axioms of probability, Application of simple probability rules-association rule learning, Bayes’ theorem, Random variables, Probability Density Function (PDF) and Cumulative Distribution Function (CDF) of a continuous random variable, Binomial Distribution, Poisson Distribution, Geometric distribution, Parameters of continuous distributions, Uniform distribution, Exponential distribution, Chi-square distribution, Student's t-distribution, F-distribution . Contents ; 3.1 Introduction to Probability Theory 3.2 Axioms of Probability 3.3 Application of Simple Probability Rules 3.4 Bayes' Theorem 35 Pie SPW ioe Wane came el Data Science Inreducton to Probebty E&I introduction to Probability Theory * Probability theory is concerned with the study of random phenomena, Such phenomena are character ed by the fact that their future behaviour is not predictable in a deterministic fashion. The role or probability theory is to analyze the behavior of a system or algorithm assuming the given probability assignments and distributions * Probability was developed to analyze the games of chance. It is mathematical modeling of the phenomenon of chance or randomness, The measure of chance is called the probability of the statement, ‘ * The probability of an even is defined as ‘s the numbet of favorable outcomes divided by the total number of Possible out comes, ill Classical Definition of Probability » Computing Probability Using the Classical Method * If an experiment has n equall that an event E can occur is m, likely simple events and if the number of ways then the probability of E, P(E), is P(E) = Number of way that E can occur _ m ~ “Number of possible outcomes > So, if S is the sample space of this experiment, then _ N@® Ey * If E denotes the events of non-occurrence of E, then the number of elementary events in E is n - m and hence the probability of E is : seen: m P@) = == 1-= = 1-P()=> P)+P@)=1 (here m is a non-negative integer and nis a Positive integer and m < n ). It is also called mathematical or priori probability. Random Experiment * In some experiments, we are not able to control the value of certain variables so that the results will vary from one performance of the experiment to the next, even though most of the conditions are the same. These experiments are called as random experiment, TECHNICAL PUBLICATIONS® - An up thrust for knowledge Data Science 3-3 Introduction to Probability Random experiment is defined as an experiment whose outcomes are known efore the experiment is performed but which outcome is going to happen in a particular trial is not known, « Example : If we toss a die, the result of the experiment is that it will come up . with one of the numbers in the set {1, 2, 3, 4, 5, 6). '» A random variable is simply an expression whose value is the outcome of a particular experiment. EXE] Sample Space * The totality of the possible outcomes of a random experiment is called the sample space of the experiment and it will be denoted by letter 'S' ‘« There will be more than one sample space that can describe outcomes of an experiment, but there is usually only one that will provide the most information. ‘= The sample space is not determined completely by the experiment. It is partially ‘af determined by the purpose for which the experiment is carried out. _* Example 11: If the experiment consists of flipping two coins, then the sample space consists of the following points: i $=((T, 7), (1H), 7), GH) will be (T, T) if both coins are tails, (I-H) if the first coin is tails and the (, T) if the first is heads and the second tails, and (H,H) if both coins nt to classify sample spaces according to the number of elements If. ‘ace has a finite number of points, it is called a finite y there are natural numbers 1, 2, 3, .... , it If it has as many points as there are in 34 Data Science Introduction to Probabiity An event is a subset A of the sample space S, ie., it is a set of Possible outcomes, If the outcome of an experiment is an element of A, we say that the event 4 has occurred. An event consisting of a singh le point of S is called a simple or elementary event. A simple event is any single outcome from a Probability experiment. Each simple event is denoted ¢;. A single performance of the experiment is kn; As particular events, we have element of S must occur, because an element of 9 own as a trial, S itself, which is and the empty set 4, ‘annot occur. An unusual event is an event that ha: Independent events : independent events if, PELE) = P(E) PED) Mutually exclusive events ; E1/E2,...En ate said to be mutually exclusi EL OE; = 0, fori sj, KE, and E, are independent and mutually exclusive then either P(E,) = 0 or P(E,) = 0. If E, and E> are independent events, and E§ are also independent the sure or certain event since an which is called the impossible event s a low probability of occurring, Let E1,E, be two events. Then E1,E> are said to be Mutually exclusive events are Sometimes called as Disjoint event. If A Mutually exclusive, then it is not Possible for both events to occur on ‘tial. If two events are’ Mutually Exclusive Events then they do not outcomes, wo events of a sample space whose intersection entire sample space are called 1 Introduction to Probability Exhaustive events : The total number of all possible events of a random experiment js called exhaustive events. For example, in tossing a coin, there are two exhaustive Data Science De Morgan's law : ‘ + The useful relationship between the three basic operations of forming intersection and complements are known as De Morgrans laws. * The complement of a union (intersection) of two sets A and intersection (union) of the complements A and B. Thus (AUB) = AnB “ (AB) = AuB Prove that : P(AUB) = Pi - Solution: Let A and B be any two events. To mutually exclusive events : Aq, BS, AN Band ASA AUB = (AM BU (AN BU (AEB) By axioms 3: P(AUB) Introduction to Probability 342x5 R B PB)- P(ANB) P(A)- PNB) Data Science P(ANB) P(A|B) = Introduction to Probability ber called the suport fee the rue The suppor is singly cons Hat ici ll ems Ih the ater ard neat The supp count ofan ten A i ented by A. Cot the manber of wanacion iT tat contin. Assume Thar Introduction to ¢ Date Science 3.14 abilty CEE 4 recanicat Jactory production line is 1 machines, A, Band C, The total output, machine Aist B Jor 35 % and machine C for the rest. The machined Mastine Ais defective, 4 % from machine B and 21M from 1 Bolt is chosen at random from the production tine and found to be defective, Whi Probability that GRRIAMN: © machine A it machine Bit machine C? wi Solution : Let B bolts using tipo, for 25%, machine the output from D = {bolt is defective}, A = {bolt is from machine A}, B = {bolt is from machine B}, C = [bolt is from machine C} Given data: P(A) = 0.25, p(B) = 0.35, PC) = 0.4. P@|A) = 0.05, P|) = 0.04, P(D|C) = 0.02, From the Bayes’ Theorem : P(A/D) = P(D/A)x P(A) P(D/A)x P(A) + PO/B)x P(B) + (DIO) PO i 0.05% 0.25 © 005% 0.25% 0.04% 0.354 002% 04 0.0125 0.01254 0.014% 0.008 PUD) = 03621 . "ol = PDX PB) - POVA)KPA)+ PDX PBF P/O PDO PO) a ity P(B/D) P(C/D) = % of wom im faovaré tall, ts!’ pete Sconce 3-17 Introduction to Probability ce PC/E)PR) PIRI) = pay) P(R) + POMP es a 1005 3 . yd ii 5 100 Random Variables + Azandom variable is set of possible values from a random experiment. + A random variable, usually written X, is a variable whose possible values are numerical outcomes of a random phenomenon. There are two types of random - variables, discrete and continuous. + Whenever you run and experiment, flip a coin, roll a die, pick a card, you’ assign tumber to represent the value to the outcome that you get. This assign is called m variable. variable is a variable X that assigns a real number [x], for each and of a random experiment. If § is the sample space containing all the S {€1/€2/€3/..,€-./en} of random experiment and X is a random e defined as a function X(e) on S, then for every outcome e; (where i = 1, 2,- in S the random variable X(e;) will assign a real value x;. ; of random variables is that user can define certain probability make it both convenient and easy to compute the probabilities of Data Science io 3-16 Introduction to Probability © The random variable is called a discrete random variable if it is defined over a sample space having a finite or a countable infinite number of sample points. In this case, random variable takes on discrete values and it is possible to enumerate all the values it may assume. A discrete random variable can only have a specific (or finite) number of numerical values. We can have infinite discrete random variables if we think about things that we know have an estimated number. Think about the number of stars in the universe. We know that there are not a specific number that we have a ‘way to count so this is an example of an infinite discrete random variable. The mean of any discrete random variable is an average of the possible outcomes, with each outcome weighted by its probability. © Example : 1. Total of roll of two dice : 2, 3, ..., 12 2. Number of desktops sold : 0, 1, ... 3. Customer count : 0, 1, Continuous Random Variable -¢ In the case sample space having an uncountable infinite number of sample points, the associated random variable is called a continuous random variable, with its values distributed over one or more continuous intervals on the real line. A continuous random variable is one which takes an infinite number of possible values. Continuous random variables are usually measurements. Examples indude height, weight, the amount of sugar in an orange, the time required to run a mile. A continuous random variable is one having continuous range of values. It cannot be produced from a discrete sample space because of our requirement that all random variables be single valued functions of all sample space points. A continuous random variable is not defined at specific values. Instead, it is defined over an interval of values and is represented by the area under a cure The probability of observing any single value is equal to 0, since the number of values which may be assumed by the random variable is infinite. Both types of random variables are important in science and engineering. Maxed random variable is one for which some of its values are discrete and some a continuous. Ma re TECHNICAL PUBLICATIONS® - An up thrust for kn fei a Oe. Data Science 3.20 Introduction to Probab, Data Sclence _ aa’ Typically pIX = x} is a discont zero whenever F,, inuous function of x; it is x continuous and nonzero only at discontinuities in Fy (x) HD Probabitty of « function of the number of heads from tossing a coin four times. Determine the cumulative distribution function, eS nple Solution : F(0) = £(0) = 1 Fl) 0 Ss Ee 0 be " | F(2) 0 £(0) + f(a) + £(2) DeaGn ant) 16 "16 "16 * 76 £(0) + £(1) + £(2) + £) Biss 65 ~ 16 "16 *16*i6 _ 1444644 15 = Sea 16) F(4) = £(0) + £(1) + £(2) + £(8) + £(4) a eue aan ~ 36 "16 *16*16*76 T+4+64441 _ 16 nit I 0 F(3) P (Tails) = Let heads denote the coin landing head side up. z Let tails denote the coin landing tail side up. ot: The possible outcomes are for the coin to land head side up ofl side up, Using the alternative notation, 1 P(X = Heads) = 5 2 = E[X-E[x)?] Bi = ii =p? Introduction to Probability ‘and standard deviation 5. Find the 8x 0.1 08 = P(X=0)+P (X= 1)+ P(X=2)+P (X=3)+ P(X=4) = 0+K+2K 42K 43K 8K = 8x0.1 08 (K = 0.1) 1 } minimum value of K. = P(X=0)+P(X=1) =0)+P(X=1)+P(X=2) + 2K Distribution means ‘two numbers’. of health research are often measured by whether they have occured ‘example, recovered from disease, admitted to hospital, died etc. | distribution occurs in games of chance, quality inspection, opinion and so on. modelled by assuming that the number of events 'n’ has a binomial with a fixed probability of event p. Binomial distribution is for a series of Bernoulli trials. ity of an event. ‘of binomial distribution : consist of n identical trials. has only two outcomes. ity of one outcome is p and the other is q = 1— p. s are independent. d in x, the number of success observed during the n trials. 3-29 i | Introduction to Probability BD 3 BG pXqh Xx? —p2 x= 0 1x2"Cp p2q?-? 43x 2p3qh-34...4%C,n(n- 1) np + "Cyxp qh? : a n(n-1)p? 9) n-2C,_2p*" 2g? * +np-n2p? x=2 3 ‘n(n-1) p? (p+ q)"~? + np-n2p? n(n- 1) p? +np-n?p? = n?p?—np?+np- np? -mp- np? = np (1-p) = pq q=1-p) deviation (a) of the binomial distribution is /npq. : ja xP X= 2+ xP (K= 3)+ xP (X= 4) + xPX= 5) x (0.165) + 3x (0.329) + 4x (0.329) + 5x (0.132) SS a, Introduction to Probability Solution : a) Binomal distribution : Let X have the random variable giving the number of heads that will turn up in 19 tosses, then. 01)? 1 sha Se e0l, cotta 3100-318 ia8 = 2628800 6x 5040x 8x 128 45 128 rane «(2a 10! 4100-41 PX eat P(X=3) = yA = 3628800 1 1 24720 “16 “64 22680 24x72x 64 11340 12x72x64 Pea 105-0105) P Seen Oval (a {ee rons = (PLaI(R)- ESE Sess = 9628800 TH400x32%32 _ 8 P=5) = 10 1 4 race (10) 10! Hoe 10-4)! er “16 3628800 7 518400x64x16 = 1024 eS am TECHNICAL PUBLICATIONS® - An up thrust for knowledge 3-31 introduction to Probability probability is, e AS, 105 63 i 128 "512" 256 * i024 = 0.1171 + 0.2050 + 0.246094 + 006836 = 0.575195 ibution : ler data is continuous. bution was discovered 1733 by de Moivre as an approximation to the ton when the number of trails large. It is derived in 1809 by also called Gauss distribution. ition describes a special class of such distributions that are discribed by the distribution mean y and the standard oA t Theorem, Which states that the sum of a large ‘oisson, etc.) will ia 3-33 Introduction to Probability 2) Distribution is symmetric about its mean, of distribution is determined by standard deviation : Large value of SD reduce the height and increase the spread of the curve, small value of SD increase _ the height and reduce the spread of the curve. 4) Almost all of the distribution will lie within 3 deviations of the mean. 5) The total area under the curve is 1. 6) The curve extencis indefinitely in both directions, approaching, but never touching, the horizontal axis as it does so. Major facts about normally distributed variables and normal-curve areas : 1, Once we know the mean and the standard deviation of a normally distributed Warlable, we know its distribution and associated normal curve. The mean and standard deviation are normal distribution’s sufficient statistics, they completely define the variable's distribution. # the probability a normally distributed variable assumes a value between a and b is. _ equal to the area under the curve between a and b. '* Calculation of the probability that a normal __iéwval: Theoretically, we need to calculate the area under the curve between the end points of the interval. Integration for each and every different normal Hs: Due to the complication of calculation and frequency in which it is done, a ed way has been derived. normal distribution : random variable lies within some The normal distribution with mean = 0 and random varible : The normal random variable with the standard ition is called standard normal random variable. 3-34 Dete Science ——_—___— ———. ‘ror to Probatsity Ed Poisson Distribution * Poisson distribution, named after its invertor simeon poisson who was q French mathematician. He found that if we have a rare event (ie. p is small) and we know the expected or mean (or j)) number of occurances, the Probabilities of 0, 1, 2 ... events are given by : R = Se Poisson distribution : Is a distribution the number of rare events that occur in a unit of time, distance, space and so on. Examples : 1, Number of insurance claims in a unit of time. 2. Number of accidents in a ten-mile highway. 3. Number of airplane crash in triangle area, = When there is a large number of trials, but a small Probability of success, binomial calculate becomes impractical. Example : Number of deaths from horse kicks in the army in different years. The mean number of successes from 1 trials is = np. « If we substitute y/n for p, and let n tend ‘becomes the Poisson distribution : Po) = SRY x! to infinity, the binomial distribution — = Poisson distribution is applied where random events in space or time are “*pected to occur. Deviation from poisson distribution may indicate some degree of non-randomness in the events under study. 2 Example : 64 deaths in 20 years from thousands of * Ifa mean or a verage probability of an event Page/per mile cycled etc, soldiers. : pee Site and, an exact probability of an event happenng is given, ot and you ate asked to calculate the probability of this event happening k times out n, then the Binomial distribution must be used Se CER AD i i strike in the north sea is 1 in 500 ie h exactly 3. oil cing wells in 1000 explorations ? ; if Cae Hearne | Introduction to Probability sson's distribution : of trials ‘n’ are large and probability of success ‘p’ is very small ities are approximated by Poisson's distribution. constant when n> and p= 0. Poisson's distributed random variable in 'n’ trials is given as, Be73 4 ges ses (ae3eg)e" tric distribution represents the number in a series of Bernoulli of problem shows up frequently in queueing systems where we're d in the time between events. mple, suppose that jobs in our system have exponentially distributed If we have a job that's been running for one hour, what's the that it will continue to run for more than two hours ? definition of conditional probability, we have PK>s+t|X>4 = eee ‘then X'> t is redundant, so we can simplify the numerator. PX >s4t|X>H = ee of the exponential distribution, PX>s+t|X>H = eee cancel, giving the surprising result. PX >s+t|X>t) = ers 3 Property is an important Property that simplifies calculations | with conditional probabilities. Geometric distribution is the only discrete distribution that has the memoryless property. Hal distribution is memoryless because the past has no bearing on its + Every instant is like the beginning of a new random period, — Same distribution regardless of how much time has already elapsed. is the only memoryless continuous random variable. ag Introduction to Probability Probability density function is given as | Re forx20 0 forx <0 le cumulative distribution function Fy(x) of exponentially distributed is given as, [-e7 2 1 x -* forx>0 {0 forx Observed test statistic. Do null hypothesis. lue : Low probability that test statistic > Observed test statistic. ypothesis. 4 ianufacturing rivets wants to limit 10 rivets manufactured by 0, 09 = 0.145 i = 215+1.99 +2,05+2.12+2.17 +2.01+1.98 + 2.03 +2.25 +1 10 a 3-43 Introduction to Probability goodness-of-fit test is applied to binned data. goodness-of-fit test can be a and the Poisson goodness of fit test be, behaves in a particular pplied to discrete distributions such as Bins by hypothesizing that the distribution of manner. For example, in order to determine needs of a retail store, the manager may wish to know whether Hanequal number of customers each day of the week. an hypothesis of equal numbers of customers on each day could be and this would be the null hypothesis, Fee Vamable has a frequency distribution with k categories into which @ has been grouped. The frequencies of occurrence of the variable, for each RY of the variable, are called the observed values, = il Which the chi square goodness of fit test works is to determine fee “Tete Would be in each category if the sample datal wae exactly according to the claim. “Tetmed the expected number of cases for each category. The total of the umber of cases is always made equal to the total of the observed ‘Cases. is is that the observed number of cases to the expected number of cases in each cal tive hypothesis is that the observed and Giently to reject the null hypothesis, 3 less of fit test is a Non-parametric test that is used to find out d value of a given phenomena is significantly different from the in each category is tegory.. expected number of cases oodness of fit test, the term goodness of fit is used to compare the distribution with the expected probability distribution. ee s of fit test determines how well theoretical or Poisson) fits the empirical distribution. Data Science pee es 9-44 ae Date Solence __!ntroduction to Probabiity Suppose that the values, say Xp, Xp ceccecceseny Xp ine asd with frequencies (Of), (Of), (Oy ... ea of ae ™ a respectively, where (Of) stands for observed frequency and yn i = 1 (Ofi -N, On _ State null hypothesized proportions for each category (p). Alternative is thet ap Jeast one of the proportions is different than specified in the null Calculate the expected counts for each cell as npi Calculate the X? statistic : er aes Rj observed expected): mNGS. ‘expected - Compute the p-value as the proportion above the X? statistic for either a randomization distribution or a X° distribution with df = (number of categories 1) if expected counts all > 5 5, Interpret the p-value in context. Chi-square Test for Independence of Attributes * The chi-Square test of independence is used to determine if there is a significant relationship between two nominal (categorical) variables. a * The frequency of each category for one nominal variable is compared across the categories of the second nominal variable. * The data can be displayed in a contingency table where each row represents a category for one variable and each column represents a category for the other variable. For example, say a researcher wants to examine the relationship between gender (male vs. female) and empathy (high vs. low). The chi-square test of independence can be used to examine this relationship. st The null hypothesis for this test is that there is no relationship between gender and empathy. * The alternative hypothesis is that there is a relationship between gender and empathy (eg. there are more high-empathy females than high-empathy males). © This test is also known as chi-square Test of Association. This test utilizes @ contingency table to analyze the data. ES : * A contingency table is an arrangement in which data is classified according to categorical variables. The categories for one variable appear in the rows, and categories for the other variable appear in columns. TECHNICAL PUBLICATIONS® - An up thrust for knowledge Introduction to Probab tly proportional fo thelsize of the sample, independent of the ship between the variables. 3-47 ur methods are under development for making discs are made by each method, and they are checked 3 difference between the proportions of superconductors under sing the chi-square test. is (Ho) : the proportions of semiconductors are equal. e hypothesis (H) : the proportions of semiconductors are not equal. ons of expected frequencies (E;) is —_ “ee 9-48 Introduction to Probab Date Science: a 2 (42-302 (22-30)? (25-30) 1-30)? (42-30)? | ( -SS z heen oo 80 Gon?) 902 ~20)2 (19-20)? (8-20) (28-20)? (25 °c = 1950 Stop 6: Calculated chi-square test is greater than the tabulated value (7.815). We reject null hypothesis. Therefore, the Proportions of supercon. z \ductors are not equal. EEN student's t-Distribution * When the sample values come from a Normal distribution, the exact distribution of “t" was worked out by W. S. Gossett. He called it a t-distribution, * Unfortunately, there is not one ‘distribution. There are different t-distributions for each different value of n. im = 7 there is a certain tdistribution but if n = 13 the d ‘distribution is a little different, y with n~1 degrees of freedom. distribution of the rand lom variable te slo. 0 Fig. 3.13.1 tles of Student's t-Distribution ition is different for different degrees of freedom. . is centered at 0 and symmetric about 0. a under the curve is 1. The area b to the left of 0 is 1/2 and the area to is 1/2. itude of t increases the graph approaches but never equals 0. the tails of the t-distribution is larger than the area in the tails of the is greater than 1. and mode of the t-distribution are equal to zero, tals ofthe tsebotion is litle greater than the area Ree a ae a as g further variability. = 58392-58000 648 / V6 100 iron bars is said to be drawn from a large 1 distributed with mean 4 feet and le be regarded as a truly random sample ? : Sample size n = 100, Sample mean x = 4.2, 4, SD.0 =06 }) : Sample can be regarded s (H}) : Sample cannot be regarded. SES Data Science 3-62 eS veoh HONE Introduction to Probar te _infroductio abity * The probability density function of an # (dj, dy) given by distributed random variable if cl e Pampa (udyx | \t/2 ax \82/2 eeu Gy 72a) 75 ( dix+dy } (1 axa) Br for real x 2 0, where dj and dy are Positive integers, and B is the beta function. + An F random variable is defined as a ratio of two independent chi-square randon variables. Basic Properties of F-distributions 1, The total area under an F-curve equals 1 2, An F-curve is only defined for x > 0, 3. An F-curve has value 0 at x = 0, is right, and approaches 0 as x} + <0 4. An F-curve is right-skewed. Positive for x > 0, extends indefinitely to the Testing the equality of two variances 1. Test assumption of equal variances that was made in using the test 2 Interest in actually comparing the variance of two populations * Assume we repeatedly select a random sample of size n from | Populations. Consider the distribution of the ratio of two variances : ‘ ; ws if 2 SS * The distribution formed in this manner approximates an F following degrees of freedom : vy =ny-1and vp =n-1. + The F table gives the critical degrees of freedom, é sa values of the F-distribution wh 3-53 Introduction to Probability introduction to Probability late the null and alternate hypotheses. that the variability in the new process ginal process, Hon uses two valties for degree of freedom. upt degrees of freedom is the value of ny — 2 ‘1 which is used in critical values of the F distribution, =7 and V> = 15 is 2.71 1 = 24 and V> = 19 is 292 eE7_™ 3-54 Introduction to Probab, Data Science (QEITEREDD #+ one sarpie of 10 observations, the sum of the squares of the deviations of ‘sample oalues from sample mean was 120 and in the other sample of 12 observations, PE whet: te diirence significant at 5 % level. Solution : Given data: nj = 10, n2 = 12, = 0.05, > (x-¥)? = 120, > y-y/? =314 Null hypothesis (Ho) : 07 =o} Alternative hypothesis (H ;) : 0? #03 Calculate S} and S3 : x)2 Pee) 120.2120. Se ny-1 10=1 omens By)” 314 _ 314 ey) ct) 12=1s a S$ > S{ so null hypothesis (Ho) is true. Test statistic = 28.545 Sh _ 28545 = eee os 2 13333 git Calculate degree of freedom (d.f.) Vy = ny-1=10-1=9 V2 = n2-1=12-1=11 Level of significant at 5 % for (V =9) and (V = 11) is 290. Calculated value of (F = 2.14) < 2.90 table value Null hypothesis (Ho) is accepted. Conclusion * Difference is not significance at 5 % level. the samples, we have 15750, n> = 6, S} = 10920 ratio of the variances conforms the F-distribution, the F-value is : 2 Fee St eit 2: d, from the F-distribution table, it can be seen that : iv | Fos (5,6) = 3.1075 ‘ | nce is really different (i.e. with a probability of 1 90 %), the F-value 3.01075. 44 < Fos (ny,n2) = 3.1075, 7 variances are the same with a 90 % probability. 3 44 correspond to a probability of 66.8 %. | innot say the variances are different. ts for difference in means ‘will assume that these two different samples are independent of each from two distinct populations. A 1: 11,0, > Sample 1: X, s, § *H2,02 > Sample 2 : X, 59 our population SDs , and o2 are known, statistic : bis Sutin 3-56 Introduction to Probasiy ty (X1 =Xp)= Gy =a) Te. Beatoe ilies. Ry Ap * The P-value is calculated using the t distribution, Two-sample confidence intervals * In addition to two sample t-tests, we can also use the t distribution to construct confidence intervals for the mean difference, When 9 and o are unknown, we can form the following 100-C % confidence interval for the mean difference Misha. ot EE (XX) £ th yan . i Gea! vale is calculated from at dstibuton wit degre gaa i ‘The k is equal to the smaller of (n ~1) and (ny -1) Summary of two-sample tests 1. Two independent samples with known o, anda : P-values calculated using the standard normal distribution, 2. Two independent samples with unknown a and oy : Use Prvalues calculated using the t distribution with degrees of f smaller of ny ~1 and n ~1. 3. Two independent samples with unknown 01 and o and Use two-sample t-test with pooled variance estimator. The P. using the t distribution with nj +n -2 (ny =1) $7 +(n9 -1) 83 . ny+n)-2 tatistics that should be used in this situation is & -X2)-0 Sia oe the P-value by using the t distribution with (nj +nz~-2) degrees of and then compare it to the appropriate significance level. Alternatively if a two-sided ae we can construct the appropriate CI : QoQ

You might also like