Download as pdf
Download as pdf
You are on page 1of 78
Chapter 1: Probability Ld 12 13 Outcomes and Events .. Probability Functions/Measures ... 1.2.1 Properties of P{] Conditional Probability and Independence ‘Chapter 2: Random Variables 24 22 23 24 25 26 24 Random Variables and Cumulative Distribution Functions Density Functions Expectations and Moments... Expectation of a Function of a Random Variable .. ‘Two Important Inequality Results Moments and Moment Generating Functions Other Distribution Summaries Chapter 3: Special Univariate Distributions 3 32 Diserete Distributions .. BLL Degenerate Distribution 3.4.2 Two Point Distribution .. 34.3 Uniform Distribution on »— Points 3.14 Binomial Distribution 3.1.5 Negative Binomial Distri ion (Paseal ar Waiting time distribution) 3.1.6 Hyper-Geometric Distribution 3.1.7. Poisson Distribution... 348 Multi-Nomial Distribution: (Generalized Binomial Distribution)... Continuous Distributions ronnie 14 sve 14 3.2.1 Uniform Distribution .. 3.22 Gamma Distribution 3.23. Beta Distribution .. 3.2.4 Normal Distribution (Gaussian Law) 3.2.5 Cauchy Distribution... Chapter 4: Joint and Conditional Distributions 4.1 Joint Distributions .... 4.2 Special Multivariate Dis 4.2.1 Multinomial Distribution .... 42.2 Bi-Variate Normal Distribution 4.3 Conditional Distributions and Densities ..... 44 Conditional Expectation 45 Conditional Expectations of Functions of Random Variables 4.6 Independence of Randont Variables .. 4.7 Covariance and Correlation 48 Transformation of Random Variables: Y = g(X) 49, Moment-Generating-Funetion Technique .. Chapter 5: Inference 5.1 Sample Statistics 5.2 Sampling Distributions 53 Point Estimation 54 Interval Estimation Chapter 6: Maximum Likelihood @stimation 6.1 Likelihood Funetion and ML Estimator .. 64 Properties of MLES en Chapter 7: Hypothesis Testing 71 Introduction 7.2 Simple Hypothesis Tests .... 7.3 Simple Null, Composite Alternative «sou 74 — Composite Hypothesis Tests Chapter 8: Elements of Bayesian Inference 8.1 Introduction 82 Parameter Estimation 83 Conjugate Prior Distributions Chapter 9: Markov Chains 9.1 Discrete Time Markov Chain Chapter 1 10.1 Reliability Analysis 10.2 Quadratic Forms Assignment Sheet-l sme ): Miscellaneous Assignment Sheet-2 .. Assignment Sheet-3 .. Assignment Sheet-4 .. — Feipsacaien 7 ‘Rago 9001! Zove Coreen cere CHAPTER 1 PROBABILITY 1.1, Outcomes and Events We consider experiments, which comprise: a collection of distinguishable ‘outcomes, which are termed elementary events, and typically denoted by a and’a collection of sets of possible outcomes 10 which we might wish to assign probabilities, 4 the event. i In order to obtain a sensible theory of probability, we require that our collection of events Ais an algebra over, ie. it mist possess the following properties (i) Qea Gi) Iain a, then Zea Git) If 4, and 4, €A, then 4U 4 €a. i In the case of finite , we might note that the cotlection of all subsets of f necessarily satisfies the above properties and By-usitg this default choice of algebra, we can assign probabilities to. any_ possible combination of | lementary events Proposition 1.1: If a is an algebra, then-p.< ® Proposition 1.2: If 4, and 4, ¢ A, thes 4, (V4) € « for any algebra A : Proposition 1.3: If 4 is an algebra and 4, 42,.u14y €A, then (Yf,4,¢ A. 1.2. Probability Funetions/Measures Let © denote the sample space and a denote a collection of events assumed to be a o- algebra, : : i Definition 1.1: (Probability Function): A probability function Pf) is a set function with domain 4 (a o- algebra of events) and range [0,1], ie P: a> (0, 1], which satisfies the following axioms (i) P[A}20 forevery dea Gi) PIQ]=1 1 Git) 1 4,4... 18 sequence of mutually exclusive events (ie. 4,14, = Al for any i+ j) in a andif (4A, then o[G4]-Seta a 1.2.1. Properties of P[-] A remarkably rich theory emerges fiom these three axioms (together, of course, with those of set theory). Indeed, all formal probability follows as a logical consequence of these axioms. Some of the most important simple results are summarised here. Throughout this section, assume that ois ou collection of possible outcomes 4 is a c- algebra over a and Pf] is an ‘asgaciated probability distribution i (3 Fi Sa es Na Ne DOL Pa PSI, CO IRD A GTI STP Eamallallgncademcom: Wel: eigen om 13. Many of these results simply demonstrate that things which we would intuitively want to be truc of probabilities do, indeed, arise as logical consequences of this simple axiomatic framework. Pidl= Proposition 1.5: If 4.,4s,..4, are pairwise disjoint elements of 4, corresponding to mutually exclusive outcomes in our experiment, then Proposition 1. PA UGiU-.U4]= SPA] Proposition 1.6: If 4.4 then fA) Proposition 1.7; For any two events 4,B0, [3] and is feft undefined when P[3]=0 Exercisel 3.1: Consider the experiment of tossing two coins, 2={(HH)(H.T)(T.H)(T.T)}, and assume that each point is equally likely. Find (i) ‘The probability of two heads given a head on the first coin. (ii) The probability of two heads given at least one head. Theorem 1.1: (Law of Total Probability): For a given probability space (Q.AP[]). if 8,5, iS @ collection of mutually disjoint events in A satisfying a-(a,, Rut Your Own Notes —_———_- ( ADE, et Fo) Sra Hass Ka, Nee LET. Now Deb O06, Pa oi 2es78 Cal: 9 918844 OPPI6ITR, AREAETOD B Ips. ie. By..B, partition a and P[B)|>9, j=l,..,n, then forevery 4e a, Pl] vang,} Conditional probability has @ number of useful properties. The following elementary result is surprisingly important and has some farteaching consequences. Theorem 1.2 (Bayes? Formul: ‘or a given probability space (2,4,P(}]), if 4,Be A are such that P[4]>0,P[2]>0, then: (3) 4)P[4} ¥[3] Theorem 1.3 (Partition Formula): If &,...B, ¢A. partition then for any AeA lala) P(Ay=YP(418,)P(B) Theorem 1.4 (Multiplication Rule): For a~given probability space {O,AP[]), let Asondy be events belonging toa for Which P[4,...4y.,]>0 «then P[A Ay dy] = PL Az ABE A, Definition 1.4 (Independent Events): For @ given probability space (Q.AP[}), let « and 2 be two events ina. Events 4 and a are defined to be independent iff one of the following conditions is satisfied GO) Plang]=P[4]P[a] (ii) P[aj a) =P[4] if P[B}>0 (ii) Pa} 4]=P[a] if r[4]>0 Bxercise 1.3.2: Consider the experiment of rolling two dice. Let A= {total is odd}, B= {6 on the fist die}, C'= {total is seven}. (i) Are 4 and 8 independent? (ii) Are 4 and C independent? (i Definition 1.5 (Independence of Several Events): For a given probability space (QA,P[]), let Ayndy be events ina. Events 4,...4y are defined to de independent iff @ Plana, ]=Pall4 ies Gi) P44 na ]= PLA PLA, PLA hie nse kiek ‘Are # and C independent? ned] = Tart 2A, Feat wt) Ha Saal res Kas, Near LT, New Dei rm fete com Che: or: digegademy coe (oni, Ph (i) 16757, Cle FRG 8 HRLGLEG SET BB Ty CHAPTER 2 RANDOM VARIABLES 2.1 Random Variables and Cumulative Distribution Functions We considered random events in the previous chapter: experimental outcomes which either do or do not occur. In general we cannot predict whether or not a random event will or will not occur before we observe the outcome of the associated experiment — although if we know enough about the experiment we may be able to make good probabilistic predictions. The natura! generalization of a random event is a random variable: an object which can take values in the set of real numbers (rather: than simply happening or not happening) for which the precise value which it takes is not known before the experiment is observed The following definition may seem a litle surprising if you've seen probability only outside of measure-theoretic settings in the past. In particular, random variables are deterministic functions: neither random nor variable in themselves. This definition is rather convenient; all randomness stems from the underlying probability space and it is clear that random variables and random events are closely related. This definition also. makes it straightforward to define multiple random variables related to a single experiment and to investigate and model the relationships between them. Definition 2.1 (Random Variable): Given a prohsbility space (0, 4,P[]), a random variable, X, is a function with domain a and co-domain r (the real line) (ie, X:0> R) Example 2.1: Roll 2 dice Q={(i,/)si,j=lwus6}- Several random variables can be defined, for example x(i,j)=i+j, also ¥((i,/))=|'-j]. Both, x and rare random variables. x can take values 2,3,..I2 and y can take values 0h... Definition 2.2 (Distribution Function): The distribution function of a random variable X, denoted by Fy() is defined to be the function FR [0,1] which assigns Fe(x)=P[¥s x]=P[(o:X(0)--0%,,.» then the discrete density function of x is defined by weo-ffs9) antp Wot Exercise 2.2.1: Consider the experiment of tossing two dice. Let x = {total of uptumed faces} and y = {absolute difference of uptumed faces} (i) Give the probability function fy and sketch it (i) Give fy Definition 2.5; Any function FR -+[0, I] is defined to be a discrete density function if for some countable set x,.x3.-%45 @ F(z)20 F=12, @f(4)=0 for xe 555/21 Gi) D(x) =1 where summation is over ¥4.x2.--1 54. Definition 2.6 (Continuous Random Variable): A random variable x is called continous if there exists @ function Fy (-) Such that Fe (3) = [fe (u)eu forevery xR Definition 2.7 (Probability Density Function of a Continuous Random Variable): If x is a continuous random variable, the function fy in Fe (x)= feu) is called the probability density function of x Definition 2.8: A function f: — (0, o-) is a probability density function iff @ f(z ve [EP e)ar=1. Expectations and Moments Definition 2.9 (Expectation, Mean): Let x be a random variable, The mean of x , denoted by iy or E[], is defined by G@)-B[X]= x fy (xj) if is discrete with mass points 5.23.08). (ii) ELX]=[% xf (x)de if x is continuous with density fy (x) Intuitively, E[2] isthe centre of gravity of the unit mass that is specified by the density function. RUC rma ftipsende om: Webley dyad oe HAI, (Firat iow Wa Sr aus Khas, Near LL, New Deli 10016, Phi (OL1}26557527, Ct: 99O91H434& 999161736, BSUESATES B 24 Exercise 2.3.1: Consider the experiment of rolling two dice. Let x” denote the total of two dice and ¥ their absolute difference. Compute E[X] and E(Y] Exercise 2.3.2: Let 1 be a continuous random variable with density * if0sxo [e(xyek]s Corollary 2.1: If x is a random variable with finite variance, 0% , then ate P(]X-nel2rox]=P[(X te) 220k] <5 Note thatthe last stitement can also be written as Pll ne rox ]2t—y ‘Thus the probability that x falls within ro units of ny is greater than or Puy Pox 0 if its PMF is given Xj XpionXy —be independent Poisson. -—R.V.’s Xp ~ PO) RaQ at Then Sy =X [tout Ny 188 PQA tty) RIS Converse is also tru. Let X&Y be independent R.V.’s with P(2,) and P(A) respectively then conditional distribution of xX given x+y is binomial. (Converse is true) Uses of the Poisson Distribution For large », and small p, X ~ Bin(n,p) is approximately distributed as Poi (np). This is sometimes termed the “law of small numbers ‘A Poisson Process with rate per unit time is such that (_X, the number of occurrences of an event in any given time interval of length « is Poi (x). Gi) The number of events in non-overlapping time intervals are independent random variables (see later). "BAIL (Fit For a Srak Hs Khas Near LT New Db 11016 Ph (11) S650, Cal ODIRDENG K DDPLGTIG, SHOEI om nfstlnacaaeny on Web: wr ne 3.1.8 Multi-Nomial Distribution (Generalized Binomial Distribution) Let sy.on%44 be non-negative imtegers such that x, +...) $m then probability that exactly trials terminate in 4,(/=1,...,f—1) & hence that xp em [ny tt np) trials is AEN iP er et z M (= (ne + pre to pal +a) A(eta) Viale eR #(x,)=np,Yar(x))=09,(I-P)) Cov Xj.) =m? /Pe Summary (s. Distribution |PMF E(X)\. |Var(X) M(t) Poison al ray a ha le) * Lk=0,1,2, Binomit |?%=4)="Caota™* leseey ale ie doo aoe ¥~-B(r7) k= 0,1,2,000 nel wet 3 Unto jest ate 4 | wo Point 7 80-F|-nye-ap [Oe a |w J a 2 X~ NB( rp P P get |e | Hyper. peometeiec Tee ce Fle) Sia Sara Hn Kas Near mal nt 3.2. Continuous Distributions 3.2.4. Uniform Distribution is said to have uniform distribution on [4,b] ifits PDF is given by asxsh ref 0; otherwise 1 ‘0 M()= 1 {ete ):r#0 Oma} Results: Let x be an RLV. with a continuous DF. F, then (x) has the uniform distribution on [0,1} 3.2.2. Gamma Distribution An RV. X is said to have gamma distribution with parameters q and B if its PDP is = o0 otherwise (b) When a=" (n>0 integer) & B=2 Then S(x)= {os erty "HATE, (Ft Foo a Sara Hae Kas, Near TT, New Dai Erma ezdanendeny is said chi-square 72 (n) distribution B(X)=n, Ver(X)=2n 1 1 MO Taya Results: Let X,(/=1,....2) be independent R.V. such that X; ~G(a,,B), then s, ¥4~6(Da,p) RV. Corollary: (i) Take a, =1 vi ‘Then S, ~ G(n.B) ie. sum of exponential is Gamma Gi) Let X~G(a),B) & ¥~G(az,B) be independent RL. then x4¥ and % are independent or x + and = + all independent conversely also true. (Gi) Memory less property of exponential P(X>résly,,)=P(X>7) where x ~exp(2) Gx) If X.Y are independent exponential R.V.’s with parameter, then x xe¥ 3.2.3. Beta Distribution An RN. x 48 said to have beta distribution with parameters a&f (a>0,B>0) ifits PDF is Zz has @ (0,1) distribution a ool H pees0 0; otherwise We write x ~ (2,8) (x)=. v(x of Note: Ifa = =1, we have U(0,t) Results: (i) xX ~B(a,B) then 1-X ~ 8(B,a) (i) Let X~G(a,B) & Y~G(az,p) be independent then als SRE & TSN ASTD a ~B(aj,02) RV. XY 3.2.4. Normal Distribution (Gaussian Law) (@) AnR.V. X is said to have a standard normal distribution if its PDF is, F()= we write x ~N(0,1) (b) An RV. is said to have normal distribution with parameters 4 & o(> 0) if We write x~(j.0") wojeoa| we ) Central moments iee((x-»))=0 it odd =[(2n—1)(2n-3)....3.1]}o™ if n is even Results: (i) Let XX, be independent R:V.'s such that X= M(ugo8). bebo then “En eo(Snds] Corollary: @) X,-N(u07) (b) IE X,~N(0,1)(¢=1,...0m) ate independent, then S, N(0.1) (i) Let x &Y be independent R.V.’s then x +Y is normal iff x &¥ is normal (iil) Let x &¥ be independent R-V."s with (0,1) then x+y and x-y are independent. i Haz Khas, Near ELT, New Deli1016, Ph (112650507, Ca BIBS VOPT6I7, SRIBATSD Emil auivncndem com Webster daca Fw (iv) Let X, & Xp are independent N(H;,07) & 2.0?) then 4-3 ee Put Your Own Notes and X, +X; are independent Sees (&) @) X~N(o) 3 4? =N2(1) (0) X=N (4,02) => ak ~ Waa?) aX +b~(au+0,a%0) i) x=M(uo?)-22=#- w(o1) (il) X&Y deiid. N(007) RV.'s then J -Cauchy (10) x Cauchy (1,0 ia (1,0) at 2 WA has PDF dezree HH wiz) 3.28. Cauchy Distribution ‘An RLV. is said to have Cauchy distibution with parameters 4 and @ ifits PDF is vt S()=t 4 j-e 0,-1< p< (a) p is the correlation coefficient (b) for normal densities: this corresponds to xX, and Xz being independent the bi-variate normal density is the product of wo uni-variate Normal random variables i (© x+y and.x-y are independent if 6? = o} (candy are independent iff =0 Extension to Multivariate Normal Distribution: Infact, it's straightforward to extend the normal distribution to vectors or arbitrary length, the multivariate normal distribution has density Fon N(xas2) “(ze Jer Fexn{- Arp Ex sh | Where | "BATE Ft For a Sara HK BB 43. 44, Note that x is a vector; it has mean 1 which is itself a vector and = is the variance-covariance matrix, It x is k-dimensional then x isa kxk matrix, Conditional Distributions and Densities Given several random variables how much information does knowing one provide about the others? The notion of conditional probability provides an explicit answer to this question. Definition 4.7 (Conditional Discrete Density Function): For discrete random variables with x and r with probability mass points ,%35..5% and Ya .003r Six (ol [Y=y)1X =] Pre] is called the conditional discrete density function of y_given x = Definition 4.8 (Conditional Discrete Distribution): For jointly discrete random variables x and Y, Sax (Vx) = PLY S91 X X fae(osl2) is called the conditional discrete distribution of v given x =x. Definition 4.9 (Conditional Probability Density Function): For continuous random variables x and r with. joint probability density function fy (%9)> Su y(%y) Sa (3) Where fy (x) is the marginal density of x. fyx (vx) = + if fy (x)> 0 Conditional Distribution For jointly continuous random variables x and Y, Sins (v13)= Conditional Expectation We can also ask what the expected behaviour of one random variable is, given knowledge of the value of a second random variable and this gives rise to the idea of conditional expectation, Definition 4.16 (Conditional Expectation): The conditional expectation in discrete and continuous cases corresponds to an expectation with respect to the appropriate conditional probability distribution: Discrete ("fiw (2\x)dz vx such that fy(x)>0 Continuous BLY |X = x]= f° vf (VI ‘Note that before x is known to take the value x, E[Y | X’] is itself a random. variable being a function of the random variable x . We might be interested inthe distribution of the random variable E[Y |X] Theorem 4.1 (Tower Property of Conditional Expectation): For any two random variables X, and X, Pa (01) 685757, Cal 9991S A HO9TGIT, SRISTOD 45. AG, 47, Foee Coren notte BLE LaT]= EG) Exercise 4.4.1: Suppose that © ~ Ufo, 1] and (x \) -Bin(2, ©) Find E[4|@] and hence or otherwise show that E(x Conditional Expectations of Functions of Random Variables By extending the theorem on marginal expectations we can relate the conditional and marginal expectations of functions of random variables (in particular, their variances). Theorem 4.2: (Marginal Expectation of @ Transformed Random Variables): For any random variables 4, and x, and for any function a). POUeoEaleouenl Theorem 4.3 (Marginal Variance): Foray random variables X, and x; Var( 4) = [Var ¥) | Xa) Vas(BL% |X). Independence of Random Variables Whilst the previous sections have been conceméd with the information that one random variable carries about another, it would seem that there must be pairs af random variables which each provide nd: information whatsoever about the other. Iti, for example, difficult to imagine that the value obtain, ‘when a dic is rolled in Coventry will tell us much about the outcome of a coin toss taking place at the same time in Lancaster. There are two equivalent statements of a property termed stochastic independence which captute precisely this idea The following two definitions are equivalent for both discrete and continuous random vatiables Definition 4.11: (Stochastic Independence): Definition 1 Random variables X;,Ky.X, are stochastically independent iff Feat ent) FTF (5) Definition 2: Random variables ,,¥,--.%, are stochastically independent ift Th, (s) If X, and x, are independent then theis conditional densities are equal to their marginal densities Py caly Moore Covariance and Correlation Having established that sometimes one random variable does convey information about another and in other cases knowing the value of a random variable tells us nothing useful about smother random variable it is useful to have mechanisms for characterising the teltionship between pais (or larger ‘groups) of random variables. Definition 4.12 (Covariance and Correlation): Covariance: For random variables. X and ¥ defined on the same probability space ‘aH, ie ofa Sral Haar Ks, Nar LL, New Da 1006, Ps OH) ASA, Cas DRTC a Ell jfalgrendeancom: Web: Wow Annee gay A & PPT, Sta se 48. cov] =B[(4-ne Hy] =B[AY]-ty Correlation: For random variables Y and Y defined on the same probabiity space Cov[XY]___ Cov[x,¥] oxy] provided that oy >0 and oy >6, Theorem 4.4 (Cauchy-Schwarz Inequality): Let X and Y have finite second moments. Then ’] with equality ifand only if P[Y =eX]=1 for some constant c. =a(X) Theorem 4.5 (Distribution of a Function of «Random Variable): Let X be a random variable and Y= ¢(X) where g is injective (i.e. it maps at (eler)) =elerp s efx? ]e[) ‘Transformation of Random Variables: most one x to any value y). Then \de"" (2) f0)= fe" ONPG ven that (g'(y)) exists and (g“'(y)) >0 Vy or (e"(y)) <0 vy. Ife is not bijective (one-to-one) there may be values of y for which there exists no x such that y= g(x). Such points clearly have density zero, When the conditions of this theorem are not satisfied it is necessary to be a little more careful. The most general approach for finding the density of a transformed random variable is to explicitly construct the distribution function of the transformed random variable and then to use the standard approach to tur the distribution function into a density (this approach is discussed in Larry Wasserstein’s “All of Statistics”) Exercise 4.8.1: Let X be distributed exponentially with parameter a, that _foe* x20 Sox) Lo reo Find the density function of (y= a(x) with e(x) (i) ¥=x", poo 0 for x<0 Gil) ¥=g(x) with g(X)=}x forosxsi 1 forxat "BAI (St Fo nS Khas New LT, New DaRFIIOO6, Ph (12650507, Ca PODTERS4 A 98916173, SRISETOD Ty Joint and Conditional Distributions [Rnigo 900": 7008 cortned atte Theorem 4.6 (Probability Integral Transformation): If X is a random variable with continuous Fy (x), then U=Fy(X) is uniformly distributed over the interval (0,1) Conversely if U is uniform over (0,1), then X = Fz'(U/) has distribution function Fy 4.9, Moment-Generating-Funetion Technique The following technique is but one example of a situation in which the ‘noment generating function proves invaluable, Funetion of a Variable. For Y= ¢(’) compute [“] If the result is the MGF of a known distribution then it-will follow that Y has that distribution. ‘Sums of Independent random variables. For ¥ =), X;y where the X, ate independent random variables for which the MGF exists V-h<1-0 me(=B[2=T] my, (0) for ~herch mm (0)=[e"] ‘Thus [],mx, (0) may be used to identify the distribution of Y as above. ‘AI, Prt Foy Jin Sar Hass Khas, Near Nw Doi 10016, Pas (1) BASS, Cale SODTARGG K HPOTTIN. BSEOUTSS @ nai fdddlatcademy cons Wen wedi CHAPTER 5 INFERENCE 5A. Sample Statistics ‘Suppose we select a sample of size n from a population of size N. For each in {t,..n}, let X, be a random variable denoting the outcome of the i ‘observation of a variable of interest. For example, X, might be the height of the i person sampled. Under the assumptions of simple random sampling, the X; are independent and identically distributed (iid) ‘Therefore, if the distribution of a single unit sampled from the population can be characterized by a distribution with density function. /, the marginal density function of each X, is also and their joist density fanction g is a simple product of their marginal densities: (At) =F) FCF Om) In order to make inferences about a population parameter, we use sample data to form an estimate of the population parameter. We calculate our estimate using an estimator or sample statistic, which is a function of the X, We have already seen examples of sample statisties, for exampte the sample Sx where n is the size of the sample is an estimator of the population mean, e.g. for discrete X we Sx(xs)] where NV is the number of distinct values which it is possible for an X; to take Sampling Distributions Since an estimator 6 is a function of random variables, it follows that 6 is itself a random variable and possesses its own distribution. The probability distribution of an estimator itself i called a sampling distribution Proposition 5.1 (Distribution of The Sample Mean): Let ¥ denote the sample mean of a random sample of size m from ¢ normal distribution with ‘mean Hand variance o?. Then 7A, et For dia Sra Haez Kh, es LT New Dal Penal naa 16 Ph: (1) 265727, Cae WIDTEMG 9891617, TOD Theorem S.A (Central Limit Theorem): Let f be a density funetion with mean } and finite variance o®. Let ¥ be the sample mean of a random sample of size m froth f and let onl = ‘Then the distribution of Z, approaches the standard normal distribution as n+. This is often written as: Z,—“-»N(0,1) with —4» denoting convergence in distribution. Thus, if the sample size is “large enough”, the sample mean can be assumed to follow a normal distribution regardless of the population distribution. In practice, this assumption is often taken to be valid for a sample size n>30 ‘The Chi-Squared Distribution The chi-squared distribution is a special case of the gamma distribution, The sample variance of a standard normal distribution is x? with 1-1 degrees of freedom. Definition $.1: If X is a random variable with density Oe s(a)=musie) qe ee o otherwise then X is defined to have 72 distribution with & degrees of freedom (1) where & is a positive integer. Thus the 22 density isa gamma density with r Result: Ifthe RVs X;,=1,... are independently normally distributed with meats 5, and variances o? then 3 = y RV has a 12 distribution. Theorem 5.2: 1f X,,...X, is @ random sample from a normal distribution with mean wand variance o? then @) ¥ ana D7 (x,-H) are independent, Due iy SEMA? basa 42, distribution, Website: wor disaendemcm Ci) 3ST, Ca POLO A HALEN EHTS ea | TOs oe Cavted ie Corollary Ss f = 9(x, 2) is the sample variance of a random sample of size n from normal distribution with mean 4x and variance o°, then (oo) ada The ¢ Distribution The « distribution is closely related to the normal distribution and is needed for making inferences about the mean of a normal distribution when the variance is also unknown. Definition 5.2: If Z~N(0,1),U~} and Z and U are independent of one other, then . ic k where 4 denotes a r distribution with & degrees of freedom, ‘The density of the 1, distribution is: ker romain qfé 2 For -01 although an extension known as the Cauchy principle value can be defined more generally), and it can be shown that k Var[x]=—£-, for > [l=75 2 Theorem 5.3: 1f X~ i, then ), as ko, That is, as k>a the density approaches the density of a standard normal. ‘The F Distribution The F Distribution is useful for making inferences about the ratio of two unknown variances. 284 et Fon SiS Tn Rh Nag UT Ne DAE TOG FH) TY, Ge RDO OTA ARID a | 53. Definition 5.3: Suppose U and ¥ are independently distributed vith ee, u~3, and y~ 2 . Then the random variable ue is distributed according to an F distribution with m and n degrees of freedom, ‘The density of X is given by x Returning to the case of two independent random samples. froin normal Populations of common variance, but differing means: XorXy, ~N(tio?) oma, ~ M(H2,07) we have Point Estimation The sample mean and variance are examples of point estimators, because the estimates they produce are single point values, rather than a range of values. For a given parameter there are an infinite number of possible estimators, hence the question arises: what makes a “good” estimator? Definition 5.4 (Unbiasedness): Let X be @ random variable with pdf J(x;0), where @eQER? is some unknown parameter, p21, Let XjyonXq be a random sample from the distribution of and let 6 denote a statistic. 6 is an unbiased estimator of 8 if £[@]-0 voco Where the expectation is with respect to f(x; 8. If 6 is not unbiased, we say that 6 is a biased estimator of 0, with Bias(6) =x 6]-0 If Bias(8)+0 when n> then we say that 6 is asymptotically unbiased. Example 5.1: Consider the following estimator ofthe population variance aE (a 2HARI Fen Fos) Sara Haus Kis, Near LUT, Now De rma neiddrnndem cam Web worm dianeadenn com 16, Ph) 26757, Cal 91RD OTT ASRTTID Ea So this estimator is biased with bias equial to—2~ As this decays to zero as. n> it is an asymptotically unbiased estimator. However, we can see that the sample variance Consistency: In order to define a consistent éstimator, we first define ‘convergence in probability. Definition 5.5 (Convergence in Probal Let {X,} be a sequence of random variables and let be @ random variable. We say that X, converges in probability to X if Ve>0 lim P[|X,-¥|2¢]=0, or equivalently tim P(x, —X] x. Definition 5.6 (Consistent Estimator): Let X),..., be a sample from the distribution of X where X is a random variable with distribution function F(x; 0). Let 6 denote a statistic. 6 is a consistent estimator of 0 if, whatever the value of 6, b6—+0 A particular case of consistency is often used to justify using sample averages: Theorem 54 (Weak Law of Large Numbers): Let ¥=15"",x, with Xiong td Then Xu, ite. X is a consistent estimator of 1. "BAL, (Fist Foo) a Sarak Haus King Naw LL, New Dal mai arta coms Webster dieteaden co 6, Ps (I) 27, Cale SOIR A OTT, TD eam = 5.4. Ay Theorene 5.5: Suppose X,—P-»a and the real function g is continuous at aa. Then g(X,)—2->2(2). Consistency according to the definition above may be hard to prove, but it turns out that a sufficient (though not necessary) condition for consistency is that 8ias(8) +0 and Var(8)—+0 as n>. Definition 5.7 (Consistency in Mean-Squared Error): If 6 is an estimator of , then the mean squared error of 6 is defined as wse()=#[(0-0)'] and 6 is said to be consistent in M! if Mse(6) +0. as the. size of the sample on which 6 is based increases to infinity. Result: Mse(6) = var(6) +[sias(@)] Interval Estimation This section describes confidence intervals which ave intervals constructed such that they contain @ with some level of confidence. Definition 5.8 (Confidence Interval): Let Xj,...X, be @ random sample feom a distribution with pdf f(x; 6) where @ is afi unknown parameter in the parameter space ©. 1f {and U arestatisties such that P[Lsesu then the interval (1,U) i8 a 100(1~a)% confidence interval for @ \-a is known as .the confidence coefficient and a is the level of significance. There are several ways in which confidence intervals can be constructed. The basic procedure we shall use t0 construct a 100(I-«)% confidence interval for a parameter is as follows (i) Select a sample statistic to estimate the parameter. (ii) Identify the sampling distribution forthe statistic. (iii) Determine the bounds within which the sample statistic will reside with probability 1a, iv) Invert these bounds to obtain an expression in terms of 6 Confidence Intervals based on the CLT: Suppose that we are interested in the population mean @ and we wish to use the sample mean X as an estimator for @. Then if the population density has variance o the CLT states that 4 5N(0,1) Where nm is the sample size. 7A, rn iow Sis Sara Hoa Khas, Near LT, New De 0016, Pas HART ‘Cosi etsadiracaden com Webuts wormdpaetdemcom Hence P(-196

You might also like