Standard-Brm U-2 Notes

You might also like

Download as pdf
Download as pdf
You are on page 1of 62
75 Scanned by CamScanner 4 Resremnon se the study of formulation and determination off an algebraic term for the relations ip between the variables. Also, it predit the value of one variable from that of the other. For example, the equation of line is y = atbx. Here, the aim is to predit the regression of the relation (y = a+bx) ie. regression of y on x. Where y treated as dependent variable and x as an independent variable. Cee) LINES Consider the straight lines as X on Y and another line Y on X. Where X and Y are two variables. In X on Y line, variable X send a dependent variable while Y as an independent variable. In second, line i.e. Y on X, variable Y is known as an independent variable. 6, and Let % and ¥ be the mean values for the series of X and Y respectively. Also, 6, are the standard deviation for the X and Y series respectively. Let ‘r’ be the correlation coefficient. (i) The regression equation for Y on X is given by eux] wei) or Y-¥=byx(X-X) 6, , Where byx =1— wi) (i) The regression equation for X on Y is given by = Bale sf x-X 2 (¥-¥) Gi) or (iv) Where Here, the term b,, and b,, are known as regression coefficient and the value of r lies i ne between 0 and 1. @ Scanned by CamScanner SL 7S I (50) Plessis edna oa euas ese elcny Remark, Multiplying equation (iii) and (iv), we get Dyx Dyy 4.2 LEAST SQUARE METHOD This method helps to find a curve fitting for a given set of data e.g. (10s )(%ar¥p)onnene Rg Yq) Let _y=f(x) be a curve with n observations. Now, we define residual at point x, E,=¥,-f(%) Similarly, residual on other points are E,=Y,-f(X,) E,=Y,—f(X,) and soon The obtained residuals may be positive or negative. The final residual is obtained by summing all residuals, which is represented by E and thereby determine the curve that minimize E. edy TINE Let y=a+bx...(1) be a given straight line for the n observations E,=Y,-[a+bX,]; i=1,2 The sum of the squares of Es is given by B=) oy | Now, we can find ‘n’ residual e, by | Scanned by CamScanner i ai) «7? E=>[y,-(a+bx,) ol ere] ey Since E is a function of two parameters a and ; be b.He E is minimum. So the necessary condition of E to be minimum j ‘seinen by — dE 0E Qa ab aE 45,70? D2, [¥i-(a+bx,)]=0 ie OXM=AQLX +b x? Ky fat «, OE imilar, if —-=0, we get Similar, if gel od x+m= DY, ay ia ‘After solving normal equations (3) and (4), we can get a required values of a and b. QU. Using the method of Least squares find an equation of the form y=a+bx:; that fits the given data : xX | 0 | 1/2 3| 4/5 ¥[1]3]5 8 [10 [12 1 equations of least square fit of a straight line. Solution ; We know the normal =ayX, +oyxt ) at and OX, tna 2) Given n=6 and countmet the calculation table Scanned by CamScanner ‘CH METHODOLOGY eee ens oo man 70) xy | xy [X? ofijo}°e 1{/3{3]t 2\5|10 | 4 3|8| 2419 4 |10| 40 | 16 5 |12| 60 | 25 Total] 15 | 39 | 137 | 55 3 We have DXi=15. D¥=3% DX i= 7 7 So using (1) and (2) equation, we ‘get 137 =ax15+b¥55 39=6a+bx 15 After solving above equation we get a=16375 and b=-3.95 Now putting these values in Y= a + bX we get fitted equation for the given data Y¥ = 16.375—3.95X) LINE/OFJREGRESSION'OE WON X Let Y =a + bX be the straight line of Y on X. The residual of Y on X is given by E, = Y,-(a+bX,) (1) By defining of residual, we have e 2 B=) E=)[¥-(a+bX))] 2) a 4 The necessary condition for E to be minimum is given as 28 aE da db Using (2), we get Dx =ayx+byx? Scanned by CamScanner RE «1 and LY =bYOX,+na These equations are known as normal equations. Also, after solving these normal equa- tions we get the values of a and b, as Hand p-BLXV- DXDT, | alxX-(Dy) | Finally we get ¥=a46% Hence the straight line Y = a + bX passes through the point (XY) Putting a = Y ~ bX in above equation, we get Y-Y=b(x-X) Which is known as regression line of Y on X, where b is also known as regression coefficient. Similarly, we can find the regression line of X on Y Q.2. Find the regression line of y on x for the following data : x}1]2)}3])4 12] 4} 17] 18 Solution : The equation of regression line is Y-Y=b(x-X) 1) Where b,, coefficient regression _2Yw-yxdy aye-(Dx) Calculation Table Scanned by CamScanner Sc: (V(r x x? y xy 1 1 12 12 2 4 14 28 3 9 7 31 4 16 18 2 Total] LX =10 | Ex’ =30| Dy =61 | Yoxy=163 For x = 4, ALY DxDy 4D -(Ls) _4%163-10%61 4%30-(10)° — 952-610 120-100 ee = 21 20 From (1), we get y-15.25 =2.1(x-2.5) y-15.25 =2.1x—5.25 y=21x+10| Which is the required equation of regression of y on x. _——sat Scanned by CamScanner PROBABILITY AND STANDARD DEVIATION 5.0 INTRODUCTION | The basic concept of probability forms the foundation of all decision making and statis- tical reasoning. It is a numerical measure of the likelihood of an event occuring. If s be the sample space in a random experiment and the event ECS. Then, the probability of an event is defined as, P(E) — Nos of favourable outcomes to E o's of all possible outcomes DIES UAE) i) Experiment : An operation, in which we can get well defined outcomes, is known as an experiment eg. If a coin is tossed, we may get a head or a tail, which is an experi- ment. ii) Random experiment : If in each trail of an experiment conducted under identical conditions, the outcomes is not always the same, but may be any of the possible outcomes, then such an experiment is called a random experiment. It is denoted by (re.). eg. Ina tossing a coin, there is not sure that a head or a tail will be came out, then it is a random experiment. iii) Sample space : The set of all possible outcomes in a random experiment is called a sample space and it is denoted by S. e.g. (1) Obtain the sample space for the experiment, “if the two coins are tossed together”, the possible outcomes of the experiments are HH, HT, TH, TT. So we get the sample space S$ as S = {HH, TH, HT, TT) Remarks : If a coin is tossed n times, then the number of possible outcomes is 2". z Ifa die is rolled n times, then the sample points in the sample space will be 6” times. eg. @) find the number of sample space of the following experiment, “If a die is rolled twice”. Ans, The numbers of sample space is n(S) = 6? = 36 iv) Event: Any subset of a sample space is called an event. Scanned by CamScanner | 8 Eee TED v) Exhaustive event ; When at least one of the events must happen. vi) Mutually exhaustive event : When the happening of an events excludes the possipy. ity of the happening of the other event in the same trial, no two or more events can happen simultaneously. 5.2 PROBABILITY OF AN EVENT Let S be the sample space and E be a subset of § representing an event. Then the probability of the event E is defined as P(E) = No’ of elementary event in E = SOS of elementary event in E No’s of elementary event in § E) P(e) = Since P($)=0, P(S)=1 and 0 n(E)=3 Probability in a single throw ="(E)_3_1 a(S) Probability of failure =q =1~- Scanned by CamScanner — Pee Ae SLND 87 *: The die is thrown six times, then take n = 6 Y 3) @ ="C,q""P" P(x=5)=%C, me (ii) P (At least 5 success) = P(x =5)+ P(x =6) oh “32 "64 64 (iii) P (At most 5 success) =1-P(x =6) “-(3) | Example 4. A sample of 5 items is selected at random from a box containing 10 items of which two are defective. Find the possible outcome of defective combinations of the said 5 selected items along with probability of a defective combination. 5 Solution. Given n=5, P= -4 then q=I-P=4 Scanned by CamScanner BIOSTATISTICS AND RESEARCH METHODOLoGy Binomial distribution ros(stilonrk | Example 5. If 20% of bolts produced by a machine are defective, find the probability that out of a sample selected at random, of 12 bolts, not more than one bolt will be defective. Solution. Given the probability of defective bolts = 20% Total number of bolts = 12 Using binomial distribution, to get the probability of getting not more than one defective bolt. =P(x=0)+P(x=1) =" G5 q!2P? + 2CqiPt =(0.8)" +12x(0.8)" x(02) = 0.0687 + 0.20615 27485, .3 (Approximate) Example 6. Five coins are thrown simultaneously. Find the chance of getting at least four heads. Solution. Let P be the probability of getting heads. Probability of getting head in a single 1 throw = ie pad . al Scanned by CamScanner PeEL EAP SILO Gs 89, Then qel-pei-t=+ ond n=5 22 P (at leats four heads) eG ey 5/5 NORMAL DISTRIBUTION Normal distribution is applied for continuous variable with mean (1) and standard deviation (6) and defined as P(x)=—e ene oV2n where x = Values of the continuous random variable varies from —co to oo b= mean value o = standard deviation e = 2.7183 and 2m =2.5066 5.5.1 PROPERTIES OF NORMAL DISTRIBUTION i) Mean, median and mode coincide in normal distribution. ii) The shape of normal curve is bell-shaped iii) The points of inflexion of the normal curve are x=p+0 x- iv) Standard normal variate, z= an . Scanned by CamScanner v) Mean deviation ~ vi) Distribution of area of normal curve a) (ut10) covers 68.27% area b) (w2s) covers 95.45% area ©) (£30) covers 99.73% area 0 then np = m = finite number. (ii) Distribution consists of a single parameter m. If m is given/defined then we can find bh 4 Scanned by CamScanner PROBABILITY AND STANDARD DEVIATION 95, the complete distribution. #Condition for Poisson distribution : wa Gi) Gi) (iv) w) The random variable should be discrete. A dichotomy exists ie. the happening of the events must be of two alternatives such as success and failure etc. Distribution is applicable in those cases where the number of trials n is very large and the probability of success p is small such that np = m (finit number). If p70 then the distribution is J-shaped and unimodal. Distribution will be applicable for those cases where the happening of an even does not affect the happening of the other events. PROBABILITY DISTRIBU If ne and p->0 such that np = m (finite number), then using Binomial distribution for Poisson distribution, the probability of r successes is given by P(X=r)="C,q""p" _a(a-I)(a-2). =A pf Ir [: q=l-p and np=m =0-2| 2 . (a etl af? (2) ATG) (ehesIeaTa) Scanned by CamScanner ae ___ |] BIosTATISTICS AND RESEARCH METHODOLOGY inf -2] =1 and a in| 1-2 =e" Colt 1 P(X= Finally, we get . Which is known as probability of r successes for Poissson distribution. Note = wi) The parameter m is called parameter of distribution. if ea teks ae (i) re mag (iii) The sum of all probabilities P(r) for r= 0, 1, 2, am. 18 1. ie. SYP(r)=1 me™ | me™ P(0)+P(1)+P(2)#orrnesnene =O (0)+P(1)+P(2) e I 2 wey We 2 =e™e"=e-mn XAOS ‘A e™.m! Given that, P(X=r)= r! emt then P(X=r+1) jy P(r+1) P(r) Scanned by CamScanner PROBABILITY AND STANDARD DEVIATION 97 | | m +l m = P(r+l)= =F P(r). (r=0,1,2, Which is known as probability of r successes for Poisson distribution. 5.8 MEAN! OF POISSON! DISTRIBUTION Given that P(t)=<—", ra. Mean (H)= 9) X:p, = XqP) +x,P, +x,P, a oen 4g Leman! | 26m? tet "mem -m ame, 2me™ | 3m'e " 2 3 2 =me"|14 245 4M oy lb 2! 3! Hence, the mean of Poisson distribution is m. 5:9 VARIANCE OF! POISSON DISTRIBUTION Scanned by CamScanner Qi. BIOSTATISTICS AND RESEARCH METHODOLOGY : Peme® 2. mem | 3 me" =0.e" 2: 3 LNR= Oe aT att 2mie™ | 3m'e™ + 1 2x1 - 2m 3m? 4m) =me™ +14 4 1 2xl 3x2x1 me|(1424 24 + (m, 2m, 3m =me s+ .|+| 2+ 1 2x1 3x2x1 1. 2x1 3x2x1 2 =me|et+mi1+ 242 _ 1 axl =me*[e"+me"] > 3x2x1 =me™(1+mle" m+n) Using (1) equation, we get Variance (0°)= 2i(x7R,)-#* =(m-+m?)—m? =m Note : Standard Deviation (s.d.) =0=\/Variance = Vm The average number of road accidents on any day is 2.5. Find the probability that the number of accidents are : i) at least one ii) at most one (Given that e** = 0.0821) Solution : The probability function of the poisson distribution is Given that m = 2.5 —— Scanned by CamScanner CULLGOTATE eh @ P(X2r): P(X21 Slee" = 1-25 =1-0.0821 =0.9179 The probability of the number of accident at least one is 0.9179, ii) P(XS1): P(XS1)=P(X=0)+P(X=1) =0.0821x3,.5, = 0.28735 | The probability of the number of accidents at most one is 0.28735. Q2 If the variance of the Poisson distribution is 4. Find the probabilities for r = 1, 2,3 from the recurrence formula of Poisson distribution. Sol. Given that variance o? =m=4 Recurrence relation for Poisson distribution is pr) a P(r+1)=—™ P(r =a r+l Put r = 0, 1, 2 in (1) equation, we get P(0+1)=P(a)=—e P(0)=—.e+ = 4x0.01832 ORE res eer Scanned by CamScanner PTs usunosoltey " Sees mw ‘to (9 ESE 2 P(1)=0.0733 44 Similarly PQ)=4 p() 22 =8x00185? ‘ = P(2)=0.1465 4 4et a _ 32. 2 aet=2 PG)=4 PO)=5 973 0.0183; = P(3)=0.1954 Q.3. Using Poisson distribution, obtain the approximate probability of getting 5 heads x times, while five coins 320 times NL Sol. Probability of getting one head with one coin => s 1y_1 ‘The probability of getting five heads with five coins -(3) “35 1 Average number of five heads with five coins in 320 tosses = nP = 320x535 =10 So, the mean of the Poisson distribution = 10 Approximate probability of getting five heads x times is = P(x) im=10} Q4. Fit a Binomial distributic ii tydetmehed || to the following table and calculate expected frequen) X{O}1)2/3]4[5 f [7 fio fia fifo Sol. The probability mass function of Binomial distribution is P(x)="C,p'q?™, x =0,1,2...n ad Scanned by CamScanner PROBABILITY AND STANDARD DEVIATION dion Where n= no's of traits p = success probability q = failure probability p=~ and pt+q=1 0 ton Dis, 2A f, =Frequencies, X; = values Xf, =N= Total frequency f(x) =N.P(x) Where P(x)=*C,p*q"* and x=0,1,2...n Table Xi Es x ei x S| 13 | 26 11} 33 ulalwlrlHlo 53 | 120 1. 721209 2640 33 P ae =0.4528 q=1-0.4528 =0.5472 The expected frequencies are £(x)=N°C,p¥qo* I Scanned by CamScanner SSS on ESE BIOSTATISTICS AND RESEARCH METHODOLOGY =53.5C, (0.4528)' (0.5472) Put x =0, f(0)=2.6=3.0 x=1,f(1)=10.7581=11.0 x =2, f (2)=17.8020=188.0 x=3,f(3)=14.7271=150 £ (4) =6.09614 =6.0 .0088=1.0 Hence the outcome is Frequency [7] 10[13]11[.9[3 Expected frequency | 3 [11 [18 |15]6|1 QS. Fit a Poisson distribution to the following data and find expected frequencies by direct method. o{1]/2/43)4 |5\6 f | 118 | 213 | 128 | 37} 18} 3] 1 Sol. Aim : To fit Poisson Distribution to the following data and find expected frequen- cies by direct method. Procedure : The probability mass function of the Poisson distribution is given’ by So, the expected frequencies can be calculated by the following method : £(x)=N.P(x) ene x! Put x = 0, 1, 2, 3,4, 5,6 and we get £(0), f(1), £(2), £3), £(4), £5) and £(6). . is Scanned by CamScanner PROBABILITY AND STANDARD DEVIATION - Table x f [xt so 213 | 213 128 | 256 37 [in 18 | 72 3 [15 1/6 518 | 673 afulalwlr|—-lo <_, 63 2 yee SB e518 Therefore, the expected frequencies are 1.2992 and N= °F, =518 e f()-n= e518 en (1-2992)" x! Put x = 0, £(0)=141.2845=141 x=1,f (I) =183.5568=184 x=2,f(2)=119.2385 =119 x=3,f (3) =51.6382=52 x=4,f(4)=16.7721=17 x=5,f(5)=4,3581=4 x=6, (6) =0.9437=1 Inference : The fitted Poisson distribution is given by 292 1.2992)" kk =01 [Frequency 118 | 213 | 128 [37 [18 [3 [1 [ Expected frequency | 141 [184 [119 [52 |17 [4] i —— Scanned by CamScanner we [RU] BlosTATISTICS AND RESEARCH METHODOLOGY Q6. Fit a Normal distribution for the following data and find expected frequencies by using ordinary method. ; Solution. Aim : To fit a normal distribution for the given data and find expected fre- quencies by using ordinary method. Procedure : The probability density function of normal distribution is given by 7) 1 £0)" She A = Assumed mid value of x/s C = Class interval length N = Total frequency x, = Mid values of the class interval f, rrequencies The following steps can be use for finding expected frequencies by using ordinate method. 1. Find the mid values of class intervals. x, the x, ('s are mid values 3. And $(z) from normal of ordinates 4, For expected frequency f(x) =NL9(2) 2 Scanned by CamScanner = CEE 105 Table for Calculations : x a 7 A mean value fl | fd, | fd? 10+20 0-20] OER 7 | aal os 20+30 20-30) 3 =25 10 |-10 10 30+40 30-40 |S =35=4 4) 0] 0 40+50 40-50] SSP =a4s sis ls 50-60 | 2*60_ ss 3] 6 |i a2 [-10 | 38 1 Afxnis98y (3) Taser” (88) For finding expected frequencies we can use the following table. Xi NC —ayNC | (mid vlaue of x) (2) | ff&)=6@)- > ; 15 0.1219 | 37.5503 | _4.5774=5 : | 25 (0.3251 | 37.5503 | 12.2076=12 | 35 0.3867 | 37.5503 | 14.5207 =15 | 45 0.2059 | 37.5503 | _7.7316=8 35 0.0498 | 37.5503 | _1.8700=2 Inference : The fitted normal distribution is given by 1 f(x)e__!_ _e ‘) 11.185V2n 2 11.185 Frequency 7 14 3 Expected Frequency [5 [22115 18]21 Scanned by CamScanner 6.0 IMPORTANT DEFINITION Population : A population is a complete set of persons or objects that process some common characteristic that is of interest to the researcher. Sample and Sample size : A finite subset of statistic individual in a population is called a sample and the total number of an individual in a sample is known as a sample size. Sampling : Sampling is the process of selecting a portion of the population to represent the entire population. f Sample design : A sample’ design is a definite plan of for obtaining a sample from population. It refers to the techni ques or the procedure, the researcher would adopt in selecting items for the sample. : A single member of a population is called as element. Elements or members of a population are selected from a samy ipling frame, which is a listing of all elements of a population. * A sampling frame is a list of all the péople that are in the population. + Sampling error refers to the difference between the value of a sample statistic (e.g. sample mean) and the true value of the population parameter (e.g. population mean). 6:1 SAMPLING CRIT) Sample Figure : Sampling Criteria — Scanned by CamScanner 107 wwide ran will eat t jn this context one must remember that two costs are involved in a sampling analysis. (i) The cost of the collecting the data, (i) The cost of an incorrect result, Tesulting from the data. 6.3 RESTRICTION OF SAMPLING CRITERIA : Research must keep in view the two causes of incorrect result : (a) Systematic bias (b) Sampling error Lehecancuins! bias : 4 systematic bias result from errors in the sampling procedure, and it cannot be reduced or eliminated by increasing sample size. The systematic bias is the result of one or more of the following factors : @) Inappropriate sampling frame : A biased representation of the population is an inappropriate sampling frame, that means some time the selected samples do not carry the similar characteristic of the population to whom they are representing. It will result in a systematic bias. di) Defective measuring device : If the measuring device is constantly in error, it will result in systematic bias. In survey work it can result if the questionnaire of the interviewer as biased. Similarly, if the physical measuring device is defective there will be systematic bias in the data collected through such measuring device. : (iii) Non-respondents : If we are unable to sample all individuals initially included in the sample, there may arise a systematic bias. (iv) In determinancy principle : Sometimes, we find that individuals act differently when kept under observation than what they do when kept in non-observed situation. For instance, if worker are aware that somebody is observing them in course of work study on the basis of which the average length of time to complete a task will be determined and accordingly the quota will be set for piece work. They generally tends to work slowly in comparison to the speed with which they work if kept unobserved. Thus this may cause a systematic bias. (v) Natural bias in the reporting of data: Natural bias of the respondent in the reporting Of data is often the cause of systematic bias in many inquiries. There is usually downward bias in the income data collected by government taxatation department, whereas we find an upward bias in the income data collected to determine social status of an individual, People in general under state their income if asked about if for tax purpose, but they overstate the same if asked for social status. Generally in psychological surveys, people tend to give What they think is the correct answer rather than revealing their true feeling, pa a Scanned by CamScanner IESVi]_ BIOSTATISTICS AND RESEARCH METHODOLogy (©) Sampling error and Non-sampling error: It can be defined as the difference between data obtained from a random sample and the data would be obtained if an entire population was measured. Error may contained in sample data even when the most careful random sampling procedure has been used in obtaining the sample. Sampling error is not under the researcher's control, it is caused by the chance of variations that may occur when a sample is chosen to represent a population. 108 The sampling errors comes due to following reasons : (Faulty sample : If you use a defective method for selecting a sample, (Gi) Faulty demarcation of sampling units : It is significant in particular areas surveys such as agricultural experiments in the field etc. (iii) Substitution : By putting one unit for another and some difficulties arises in studying that particular unit, then this leads to some bias in the sample. (iv) Constant error : Due to wrong choise of the statistics for estimating the population parameters. NON-SAMPLING ERROR : ‘The non sampling errors arise at the stages of of the following : @ Observation Gi) Ascertainment (i) Processing of data Non sampling errors can occur at every stage of planning or execution of census or smaple survey. There are some important factors which effect to non-sampling error. @) Faulty definition (i) Response error (ii) Non-response bias (iv) Errors in data collection (v) Compiling error (vi) Publication error 6.4 SAMPLE DESIGN A sample design is a definite plan of for obtaining a sample from a given population. It refers to the technique or the procedure, the researcher would adopt in selecting items for the sample. Sample design may as well lay down the number of items to be includes in the sample i.e. the size of the sample. Sample design is determined before data are collected. There are many sample designs from which a researcher can choose. Researcher must select or prepare a sample design which should be reliable and appropriate for his/her research study. ‘The researcher must pay attention to the following steps in developing a sample desig __——a Scanned by CamScanner aeRO 409 (@» Size of the population under study, {i Homogenesity and heterogenisi (ii) Objective of the study, iv) Cost limit of the project ty of the population. (v) Time limitation of the project. (vi) The parameters of interest in Teseat (vii) Fesibility of the sample, (viii) Research design, 6.5 CHARACTERISTI rch study. methodology and desire Beneralization also need to consider. C OF A GooD SAMPLING DESIGN : Sampling design must result in 4 trully representative sample. \ Sample design must result in small sampling error. Tr design must be suitable in the context of funds available for the research. (iv) Sample design must be designed in a way (v) Sample should be the true Tepresentative of the population so that the study can be applied in general with reasonable level of confillence, 6.6 LIMITATION OF SAMPLING @ Take proper care in the planning and execution of th results obtained might be inaccurate and misleading, 'y that it control or systematic bias. ie sample survey, otherwise the i) Until and unless sampling is done by trained and efficient personnel and sophisticated equipment for its planning, execution and analysis. In absence of these sampling is not trust worthy. (ii) If you want to have information of each and every unit of population. You will have to go for complete enumeration. In that case sampling will not be an appropriate method. 6.7 TYPES OF SAMPLING There are different types of sample design : (a) Probability sampling (b) Non-probability sampling (0. Mixed sampling (Quota sampling) (a.1) Probability sampling technique : A Fe son the gal ofa orielliy acta f rf of random selection. The go : ef fampling that utilises atte alot pe allonsl te dictionary meaning of the term ada peat something occurs haphazardly or without direction. Random Gan teens ct ing but haphazard. It is a very systematic scientific process. The owever, is anything me wt Scanned by CamScanner iN ie ABUT BIOSTATISTICS AND RESEARCH METHODOLOGY investigator can specify the chance of any one element of the population being selected for the sample, Selections are independent of each other, and the investigators bias does not enter the selection of the sample. When a random sample is selected the researcher hopes that the variables of interest in the population will be present in the sample in approximately the same proportions as would be found in the total population. Unfortuantely, there is never any eueanes this will occur. Probability sampling allows the researcher to estimate the chances that any given population element will be included in the sample without the bias. (a2) Simple random sampling : (Lottery method) : It is a type of probability sampling that ensures that each element of the population has an equal and independent chance of being chosen. “This method is generally used in atleast one phase of the other three types of random sampling procedures and will therefore, be examined first. The word ‘simple’ does not mean easy or uncomplicated infact, simple random sampling can be quite complex and time consuming, especially if a large sample is desired. ‘The first step is to identify the accessible population and enumerate the list of all the elements of the population. This listing is called a sampling frame. After the sampling frame is developed, a method must be selected to choose the sample. Slips of paper representing each element in the population could be placed in a hat or a bow] and the sample selected by reaching in and drawing out as many slips of paper, as the desired size of the samples. Merits : 1. This method requires minimum knowledge of population in advance which is needed in the case of purposive sampling. R This method is free from classification error. 3, Sampling error can be easily computed and the accuracy of the estimate easily assessed. Demerits : 1, This method does not make use of the knowledge about the population which researcher may have. 2. From the point of view of field survey, it has been claimed that cases selected by this sampling tend to be too widely dispersed and that the time and cost of collecting data become too large. 3, The use of simple random sampling necessitates a completely catalogued universe from which to draw the sample. But it is often difficult for the investigator to have upto data lists of all elements. Procedure : _Use a table of random numbers, a computer random number generator, or a mechanical device to select the sample. (a.3) Stratified random sampling : : In stratified random sampling, th lation is divided into subgroups, or strata, pling, the populati according to some variable or variables of importance to Scanned by CamScanner — ee MPLE AND HYPOTH! ait the research study. After the population is divided j : dom sample is taken from each of these rr as into two or more strata, a simple ran Thee et ee of popiulations that may call for the use of stratified Soe ge tat ight Be ang, Such 88 age, sex and educational backgrounds are examples || of vari ight be used as criteria for dividing populations into subgroups. Merits : @ Increases probability of sample being representative. (ii) Assures adequate number of ease fot subgroups. Demerits : i (i) Requires accurate knowledge of population. Gi) May be costly to prepare stratified lists (iii) Statistics are more complicated. (a.4) Cluster Random Sampling : In large scale studies, where the population is geographically spread out, sampling procedures can be very difficult and time consuming. Also, it may be difficult or impossible to get a total listing of populations. During each phase of nes from the clusters either simple, stratified or systematic random sample can be used. 1 Because the sample is selected from clusters in two or more separate stages, the approach is seomtimes called as multi stage sampling. Although cluster sampling may be necessary for large scale survey studies, the likelihood of sampling error increases with each stage of sampling. A simple random sample is subject to a single sampling error, whereas a cluster sample is subject to as many sampling errors as there are stages in the sampling procedure. To compensate for the sampling error when cluster sampling is used, larger samples should be selected than would normally be choosen for a sample or stratified random sample. In cluster sampling, we use the following steps : () Divide population into clusters Gi) Randomly sample clusters Gi) Measure all the units within sampled clusters ” Merits = () Save money and time (i) Arrangements made with small number of s of populations can be estimated sample units (ii) Characteristics of clusters as well as those Demerits : (i) Larger sampling errors than other prol Gi) Require assignment of each member of ability samples | population uniquely to a cluster Scanned by CamScanner 112 (80) eee ea {|| (iii) Statistics are more complicated (a.5) Systematic random sampling : It involves selecting every n element of the population. The procedure is to follow the following steps = Step 1: To obtain a list of the total population (N). Step 2: The sample size (n) decide. Step 3 : The sampling interval width (K) is determined by (N/n). This sampling method is the most controversial type of the random pat Ae Infact, it may be classified as either a probability or a non probability samp 8 be hod. Two criteria are necessary for a systematic sampling procedure to be classified as probability sampling : (The listing of the population must be random with respect to the variable of interest. Gi) The first element or member of the sample must be selected randomly. Disadvantages : (@ Samples may be biased. If ordering of populations is not random. Gi) After the first sampling element is chosen, population members no longer have equal chance of being chosen. (a.6) Multistage sampling : Multistage sampling is a further development of the principle of cluster sampling. Suppose we want to investigate the working efficient of nationalised bank of India and we want to take a sample of few banks, For this purpose. The first stages is to select large primary sampling unit such as states in a country. Then we select certain districts. This would represent a two stages sampling design with the ultimate sampling units being clusters of districts. If instead of taking a census of all banks within the selected districts, we select certain towns and interview all banks in the chosen town. This would represent a three stage sampling design. If instead of taking a census of all banks within the selected towns. We randomly sample banks from each selected town, then it is a case of using a four stage sampling plan. If we select randomly at all stages, we will have, what is known as multistage random sampling design. Advantages : (It is easier to administer than most single stage designs mainly because of the fact that sampling frame under multistage sampling is developed in partial units. (i) A large number of units can be sampled for a given, because of sequential clustering, designs. (b) Non-probability sampling methods agi cost under multistage sampling whereas this is not possible in most of the sample The main difference between non-probability and probability sampling is that non- Probability sampling does not involve random se ‘lection, while probability sampling does. — Scanned by CamScanner YE 113 In non probability sampling, the sample elements are chosen from the population by non-random methods. Non-random methods of sampling are more likely to produce @ piased sample ob oe random methods, The investigator cannot estimate the probability that each element of the population will be included in the sample. Infact, in non probability sampling; certain elements of the population may have no chance of being included in the sample. This restricts the generalizations that can be made about the study findings. Despite the limitations of non-probability sampling, most medical researcher involve such type of sampling procedure. The most frequen reasons for the use of non-probability samples involves convenience and the desire to use available subjects. Samples can be chosen from available groups of subjects by several different methods including convenience, quota and purposive systematic sampling is used, these ethnic groups may under represented. We can divide non-probability sampling methods into two types : Convenience and Purposive sampling, (i) Convenience, Accidental or Haphazard sampling : In this sample which appear convenient to him or her and to the management of the organisation in conduction research. Convenience sampling is referred to as accidental or incidental and involves choosing readily available People or objects for a study. These elements may or may not be typical of the population. There is no accurate way to determine their representativeness. It is easy to see that this may be a very unreliable methods of sampling. Convenience samples are chosen because of the savings in time and money. The researcher may choose a convenience sample from familiar people, as when a teacher uses students in her or his class or from stangers, such as might be encountered when a medical researcher conducts a survey among family members in an intensive care unit waiting room to determine their attitudes about visiting hour limitations. Another method of obtaining a convenience sample is through ‘snow ball’ sampling. This term is used to describe a method of sampling that involves the assistance of study subjects to help obtain other potential subjects. Suppose the researcher wanted to determine how to help people to stop cigarette smoking. The researcher might know of some one who hasbeen successful in refraining from cigaratte smoking for ten years. This person is contacted and asked if he or she knows others who have also been successful. This type of networking is particularly helpful in finding people who are reluctant to make their identify known, such as substance abusers. (i) Purposive sampling : The purpose sampling is also called the judgment sample. These terms indicate selection by design by choise, not be chance, in purposive sampling a sample is chosen which is through to be typical of the universe with regard to the characteristics under investigation, The researcher must know about the characteristic of the universe before hand in order to be able to recognise typical item in the universe. Scanned by CamScanner 414 GEE IPSC cM es neces Merits a investigati hi | (i) Teissimple to draw and people often use it in exploratory investigations w ich precede major survey- { METHODOLOGY It is less costly and involves less field work. It is more representative of ‘typical’ conditions than the random sample if the size of sample is small. . Demerits : (It is not always reliable. The human mind has difficulty in recognizing typical items, (i). It requires from the researcher considerable knowledge about the population which he usually does not posses. (@ Mixed sampling : Quota sampling Quota sampling is similar to stratified random sampling in that the first step involves dividing the population into homogenous strata and selecting sample elements from each of these strata. The difference lies in the means of securing potential subjects from these strata. Stratified random sampling involves a random sampling method of obtaining sample members, whereas quota sampling obtains members through convenience sample. The term ‘quota’ arises from the researchers establishment of a desired quota or proportion for some popultion variable of interest. The basis of stratification should be variable of importance to the study. The variables frequently include subject attributes such as age, sex and educational background. The number of elements chosen from each stratus is generally in proportion to the size of that stratum in the total population. For example- If the researcher wanted to determine whether more males or females receive yearly physical examinations, an equal proportions of males and females should be approached for the study. If convenience sampling is uséd, the two sexes may not be equally represented in the sample e.g. if a sample of 100 is desired, a quota of 50% males and 50% females would be set. Merits : (i) Quota sampling is less costly. (ii) Quota sampling is administratively easy. (ii) Most suitable in a situation where field work has to be done quickly. Demerits : (i) In this type the investigator very often selects. Those respondents whom he knows : called courtesy effect. (ii) It may not provide a representative sample of respondents. (iii) It is not possible to estimate sampling errors because quota sampling does not meet the basic requirement and random. Scanned by CamScanner a ——— A 7 EN AO sod A hypothesis is the assumption that we make about the population parameter, (which is not necessarily based on statistical data). Testing of hypothesis is a procedure or process by which we accept or reject statistical hypothesis based on sample taken from the population. In statistics, we are mainly concerned with the study of population. The most important aspects of population that is statistically studies, is the form of the distribution or the value of parameters involved in it. The statistical hypothesis is a statement or assetion about the value of the parameters of a population or the form of the distribution of population. Objectives of Hypothesis i) Describe the basic concept of hypothesis testing. ii) Identify and describe the test for a particular problem. | iii) Analysis the type of errors. There are two types of statistical hypothesis for each situations : a) Null Hypothesis b) Alternative Hypothesis a) Null Hypothesis A null hypothesis define a test that there is no difference between the assumed value and natural value of parameter. Also, it will give a natural idea about a relationship between the variables under study. It is denoted by H,. If you want to test the Hb% of patients is 12 or not, we can frame the null hypothesis, H, Eg. :the mean Hb is 12. In some cases, if we want to compare the mean of two groups, we can frame the null hypoth esis, in the form of H, : There is no difference between mean of two groups for example, H, : The IQ of boys and girls is same. b) Altemative Hypothesis then some of other statement is true (valid) When we contradict the null hypothesis, e of 0 that is called alternative hypothesis, if the null hypothesis reject, we accept some other hy- pothesis which is true, that is called alternative hypothesis. It is denoted by H, (or H, ete). For example H, : The 1Q of boys is greater than gitls The alternative hypothesis can be writen in anyone of the three forms : For example, Hb% of patient is 12 or not H, : = 12 (Null Hypothesis) Scanned by CamScanner 1s TT () H, : p< 12 (alternative hypothesis) The alternative hypothesis H, is called one tail or left tail alternative hypothesis. BIOSTATISTICS AND RESEAI <— No change Left tail (or one tail) °. Figure () ii) H, : [> 12 (alternative hypothesis) The alternative hypothesis H, is called one tail or right tail alternative hypothesis. No change —s| Right tail (or one tail) - oO + Figure (ii) iii) H,: 2 # 12 (alternative hypothesis) When the prediction does not specify the direction, we say that given hypothesis H, is two tailed hypothesis. No change —> = oO + Figure (ii) bh a Scanned by CamScanner a Mita anoun se 117 During the testing of hypothesis; it is necessary to select a suitable level of significance. ‘The confidence with which a researcher reject or accepts the null hypothesis dependents on the significance level adopted. This probability level through which we reject the hypothesis is called level of significane. It is denoted by a. The level of significance is the probability of rejecting the null hypothesis; when it is true. If the level of significance of a test is a = 1%, that means; if we conduct the testing procedure based on 100 different samples from the population; there is maximum chance of rejecting the null hypothesis when it is true is 1% and 99 times. We will accept null hypoth- esis; when it is true, Basically, the level of significance is fixed as 5%. HEYA yea woe Table of Type I and Type Il Error [ Hypothesis (or Dec ion) Accept (Hy) Reject (H,) Correct decision Type I error (a-error) H, (False) Type II error Correct decision Type I Error While conducting a statistical testing procedure, we select a part of the population and make the decision. So, there is a chance of making the error in the testing hypothesis. There are two probabilities of wrong decisions, while rejecting or accepting null hypothesis. “The error arise of the time of rejecting the null hypothesis; when it is true, is called type I error”. For example : At present the average salary of employes is above 7500/per month, but wwe selected a sample of 40 employees getting the average salary Rs. 7050/- per month, and then we reject the null hypothesis i.e. H,, The mean salary of staff is Rs. 7500/-, then there is an error in the decision. These kinds of errors arise at the time of rejecting the (H,), when it is true, is called type I error. Type II Error The error arises at the time of accepting the hypothesis. H,, when it is a false, is called a type II error. For example ; In a pack of 100 packs of medicines, we randomly select § pieces, which are in good condition but in the entire box, there are too many defective medicines present, So, if we accept the box by taking null hypothesis, Hy, there is no defective piece in the box; will make error in the decision. This is called as type Il error. Scanned by CamScanner SSI 10 EE | SEY 6.12, POWER OF TEST. The power of test is the probability of accepting null hypothesis when it is true that is equal to (1-probability of type II error), ice. Power of test = 1 - Probability of type Il error -B where B = Probability of type II error Lay USES Vy uC A suitable statistics used for a testin; mean, median and mode. Beer aa REGION/ANDJACCEPTING) REGION The range of value of the test statistic calculated on the bases of sample can be divided into two parts critical region (rejection regions) and acceptance region. {Critical Region : when the value of test statistic falls in critical region, we reject the null hypothesis. i Procedure is known as test statistic for example (ii) Acceptance Region : When the value of test statistic falls in acceptance region, we accept the null hypothesis. The value of acceptance region says that the difference wa s probably due to chance and null hypothesis should not be rejected. Acceptance region : pi Rejection (accept HoH the sample | Rejection region mean (X) falls in this region ‘region 0.025 0.475 0.025 of area of area a 2=1.96 Hy 2=1.96 Figure 1. Reject Hy if the sample mean (X) falls in either of these two regions. — Scanned by CamScanner WW ter ekshdn Lunes) 119 Acceptance and rejection regions. In case of a one talled test (left tall) with 5% significance ‘ acceptance region <—__Bcceptanceregion ‘Acceptance region (Accept Ho if the sample mean falls in this region) Rejection region 0.05 of area Both taken equal 0.95 area z= 1.645 Ho Figure 2. Reject Hy if the sample mean (X) falls in this region. Acceptance and rejection regions in case of a one talled test (Right tail) with 5% significance level) Er gs acceptance region (accept Pile sample falls in this region) 0.05 % area 0.05 of year (Both taken together equal 0.95 or 95% of area) Hg 1.645 Figure 3. Reject H, if the sample mean (X) falls in this darkly marked region. Scanned by CamScanner ee SUI] BIOSTATISTICS AND RESEARCH METHOD: RENN) The standard deviation of a sampling distrib error. It is denoted by S.E. and defined as Standard deviation Standard error of mean (*)= Foc iems or SE(x vn Where o=SD. and n = Sample size (ii)In case of two groups, the standard error is defined as, 6.16 TESTING! OF HYPOTHESIS PROCEDURES There are two types of testing problems : a) Large sample test for which sample size, n>30 b) Small sample test for which sample size, n<30 Test Diagram : ‘Testing of Hypothesis (a) Large sample test (n>30) (b) Small sample test (n<30) 4 Test of significance Test of significance for one group for two groups Test of significance when S.Disknown Tact of significance when §.D. is unkown Test of significance by using student test Unpaired t test Paired ‘* test Scanned by CamScanner ution of a statistic is known as standard an Pear shine) 121 (a) Large Sample Test : Test for large sample to test significance of mean for sample size (n) greater than 30 i.e. n>30. We have further two types of testing of hypothesis : i) Testing of the significance of mean of Population for one group. ii) Testing of the significance of mena of Population for two groups. i) Testing of the significance of mean of population for one group : Test Procedure : Step 1: Define null hypothesis (H.) and altemative hypothesis (H,) Step 2 : Check the level of significance, which Sives you a tabulated normal or t-value. Step 3: Apply the test and calculate required values. Step 4: Display the result, if calculated value of |r| is less than tabulated value, we accept the null hypothesis, otherwise we reject Hy, Mathematically, If we want to test mean of population equal to a specified value (i,). Then H,: =H, (specified value) Against, if we can frame the alternative hypothesis H, : h ¥ i, (two tail test) For large sample, we use the z-test statistics. (x§-1)va o Where X= Sample mean n= Sample size (n>30) © = Standard deviation Mo = Specified value Note : The tabled value of Z at 5% level of significance is 1.96. Example 1 : If the average salary of a principle in college in private sector is Rs. 45000 per Month. A sample of 50 principles has mean of Rs. 48326 at a = 0.05. Can we claim that Principals earn Rs. 45000/- per month. While S.D. of population is Rs. 1523. Solution Here, we can‘ frame the null hypothesis. ER Scanned by CamScanner EARCH METHODOLOGY 2 TT 5) EST Hy: w= 45000 Also, the alternative hypothesis is Hy: m # 45000 Since the sample size, n = 50 (>30) We know, (x=-1,)¥n o Where x= Sample mean = 48326 Mo = Specified mean = 45000 n=50 D. = 1523 (48326-45000) ie 3326%7.07 aos 23 Since the calculated value of z is greater than the labled values. Hence we reject the null hypothesis at 5% level of significance and conclude that the salary is not equal to Rs. 45000/ - per month. Example 2. A sample of 100 BSc. student is taken from a medical college with height having standard deviation (S.D. = 10cm). The mean height of sample of student is found 168.8em. Can we accept that the mean height of student is 170cm. Solution. We have, the null hypothesis : H,: p= 170cm ‘Also define aternative hypothesis against H, as, H,: #170cm Given large sample size n = 100 > 30 We use test statistics 68.8, n=100, 4“, =170, o=10 Ha) va z= a = (sss—170) x00 —, Scanned by CamScanner a rr 1 (Mla shanohi 23 Or |z|=1.2 is less than Tabled value. So, thi can t z student is 170 cm. null hypothesis is accepted. Also we can define the height of ii) Testing of the significance of mean of population for two groups : d ids Tabulated value of Z at 5% level of significance is 1.96. Here, the calculated value z = 1.2 Here we want to test mean of two roups is same or not. | Mathematically, | H, : Mean of two groups is same | ie. Hy: Hy = Hy Again, it we can frame the alternative hypothesis Hy: hy # Hy : | For large sample, we use the z test statistics. zoe eta 1 ny B(x m7) Where 9,, = ( and 9, = x\ 7 n,-1 n,-1 Where mean and S.D. of n, samples is x, and 9, and for another group mean and S.D. ! of samples is x, and 72. Also, n, and n, both are greater than 30. | Note : If the calculated value |z| is greater than the tabled value; we reject the null hypothesis at 5% level of significance, 0. Example 3. In the BSc. examination, 50 boys in a college scored mean percentage was 75 With S.D. 9 and 60 grils in same examination score mean grade score 79 with S.D. 7. 1s there any significance difference between performance of boys and girls in exam? (ct = 0.05). | Solution. For testing the null hypothesis, define H, as H, : Performance of boys and girls are same Scanned by CamScanner ym) BIOSTATISTICS AND RESEARCH METHODOLOGy ie Hy By = By Against it the alternative hypothesis H, is Hy? Hy # Hy For large sample, we use test statistics Where 75, X;=79, 6,=9, ¢,=7 and n,=50, n, =60 15-79 (oy, 50 * 60 Zz 4 1 49 50* 60 Since the calculated value of |2| is greater than the tabled value at 5% level of signifi- cance, so we reject this null hypotivesis, that the performance of boys and girls are same. Example 4. A random sample drawn from two states give the following data relating to the height of adult males- Sample Mean (height) S.D. Sample size Scanned by CamScanner a 125 Is there any significant diffe: H,: No difference between mean (height — ie. Hy = Hy = Hy , ‘Against, if the alternative hypothesis is Hyi by # Hh For large sample, the test statistics is Where x, =67.25, x, = 67.85, 0, =25, 0, =2.58 and n, =60, n, =80 67.25 — 67.85 Zo (2.5)° (258) 0 60 8 -0.6 [e253 , 685 60 80 -0.6 =——=-3. 0.18 = = |2)=3.33 Since the calculated value of Z (= hypothesis i.e. rejected, that there is no 3.33) is greater than the tabled value (1.96). In our difference between mean heights. () Small Sample Test : To test of significance of mean three test : i) When standard deviation is kn‘ ii) When standard deviation is unknown iii) Using student ‘t’-test. if) When standard deviation is known + The test statistics is used for the testing of null + H:p=, (Null hypothesis) of population for small sample (n<30). We have the following own (for single and independent group) (for single group) hypothesis for a single group : Scanned by CamScanner —— BIOSTATISTICS AND RESEARCH METHODOLOGY H,: 4 # 44, (Alternative hypothesis) Then, the test statistics is } (x-n,)vn \\\ @ Where 126 ay X = Sample mean | My = Specified value | n= Sample size ii) Standard deviation i(b) When standard deviation is known (for two individual group). The test statistics is used for the testing of null hypothesis for two individual groups. (Where, n,, n, <30). Hy: h,=H, (Null hypothesis) H,:}, # H, (Altemative hypothesis) | Then the test statistics is Where Sample mean of first group x! %, = Sample mena of second group 2, = S.D. of first group 9, SID. of second group sample size of first group n, = Sample size of second group Example 5. Fifteen samples having mena Hb% is 14.1 taken from a normal population. The population standard deviation of Hb% is 0.5. Can it be concluded that these samples are taken from a normal population? Solution. To use test statistics define H, and H, as H, : The mean of sample is 14% ie. H, : = 14 (Null hypothesis) bm Scanned by CamScanner 7 127 a :p# 14 (Alternative hypothesis) " We have Where x=14.1, n=15, n,=14,0=05 — (14.1-14) VIS 05 __ Since the calculated value (0.76) of Z is less than the tabled value (1.96) at 5% level of significance. So, the hypothesis H, : “The mean of sample is 14%, is accepted”. Gi) When standard deviation is unknown : __Inbio-statistics, we compute the sample standard deviation for small sample (n<30), by ving the flleyving formula Also, the test of significance of mean is equal to a specified value (j1,) in small sample when population S.D. (a) is unknown Here, we frame the null hypothesis Hy:H#= Hy and Hy: ¢ H, (alternative hypothesis) For small sample, we use test statistics as Where X= Sample mean Ho = Specified value = Sample size @ = Calculated S.D. N~ 1 = Degree of freedom (df) Scanned by CamScanner 2 Perret ene ods ete) =2e aon canto LoI SOLS Example 6. Following observations are the pulse rate of the five patients taken from normal population are 62, 64, 60, 78, 68 Test whether the mean of distribution is 70 or not. (Given a = 5%). Solution. To test for the null hypothesis, frame H,, H, : Mean of population is equal to 70 ie. H, : 1 = 70 (Null hypothesis) H, : 1 # 70 (alternative hypothesis) For H, use the test statistics (x-4,)Ja o z= x x-X (xx); 62 44 19.36 64 24 5.76 60 64 40.96 8 116 134.56 2.56 > D(x-x) =203.20 | —3.6X2.23 8. 736x225 _-802_ 112 112 Scanned by CamScanner — eNO eS) 129 | or [2 Since the calculated value (= 112) of 2s .96), then the null hypothesis is accepted that the m 2 is less than the tabled value (1.96), ‘ean of population is equal to 70. Gi) Test of significance of difference between means of small samples (n<30) by student a test. For small sample, t-test is applied instead of test statistics (or z-test). It was designed by wS. Gossell whose pen name was student. Hence this test is also called, “student-'t’-test”. Gossett showed that the ratio followed different distribution called t-distribution. This t corresponds to z in large sample, but the Probability of occurance (p) of this calculated value is determined by reference to ‘t-table’, Ifthe calculated value under t-test greater than from given value at 5% level of signifiance. Then the null hypothesis (H,) is rejected and alternative hypothesis is accepted. Properties of ‘t’ distribution : (i) It is a continuous probability distribution. (i) Its value lies between -o and +c. (ii)_It will be a normal distribution, if na, (iv) It is symmetrical about mean, Uses of ‘test : () It is used to test for a specified value. {ii) It is used to test the differences between values for independent sample. \ (iii) It is used as a paired ‘t’-test for dependent samples. (iv) It is used to construct confidence interval for the estimates. Application of t-test : Itis applied to find the significance of difference between two means : (a) Unpaired t-test () Paired t-test (a) Unpaired t-test : Test for significance of mean of two significant groups with small sample size (n<30) and Population standard deviation (6, and G, is is unknown). “, Etest for testing null hypothesis is ee Scanned by CamScanner 0 a) PRCT atelier een with df. (n,-1)+(n,-1)=(n, +n, -2) f Or, we can also define as Where X= Sample mean B(x-x) ‘ 9 = Standard deviation =|" or ‘t’ statistics also defined as Mi where = ZO) n-1 Note : If o; #03 ive. the two population variences are not the same, then they are estimated separately. Example 7. A random sample of 8 boys of a team to have the following weight (ke) 50, 56, 70, 62, 65, 55, 62, 68 Test at 5% level of significance whether the average weight can be taken as 60k, Solution. Construct the frequency table x ¢ @ =(x-x) 50 144 56 36 70 64 62 0 65 9 55 49 62 0 68 36 Total Dd=6 Dad? =338 Pieces — Scanned by CamScanner SS 191 Let the assumed mean A = 62 and n= g & 8 Ket 2 aens al ty3e_ 4 ee = 7/338 45] =2.61 a1 agg — 18.262 7 7 Here, the ‘t-statistics’, follow the following steps : Step 1- Hy: X= (null hypothesis) H,:X = pi (alternative hypothesis) Level of significance = 5% Degree of freedom = (n ~ 1) = (8-1) =7 Tabled value = 1.895 Step 3- Use t-statistics = R) 1) ig o = (62.741-60) 2.61 = UTA Ly 645 261 Step 4- Result : Since calculated value (2.778) of t is less than tabled value (1.895). So the null hypothesis is rejected. ; ie, the average weight of boys as 60kg cannot be cosidered. Scanned by CamScanner 132 Ee) PRS SSL es eee Example 8. A group of 5 patients treated with medicine A wi ights 42, 32, 48, 0, Al kg. a second group of 6 patients from same hospital treated with medicine B with weights 38, 42, | 56, 64, 67 and 69. Do you agree with the claim that the medicine B increases weight signifi. cantly. Solution. Given sample size of group one = n, = 5 1 and sample size of 2nd group is n, = 6 Since standard deviation are unknown. So we use t-test. First we frame null hypothesis HH, : Effect of medicine of A and B on patients are same. Hy: HM, = Ht (null hypothesis) Hy: H, #4y(ht, < 1) ie. one tail test H, : Medicine B increases weight than A. Construct mean table for medicine A and B Medicine B =V75 =Vi748 = 8515 Now, use t-test _ Scanned by CamScanner 138 = -1.1849 =10 34 [}=1.1849 Since the calculated. value (1.185) for tis less than tabled value 1.833 at 5% level of significance So the assumption that the medicine B increases weight is corrected ie. ac- cepted. (b) Paired ‘t’-test If we want to compare two set of periods. Then we use paired t-test. Let x, and y, be the pre test and Post test value as a =y-%, values of same sample taken in two different time For small smaple, we frame the null hypothesis H, = No difference between pre test and post test value -. Use the t statistics as n Note : If the calculated value of t is less than tabled value of t at 5% levle of significance of (n-1) degree of freedom, we accept null hypothesis, otherwise reject the null hypothesis. Example 9 : An IQ test was administered for 5 students, The result are as follows : Candidate 1Q before training IQ after training = Test, whether there is any change in the IQ after training program. a Scanned by CamScanner eS] BIOSTATISTICS AND RESEARCH METHODOLOGY Solution. We design a research before and after training. The values are taken from same sample in two different time period. To check the hypothesis, frame the null hypoth. esis H, as H, : No difference in IQ, before and after training or H, : Training program is not effective (Null hypothesis) H, : There is difference in 1Q after training programm (alternative hypothesis) Now, use the t test =_3a Where i= 24 Now dv _ 2x5 SD. 30 2x 2.24 5.48 Since the calculated value (0.817) of t is less than tabled value (2.132). Hence we accept H, ie. there is no difference in IQ, before and after training, Scanned by CamScanner [PU EXERCISE 2.1 HORT ANSWER QUESTIONS | Q.1. Define the probability of an event. ‘ans. Probability of an event : Let S be the sample space and E be a subset of S represent ing an event. Then the probability of an event is defined as No's of elementary event in E E)= HE)= Sorel denewayravacinS n(E) P(E)= 71S) What do you understand by random experiment? Random experiment : If in a trial of an experiment conducted under identical condi- tions, the outcomes is not always the same, but may be any of the possible outcomes, then such experiment is called a random experiment. It is denoted by re. Q3. Explain the following terms : a) Binomial distribution Hint : See article 5.5 b) Normal distribution Hint : See article 5.6 Q2. Ans. MULTIPLE CHOICE QUESTIONS 1. The probability can be denoted by - a)P bq OE d) None of these 2. What is the probability of getting an even number when a dice is tossed? a)1 b) 1/2 O38 4) 1/6 3. What is the probability of getting more than 3, if a die is tossed? a)1 byl /2 1/3 4) 2/3 4, Which of the following examples of random experiment are ? a) Rolling a dice b) Tossing of coins ¢) Both (a) & (b) d) None of these 5. A dice is rolled. Find the probability of an even. number- a)2 b)1 01/3 r d) 1/2 6. The probability of getting head, if tossing a coin twice is- a) b) 0.50 ©) 0.25 d) 0.125 7. Best method of variability is- a) Mean b) Median ©) Mode d) Standard deviation re Scanned by CamScanner 136 6. %. Qa. Sol. Qs. Sol. av. a8. ET i, NUMERICAL Questions. ‘Standard error is a mea 2) Sampling error © Observer error ‘True about normal distribution curv 2 Mean, median and mode all coine, ) Standard deviation equalt to 21 OMe fcan equal to twice of median 4) None of these sure of ) Instrumental error ‘None of these ANSWERS (MULTIPLE CHO! 1a) 2») ICE QUESTIONS) 3. b) ; 7a) bo fa bag contains 8 is and 5 blac Ka bag ‘ed balls and 5 blacks balls Find the probability of geting aad Passa What is the probability of finding an odd number, Pa1/2 we thn et A problem inst gen teste se danish ie 2 Obtain the probability that the problem will be solved 3/40 13 1 Average weight of baby at birth is 3.05 kg. The standard deviation of 0.39 kgf the birth weight are normally distribution would you regard Weight of 4 kg as abnormal? of 25kg as normal? Abnormal in more than 95% of eases ii) Normal limits are (229, 3.81 Calculate the probability that in a tossing of coins three times, there will appear- a) Three tails a2 ai * is a success, what is the 2 Ss towns mes geting of an eve smb robebity oping fou ten aor rhe mean deviation om the men of te nr suon abot 3/8 9-7/8 standard deviation. oa x is normally distributed with mean 4 and variance 9. Fi ee vd po EXERCISE 22 snoat At » ” 3. Hint 4. Hint Qs. Hint ulation is called a sample ‘p sample size to represent the ‘ighe the characteristic of sample design. eh, le for the resem imple design meut be sutiable in the context of funds available f control or systematic bias. 1e population so that the study can be ‘confidence. Sample should applied in gen Waite a short a) Non probat 1) Probability sampling See 67 Discuss the types of sampling in details, See 67 ‘Explain the sampling criteria and their procedure. 'Se 6.1 and 6.2 MULTIPLE CHOICE QUESTIONS ‘Which of the following techniques fields a random sample? ®) Choosing school randomly and then sampling of everyone within the school ») Numbering all the elements of to pick eases from the ©) Choosing a pr 1 sample design and using a random number table roportion from within each ethnic group at random the following statements are true? ger the sample size andl the greater the sampling error ‘ample size decreases, so does the size of the confidence intery ©) The more cate ones Boncs you want to make in your data analysis, the larger the sample ® None of these a Scanned by CamScanner 138 . Arandom sampling, chance of being selected is- Sampling in quate esearch IS similar to Which type o¢ resei aap random sampling b) Systematic samplin, _ fn Ds ©) Purposive sampling d) All the above h . How often does the census bureau take a complete Population ; a) Every year b) Evey 5* year aMpg a Every 7 year 4) Every 10 year np for a survey, a village is divided into five lanes, the each | * domly. It is an example of : ane ig - 4) Simple random sampling b) Systematic random sam, Pld, ¢) Multiphase random sampling —_d) Stratified random samp’ S a) Not same & not known —_b) Same and known ©) Same and not knownd) Not same but known True about simple random sampling is- a) All units have unequal chance to be selected ») All units have equal chance to be selected Every fixed unit is taken for selection d) None of these Which is true of cluster sampling- a) Every n® case is chosen for study b) A stratification of population is done ©) A stratification of population is not done 4) A natural group is taken as sampling unit ANSWERS (MULTIPLE CHOICE QUESTIONS) 1 b) 2a) Bo 4 4) 5. d) & »b) 2b) & d@d Scanned by CamScanner

You might also like