CH 5

You might also like

Download as pdf
Download as pdf
You are on page 1of 14
AN INTRODUCTION TOM Sr 5.5 Non-Probability Sampling «. nonrandom and subjective me; Non- robability sampling 'S AA ih ropulation a = . i the selection 0} f ; Prisin, sampling where the ment or the discretion of the sa, 8 ty | judgi : sample depends on the personal J robability sampling is that jy Ply oe ishing, feature of non-P! f The distinction of population elements is not made through t sampling, iit i cause of this; the investigator canno} ay probability ce Poresentative of the larger population in that unis the investigator’s ability to generalize the findings be is ry apecito sample studied. Further, no confidence interval estima, possible for non-probability sampling. We discuss below a few probability sampling methods. 154 ion on, 5.5.1 Convenience Sampling Non-probability samples that are unrestricted are known as convenieng, samples. Researchers or field workers have the freedom to choo whomever they find; thus the name convenience. The convenience samp, may consist of respondents living in an easily accessible local, Undoubtedly, it is the simplest and least reliable form of non-probabilty sampling. The primary virtue is its low cost. While a convenience sample has no control to ensure precision, this method is quite frequently used, especially in market research and publi: opinion surveys. They are used because probability sampling is oftena time-consuming and expensive procedure and in fact, may not be feasible in many situations. In the early stages of exploratory research, when oni seeking guidance, this approach is recommended. 5.5.2 Accidental Sampling An accidental type sampling is one in which the selection of the cas made whatever happens to be available instantly. In such sample individuals are selected as they appear in a process. If it is decided # Pe aes pneu or patients with abdominal pain, will be chosen I jueue in front of a hospital counts faa ill fall accidental sampling procedure, ven te seeing, seme a 5.5.3 Purposive Sampling Sie sampling method that conforms to certain cite cal a Purposive sampling, There are two major types 8 sampling: (i) Judgment sampling and Gi) Quota prea A SAMPLING AND SAMPLE DESIGN 155 sampling: In Judgment sampling, individuals @ deme idered to be most representative of the nopulstions selected ho ai ment sampling because choice of the individual units depends 18 a on the sampler, who, on his own judgment, decides the sample to enti! ted that conforms to some criteria. In a study of labor problem, you es de to talk only with those who have experienced discrimination re in job. Election results are predicted from only a few because of their predictive record in past elections, reser deci we they We elected persons ota sampling: A quota sampling is a non-probability sampling, in which the interviewers are told to contact and interview a certain number of individuals from certain sub-groups or strata of the population to make up the total sample. The formation of the strata is usually based on such characteristics as sex, age, social status, region of residence. These characteristics which are used to form strata are termed ‘quota control’. The technique is widely used by market researchers, political opinion seekers and many others to avoid the cost problems of interviewing the individuals. ‘The term ‘quota’ arises from the fact that in this method, the interviewers are given quotas of the sub-groups (i.e. strata) of the population at the very outset to build a sample roughly proportional to the population. That is, quotas of desired number of sample cases are computed proportionally to the population sub-groups. The sample quotas are divided among the interviewers, who then do their best to choose persons who fit the restrictions of their quota controls. iy Suppose you want to conduct a survey of rural and urban residents of a Population, How many residents should be chosen from each area? Suppose it is known that one-third of the population lives in urban areas and two-thirds in rural areas, the sample can be selected purposively from uban and rural areas in the same proportion. Thus a total of 300 Respondents would mean 100 urban residents and 200 rural residents to be included in the study. Note that quota sampling may be considered “quivalent to stratified sampling with the added requirement that stratum 1S enerally represented in the sample in the same proportion as In the entire Population, und that its cost per element Quota sampling j is mpling is practiced:mainly on the gro! frp i er to administer and can be Slower th oe nie Be ‘an for probability sample, it is east H et more quickly than a comparable probability sample. oe ee et advantage of quota sampling is that it can always a ie tended sample size in each stratum, whereas with a pre-selected ran 156 AN INTRODUCTION TO RESEARCH METHODS come selected individuals who can, ee ‘migrated elsewhere or who refuse Meth operate, resulting in increased non-tesponse rate. ; ; Despite its simplicity, quota sampling has ca weal messes, First, choice of subjects is left to field workers to make on a judgmental bat | and thus it suffers from selection bias. Secondly, since the procedure ‘ | selecting the sample is ill-defined, there is no valid method of estimating the standard error of a sample estimator. sample there wil found at home, 5.5.4 Snowball Sampling Snowball sampling is the colorful name for technique of building up a fy or a sample of a special population. Some recent authors have referred tg snowball sampling as chain referral or network sampling. Snowball sampling is conducted in stages. In the first stage, a few persons possessing the requisite characteristic are identified and interviewed. These persons are used as informants to identify others who qualify for inclusion in the sample. The second stage involves interviewing these persons who | can be interviewed in the third stage and so on. For example, consider the | selection of beggars for which no frame is available. This can be best done by asking an initial group of beggars to supply the names of other beggar they come across. Selection of street sex workers also can be mate following this network approach. If you were able to find a few st workers willing to talk to you, you might ask them for the names atl locations of others they know who might also be willing to be interviewed. The term snowball stems from the analogy of a snowball, which begits small but becomes bigger and bigger as it falls downhill. Snowball sampling has been particularly used to study drug cultures heroine addiction, teenage gang activities, and community relations, a other issues where respondents may not be readily visible or are diffcut™ identify and contact, tl 5.6 Probability Sampling : robability sampling is based selectio" controlled procedure that — n the concept of random eee nt es " . This section is devoted to the SW. | a i Probability sampling designs, which are frequently used in 2 selection, These include, among others, the following: ¢ Simple random sampling ¢ Systematic sampling SAMPLING AND SAMPLE DESIGN 157 «Stratified random sampling and «Cluster sampling we provide below a brief overview of these sampling procedures. 56.1 Simple Random Sampling simple random sampling is a procedure that gives each of the sampling units in the population an equal and known non-zero probability of being selected. Selecting a simple random sample may be accomplished with the sid of computer software, a table of random numbers, or a scientific calculator. In most instances, random numbers are employed to select samples. Such selection procedure ensures that every unit in the population has an equal and known probability of being included in the sample. Drawing a simple random sample from a population requires that in every draw, each eligible population unit be assigned equal probability of selection. This ensures randomness in the selection making the sample independent of human judgment, In reality, a simple random sample is drawn unit by unit. If a list (sampling frame) of the population units is available, the selection of a random sample may be easily accomplished with the use of random numbers. The following 8-step procedure may be followed in drawing a simple random sample of 1 units using random numbers from a population of V units, |. Assign serial numbers to the units in the population from 1 through N. 2. Decide on the random number table to be used. 3. Choose an N-digit random number from any point in the random number table. 4. Ifthig random number is less than or equal to N, this is your first selected unit. 5. Move on to the next rant horizontally or in any ot! your second unit. 6. If at any stage of your selecti exceeds N, discard it and choose 7. If, further, any random number i discarded and be replaced by @ fresh random num next, The process stops once you arrive at dom number not exceeding N, vertically, ther direction systematically and choose fon, the random number chosen the next random number. js repeated, it must also be ber appearing your desired sample size. | 158 AN INTRODUCTION TO RESEARCH METHODS designed to illustrate how the selectio, de in practice. o ple of size 5 from a popula sampling method, Plato, The following examples are population elements can be mat Draw a simple rane ploying si dom sam| imple random ign seri bers 001, 002, Ne150. Assign serial num 1, 002,...15p ation. Since 150 is 2 oe number . igi resented in the Appeng: ly read three-digit random numbers pre pend Sipe we start from the leftmost digit of first row of the random nur x table in the Appendix 1 and proceed downward until we achieve a sample of 5 units. The random numbers were as follows: 86 666 277 130 802 108 541 608 497 7 40 515 416 302413. 258 «OGL 608 809 195 Example 5.3: comprising 150 units em Solution: Here n=5 and the 150 units in the popu! 414 493 063 609 923-79 «(381 396 840 474 43 642 668 724 210 953 407 582. 895, 154 i those numbers, which lie in the range 001-150 Note that we choose only : i de this range is omitted, since they do no Any number lying outsi correspond to any unit in the population. The process stops once we arrive at five numbers. Note that the selected numbers are 130, 108, 61, 63 and 121. These numbers are underlined with bold faces. All these numbers are distinct. If a random number occurs twice, the second occurrence is omitted, and another number is selected as its replacement. 45.4: Assume that there are N=1000 tecords of daily wages ofthe dustry. Draw a sample of 25 recots sample of 25 Example employees of a pharmaceutical in using the random numbers shown in Appendix 1 to draw a records. Solution: The first step is to arrange the wages of 1! assigning 2 mune from 000 to 999. That is, we have 1000 three! numbers where 001 represents the first record, e 999" record 000 the 1000". We may use the first three ae e et column 0 random numbers in Appendix 1 consisting of 10 random digits drops, the last 7 digits of each random number. We see that the first selec number is 853, the second is 540, and the third is 985 and 5° pe further down the column, the following random numbers at 000 employes 853 540985903 6 3°92 64° | 99k OTB 495 496 641 417-906 906 715 883 744104 467 236 «159118 782 SAMPLING AND SAMPLE DESIGN 159 Note that renumbering the serials has m: ade the task of choosing the cases much more easier and there has been no rejection in the process. if the wage records of the employees are actually numbered, we merely choose the records with the Corresponding numbers and these records represent a simple random sample of size n=25 from N=1000, We illustrate below by an example a relatively efficient met thod of drawing asimple random sample that has less rejection rate. Example 5.5: Refer to Example 5.3. The of 5 has to be chosen, contains 150 uni 150, follow the steps below: 1, Choose a random number from the random number table provided to you (refer to the random numbers shown in Example 5.3). This number is 277, which exceeds 150, | 2. Divide 277 by 150. The remainder is 127. The unit labeled 127 in the population is your first selected unit. 3. To select the second unit, choose the next random number. This number is 130, which is less than 150, We directly choose this number as our second unit in the sample. 4. The next random number is 802, which results in a remainder of 52 when divided by 150. The unit corresponding to this number is our third selected unit. 5. Continuing this process, we arrive at the next two numbers. These are 108 and 91, ‘The random numbers thus chosen are 52, 91, 108, 127, and 130. population from which a sample its. For selecting a unit from 001— The procedure above is referred to as remainder method. This procedure has the advantage of having less rejection rate in the selection process. S611 Determination of sample size in simple random sample One of the most important problems in the planning a sample survey is that a termining how large a sample is needed for the estimates to be "liable enough to meet the objectives of the survey. The decision is portant for several reasons. Too large a sample involves huge cost, mee materials and time, while too small a sample invalidates the ‘ls, Then the question is: what is the optimum size of the sample? ithough General rules are hard to make for the sample size without ‘ledge of the specific population, around 30 cases seem to be the bare (Chany .._ Or studies in which statistical data analysis is to be done eye Eee 1970: 89). However, many researchers regard 50, and some ** 100 cases as the minimum (Fisher et al. 1991), One reason is that Y 160 AN INTRODUCTION TO RESEARCH METHODS ulations the researchers wish to there are often se st controlled for. If there are not cae tees i Fach sib go of the population, it is sometimes hard to meg aie of standard statistical tests such as chi-square in particula, : addition percentages calculated on the basis of fewer than 30 cases tend, | be unreliable. i 1991) suggest a simple approach in the cases when Ee nae Mee a contingency table. This approach ensures minimum number of cases as cell frequencies in a cross-table of variable Following the approach, let us consider the problem of analyzing th, association between nutritional knowledge of mothers and their level education. In order to analyze such a table, two points are to be kept in mind while determining the sample size: Each category of the independent variable should contain-at least, specified number of cases; + The expected number of cases in each cell should be-at least 5 (o permit statistical test, such as chi-square). In the present example, education is the independent variable whil nutritional knowledge is the dependent variable. Let the variable ‘education’ have 4 levels as below: Education level _% of mothers None 60 Primary 20 Secondary 15 Above rewondany 35. Total 100 Now suppose that nutritional knowledge of the mothers has 3 categot® no knowledge’, ‘moderate knowledge’ and ‘high knowledge’ WH account respectively for 30%, 20% and 50% of all. mothers. SAMPLING AND SAMPLE DESIGN 161 ————E Knowledge level ____% of mothers No knowledge 30 Moderate knowledge 20 High knowledge 50. Total 100 To find minimum sample size needed to ensure an expected cell frequency of at least 5, We divide 5 by the product of the proportion falling in the smallest categories of the two variables (viz.: 5% for above secondary, and 20% for moderate knowledge): 5 (0.05)(0.20) me fel Since the sample size required must meet both criteria (30 cases in each variable category and 5 cases in each cell), the larger of the two estimates (600 vs 500) should be adopted as the final sample size. This criterion leads to a choice of =600 as our final sample size. We can verify that the above procedure ensures that none of the cells contains less than 5 cases and at the same time the independent variable category contains at least 30 cases: Table 5.3: Cross-table of Education and Nutritional Level Education level Nutrition level None Primary Secondary Above —_—‘Total secondary _(%) No knowledge 108 36 27 9 180 (30%) Moderate 72 2 18 6 120 knowledge (20%) High knowledge 180 60 5 15 300 (30%) Total 360 120 90 30 600 (%) (60%) (20%) (15%), (8%) 100%) The cell values in the above table are calculated as the product of row and ehuin percentages and the estimated sample size (7=600). For example, he fist value 108 is calculated as follows: 108=0.30 x 0.60 x 600 Similarly, the second value 60 in the third row is calculated as 60=0.50 x 0.20 x 600 162 AN INTRODUCTION TO RESEARCH METHODS statistically sound approach of determ; it below a more . eae nsider two cases: sample size. In doing so, we co! - sample size (7) in estimating Populaty, iting a) Determination of. proportion; b) Determination of sample size (”) in estimating population i 5.6.1.2 Sample size when estimating a population proportion In sample surveys, we are frequently encountered with the problem of estimating population proportions or percentages such as proportion « persons smoking, proportion of children suffering from malnutrition, proportion of voters favoring a particular candidate, percentage 4 customers arriving at a superstore with credit card and the like. Thus if, pis such a proportion that has a given attribute, then for a sufficiently large population, the formula for estimating the sample size is 2 p= ny = EEE P t where: mq =desired sample size z=standard normal deviate usually set at 1.96, which corresponds to the 95% confidence level. pRassumed proportion in the target population estimated to have? particular characteristic. =allowable marginal error in estimating population proportion. Example 5.6: A nutrition survey is to be conducted in a refugee cam? Assume that 40% children suffer from malnutrition. How large a sampe would be needed in order to be 95% certain that the estimated prevalem* does not differ from the true prevalence by more than 0.05? Solution: Assuming that the population is large, we employ formula above. Here z=1.96, d=0.05 and p=0.40. We now want to estimate the ™ Proportion in the population within 5 pei i ‘That is wit P=0.40+0.05. Thus Percentage points ofp. 2 ny = =P p) _ (1.96?)(0.40) (0.60) =369 e ? (0.05) If p is not known or diffi take it as 0.50 which indicates a sample size t} expected between two fe v icult to assume, it will be the safest procedtt maximizes the expected variance al jon? hat is sure to be large enough. If the prOPO pa Values, the value closest to 50% is sele! 4 SAMPLING AND SAMPLE DESIGN 163 , ifp is thought to be betw % eit cen 15% and 30%, then 30%, (the larger sxamP nthe two) should be chosen as the value of p to calculate of the} hoice of d is 0.05, This val ‘A.common cl . Value does not seem to isti scenarios where the true value of p is outside the range lagent i. small value for or consideration of a relative margin of error a recommended: The sen iol computed as portion of the assumed true roportion p. Consideration o this relative rate of allowabl i would convert the equation to ae monp-p{) = (py P(r We check below that the formula (c) yields a value of 139 for m when d=0.5 and p=0.90: 2 2 z*p(1- p) _ 1.967(0.90)(1- 0.10) ny = PRP Oe ) az 0.057 139, With the same values r (0.05) and p (0.90), (c*) yields: a-p (=)-S3(85)=- p \r 0.90 (0.057 If is small, the formula to be used assumes the following form: no Neale) (a Nd? +27 p(l- P) The formula (d) above can also be expressed as follows: ie fel n= N+M In practice, we first calculate no. If nol is negligible, then m is a Satisfactory approximation to ”. Assuming that p is difficult to fix in adval vent nce, we take it to be 0.50. In that (1,962)(0.50)(0:50) _ 385 uly (0.057 9 ION TO. RESEARCH METHODS 164 AN INTRODUCT! rd this as a small population, Then, 7 N=2000 and we regard t! «. oa ate of nas follows: uy revise our estim | __Nmy__ 2000x385 _ 494, | "Nem 20004385 Yamane (1967) provides a more simplified formula to calculate n, ‘This N "end le) When (et) is applied to the above case MN 2000 n= Tend? 1+2000(0.05)? As can be noted, the sample size using formula (e) results in sampling fewer children than formula (c). It is further easy to verify that formula (0, for a given z and d values, will give the same sample size regardless of the size of the population. The following table compares the two forma numerically: Table 5.4: Comparison of Two Sample Size Formulae for | p=0.5, d=0.05 and z=1.96 Population Estimated sample size Estimated sample size size when Nis large when Vis small 50 385 45 100 385 80 500 385 218 1,000 385 279 5,000 385 357 10,000 385 371 50,000 385 382 In comparative studies, one usually wants to demonstrate that there e | significant difference between two groups. If we assume an equal nut of cases (m= n 2= 7) in the two sub-populations, the formula for 7s"? similar to the one above; n 2222p p) a ao! “ f Example 5.7: Suppose we want to compare nutritional status of groups of children. If we expect P (proportion malnourished) © mi and we wish to conclude that an observed difference of 0.10 of ™® SAMPLING AND SAMPLE DESIGN 1 65 ‘ ignificant at the 0.05 | Y vonsidered as signi Ie level (95% confi vrsample size for each group based on (f) works cute te (107 =184, thus, we need 184 children in the first group and another 184 children in the second group. gxample 5.8: How large a random sample of records from a hospital admitting 20,000 patients during a year is required if we want to be 90% confident that our estimate of proportion of patients with hypertension is off by less than .02? An estimate of the proportion of patients having hypertension was available from a neighboring hospital to be 0.35. Solution: In the example we have d=0.02, z =1.64 and p=0.35. Since N is pretty large, Ifthe population would have been small, say 3000, na No _ 3000x1530 N+nq 3000+1530 =1013. Following Yamane na - 0 _ essa 1+Nd? 1+3000(.02) Example 5.9: The National Board of Revenue (NBR) suspects, on the basis of a preliminary enquiry, that 20% businessmen provide false Statements in their income tax return, They now plan to undertake a ‘ttistical study to estimate the true proportion of persons who falsify in their retums, How large a sample would be needed if they want to be 90 % ie that the error in the estimation in the true proportion will be within ? Witton: Here 2=1.64, p=0.20, 40.05, Under the assumption that Prlation from which the sample is to be taken is large, we have =F l=p) (64) (020)0.80) 72 Ng — 7 q d (0.05) | ; heefore, in order to be 90% confident of estimating a desired Portion within +5 9% ofthe true value, a sample of 172 is needed. “sch Methods—12 ba 0 RESEARCH METHODS 166 AN INTRODUCTION T Id have set p=0.50, so that estimate of p, We WOU! Had there been no 2 p(t p) = (1.647050) (0.50) _ 265 maar (0.05) in determination of sample size is not strictly follow, aiteing an derived for , rather an au ae is adopteg fy ‘one reason or other. In such cases, it may be of interest to know the de of allowable error (d) inherent in the estimated proportions of interes, implied by the arbitrary choice of n. This requires a post-survey estima of d, a formula for which is of the following form: : d= ]-2L=P) , when population is large, and ... ¢ 7 2 de ol evn) » When population is small .., (#4) In Example 5.10: Suppose a sample survey was based on an arbitrary sample size of 386 units from a large population. What was the marginal level of error implied by the choice of p=0.05 and z=1.96? What would be d if Nis assumed to 4000? Solution: Employing (*) above, 2 2 a=, 2p) _ 126x808). 995 or 5% n When N=4000 (this being assumed small), we employ (**) and find 1,96? x0.5(1-0.5)(4000 - 386) d= = =4.7% 4000x386 one 5.6.1.3 Sample size for estimating population mean Very often we want to draw inference on the mean and the total value variables like income, expenditure, age or BMI. The sample size neede make such inference is somewhat different from the one discussed f0" proportion. For the mean, the formula is joe wo Na? +276? 7% : 16 where 0 is the population variance, The other quantities bear the 5H” meaning as before. When N is large, the sample size can be calculated _d SAMPLING AND SAMPLE DESIGN 167 _ mal are {hy The formula (g) can now be rewritten as Nag ian fi) N+ny gxample 5.11: For a population of 10,000 women, the distribution of the ody mass index (BMI) showed a variance of 15. How large a sample should we draw if we want to be 95% confident that our estimate of the average BMI in the population is off by + 0.32 Solution: Here N=10,000, o*=15, @=0.3. Hence for estimating the mean, the sample size is obtained from (g) as below: 10000 (0.3)? +(1.967)(15) _ Thus a sample of 602 women will be needed to achieve the desired degree of confidence in the estimate. If N were large, n would have been by virtue of (A) 1,96? x15 nie ee oF 640 Example 5.12: Suppose a real estate company wishes to estimate the monthly average consumption of electricity in their apartment complexes located in a city within +50 kwh of the true value and desires to be 95% confident of correctly assessing the true mean. On the basis of a study undertaken elsewhere in a similar environment, the company believes that the standard deviation can be estimated as 200 kwlh. What would be the Sample size? Solution: Using (h) with o? =200=40000, 2=1.96 and d=50, we have _ 1.96? x40000 _ [| 00 1% 61.5 That is, we choose 62 complexes. tFe-cannot be guessed at all or estimated otherwise, a rule of thumb for vavtating the standard deviation is to take one-sixth of the range of the Cites the researcher expects. In the above example, if the company torts that the consumption of electricity will vary from as low as 2500 88 high as 40000 kw{h, then an estimate of ¢ can be taken to be

You might also like