Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 5

Sample Size Determination

1. It is desired to estimate the mean lifespan of a particular model of television. Given that the
population standard deviation σ=500σ=500 hours, how large a sample is needed to be able to assert
with a confidence level of 95% that the mean of the sample will differ from the population mean by
less than 50 hours? (95% confidence level: Z=1.96Z=1.96) (Ans: 385)

2. To estimate the mean battery life of a new type of smartphone battery, it is known that the population
standard deviation σ=200σ=200 minutes. How large a sample is needed to be able to assert with a
confidence level of 99% that the mean of the sample will differ from the population mean by less than
20 minutes? (99% confidence level: Z=2.576) (Ans: 665)

3. A manufacturer wants to estimate the mean durability of their car tires. Given that the population
standard deviation σ=10,000σ=10,000 kilometers, how large a sample is needed to ensure that the
mean of the sample will differ from the population mean by less than 1,200 kilometers with 95%
confidence? (95% confidence level: Z=1.96) (Ans: 267)

4. It is desired to estimate the mean lifespan of a new brand of light bulbs. Given that the population
standard deviation σ=150σ=150 hours, how large a sample is needed to assert with a confidence level
of 90% that the mean of the sample will differ from the population mean by less than 10 hours? (90%
confidence level: Z=1.645) (Ans: 610)

5. To estimate the mean energy consumption of a new model of refrigerator, it is known that the
population standard deviation σ=30σ=30 kWh/year. How large a sample is needed to assert with a
confidence level of 95% that the mean of the sample will differ from the population mean by less than
5 kWh/year? ( 95% confidence level: Z=1.96) (Ans:139)

6. A consumer electronics company wants to determine the job satisfaction levels of its employees. For
this, they ask a simple question, ‘Are you satisfied with your job?’ It was estimated that no more than
30 per cent of the employees would answer yes. What should be the sample size for this company to
estimate the population proportion to ensure a 95 per cent confidence in result, and to be within 0.04
of the true population proportion? (95% confidence level: Z=1.96) (Ans: 683)

7. A tech company wants to assess the level of employee engagement. It is estimated that no more than
50% of the employees will consider themselves highly engaged. What should be the sample size to
estimate the population proportion with 95% confidence and within 0.05 of the true population
proportion? (95% confidence level: Z=1.96) (Ans: 385)

8. A population is divided into three strata with sizes N1=1000, N2=1500, and N3=2500. The respective
standard deviations for these strata are σ1=2.0, σ2=3.5, and σ3=4.5. If the sampling costs are Rs 15
per interview for all strata, and the total sample size is n=200n=200, how should the sample be
allocated under proportionate sampling design and disproportionate sampling design considering both
stratum variability and sampling costs? Explain.

Ans: n1=40, n2=60, n3=100


Hypothesis Testing

9. A random sample of 25 students from a high school averaged 78.5 points with a standard deviation of
5.2 points on a standardized math test. Test the school's claim that their students' average score is
greater than 80 points using a 5 percent level of significance. (Table value of t at 5 percent = 1.711).
(Ans: −1.442)

10. A random sample of 30 participants in a weight loss program lost an average of 6.8 kg with a standard
deviation of 2.5 kg. Test the program's claim that participants lose an average of at least 7.5 kg using a
5 percent level of significance. (Table value of t at 5 percent = 1.697) (Ans: −1.535)

11. A random sample of 40 calls to a customer service centre had an average response time of 3.2 minutes
with a standard deviation of 1.1 minutes. Test the company's claim that the average response time is
less than 3.5 minutes using a 5 percent level of significance. (Table value = 1.645) (Ans: −1.734)

12. In a departmental store’s study designed to test whether the mean balance outstanding on 30-day
charge account is same in its two suburban branch stores, random samples yielded the following
results: n1 = 60, mean1 = Rs. 6420, s1=Rs. 1600, n2=100, mean2= Rs. 7141, s2=Rs. 2213, where the
subscripts denote branch store 1 and branch store 2. Use the 0.05 level of significance to test the
hypothesis against a suitable alternative. (Table value = 1.96) (Ans: −82.58)

13. A company's study is designed to test whether the mean sales during the holiday season is the same in
its two regional branches. Random samples yielded the following results: n1 = 80, mean1 = $5000, s1
= $1200, n2 = 90, mean2 = $5300, s2 = $1500. Use the 0.05 level of significance to test the
hypothesis against a suitable alternative. (Table value=1.96) (Ans: −1.45)

14. In a supermarket chain's study designed to test whether the mean transaction amount is the same in its
two suburban stores, random samples yielded the following results: n1 = 40, mean1 = $150, s1 = $20,
n2 = 60, mean2 = $160, s2 = $25, where the subscripts denote store 1 and store 2. Use the 0.05 level
of significance to test the hypothesis against a suitable alternative. (Table value=1.96) (Ans: −2.21)

15. A survey conducted in a city suggests that the population distribution by type of grocery store used is
50% supermarket, 20% local store, 20% convenience store, and 10% online. In a sample of 500
shoppers, 250 reported using supermarkets, 120 used local stores, 100 used convenience stores, and
the rest shopped online. Perform a chi-square test at the 5 percent significance level to assess whether
the observed distribution differs significantly from the expected distribution. (The critical chi-square
value = 7.815) (Ans: 12)

16. A survey conducted in a school suggests that the population distribution by preferred study method is
50% group study, 25% individual study, 15% online resources, and 10% tutoring. In a sample of 200
students, 110 reported preferring group study, 50 preferred individual study, 20 preferred online
resources, and 20 preferred tutoring. Perform a chi-square test at the 5 percent significance level to
assess whether the observed distribution differs significantly from the expected distribution. (The
critical chi-square value = 7.815) (Ans: 4.33)

17. A bookstore owner believes that the proportion of customers buying fiction, non-fiction, magazines,
and stationery follows a ratio of 5:3:2:1. In a sample of 220 customers, 110 bought fiction, 66 bought
non-fiction, 22 bought magazines, and the rest bought stationery. Determine if the observed
distribution differs significantly from the expected distribution using a chi-square test. (Table value
chi-square = 7.815) (Ans: 9.9)

18. A school's cafeteria manager believes that the proportion of students buying sandwiches, salads, fruits,
and beverages follows a ratio of 4:2:2:1. In a sample of 280 students, 140 bought sandwiches, 60
bought salads, 40 bought fruits, and the rest bought beverages. Determine if the observed distribution
differs significantly from the expected distribution using a chi-square test. (Table value chi-square =
7.815) (Ans: 12.45)

19. A survey indicates that the distribution of preferred streaming services among college students is 50%
Netflix, 25% Hulu, 15% Amazon Prime, and 10% Disney+. In a sample of 500 college students, 240
prefer Netflix, 130 prefer Hulu, 80 prefer Amazon Prime, and the rest prefer Disney+. Perform a chi-
square test at the 5 percent significance level to examine if the observed distribution differs
significantly from the expected distribution. (The critical chi-square value = 7.815) (Ans: 9.933)

20. A high school is studying absenteeism in its sports teams. The total number of absentees from practice
sessions is 60. Test the hypothesis at the 5% significance level that absenteeism is the same for all
teams if the actual absentees are as follows: Basketball: 12, Soccer: 14, Baseball: 10, Volleyball: 13,
Track and Field: 11. (Table Value is approximately 9.488) Ans: 0.832

21. A study examines the effect of different workout routines on weight loss in gym members.
Researchers compare the mean weight loss of participants following cardio exercises, strength
training, and yoga sessions. The ANOVA test yields an F-statistic of 6.78 with a p-value of 0.002.
Post-hoc Tukey's HSD tests reveal a mean difference of 2.5 between cardio exercises and strength
training (p = 0.015), a mean difference of 1.8 between cardio exercises and yoga sessions (p = 0.042),
but a mean difference of 0.7 between strength training and yoga sessions (p = 0.498). Discuss these
findings comprehensively, incorporating both the ANOVA outcomes and additional statistical
comparisons. What insights can be inferred regarding the effectiveness of the workout routines?

22. You are analyzing the factors affecting the demand for a certain commodity. A regression analysis
was conducted using the following variables: price of the commodity, consumer income, and the price
of other commodities. The results of the regression analysis are summarized in the table below:

Variable Coefficient (β) p-value Model Summary Value


Intercept (β₀) 500 units 0.002 R²: 0.85
Price of the Commodity (β₁) -20 units per dollar <0.001 Adjusted R²: 0.83
Consumer Income (β₂) 0.8 units per dollar 0.004 F-statistic: 50.12 (p < 0.001)
Variable Coefficient (β) p-value Model Summary Value
Price of Other Commodities (β₃) -15 units per dollar 0.007 AIC: 1234.56
BIC: 1250.78

1. Interpret the coefficient for the price of the commodity (β ₁). What does this coefficient indicate about
the relationship between the price of the commodity and its demand?
2. What is the significance of the consumer income variable (β ₂) in affecting the demand for the
commodity? Explain the meaning of the p-value associated with this variable.
3. Explain the impact of the price of other commodities (β ₃) on the demand for the commodity. How does
the coefficient for this variable help in understanding consumer behavior?
4. How well does the model explain the variance in demand for the commodity? Refer to the R² and
Adjusted R² values in your explanation.
5. Evaluate the overall significance of the regression model. What does the F-statistic and its p-value
indicate about the model?
6. Discuss the purpose of the AIC and BIC values provided in the model summary. Why are these values
important in regression analysis?
7. Make regression equation

23. This table provides a summary of the regression analysis, showing the impact and significance of each
variable (product price, marketing expenditure, sales team size, and customer satisfaction) on the sales
performance.

Variable Coefficient (β) p-value Model Summary Value


Intercept (β₀) $20,000 0.001 R²: 0.88
Product Price (β₁) -$100 per dollar <0.001 Adjusted R²: 0.87
Marketing Expenditure (β₂) $250 per dollar <0.001 F-statistic: 55.67 (p < 0.001)
Sales Team Size (β₃) $500 per person 0.003 AIC: 890.45
Customer Satisfaction (β₄) $1,200 per point 0.005 BIC: 905.67

1. Interpret the coefficient for the product price (β₁). What does a coefficient of -$100 per dollar imply
about the relationship between product price and sales performance?

2. The p-value for marketing expenditure (β₂) is <0.001. Explain the significance of this p-value in the
context of the regression model. What does it suggest about the reliability of the relationship between
marketing expenditure and sales performance?
3. Make a regression equation
4. How well does the model explain the variance in sales performance? Refer to the R² and Adjusted R²
values in your explanation.

Case Study:
Mr. Mohan Mehta has a chain of restaurants in many cities of northern India and was interested in diversifying
his business. His only son, Kamal, never wanted to be in the hospitality line. To settle Kamal into a line which
would interest him, Mr Mehta decided to venture into garment manufacturing. He gave this idea to his son, who
liked it very much. Kamal had already done a course in fashion designing and wanted to do something different
for the consumers of this industry. An idea struck him that he should design garments for people who are very
bulky but want a lean look after wearing readymade garments. The first thing that came to his mind was to have
an estimate of people who wore large sized shirts (42 size and above) and large sized trousers (38 size and
above).

A meeting was called of experts from the garment industry and a number of fashion designers to discuss on
how they should proceed. A common concern for many of them was to know the size of such a market. Another
issue that was bothering them was how to approach the respondents. It was believed that asking people about
the size of their shirt or trouser may put them off and there may not be any worthwhile response. A suggestion
that came up was that they should employ some observers at entrances of various malls and their job would be
to look at people who walked into the malls and see whether the concerned person was wearing a big sized shirt
or trouser. This would be a better way of approaching the respondents. This procedure would help them to
estimate in a very simple way the proportion of people who wore big-sized garments.
QUESTIONS
1. Name the sampling design that is being used in the study.
2. What are the limitations of the design so chosen?
3. Can you suggest a better design?
4. What method of data collection is being employed?

You might also like