Hypothesis Testing 3

You might also like

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 19

Hypothesis Testing

Theory & STATA

Eeman Qureshi
Inferential Statistics
•Provide the basis for predictions, forecasts and estimates that are used to transform info into knowledge.

•In inferential statistics, we take our sample data, and we calculate our sample statistics.

•We can then use those sample statistics to estimate the population parameter, which is oftentimes what we’re
really looking to understand.

Eeman Qureshi
Overview
• Hypothesis testing is a fundamental statistical technique used to
determine if there is enough evidence in a sample of data to infer that
a certain condition holds for the entire population.
• At its core, hypothesis testing involves making an assumption (the null
hypothesis) about a population parameter and then using sample
data to test whether this assumption can be accepted or rejected

Eeman Qureshi
Key Aspects of Hypothesis Testing:

• Null Hypothesis (H0): This is a statement of no effect or no difference and


serves as the default assumption. For example, "There is no difference in
average test scores between two groups of students."
• Alternative Hypothesis (H1 or Ha): This contradicts the null hypothesis.
It's what the researcher aims to prove. For example, "Group A has a higher
average test score than Group B."
• Significance Level (α): It's the threshold for rejecting the null hypothesis.
Commonly set at 0.05, this means there's a 5% chance of rejecting the null
hypothesis when it's actually true (Type I error).
• P-value: This helps in determining the significance of the results. A low p-
value (typically ≤ 0.05) indicates strong evidence against the null hypothesis.

Eeman Qureshi
Importance in Research

• Decision Making: Hypothesis testing provides a systematic way to


make decisions based on data, reducing bias and subjectivity.
• Understanding Relationships: It helps researchers understand
relationships and patterns in data.
• Scientific Rigor: Hypothesis testing adds rigor and validity to
research findings, ensuring that results are not just due to chance.

Eeman Qureshi
Real World Examples
• Medical Research: Testing the effectiveness of new drugs or
treatments. For example, comparing the recovery rates of patients using
a new medication versus a placebo to determine the drug's effectiveness.
• Environmental Studies: Evaluating the impact of environmental
policies or changes. For instance, testing if a new pollution control
measure leads to a significant reduction in air pollution levels.
• Education: Assessing the impact of new teaching methods or
curriculum changes. For example, testing whether students score higher
on standardized tests after a new teaching method is implemented.

Eeman Qureshi
Real World Example in Education: Evaluating the Impact of
Smaller Class Sizes

• Context: A school district implements a policy to reduce class sizes in


elementary schools with the belief that smaller class sizes will
improve student academic performance. To assess the effectiveness of
this policy, a hypothesis test is conducted.

Eeman Qureshi
Real World Example in Education:
Evaluating the Impact of Smaller Class Sizes
• H0= Reducing class sizes in elementary schools has no effect on
student academic performance.
• Ha= Reducing class sizes in elementary schools improves student
academic performance.

Eeman Qureshi
Types of hypothesis testing using t-test
• Two-Tailed Test: Appropriate when the research is open to the possibility
that smaller class sizes could either positively or negatively impact
performance. The key here is that any significant difference, regardless of
direction, would lead to rejecting the null hypothesis.
• One-Tailed Test (Positive Direction): Used when the specific theory or
prediction is that smaller class sizes will improve performance. Only a
significant improvement would lead to rejecting the null hypothesis.
• One-Tailed Test (Negative Direction): Applied when there is a specific
prediction that smaller class sizes might harm performance. Only a
significant decrease in performance would lead to rejecting the null
hypothesis.

Eeman Qureshi
T-Test
1) One sample t-test: n-1 df

T-stat formula: (Mean minus H0)/std. error

•Assume you’re interested to find out whether the true value of average mileage is equal to 30 i.e. the
mean of mpg is equal to 30.

Command given below:

• ttest mpg == 30

Eeman Qureshi
Eeman Qureshi
Downside of Hypothesis Testing
• Reverse Engineering is possible because of Hypothesis testing in STATA.
• You can come up with whatever results you want by using hypothesis testing in STATA, but you will need to
provide significant evidence for this.
• You need to come up with a proper theory behind your hypothesis which involves a literature review and
come up with a sample size based on that research.

Eeman Qureshi
Hypothesis Tests
•H0: Mean= 30 (Null Hypothesis)

•Three alternate hypotheses:

● Red is showing the left tail of the curve mean < 30 and p<0.05 so we will reject the null.

● Blue shows Ha> 30 i.e., the right tail hypothesis, and here the p-value is greater than 0.05 so we fail to reject
the null.

● Yellow box: Two-tailed hypothesis, it is writing the t-stat in absolute terms, so it caters to both positive and
negative which is why we don’t write two next to it. P value < 0.05 so we reject the null of mean = 30.

Eeman Qureshi
Two Sample t-test
Q) You have been told that foreign cars have better Mileage than domestic cars. Perform a t-test to
confirm if this is true. Using the statement above which hypothesis test will you carry out from the three
alternative hypotheses presented in the STATA output.

Command given below:

•ttest mpg, by (foreign)

• Note: Two sample t-test works with only categorical variables

Eeman Qureshi
STATA Result

Eeman Qureshi
Matched Pairs T-test
• Do students score better in math
than English? Yes. What is the
probability of them doing so?
Look at the left tail hypothesis
which provides the p-value so
you can tell probability from
there.
• Command given below:
ttest math== reading

Eeman Qureshi
Hypothesis Test on a Regression
Coefficient
• H0= mileage has no effect on the
price of cars
• Ha: mileage has an impact on
the price of cars
• P value less than alpha i.e. 5%
• So p<0.05 therefore you reject
the null and this relationship
coefficient is statistically
significant.

Eeman Qureshi
Practice Questions: Use Lab 10.xls for Q1-Q6.
In today’s lab you will be using data from google trends. Google Trends is a web facility of Google Inc., based on Google
Search that shows how frequently a particular search-term is entered relative to the total search-volume across various
regions of the world, and in various languages. In the dataset today you have a search volume for IPhone in some of the
South Asian countries over time. We want to analyze how the search volume in Pakistan compares to search volume in other
South Asian countries.

•Q1) Create a new variable which is the difference of search volume in Pakistan and search volume in
Bangladesh. Name this variable Diff1.
•Q2) Create a new variable which is the difference of search volume in Pakistan and search volume in India.
Name this variable Diff2.
•Q3) Generate a variable 'Ratio1' that calculates the ratio of search volume in Pakistan to search volume in
Bangladesh.
•Q4) Generate a variable 'Ratio2' that calculates the ratio of search volume in Pakistan to search volume in India.
•Q5) Find the mean, median and standard deviation for each of the variables Diff1 & Diff2, Ratio1 & Ratio2.
Provide the command in your do-files.
•Q6) You have been told that the search volume in Pakistan is greater than the search volume in India. Perform t-
test for this using the Diff2 variable and confirm which alternative hypothesis will you accept and use as per the
statement provided above?
Eeman Qureshi
Use STATA’s sample dataset sp500.dta for these questions.
Q7) Open STATA’s example dataset sp500.data in STATA.
Q8) Test the null hypothesis that the mean of open (opening
price) is equal to 80 in the favor of the alternative hypothesis
that the mean of opening price is greater than 80.
Q9) You have been told that the mean of opening price is higher
than the mean of closing price. Perform a paired t-test to
confirm if this is true and interpret the p-value.
Use STATA’s sample dataset bplong.dta for these questions.
Q10) Open STATA’s example dataset bplong.dta and provide
the command in your do-files.
Q11) You have to check if the statement that recorded bp is
higher for males as opposed to females is true. Perform a t-test
to check if this statement holds true and select the alternative
hypothesis you will use to match the statement above.

Eeman Qureshi

You might also like