DA Unit 3 T Test (Class

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 30

CSE Department, KIT’s College of Engg.

Course: Data Analytics (Theory), UCS0622, TYCSE: Semester II, 2022-23
Instructor: Dr. Kapil B. Kadam, (KBK)

Data Analytics (DA)

Unit 3

UCS0622: Data Analytics (Theory), TYCSE: Semester II, 2022-23 Instructor: Dr. Kapil B. Kadam (KBK), CSE Dept., KITCOEK
■ The t-test is a statistical test that is
mainly used to compare the mean
of two groups of samples.
■ It is meant for evaluating whether
the means of the two sets of data
are statistically significantly
different from each other.

It is used to compare the mean of a It is used to compare the mean of two It is used to compare the means between
population with a theoretical value. independent given samples. two groups of samples that are related.
or or or
If there is one group being compared If the groups come from two different If the groups come from a single
against a standard value perform a populations perform a two-sample or population perform a paired t test.
one-sample t test. independent t-test
For example
For example For example - Mean difference in heart rate for a
- Mean heart rate of a group of - Mean heart rates for two same group of people before and
people is equal to 65 or not groups of people are the same after exercise is zero or not
or not
- Avg marks of a class are equal - Difference between two exams
to 70 or not - Avg marks of two different (after some interval) of the same
class are same or not course

It is used to compare the mean of a It is used to compare the mean of two It is used to compare the means between
population with a theoretical value. independent given samples. two groups of samples that are related.
or or or
If there is one group being compared If the groups come from two different If the groups come from a single
against a standard value perform a populations perform a two-sample or population perform a paired t test.
one-sample t test. independent t-test
For example
For example For example - Mean difference in heart rate for a
- Mean heart rate of a group of - Mean heart rates for two same group of people before and
people is equal to 65 or not groups of people are the same after exercise is zero or not
or not
- Avg marks of a class are equal - Difference between two exams
to 70 or not - Avg marks of two different (after some interval) of the same
class are same or not course

It is used to compare the mean of a It is used to compare the mean of two It is used to compare the means between
population with a theoretical value. independent given samples. two groups of samples that are related.
or or or
If there is one group being compared If the groups come from two different If the groups come from a single
against a standard value perform a populations perform a two-sample or population perform a paired t test.
one-sample t test. independent t-test
For example
For example For example - Mean difference in heart rate for a
- Mean heart rate of a group of - Mean heart rates for two same group of people before and
people is equal to 65 or not groups of people are the same after exercise is zero or not
or not
- Avg marks of a class are equal - Difference between two exams
to 70 or not - Avg marks of two different (after some interval) of the same
class are same or not course

It is used to compare the mean of a It is used to compare the mean of two It is used to compare the means between
population with a theoretical value. independent given samples. two groups of samples that are related.
or or or
If there is one group being compared If the groups come from two different If the groups come from a single
against a standard value perform a populations perform a two-sample or population perform a paired t test.
one-sample t test. independent t-test
For example
For example For example - Mean difference in heart rate for a
- Mean heart rate of a group of - Mean heart rates for two same group of people before and
people is equal to 65 or not groups of people are the same after exercise is zero or not
or not
- Avg marks of a class are equal - Difference between two exams
to 70 or not - Avg marks of two different (after some interval) of the same
class are same or not course

How to perform a t-test
For all of the t-tests involving means, you perform the same steps in analysis:

■ Define your hypotheses - null (Ho) and alternative (Ha) hypotheses

before collecting your data.

■ Select a Level of Significance i.e. the alpha value (or 𝛼 value). For
example, 𝛼=0.05 when comparing two independent groups.

■ Check the data for errors.

■ Perform the test and draw your conclusion.

■ All t-tests for means involve calculating a test statistic.

■ You compare the test statistic to a theoretical value from the


■ The theoretical value involves both the 𝛼 value and the degrees of
freedom for your data.
T-test Formula -
for independent samples t-test

T-test Formula (for independent sample)

t t-test value

x̅ 1 Mean of first set of values

x̅ 2 Mean of second set of values
s1 Standard deviation of first set of values
s2 Standard deviation of second set of values
n1 Total number of values in first set
n2 Total number of values in second set.
T-test Formula Standard deviation

t t-test value s The standard deviation for a data set

x̅ 1 Mean of first set of values x Values given in data set

x̅ 2 Mean of second set of values x̅ Mean value of data set
s1 Standard deviation of first set of values n Total number of values in the data set
s2 Standard deviation of second set of values
n1 Total number of values in first set
n2 Total number of values in second set.
■ Find the t-test value for the following given two sets of values:
Set A 7, 2, 9, 8
Set B 1, 2, 3, 4

Null Hypothesis
There is no significant difference between the means of the two groups

Significance level 𝛼 = 0.05

■ Find the t-test value for the following given two sets of values:
Set A 7, 2, 9, 8
Set B 1, 2, 3, 4

Null Hypothesis
There is no significant difference between the means of the two groups

Significance level 𝛼 = 0.05

■ Find the t-test value for the following given two sets of values:
Set A 7, 2, 9, 8
Set B 1, 2, 3, 4

● For 1st dataset
一 Find mean
一 Compute standard deviation
● For 2nd dataset
一 Find mean
一 Compute standard deviation
● Compute t value
● Calculate degrees of freedom
● Compare the absolute t value with the critical t value

■ Find the t-test value for the following given two sets of values:
Set A 7, 2, 9, 8
Set B 1, 2, 3, 4

● For 1st dataset
一 Find mean
一 Compute standard deviation
● For 2nd dataset
一 Find mean
一 Compute standard deviation
● Compute t value
● Calculate degrees of freedom
● Compare the absolute t value with the critical t value

■ Find the t-test value for the following given two sets of values:
Set A 7, 2, 9, 8
Set B 1, 2, 3, 4

● For 1st dataset
一 Find mean
一 Compute standard deviation
● For 2nd dataset
一 Find mean
一 Compute standard deviation
● Compute t value
● Calculate degrees of freedom
● Compare the absolute t value with the critical t value

Do you wish to try
by yourself?

2.38 < 2.447

If the absolute value of the

t-value is greater than the
critical value, you reject the null

If the absolute value of the

t-value is less than the critical
value, you fail to reject the null

So here, absolute value 2.38 is

less than the critical value
(2.447) so we fail to reject the
null hypothesis.
Take Notes

Exercise 1
Test whether there is a significant difference in the average
marks between boys and girls in a your class. We randomly
select 10 boys and 10 girls students and their marks. The data
set is as follows:

Boys: 67, 64, 72, 63, 71, 69, 70, 71, 63, 68
Girls: 62, 70, 66, 68, 65, 61, 72, 64, 76, 63

Conduct an independent samples t-test to determine if there is

a significant difference. Use a significance level of 0.05.

Exercise 2
Suppose you are interested in determining whether there is a significant
difference in the mean weight of apples produced by two different
orchards. You randomly select 10 apples from Orchard A and 12 apples
from Orchard B, and weigh each apple. The data set is as follows:

Orchard A: 100g, 120g, 110g, 90g, 115g, 105g, 95g, 110g, 115g, 105g
Orchard B: 130g, 125g, 135g, 120g, 130g, 135g, 130g, 125g, 130g, 135g, 140g, 125g

Conduct an independent samples t-test to determine if there is a

significant difference in mean weight between the two orchards. Use a
significance level of 0.05.

What if I have more than 2 datasets ?
Set A 7, 2, 9, 8
Set B 1, 2, 3, 4
Set C 2, 5, 6, 7

What if I have more than 2 datasets ?
Set A 7, 2, 9, 8
Set B 1, 2, 3, 4
Set C 2, 5, 6, 7

We use ANOVA
(Analysis of Variance)

Explore ANOVA
before coming to next class

■ https://www.scribbr.com/statistics/t-test/

■ https://www.jmp.com/en_in/statistics-knowledge-portal/t-test.html

■ https://www.toppr.com/guides/maths-formulas/t-test-formula/#Wh

■ https://www.scribbr.com/statistics/students-t-table/

CSE Department, KIT’s College of Engg. Kolhapur
Course: Data Analytics (Theory), UCS0622, TYCSE: Semester II, 2022-23
Instructor: Dr. Kapil B. Kadam, (KBK)

Thank You!

UCS0622: Data Analytics (Theory), TYCSE: Semester II, 2022-23 Instructor: Dr. Kapil B. Kadam (KBK), CSE Dept., KITCOEK

You might also like