Hafta 6

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 40

QUALITY

AND
STATISTICS
DESCRIPTIVE AND INFERENTIAL
STATISTICS
Descriptive and inferential statistics are two fields of statistics.
Descriptive statistics is used to describe data. It is used to summarize the
attributes of a sample in such a way that a pattern can be drawn from the
group. It enables researchers to present data in a more meaningful way
such that easy interpretations can be made.
Inferential statistics is a branch of statistics that is used to make inferences
about the population by analyzing a sample. When the population data is
very large it becomes difficult to use it. In such cases, certain samples are
taken that are representative of the entire population. Inferential statistics
draws conclusions regarding the population using these samples.
DESCRIPTIVE AND INFERENTIAL
STATISTICS
POPULATION VERSUS SAMPLE
While population refers to the set of all elements with relevant
characteristics, sample refers to a subset of a population with a limited
number of data taken from this population.

Parameters
Inferential
Statistics
POPULATION

Descriptive
Statistics SAMPLE
DESCRIPTIVE AND INFERENTIAL
STATISTICS
DESCRIPTIVE STATISTICS
DESCRIPTIVE STATISTICS

Mean is the arithmetic average computed by summing all the values in the
dataset and dividing the sum by the number of data values. For a finite set
of dataset with measurement values 𝑋𝑋1 , 𝑋𝑋2 , … , 𝑋𝑋𝑛𝑛
(a set of 𝑛𝑛 numbers), it is defined by the formula:
DESCRIPTIVE STATISTICS

The middle number in the data set (n/2), when arranged in ascending order
(small to large). If there are odd numbers of observations then median is the
(n+1)/2th ordered value. If there are even numbers of observations then
median is average of the two middle values.

For a given data set: 12, 14, 11, 12, 12, 12, 15, 17, 22, 15, 12
Ascending Order: 11, 12, 12, 12, 12, 12, 14, 15, 15, 17, 22
Thus, the middle number in the data set Median = 12

For a given data set: 12, 14, 11, 12, 12, 12, 15, 17, 22, 15, 12, 19
Ascending Order: 11, 12, 12, 12, 12, 12, 14, 15, 15, 17, 19, 22
Thus, the middle number in the data set Median = (12+14)/2=13
DESCRIPTIVE STATISTICS

Mode is the data point having the highest frequency (maximum


occurrences).

For a given data set: 12, 14, 11, 12, 12, 12, 15, 17, 22, 15, 12
Maximum occurring data point, Mode = 12
DESCRIPTIVE STATISTICS

A quartile is any of the three values which divide


the sorted data set into four equal parts, so that
each part represents one fourth of the sampled
population.

If 𝑛𝑛 is odd number If 𝑛𝑛 is even number


𝑛𝑛 + 1 𝑛𝑛
𝑄𝑄1 = 𝑄𝑄1 =
4 4
𝑛𝑛 + 1 𝑛𝑛
𝑄𝑄2 = (𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀) 𝑄𝑄2 = (𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀)
2 2

3(𝑛𝑛 + 1) 3𝑛𝑛
𝑄𝑄3 = 𝑄𝑄3 =
4 4
DESCRIPTIVE STATISTICS

The difference between the upper and lower quartiles is called the
interquartile range. Accordingly, outliers can be checked.
DESCRIPTIVE STATISTICS

It can be interpreted as the average distance of the individual observations


from the mean. Standard deviation of the population is represented as " 𝜎𝜎 ".
Standard deviation of the sample is represented as " 𝑠𝑠 ".

∑𝑁𝑁
𝑖𝑖=1 𝑋𝑋𝑖𝑖 − 𝜇𝜇
2 ∑𝑛𝑛𝑖𝑖=1 𝑋𝑋𝑖𝑖 − 𝑋𝑋� 2
𝜎𝜎 = 𝑠𝑠 =
𝑁𝑁 𝑛𝑛 − 1
DESCRIPTIVE STATISTICS

Variance is defined as the square of standard deviation. Variance of the


population is represented as σ times σ. Variance for the sample is
represented as "s times s".

∑ 𝑁𝑁
𝑖𝑖=1 𝑋𝑋𝑖𝑖 − 𝜇𝜇
2 ∑ 𝑛𝑛 �
𝑖𝑖=1 𝑋𝑋𝑖𝑖 − 𝑋𝑋
2
2 2
𝑠𝑠 =
𝜎𝜎 =
𝑁𝑁 𝑛𝑛 − 1
DESCRIPTIVE STATISTICS

Range is defined as the difference between largest value in a data set and the
smallest value in a data set.

𝑅𝑅𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎 = 𝑉𝑉𝑉𝑉𝑉𝑉𝑉𝑉𝑉𝑉𝑚𝑚𝑚𝑚𝑚𝑚 − 𝑉𝑉𝑉𝑉𝑉𝑉𝑉𝑉𝑉𝑉𝑚𝑚𝑚𝑚𝑚𝑚


DESCRIPTIVE STATISTICS

LEFT SKEW RİGHT SKEW


INFERENTIAL STATISTICS
Hypothesis Tests for Means
INFERENTIAL STATISTICS
Hypothesis Tests for Means
Hypothesis Tests for Means
A company needs aluminum plates with an average thickness of 0.01 cm. The control
staff randomly took 100 samples from the batch in production and determined the
average of this sample volume as 0.009 cm and the standard deviation as 0.01.
When 𝛼𝛼=0.05, can it be said that the thickness of the aluminum plates in the
incoming batch is 0.01?

𝑛𝑛 = 100 𝐻𝐻0 : 𝜇𝜇 = 0,01


𝑋𝑋� = 0.009 𝐻𝐻1 : 𝜇𝜇 ≠ 0,01
𝑠𝑠 = 0.01 One sample size, the population variance is unknown,
𝜇𝜇0 = 0.01 𝑛𝑛 ≥ 30, 𝑛𝑛 < 0.05𝑁𝑁

� − 𝜇𝜇0
𝑋𝑋 0.009 − 0.01
𝑍𝑍 = 𝑠𝑠 = 0.01
= −1
𝑛𝑛 100

𝐻𝐻0 ACCEPT -1

For 𝛼𝛼=0.05, we have sufficient evidence that the thickness of the aluminum plates
produced is 0.01.
19
Hypothesis Tests for Means
It is known that the average strength of the wires produced in a steel wire
production company is 300 kp/mm2 and the standard deviation is 24 kp/mm2. Since
the management thought that the strength of the wire would increase if it was
produced by another method, it carried out trial production and the strength of 64
coils of wire randomly selected from this trial production was determined as 310
kp/mm2. Test the accuracy of the idea for 𝛼𝛼=0.01.
𝑛𝑛 = 64 𝐻𝐻0 : 𝜇𝜇 = 300
𝜇𝜇0 = 300 𝐻𝐻1 : 𝜇𝜇 > 300
𝜎𝜎 = 24 One sample size, the population variance is known,

𝑋𝑋 = 310 𝑛𝑛 < 0.05𝑁𝑁
� − 𝜇𝜇0
𝑋𝑋 310 − 300
𝑍𝑍 = 𝜎𝜎 = 24
≅ 3.33
𝑛𝑛 64
3,33
Z=2,33
𝐻𝐻0 RED
For 𝛼𝛼=0.01, we have sufficient evidence that the new production technique will
increase the strength of steel wire.
21
Hypothesis Tests for Means
Akçubuk Inc. is a company that produces steel rods. While production was in
progress, the company's control staff randomly selected 64 steel rods and examined
their torsional strength. The rods are twisted with an average force of 4700 kg. and
with the standard deviation of the torsion value is 800 kg. The company wants the
rods it produces to withstand a force of at least 5000 kg. Examine this situation for
𝛼𝛼=0.01.
𝑛𝑛 = 64 𝐻𝐻0 : 𝜇𝜇 ≥ 5000

𝑋𝑋 = 4700 𝐻𝐻1 : 𝜇𝜇 < 5000
𝑠𝑠 = 800 One sample size, the population variance is unknown,
𝜇𝜇0 = 5000 𝑛𝑛 ≥ 30 , 𝑛𝑛 < 0.05𝑁𝑁
� − 𝜇𝜇0
𝑋𝑋 4700 − 5000
𝑍𝑍 = 𝑠𝑠 = 800
= −3
𝑛𝑛 64
-3 Z=-2,33
𝐻𝐻0 RED
For 𝛼𝛼=0.01, we do not have sufficient evidence that the rods produced by the
company can withstand a force of at least 5000 kg.
Hypothesis Tests for Means
It is known that in a factory where 500 grams of detergents are produced, the
standard deviation of the process is 10 grams. Since the average weight of 50
randomly selected boxes from this process is determined as 505 grams, test whether
the production complies with the standards at the 0.05 significance level.

𝑛𝑛 = 50 𝐻𝐻0 : 𝜇𝜇 = 500
𝜇𝜇0 = 500 𝐻𝐻1 : 𝜇𝜇 ≠ 500
𝜎𝜎 = 10 One sample size, the population variance is known,
𝑋𝑋� = 505 𝑛𝑛 < 0,05𝑁𝑁

� − 𝜇𝜇0 505 − 500


𝑋𝑋
𝑍𝑍 = 𝜎𝜎 = 10
≅ 3,54
𝑛𝑛 50

𝐻𝐻0 RED 3,54


For 𝛼𝛼=0.05, we do not have sufficient evidence that the detergents produced
comply with the standards.
Hypothesis Tests for Means
It is known by the battery manufacturer that A3 type battery life is normally
distributed with a standard deviation of 1.25 hours. As a result of examining 10
randomly selected samples from the 150 battery batch that was produced last
week, the average battery life was determined as 40.5 hours. Can it be said that
the A3 type battery life is more than 40 hours at the 0.05 significance level?
25
26
Hypothesis Tests for Means
In a factory where 250 gram biscuits are produced, whether the production is under
control is examined at frequent intervals by randomly selecting 25 boxes from every
400 boxes. Since it is known that the boxes follow a normal distribution according to
their weight and the mean of a sample is 247.67 grams and the standard deviation is
6 grams, examine the production situation at the 0.01 significance level.
𝑁𝑁 = 400 𝐻𝐻0 : 𝜇𝜇 = 250
𝑛𝑛 = 25 𝐻𝐻1 : 𝜇𝜇 ≠ 250One sample size, the population variance is
𝜇𝜇0 = 250 unknown,
𝑋𝑋� = 247.67 𝑛𝑛 < 30, 𝑛𝑛 > 0.05𝑁𝑁(25 > 0.05 × 400)
𝑠𝑠 = 6

� − 𝜇𝜇0
𝑋𝑋 247.67 − 250
𝑡𝑡 = = ≅ −2,003 0.005 0.005
𝑠𝑠 𝑁𝑁 − 𝑛𝑛 6 400 − 25
𝑛𝑛 𝑁𝑁 − 1 25 400 − 1 -2.797 2.797
-2,003

𝐻𝐻0 ACCEPT
For 𝛼𝛼=0.01, we have sufficient evidence that production is under control.
28
Hypothesis Tests for Means
According to the data below, can it be said that the average is greater than 0.82
when 𝛼𝛼=0.05?
0.8411 0.8580 0.8182 0,8125 0.8750
0.8042 0.8532 0.8483 0.8276 0.7983
0.8191 0.8730 0.8282 0.8359 0.8660
30
INFERENTIAL STATISTICS
Hypothesis Tests for Two Population
Means
INFERENTIAL STATISTICS
Hypothesis Tests for Two Population
Means
Hypothesis Tests for Means
A and B are two companies that produce car tires. It has been determined that the
lifespan of the tires produced by these companies complies with a normal
distribution with an average of 65000 km and a standard deviation of 4800 km.
An automobile manufacturer, which buys tires from both companies, decided to test
whether the tires it bought from both companies in March have the same lifespan
for 𝛼𝛼=0.05.
25 random samples were taken from the March productions of both companies and
the average life of the tires produced by company A was found as 64190 km, and the
average life of the tires produced by company B was 65450 km. Can the lifespan of
tires produced by both companies be considered equal?
Hypothesis Tests for Means
𝑛𝑛1 = 25 𝐻𝐻0 : 𝜇𝜇1 = 𝜇𝜇2
𝑋𝑋�1 = 64190 𝐻𝐻1 : 𝜇𝜇1 ≠ 𝜇𝜇2
𝑛𝑛2 = 25 Two sample size, difference between two means,
𝑋𝑋�2 = 65450 independent samples,population variances are known
𝜎𝜎 = 4800

𝑋𝑋�1 − 𝑋𝑋�2 64190 − 65450


𝑍𝑍 = = = −0,928
𝜎𝜎12 𝜎𝜎 2
+ 2 48002 48002
𝑛𝑛1 𝑛𝑛2 +
25 25

-0,928
𝐻𝐻0 ACCEPT
For 𝛼𝛼=0.05, we have sufficient evidence that the lifespan of tires produced by both
companies is equal.
35
Hypothesis Tests for Means
One of the two accumulator manufacturers claims that the batteries it produces last
at least as long as its rival's. Since it is known that the batteries comply with the
normal distribution in terms of durability and their standard deviations are equal, 15
batteries were randomly selected from the production of both companies and were
subjected to a life test In order to investigate the validity of the company's claim at
the 0.05 significance level. The average life of the batteries produced by the claimant
company is 2240 hours and the standard deviation is 100 hours, while the average
life of the batteries produced by the rival company is 2200 hours and the standard
deviation is 90 hours. Examine the company's claim.
Hypothesis Tests for Means
𝑛𝑛1 = 15 𝐻𝐻0 : 𝜇𝜇1 ≥ 𝜇𝜇2
𝑋𝑋�1 = 2240 𝐻𝐻1 : 𝜇𝜇1 < 𝜇𝜇2
𝑠𝑠1 = 100 Two sample size, difference between two means,
𝑛𝑛2 = 15 independent samples,the population variances are
𝑋𝑋�2 = 2200 unknown, considered equal,
𝑠𝑠2 = 90
𝑋𝑋�1 − 𝑋𝑋�2
𝑡𝑡 =
(𝑛𝑛1 − 1)𝑠𝑠1 2 + (𝑛𝑛2 − 1)𝑠𝑠2 2 1 1
+
𝑛𝑛1 + 𝑛𝑛2 − 2 𝑛𝑛1 𝑛𝑛2
2240 − 2200
= = 1,152
(15 − 1)1002 + (15 − 1)902 1 1 𝑡𝑡0.05;28 = −1,701
+
15 + 15 − 2 15 15

𝐻𝐻0 ACCEPT
For 𝛼𝛼=0.05, we have sufficient evidence to verify the manufacturer's claim.
38
Hypothesis Tests for Means
A company supplies a raw material in bags from two different suppliers. While the
weight of 15 bags of raw materials purchased from the first company fits a normal
distribution with a mean of 4.7 kg and a standard deviation of 2 kg, the weight of 15
bags of raw materials bought from the second company fits a normal distribution
with a mean of 7.8 kg and a standard deviation of 2.5 kg. Can it be said that the
average weights of raw materials coming from two suppliers are the same at the
0.05 significance level?
40

You might also like