CH 1-2 Basic Statistics-2

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 48

Chapter One

Introduction
Definition of statistics
Definition:
1. Plural sense (lay man definition):
Statistics is defined as the collection of numerical facts or figures
( or the raw data themselves).
Examples: Statistics of births, deaths, students, imports & exports,
etc.
2. Singular sense (formal definition):
Statistics is the subject that deals with the methods of collecting,
organizing, presenting, analyzing and interpreting of numerical
Compiled by: Bacha E., Applied Mathematics, ASTU
data.
 Statistical methods can be used to find answers to the
questions like:
• What kind and how much data need to be collected?
• How should we organize and summarize the data?
• How can we analyze the data and draw conclusions from
it?
• How can we assess the strength of the conclusions and
evaluate their uncertainty?
Classifications of Statistics
 Depending on how data can be used statistics is some times divided
in to two main areas or branches.

Compiled by: Bacha E., Applied Mathematics, ASTU


i. Descriptive Statistics

 Is the branch of statistics devoted to the summarization and


description of data.
 It includes the construction of graphs, charts, and tables, and the
calculation of various descriptive measures such as averages, measures
of variation, and percentiles.
Example:
1. Suppose that the mark of 6 students for Basic Statistics course
is given as 40, 45, 50, 60, 70 and 80. The
average mark of the 6 students is 57.5
2. 85% of the instructors in ASTU are males.
3. The average age of football players participated in 2018 Russia world
cup was 26 years.
Compiled by: Bacha E., Applied Mathematics, ASTU
ii. Inferential Statistics
 consist of methods for drawing and measuring the reliability of
conclusions about population based on information obtained from
a sample of the population.
 It deals with making inferences and/or conclusions about a
population based on data obtained from a sample of observations.
 Making predictions and generalizing about phenomena
represented by the data.

 It consists of performing hypothesis testing, determining


relationships among variables and making predictions.

Compiled by: Bacha E., Applied Mathematics, ASTU


Cont…
 For example, the average income of all families (the

population) in Ethiopia can be estimated from figures obtained


from a few hundred (the sample) families.

Compiled by: Bacha E., Applied Mathematics, ASTU


Introduction ………

 Descriptive Statistics Inferential Statistics


 Collect  Predict and forecast values of
 Organize population parameters
 Summarize  Test hypotheses about values of
 Display population parameters
 Analyze  Make decisions

Stages of Statistical Investigation


There are five stages or steps in any statistical investigation.
Collection of data
Organization of data
Presentation of data
Analysis of data
Compiled by: Bacha E., Applied Mathematics, ASTU
Interpretation of data
Definition of some statistical terms

Population: - is a complete observations or measurements of individuals or objects


under study.
 The word population doesn’t necessarily refer to people.
e.g. -All clients of Telephone Company
-All students of Adama Science and Technology University (ASTU)
- All households in Adama City
Sample: - is a part or subset of the population under study.
Survey: - is an investigation of a certain population to assess its characteristics.
It may be census or sample.
Census survey: a complete enumeration of the population under study.
Sample survey: the process of collecting data covering a representative part or portion
of a population.

Compiled by: Bacha E., Applied Mathematics, ASTU


 Parameter: Characteristic or measure obtained from a
population data.
Examples: -Population mean (µ-read as “mu”)
-Population variance ( σ2 -read as “sigma square”)
- Population standard deviation(σ)
-etc
 Statistic: Characteristic or measure obtained from a sample.
- That is, a statistic describes a characteristic of the sample
which can then be used to make inference about unknown
parameters.
Compiled by: Bacha E., Applied Mathematics, ASTU
Introduction ………
Examples: -Sample mean 𝑥
-Sample variance (S2)
-Sample standard deviation (S)

Sample size(n): - The number of elements or observation to be included in the


sample.
Variable: - is an item of interest that can take numerical or non-numerical
values for different elements
Example: Sex, marital status, age, weight, height, expenditure, etc
There are two types of variable.
1. Qualitative variable:- is variable that assume non-numerical values
e.g. Sex, religion marital status, nationality, language, hair color, etc.
2. Quantitative variable:- is variable that assume numerical values
e.g. Age, income, height, weight, family size, volume, expenditure, etc.
Compiled by: Bacha E., Applied Mathematics, ASTU
Introduction ………

Note that quantitative variables are either discrete (which can assume only certain
values, and there are usually "gaps" between the values, such as the number of
bedrooms in your house) or continuous (which can assume any value within a specific
range, such as the air pressure in a tire.)

Data:- is a measurement or observation value recorded for


a certain element or variable. Data

Based on their nature data divided into two


Categorical Numerical
1. Qualitative data (Qualitative) (Quantitative)
2. Quantitative data 1. Discrete data

2. Continuous data Discrete Continuous

. Compiled by: Bacha E., Applied Mathematics, ASTU


Data …
Qualitative data: is the recorded values of qualitative
variable. Example:
 Gender – Male & Female
 Religion – Orthodox, Muslim, Catholic, Protestant, Wakefeta, etc.
 Marital Status – Single, Married, Divorced and Widowed
 Political Party
 Quantitative data: is the recorded values of quantitative
variable. Example:
 Number of Children -Weight - Income
 Defects per hour - Voltage - Wage of workers

Compiled by: Bacha E., Applied Mathematics, ASTU


Data …
Quantitative 1. Discrete data
data
2. Continuous data
 Discrete data:- the possible values are known.
- Countable
e.g. Number of Children per family, No. of students in a class, etc

 Continuous data:- take any value within a specific range.


Most of them are obtained by measurable.
e.g. Income, length, salary, weight, height, etc.
Compiled by: Bacha E., Applied Mathematics, ASTU
Applications, Uses and Limitations of Statistics
Applications:
Statistics is a very broad subject, with applications in a vast
number of different fields.
Statistics can be applied in any field of study which seeks
quantitative evidence.
Statistics is a crucial process behind how we make discoveries in
science, make decisions based on data, and make predictions.
Statistics have wide application in Science & Engineering
-To determine the probability of reliability of a product.
-To control the quality of products in a given production process.
- In chemical studies for data collection and analysis of chemical
compounds for more efficient management of flow of the
information.
-etc. Compiled by: Bacha E., Applied Mathematics, ASTU
Functions/Uses of Statistics
 It condenses and summarizes a mass of data

 It facilitates comparison of data

 Statistics helps to predict future trends

 Statistics helps to formulate & review policies

 Statistics helps in Formulating and testing hypothesis

Compiled by: Bacha E., Applied Mathematics, ASTU


Introduction ………

Limitations of Statistics
It does not deal with individual values

It does not deal with qualitative characteristics directly

Statistical conclusions are not universally true

It can be misused: statistics cannot be used to full advantage in the

absence of proper understanding of the subject matter

etc
Compiled by: Bacha E., Applied Mathematics, ASTU
Level of Measurements
Proper knowledge about the nature and type of data to be dealt with is essential in
order to specify and apply the proper statistical method for their analysis and
inferences.
Four levels of measurement scales are commonly distinguished:
1. Nominal scale
-no ranking or ordering
-all arithmetic & relational operations are not applicable
- no numerical or quantitative value
Example : -Sex (Male or Female),
-Marital Status (married, single, widow, divorce)
2. Ordinal Scale
Can be arranged in some order, but the differences between the data values are
meaningless.
All arithmetic operations are not applicable
All relational operations are applicable
Example:- latter grading (A, B, C, D, F)
- Rating
Compiled by: Bachascales (excellent,
E., Applied very
Mathematics, good,
ASTU good, fair, poor)
- military status (general, colonel, lieutenant, etc).
3. Interval Scale
All relational operations are applicable
All arithmetic operations except division are applicable
There is no true zero, or starting point.
That is, zero on the scale is arbitrary (artificial origin)
Example: - 𝑇𝑒𝑚𝑝𝑒𝑟𝑎𝑡𝑢𝑟𝑒(℃)
- Intelligence Quotient (IQ)
4. Ratio Scale
All arithmetic & relational operations are applicable
Zero on the scales implies absolute absence of the
characteristics under considered
Example: - Weight, age, number of students, number of children
per family etc
Compiled by: Bacha E., Applied Mathematics, ASTU
CHAPTER TWO
Methods of Data Collection and
Presentation
I. Methods Data Collection Collection
Based on their sources data can be classified into two.
i. Primary data
ii. Secondary data.
Primary data are those collected by the investigator for the purpose a
specific study, whereas
Secondary data are obtained from available data already collected by
some other agency for the same or different purpose.

Compiled by: Bacha E., Applied Mathematics, ASTU


Examples of secondary data

Taking data from:


 Different Organizations such as
o Central Statistics Agency (CSA)
o World Bank,
o Commercial Bank of Ethiopia,
o National Bank of Ethiopia, …
 Books
 Journals
 Internet
 etc
Compiled by: Bacha E., Applied Mathematics, ASTU
Methods of Data Collection…
The common primary data collection are:

i) Direct personal Interviews


 Face-to-face interview

 Telephone interview

ii) Written questionnaire method

iii) Experimental (Experimentation)

iv) Indirect interview

V) Observation

Compiled by: Bacha E., Applied Mathematics, ASTU


Methods of Data Presentation
 There are two methods of data presentation
1. Tabular presentation of data (Frequency Distribution)
Categorical (qualitative) Frequency Distribution
 Discrete Frequency Distribution
Continuous Frequency Distribution
2. Diagrammatic and Graphical presentation

Compiled by: Bacha E., Applied Mathematics, ASTU


Data Presentation…
1. Categorical Frequency Distribution
 Used for data that can be place in specific categories such as
nominal, or ordinal
 count the occurrences in each category and find the totals.
Example: Social worker collected the following data on marital
status from 60 persons. (M=married, S=single, W=widowed,
D=divorced).
M M M S S S S D W W D W D S ……

Compiled by: Bacha E., Applied Mathematics, ASTU


Data Presentation…
Table 1. The Marital status of 60 adults
Marital S M D W Total
Status

Frequency (fi) 25 20 8 7 60

2. Discrete Frequency Distribution

 Count the number of times each possible value is repeated

Example: In a survey of 30 families, the number of children per family


was recorded and obtained the following data: 4 2 4 3 2 8 3 4 4 2 2
8 5 3 4 5 4 5 4 3 5 2 7 3 3 6 7 3 8 4.
Compiled by: Bacha E., Applied Mathematics, ASTU
Data Presentation…
These individual observations can be arranged in ascending order of
magnitude to from an array: 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 4, 4,
4, 4, 4, 4, 4, 4, 5, 5, 5, 5, 6, 7, 7, 8, 8, 8.
 The distribution of children in 30 families would be:

No. of 2 3 4 5 6 7 8 Total
Children

Frequency 5 7 8 4 1 2 3 30
(fi)
Compiled by: Bacha E., Applied Mathematics, ASTU
3. Continuous Frequency Distribution:- Continuous FD’s
arise from continuous variables.
- When the range of the data is large, the data must be grouped
in to classes that are more than one unit in width.
Basic Terms in a continuous frequency distribution
 Class Frequency (or frequency):- refers to the number of
items belonging to a class.
 Class limits (C.L.):- It divided into two.
i) Lower class limit (LCL)
ii) Upper class limit (UCL)
Example: Consider the mark of 40 students out of 60 given
below.
𝐿𝐶𝐿1 = 6 , 𝐿𝐶𝐿2 = 12, … , 𝐿𝐶𝐿6 = 36
𝑈𝐶𝐿
Compiled1by:= 11,
Bacha 𝑈𝐶𝐿
E., Applied 2 = 17,
Mathematics, ASTU … , 𝑈𝐶𝐿6 = 41
Table 1: The mark of 40 students out of 60
Class No of Class Class Relative % R.F L.C.F M.C.F
limits Students Boundary Mark frequency
(Mark) (fi) (C.B) (C.M) (R.F)

6 - 11 2 5.5-11.5 8.5 0.05 5 2 40

12 - 17 4 11.5-17.5 14.5 0.1 10 6 38

18 - 23 10 17.5-23.5 20.5 0.25 25 16 34

24 - 29 16 23.5-29.5 26.5 0.4 40 32 24

30 - 35 5 29.5-35.5 32.5 0.12 12 37 8

36 - 41 3 35.5-41.5 38.5 0.08 8 40 3

Compiled by: Bacha E., Applied Mathematics, ASTU


Unit of Measure
 Unit of Measure (U):- is the difference b/n any two
successive (consecutive) of upper and lower class limits.

𝑈 = 𝐿𝐶𝐿𝑖+1 − 𝑈𝐶𝐿𝑖

e.g. From the above example,


𝐿𝐶𝐿2 = 12 & 𝑈𝐶𝐿1 =11
𝑈 = 𝐿𝐶𝐿2 − 𝑈𝐶𝐿1 = 12 − 11 = 𝟏

Compiled by: Bacha E., Applied Mathematics, ASTU


Continuous F.D ….
Class Boundary (C.B)
- Add half the unit of measure on all upper class limits to get
the upper class boundary (UCB)
- Subtract half the unit of measure from all lower class limits to get
the lower class boundary (LCB). That is
LCBi  LCLi  U
2
UCBi  UCLi  U
2
e.g. Using the above example find the lower and upper class
boundaries. Table one.pptx
𝐿𝐶𝐵1 = 5.5, 𝐿𝐶𝐵2 = 11.5, … . 𝐿𝐶𝐵6 =35.5
𝑈𝐶𝐵1 = 11.5, 𝑈𝐶𝐵2 = 17.5, … 𝑈𝐶𝐵6 = 41.5

Compiled by: Bacha E., Applied Mathematics, ASTU


Continuous F.D ….
 Class Mark (C.M):- is the mid-point of a class interval .
It is obtained as:

LCLi  UCLi LCBi  UCBi


C.M i  orC.M i 
2 2
e.g. Using the above example find the class marks. Table
one.pptx

c.m1  8.5, c.m4  26.5


c.m2  14.5 c.m5  32.5
c.m3 Bacha
Compiled by:
 E.,20 .5
Applied Mathematics, ASTU
c.m4  38.5
Continuous F.D ….
Class Width (w):- is the difference b/n any two successive (consecutive)
of LCL or UCL or LCB or UCB or class marks. That is:

w  LCLi 1  LCLi w  LCBi 1  LCBi


w  UCLi 1  UCLi or or w  C.M i 1  C.M i
w  UCBi 1  UCBi

e.g. Calculate the class width of the above example. Table one.pptx
Solution:

w  LCLi 1  LCLi  LCL2  LCL1  12  6  6

Compiled by: Bacha E., Applied Mathematics, ASTU


Relative and Percentage frequency distribution
Relative frequency (R.F) :- is the number of objects/cases per
category divided by the total number of objects.
- it gives proportions for each category out of the total.
classfrequency f i
R.F  
totalfrquency n
Percentage frequency distribution

fi
%R.F   100%
n
Example: Table one.pptx
Compiled by: Bacha E., Applied Mathematics, ASTU
Cumulative Frequency Distribution
1. Less than cumulative frequency (L.C.F):- is obtained by adding
the frequency of all the preceding classes including the frequency
of that class.
- In other word L.C.F is the total number of observations less than the
UCB of that class.
2. More than cumulative frequency (M.C.F) :- is also obtained
by adding the frequency of all the succeeding classes including the
frequency of that class
- M.C.F is the total number of observations greater than the LCB of
that class.
Example: Using the above example find the L.C.F and M.C.F Table
one.pptx
Compiled by: Bacha E., Applied Mathematics, ASTU
Constructing a continuous frequency distribution
 Practical steps in constructing continuous frequency distribution
1. Determine the number of classes (k)
Using Sturges‟ rule-of-thumb:
k = 1 + 3.322 log n
where k is the number of classes,
log is common logarithm
n is the total number observations in our sample
2. Determine the Class Width (w)
𝑅𝑎𝑛𝑔𝑒
𝑤= 𝑘
and rounded to the nearest integer.
Range = largest value – smallest value
R = L – S and rounded to the nearest integer
Compiled by: Bacha E., Applied Mathematics, ASTU
3. Determine the Class Limits
 The lower class limit of the first class should be less than
or equal to the smallest value of the observations collected from
the field
 Add the class width on the lower class limit to obtain the lower
class limit of the next higher class.
 Subtract the unit of measure from 2nd LCL to obtain the 1st UCL.
 Then Add the class width on the UCL to obtain the upper class
limit of the next higher class.
Example:- Construct a continuous frequency distribution for the
following raw data on marks (out of 100) obtained by 50
students in Statistics course.
57, 53, 65, 55, 50, 45, 64, 52, 15, 46, 42, 63, 33, 64, 53, 25, 54, 35,
48, 55, 70, 47, 39, 58, 52, 36, 65, 75, 26, 20, 55, 60, 83, 61, 45,
63, 49, 42, 35, 18, 51, 45, 42, 65, 39, 59, 45, 41, 30, 40.
Compiled by: Bacha E., Applied Mathematics, ASTU
Cont…
Solution: n = 50, L = 83, S = 15. Then
 k = 1 + 3.322 log n= 1+3.322 (log50) = 6.64 ≈ 7
 R = L – S =83 – 15 = 68
 w = R/k = 68/7 = 9.71 ≈ 10
Table 2: The marks of 50 students (out of 100) obtained in
Statistics course.
Class 15 -24 25-34 35-44 45-54 55-64 65-74 75-84 Total
limits

fi 3 4 10 15 12 4 2 50
C.B
C.M

R.Compiled
F by: Bacha E., Applied Mathematics, ASTU
L.C.F
Diagrammatic and Graphical Method of Data Presentation

i) Diagrammatic Presentation of Data


 It usually used to present qualitative and discrete data.
 The common diagrammatic presentation of data are:
1. Bar Chart
i) Simple bar chart
ii) Component (subdivided) bar chart
2. Pie Chart
Bar Chart:- Used to represent & compare the frequency distribution
of discrete data and attributes or categorical data .
- Bars can be drawn either vertically or horizontally.

Compiled by: Bacha E., Applied Mathematics, ASTU


Cont…
 Simple bar chart:- used to display data on one variable,

 They are thick lines (narrow rectangles) having the same breadth (size).

 It used for aggregate data

 Component Bar chart

 It used when there is a desire to show how a total (or aggregate) is

divided into its component parts.

Example: Number of students in the four department of Science


College given as follows:

 Draw simple and component bar charts.


Compiled by: Bacha E., Applied Mathematics, ASTU
Bar chart …
Department Physics Maths Chemistry Biology

Number of 200 400 450 600


students

Male 170 350 250 200

Simple bar chart Sub-divided bar chart


Female 30 50 200 400
800 600 800
Female
Frequency

600
400 450 600
400 Frequency 400 Male
200
200 200
0 0
Phys Maths Chem Bio Phys Maths Chem Bio
Deprtment Department
Compiled by: Bacha E., Applied Mathematics, ASTU
Pie Chart
Pie chart:- is a circle that is divided in to sections or wedges
according to the percentage of frequencies in each category of the
distribution.
Example: Draw a pie chart to represent the following population data
in a town.
Men Women Girls Boys

2500 2000 4000 1500

Solution: First find the percentage of each class.

fi
%   100%
n
Compiled by: Bacha E., Applied Mathematics, ASTU
Pie chart …..
Class Frequence (fi) percentage

Men 2500 25%


Women 2000 20%

Girls 4000 40%


Boys 1500 15%

Boys Men
15% 25%

Girls Women
40% 20%

Compiled by: Bacha E., Applied Mathematics, ASTU


Pie chart …..
 Advantages - Pie chart can:
 display relative proportions of multiple classes of data
 show areas proportional to the number of data points in each
category
 summarize a large data set in visual form
 be visually simpler than other types of graphs
 permit a visual check of the reasonableness or accuracy of
calculations
 Disadvantages - Pie charts can:
 reveal little about central tendency, dispersion, skew, or
kurtosis
 be easily manipulated to yield false impressions
Compiled by: Bacha E., Applied Mathematics, ASTU
Exercise
Construct a sub-divided bar chart for the four types of products
in relation to the opinion of consumers purchasing the given
products as given below:
Products Definitely Probably Unsure No

Product 1 50% 40% 10% 2%

Product 2 60% 30% 12% 15%

Product 3 70% 45% 8% 8%

Product 4 60% 35% 5% 20%

Compiled by: Bacha E., Applied Mathematics, ASTU


Graphical Presentation of data

 Graphical presentation of data is used to present a continuous

data.

 The common graphical presentation of data are:

1. Histogram

2. Frequency polygon

3. Cumulative frequency curves (Ogives)


 Less than ogive (Less than cumulative frequency curves )
 More than ogive (More than cumulative frequency curves )
Compiled by: Bacha E., Applied Mathematics, ASTU
Histogram
 To construct a histogram,
 the class boundaries or the class marks are plotted on the
horizontal axis and
 the class frequencies are plotted on the vertical axis.
Example: Draw histogram for the marks of 50 students (out of
100) obtained in Statistics course. Table Two.pptx

Compiled by: Bacha E., Applied Mathematics, ASTU


Frequency Polygon
 A frequency polygon is a line graph where class frequencies
are plotted against the class marks and the successive points are
connected by straight lines.
Example: Draw frequency polygon for the marks of 50 students (out
of 100) obtained in Statistics course. Table Two.pptx

Compiled by: Bacha E., Applied Mathematics, ASTU


Cumulative frequency curves (Ogives)
 To draw less/ more than ogive the less/more than cumulative
frequencies are plotted against upper/lower class boundaries
of their respective classes and they are joined by either
straight lines or smooth curves.
Example: Draw cumulative frequency curves for the marks of 50
students (out of 100) obtained in Statistics course. Table
Two.pptx

Compiled by: Bacha E., Applied Mathematics, ASTU


Exercise
 Draw (a) histogram (b) frequency polygon (c) Ogive for the
following frequency distribution of grades in a final examination
in introduction to statistics.

Compiled by: Bacha E., Applied Mathematics, ASTU


Compiled by: Bacha E., Applied Mathematics, ASTU

You might also like