Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 12

Applied Dr.

Mahmoud Abd El-


DESCRIPTIVE STATISTICS
Statistics Raouf

1. Introduction
For a layman, ‘Statistics’ means numerical information expressed in quantitative terms. This
information may relate to objects, subjects, activities, phenomena, or regions of space. As a matter of
fact, data have no limits as to their reference, coverage, and scope.

2. Meaning and Definitions of Statistics


In the beginning, it may be noted that the word ‘statistics’ refers to the whole body of tools that
are used to collect data, organize and interpret them to draw conclusions from them. If statistics, as a
subject, is inadequate and consists of poor methodology, we could not know the right procedure to
extract from the data the information they contain. Similarly, if our data are defective or that they
are inadequate or inaccurate, we could not reach the right conclusions even though our subject is
well developed.
A.L. Bowley has defined statistics as: (i) statistics is the science of counting, (ii) Statistics may
rightly be called the science of averages, and (iii) statistics is the science of measurement of social
organism regarded as a whole in all its manifestations.
Boddington defined as: Statistics is the science of estimates and probabilities.
Further, W.I. King has defined Statistics in a wider context “the science of Statistics is the method
of judging collective, natural or social phenomena from the results obtained by the analysis or
enumeration or collection of estimates”.
Seligman explored that statistics is a science that deals with the methods of collecting, classifying,
presenting, comparing and interpreting numerical data collected to throw some light on any sphere
of enquiry.
Spiegal defines statistics highlighting its role in decision-making particularly under uncertainty, as
follows: statistics is concerned with scientific method for collecting, organizing, summarizing,
presenting and analyzing data as well as drawing valid conclusions and making reasonable
decisions on the basis of such analysis.
According to Prof. Horace Secrist, Statistics is the aggregate of facts, affected to a marked extent by
multiplicity of causes, numerically expressed, enumerated or estimated according to reasonable

1
Applied Dr. Mahmoud Abd El-
DESCRIPTIVE STATISTICS
Statistics Raouf

standards of accuracy, collected in a systematic manner for a pre-determined purpose, and placed in
relation to each other.

3. Types of Data
Statistical data are the basic raw material of statistics. Data may relate to an activity of our
interest, a phenomenon, or a problem situation under study. They derive as a result of the process of
measuring, counting and/or observing. Statistical data, therefore, refer to those aspects of a problem
situation that can be measured, quantified, counted, or classified. In statistics, data are classified into
two broad categories: quantitative data and qualitative data. This classification is based on the kind
of characteristics that are measured.

Quantitative data are those that can be quantified in definite units of measurement. These refer to
characteristics whose successive measurements yield quantifiable observations. Depending on the
nature of the variable observed for measurement, quantitative data can be further categorized as
continuous and discrete data.
Obviously, a variable may be a continuous variable or a discrete variable.
 Continuous data represent the numerical values of a continuous variable. A continuous variable
is the one that can assume any value between any two points on a line segment, thus representing
an interval of values. The values are quite precise and close to each other, yet distinguishably
different. All characteristics such as weight, length, height, thickness, velocity, temperature,
tensile strength, etc.,
 Discrete data are the values assumed by a discrete variable. A discrete variable is the one whose
outcomes are measured in fixed numbers. Such data are essentially count data. These are derived
from a process of counting, such as the number of items possessing or not possessing a certain
characteristic. The number of customers visiting a departmental store every day, the incoming
flights at an airport, and the defective items in a consignment received for sale, are all examples
of discrete data.

2
Applied Dr. Mahmoud Abd El-
DESCRIPTIVE STATISTICS
Statistics Raouf

Qualitative data refer to qualitative characteristics of a subject or an object. A characteristic is


qualitative in nature when its observations are defined and noted in terms of the presence or absence
of a certain attribute in discrete numbers. These data are further classified as nominal and ordinal
data.
 Nominal data are the outcome of classification into two or more categories of items or units
comprising a sample or a population according to some quality characteristic. Classification of
students according to sex (as males and females), of workers according to skill (as skilled, semi-
skilled, and unskilled), and of employees according to the level of education (as matriculates,
undergraduates, and post-graduates).
 Ordinal data, on the other hand, are the result of assigning ranks to specify order in terms of the
integers 1,2,3, ..., n. Ranks may be assigned according to the level of performance in a test. a
contest, a competition, an interview, or a show. The candidates appearing in an interview, for
example, may be assigned ranks in integers ranging from 1 to n, depending on their performance
in the interview.
4. Types of Statistics
There are two major divisions of statistics such as descriptive statistics and inferential statistics.

The descriptive statistics deals with collecting, summarizing, and simplifying data, which are
otherwise quite unwieldy and voluminous. It seeks to achieve this in a manner that meaningful
conclusions can be readily drawn from the data. Descriptive statistics may thus be seen as
comprising methods of bringing out and highlighting the latent characteristics present in a set of
numerical data. It not only facilitates an understanding of the data and systematic reporting thereof in
a manner; and also makes them amenable to further discussion, analysis, and interpretations.

A well thought-out and sharp data classification facilitates easy description of the data by means of a
variety of summary measures. These include measures of central tendency, dispersion, skewness,
and kurtosis, which constitute the essential scope of descriptive statistics.

3
Applied Dr. Mahmoud Abd El-
DESCRIPTIVE STATISTICS
Statistics Raouf

Inferential statistics, goes beyond describing a given problem situation by means of collecting,
summarizing, and meaningfully presenting the related data. Instead, it consists of methods that are
used for drawing inferences, or making broad generalizations, about a totality of observations on the
basis of knowledge about a part of that totality. Thus, obtaining a particular value from the sample
information and using it for drawing an inference about the entire population underlies the subject
matter of inferential statistics.
Notes:
(1) The totality of observations about which an inference may be drawn, or a generalization made,
is called a population.
(2) The part of totality, which is observed for data collection and analysis to gain knowledge about
the population, is called a sample.

Inferential statistics helps to evaluate the risks involved in reaching inferences or generalizations
about an unknown population on the basis of sample information. for example, an inspection of a
sample of five battery cells drawn from a given lot may reveal that all the five cells are in perfectly
good condition. This information may be used to conclude that the entire lot is good enough to buy
or not.

5. Importance of Statistics in Business


There are three major functions in any business enterprise in which the statistical methods are

useful. These are as follows:

(i) The planning of operations: This may relate to either special projects or to the recurring

activities of a firm over a specified period.

(ii) The setting up of standards: This may relate to the size of employment, volume of sales,

fixation of quality norms for the manufactured product, norms for the daily output, and so

forth.

4
Applied Dr. Mahmoud Abd El-
DESCRIPTIVE STATISTICS
Statistics Raouf

(iii) The function of control: This involves comparison of actual production achieved against the

norm or target set earlier. In case the production has fallen short of the target, it gives remedial

measures so that such a deficiency does not occur again.

Statistical Measures
The description of statistical data may be quite elaborate or quite brief depending on two factors: the
nature of data and the purpose for which the same data have been collected.
1) Measures Of Central Tendency:
The measures of central tendency enable us to compare two or more distributions pertaining to
the same time period or within the same distribution over time. For example, the average
consumption of tea in two different territories for the same period or in a territory for two years, say,
2003 and 2004, can be attempted by means of an average.
 MEAN

Adding all the observations and dividing the sum by the number of observations results the
mean. Symbolically, the mean is

X=
∑ X = X 1 + X 2 +…+ X n
n n
It may be noted that the Greek letter μ is used to denote the mean of the population and n to denote
the total number of observations in a population.
Example 1: Calculate the following average workers' wages:
15, 18, 28, 39, 56, 66
Solution:

X=
∑ X = 15+ 18+28+39+56+ 6 = 222 =37
n 6 6
Example 2: Calculate the mean of the following items:

5
Applied Dr. Mahmoud Abd El-
DESCRIPTIVE STATISTICS
Statistics Raouf

29, 21, 18, 27, 25, 30, 16


Solution:

X=
∑ X = 29+21+18+27+25+30+ 16 = 166 =23.7142
n 7 7
 MEDIAN

Median is defined as the value of the middle item (or the mean of the values of the two
middle items) when the data are arranged in an ascending or descending order of magnitude. if the
n values are arranged in ascending or descending order of magnitude,

 if n is odd. the median is the middle value


2( )
n+1 th
.

() ( )
th th
n n
 if n is even. the median is the mean of the two middle values and +1 .
2 2
Suppose we have the following series:
15, 19, 21,7, 10, 33, 25, 18 ,5
We have to first arrange it in either ascending or descending order. These figures are arranged in an
ascending order as follows:
5, 7, 10, 15, 18, 19, 21, 25, 33

Now as (n) is odd number, to find out the value of the middle item, we use the formula
n+1 th
2 ( )
then

the median =18


Suppose we have the following series: 5, 7, 10, 15, 18, 19, 21, 23, 25, 33.
Now as (n) is even number, to find out 2 values of the middle, we have to take the average of the
18+19
values of 5th and 6th item, then the median= =18.5
2
Example 3: Calculate the following median workers' wages:
15, 18, 28, 39, 56, 66
Solution:

15 18 28 39 56 66

6
Applied Dr. Mahmoud Abd El-
DESCRIPTIVE STATISTICS
Statistics Raouf

39+ 28
median= =33.5
2
Example 4: Calculate the median of the following items:
16 ,30 ,25 ,27 ,18 ,21 ,29
Solution:
2 2
16 18 25 27 30
1 9
median=25
 MODE

The mode is another measure of central tendency. It is the value at the point around which the
items are most heavily concentrated. (The most frequent values)
Example 5: Calculate the mode of the following items:
 29, 25, 18, 27, 25, 30, 16
mode=25
 15, 18, 18, 39, 56, 15
mode={15,18 }
 15, 18, 18, 39, 39, 15
No mode.
The mode is the only measure that takes more than one value and it is possible that there is no mode
2) Measures of Dispersion:
It may be noted that these measures do not indicate the extent of dispersion or variability in a
distribution. The dispersion or variability provides us one more step in increasing our understanding
of the pattern of the data. Further, a high degree of uniformity (i.e. low degree of dispersion) is a
desirable quality.
Averages are not sufficient to give a complete description of the data, as they are not suitable for
measuring how different or homogeneous the data are with each other. For example, if we look at
the following two sets of data:
A 30 40 55 60 65 80 90

7
Applied Dr. Mahmoud Abd El-
DESCRIPTIVE STATISTICS
Statistics Raouf

B 55 57 59 60 61 63 65
We found that the mean and the median for each are 60. However, the differences between them
are large. Values in group B are close to each other and are not far from the mean or median, unlike
in case A where we find their components more dispersed. Accordingly, when accurately describing
the dataset, we are not satisfied with the average scale, but in addition, a dispersion scale should be
calculated. There are commonly used measures: range - variance - standard deviation.
 RANGE

The simplest measure of dispersion is the range, which is the difference between the maximum
value and the minimum value of data.
A B
Range = max -min ¿ 90−30=60 ¿ 65−55=10
it is clear that group B is less dispersed than group A. In other words, the elements of group B
are more homogeneous with each other than the elements of group A.
 VARIANCE

variance is the mean squared difference between all elements of a group and the mean of this
group.
2
S=
∑ ( X −X )2
n−1
A B
Xi X −X ( X −X )
2
Xi X −X (X −X )
2

30 30- 900 55 5- 25
40 20- 400 57 3- 9
55 5- 25 59 1- 1
60 0 0 60 0 0
65 5 25 61 1 1
80 20 400 63 3 9
90 30 900 65 5 25
∑ 420 0 2650 42 0 70

8
Applied Dr. Mahmoud Abd El-
DESCRIPTIVE STATISTICS
Statistics Raouf

2
S=
∑ ( X− X)
2
2
SA=
2650
=441.6666
2
S B=
70
=11.6666
n−1 7−1 7−1

 STANDARD DEVIATION

standard deviation is the mean of difference between all elements of a group and the mean of this
group.
S= √ S
2

S= √ S2 S A =√ 441.6666=21.01586 S B=√ 11.6666=3.415650

9
Applied Dr. Mahmoud Abd El-
DESCRIPTIVE STATISTICS
Statistics Raouf

Solved Problems
1) Calculate the mean, median and mode of the following data:
i.18, 10, 15, 13, 17, 15, 12, 15, 18, 16, 11
Solution: Order data: 10, 11, 12, 13, 15, 15, 15, 16, 17, 18, 18 ¿ 15 , Median=15 and
10+ 11+12+13+ 15× 3+16+17+18 × 2
Mean= =14.55
11
ii. Find the Average, Median, Mode, Range, Variance, and Standard Deviation for the following
data: 4, 7, 9, 12, 15, 20.
Solution: Order data: 4, 7, 9, 12, 15, 20 x x−X ( x−X )2
Mode=No Mode, 4 −43/ 6 1849/36
7 −25 /6 625/36
9+12 9 −13/6 169/36
Median= =10.5 Range=20−4=16 ,
2 12 5/6 25/36
4+7 +9+12+15+20 67 15 23/6 529/36
X= = =11.16 ,
6 6 20 53/6 2809/36
Σ 0 166.83
166.83
v( x)= =33.266and SD=√ 33.266=5.77
5

2) In the following data, which group is more homogenous? Why?


Group A 187 284 201 151 100 154 105
Group B 20 15 8 8 15 12 16
Solution:
2 2
xA x A− X A ( x A −X A ) xB x B −X B ( x B −X B )
187 127 /7 16129/49 20 46 /7 2116 /49
284 806 /7 649636 /49 15 11/7 121/49
201 225/ 7 50625/49 8 −38/7 1444 /49
151 −125/7 15625/49 8 −38/7 1444 /49
100 −482/7 232324 /49 15 11/7 121/49
154 −104 /7 10816/ 49 12 −10/7 144 /49
105 −447/7 199809/49 16 18/7 256 /49
Σ 1182 0 23978.86 94 0 115.71

1182 94
X A= =168.86 X B= =13.43
7 7
23978.86 115.71
v A ( x )= =3996.476 vB ( x) = =19.285
6 6
S D A =63.22 S D B =4.39

10
Applied Dr. Mahmoud Abd El-
DESCRIPTIVE STATISTICS
Statistics Raouf

Therefore, Group B is more Homogenous than Group A because, it has smaller SD.

Self-Assessment Questions
1. Calculate the Mean, Median and Mode of the following data:
i. 2, 0, 5, 4, 6, 4, 2, 0, 4, 8, 0, 6.
ii. 51, 52, 47, 50, 48, 41, 59, 56, 89.
iii. 1, 2, 4, 5, 1, 2, 5, 7, 0, -1.
iv. 4, 8, 6, 2, 1, 0, -1, 7.
v. 740, 712, 742, 7, 712, 751, 714, 742
vi. 1, 2, 5, 4, 2, 4, 1, 1, 5, 4, 2, 5.
vii. 2, 3, 5, 6, 3, 4, 8, 2, 9, 3, 5, 5, 5, 2, 7.
viii. -1, -2, -3, -9, 0, 4, 9, 7, 5, 6, 4, -1, 0, 2.
ix. -1, 0, 2, -1, 0, 0, 3, 8, 0, 5, -1.
x. 4, 5, 8, 8, 7, 4, 5, 7, 2.
2. Find the Average, Median, Mode, Range, Variance, and Standard Deviation for the
following data:
i. 1, 1, 2, 3, 4, 1, 6, 3, 2, 4, 1, 2
ii. 3, 1, 10, 10, 42, 1, 3, 2, 2, 1, 3, 5, 2, 1.
iii. 3, 1, 2, 3, 4, 1, 2, 3, 5, 7, 6, 2
iv. -1, -4, -3, 1, -4, -4
3. In the following data, which group is more homogenous? Why?
Group A 22 25 29 28 27 22 20
Group B 7 2 8 9 11 15 19

Group A 24 31 35 39 41 24 36
Group B 19 14 13 16 15 15 19

11
Applied Dr. Mahmoud Abd El-
DESCRIPTIVE STATISTICS
Statistics Raouf

12

You might also like