Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 42

Why Statistics:

Complexity of the situations make a process of decision making difficult


Statistics provides the method of collecting , presenting, analyzing, and
meaningfully arranging data.
Type of situations:
When data need to be presented in a form which helps in easy
grouping(Graphs, Charts, Table)
To test some Hypothesis and draw inference
Unknown quantities are to be estmated through observed data
A decision is to be made under uncertainty regarding a course of action
Statics

Descriptive Inductive Stastical Decision Theory

Data collection Stastistical inferences - Decision problems


&presentation Hypothesis testing - Alternatives
and Inferences - Uncertainties
(eg. Regression - Criterion of choices
Correlation)
ARRANGING DATA

Learning Goals

 MEANING OF DATA
 TYPES OF DATA

 DATA COLLECTION

 DATA PRESENTATION DEVICES


MEANING OF DATA

 Data is a collection of related observations,

facts or figures.

 Collection of data is called a data-set, and

each observation a data point.


TYPES OF DATA

 PRIMARY DATA

 SECONDARY DATA
DATA COLLECTION
 Following questions can pose to test the validity of
the data:
 Where does the data originate from?
 Is the source reliable?
 Does the data support or contradict the previous
decisions?
 Are the conclusions derived from the data?
 What is the size of the sample? does it represents the
entire population under consideration for decision
making?
METHODS OF COLLECTING DATA

 COMPLETE ENUMERATION

 SAMPLE METHOD
CLASSIFICATION OF DATA

 GEOGRAPHICAL

 CHRONOLOGICAL

 QUALITATIVE

 BY MAGNITUDE
TABULAR PRESENTATION OF DATA

OBJECTIVES are:
 To condense complex data

 To show a trend

 To display huge volumes of data in less space

 To highlight key characteristics of data

 To facilitate comparison of data elements

 To help decision making using statistical methods

 To serve as reference for future decisions


PARTS OF AN IDEAL TABLE
 Table number: acts as an identity to the table
 Title: given an idea about the nature of data in
the table
 Captions: these are headings given to vertical
columns that explains the mode of classification
i.e. time, quantity, region etc.

Contd..
 Stubs: these are the headings explaining the
basis for classifying the rows

 Body: the data posted in rows and columns,


where row and column headings explain the
data.
 Footnote: any other information to explain
the data in the table.
 Source: source of information
Table Title
Table No
Table 1.1: Product wise Sales
Captions
Product Year wise Sales
Stub 2001 2002 2003 2004
(Headings
of the P1 40 45 40 50 Body
row) of the
Table
P2 15 20 22 30

P3 20 30 40 50

Source :Economics Time, 22nd Feb.2005


GRAPHICAL PRESENTATION OF
DATA

 LINE CHARTS
 BAR CHARTS
 PIE CHARTS
 PICTOGRAMS
 SCATTER DIAGRAMS
LINE CHART

500
450
400
350
300
250 SALES
200
150
100
50
0
1990 1991 1992 1993 1994 1995 1996 1997
1. Line Graph
BAR CHARTS

4500
4000
3500
3000
2500
EXPORTS
2000 IMPORTS
1500
1000
500
0
1995 1996 1997 1997
ARRANGING DATA

 PIE CHART
 HISTOGRAMS

 FREQUENCY POLYGONS

 SKEWNESS

 KURTOSIS
PIE DIAGRAMS

Indian Promoters

Indian institutions/
mutual funds
FIIS

Public
HISTOGRAMS
 The histogram graphically shows the
following:
 center (i.e., the location) of the data;
 spread (i.e., the scale) of the data;
 Skewness of the data;
 presence of outliers; and
 presence of multiple modes in the data.
HISTOGRAMS
HISTOGRAMS are as "sorting bins." You have one variable,
and you sort data by this variable by placing them into
"bins." Then you count how many pieces of data are in
each bin. The height of the rectangle you draw on top of
each bin is proportional to the number of pieces in that
bin.
On the other hand, in bar graphs you have several
measurements of different items, and you compare them.
The main question a histogram answers is: "How many
measurements are there in each of the classes of
measurements?" The main question a bar graph answers
is: "What is the measurement for each item?"
Situation Bar Graph or Histogram?
We want to compare total revenues of Bar graph. Key question: What is
five different companies. the revenue for each company?

We have measured revenues of


several companies. We want to
Histogram. Key question: How
compare numbers of companies that
many companies are there in
make from 0 to 10,000; from 10,000 to
each class of revenues?
20,000; from 20,000 to 30,000 and so
on.

We want to compare heights of ten Bar graph. Key question: What is


oak trees in a city park. the height of each tree?

We have measured several trees in a


city park. We want to compare Histogram. Key question: How
numbers of trees that are from 0 to 5 many trees are there in each
meters high; from 5 to 10; from 10 to class of heights?
15 and so on.
FREQUENCY POLYGONS

"Less than" Ogive of the distribution of 50


employees

60
50
40
Cumulative
30
frequency
20
10
0
<25 <30 <35 <40 <45 <50 <55 <60
SKEWNESS
Skewness is a measure of symmetry,
or more precisely, the lack of
symmetry.

A distribution, or data set, is


symmetric if it looks the same to the
left and right of the center point.
SKEWNESS
 A curve is said to be skewed when the values in the
frequency distribution are concentrated more towards
the left or right side of the curve i.e. the values are
not equally distributed from the centre of the curve.
 A curve is said to be positively skewed when the tail
of the curve is more stretched towards the right side.
It is said to be negatively skewed when the tail is
more stretched towards the left side.
KURTOSIS
 KURTOSIS is the degree of peakness of a
distribution of points.
 It measures the peakedness of a distribution
 Two curves with same central location and
dispersion may have different degree of
kurtosis
MEASURES OF CENTRAL
TENDENCY
Objectives of Averaging
Requisites of a Good Average
Types of Averages
Mathematical Averages
Positional Averages
CENTRAL TENDENCY

The tendency of the data to cluster around the central


value is known as CENTRAL TENDENCY.
&
Corresponding numerical measure of this tendency is
known as measurement of central tendency

The average is of great significance because it depicts the


characteristics of the whole group. Since an average represents the
entire data, its value lies somewhere in between the two extremes, i.e.
the largest and the smallest items. For this reason an average is
frequently referred to as a measure of Central Tendency.
MAIN OBJECTIVES

• To find out one value that represents the whole


mass of data.
• To facilitate comparison.
• To establish relationship.
• To derive inference about a universe from a
sample.
• To aid decision making.
Requisites of a Good Average
• It should be rigidly defined.
• It should be mathematically expressed.
• It should be readily comprehensible and easy to
calculate.
• It should be calculated based on all the observations.
• It should be least affected by extreme fluctuations in
sampling data.
• It should be suitable for further mathematical
treatment.
Types of Averages

AVERAGES

Mathematical Averages Positional Averages

Arithmetic Geometric Harmonic


Median MODE
Mean Mean Mean
(Md) (Mo)
(A.M.) (G.M.) (H.M.)
ARITHMETIC MEAN
• It is a ratio obtained on dividing the sum of observations by the total
number of observations is known as ARITHMETIC MEAN.
_
• Arithmetic mean is represented by notation X( read X-bar)

CALCULATING THE MEAN FROM UNGROUPED DATA

The mean X OF A Collection of observations x1,x2….xn is given by:


_
X= (1/n) (x1 +x2 ….xn )
= ∑x/n
n
= (1/n) ∑xi
i=1
In statistics the collection of all the elements under study is called a
POPULATION whereas a collection of some (but not all) of the
elements under study is called a sample.

It is necessary to distinguish whether we are considering a population


or a sample because certain formulas, like those for computing
standard deviation of a population are different from those for
computing the standard deviation of a sample. Hence population mean
is denoted by

µ= Sum of all the data points in the population


Size of population

X= sum of all the data points in the sample


Size of sample
The following table gives the annual profits of 10 financial
services companies for the year2007-2008.
Calculate arithmetic mean profit of companies.

Companies Net Profit (Rs. crore)


A 9.19
B 4.27
C 1.74
D 5.71
E 4.80
F 4.01
G 9.22
H 3.00
I 15.16
J 3.93
CALCULATION FOR GROUPED DATA

X=
∑ fx
Discrete Series:
∑f
E.G. In a survey of 50 chemical industries, the following data was calculated:

Xi= Level of fi= No. of companies Xi fi


Profit (Rs. That earned Xi
Lakh) amount of
Earned during profit
2002-2003

20 12 240
16 15 240
24 8 192
25 7 175
31 8 248

TOTAL 50 1095
USES OF A.M.
• Mean is the simplest average to understand and easy to compute

• It is relatively reliable in the sense that it does not vary too much when
repeated samples are taken from one and the same population, at least not
as much as other kind of statistical descriptions.

• The mean is typical in the sense that it is the centre of gravity balancing the
values on the either side of it.
Advantages and Disadvantages of A.M.

+ Its concept is familiar and clear to all.


+ It is easy to understand and easy to calculate.
+ Provides a good basis for comparison.
 It may be affected the highly fluctuating values that are not far
from other values of the group.
 It is very difficult to find actual mean.
 Calculation of mean for a data set with open-ended classes, is
not possible.

You might also like