Professional Documents
Culture Documents
Introduction To Stats, Datasets
Introduction To Stats, Datasets
Introduction To Stats, Datasets
Statistics
Applications in Business and Economics
Data
Descriptive Statistics
Statistical Inference
The term statistics can refer to numerical facts such as
averages, medians, percents, and index numbers that
help us understand a variety of business and economic
situations.
Statistics can also refer to the art and science of
collecting, analyzing, presenting, and interpreting
data.
Definition:
Seligman –
“Statistics is the science which deals with the methods
of collecting, classifying, presenting, comparing and
interpreting numerical data collected to throw some
light on any sphere of enquiry."
i) Weekly wages of 100 workers of a factory.
You want to extrapolate from the data you have collected to make
general conclusions.
There is a large population of data out there, and you have randomly
sampled parts of it.
INFERENTIAL STATISTICS
Sample Population
‾x µ
(Statistic) (Parameter)
Select a
random
sample
Functions of Statistics:
1. An understanding of variation
Data Set
Scales of measurement include:
Nominal Interval
Ordinal Ratio
1 for Educator
2 for Construction Worker
3 for Manufacturing Worker
Example: Ethnicity
1 for African-American
2 for Anglo-American
3 for Hispanic-American
Scales of Measurement
Nominal
Example:
Students of a university are classified by the
school in which they are enrolled using a
nonnumeric label such as Business, Humanities,
Education, and so on.
Alternatively, a numeric code could be used for
the school variable (e.g. 1 denotes Business,
2 denotes Humanities, 3 denotes Education, and
so on).
Ordinal
1 for President
2 for Vice President
3 for Plant Manager
4 for Department Supervisor
5 for Employee
Ordinal
Example:
Students of a university are classified by their
class standing using a nonnumeric label such as
Freshman, Sophomore, Junior, or Senior.
Alternatively, a numeric code could be used for
the class standing variable (e.g. 1 denotes
Freshman, 2 denotes Sophomore, and so on).
Ordinal Data
1 2 3 4 5
Interval
Example:
Tushar has an SAT score of 1205, while Priya
has an SAT score of 1090. Tushar scored 115
points more than Priya.
Example: Monetary Variables, such as Profit and Loss, Revenues, and Expenses
Example: Financial ratios, such as P/E Ratio, Inventory Turnover, and Quick
Ratio.
Ratio
Example:
If we compare the cost of Rs. 30000 for one
automobile to the cost of Rs. 15000 for a second
automobile, the ratio property shows that the first
automobile is Rs 30000/Rs 15000= 2 times the cost
of the second one.
Tushar’s college record shows 36 credit hours
earned, while Priya’s record shows 72 credit
hours earned. Priya has twice as many credit
hours earned as Tushar.
Scales of Measurement
Scale
Nominal Numbers
Assigned 7 8 3 Finish
to Runners
Data
Categorical Quantitative
SOURCES OF DATA
Internal External
Primary Secondary
METHODS OF COLLECTING DATA
COLLECTION OF DATA
Primary Secondary
Colour Number
Brown 18
Black 12
Total 30
The bringing together, if items with
common characteristics are known
as CLASSIFICATION.
21 50 42 75 55 67 74 55 47 64
71 61 40 25 25 54 64 37 88 44
31 70 81 51 45 63 49 43 35 67
68 31 38 45 59 75 57 29 66 50
56 84 56 88 63 32 55 88 79 78
MARKS IN STATISTICS OF 250 STUDENTS
32 47 41 51 41 30 39 18 48 53
54 32 31 46 15 37 32 56 42 48
38 26 50 40 38 42 35 22 62 51
44 21 45 31 37 41 44 18 37 47
68 41 30 52 52 60 42 38 38 34
41 53 48 21 28 49 42 36 41 29
30 33 37 35 29 37 38 40 32 49
43 32 24 38 38 22 41 50 17 46
46 50 26 15 23 42 25 52 38 46
41 38 40 37 40 48 45 30 28 31
40 33 42 36 51 42 56 44 35 38
31 51 45 41 50 53 50 32 45 48
49 43 40 34 34 44 38 58 49 28
40 45 19 24 34 47 37 33 37 36
36 32 61 30 44 43 50 31 38 45
46 40 32 34 44 54 35 39 31 48
48 50 43 55 43 39 41 48 53 34
32 31 42 34 34 32 33 24 43 39
40 50 27 47 34 44 34 33 47 42
17 42 57 35 38 17 33 46 36 23
48 50 31 58 33 44 26 29 31 37
47 55 57 37 41 54 42 45 47 43
34 52 47 46 44 50 44 38 42 19
52 45 23 41 47 33 42 24 48 39
48 44 60 38 38 44 38 43 40 48
MARKS NO. OF STUDENTS ( F )
15 – 19 9
20 – 24 11
25 – 29 10
30 – 34 44
35 – 39 45
40 – 44 54
45 – 49 37
50 – 54 26
55 – 59 8
60 – 64 5
65 – 69 1
TOTAL 250
MARKS NO. OF STUDENTS ( F )
15 – 20 9
20 – 25 11
25 – 30 10
30 – 35 44
35 – 40 45
40 – 45 54
45 – 50 37
50 – 55 26
55 – 60 8
60 – 65 5
65 – 70 1
TOTAL 250
MARKS NO. OF CUMULATIVE CUMULATIVE
STUDENTS ( F ) FREQUENCY (<) FREQUENCY (>)
15 – 20 9 9 250
20 – 25 11 20 241
25 – 30 10 30 230
30 – 35 44 74 220
35 – 40 45 119 176
40 – 45 54 173 131
45 – 50 37 210 77
50 – 55 26 236 40
55 – 60 8 244 14
60 – 65 5 249 6
65 – 70 1 250 1
TOTAL 250
Group Data and the Histogram
184 1.000
x F(x) F(x)/n
Spending Class ($) Cumulative Frequency Cumulative Relative Frequency
Pie Charts
Categories represented as percentages of total
Bar Graphs
Heights of rectangles represent group frequencies
Frequency Polygons
Height of line represents frequency
Ogives
Height of line represents cumulative frequency
Organizing Numerical Data
Numerical Data
Frequency Distributions
Ordered Array and
Cumulative Distributions
Stem-and-Leaf
Histogram Polygon Ogive
Display
The Ordered Array
91 78 93 57 75 52 99 80 97 62
71 69 72 89 66 75 79 75 72 76
104 74 62 68 97 105 77 65 80 109
85 97 88 68 83 68 71 69 67 74
62 82 98 101 79 105 79 69 62 73
Example: Hudson Auto
Parts Percent
Cost ($) Frequency Frequency
50-59 2 4
60-69 13 26
(2/50)100
70-79 16 32
80-89 7 14
90-99 7 14
100-109 5 10
50 100
Example: Hudson Auto
18
Tune-up Parts Cost
16
14
12
Frequency
10
8
6
4
2
Parts
50-59 60-69 70-79 80-89 90-99 100-110 Cost ($)
The most common numerical descriptive statistic
is the average (or mean).
The average demonstrates a measure of the central
tendency, or central location, of the data for a variable.
Hudson’s average cost of parts, based on the 50
tune-ups studied, is $79 (found by summing the
50 cost values and then dividing by 50).
Statistical Inference
RD1 Red 12
RD2 Red 10
RD3 Red 13
RD4 Red 10
RD5 Red 13
BL1 Blue 27
BL2 Blue 24
GR1 Green 35
GR2 Green 35
GY1 Gray 15
GY2 Gray 18
GY3 Gray 17
Sample and Sample Data
RD2 Red 10
RD5 Red 13
GR1 Green 35
GY2 Gray 18
1. Population
consists of all tune- 2. A sample of 50
ups. Average cost of engine tune-ups
parts is unknown. is examined.