Lec 11 Chapter IV Descriptiv and Inferential Stat.

You might also like

Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 26

Chapter IV:

Descriptive and inferential statistics


4.1 Descriptive statistics
Purpose: To communicate the essential characteristics of a
population through a data set obtained from a sample.
Population sample Data Tables
Graphs
Numerical indexes
(Estimates)
(Averages, Percentages, percentile ranks, variability
measures, correlation coefficients, regression coefficients,
etc) 1
4.1.1 Frequency distributions and Graphs

Frequency Distribution:

1. Ungrouped frequency distribution

2. Grouped frequency distribution

Ungrouped frequency distribution is an arrangement of data in


which the frequency of each data value of any variable is
shown.

Grouped frequency distribution is an arrangement of data in


which data values of any variable are clustered or grouped into
intervals and the frequencies of each interval is shown.
2
Cont…
• Steps in constructing freq. distribution tables:
– List each data value in ascending order (column 1)

– Count the number of times each value occurs or frequency


(column 2); Collapse data values into intervals and find freq. in
each interval to construct grouped freq. distn.; Intervals do not
overlap.
– Cumulative frequency of each value/interval (column 3,
optional).
– Percentage of each value/interval (column 4, optional).

– Cumulative percentage of each value (column 5, optional). 3


Categorical Freq. Table

Cumulative Cumulative
Blood Type Frequency Percent
Frequency percent

A 5 5 20 20

B 4 9 16 36

AB 7 16 28 64

O 9 25 36 100

4
Grouped Frequency Table
Cumulative Cumulative
Temperature Frequency Percent
Frequency percent

100-104 2 4 4 4

105-109 8 10 16 20

110-114 18 28 36 56

115-119 13 41 26 82

120-124 7 48 14 96

125-129 1 49 2 98

130-134 1 50 2 100

5
Graphic presentations of data
• Graphs include:
– Bar graphs

– Histograms

– Line graphs

– Scatter plots

– Pie charts

– Pictorial diagrams, etc

Define each and obtain examples of each.


6
Bar charts

7
Line charts

8
Histogram

9
10
Pie chart

11
Scatter diagram

12
4.1.2. Measures of central tendency, Variability,
Relative Positions, and Relationships

• Measures of central tendency include:

– Mean: arithmetic average value

– Median: 50th percentile value

– Mode: the most frequent value

Compare the mean, median and mode.


• In normal distribution (symmetrical, unimodal, or bell-shaped)
the three averages are equal.
• In skewed distribution (asymmetrical), the three averages are
different. 13
Skewedness and Kurtosis
• Normal distribution
mean = median = mode
• Negatively skewed distributions
– Skewed to the left

M Md Mo

Mean < Median < Mode


14
Cont..
• Positively skewed distributions
– Skewed to the right

Mo Md M
Mode < Median < Mean
• Outliers at only one end cause skewedness.
• The tail indicates the direction of the skewness
15
Cont…
• Kurtosis is the hump at the modal point.

• Both skewness and normality are matter of degree. They are


approximated by histograms of frequency distributions of samples
drawn from skewed and normal populations respectively.
• According to Karl Pearson, for moderately skewed distribution the
following empirical relation holds true between the mean, mode
and the median values.
mode-median =3(median-mean)

16
Measures of variability
Measures of variability include:
– Range : difference between the largest and
the smallest value in the data set.
– Variance: average of squared deviations of
scores from their mean.
• Variance = sum of (X-µ)2/N (or of (X-M)2/n-
1) for a population and a sample
respectively.
– Standard deviation: Square root of variance.
– Coefficient of variation: SD/mean (100%).
17
Normal distribution
Circle Normal curve
Equation: x2+y2=r2 y = e-(x-µ)2/(2 σ2)
σ √(2π)

r= the radius e=2.718; π =3.14;


µ = population mean
σ = population SD
Help study wheel, circular Help study variables
motions that are not that are not
perfectly circular perfectly normal.
Think of circles of Think of normal
curves of different radii. of different
µ and σ
Shapes of normal distributions
a b
1 1

2 2

µ1 = µ 2 µ1 > µ2
σ1 > σ 2 σ1 = σ 2
a) Same mean different SDs. b) Different means but same SD.
NB: You can imagine another situation under which we have
different means and different standard deviations
19
SD and normal distribution

• For a given normal population distribution,


68.26% of cases fall within 1 SD
95.44% of cases fall within 2 SD
99.74% of cases fall within 3 SD
• The percentage in each case represents the
portion of area under the normal curve within
the given SD.
• Area under the whole curve is assumed to be 1
or 100%.
20
Measures of relative position
• Provides info. about where a score falls w.r.t. other scores in the
distribution of data.
• Raw score

• Derived scores: Mean, SD, percentile ranks, deciles, quartiles, Z


scores, … etc.
• Percentile rank: The percentage of scores in a reference group
(norm group) that falls below a particular raw score.
• Percentile corresponding to score X is equal to (No. of scores
below X plus .5 )/Tot no. of scores, the whole multiplied by
100%.
Standard score and Percentile rank
• Quartiles divide the distn. into four parts.

• Deciles divide the distn. into 10 parts.

• Median is 50th percentile or 2nd quartile, or 5th decile.

• Standard scores: scores converted from one scale to


another so that they can have a particular mean and
SD and are more interpretable. Ex. Z score.
• Z=(X-M)/SD; It has mean 0 and SD=1. It has normal
distribution. 22
Measures of Relationships
• Measures of relationships include:
– Pearson’s correlation coefficient r
– Spearman’s correlation coefficient ƥ
– Contingency table
– Regression coefficient beta
• Simple regression coefficient
• Multiple regression coefficients

23
Data 1
Exercise 4.1.2
Using SPSS and Data 1 above, answer each of the following.

1.Summarize the data for the three continuous variables (Startsal,


GPA and GRE) reporting averages, SD, range, coefficient of
variation in an appropriate table. Display the data in frequency
distribution tables, histograms with normal curve and scatter
plots for these variables. Communicate the characteristics of the
population from which this sample is drawn in a paragraph.

2.Draw frequency bar graphs and percentage pie charts for Gender
and Major.
Exercise 4.1.2 (cont)
Based on Data 1, Use SPSS to obtain:
3. Obtain a contingency table for Startsal by
Gender and Major.
4. The percentile rank of GPA of 2.9 and
salary of 33000.
5. Z score of GPA of 2.9 and salary of 33000.
6. Pearson’s r between Startsal and GRE.
7. Spearman ρ between rank scores of Startsal
and GRE.
26

You might also like