Download as pdf or txt
Download as pdf or txt
You are on page 1of 7

LESSON 1 Data - actual values of the variable, may

be numbers or words.
POPULATION AND SAMPLE
Note: variables are characteristics that
Population (N) - collection of persons, change or vary (age, gender, course).
things, or objects under study.

Sample (n) - portion of population MEAN AND PROPORTION


Sampling - to select a portion (or subset)
Mean (x̄) - average
from the population
Proportion - fraction (ex. 25/50 are men)
Sample Size - total amount of things in a
sample

DESCRIPTIVE AND
PARAMETER AND INFERENTIAL
STATISTICS
Descriptive Statistics - numerical
measures of data, deals with organizing
Parameter - number that is a property of
and summarizing
population
a. Graphing
b. Using numbers
Statistics - number that represents a
sample
Inferential Statistics - making decisions
or predictions, drawing good conclusions.
KEY NOTES
— uses probability to determine
Population - if hindi sinabi kung ilan how confident we can be
Sample - number kung ilan
Probability - mathematical tool to study
Parameter - if population gamit randomness, deals with chance.
Statistics - if sample gamit

TYPES OF DATA
VARIABLES AND DATA
Qualitative (Categorical) - results of
Variables - characteristics of interest of categorizing, measuring quality, described
each person, represented by capital by words or letters.
letters X and Y
Quantitative (Numerical) - always
a. Numerical Variables - take on numbers, results of counting or
values with equal units (weight in measuring.
pounds, time in hours).
b. Categorical Variables - place Quantitative Discrete - counting
things into categories. Quantitative Continuous - measure
ORGANIZING DATA sample = # of member * desired
population sample
Statistical Table
Note: per type of people gagawa ng
- use a data distribution to describe
sample/strata
what values and how often
- can be measured in 3 ways:
1. Frequency
2. Relative Frequency (F/n) CONDUCTING A SURVEY
3. Percentage (Rf * 100)
A. Face-to-face Interview
Graphs B. Self-administered survey
- helpful
- no strict rules
1. Pie charts - wedges in a DESIGNING A SURVEY
circle
2. Bar Graph - length of the
1. Identify the goal (problem,
bar
variables)
2. Identify the sample (who?)
3. Choose interviewing method
LESSON 2 4. Decide what questions you will ask
5. Conduct interviews and collect
SAMPLING METHOD information.

A. NON PROBABILITY SAMPLING - not


given equal opportunity
LESSON 3
1. Convenience - based on easy and LEVELS OF MEASUREMENT
convenient access.
2. Quota - to fill a specific quota,
1. Nominal Scale Level
chosen according to traits.
- quality or category
3. Judgemental - called
- names, colors, labels, gender, etc.
purpose/authoritative, basis is
- ORDER DOES NOT MATTER
researcher’s judgment.
4. Snowball - referrals or recruit
2. Ordinal Scale Level
- ranking or placement
B. PROBABILITY SAMPLING - given
- likert scale (very good, good, etc.)
equal chances.
- order matters
- DIFFERENCE CANNOT BE
1. Simple Random - dice or number
MEASURED
generator.
2. Systematic - interval “r-th”
3. Ordinal Scale Level
3. Cluster - large population, rows
- order matters
4. Stratified - there are subgroups
- differences can be measured
with same characteristics or strata
(except ratios)
- NO TRUE “0” STARTING POINT
- example calendar
C = Xmax - Xmin
4. Ratio Scale Level K
- order matters
- differences can be measures 4. Determine the lower and upper
- has “0” starting point limits of c.
- Example score 5. Write down class intervals starting
with the decided lower and upper.
Add the class size.

LESSON 4 6. Determine
observations.
the number of

A. UNGROUPED DATA
(ii) CLASS BOUNDARY
- true class limit
2, 2, 3, 4, 5, 5, 6, 7, 8, 8, 8, 8, 8, 9, 9, 10,
11, 11, 12
Formula:
X f <CFD >CFD Rf CuRf
2 2 2 19 2/19 2/19 Ucb = ULi + LLi+1 i = class or row
3 1 3 17 1/19 3/19 2
4 1 4 16 1/19 4/19
5 2 6 15 2/19 6/19
6 1 7 13 1/19 7/19 EXAMPLE:
7 1 8 12 1/19 8/19 65.6 35.6 52.5 74.7 73.0
8 5 13 11 5/19 13/19 83.4 63.6 56.4 52.5 49.2
9 2 15 6 2/19 15/19 73.3 33.2 10.3 74.7 52.7
10 1 16 4 1/19 16/19 76.2 28.6 36.0 64.7 45.8
11 2 18 3 2/19 18/19 97.6 41.0 64.5 83.4 45.9
12 1 19 1 1/19 1 72.1 65.4 78.5 80.1 50.2
1
1. Xmin = 10.3
Xmax = 97.6
B. GROUPED DATA 2. K = 5
3. C = 97.6-10.3 = 17.46 = 17.5
(i) STEPS IN CONSTRUCTING A
FREQUENCY DISTRIBUTION
CLASS INTERVAL CLASS BOUNDARY
1. Determine the largest and smallest LL UL f LCB UCB
value 10 27.4 1 9.95 27.45
2. Determine the number of class 27.5 44.9 5 27.45 44.95
interval 45 62.5 8 44.95 62.45
62.5 79.9 12 62.45 79.95
Formula: k = √n (no. of classes, k) 80 97.4 3 79.95 97.45
97.5 114.9 1 97.45 114.95

3. Determine the approximate class


size ( c ) bin size/class width
LESSON 5 LESSON 6
DATA ORGANIZATION WHY STUDY STATISTICS IN DATA
ANALYSIS?
Frequency (f) - number of times a value
Generalization: State the outcome
occurs.
Prediction: Predicts based on outcome
Decision: Decision making
Relative Frequency (f/n) - ratio (fraction
or proportion) of the number of times a
value data occurs in the set of outcomes.
MEASURE OF CENTRAL
Cumulative Relative Frequency - TENDENCY (UNGROUPED)
accumulation of the previous relative
frequency. - single number that gives summary
of the characteristics of data.

RELATIVE FREQUENCY MOST COMMONLY MEASURE OF CT:


HISTOGRAMS
1. Mean (arithmetic mean)
- average of measurements.
- bar graph in which the height of the
bar shows “how often”.
Characteristics:
- subinterval on the horizontal axis
Advantages:
and relative frequencies on the
Disadvantages:
vertical axis.

- The height of the bar represents:


2. Median
1. The proportion of
- middle or divides the data in (2)
measurements falling in
equal parts.
that class or subinterval.
2. The probability that a single
Characteristics:
measurement drawn at
Advantages:
random will belong to that
Disadvantages:
class or subinterval

3. Mode
EXAMPLE:
CLASS INTERVAL CLASS
- highest frequency.
BOUNDARY
LL UL f LCB UCB Xi . Characteristics:
10 27.4 1 9.95 27.45 18.7 Advantages:
27.5 44.9 5 27.45 44.95 36.2 Disadvantages:
45 62.5 8 44.95 62.45 53.7
62.5 79.9 12 62.45 79.95 71.2
80 97.4 3 79.95 97.45 88.7
97.5 114.9 1 97.45 114.95 106.
MEASURES OF SPREAD D = 3 days ( has smallest difference)
E = 6 days
(UNGROUPED)
Suppose brand D is out of Stock
Range - spread of data from lowest to the
highest value in distribution, simplest
Brand A Brand B Brand C Brand
measure of variability
X ( x-X̄ )2 X ( x-X̄ )2 X ( x-X̄ E
)2 X ( x-X̄
Average Deviation - provides the )2
average of different variations from a data
set, to measure the distance ( from mean 27 9 30 0 28 4 28 4
and median. 33 1 26 16 34 16 33 9
29 0 32 4 30 0 30 0
30 1 30 0 31 1 27 9
Variance - spread between numbers, now 31 9 32 4 27 9 32 4
far from each number (from mean)
Σ (x-X̄2) = 4.8 =6 = 5.2
Standard Deviation - amount of variation ----------- =4
n
or dispensation.

Note : low standard deviation - values


√Σ (x-X̄ ) 2

----------- =4 = 2.20 = 2.4 = 2.3


dose to mean n

: high Standard Deviation - spread


out

MEASURES OF SYMMETRY
(GROUPED)

Brands No. of Days


A 27 3 29 30 31

B 30 26 32 30 32

C 28 34 30 31 27
Measures of Kurtosis - degree of
D 29 30 32 30 29 peakedness of unimodal distribution.
E 28 33 30 27 32
Note: Peakedness = comparative
measure of the height of the peak of a
X̄ = 30 days frequency distribution.

Range - usually taken to a normal dist.


A = 6 days (symmetric distribution)
B = 6 days
C = 7 days x
MEASURES OF CENTER
MEASURES OF POSITION (GROUPED)
(GROUPED)
1. Mean X̄
QUARTILES (4 equal parts)
. . . . . X̄= ΣFX F = class frequency
Q0 Q1 Q2 Q3 Q4 ΣF X = class mark or mid point

PERCENTILES (100 equal parts)


2. Median X̄
. . . . . X̄ = L1 + (n/2 - ΣF1) c
P0 P25 P50 P75 P100 fmed
where:

DECILES (4 equal parts) L1 = LCB (nth item belong)


2
. . . . . . . . . n = total frequency
D0 D1 D2 f med = median class frequency
D2 = P250 s D7 = D70
Σ F1 = sum of frequency of all classes
P25 = Q1
lower than median
C = Xmax - Xmin
k
PERCENTAGE VS. PERCENTILE
3. Mode X̄
1,2,3,4,5
X̄ = L1 +
Percentage = no. of interest
total nos. Of data

Location of (P) = Percentile ( n + 1) where:


100 L1 = LCB of modal class ( highest freq
class )

EXAMPLE:
∆1 = excess over the frequency of the
2,3,5,7,8,10,11,13,15,16,19 n=11
next lower class
Location P50 = 50 (11+1) = 6th place ∆2 = excess over the frequency of the
100 next higher class.
C = median class size
Location of P30 = 30 (11+1) = 3.6th place
100

EXAMPLE:
Class interval Class boundary f xi
LL UL CB UCB I
10.3 27.7 10.25 27.15 5 19.0
27. 8 45.2 27.75 45.25 8 36.5 where:
45.3 62.7 45.25 62. 75 18 54.0 X̄= midpoint
62.8 80.2 62.75 80.25 3 71.5 X̄1= sample mean
80.3 97.7 80.25 97.75 3 89.0
Standard Deviation ( Population)
Mean : X̄ Mode ; X̄
X = 61 L1= 62.75
∆1= 13 - 8 = 5
∆2 = 13 - 3 = 10
C = 17.5
X̄ = 68.58
μ = population mean
Median : X̄
N = population
n = 30 n\2 = 15
fmed = 13 ( n\2 belong)
n\2 = 15
1 + 5 + 8 = ( kulang) MEASUREMENT OF
50 n\2 is belong in 13 SKEWNESS (GROUPED)
L1 = 45.25
C = 17. 5 ΣFI = 1 + 5 + 8 Skewness - the dental moment about the
X̄ = 4.10 mean determines the symmetry of
distribution

MEASURES OF LOCATION Formula:


(GROUPED) a3 = ΣF (x-X̄ ) 3
(n-1) s³
Quartiles, Deciles, Percentiles
Moment Coefficient of Kurtosis
Formula:
Q4 = n⁴
s⁴
where:
Fmed = observed class freq
(ΣF1) = sum of f , all class lower than where:

the observed n⁴ is the fourth moment about mean and


equal to
Standard Deviation ( Sample )

s = standard deviation

n ≤ 100

You might also like