Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

• DATA MANAGEMENT- student will learn the different measures and how to properly use them to

summarize data set.


• DATA COLLECTION is the process of gathering and measuring information about variables on
study in an established systematic procedure.
• QUALITATIVE DATA
-Can be observed or recorded.
-Is collected through methods of observation one-to-one interviews directing focus groups
and similar method.
-also known as categorical data ( can be arranged categorically based on the attributes and
properties of a thing ).
• QUANTITATIVE DATA
-Defined as the value of data in the form of counts
-any quantifiable information that can be used for mathematical calculation and statistical
analysis
-answer questions such as “how many?” “how often?” “how much?”
2 types of qualitative data
1. Discrete type- data only involves counts and must be expressed in whole number.( e g ,
number of board passer)
2. Continuous type- usually expressed in fraction form. Can be expressed between two
ranges of values.( e.g., height of a student)
• FOUR LEVELS OF MEASUREMENT
1. Nominal Scale-(lowest level) used for identification purpose only. Qualitative set whose
categories cannot be arranged in a sequence. e.g., your blood type
2. Ordinal Scale-(second level of measurement)use for categorizing as well as for organizing.
-qualitative data set which can be put into sequence .e.g., low|medium|high
3. Interval Scale- (third level) variable that belong to this type of data does not have true zero e.g.,
The IQ score of a student in class
4. Ration Scale-(highest level) posses the property of identity. Qualitative variables that possess
absolute zero all belong to the Ratio Scale. e.g., the length of your shoes.
• TYPES OF DATA COLLECTION METHOD
1. Qualitative Data Collection Method- data are usually collected thru observation, thru
focus group discussions and thru using open-ended questions.
- Using this manner generate longer narratives.
2. Quantitative Data Collection Method- data are usually collected using closed-ended
questions.
-uses questionnaire as a form to collect information, exact and short.
• ORGANIZING DATA
-Gathered data remain meaningless unless organized.
-to organize data is to present it in text, in table, or in graphical form.
1. Textual Presentation- data are presented in text or in paragraph and sentences.
-highlights the most important characteristics, significant figures of the data.
2. Tabular Presentation- systematic and logical arrangement of data in form of rows and columns
a. One-Way Table- table presented only one categorical variable followed by its
frequency and percentage.
b. Two-Way Table- this manner use two categorical variables, (1sr variable is presented in rows, the
second in columns, and is collected in a single group observation)
c. Frequency Distribution Table- used to organize Qualitative date by summarizing it
using intervals or classes.
1st step. Arrange the data in ascending order(upward) (ex. 10 10 12 13 15 )
2nd step. Compute the range; R=highest value – Lowest value(ex. R=15-10=5 )
3rd step. Compute class interval; k=1+3.322 log(N) ex. K=1+3.322 log5= 3.32
N=no. Of observations
4th step. Compute for the class width; C= R/K ex. C=5/3.32 = 1.50 ≈ 2
5th step. 5.1: start with the lowest value + class width=10 (1st lower class limit*LL*)+
2=12 (the next LL) subtract 1 = 11(to create first upper limit*UL*); 12+2=14(3rd LL)
subtract 1= 15(3rdUL) so on until you reach the higher value
5.2:Less 0.5 for lowest value and add 0.5 to succeeding value*UL*
5.3 tally on the class limit then count it then attached it in the frequency.
Ex . 1st (10, 10, 12, 13, 15) 10-11 in the observation have only 2 so the tally is ||
5.4 To get lesser than cumulative frequency, start with the lowest value of
the frequency(which for us is 2 ) then add the succeeding value(which is 2)
equal to 4
5.5 to get greater than cumulative Frequency, start with the highest
value(which is for ex.is 5) subtract the 1st frequency then so on(ex. 5-2=3-
2=1)
5.6 Relative frequency rf= frequency÷number of value×100% =
2÷5×100%=0.4/40%
5.7 Class midpoint (lower limit + upper limit ÷ 2) 2+3=5÷2= 2.5,
3+4=7÷2=3.5, 5+6=11÷2=5.5

Class Class Class Tally Frequency Lesser Greater Relative


interval Boundary midpoints than than frequency
cumulative cumulative
frequency frequency
LL-UL LB-UB f >cf <cf rf
10-11 9.5-10.5 10.5 || 2 2 5 40%

12-13 11.5-13.5 12.5 || 2 4 3 40%

14-15 13.5-15.5 14.5 | 1 5 1 20%

3. Graphical Presentation- data is an attractive way of visually presented .


a. Bar-chart and pie-chart-very useful to visually presented categories
Bar chart(higher the bar mean the higher the number)
pie-char (the bigger the slice the bigger would be frequency)
b. Histogram- uses frequency along the vertical axis and class boundary along
the horizontal axis
Vertical scale- identifies the frequencies in the various classes
Horizontal scale- identifies the variables( it shows values for class boundaries, class limits, or class marks.
c. Frequency Polygon- visually substantial method of representing quantitative
data and its frequency.
d. OGIVE- cumulative frequency graph for the classes in a frequency
distribution. Typically “upward”
e. Box-&-Whisker- shows the median, the quartiles, and the extremes for a
numerical set of data. Useful for comparing sets of data.
f. Stem and Leaf- where the data valu3 is divided into parts. Stem number is all
except the last digit(the tens part) leaf the last digit and is a single digit
only.(ones part)
• MEASURES OF CENTRAL TENDENCY
-used to find the middle value in a set of observation.
- 3 measures (Mean, Median, Mode)
▪ Mean balance point of the data. Most commonly used measure of central tendency.
▪ Median middle most value, in ascending or descending order of observation.
▪ Mode values with the greatest frequency (sometimes it exist sometimes it does.)
• MEASURES OF DISPERSION
-is a way to find out how the data set tends to spread out from the mean.
-also known as the measure of spread or measure of variability.
▪ Range- easiest measure among all measure of variability. The difference between the
highest and lowest value in the data set.
R= highest value – Lowest Value
▪ Interquartile Range- descriptive statistics, tells you the spread of the middle half of your
distribution.(Quartiles segment ordered from low to high into four equal parts) interquartile
range contains the second and third quartiles. IQR= Q – Q (ex. 43 45 48 51 58 65 78)
Q3= third quartile which is 3 (total no. of value + 1 given)th Q3= 3(7+1)th = 24/4= 6th Q3=65
Quartiles segment which is 4 4
Q1= third quartile which is 1 (total no. of value + 1 given)th Q1= 1(7+1)th = 8/4= 2nd Q3=45
Quartiles segment which is 4 4
IQR= Q3 – Q1= 65- 45= 20
▪ Variance- average of the squared deviation from
the mean. It measures how far a set of numbers
are spread out from the mean.

▪ Standard deviation- positive square root of the


variance s²=15, thus, s= square root s² = √15 =
3.87
▪ Absolute Deviation-average of its distances of its
deviation from the central point.

• MEASURES OF RELATIVE POSITION


-positional measures used to measure other locations of a data set.
▪ Quartile measure divides the ordered observation into four equal parts. Qk: K( n+1)
4
▪ Percentile value divides the observation into
100 equal parts, so the mean is 1%
Pk : k(n+1) Pk can be P1 - P100
100

▪ Z-score- indicates how many standard


deviations an element is from the mean

• PROBABILITY- used in everyday conversation to measure the likelihood of an event to happen.


The value of probability ranges from 0 to 1.
-A zero probability means
that the event will never
happen.
-Probability of 1 indicates
that the event will surely
happen.

CONDITIONAL PROBABILITY of an event B to occur


knowing that event A has occurred.
• NORMAL DISTRIBUTION also known as Gaussian distribution
-Is a bell shaped curve that extends asymptotically To the horizontal axis from negative to positive
Infinity.
-It has mean and standard deviation.
1. Symmetric- normal distribution is
perfectly symmetrical around its
center.
2. Unimodal- is only one mode in a
normal distribution.
3. Asymptotic- extremes Come closer
and closer to the horizontal line but it
never touches.
4. Equal values- of the mean, median and mode.

• LINEAR REGRESSION AND CORRELATION


-is a statistical method that allows us to summarize and study relationships between two
continuous (quantitative) variables.
-This lesson introduces the concept and basic procedures of simple linear regression.
▪ Simple Linear Regression Analysis
1. Regression Analysis-is a statistical method that makes use of the relationship between
two or more quantitative variables.
2.Dependent or Response Variable (Y)-can be explained with the knowledge of the values
of the other variable, called the Independent or Explanatory Variable(X).
3.Scatter Diagram-is the most popular method for examining data points based on two
variables.
·A strong organization of points along a straight line characterizes a Linear Relationship.
·A curved set of points may denote a nonlinear relationship or there may be only a seemingly
random pattern of points, indicating no relationship.
·The term 'simple' in simple linear regression, is used when we consider only one independent
variable.
·Mathematically, the line with "best fit' is that line such that when the differences between the
actual vales of Y and the predicted values of Y based on the regression line for each
-re squared and summed, the sum is minimum. For each X, the equation
• CORRELATION ANALYSIS
Correlation analysis is a method of statistical evaluation used to study the strength of a
relationship between two, numerically measured, continuous variables (e.g. Height and weight). It
is the attempts to measure the strength of the relationship between two random variables by
means of a single number called a correlation coefficient.

You might also like