Professional Documents
Culture Documents
Week 10: Basic Concept
Week 10: Basic Concept
Learning Competency: Uses statistical techniques to analyze data – study of differences and
relationships limited for bivariate analysis
Basic Concept
Statistics is a form of mathematical analysis that uses quantified models, representations and synopses
for a given set of experimental data or real-life studies. Statistics studies methodologies to gather,
review, analyze and draw conclusions from data. Statistical methods analyze large volumes of data and
their properties. Statistics is used in various disciplines such as psychology, business, physical and
social sciences, humanities, government and manufacturing. Statistical data is gathered using a
sample procedure or other method. Two types of statistical methods are used in analyzing data:
descriptive statistics and inferential statistics. Descriptive statistics are used to synopsize data from a
sample exercising the mean or standard deviation. Inferential statistics are used when data is viewed as a
subclass of a specific population.
Statistical Methodologies
Descriptive statistics are brief descriptive coefficients that summarize a given data set,
Descriptive which can be either a representation of the entire population or a sample of it. Descriptive
Statistics statistics are broken down into measures of central tendency and measures of variability,
or spread. Measures of central tendency include the mean, median and mode, while
measures of variability include the standard deviation or variance, and the minimum and
maximum variables.
Now, suppose you need to collect data on a very large population. For example, suppose
you want to know the average height of all the men in a city with a population of so many
million residents. It isn't very practical to try and get the height of each man. This is where
inferential statistics comes into play. Inferential statistics makes inferences about populations
Inferential using data drawn from the population. Instead of using the entire population to gather the data,
the statistician will collect a sample or samples from the millions of residents and make
Statistics inferences about the entire population using the sample. The sample is a set of data taken from
the population to represent the population. Probability distributions, hypothesis testing,
correlation testing and regression analysis are all fall under the category of inferential
statistics.
Types of Statistical Data Analysis
1. Univariate Analysis – analysis of one variable.
2. Bivariate Analysis – analysis of two variables (independent and dependent)
3. Multivariate Analysis – analysis of multiple relations between multiple variables.
Covariance is the statistical term to measure the extent of the change in the
relationship of two random variables. Random variables are data with varied values
like those ones in the interval level or scale (Strongly disagree, disagree, neutral,
agree, strongly agree) whose values depend on the arbitrariness of the respondents
Cross Tabulation – is also called ―crosstab or students-contingency table‖ that follows the format of
a matrix that is made up of lines of numbers, symbols, and other expressions. Similar to one type of
graph called table, matrix arranges data in rows and columns. If the table compares data on only two
variables, such table is called Bivariate Table.
Example:
Secondary School Participants who attend the 1st UCNHS Research Conference
SCHOOL MALE FEMALE ROW TOTAL
QMA 152 127 279
(18.7%`) 15.4%
UNCNHS 120 98 218
14.8% 11.9%
PUNP 59 48 107
7.2% 5.8%
UCU 61 58 119
7.5% 5.8%
LNL 81 79 159
10% 9.5%
U-Pang. 79 99 178
9.7% 12%
CLLC 102 120 222
12.6% 14.5%
ABE 69 93 162
8.5% 11.3%
STI 83 101 184
10.2% 12.2%
COLUMN TOTAL 806 823 1629
100% 100%
Measure of Correlations
Correlation is a bivariate analysis that measures the strengths of association between two variables
and the direction of the relationship. In terms of the strength of relationship, the value of the correlation
coefficient varies between +1 and -1. When the value of the correlation coefficient lies around ± 1,
then it is said to be a perfect degree of association between the two variables. As the correlation
coefficient value goes towards 0, the relationship between the two variables will be weaker. The
direction of the relationship is simply the + (indicating a positive relationship between the variables) or
- (indicating a negative relationship between the variables) sign of the correlation. Usually, in statistics,
we measure four types of correlations: Pearson correlation, Kendall rank correlation, Spearman
correlation, and the Point-Biseria.
KEY TERMS Effect size: Cohen‘s standard will be used to evaluate the
correlation coefficient to determine the strength of the relationship, or the effect size,
where correlation coefficients between .10 and .29 represent a small association,
coefficients between .30 and .49 represent a medium association, and coefficients of .
50 and above represent a large association or relationship.
Continuous data: Data that is interval or ratio level. This type of data possesses
the properties of magnitude and equal interval between adjacent units. Equal intervals
between adjacent units‘ means that there are equal amounts of the variable being
measured between adjacent units on the scale. An example would be age. An increase
in age from 21 to 22 would be the same as an increase in age from 60 to 61.
KEY TERMS
Concordant: Ordered in the same way.
Discordant: Ordered differently.
Ordinal data: Ordinal scales rank order the items that are being measured to indicate if they possess more,
less, or the same amount of the variable being measured. An ordinal scale allows us to determine if X > Y, Y > X,
or if X = Y. An example would be rank ordering the participants in a dance contest. The dancer who was ranked
one was a better dancer than the dancer who was ranked two. The dancer ranked two was a better dancer than the
dancer who was ranked three, and so on. Although this scale allows us to determine greater than, less than, or
equal to, it still does not define the magnitude of the relationship between units.
Activity: Problem solving: Compute of the correlation coefficient for the data
obtained in the study of age and blood pressure given. You can use another papers for
your answer.
Subject Age x Pressure y Xy X2 Y2
A 43 128
B 48 120
C 56 135
D 61 143
E 67 141
F 70 152
∑X ∑Y ∑XY ∑X2 ∑y2
Substitute in formula and solve for:
Homework:
F.A. 1.1. The first step for a researcher after finishing data collection is to start hypothesis testing
procedures.
F.A 1.2. A researcher should explore the characteristics of the data and the examined variables to
summarise the data once data is clean and ready for investigation.
FA.1.3. One of the important considerations in preliminary analysis is to look for patterns in the data
and to check if any specific variable looks extremely erratic.
FA. 1.4. Descriptive statistics are mathematical techniques which are used make inferences about the
population of interest based on data collected from a representative sample.
FA. 1.5. Blunders are errors made in transferring the manual data onto software for analysis during
data entry or coding.
FA.1.6. Preliminary analysis through descriptive summaries can be used to portray frequency of
responses for each of the key study variables through tables of visual charts.
F.A 1.7. Different data types command the use of similar analysis techniques, whereby statistical
methods for analysing categorical data can also be used for continuous data.
F.A 1.8. Criteria that can be used to assess the best statistical technique to employ for examining a
phenomenon and testing hypothesis include (please select the answer that DOESN’T apply) ______.
F.A 1.9. Bivariate statistical methods analyse any number of variables and can offer different types of
inferences, whereas multivariate statistical techniques analyse two variables at a time.
F.A 1.10. Univariate analysis, through one-way frequencies and descriptive statistics, offer a good
understanding of variables, but only individually.
a. True
b. False
Answer: B
2. A researcher should explore the characteristics of the data and the examined variables
to summarise the data once data is clean and ready for investigation.
a. True
b. False
Answer: A
3. One of the important considerations in preliminary analysis is to look for patterns in the
data and to check if any specific variable looks extremely erratic.
a. True
b. False
Answer: A
4. Descriptive statistics are mathematical techniques which are used make inferences
about the population of interest based on data collected from a representative sample.
a. True
b. False
Answer: B
5. Blunders are errors made in transferring the manual data onto software for analysis
during data entry or coding.
a. True
b. False
Answer: A
a. True
b. False
Answer: A
7. Different data types command the use of similar analysis techniques, whereby
statistical methods for analysing categorical data can also be used for continuous data.
a. True
b. False
Answer: A
8. Criteria that can be used to assess the best statistical technique to employ for
examining a phenomenon and testing hypothesis include (please select the answer that
DOESN’T apply) ______.
Answer: C
9. Bivariate statistical methods analyse any number of variables and can offer different
types of inferences, whereas multivariate statistical techniques analyse two variables at a
time.
a. True
b. False
Answer: B
10. Univariate analysis, through one-way frequencies and descriptive statistics, offer a
good understanding of variables, but only individually.
a. True
b. False
Answer: A
a. True
b. False
Answer: A
12. Parametric statistical techniques are best used to analyse categorical data.
a. True
b. False
Answer: B
13. Non-parametric statistics are less sensitive than parametric statistics, whereby they
may not detect significant relationships or differences when they actually exist.
a. True
b. False
Answer: A
a. True
b. False
Answer: B
15. In using any non-parametric statistical tools, assumptions must be checked first to
ensure they are not violated.
a. True
b. False
Answer: B
16. Independent t-tests and compares data about (please select the ONE answer that
applies) ______.
Answer: A
17. To investigate relationships, bivariate correlation analysis can be used when the
analysis involves two continuous variables.
a. True
b. False
Answer:
18. Multiple regression analysis enables an understanding of a set of DVs for their ability
to explain variance in the IV.
a. True
b. False
19. Regression analysis has stringent assumptions that must be checked before running
the tests to ensure they are not violated towards achieving a good regression model.
a. True
b. False
Answer: A
20. It is likely that with large samples even minor differences/relationships may appear to
be statistically significant so the researcher must interpret the results in perspective to
reflect the real theoretical/practical significance of findings.
a. True
b. False
Answer: B