Professional Documents
Culture Documents
Nature of Stat
Nature of Stat
Nature of Statistics
1. Definition of Statistics
⮚ Statistics is a form of mathematical analysis that uses quantified models, representations and synopses for a
given set of experimental data or real-life studies. Statistics studies methodologies to gather, review, analyze
and draw conclusions from data, according to Grant and Kenton (2019).
⮚ Statistics is a collection of mathematical techniques that help to analyze and present data. Statistics is also used
in associated tasks such as designing experiments and surveys and planning the collection and analysis of data
from these (Kalla, 2008)
⮚ Statistics is the science concerned with developing and studying methods for collecting, analyzing, interpreting
and presenting empirical data (Bren, 2019)
According to Grant and Kenton (2019) Statistics is a term used to summarize a process that an analyst uses to
characterize a data set. If the data set depends on a sample of a larger population, then the analyst can develop
interpretations about the population primarily based on the statistical outcomes from the sample. Statistical analysis
involves the process of gathering and evaluating data and then summarizing the data into a mathematical form.
2. Branches of Statistics
Descriptive statistics are brief descriptive coefficients that summarize a given data set, which can be either a
representation of the entire or a sample of a population. Descriptive statistics are broken down into measures of central
tendency and measures of variability (spread). (Kenton, 2019)
Inferential statistics, the aim of the inferential statistics is to draw conclusions from a sample and generalize
them to the population. It determines the probability of the characteristics of the sample using probability theory. The
most common methodologies used are hypothesis tests, Analysis of variance etc. (Singh, 2018)
3. Types of Data
Quantitative Data, data are measures of values
or counts and are expressed as numbers.
Quantitative data are data about numeric variables (ABS,
2019). Discreet Data is quantitative data that can be
counted and has a finite number of possible values.
Whereas, continuous data is quantitative data that can
be measured and has an infinite number of possible
values within a selected range. While Qualitative Data,
data are measures of 'types' and may be represented by
a name, symbol, or a number code. Qualitative data are
data about categorical variables
4. Scale of Measurement
Scales of measurement refer to ways in which variables/numbers are defined and categorized. Each scale of
measurement has certain properties which in turn determines the appropriateness for use of certain statistical analyses.
The four scales of measurement are nominal, ordinal, interval, and ratio. (Cornell, 2016)
Nominal: Categorical data and numbers that are simply used as identifiers or names represent a nominal scale
of measurement.
In addition, this level of measurement, the numbers in the
variable is used only to classify the data. In this level of
measurement, words, letters, and alpha-numeric symbols can be
used.
Survey on Why People Travel %
Personal business 14.6
Visit Friends or Relatives 33
Work-related 22.5
Leisure 30
The second level of measurement is the ordinal level of measurement. This level of measurement depicts some
ordered relationship among the variable’s observations (SS, 2019). Furthermore, an ordinal scale of measurement
represents an ordered series of relationships or rank order.
Example: Individuals competing in a contest may be fortunate to achieve first, second, or third place. First,
second, and third place represent ordinal data.
Interval Scale, an interval scale has ordered numbers with meaningful divisions, the magnitude between the
consecutive intervals are equal. Interval scales do not have a true zero i.e In Celsius 0 degrees does not mean the
absence of heat (Bisht, 2019). Furthermore, Interval scales have the properties of: Identity, Magnitude and Equal
distance.
For example, temperature on Fahrenheit/Celsius thermometer i.e. 90° are hotter than 45° and the difference
between 10° and 30° are the same as the difference between 60° degrees and 80°.
Ratio scales have all of the characteristics of interval scales as well as a true zero, which refers to complete
absence of the characteristic being measured. Physical characteristics of persons and objects can be measured with ratio
scales, and, thus, height and weight are examples of ratio measurement (Lee, 2019)
Example: A score of 0 means there is complete absence of height or weight. A person who is 1.2 meters (4 feet)
tall is two-thirds as tall as a 1.8-metre- (6-foot-) tall person. Similarly, a person weighing 45.4 kg (100 pounds) is two-
thirds as heavy as a person who weighs 68 kg (150 pounds).
b. Systematic sampling
Individuals are selected at regular intervals from the sampling frame. The intervals are chosen to ensure an
adequate sample size. If you need a sample size n from a population of size x, you should select every x/nth individual
for the sample. For example, if you wanted a sample size of 100 from a population of 1000, select every 1000/100 =
10th member of the sampling frame.
c. Stratified sampling
In this method, the population is first divided into subgroups (or strata) who all share a similar characteristic. It is
used when we might reasonably expect the measurement of interest to vary between the different subgroups, and we
want to ensure representation from all the subgroups. For example, in a study of stroke outcomes, we may stratify the
population by sex, to ensure equal representation of men and women.
d. Clustered sampling
In a clustered sample, subgroups of the population are used as
the sampling unit, rather than individuals. The population is divided
into subgroups, known as clusters, which are randomly selected to be
included in the study.
cluster sampling is a sampling method in which the entire
population of the study is divided into externally homogeneous, but
internally heterogeneous, groups called clusters. Essentially, each
cluster is a mini-representation of the entire population.
a. Convenience sampling
Convenience sampling is perhaps the easiest method of sampling, because participants are selected based on
availability and willingness to take part. Useful results can be obtained, but the results are prone to significant bias,
because those who volunteer to take part may be different from those who choose not to (volunteer bias), and the
sample may not be representative of other characteristics, such as age or sex. Note: volunteer bias is a risk of all non-
probability sampling methods.
b. Quota sampling
This method of sampling is often used by market researchers. Interviewers are given a quota of subjects of a
specified type to attempt to recruit. For example, an interviewer might be told to go out and select 20 adult men, 20
adult women, 10 teenage girls and 10 teenage boys so that they could interview them about their television viewing.
Ideally the quotas chosen would proportionally represent the characteristics of the underlying population.
d. Snowball sampling
This method is commonly used in social sciences when investigating hard-to-reach groups. Existing subjects are
asked to nominate further subjects known to them, so the sample increases in size like a rolling snowball. For example,
when carrying out a survey of risk behaviors amongst intravenous drug users, participants may be asked to nominate
other users to be interviewed.
Opposite to closed-ended are open-ended surveys and questionnaires. The main difference between the two is
the fact that closed-ended surveys offer predefined answer options the respondent must choose from, whereas open-
ended surveys allow the respondents much more freedom and flexibility when providing their answers.
7. Measure of Central Tendency (Laerd, 2018)
A measure of central tendency is a single value that attempts to describe a set of data by identifying the central
position within that set of data. As such, measures of central tendency are sometimes called measures of central
location. They are also classed as summary statistics. The mean (often called the average) is most likely the measure of
central tendency that you are most familiar with, but there are others, such as the median and the mode.
The mean, median and mode are all valid measures of central tendency, but under different conditions, some
measures of central tendency become more appropriate to use than others. In the following sections, we will look at the
mean, mode and median, and learn how to calculate them and under what conditions they are most appropriate to be
used.
The mean salary for these ten staff is Php 30.7k. However, inspecting the raw data suggests that this mean value
might not be the best way to accurately reflect the typical salary of a worker, as most workers have salaries in the Php
12k to 18k range. The mean is being skewed by the two large salaries. Therefore, in this situation, we would like to have
a better measure of central tendency. As we will find out later, taking the median would be a better measure of central
tendency in this situation.
Variance
The variance is the average of the squared differences from the mean. To figure out the variance, first calculate
the difference between each point and the mean; then, square and average the results.
Standard Deviation
Standard deviation is a statistic that looks at how far from the mean a group of numbers is, by using the square
root of the variance. The calculation of variance uses squares because it weights outliers more heavily than data very
near the mean. This calculation also prevents differences above the mean from canceling out those below, which can
sometimes result in a variance of zero.
Standard deviation is calculated as the square root of variance by figuring out the variation between each data
point relative to the mean. If the points are further from the mean, there is a higher deviation within the date; if they
are closer to the mean, there is a lower deviation. So, the more spread out the group of numbers, the higher the
standard deviation.
Statistical Software
About JASP
⮚ In recognition of Bayesian pioneer Sir Harold Jeffreys, JASP stands for Jeffreys’s Amazing Statistics Program.
⮚ JASP is currently supported by long-term, multi-million euro grants that help fund a team of motivated software
⮚ The JASP application is written in C++, using the Qt toolkit. The analyses themselves are written in either R or C+
+. The display layer (where the tables are rendered) is written in javascript, and is built on top of jQuery UI and
webkit.
JASP generally produces APA style results tables and plots to ease publication. It promotes open science by
integration with the Open Science Framework and reproducibility by integrating the analysis settings into the results.
Activity Data:
Preparednes
Res Sex Age Civil Status FMI BLS/DRRM BRGY. Resilience Adaptation
s
Younge
1 Female Married Higher With Training Malinong 4.13 3.25 3.64
r
2 Female Older Single Lower Without Training Malinong 3.33 2.63 3.14
Younge
3 Male Married Lower Without Training Malinong 3.47 2.88 3.00
r
Younge
4 Male Single Lower With Training Malinong 3.00 2.38 3.07
r
5 Female Older Married Higher Without Training Malinong 3.93 3.38 4.64
6 Male Older Married Higher With Training Malinong 4.07 2.94 4.21
7 Female Older Single Higher Without Training Malinong 4.07 3.38 4.07
8 Female Older Married Higher With Training Malinong 4.20 3.31 4.29
Younge
9 Male Married Lower With Training Malinong 3.67 3.25 4.07
r
Younge
10 Female Married Higher With Training Malinong 3.60 4.13 3.29
r
Younge
11 Female Single Lower Without Training Higugma 4.27 4.38 4.21
r
Younge
12 Male Married Higher With Training Higugma 4.40 3.69 4.43
r
13 Female Older Married Lower Without Training Higugma 4.20 4.31 4.43
Younge
14 Male Single Lower Without Training Higugma 4.73 4.56 4.43
r
Younge
15 Female Single Lower Without Training Higugma 4.67 3.94 4.07
r
Younge
16 Male Married Higher Without Training Higugma 4.53 4.50 4.64
r
17 Female Older Single Lower With Training Higugma 4.27 4.44 4.57
18 Male Older Single Lower Without Training Higugma 4.40 4.38 4.43
19 Male Older Single Lower Without Training Higugma 4.53 4.56 4.71
Younge
20 Male Single Higher Without Training Higugma 4.53 4.31 4.36
r
Younge
21 Female Married Lower Without Training Paglaum 4.53 4.69 4.57
r
22 Male Older Single Higher With Training Paglaum 2.73 2.38 2.57
23 Male Older Married Lower With Training Paglaum 2.73 2.56 2.57
Younge
24 Male Single Higher Without Training Paglaum 2.00 2.19 2.5
r
25 Female Older Married Lower Without Training Paglaum 2.07 2.31 2.07
Younge
26 Female Married Lower Without Training Paglaum 1.93 1.94 1.93
r
Younge
27 Male Married Higher Without Training Paglaum 3.33 2.31 2.21
r
Younge
28 Female Married Higher Without Training Paglaum 2.60 2.13 2.21
r
Younge
29 Male Married Lower Without Training Paglaum 2.27 2.13 2.00
r
Younge
30 Female Single Lower With Training Paglaum 2.73 2.38 2.57
r
Legend:
Sex Age Civil Status Basic Life Support Training
2-Female 2-Married (47 years old & above) 2-Married 2-Without training
3-Gabinuligay
Practice:
Procedure:
Step 4. Go to Results
2. What is the level of resiliency of the participants when taken as a whole and grouped according to sex, age, civil
status, family income, basic life support training and barangay?
3. Go to results
Descriptive Statistics
Descriptive Statistics
Resilience
Female Male
Valid 15 15
Missing 0 0
Mean 3.635 3.626
Descriptive Statistics
Resilience
Female Male
Std. Deviation 0.894 0.913
Minimum 1.930 2.000
Maximum 4.670 4.730
Based from the result, both male (M = 3.63, SD = 0.913) and female (M = 3.64, SD = 0.890) have high level of
resiliency. However, the female group has high mean value than the male group this illustrates that the level of
resiliency of the female group was slightly higher than the male group. In addition, the standard deviation of the female
group displays significant consistency than the male group.
1. What is the level of preparedness of the participants when taken as a whole and grouped according to civil status, and
barangay?
Note: Present your answers in a Microsoft Word (A4, with 1 inch margin, Calibri Font style, 11 font size and 1.5
spacing). In addition, after presenting your answers kindly paste your JASP results at the last page for statistical
evidences.
References: