Nature of Statistics

NATURE OF STATISTICS
STATISTICS - A branch of mathematics that Measurement.

examines and investigates ways to process and Experimental Classification
analyze the date gathered. A researcher may classify
- The study of how to collect, organize, variables according to the
analyze, and interpret numerical function they serve in the
information from data. experiment.
Descriptive Statistics Mathematical Classification
- Involves methods of organizing, Variables may also be
picturing, and summarizing classified in terms of the
information from samples mathematical values.
Inferential Statistics Independent Variables
- Involves methods of using Variables controlled by the
information from a sample to experimenter, and expected to
draw conclusions have an effect on the behavior of
Population the subjects.
Consists of all members of Dependent Variable
the group. Is some measure of the behavior
Sample of subjects and expected to be
A portion, or part, of the influenced by the independent
population of interest variable.
selected for analysis. Continuous Variable
Parameter A variable which can assume any
Numerical index describing of an infinite number of values.
a characteristic of a Discrete Variable
population. A variable which consist of either a
Statistics finite number of values or
Numerical index describing countable number of values.
a characteristic of a Nominal
sample. Used to differentiate
Primary Data classes or categories
Data that comes from the original for purely classification.
source, and are intended to Ordinal
answer specific questions. This applies to data
Secondary Data that can be arranged
Data that are taken from in order.
previously recorded data. Interval
Constant classify order
A characteristic that does not vary. and differentiate
Variable between classes or
A characteristic that can take categories in terms of
different values. degrees of differences.
Qualitative Variable There is no natural zero
Describes an individual by starting point
category or group. Ratio
Quantitative Variable Applies to data that
Has a value or numerical can be arranged in
order. There is a true zero. - Snowball Sampling
Random Sampling is a technique in which one or more
Type of sampling in which members of a population are located
all members have equal and
chance of being selected. used to lead the researchers to other
- Simple Random Sampling members of the population
A process whose members had an - Voluntary Sampling
equal A technique when samples are
chance of being selected from the composed of
population via random numbers. respondents who are self-select in the
- Systematic Sampling study/survey
A process of selecting a kth element in
the
population until the desired number of
subjects or respondents is attained
- Stratified Sampling
Subdivide the population into at least SUMMARY MEASURES
two CENTRAL TENDENCY
different subgroups that share the same a summary measure that attempts to describe a
characteristics, then draw a sample whole set of data with a single value that
from represents the middle or centre of its
each subgroup (or stratum) distribution.
- Cluster Sampling MEAN
Divide the population into sections (or It is the average of our data set and it is
clusters); randomly select some of those easily affected by outliers.
clusters; Outlier
Non-Random Sampling is a data point that differs significantly from
A sampling procedure other
where samples selected in observations.
a deliberate manner with MEDIAN
little or no attention to It is the middle value and arranged in
randomization order. It is the opposite of Mean since it won’t
- Convenience Sampling be affected by outliers because it is in the
A process of selecting a group of middle and it can’t be changed.
individuals MODE
who (conveniently) are available for Occurs most frequently. It is the most
study commonly occurring value in a
- Purposive Sampling distribution.
A process of selecting based from MIDRANGE
judgment It is the middle point of a range of
to select a sample which the researcher numbers. Average of the lowest and
believed, based on prior information, highest scores.
will POINT MEASURES
provide the data they need. give us a way to see where a certain data point
- Quota Sampling or value falls in a sample or distribution. A
Is applied when an investigator survey measure can tell us whether a value is about the
collects information from an assigned average, or whether it’s a value falls is an
number, or quota of individuals outlier.
MEASURES OF VARIABILITY PERCENTILE RANK
are statistics that describe the - percentage of scores in its frequency
amount of difference and spread in a distribution that are equal to or lower
data set. than it.
RANGE RELIABILITY
The difference between the largest and the refers to how consistently a method measures
smallest measures. something. If the same result can be
INTERQUARTILE RANGE (IQR) consistently achieved by using the same
Half the distance between quartile methods under
Points measures the spread of your data's the same circumstances, the measurement is
middle half. considered reliable.
VARIANCE Test-retest
a measure of dispersion a measure of reliability obtained by
that takes into account the spread of administering the same test twice over a period
all data points in a data set. of time to a group of individuals.
STANDARD DEVIATION Internal consistency
A measure of how dispersed the data is which is the consistency of people’s responses
in relation to the mean. across the items on a multiple-item measure.
Inter-rater reliability
is the extent to which different observers are
consistent in their judgments.
Parallel Forms Reliability
Measures the correlation between two
equivalent versions of a test. You use it when
Z- SCORES you have two different assessment tools or sets
(also known as a standard scores) of questions designed to measure the same
- Helps to understand where a score lies thing.
in relation to other scores in the Split-half reliability
distribution is a measure of consistency whereby a set of
- Helps to understand where a score lies items that make up a measure is split in two
in relation to other scores in the during the data
distribution
Standard Score VALIDITY
allows us to calculate the probability of a score is the extent to which the data or results
occur within our normal distribution and of a research method represent
enables us to compare two scores that are from the intended variable.
different normal CONSTRUCT VALIDITY
distributions. Evaluates whether a measurement tool really
DEVIATION IQ SCORES represents the thing we are interested in
a standard score with a mean of 100 and a measuring. It’s central to
standard establishing the overall validity of a method.
deviation of 15. CONTENT VALIDITY
T- SCORES Assesses whether a test is representative of all
are standard scores with a mean of 50 and a aspects of the construct.
standard deviation of 10. FACE VALIDTY
SCALED SCORES considers how suitable the content of a test
are standard scores with a mean of 10 and a seems to be on the surface. It’s similar to
standard deviation of 3.
content validity, but face validity is a more Lambda
informal and subjective assessment. • test used to measure relationship between
CRITERION VALIDITY two nominal variables
evaluates how well a test can predict a concrete Gamma
outcome, or how well the results of your test • test used to measure relationship between
approximate the results of another test. two ordinal variables
Eta
NORMAL DISTRIBUTION • test used to measure relationship between
form a bell-shaped curved that is symmetric interval and nominal variables
about a vertical line through the mean of the Linear Regression
data. • test used to predict the dependent variable
SKEWNESS from the given (one) independent
a measure of the asymmetry of a variable.
distribution. A distribution is asymmetrical Multiple Linear Regression
when • test used to predict the dependent variable
its left and right side are not mirror images. from several independent variables.
POSITIVE SKEWED t-test for one sample (One sample t-test)
the mean of the data is greater than the median • used to compare population mean to sample
(a large number of data-pushed on the right- mean
hand side). t-test for dependent sample
NEGATIVE SKEWED • test used to compare means from the same
The mean of the data is less than the median (a groups
large Example
number of data-pushed on the left-hand side). t-test for independent sample
KURTOSIS • test used to compare means from two
a measure of whether the different (independent) groups
data are heavy-tailed or light-tailed One-way ANOVA
relative to a normal distribution. • test used to compared three or more means.
Leptokurtic or heavy-tailed distribution Chi square (x2)
(kurtosis more than normal distribution). • test between two nominal variables.
Mesokurtic (kurtosis)
Same as the normal distribution). Non-Parametric Test
Platykurtic or short-tailed Distribution • if one of the assumptions for parametric test
(kurtosis less than normal distribution). was violated
Parametric Test
• if the data are normally distributed (Shapiro- Binomial Test
Wilk Test/Kolmogorov-Smirnov Test) • A non parametric test used to compare two
• more than 30 samples nominal data (dichotomous)
• probability sampling technique was used to Spearman rho
select the sample • A non parametric test used to measure
relationship (correlation) between interval
Pearson’s r variables.
• test used to measure relationship (correlation) Wilcoxon Signed Ranked Test
between interval variables • A non parametric test use to compare two
Point Biserial dependent/related groups
• test used to measure relationship (correlation) Mann Whitney U Test
between interval and nominal • A non parametric test use to compare two
(dichotomous) variables. independent/different groups
Kruskal Wallis H Test NULL HYPOTHESIS
• A non parametric test used to compared three -the hypothesis that we hope to reject or do not
or more means. reject.
ALTERNATIVE HYPOTHESIS
Summary (Counterparts) -the hypothesis that will be accepted.
1. chi-square – Binomial Test TWO TAILED
2. Pearson’s r - Spearman rho -Used for non directional hypothesis.
3. t-test for dependent sample – Wilcoxon ONE TAILED
signed rank test -used for directional hypothesis.
4. t-test for independent sample – Mann TYPE 1 ERROR
Whitney U test -rejected when in fact it is true (FALSE POSITIVE)
5. One-way ANOVA – Kruskal Wallis H Test TYPE II ERROR
-not rejected when in fact it is false(FALSE
STANDARD ERROR NEGATIVE)
-used to estimate the standard deviation of the
sampling distribution
DIFFERENCE OF PROPORTIONS
-used to check if the data is reliable and stable.
CONFIDENCE INTERVAL
-is the range of values that you expect your
estimate to fall between a certain percentage of
the time you ran the experiment.
CONFIDENT LEVEL
-is the percentage of times you expect to
reproduce an estimate between the upper and
lower bound of confidence level.
POINT ESTIMATE
-will be whatever statistical estimate you are
making.
CRITICAL VALUE
-tells you how many standard deviation away
from the mean you need to go in order to reach
the desired confidence level to
you confidence interval.
LEVEL OF SIGNIFICANCE
-it is defined if the null hypothesis is assumed to
be accepted or rejected.
INFERENTIAL STATISTICS
-the process of making inferences or
generalizations on population based on the
results of the study on the samples.
HYPOTHESIS
-assumption made about the probability
distribution of the population.

Nature of Statistics

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Nature of Statistics

Uploaded by

Copyright:

Available Formats

NATURE OF STATISTICS

STATISTICS - A branch of mathematics that Measurement.

You might also like