Cba101 MT

Introduction to Statistics Sample Continuous variables
Statistics is the branch of - portion of population selected for produce numerical responses that Interval scale
mathematics that transforms analysis arise from a measuring process. - ordered scale in which the
numbers into useful information for difference between measurements is
decision makers. Statistics lets you Parameter * The time you wait for teller service a meaningful quantity but does not
know about the risks associated with - measurement used to describe the at a bank involve a true zero point and can
making a business decision and population. ~ because the response takes on present value below zero.
allows you to understand and reduce any value within a continuum, or an * A noontime temperature reading of
the variation in the decision-making Statistics interval, depending on the precision 67 degrees Fahrenheit is 2 degrees
process. - measurement used to describe the of the measuring instrument. Your warmer than a noontime reading of
sample waiting time could be 1 minute, 1.1 65 degrees.
I. Statistics in Business minutes, depending on the precision
Descriptive statistics III. Identifying Types of Variables of the measuring device used. Ratio scale
- the methods that help collect, ordered scale in which the difference
summarize, present, and analyze a Categorical variables between the measurements involves
set of data. (known as qualitative variables) a true zero point, as in height,
have values that can only be placed weight, age, or salary measurements
Inferential statistics into categories such as yes and no. * A person who weighs 240 pounds
- the methods that use the data is twice as heavy as someone who
collected from a small group to draw * “Do you currently own bonds?” (yes Measurement Scales weighs 120 pounds
conclusions about a larger group. or no) and the level of risk of a bond
fund (below average, average, or Nominal scale
II. Vocabulary of Statistics above average) Classifies data into distinct
Variable categories in which no ranking is
- a characteristic of an item or Numerical variables implied.
individual. (known as quantitative variables) The weakest form of measurement
have values that represent because you cannot specify any
Data quantities. Identified as being either ranking across the various
- the different value associated with a discrete or continuous variables. categories.
Chapter 3.1 Central Tendency
variable The three measures of central
Discrete variables tendency.
Operational Definitions have numerical values that arise
-Data values are meaningless unless from a counting process. I. Mean (Arithmetic Mean)
their caribales have operational - the only common measure in which
definitions, universally accepted * “The number of premium cable all the values play an equal role.
meaning that are clear to all channels subscribed to” Ordinal scale - serves as a “balance point” in a set
associated with an analysis. ~ because the response is one of a Classifies values into distinct of data.
~ shows how the research use the finite number of integers. You categories in which ranking is
word. subscribe to zero, one, two, or more implied.
channels.
Population
- consists of all the items or * “The number of items purchased”
individual about which you want to ~ because you are counting the
draw a conclusion. number of items purchased.
one or both extremes. Thus, using
II. Median the range as a measure of variation
- the middle value in an ordered when at least one value is an *
array of data that has been ranked extreme value is misleading.
from smallest to largest.
- not affected by extreme values, so
you can use the median when
extreme values are present.
II. Variance and Standard Panel A: are negative, or left-
Deviation skewed. In this panel, most of the
Sum of squares (SS) values are in the upper portion of the
- One measure of variation that distribution. A long tail and distortion
differs from data set to data set to the left is caused by some
Rule 1 If the data set contains an squares the difference between each IV. Z- score extremely small values. These
odd number of values, the median is value and the mean and then sums Extreme value or outlier: a value extremely small values pull the mean
the measurement associated with these squared differences. located far away from the mean. downward so that the mean is less
the middle-ranked value. Values located far away from the than the median.
Rule 2 If the data set contains an mean will have either very small
even number of values, the median (negative) Z scores or very large Panel B: are symmetrical. Each half
is the measurement associated with (positive) Z scores. of the curve is a mirror image of the
the average of the two middle-ranked other half of the curve. The low and
values. Z-score: Useful in identifying high values on the scale balance,
outliers. It is considered an outlier if it and the mean equals the median.
III. Mode is less than -3.0 or greater than +3.0
the value in a set of data that Panel C: are positive, or right-
appears most frequently. skewed. In this panel, most of the
- extreme values do not affect the values are in the lower portion of the
mode. Often, there is no mode or distribution. A long tail on the right is
there are several modes in a set of caused by some extremely large
data. values. These extremely large values
pull the mean upward so that the
Chapter 3.2 Variation and Shape III. Coefficient of Variation mean is greater than the median.
- a relative measure of variation that V. Shape
Variation measures the spread, or is always expressed as a percentage Skewness statistic: measures the
- the pattern of the distribution of
dispersion, of values in a data set. rather than in terms of the units of extent to which a set of data is not
data values throughout the entire
the particular data. The coefficient of symmetric.
range of all the values. A
I. Range variation, denoted by the symbol CV, distribution is either symmetrical or
- simplest numerical descriptive measures the scatter in the data Kurtosis statistic: measures the
skewed. Shape also can influence
measure of variation in a set of data. relative to the mean. relative concentration of values in the
the relationship of the mean to the
- measures the total spread in the center of the distribution of a data
median. In most cases:
set of data. set, as compared with the tails.
- It does not indicate whether the
values are evenly distributed A symmetric distribution has a
throughout the data set, clustered skewness value of zero. A right-
near the middle, or clustered near skewed distribution has a positive
skewness value, and a left-skewed Covariance-based structural Random Sampling
distribution has a negative skewness 3. Multiple Regression Analysis equation modeling (CB-SEM) - generalize the result across the
value. (MRA): to know which of the - for theory testing and confirmation population
variables would explain or predict a
A bell-shaped distribution has a particular variable. Partial Least Squares Structural Types of Random Sampling
kurtosis value of zero. A distribution Equation Modeling (PLS-SEM) 1. Simple random sampling
that is flatter than a bell-shaped 4. Cluster Analysis: to know how to - for prediction and theory - contain sampling Frame
distribution has a negative kurtosis group the respondents based on development. Sampling Frame
value. A distribution with a sharper their preferences and profile them by - list of target population and must
peak (one that has a higher cross-tabulating their demographic have 20% allowance for non-
concentration of values in the center data. participation.
of the distribution than a bell-shaped III. Sampling plan 2. Stratified random sampling
distribution) has a positive kurtosis 5. to know how to measure the Undergraduate: 100 respondents - the target population has natural
value. perception of the respondents and (9.87% margin of error and 68.3% groupings and some groups have
show the result visually though confidence level) more respondents than the other
Chapter 4: Statistical Tools perceptual map. Graduate: 377 respondents 5% group.
I. Correlational Research Tools a. Multidimensional scaling (MDS): margin of error and 95% confidence
1. Involves hypothesis testing: if the data is based on paired level) 3. Poisson Distribution
a. Hypothesis of relationship comparison (interval or ratio). - the population is unknown
a.1 Pearson R correlation: Sampling size formula: - the presence of enumerator
parametric data b. Multidimensional Unfolding 1. Slovin’s Formula (population is (research assistant) is needed and a
(MDU): if the data are based on known) sampling procedure.
a.2 Kendall’s tau B; Spearman’s ranking (ordinal).
rho; or chi-square: non-parametric n= # of samples Chapter 5
data c. Correspondence analysis: if the N = total population Type of Question
data are based on multiple options e = error tolerance I. Descriptive: simply describing
b. Hypothesis of Difference (nominal). Type of Data
b.1 Paired T-test: comparing a pair 2. Cochran’s formula (population is 1. Nominal/ Ordinal
of variables (interval or ordinal) 6. Conjoint Analysis: to determine unknown) - Frequency Distribution
which in the list of valued factors 2. Interval/ Ratio
b.2 Independent samples t-test: would result into the best n= # of samples - Mean and Standard Deviation
comparing a variable (interval or combination for the respondents to Z = value from the normal distribution
ratio) with a nominal variable with choose from. table corresponding to desired level II. Correlational: asking for
two categories only. of confidence (95%), which is 1.96 relationship
p(1-p) = maximum data variability Is there dependence?
b.3 One-way ANOVA: comparing a II. Causal research tool = .50(1-.50) = .25 Yes: Regression Analysis
variable (interval or ratio) with a 1. Confirmatory Factor Analysis e = margin of error (prediction)
nominal variable with at least 3 or - to test the existing theoretical No: Type of Data
more categories. framework and validate its Purposive Sampling 1. Interval/Ratio: Pearson
measurement variables for research - to determine the target market for a 2. Ordinal: Spearman
2. Factor analysis: to know which application. particular organization. 3. Nominal: Chi-square
factor/variables are valued by the - qualifying the respondents, then
respondents; to know which 2. If the problem is complex and selecting once qualified by the quota III. Comparative
variables are important and the characterized complex use: method. Type of Data
unimportant will be eliminated. 1. Nominal/Ordinal: Chi-square
2. Interval/ Ratio: How many to be Hypothesis Based on the null hypothesis there
compared? Alternative: There is a significant (is or is no) significant difference
If 1: One Sample t-test (either relationship or difference) between two variances, the result
If 2: Coming from …? shows that the P-value (value of Sig.
Same sample: Paired Sample T-test Null: There is no significant (either 2-tailed (is or is not) significant.
Different Sample: Independent relationship or difference) Thus, the null hypothesis is (rejected
Sample T-test p-value = greater than 0.05 – there is or accepted).
If 3 or more: One-way ANOVA no significance; Ho is accepted.
P-value = equal or less than 0.05 –
there is significant, Ho is rejected.
Descriptive Research Measuring 1. Pearson

Tools Table of Correlations Interpretation 3. Independent Sample T-test
1. Descriptive Statistics Range of Interpretation:
- consist of central tendency, Coefficient Description The result of the analysis shows that
variation, and shape. From To the mean variance for nominal
Skewness ratio and Kurtosis ratio ± 0.81 ± 1.00 Very Strong variable is value of mean, which is
should fall within the critical values of ± 0.61 ± 0.80 Strong (lesser or greater) than he 2nd
±2.58 (0.01 significance level). ± 0.41 ± 0.60 Moderate nominal variable with mean = ().
Formula: ± 0.21 ± 0.40 Weak Based on Levene’s test for equality
Skewed Ratio = skewness/standard ± 0.00 ± 0.20 Weak to No of variance, the F test (is or is not)
error of skewness Correlation significant for equal variances
Indicator assumed. Thus, the null hypothesis
Skewed to the left: less than -2.58 Interpretation Format: for equal variance assumed (should
Normally skewed: within ±2.58 Based on the correlation table, it can be rejected or cannot be rejected).
Skewed to the right: greater than be noted that the P-value are all When tested for t-test for Equality of
+2.58 0.000 between two variance; two Means significance, it is revealed
variance; and so on. These imply that their variable (does or does not)
Kurtosis Ratio = kurtosis/standard that the variables are significantly significantly differ (P-value =
error of kurtosis related to each other. Furthermore, unknown). This means that the two
Indicator the coefficient values are found to variance (have or have not) relatively
Platykurtic: less than -2.58 be: (value of Pearson correlation) of the same level of variable.
Mesokurtic: within ±2.58 two variance; and so on. Based on
Leptokurtic: greater than +2.58 Table of Correlations Interpretation,
the results signify (description
2. Frequency Table according to range of coefficient)
3. Charts between the variable tested.
4. Mean interpretation table
Level of Interpretation Note: categorize the interpretation
Formula: (n-1)/n, where n = # of point according to the table of correlations.
score in a scale.
2. Paired Sample T-test
II. Correlational Interpretation:

Cba101 MT

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Cba101 MT

Uploaded by

Copyright:

Available Formats

Introduction to Statistics Sample Continuous variables

Descriptive Research Measuring 1. Pearson

You might also like