Summary of Research 2

Chapter one
 Descriptive statistics and inferential

Descriptive statistics summarize the population data by describing what was observed in the
sample numerically or graphically.
Descriptive statistics involves the use of numerical and graphical methods to summarize and
describe the characteristics of a data set. It includes measures such as mean, median, mode,
standard deviation, and range. Descriptive statistics are used to present and summarize data in
a meaningful way.
Inferential statistics uses patterns revealed through analysis of sample data to draw
inferences about the population represented. Inferential statistics, on the other hand, involves
making inferences or predictions about a population based on a sample of data. It uses
probability theory to generalize about a population based on the analysis of a sample.
Inferential statistics includes techniques such as hypothesis testing, confidence intervals, and
regression analysis.
In summary, descriptive statistics is used to describe and summarize data, while inferential
statistics is used to make predictions or inferences about a population based on a sample.
 Relationship between the sampling technique used and
descriptive/inferential statistics
Yes, there is a relationship between the sampling techniques used to gather data and
descriptive/inferential statistics. The sampling technique used can impact the validity and
generalizability of the data, which in turn affects the type of statistical analysis that can be
performed.
Therefore, it is important to carefully consider and select an appropriate sampling technique
when gathering data in order to ensure that accurate and reliable descriptive and inferential
statistics can be obtained.
 Limitations of statistics
 Statistical laws are not exact. They are probabilistic in nature, and inferences
based on them are only approximate.
 Statistics is liable to be misused. It deals with figures which are innocent by
themselves, but which can be easily distorted and manipulated.
 Major steps in a statistical investigation
 Define the objectives and scope of the survey.
 Define the population and sampling units.
 Identify the proper sampling technique and collect data.
 Organize the data to have a good overall picture of the data
 Analyze the data (calculate various statistics of interest).
 Make conclusions / predictions based on the statistics computed from the sample
by applying mathematical statistics and probability theory.
 Methods of data collection
1. Personal interview: This method of data collection may take two forms:
a) Face-to-face: This involves trained interviewers visiting the desired people
(respondents) in person to collect data.
b) Telephone: This involves trained interviewers phoning people to collect data. This
method is quicker and less expensive than face-to-face interviewing.
2. Self-completed (written questionnaire): In this method, written questions are mailed or

hand-delivered to respondents.
(a) Mail survey: Here questionnaires are mailed to people and mailed back by the
respondents after completion.
b) Hand-delivered questionnaire: This is a self-enumerated survey where

questionnaires are hand-delivered to people and mailed back by the respondents after
completion.
 Types of Sampling Techniques

there are two types of sampling techniques: random sampling and non-random
sampling.
 In random sampling, the elements to be included in the sample entirely depend
on chance.
 Random sampling techniques often yield samples that are representative of the
population from which they are drawn.
 In non-random sampling, the units in the sample are chosen by the investigator
based on his/her personal convenience and beliefs.
1. Random Sampling Technique

1. Simple Random Sampling
This is a method of sampling in which every member of the population has the same
chance of being included in the sample.
2. Systematic Random Sampling

In some instances, the most practical way of sampling is to select, say, every 20th
name on a list, every 12th house on one side of a street, every 50th piece of item
coming off a production line, and so on. This is called systematic sampling.
3. Stratified Random Sampling
Stratified random sampling is the procedure of dividing the population into relatively
homogeneous groups, called strata, and then taking a simple random sample from
each stratum. If the population elements are homogeneous, then there is no need to
apply this technique.
Example: If our interest is the income of households in a city, then our strata may be:
low income households, middle income households, high income households
Take a sample of size proportional to the sub-population (stratum) size, i.e., draw a
large sample from a large stratum and a small sample from a small sub-population.
4. Cluster Sampling
This is a method of sampling in which the total population is divided into relatively
small subdivisions, called clusters, and then some of these clusters are randomly
selected using simple random sampling. Example: Suppose we want to make a survey
of households in Addis Ababa. Collecting information on each and every household is
impractical from the point of view of cost and time.
What we do is divide the city into a number of relatively small subdivisions, say,
Kebeles. So, the Kebeles are our clusters. Then we randomly select, say, 20 Kebeles
using simple random sampling. Then we randomly select households from each of
these 20 selected Kebeles using simple random sampling.
This method is called two-stage sampling since simple random sampling is applied
twice (first, to select a sample of Kebeles and second, to select a sample of
households from the selected Kebeles)
2. Non-random Sampling Techniques

1. Convenience or Accidental sampling: (members of the population are chosen
based on their relative ease of access)
2. Judgment or Purposive sampling: (The researcher chooses the sample based on

who he/she thinks would be appropriate for the study)
Purposive sampling starts with a purpose in mind and the sample is thus selected
to include people or objects of interest and exclude those who do not suit the
purpose. Purposive sampling can be subject to bias and error.
3. Case study (The research is limited to one group, often with a similar characteristic
or of small size.)
4. Ad hoc quotas (A quota is established and researchers are free to choose any
respondent they wish as long as the quota is met.)
5. Snowball sampling (The first respondent refers a friend. The friend also refers a
friend, etc.)
 Comparison: Probability and non-probability sampling

 Probability sampling (or random sampling) is a sampling technique in which the
probability of getting any particular sample may be calculated.
 Non-probability sampling does not meet this criterion and should be used with
caution.
 Non-probability sampling techniques cannot be used to infer from the sample to
the general population.
 The difference between non-probability and probability sampling is that non-
probability sampling does not involve random selection and probability sampling
does.
 Criteria for the acceptability of a sampling method
Chance of Selection for Each Unit: The sample must be selected so that it properly
represents the population that is to be covered.
Measurable Reliability: It should be possible to measure the reliability of the estimates
made from the sample. Feasibility: Another characteristic is that the sampling plan must
be practical. Economy and Efficiency: Finally, the design should be efficient.
Chapter two
 Scales of measurement of data

a) Nominal Scale: The nominal scale assigns numbers as a way to label or identify
characteristics. For example, we can record the gender of respondents as 0 and
where 0 stands for male and 1 stands for female.
b) Ordinal Scale: The ordinal scale ensures that the possible categories can be
placed in a specific order (rank) or in some ‘natural’ way. For example, responses
for health service provision can be coded as: 1 for poor – 2 for moderate – 3 for
good – 4 for excellent.
c) Interval Scale: Unlike the nominal and ordinal scales of measurement, the
numbers in an interval scale are obtained as a result of a measurement process and
have some units of measurement. Also, the differences between any two adjacent
points on any part of the scale are meaningful. For example, Celsius temperature
is an interval scale.
d) Ratio Scale: The ratio scale represents the highest form of measurement
precision. In addition to the properties of all lower scales of measurement, it
possesses the additional feature that ratios have meaningful interpretation.
Furthermore, there is no restriction on the kind of statistics that can be computed
for ratio scaled data
For example, the height of individuals (in centimeters), the annual profit of firms
(in Birr) and plot elevation (in meters) represent ratio scales.
 Why is level of measurement important?

 First, knowing the level of measurement helps you decide on how to interpret the
data. For example, if your measure is nominal, then you know that the numerical
values are just short codes for (qualitative) categories.
 Second, the level of measurement helps you decide on how to present data in tabular
and graphical forms. For example, if you know that a measure is nominal, then you
don’t go for a grouped frequency distribution or a histogram.
 Third, knowing the level of measurement helps you decide what type of statistical
analysis is appropriate. If a measure is nominal, for instance, then you know that you
would never average the data values or apply parametric statistical methods.
 Types of classification of variables

1. continuous variable: quantitative variable that has a ‘connected’ string of possible
values at all points along the number line, with no gaps between them, is called a
continuous variable.
In other words, a variable is said to be continuous if it can assume an infinite number
of real values within a certain range. Examples of a continuous variable are distance,
age, daily revenue, etc.
2. discrete variable: quantitative variable that has ‘separate’ values at specific points
along the number line, with gaps between them, is called a discrete variable
Example: The number of people in a particular shopping mole per hour, the number
of shares traded during a day.
Chapter three
 Identifying outlier (extreme values)

Extreme values or outliers in a data set refer to values that are significantly different
from the majority of the other values. These values can be either unusually high
(positive outliers) or unusually low (negative outliers).
Outliers can occur due to various reasons such as measurement errors, data entry
mistakes, or genuinely rare occurrences in the data. Identifying and understanding
extreme values or outliers is important in data analysis as they can have a significant
impact on statistical measures and interpretations. Outliers can affect measures such
as the mean and standard deviation, making them less representative of the overall
data set.
Therefore, it is often necessary to identify and handle outliers appropriately to ensure
accurate analysis and interpretation of the data.
Note: Data values which are by far smaller or larger as compared to the bulk of data
are called extreme values or outliers. Whenever such extreme values exist, the mean
may give a distorted picture of the data. On the other hand, the median of such data
gives a good overall picture of the data.
Chapter four
 Review of hypothesis testing

A statistical hypothesis is an assertion or tentative assumption about a population
parameter. It refers to a provisional idea whose merit needs evaluation.
We have two types of hypotheses: The null hypothesis (Ho) is often established as:
 There is no significant association between two or several items
 There is no significant difference between two or several items
 There is no significant influence of one item on another
 There is no significant treatment effect, etc.
The alternative hypothesis (Ha or H1) is the alternative available when the null hypothesis
has to be rejected. In other words, if we have strong evidence against the null hypothesis, we
have to reject it and go for its complement which we call the alternative hypothesis.
Hypothesis testing is a procedure for checking the validity of a statistical hypothesis. It is a
process by which we decide whether the null hypothesis should be rejected or not.
A value computed from a sample that is used to determine whether the null hypothesis has to
be rejected or not is called a test statistic. Our decision is based on where this figure falls: if
it falls in the critical (rejection) region(s), then we reject 0 H; otherwise (that is, if it falls in
the non-rejection region), then we do not reject 0 H.
The value that is a border line between the non-rejection and critical regions is called the
critical value.
 Types of errors and level of significance

A hypothesis testing procedure may lead to a wrong conclusion. There are two kinds of
possible errors, called Type I error and Type II error.
1. Type I error is the act of rejecting the null hypothesis (0 H) while it is true. The
probability of type I error is often denoted by and is called the level of significance.
2. Type II error is the act of failing to reject the null hypothesis 0 H while it is false. The
probability of type II error is often denoted by.
 Procedure for hypothesis testing
 The first step is to state the null and alternative hypotheses. Depending.
 The second step is to determine the test size (level of significance or).
 The third step is to compute the appropriate test statistic from the sample.
 The fourth step is to determine the critical value on the basis of the specified
level of significance and the appropriate probability distribution.
 The fifth step is the decision-making step. This step of hypothesis testing
helps the researcher to reject or not to reject the null hypothesis.
 The sixth step is to draw important conclusions and make substantive
interpretation of the results.
 Data analysis
In most research works data analysis involves three major steps, done in roughly this order:
 Cleaning and organizing the data for analysis (data preparation)
 Describing the data (descriptive statistics)
 Testing hypotheses and models (inferential statistics).
 parametric and non-parametric procedure

1. Assumptions: Parametric statistical procedures rely on specific assumptions about
the distribution of the data, such as normality and homogeneity of variance. Non-
parametric procedures do not make these assumptions and are therefore more robust
to violations of these assumptions.
2. Type of Data: Parametric procedures are typically used for interval or ratio data,
while non-parametric procedures can be used for any type of data, including ordinal
and nominal data.
3. Power: Parametric procedures generally have more statistical power than non-
parametric procedures when the assumptions are met. However, non-parametric
procedures may be more powerful when the assumptions are violated or when dealing
with small sample sizes.
 Answers for tests
1. Briefly discuss the difference between descriptive statistics and inferential ?
 Descriptive statistics summarize the population data by describing what was
observed in the sample numerically or graphically.
 Descriptive statistics involves the use of numerical and graphical methods to
summarize and describe the characteristics of a data set. It includes measures such as
mean, median, mode, standard deviation, and range. Descriptive statistics are used to
present and summarize data in a meaningful way.
 Inferential statistics uses patterns revealed through analysis of sample data to draw
inferences about the population represented. Inferential statistics, on the other hand,
involves making inferences or predictions about a population based on a sample of
data. It uses probability theory to generalize about a population based on the analysis
of a sample.
 Inferential statistics includes techniques such as hypothesis testing, confidence
intervals, and regression analysis.
 In summary, descriptive statistics is used to describe and summarize data, while
inferential statistics is used to make predictions or inferences about a population
based on a sample.
2. is there is relationship between the sampling technique used and
descriptive/inferential statistics?
Yes, there is a relationship between the sampling techniques used to gather data and
descriptive/inferential statistics. The sampling technique used can impact the validity and
generalizability of the data, which in turn affects the type of statistical analysis that can be
performed.
Therefore, it is important to carefully consider and select an appropriate sampling
technique when gathering data in order to ensure that accurate and reliable descriptive
and inferential statistics can be obtained.
3. Comparison on parametric and non-parametric procedure?

 The key difference is that parametric test relies on statistical distribution in data
where as non-parametric do not depend any distribution.
 Parametric procedures are typically used for interval or ratio data, while non-
parametric procedures can be used for any type of data, including ordinal and
nominal data.
 Parametric statistical procedures rely on specific assumptions about the distribution
of the data, where non-parametric procedures do not make these assumptions.
4. If the data are in interval and ratio scale, are parametric methods the only option?
why?
Parametric methods are not the only option for analysing data on interval and ratio scales,
but they are often preferred due to their advantages. Here's why:
parametric methods are often preferred for interval and ratio scale data due to their
efficiency, power, and interpretability, but non-parametric methods can be used when
the assumptions of parametric models are violated or when dealing with non-parametric
data types.
5. distinguish between random and non-random sampling?
 Random sampling: is a technique in which each member of the population has an
equal chance of being selected for the sample. This is done by using a random number
generator to select a sample from the population. Random sampling is the most
common and reliable sampling technique, as it ensures that the sample is
representative of the population. This means that the results of the study can be
generalized to the population as a whole.
 Non-random sampling: is a technique in which the sample is selected based on a
non-random criterion, such as convenience, judgment, or availability. This means that
some members of the population are more likely to be selected than others. Non-
random sampling is less reliable than random sampling, as it can lead to a biased
sample. This means that the results of the study cannot be generalized to the
population as a whole.
6. can we apply both sampling techniques for inferential purposes? if your
is no, discus the reasons?
There are two main reasons why non-random sampling cannot be used for inferential
purposes:
 Bias: Non-random sampling can lead to a biased sample. This means that the sample
is not representative of the population, and the results of the study cannot be
generalized to the population as a whole.
 Sampling Error: Non-random sampling can lead to an increased sampling error.
This means that the results of the study are less reliable, and the true value of the
population parameter is less likely to be within the confidence interval.
7. What are the four Scales of measurement of data?
e) Nominal Scale: The nominal scale assigns numbers as a way to label or identify
characteristics. For example, we can record the gender of respondents as 0 and
where 0 stands for male and 1 stands for female.
f) Ordinal Scale: The ordinal scale ensures that the possible categories can be
placed in a specific order (rank) or in some ‘natural’ way. For example, responses
for health service provision can be coded as: 1 for poor – 2 for moderate – 3 for
good – 4 for excellent.
g) Interval Scale: Unlike the nominal and ordinal scales of measurement, the
numbers in an interval scale are obtained as a result of a measurement process and
have some units of measurement. Also, the differences between any two adjacent
points on any part of the scale are meaningful. For example, Celsius temperature
is an interval scale.
h) Ratio Scale: The ratio scale represents the highest form of measurement
precision. In addition to the properties of all lower scales of measurement, it
possesses the additional feature that ratios have meaningful interpretation.
Furthermore, there is no restriction on the kind of statistics that can be computed
for ratio scaled data
For example, the height of individuals (in centimeters), the annual profit of firms
(in Birr) and plot elevation (in meters) represent ratio scales.
8. The importance of Scales of measurement?

 helps you decide what type of statistical analysis is appropriate.
 helps you decide on how to interpret the data.
1. Differentiate between correlation and regression analysis.
Answer:
 Correlation: Measures the strength and direction of a linear relationship between two
variables. It indicates how much one variable changes in relation to the other, but it
does not imply causation.
 Regression: Models the relationship between a dependent variable and one or more
independent variables. It allows you to predict the value of the dependent variable
based on the values of the independent variables.
2. Explain the different types of correlation coefficients and their interpretations.
Answer:
 Pearson's correlation coefficient (r): Most common, measures linear relationships (-1
to 1, 0 = no correlation).
 Spearman's rank correlation coefficient (rho): Measures monotonic relationships, less
sensitive to outliers than Pearson's.
 Kendall's rank correlation coefficient (tau): Similar to Spearman's, measures
concordance between ranks.
3. Discuss the assumptions of linear regression and potential consequences of violating

them.
Answer:
 Linearity: Relationship between variables is linear. Violation: Use transformations or

non-linear models.
 Independence: Errors are independent from each other. Violation: Use clustered
standard errors or other robust methods.
 Homoscedasticity: Constant variance of errors across the range of independent
variables. Violation: Use weighted least squares or robust standard errors.
 Normality: Errors are normally distributed. Violation: Use robust standard errors or
non-parametric methods.
4. How do you interpret the p-value and coefficient of determination (R-squared) in
regression analysis?
Answer:
p-value: Indicates the probability of observing the test statistic or a more extreme one by
chance, assuming no relationship between variables. Lower p-values (e.g., < 0.05) suggest
statistically significant relationships.
 R-squared: Proportion of variance in the dependent variable explained by the model.

Higher values (e.g., > 0.7) suggest better model fit, but high R-squared doesn't
necessarily mean causality.
5. Describe potential issues with relying solely on correlation coefficients to establish

causality.
Answer:
 Third-variable problem: A third variable may be influencing both measured variables,

creating a spurious correlation.
 Reverse causation: The direction of the relationship may be reversed, causing one
variable to influence the other instead of the other way around.
 Correlation does not imply causation: Just because two variables are correlated does
not mean that one causes the other.
6. How can you choose between correlation and regression analysis for analyzing your
research data?
Answer:
 Correlation: If you want to measure the strength and direction of a relationship

without implying causation.
 Regression: If you want to predict the value of one variable based on the values of
other variables and understand the impact of each variable on the outcome.
7. Describe the steps involved in building a simple linear regression model.
Answer:
1. Data collection: Gather data on both your dependent and independent variables.
2. Data cleaning and exploration: Check for missing values, outliers, and potential
transformations.
3. Model fitting: Estimate the intercept and slope coefficients using techniques like least
squares.
4. Model evaluation: Assess the goodness-of-fit (e.g., R-squared, p-values) and check
for violations of assumptions (linearity, homoscedasticity, etc.).
5. Model interpretation: Explain the meaning of the coefficients and use them to predict
values of the dependent variable.
8. Differentiate between simple and multiple linear regression.
Answer:
 Simple linear regression: Models the relationship between one dependent variable and
one independent variable.
 Multiple linear regression: Models the relationship between one dependent variable
and two or more independent variables.
9. Discuss the benefits and limitations of using multiple regression compared to simple
linear regression.
Answer:
Benefits:
 Captures the influence of multiple factors on the dependent variable.

 Can provide more accurate predictions if multiple variables are relevant.
Limitations:
 Increased complexity in interpretation and analysis.

 Higher risk of overfitting (model memorizing data instead of generalizing).
 Increased data requirements to ensure reliable estimates.
10. How can you choose between simple and multiple regression for your research
question?
Answer:
Consider the following factors:
 Number of relevant independent variables: If only one variable likely influences your
outcome, simple regression may be sufficient.
 Complexity of the relationships: If multiple variables interact or have non-linear
effects, multiple regression might be needed.
 Data availability: Multiple regression requires more data for reliable estimates.
11. Explain how multicollinearity can affect multiple regression models and how to
address it.
Answer:
Multicollinearity occurs when independent variables are highly correlated with each other. It
can inflate standard errors and reduce the reliability of coefficient estimates. To address it:
 Exclude highly correlated variables.

 Use dimension reduction techniques like principal component analysis.
 Interpret results cautiously, acknowledging the uncertainty in variable effects.
12. Define non-parametric tests and explain their key advantages compared to
parametric tests.
Answer:
Non-parametric tests are statistical methods that don't require assumptions about the
underlying population distribution (e.g., normality). They're advantageous when:
 The data is not normally distributed.

 Sample size is small.
 Data contains outliers or ordinal data.
 Assumptions of parametric tests cannot be met.
13. Differentiate between parametric and non-parametric versions of common statistical

tests (e.g., one-tailed t-test vs. Mann-Whitney U test).
Answer:
 T-test vs. Mann-Whitney U test: Both compare two independent groups, but the t-test
assumes normality, while Mann-Whitney U test ranks data and is distribution-free.
 ANOVA vs. Kruskal-Wallis test: Both compare means across multiple groups, but
ANOVA assumes normality, while Kruskal-Wallis ranks data and is non-parametric.
14. Describe some common non-parametric tests and their applications.
Answer:
 Mann-Whitney U test: Compares two independent groups (ordinal or interval data).

 Wilcoxon signed-rank test: Compares paired data (ordinal or interval data).
 Kruskal-Wallis test: Compares three or more independent groups (ordinal or interval
data).
 Friedman's test: Compares repeated measures across treatments or times (ordinal
data).
 Spearman's rank correlation coefficient: Measures monotonic relationship between
two ordinal variables.
15. Discuss the importance of checking assumptions before using non-parametric tests.
Answer:
Even though non-parametric tests are less stringent, some basic assumptions might still apply
depending on the test. Checking for outliers, independence of observations, and homogeneity
of variances can enhance the reliability of your results.
16. How do you interpret the p-value and effect size measures in non-parametric tests?
Answer:
 P-value: Similar to parametric tests, indicates the probability of observing the test
statistic or more extreme one by chance, assuming no difference between groups.
Lower p-values (e.g., <0.05) suggest statistically significant differences.
 Effect size measures: Provide the magnitude and direction of the difference between
groups (e.g., Cohen's d for Mann-Whitney U test). Interpretation depends on the
specific test and research context.
17. What are some potential limitations of using non-parametric tests?
Answer:
 Lower statistical power compared to some parametric tests (especially with small
samples).
 May have less informative estimates of effect size.
 Certain non-parametric tests can be computationally intensive for large datasets.
Hypothesis Examples for Different Analyses:
1. Correlation:
Research Question: Is there a relationship between student engagement in online learning

activities (independent variable) and their final exam scores (dependent variable) in a
statistics course?
Hypothesis:
 Null Hypothesis (H0): There is no statistically significant correlation between

student engagement in online learning activities and their final exam scores. (ρ = 0)
 Alternative Hypothesis (Ha): There is a positive correlation between student
engagement in online learning activities and their final exam scores. (ρ > 0)
2. Simple Linear Regression:
Research Question: How does the average number of hours spent exercising per week
(independent variable) affect an individual's body mass index (BMI) (dependent variable)?
Hypothesis:
 Null Hypothesis (H0): There is no linear relationship between the average number of
hours spent exercising per week and an individual's BMI. (β = 0)
 Alternative Hypothesis (Ha): There is a negative linear relationship between the
average number of hours spent exercising per week and an individual's BMI. (β < 0)
3. Multiple Linear Regression:
Research Question: What are the combined effects of age (A), income (I), and education level
(E) (independent variables) on an individual's life expectancy (LE) (dependent variable)?
Hypothesis:
 Null Hypothesis (H0): Age, income, and education level have no statistically
significant combined effect on an individual's life expectancy. (βA = βI = βE = 0)
 Alternative Hypothesis (Ha): Age has a negative relationship with life expectancy
(βA < 0), income and education level have positive relationships with life expectancy
(βI > 0, βE > 0), and there might be interaction effects between these variables.
4. Non-Parametric Tests:
a) Mann-Whitney U Test:
Research Question: Do college students who live on campus (Group 1) have higher levels of
stress (measured by a stress score) than students who live off campus (Group 2)?
Hypothesis:
 Null Hypothesis (H0): There is no statistically significant difference in stress levels

between students who live on campus and those who live off campus.
 Alternative Hypothesis (Ha): Students who live on campus have higher stress levels
than students who live off campus.
b) Wilcoxon Signed-Rank Test:
Research Question: Do participants experience a change in anxiety levels before and after
practicing mindfulness meditation?
Hypothesis:
 Null Hypothesis (H0): There is no statistically significant difference in anxiety levels

before and after practicing mindfulness meditation.
 Alternative Hypothesis (Ha): Participants experience a decrease in anxiety levels
after practicing mindfulness meditation.
c) Kruskal-Wallis Test:
Research Question: Do employees belonging to different age groups (Gen Z, Millennials,

Gen X, Boomers) have different levels of job satisfaction?
Hypothesis:
 Null Hypothesis (H0): There is no statistically significant difference in job

satisfaction levels across different age groups.
 Alternative Hypothesis (Ha): There are statistically significant differences in job
satisfaction levels across different age groups.

Summary of Research 2

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Summary of Research 2

Uploaded by

Copyright:

Available Formats

Chapter one

 Descriptive statistics and inferential

 Define the population and sampling units.

 Identify the proper sampling technique and collect data.

 Organize the data to have a good overall picture of the data

 Analyze the data (calculate various statistics of interest).

2. Self-completed (written questionnaire): In this method, written questions are mailed or

b) Hand-delivered questionnaire: This is a self-enumerated survey where

 Types of Sampling Techniques

1. Random Sampling Technique

2. Systematic Random Sampling

3. Stratified Random Sampling

2. Non-random Sampling Techniques

2. Judgment or Purposive sampling: (The researcher chooses the sample based on

 Comparison: Probability and non-probability sampling

 Scales of measurement of data

 Why is level of measurement important?

 Types of classification of variables

 Identifying outlier (extreme values)

 Review of hypothesis testing

 Types of errors and level of significance

 parametric and non-parametric procedure

3. Comparison on parametric and non-parametric procedure?

8. The importance of Scales of measurement?

2. Explain the different types of correlation coefficients and their interpretations.

3. Discuss the assumptions of linear regression and potential consequences of violating

 Linearity: Relationship between variables is linear. Violation: Use transformations or

 R-squared: Proportion of variance in the dependent variable explained by the model.

5. Describe potential issues with relying solely on correlation coefficients to establish

 Third-variable problem: A third variable may be influencing both measured variables,

 Correlation: If you want to measure the strength and direction of a relationship

7. Describe the steps involved in building a simple linear regression model.

8. Differentiate between simple and multiple linear regression.

 Captures the influence of multiple factors on the dependent variable.

 Increased complexity in interpretation and analysis.

Consider the following factors:

 Exclude highly correlated variables.

 The data is not normally distributed.

13. Differentiate between parametric and non-parametric versions of common statistical

14. Describe some common non-parametric tests and their applications.

 Mann-Whitney U test: Compares two independent groups (ordinal or interval data).

17. What are some potential limitations of using non-parametric tests?

Research Question: Is there a relationship between student engagement in online learning

 Null Hypothesis (H0): There is no statistically significant correlation between

2. Simple Linear Regression:

3. Multiple Linear Regression:

 Null Hypothesis (H0): There is no statistically significant difference in stress levels

b) Wilcoxon Signed-Rank Test:

 Null Hypothesis (H0): There is no statistically significant difference in anxiety levels

Research Question: Do employees belonging to different age groups (Gen Z, Millennials,

 Null Hypothesis (H0): There is no statistically significant difference in job

You might also like