Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 4

Lesson 1: Introduction

Reasons why anecdotal proofs fail:


1. Small number of occurrences/observations.
2. Selection Bias
3. Confirmation Bias
4. Lack of accuracy

Statistics and why they are needed


The statistical approach – In order to overcome the limitations of the anecdotal
approach and personal perspective bias, we will use statistical tools.
In statistics we make use of data models that will support the claims/assumptions
with respect to the entire population that we are considering.
Statistics is a science that is concerned with gathering data, description and
presentation of the data, examination of the data and the drawing of conclusions
from the data.
Statistics assists us to propose assumptions/theories and to answer questions that
are of interest to us related to such theories.
Any research that is based on data requires knowledge of statistics.

Data is the fuel of the Digital Age


The Status Quo: High availability of data at a relatively low cost.

The Goal: Turn data into knowledge. Using data to generate value and translating it
into a competitive advantage.

Required: The knowhow to research data and generate significant insights with the
help of statistics. It is not enough to know how to extract data and produce
“beautiful” graphs.

Statistical biases that can lead to wrong conclusions:


 Non-representative sample data
 Poor quality data
 Selection bias - for example, receiving responses only from those who chose
to answer the survey
 Poor wording of questions
 Presenting data using graphs in a misleading way
 Funded research

2
The Stages of Statistical Research:

Presenting the researh question and defining the


reaserch population

Design the study


(what data is needed and how it should be
collected)

Collection of data from the entire population

Organizing the data in tables or charts


Descriptive
statistics
Summarizing the data using metrics

Drawing conclusions about the study population Statistical


from the sample metrics Inference

Presentation of Conclusions

Descriptive Statistics
The field of descriptive statistics deals with methods to organize, describe, and
summarize data collected in statistical research.
Ways to summarize data:
1. Tables and graphs.
2. Numerical methods (such as average, mean, median etc.)
Statistical Inference
Study population- The collection of all investigated cases.
In most studies there is no practical possibility to examine the entire population,
therefore we collect data on part of the study population. This part constitutes the
sample.
Sample- Part of the population that is selected for the study out of the entire study
population.

2
Random sample- each individual/member in/of the population has an equal chance
of being included in it independently.
Representative sample- represents all the relevant features that exist in the
population with respect to the researched question.
When making use of the sample, an additional stage before the presentation of the
results/findings needs to be added.
Using statistical methods to deduce from the index obtained in the sample to an
index attributed to the entire population. This is the stage of statistical inference,
making use of statistical methods to draw conclusions from “good” data.
Statistical inference deals with methods for generalization from the results of a
representative and random sample of the entire population being studied.
Estimation- estimation of the parameters of the population (prediction) based on the
observations in the sample.
Hypothesis testing- testing hypotheses with respect to the parameters of the
population by using statistical tests.
The inference from a sample is not a certainty but a probability. A complementary
field to understanding statistical inference is probability. To understand the
probabilistic methods of inference from a sample to a population one must learn
basic concepts in probability theory. Statistical inference uses concepts from
probability theory to determine whether the conclusions drawn are reliable or not.

Variables
Variable- a characteristic that is studied in the population and receives different
values in the population.
The values of the variable can vary from study to study.

Classification of variables by the type of the variable:


Qualitative variable- receives values that are names (or numerical values without
meaning to numbers or order).
Examples: gender, marital status, place of residence, eye color.
Quantitative variable- receives numerical values (with meaning to numbers and
order).
Examples: height, weight, salary, number of lates for work.
A secondary classification of variables for quantitative variables:
Discrete variable- the variable values are discrete and limited to a list of possible
values. Between every two values of the variable there is a finite number of values.

2
Examples: number of people in a family, number of rooms in an apartment.
Continuous variable- the values of the variable are continuous. Between any two
values of the variable there are infinitely many possible values.
Examples: height, age, weight.

Quantitative
Variable
Discrete
Qualitative
Continuous

Sometimes for reasons of convenience for the purposes of data processing in


statistical software, the researcher replaces the qualitative variable categories with
numerical values. The numbers and the order between them have no meaning and
therefore there is no change in the type of the variable and it remains a qualitative
variable.
Example: residential districts: North-1, Haifa-2, Tel-Aviv-3, Center-4, Jerusalem-5,
South-6, Yehuda Samaria-7

Classification of variables based on description direction:


This distinction is important when examining relationships between variables.
Dependent variable- the variable/phenomenon the researcher wants to
explain/investigate/understand. The dependent variable is the variable
explained/described by the other variables (independent variables).
Examples: private consumption, monthly salary, job satisfaction.
Independent variable (explanatory variable/descriptive variable)- the
variable/phenomenon that can be used to explain/understand the dependent
variable.
Often in research there are several independent variables to explain the dependent
variable.
Important note:
The classification of the variables based on the direction of the description can
change depending on the problem being studied. For example: monthly salary can
be a variable that depends on independent variables, but is equally an independent
variable which can explain a dependent variable (e.g. job satisfaction).

You might also like