Professional Documents
Culture Documents
Bsa s01 s02 Ppt-In-class
Bsa s01 s02 Ppt-In-class
Bsa s01 s02 Ppt-In-class
Gaurav KS
e-mail: gauravks@iima.ac.in
Quiz : 20%
Group − Assignments : 30%
MidTerm : 25%
EndTerm : 35%
Objective/Purpose
Objective/Purpose
Using quantitative modeling to help companies/individuals make better
decisions and improve performance.
Objective/Purpose
Using quantitative modeling to help companies/individuals make better
decisions and improve performance.
Business Statistics
Application of statistical tool to business and managerial problems for the
purpose of decision making.
Objective/Purpose
Using quantitative modeling to help companies/individuals make better
decisions and improve performance.
Business Statistics
Application of statistical tool to business and managerial problems for the
purpose of decision making.
Statistics
Study of numerical data, facts, figures and measurements.
Statistics is used to convert raw numerical data into useful information.
Objective/Purpose
Using quantitative modeling to help companies/individuals make better
decisions and improve performance.
Business Statistics
Application of statistical tool to business and managerial problems for the
purpose of decision making.
Statistics
Study of numerical data, facts, figures and measurements.
Statistics is used to convert raw numerical data into useful information.
(Data) Analysis
includes data description, data inference, and the search for relationships in
data.
Gaurav KS Business Statistics and Analysis July 30, 2022 9 / 125
“Business Statistics” and “Analysis”
Objective/Purpose
Using quantitative modeling to help companies/individuals make better
decisions and improve performance.
Objective/Purpose
Using quantitative modeling to help companies/individuals make better
decisions and improve performance.
Objective/Purpose
Using quantitative modeling to help companies/individuals make better
decisions and improve performance.
Objective/Purpose
Using quantitative modeling to help companies/individuals make better
decisions and improve performance.
Objective/Purpose
Using quantitative modeling to help companies/individuals make better
decisions and improve performance.
data
Objective
Objective: Let’s say one want to assess the fitness/obesity in a class.
Objective
Objective: Let’s say one want to assess the fitness/obesity in a class.
Objective
Objective: Let’s say one want to assess the fitness/obesity in a class.
Survey Questionnaire:
1 Name:
2 Gender:
3 Age:
4 Height (in Meters):
5 Weight (in Kg):
6 Rate you fitness on scale 1(lowest) to 5 (highest):
Qualitative
A variable that is described verbally than numerically
Qualitative
A variable that is described verbally than numerically
Also known as categorical data
Qualitative
A variable that is described verbally than numerically
Also known as categorical data
Quantitative
Quantitative: a variable that assumes meaningful numerical value
Nominal
Variables that store labelled information without any order or quantitative
value
Nominal
Variables that store labelled information without any order or quantitative
value
The name “nominal” comes from the Latin name “nomen”, which means
“name”
Nominal
Variables that store labelled information without any order or quantitative
value
The name “nominal” comes from the Latin name “nomen”, which means
“name”
We can’t do any numerical tasks or can’t give any order to sort the data
Nominal
Variables that store labelled information without any order or quantitative
value
The name “nominal” comes from the Latin name “nomen”, which means
“name”
We can’t do any numerical tasks or can’t give any order to sort the data
These data don’t have any meaningful order
Nominal
Variables that store labelled information without any order or quantitative
value
The name “nominal” comes from the Latin name “nomen”, which means
“name”
We can’t do any numerical tasks or can’t give any order to sort the data
These data don’t have any meaningful order
Also known as categorical data
Nominal
Variables that store labelled information without any order or quantitative
value
The name “nominal” comes from the Latin name “nomen”, which means
“name”
We can’t do any numerical tasks or can’t give any order to sort the data
These data don’t have any meaningful order
Also known as categorical data
Examples
Nationality (Indian, German, American)
Nominal
Variables that store labelled information without any order or quantitative
value
The name “nominal” comes from the Latin name “nomen”, which means
“name”
We can’t do any numerical tasks or can’t give any order to sort the data
These data don’t have any meaningful order
Also known as categorical data
Examples
Nationality (Indian, German, American)
Relationship status (Single, Live-in, Committed, Complicated, Married, Widowed,
Open)
Nominal
Variables that store labelled information without any order or quantitative
value
The name “nominal” comes from the Latin name “nomen”, which means
“name”
We can’t do any numerical tasks or can’t give any order to sort the data
These data don’t have any meaningful order
Also known as categorical data
Examples
Nationality (Indian, German, American)
Relationship status (Single, Live-in, Committed, Complicated, Married, Widowed,
Open)
Gender (Male, Female, Others)
Nominal
Variables that store labelled information without any order or quantitative
value
The name “nominal” comes from the Latin name “nomen”, which means
“name”
We can’t do any numerical tasks or can’t give any order to sort the data
These data don’t have any meaningful order
Also known as categorical data
Examples
Nationality (Indian, German, American)
Relationship status (Single, Live-in, Committed, Complicated, Married, Widowed,
Open)
Gender (Male, Female, Others)
Eye Color (Black, Brown, etc.)
Nominal
Binary or Dichotomous
A categorical variable that can only take one of two values
Nominal
Binary or Dichotomous
A categorical variable that can only take one of two values
For example- Male and Female, True and False, Day and Night, Pass and
Fail, etc.
Nominal
Binary or Dichotomous
Ordinal
Variable has a natural ordering by their position on the scale.
Nominal
Binary or Dichotomous
Ordinal
Variable has a natural ordering by their position on the scale.
Commonly used for observation like customer satisfaction, happiness, etc.
Nominal
Binary or Dichotomous
Ordinal
Variable has a natural ordering by their position on the scale.
Commonly used for observation like customer satisfaction, happiness, etc.
Considered as “in-between” the qualitative data and quantitative data.
Binary or Dichotomous
Ordinal
Variable has a natural ordering by their position on the scale.
Commonly used for observation like customer satisfaction, happiness, etc.
Considered as “in-between” the qualitative data and quantitative data.
Examples
Companies ask for feedback, experience, or satisfaction on a scale of 1 to 10
Letter grades in the exam (A, B, C, D, etc.)
Ranking of peoples in a competition (First, Second, Third, etc.)
Economic Status (High, Medium, and Low)
Education Level (Higher, Secondary, Primary)
Gaurav KS Business Statistics and Analysis July 30, 2022 36 / 125
Quantitative: Discrete and Continuous
Discrete
Discrete data is a numerical type of data that includes whole, concrete
numbers with specific and fixed data values determined by counting.
Discrete
Discrete data is a numerical type of data that includes whole, concrete
numbers with specific and fixed data values determined by counting.
Discrete data refers to individual and countable items.
Discrete
Discrete data is a numerical type of data that includes whole, concrete
numbers with specific and fixed data values determined by counting.
Discrete data refers to individual and countable items.
Synonyms for the word discrete are disconnected, separate, and distinct. So,
on plotting one can see them scattered.
Discrete
Discrete data is a numerical type of data that includes whole, concrete
numbers with specific and fixed data values determined by counting.
Discrete data refers to individual and countable items.
Synonyms for the word discrete are disconnected, separate, and distinct. So,
on plotting one can see them scattered.
Discrete data are countable and finite; they are whole numbers or integers.
Discrete
Discrete data is a numerical type of data that includes whole, concrete
numbers with specific and fixed data values determined by counting.
Discrete data refers to individual and countable items.
Synonyms for the word discrete are disconnected, separate, and distinct. So,
on plotting one can see them scattered.
Discrete data are countable and finite; they are whole numbers or integers.
Examples
Total numbers of students present in a class
Numbers of employees in a company
The total number of players who participated in a competition
Days in a week
Discrete
Discrete
Continuous
Continuous data includes complex/fractional numbers and varying data
values that are measured over a specific time interval.
Discrete
Continuous
Continuous data includes complex/fractional numbers and varying data
values that are measured over a specific time interval.
Continuous data refers to change over time, involving concepts that are not
simply countable but require detailed measurements.
Discrete
Continuous
Continuous data includes complex/fractional numbers and varying data
values that are measured over a specific time interval.
Continuous data refers to change over time, involving concepts that are not
simply countable but require detailed measurements.
So, on plotting one can see them like a line.
Discrete
Continuous
Continuous data includes complex/fractional numbers and varying data
values that are measured over a specific time interval.
Continuous data refers to change over time, involving concepts that are not
simply countable but require detailed measurements.
So, on plotting one can see them like a line.
Examples
Height of a person
Speed of a vehicle
“Time-taken” to finish the work
Market share price
Interval Scale
Can be categorized
Can be ranked
Difference between the scale values are equal
No true zero point as the origin
Interval Scale
Can be categorized
Can be ranked
Difference between the scale values are equal
No true zero point as the origin
Example
Temperature
Interval Scale
Ratio Scale
Can be categorized
Can be ranked
Difference between the scale values are equal
A true zero point as the origin
Survey Questionnaire:
1 Name:
2 Gender:
3 Age:
4 Height (in Meters):
5 Weight (in Kg):
6 Rate you fitness on scale 1(lowest) to 5 (highest):
Survey Questionnaire:
1 Name: Nominal
2 Gender:
3 Age:
4 Height (in Meters):
5 Weight (in Kg):
6 Rate you fitness on scale 1(lowest) to 5 (highest):
Survey Questionnaire:
1 Name: Nominal
2 Gender: Nominal/Binary
3 Age:
4 Height (in Meters):
5 Weight (in Kg):
6 Rate you fitness on scale 1(lowest) to 5 (highest):
Survey Questionnaire:
1 Name: Nominal
2 Gender: Nominal/Binary
3 Age: Continuous/Ratio-scale
4 Height (in Meters):
5 Weight (in Kg):
6 Rate you fitness on scale 1(lowest) to 5 (highest):
Survey Questionnaire:
1 Name: Nominal
2 Gender: Nominal/Binary
3 Age: Continuous/Ratio-scale
4 Height (in Meters): Continuous/Ratio-scale
5 Weight (in Kg):
6 Rate you fitness on scale 1(lowest) to 5 (highest):
Survey Questionnaire:
1 Name: Nominal
2 Gender: Nominal/Binary
3 Age: Continuous/Ratio-scale
4 Height (in Meters): Continuous/Ratio-scale
5 Weight (in Kg): Continuous/Ratio-scale
6 Rate you fitness on scale 1(lowest) to 5 (highest):
Survey Questionnaire:
1 Name: Nominal
2 Gender: Nominal/Binary
3 Age: Continuous/Ratio-scale
4 Height (in Meters): Continuous/Ratio-scale
5 Weight (in Kg): Continuous/Ratio-scale
6 Rate you fitness on scale 1(lowest) to 5 (highest): Ordinal
Cross-section
When data is collected by observing many subjects (such as individuals, firms,
countries, or regions) at the one point or period of time
Examples
We want to measure current obesity levels in a population, we could draw a
sample of 1,000 people randomly from that population (also known as a cross
section of that population), measure their weight and height, and calculate
what percentage of that sample is categorized as obese
Student grades at the end of the current semester
Household data of the previous year - expenditure on food, unemployment,
income, etc
Car data - average speed, horsepower, color, etc
Time series a series of data points indexed (or listed or graphed) in time order.
India’s Monthly Inflation for past 5 years.
Height of a person, measured once every month.
Note: Time series is different from cross-sectional data because ordering of the
observations conveys important information.
Panel data (or longitudinal data), combines both cross-sectional and time
series data ideas and looks at how the subjects (firms, individuals, etc.)
change over a time series.
Panel data (or longitudinal data), combines both cross-sectional and time
series data ideas and looks at how the subjects (firms, individuals, etc.)
change over a time series.
Panel data differs from pooled cross-sectional data across time, because it
deals with the observations on the same subjects in different times whereas
the latter observes different subjects in different time periods.
Qualitative
Ordinal
Quantitative
Discrete
Nominal data has the identity property and helps to distinguish between
individual data points. Ordinal data has both identity and magnitude
property and helps order the data points in a specific way.
Rainfall data can take any value on the scale and is, therefore, continuous.
Rainfall in mm has a true zero (indicating no rainfall at all) and is,
therefore, ratio scale.
Cross-section
When data is collected by observing many subjects (such as individuals, firms,
countries, or regions) at the one point or period of time
Examples
Examples
Panel data (or longitudinal data), combines both cross-sectional and time
series data ideas and looks at how the subjects (firms, individuals, etc.)
change over a time series.
Examples
1 Demand forecasting
1 Demand forecasting
2 GDP growth
1 Demand forecasting
2 GDP growth
3 Course grading of PRM students
1 Demand forecasting
2 GDP growth
3 Course grading of PRM students
4 Are the grades across the sections are significantly different from each other
Gaurav KS Business Statistics and Analysis July 30, 2022 100 / 125
A Seven-Step Modeling Process
1 Define the problem
2 Collect and summarize data
3 Develop a model
4 Verify the model
5 Select one of more suitable decisions
6 Present the results to the organization
7 Implement the model and update it over time
Gaurav KS Business Statistics and Analysis July 30, 2022 101 / 125
Course Overview and Schedule of Sessions
Gaurav KS Business Statistics and Analysis July 30, 2022 102 / 125
Session 2 & 3: Descriptive Statistics
Objective
To extract meaningful information from data.
Gaurav KS Business Statistics and Analysis July 30, 2022 103 / 125
Session 2 & 3: Descriptive Statistics
Objective
To extract meaningful information from data.
How
Descriptive measure or summary statistic
Visuals (Charts)
Gaurav KS Business Statistics and Analysis July 30, 2022 104 / 125
Session 2 & 3: Descriptive Statistics
Objective
To extract meaningful information from data.
How
Descriptive measure or summary statistic
Visuals (Charts)
Descriptive measures
30% of the class students are females.
Average height of students is 5.6 ft.
Gaurav KS Business Statistics and Analysis July 30, 2022 105 / 125
Session 2 & 3: Descriptive Statistics
Objective
To extract meaningful information from data.
How
Descriptive measure or summary statistic
Visuals (Charts)
Descriptive measures
30% of the class students are females.
Average height of students is 5.6 ft.
Visuals
Scatter plot
Frequency plot, Histograms
Pi chart
Gaurav KS Business Statistics and Analysis July 30, 2022 106 / 125
Session 4: Linear Transformation and Standardization
Objective
To understand the concept of standardization and how it can help in making
comparisons.
Gaurav KS Business Statistics and Analysis July 30, 2022 107 / 125
Session 5: Correlation and Covariance
Objective
Introduction to bivariate analysis, idea behind correlation and correlation does not
mean causation.
Gaurav KS Business Statistics and Analysis July 30, 2022 108 / 125
Correlation
Examples
Temperature and attendance at outdoor events
The age of a car and its value
Years of education and annual earnings
People’s telephone number and their IQs
Miles driven and amount of fuel consumed
Amount of smoking and incidence of lung cancer
Gaurav KS Business Statistics and Analysis July 30, 2022 109 / 125
Correlation: Plot the Data
Correlation as an association
How two variables are related with each other
Gaurav KS Business Statistics and Analysis July 30, 2022 110 / 125
Correlation: Scatter Plot
Correlation as an association
How two variables are related with each other
From the above three plots, what can we say about the relation between the
two variables?
Gaurav KS Business Statistics and Analysis July 30, 2022 111 / 125
Correlation: Scatter plot inferences
Correlation as an association
How two variables are related with each other
Gaurav KS Business Statistics and Analysis July 30, 2022 112 / 125
Correlation Coefficient
Correlation
How two variables are related with each other
Two aspects of (co)relation: Direction and Strength
Gaurav KS Business Statistics and Analysis July 30, 2022 113 / 125
Session 6 &7: Probability and Probability Distribution
Objective
Introduction to notion of probability and to understand how probability can help
in decision making under uncertainty.
Gaurav KS Business Statistics and Analysis July 30, 2022 114 / 125
Session 8: Conditional Probability, Bayes’ Theorem, and
Their applications
Objective
To understand the concept of Bayes’ Theorem and probability and how it helps in
better decision making.
Gaurav KS Business Statistics and Analysis July 30, 2022 115 / 125
Session 9 & 10: Normal, Binomial, Poisson and
Exponential distributions applications
Objective
To get familiar with the concept of distributions and how different distributions
can be used in business world to analyze data.
Gaurav KS Business Statistics and Analysis July 30, 2022 116 / 125
Session 11: Sampling
Objective
To introduce the different types of probabilistic sampling techniques.
Gaurav KS Business Statistics and Analysis July 30, 2022 117 / 125
Session 13: Central Limit Theorem
Objective
One useful statistical theorem that helps in approximating distribution of a large
sample towards a normal distribution.
Gaurav KS Business Statistics and Analysis July 30, 2022 118 / 125
Session 14 & 15: Estimation
Objective
To understand the logic of estimation and how estimation helps in drawing
meaningful conclusion about a population.
Gaurav KS Business Statistics and Analysis July 30, 2022 119 / 125
Session 16 & 17: Testing of Hypothesis
Objective
To understand how to generate and test hypotheses.
Gaurav KS Business Statistics and Analysis July 30, 2022 120 / 125
Session 18: ANOVA
Objective
To understand the concept of ANOVA and its real life applications.
Gaurav KS Business Statistics and Analysis July 30, 2022 121 / 125
Session 19 & 20: Regression Analysis
Objective
To introduce the concept of regression and how it can be used to make
predictions.
Gaurav KS Business Statistics and Analysis July 30, 2022 122 / 125
Session Summary
Gaurav KS Business Statistics and Analysis July 30, 2022 123 / 125
QUESTIONS
Gaurav KS Business Statistics and Analysis July 30, 2022 124 / 125
THANK YOU
Gaurav KS Business Statistics and Analysis July 30, 2022 125 / 125