Download as pdf or txt
Download as pdf or txt
You are on page 1of 69

Paris Graduate School of Management (PGSM)

International Executive Master of Business Administration

Paris Graduate School of Management


École Supérieure de Gestion et Commerce International

INTERNATIONAL EXECUTIVE
MASTER OF BUSINESS ADMINISTRATION

IEMBA

© 2024 International Executive MBA - Paris Graduate School of Management. All rights reserved.

Paris Graduate School of Management


École Supérieure de Gestion et Commerce International

INTERNATIONAL EXECUTIVE
MASTER OF BUSINESS ADMINISTRATION
Management Decision Making
January 2024

© 2024 International Executive MBA - Paris Graduate School of Management. All rights reserved.

© 2024 International Executive MBA - Paris Graduate School of Management.


All rights reserved.
Paris Graduate School of Management (PGSM)
International Executive Master of Business Administration

International Executive Master of Business Administration

LECTURES 1 & 2

© 2024 International Executive MBA - Paris Graduate School of Management. All rights reserved.

Chapter 1
Data and Statistics
n Applications in Business and Economics
n Data
n Data Sources
n Descriptive Statistics
n Statistical Inference

© 2024 International Executive MBA - Paris Graduate School of Management.


All rights reserved.
Paris Graduate School of Management (PGSM)
International Executive Master of Business Administration

Applications in
Business and Economics
n Accounting
Public accounting firms use statistical sampling procedures when
conducting audits for their clients.
n Finance
Financial advisors use a variety of statistical information, including
price-earnings ratios and dividend yields, to guide their
investment recommendations.
n Marketing
Electronic point-of-sale scanners at retail checkout counters are
being used to collect data for a variety of marketing research
applications.

Applications in
Business and Economics
n Production
A variety of statistical quality control charts are used to
monitor the output of a production process.
n Economics
Economists use statistical information in making
forecasts about the future of the economy or some
aspect of it.

© 2024 International Executive MBA - Paris Graduate School of Management.


All rights reserved.
Paris Graduate School of Management (PGSM)
International Executive Master of Business Administration

Data

n Elements, Variables, and Observations


n Scales of Measurement
n Qualitative and Quantitative Data
n Cross-Sectional and Time Series Data

Data and Data Sets

n Data are the facts and figures that are collected,


summarized, analyzed, and interpreted.
n The data collected in a particular study are referred to as
the data set.

© 2024 International Executive MBA - Paris Graduate School of Management.


All rights reserved.
Paris Graduate School of Management (PGSM)
International Executive Master of Business Administration

Elements, Variables, and Observations


n The elements are the entities on which data are
collected.
n A variable is a characteristic of interest for the
elements.
n The set of measurements collected for a particular
element is called an observation.
n The total number of data values in a data set is the
number of elements multiplied by the number of
variables.

Data, Data Sets,


Elements, Variables, and Observations

Variables Stock Annual Earn/


Company Exchange Sales($M) Sh.($)
Dataram AMEX 73.10 0.86
EnergySouth OTC 74.00 1.67
Keystone NYSE 365.70 0.86
LandCare NYSE 111.40 0.33
Psychemedics AMEX 17.60 0.13
Elements Data Set Datum

© 2024 International Executive MBA - Paris Graduate School of Management.


All rights reserved.
Paris Graduate School of Management (PGSM)
International Executive Master of Business Administration

Scales of Measurement
n Scales of measurement include:
• Nominal
• Ordinal
• Interval
• Ratio
n The scale determines the amount of information
contained in the data.
n The scale indicates the data summarization and
statistical analyses that are most appropriate.

Scales of Measurement
n Nominal
• Data are labels or names used to identify an
attribute of the element.
• A nonnumeric label or a numeric code may be used.

© 2024 International Executive MBA - Paris Graduate School of Management.


All rights reserved.
Paris Graduate School of Management (PGSM)
International Executive Master of Business Administration

Scales of Measurement
n Nominal
• Example:
Students of a university are classified by the school
in which they are enrolled using a nonnumeric label
such as Business, Humanities, Education, and so on.
Alternatively, a numeric code could be used for the
school variable (e.g. 1 denotes Business, 2 denotes
Humanities, 3 denotes Education, and so on).

Scales of Measurement

n Ordinal
• The data have the properties of nominal data and
the order or rank of the data is meaningful.
• A nonnumeric label or a numeric code may be used.

© 2024 International Executive MBA - Paris Graduate School of Management.


All rights reserved.
Paris Graduate School of Management (PGSM)
International Executive Master of Business Administration

Scales of Measurement
n Ordinal
• Example:
Students of a university are classified by their
class standing using a nonnumeric label such as
Freshman, Sophomore, Junior, or Senior.
Alternatively, a numeric code could be used for
the class standing variable (e.g. 1 denotes
Freshman, 2 denotes Sophomore, and so on).

Scales of Measurement
n Interval
• The data have the properties of ordinal data and the
interval between observations is expressed in terms
of a fixed unit of measure.
• Interval data are always numeric.

© 2024 International Executive MBA - Paris Graduate School of Management.


All rights reserved.
Paris Graduate School of Management (PGSM)
International Executive Master of Business Administration

Scales of Measurement
n Interval
• Example:
Melissa has an SAT score of 1205, while Kevin has
an SAT score of 1090. Melissa scored 115 points
more than Kevin.

Scales of Measurement
n Ratio
• The data have all the properties of interval data and
the ratio of two values is meaningful.
• Variables such as distance, height, weight, and time
use the ratio scale.
• This scale must contain a zero value that indicates
that nothing exists for the variable at the zero point.

© 2024 International Executive MBA - Paris Graduate School of Management.


All rights reserved.
Paris Graduate School of Management (PGSM)
International Executive Master of Business Administration

Scales of Measurement
n Ratio
• Example:
Melissa’s college record shows 36 credit hours
earned, while Kevin’s record shows 72 credit
hours earned. Kevin has twice as many credit
hours earned as Melissa.

Qualitative and Quantitative Data


n Data can be further classified as being qualitative or
quantitative.
n The statistical analysis that is appropriate depends on
whether the data for the variable are qualitative or
quantitative.
n In general, there are more alternatives for statistical
analysis when the data are quantitative.

© 2024 International Executive MBA - Paris Graduate School of Management.


All rights reserved.
Paris Graduate School of Management (PGSM)
International Executive Master of Business Administration

Qualitative Data
n Qualitative data are labels or names used to identify an
attribute of each element.
n Qualitative data use either the nominal or ordinal scale
of measurement.
n Qualitative data can be either numeric or nonnumeric.
n The statistical analysis for qualitative data are rather
limited.

Quantitative Data
n Quantitative data indicate either how many or how
much.
• Quantitative data that measure how many are
discrete.
• Quantitative data that measure how much are
continuous because there is no separation between
the possible values for the data..
n Quantitative data are always numeric.
n Ordinary arithmetic operations are meaningful only
with quantitative data.

© 2024 International Executive MBA - Paris Graduate School of Management.


All rights reserved.
Paris Graduate School of Management (PGSM)
International Executive Master of Business Administration

Cross-Sectional and Time Series Data


n Cross-sectional data are collected at the same or
approximately the same point in time.
• Example: data detailing the number of building
permits issued in June 2000 in each of the counties
of Texas
n Time series data are collected over several time
periods.
• Example: data detailing the number of building
permits issued in Travis County, Texas in each of the
last 36 months

Data Sources
n Existing Sources
• Data needed for a particular application might
already exist within a firm. Detailed information is
often kept on customers, suppliers, and employees
for example.
• Substantial amounts of business and economic data
are available from organizations that specialize in
collecting and maintaining data.

© 2024 International Executive MBA - Paris Graduate School of Management.


All rights reserved.
Paris Graduate School of Management (PGSM)
International Executive Master of Business Administration

Data Sources
n Existing Sources
• Government agencies are another important source
of data.
• Data are also available from a variety of industry
associations and special-interest organizations.

Data Sources
n Internet
• The Internet has become an important source of
data.
• Most government agencies, like the Bureau of the
Census (www.census.gov), make their data available
through a web site.
• More and more companies are creating web sites
and providing public access to them.
• A number of companies now specialize in making
information available over the Internet.

© 2024 International Executive MBA - Paris Graduate School of Management.


All rights reserved.
Paris Graduate School of Management (PGSM)
International Executive Master of Business Administration

Data Sources
n Statistical Studies
• Statistical studies can be classified as either experimental
or observational.
• In experimental studies the variables of interest are first
identified. Then one or more factors are controlled so
that data can be obtained about how the factors influence
the variables.
• In observational (nonexperimental) studies no attempt is
made to control or influence the variables of interest.
• A survey is perhaps the most common type of
observational study.

Data Acquisition Considerations


n Time Requirement
• Searching for information can be time consuming.
• Information might no longer be useful by the time it
is available.
n Cost of Acquisition
• Organizations often charge for information even
when it is not their primary business activity.
n Data Errors
• Using any data that happens to be available or that
were acquired with little care can lead to poor and
misleading information.

© 2024 International Executive MBA - Paris Graduate School of Management.


All rights reserved.
Paris Graduate School of Management (PGSM)
International Executive Master of Business Administration

Descriptive Statistics

n Descriptive statistics are the tabular, graphical, and


numerical methods used to summarize data.

Example: Hudson Auto Repair


The manager of Hudson Auto would like to have a better
understanding of the cost of parts used in the engine tune-
ups performed in the shop. She examines 50 customer
invoices for tune-ups. The costs of parts, rounded to the
nearest dollar, are listed below.

91 78 93 57 75 52 99 80 97 62
71 69 72 89 66 75 79 75 72 76
104 74 62 68 97 105 77 65 80 109
85 97 88 68 83 68 71 69 67 74
62 82 98 101 79 105 79 69 62 73

© 2024 International Executive MBA - Paris Graduate School of Management.


All rights reserved.
Paris Graduate School of Management (PGSM)
International Executive Master of Business Administration

Example: Hudson Auto Repair


n Tabular Summary (Frequencies and Percent Frequencies)
Parts Percent
Cost ($) Frequency Frequency
50-59 2 4
60-69 13 26
70-79 16 32
80-89 7 14
90-99 7 14
100-109 5 10
Total 50 100

Example: Hudson Auto Repair


n Graphical Summary (Histogram)
18
16
14
Frequency

12
10
8
6
4
2
Parts
50 60 70 80 90 100 110 Cost ($)

© 2024 International Executive MBA - Paris Graduate School of Management.


All rights reserved.
Paris Graduate School of Management (PGSM)
International Executive Master of Business Administration

Example: Hudson Auto Repair


n Numerical Descriptive Statistics

• The most common numerical descriptive statistic is


the average (or mean).
• Hudson’s average cost of parts, based on the 50
tune-ups studied, is $79 (found by summing the 50
cost values and then dividing by 50).

Statistical Inference
n Statistical inference is the process of using data
obtained from a small group of elements (the sample)
to make estimates and test hypotheses about the
characteristics of a larger group of elements (the
population).

© 2024 International Executive MBA - Paris Graduate School of Management.


All rights reserved.
Paris Graduate School of Management (PGSM)
International Executive Master of Business Administration

Example: Hudson Auto Repair


n Process of Statistical Inference
1. Population
consists of all 2. A sample of 50
tune-ups. Average engine tune-ups
cost of parts is is examined.
unknown.

4. The value of the 3. The sample data


sample average is used provide a sample
to make an estimate of average cost of
the population average. $79 per tune-up.

End of Chapter 1

© 2024 International Executive MBA - Paris Graduate School of Management.


All rights reserved.
Paris Graduate School of Management (PGSM)
International Executive Master of Business Administration

Chapter 2
Descriptive Statistics:
Tabular and Graphical Methods

n Summarizing Qualitative Data


n Summarizing Quantitative Data
n Exploratory Data Analysis
n Crosstabulations
and Scatter Diagrams

Summarizing Qualitative Data

n Frequency Distribution
n Relative Frequency
n Percent Frequency Distribution
n Bar Graph
n Pie Chart

© 2024 International Executive MBA - Paris Graduate School of Management.


All rights reserved.
Paris Graduate School of Management (PGSM)
International Executive Master of Business Administration

Frequency Distribution
n A frequency distribution is a tabular summary of data
showing the frequency (or number) of items in each of
several nonoverlapping classes.
n The objective is to provide insights about the data that
cannot be quickly obtained by looking only at the
original data.

Example: Marada Inn


Guests staying at Marada Inn were asked to rate the quality
of their accommodations as being excellent, above average,
average, below average, or poor. The ratings provided by a
sample of 20 guests are shown below.

Below Average Average Above Average


Above Average Above Average Above Average
Above Average Below Average Below Average
Average Poor Poor
Above Average Excellent Above Average
Average Above Average Average
Above Average Average

© 2024 International Executive MBA - Paris Graduate School of Management.


All rights reserved.
Paris Graduate School of Management (PGSM)
International Executive Master of Business Administration

Example: Marada Inn


n Frequency Distribution

Rating Frequency
Poor 2
Below Average 3
Average 5
Above Average 9
Excellent 1
Total 20

Relative Frequency Distribution

n The relative frequency of a class is the fraction or


proportion of the total number of data items belonging
to the class.
n A relative frequency distribution is a tabular summary
of a set of data showing the relative frequency for each
class.

© 2024 International Executive MBA - Paris Graduate School of Management.


All rights reserved.
Paris Graduate School of Management (PGSM)
International Executive Master of Business Administration

Percent Frequency Distribution

n The percent frequency of a class is the relative


frequency multiplied by 100.
n A percent frequency distribution is a tabular summary
of a set of data showing the percent frequency for each
class.

Example: Marada Inn


n Relative Frequency and Percent Frequency Distributions

Relative Percent
Rating Frequency Frequency
Poor .10 10
Below Average .15 15
Average .25 25
Above Average .45 45
Excellent .05 5
Total 1.00 100

© 2024 International Executive MBA - Paris Graduate School of Management.


All rights reserved.
Paris Graduate School of Management (PGSM)
International Executive Master of Business Administration

Bar Graph
n A bar graph is a graphical device for depicting qualitative
data that have been summarized in a frequency, relative
frequency, or percent frequency distribution.
n On the horizontal axis we specify the labels that are used for
each of the classes.
n A frequency, relative frequency, or percent frequency scale
can be used for the vertical axis.
n Using a bar of fixed width drawn above each class label, we
extend the height appropriately.
n The bars are separated to emphasize the fact that each class
is a separate category.

Example: Marada Inn


n Bar Graph
9
8
Frequency

7
6
5
4
3
2
1
Below Average Above Excellent Rating
Poor
Average Average

© 2024 International Executive MBA - Paris Graduate School of Management.


All rights reserved.
Paris Graduate School of Management (PGSM)
International Executive Master of Business Administration

Pie Chart
n The pie chart is a commonly used graphical device for
presenting relative frequency distributions for
qualitative data.
n First draw a circle; then use the relative frequencies to
subdivide the circle into sectors that correspond to the
relative frequency for each class.
n Since there are 360 degrees in a circle, a class with a
relative frequency of .25 would consume .25(360) =
90 degrees of the circle.

Example: Marada Inn


n Pie Chart
Exc. Poor
5% 10%
Below
Above Average
Average 15%
45% Average
25%

Quality Ratings

© 2024 International Executive MBA - Paris Graduate School of Management.


All rights reserved.
Paris Graduate School of Management (PGSM)
International Executive Master of Business Administration

Example: Marada Inn


n Insights Gained from the Preceding Pie Chart
• One-half of the customers surveyed gave Marada a
quality rating of “above average” or “excellent”
(looking at the left side of the pie). This might
please the manager.
• For each customer who gave an “excellent” rating,
there were two customers who gave a “poor” rating
(looking at the top of the pie). This should displease
the manager.

Summarizing Quantitative Data

n Frequency Distribution
n Relative Frequency and Percent Frequency
Distributions
n Dot Plot
n Histogram
n Cumulative Distributions
n Ogive

© 2024 International Executive MBA - Paris Graduate School of Management.


All rights reserved.
Paris Graduate School of Management (PGSM)
International Executive Master of Business Administration

Example: Hudson Auto Repair


The manager of Hudson Auto would like to get a better
picture of the distribution of costs for engine tune-up
parts. A sample of 50 customer invoices has been taken
and the costs of parts, rounded to the nearest dollar, are
listed below.
91 78 93 57 75 52 99 80 97 62
71 69 72 89 66 75 79 75 72 76
104 74 62 68 97 105 77 65 80 109
85 97 88 68 83 68 71 69 67 74
62 82 98 101 79 105 79 69 62 73

Frequency Distribution
n Guidelines for Selecting Number of Classes
• Use between 5 and 20 classes.
• Data sets with a larger number of elements usually
require a larger number of classes.
• Smaller data sets usually require fewer classes.

© 2024 International Executive MBA - Paris Graduate School of Management.


All rights reserved.
Paris Graduate School of Management (PGSM)
International Executive Master of Business Administration

Frequency Distribution
n Guidelines for Selecting Width of Classes
• Use classes of equal width.
• Approximate Class Width =
Largest Data Value  Smallest Data Value
Number of Classes

Example: Hudson Auto Repair


n Frequency Distribution
If we choose six classes:
Approximate Class Width = (109 - 52)/6 = 9.5 10
Cost ($) Frequency
50-59 2
60-69 13
70-79 16
80-89 7
90-99 7
100-109 5
Total 50

© 2024 International Executive MBA - Paris Graduate School of Management.


All rights reserved.
Paris Graduate School of Management (PGSM)
International Executive Master of Business Administration

Example: Hudson Auto Repair


n Relative Frequency and Percent Frequency Distributions
Relative Percent
Cost ($) Frequency Frequency
50-59 .04 4
60-69 .26 26
70-79 .32 32
80-89 .14 14
90-99 .14 14
100-109 .10 10
Total 1.00 100

Example: Hudson Auto Repair


n Insights Gained from the Percent Frequency
Distribution
• Only 4% of the parts costs are in the $50-59 class.
• 30% of the parts costs are under $70.
• The greatest percentage (32% or almost one-third)
of the parts costs are in the $70-79 class.
• 10% of the parts costs are $100 or more.

© 2024 International Executive MBA - Paris Graduate School of Management.


All rights reserved.
Paris Graduate School of Management (PGSM)
International Executive Master of Business Administration

Dot Plot
n One of the simplest graphical summaries of data is a
dot plot.
n A horizontal axis shows the range of data values.
n Then each data value is represented by a dot placed
above the axis.

Example: Hudson Auto Repair


n Dot Plot

... .... .. ... ... . .


.
. . . ..... .......... .. . .. . . ... . .. .
50 60 70 80 90 100 110
Cost ($)

© 2024 International Executive MBA - Paris Graduate School of Management.


All rights reserved.
Paris Graduate School of Management (PGSM)
International Executive Master of Business Administration

Histogram
n Another common graphical presentation of quantitative data
is a histogram.
n The variable of interest is placed on the horizontal axis and
the frequency, relative frequency, or percent frequency is
placed on the vertical axis.
n A rectangle is drawn above each class interval with its height
corresponding to the interval’s frequency, relative frequency,
or percent frequency.
n Unlike a bar graph, a histogram has no natural separation
between rectangles of adjacent classes.

Example: Hudson Auto Repair


n Histogram
18
16
14
Frequency

12
10
8
6
4
2 Parts
50 60 70 80 90 100 110 Cost ($)

© 2024 International Executive MBA - Paris Graduate School of Management.


All rights reserved.
Paris Graduate School of Management (PGSM)
International Executive Master of Business Administration

Cumulative Distribution
n The cumulative frequency distribution shows the
number of items with values less than or equal to the
upper limit of each class.
n The cumulative relative frequency distribution shows
the proportion of items with values less than or equal
to the upper limit of each class.
n The cumulative percent frequency distribution shows
the percentage of items with values less than or equal
to the upper limit of each class.

Example: Hudson Auto Repair


n Cumulative Distributions
Cumulative Cumulative
Cumulative Relative Percent
Cost ($) Frequency Frequency Frequency
< 59 2 .04 4
< 69 15 .30 30
< 79 31 .62 62
< 89 38 .76 76
< 99 45 .90 90
< 109 50 1.00 100

© 2024 International Executive MBA - Paris Graduate School of Management.


All rights reserved.
Paris Graduate School of Management (PGSM)
International Executive Master of Business Administration

Exploratory Data Analysis


n The techniques of exploratory data analysis consist of
simple arithmetic and easy-to-draw pictures that can
be used to summarize data quickly.
n One such technique is the stem-and-leaf display.

Crosstabulations and Scatter Diagrams


n Thus far we have focused on methods that are used to
summarize the data for one variable at a time.
n Often a manager is interested in tabular and graphical
methods that will help understand the relationship
between two variables.
n Crosstabulation and a scatter diagram are two methods
for summarizing the data for two (or more) variables
simultaneously.

© 2024 International Executive MBA - Paris Graduate School of Management.


All rights reserved.
Paris Graduate School of Management (PGSM)
International Executive Master of Business Administration

Crosstabulation
n Crosstabulation is a tabular method for summarizing the
data for two variables simultaneously.
n Crosstabulation can be used when:
• One variable is qualitative and the other is quantitative
• Both variables are qualitative
• Both variables are quantitative
n The left and top margin labels define the classes for the two
variables.

Example: Finger Lakes Homes


n Crosstabulation
The number of Finger Lakes homes sold for each style
and price for the past two years is shown below.
Price Home Style
Range Colonial Ranch Split A-Frame Total

< $99,000 18 6 19 12 55
> $99,000 12 14 16 3 45

Total 30 20 35 15 100

© 2024 International Executive MBA - Paris Graduate School of Management.


All rights reserved.
Paris Graduate School of Management (PGSM)
International Executive Master of Business Administration

Example: Finger Lakes Homes

n Insights Gained from the Preceding Crosstabulation


• The greatest number of homes in the sample (19)
are a split-level style and priced at less than or equal
to $99,000.
• Only three homes in the sample are an A-Frame style
and priced at more than $99,000.

Crosstabulation: Row or Column Percentages


n Converting the entries in the table into row percentages
or column percentages can provide additional insight
about the relationship between the two variables.

© 2024 International Executive MBA - Paris Graduate School of Management.


All rights reserved.
Paris Graduate School of Management (PGSM)
International Executive Master of Business Administration

Example: Finger Lakes Homes


n Row Percentages

Price Home Style


Range Colonial Ranch Split A-Frame Total

< $99,000 32.73 10.91 34.55 21.82 100


> $99,000 26.67 31.11 35.56 6.67 100

Note: row totals are actually 100.01 due to rounding.

Example: Finger Lakes Homes


n Column Percentages
Price Home Style
Range Colonial Ranch Split A-Frame

< $99,000 60.00 30.00 54.29 80.00


> $99,000 40.00 70.00 45.71 20.00

Total 100 100 100 100

© 2024 International Executive MBA - Paris Graduate School of Management.


All rights reserved.
Paris Graduate School of Management (PGSM)
International Executive Master of Business Administration

Scatter Diagram
n A scatter diagram is a graphical presentation of the
relationship between two quantitative variables.
n One variable is shown on the horizontal axis and the
other variable is shown on the vertical axis.
n The general pattern of the plotted points suggests the
overall relationship between the variables.

Scatter Diagram
n A Positive Relationship
y

© 2024 International Executive MBA - Paris Graduate School of Management.


All rights reserved.
Paris Graduate School of Management (PGSM)
International Executive Master of Business Administration

Scatter Diagram
n A Negative Relationship
y

Scatter Diagram
n No Apparent Relationship
y

© 2024 International Executive MBA - Paris Graduate School of Management.


All rights reserved.
Paris Graduate School of Management (PGSM)
International Executive Master of Business Administration

Example: Panthers Football Team


n Scatter Diagram
The Panthers football team is interested in investigating
the relationship, if any, between interceptions made and
points scored.
x = Number of y = Number of
Interceptions Points Scored
1 14
3 24
2 18
1 17
3 27

Example: Panthers Football Team


n Scatter Diagram
y
Number of Points Scored

30
25
20
15
10
5
0 x
0 1 2 3
Number of Interceptions

© 2024 International Executive MBA - Paris Graduate School of Management.


All rights reserved.
Paris Graduate School of Management (PGSM)
International Executive Master of Business Administration

Example: Panthers Football Team


n The preceding scatter diagram indicates a positive
relationship between the number of interceptions and
the number of points scored.
n Higher points scored are associated with a higher
number of interceptions.
n The relationship is not perfect; all plotted points in the
scatter diagram are not on a straight line.

Tabular and Graphical Procedures


Data
Qualitative Data Quantitative Data

Tabular Graphical Tabular Graphical


Methods Methods Methods Methods
•Frequency •Bar Graph •Frequency
•Dot Plot
Distribution •Pie Chart Distribution
•Histogram
•Rel. Freq. Dist. •Rel. Freq. Dist.
•Ogive
•% Freq. Dist. •Cum. Freq. Dist.
•Scatter
•Crosstabulation •Cum. Rel. Freq.
Diagram
Distribution
•Stem-and-Leaf
Display
•Crosstabulation

© 2024 International Executive MBA - Paris Graduate School of Management.


All rights reserved.
Paris Graduate School of Management (PGSM)
International Executive Master of Business Administration

End of Chapter 2

Chapter 3
Descriptive Statistics: Numerical Methods
n Measures of Location
n Measures of Variability
n Measures of Relative Location and Detecting Outliers
n Exploratory Data Analysis
n Measures of Association Between Two Variables
n The Weighted Mean and
Working with Grouped Data

x
© 2024 International Executive MBA - Paris Graduate School of Management.
All rights reserved.
Paris Graduate School of Management (PGSM)
International Executive Master of Business Administration

Measures of Location
n Mean
n Median
n Mode
n Percentiles
n Quartiles

Example: Apartment Rents


Given below is a sample of monthly rent values ($) for one-
bedroom apartments. The data is a sample of 70 apartments
in a particular city. The data are presented in ascending
order.
425 430 430 435 435 435 435 435 440 440
440 440 440 445 445 445 445 445 450 450
450 450 450 450 450 460 460 460 465 465
465 470 470 472 475 475 475 480 480 480
480 485 490 490 490 500 500 500 500 510
510 515 525 525 525 535 549 550 570 570
575 575 580 590 600 600 600 600 615 615

© 2024 International Executive MBA - Paris Graduate School of Management.


All rights reserved.
Paris Graduate School of Management (PGSM)
International Executive Master of Business Administration

Mean
n The mean of a data set is the average of all the data
values.
n If the data are from a sample, the mean is denoted by x
 xi
x 
n
n If the data are from a population, the mean is denoted
bym (mu).
 xi

N

Example: Apartment Rents


n Mean
 xi 34 , 356
x    490.80
n 70
425 430 430 435 435 435 435 435 440 440
440 440 440 445 445 445 445 445 450 450
450 450 450 450 450 460 460 460 465 465
465 470 470 472 475 475 475 480 480 480
480 485 490 490 490 500 500 500 500 510
510 515 525 525 525 535 549 550 570 570
575 575 580 590 600 600 600 600 615 615

© 2024 International Executive MBA - Paris Graduate School of Management.


All rights reserved.
Paris Graduate School of Management (PGSM)
International Executive Master of Business Administration

Median
n The median is the measure of location most often
reported for annual income and property value data.
n A few extremely large incomes or property values can
inflate the mean.

Median
n The median of a data set is the value in the middle
when the data items are arranged in ascending order.
n For an odd number of observations, the median is the
middle value.
n For an even number of observations, the median is the
average of the two middle values.

© 2024 International Executive MBA - Paris Graduate School of Management.


All rights reserved.
Paris Graduate School of Management (PGSM)
International Executive Master of Business Administration

Example: Apartment Rents


n Median
Median = 50th percentile
i = (p/100)n = (50/100)70 = 35.5
Averaging the 35th and 36th data values:
425 430 430 435 435 435 435
Median = (475 + 475)/2 = 475435 440 440
440 440 440 445 445 445 445 445 450 450
450 450 450 450 450 460 460 460 465 465
465 470 470 472 475 475 475 480 480 480
480 485 490 490 490 500 500 500 500 510
510 515 525 525 525 535 549 550 570 570
575 575 580 590 600 600 600 600 615 615

Mode
n The mode of a data set is the value that occurs with
greatest frequency.
n The greatest frequency can occur at two or more
different values.
n If the data have exactly two modes, the data are
bimodal.
n If the data have more than two modes, the data are
multimodal.

© 2024 International Executive MBA - Paris Graduate School of Management.


All rights reserved.
Paris Graduate School of Management (PGSM)
International Executive Master of Business Administration

Example: Apartment Rents


n Mode
450 occurred most frequently (7 times)
Mode = 450
425 430 430 435 435 435 435 435 440 440
440 440 440 445 445 445 445 445 450 450
450 450 450 450 450 460 460 460 465 465
465 470 470 472 475 475 475 480 480 480
480 485 490 490 490 500 500 500 500 510
510 515 525 525 525 535 549 550 570 570
575 575 580 590 600 600 600 600 615 615

Percentiles
n A percentile provides information about how the data
are spread over the interval from the smallest value to
the largest value.
n Admission test scores for colleges and universities are
frequently reported in terms of percentiles.

© 2024 International Executive MBA - Paris Graduate School of Management.


All rights reserved.
Paris Graduate School of Management (PGSM)
International Executive Master of Business Administration

Percentiles
n The pth percentile of a data set is a value such that at least p
percent of the items take on this value or less and at least (100
- p) percent of the items take on this value or more.
• Arrange the data in ascending order.
• Compute index i, the position of the pth percentile.
i = (p/100)n
• If i is not an integer, round up. The p th percentile is the
value in the i th position.
• If i is an integer, the p th percentile is the average of the
values in positions i and i +1.

Example: Apartment Rents


n 90th Percentile
i = (p/100)n = (90/100)70 = 63
Averaging the 63rd and 64th data values:
90th Percentile = (580 + 590)/2 = 585
425 430 430 435 435 435 435 435 440 440
440 440 440 445 445 445 445 445 450 450
450 450 450 450 450 460 460 460 465 465
465 470 470 472 475 475 475 480 480 480
480 485 490 490 490 500 500 500 500 510
510 515 525 525 525 535 549 550 570 570
575 575 580 590 600 600 600 600 615 615

© 2024 International Executive MBA - Paris Graduate School of Management.


All rights reserved.
Paris Graduate School of Management (PGSM)
International Executive Master of Business Administration

Quartiles

n Quartiles are specific percentiles


n First Quartile = 25th Percentile
n Second Quartile = 50th Percentile = Median
n Third Quartile = 75th Percentile

Example: Apartment Rents


n Third Quartile
Third quartile = 75th percentile
i = (p/100)n = (75/100)70 = 52.5 = 53
Third quartile = 525
425 430 430 435 435 435 435 435 440 440
440 440 440 445 445 445 445 445 450 450
450 450 450 450 450 460 460 460 465 465
465 470 470 472 475 475 475 480 480 480
480 485 490 490 490 500 500 500 500 510
510 515 525 525 525 535 549 550 570 570
575 575 580 590 600 600 600 600 615 615

© 2024 International Executive MBA - Paris Graduate School of Management.


All rights reserved.
Paris Graduate School of Management (PGSM)
International Executive Master of Business Administration

Measures of Variability
n It is often desirable to consider measures of variability
(dispersion), as well as measures of location.
n For example, in choosing supplier A or supplier B we
might consider not only the average delivery time for
each, but also the variability in delivery time for each.

Measures of Variability
n Range
n Interquartile Range
n Variance
n Standard Deviation
n Coefficient of Variation

© 2024 International Executive MBA - Paris Graduate School of Management.


All rights reserved.
Paris Graduate School of Management (PGSM)
International Executive Master of Business Administration

Range

n The range of a data set is the difference between the


largest and smallest data values.
n It is the simplest measure of variability.
n It is very sensitive to the smallest and largest data
values.

Example: Apartment Rents


n Range
Range = largest value - smallest value
Range = 615 - 425 = 190
425 430 430 435 435 435 435 435 440 440
440 440 440 445 445 445 445 445 450 450
450 450 450 450 450 460 460 460 465 465
465 470 470 472 475 475 475 480 480 480
480 485 490 490 490 500 500 500 500 510
510 515 525 525 525 535 549 550 570 570
575 575 580 590 600 600 600 600 615 615

© 2024 International Executive MBA - Paris Graduate School of Management.


All rights reserved.
Paris Graduate School of Management (PGSM)
International Executive Master of Business Administration

Interquartile Range
n The interquartile range of a data set is the difference
between the third quartile and the first quartile.
n It is the range for the middle 50% of the data.
n It overcomes the sensitivity to extreme data values.

Example: Apartment Rents


n Interquartile Range
3rd Quartile (Q3) = 525
1st Quartile (Q1) = 445
Interquartile Range = Q3 - Q1 = 525 - 445 = 80
425 430 430 435 435 435 435 435 440 440
440 440 440 445 445 445 445 445 450 450
450 450 450 450 450 460 460 460 465 465
465 470 470 472 475 475 475 480 480 480
480 485 490 490 490 500 500 500 500 510
510 515 525 525 525 535 549 550 570 570
575 575 580 590 600 600 600 600 615 615

© 2024 International Executive MBA - Paris Graduate School of Management.


All rights reserved.
Paris Graduate School of Management (PGSM)
International Executive Master of Business Administration

Variance
n The variance is a measure of variability that utilizes
all the data.
n It is based on the difference between the value of each
observation (xi) and the mean (x for a sample,  for a
population).

Variance
n The variance is the average of the squared differences
between each data value and the mean.
n If the data set is a sample, the variance is denoted by s2.
2
2  ( xi  x )
s 
n 1

n If the data set is a population, the variance is denoted by  2.


2
2  ( xi   )
 
N

© 2024 International Executive MBA - Paris Graduate School of Management.


All rights reserved.
Paris Graduate School of Management (PGSM)
International Executive Master of Business Administration

Standard Deviation
n The standard deviation of a data set is the positive square
root of the variance.
n It is measured in the same units as the data, making it more
easily comparable, than the variance, to the mean.
n If the data set is a sample, the standard deviation is denoted
s.
s  s2

n If the data set is a population, the standard deviation is


denoted (sigma).
  2

Coefficient of Variation
n The coefficient of variation indicates how large the
standard deviation is in relation to the mean.
n If the data set is a sample, the coefficient of variation is
computed as follows:
s
(100)
x
n If the data set is a population, the coefficient of variation is
computed as follows:

(100)

© 2024 International Executive MBA - Paris Graduate School of Management.


All rights reserved.
Paris Graduate School of Management (PGSM)
International Executive Master of Business Administration

Example: Apartment Rents


n Variance
2  ( xi  x ) 2
s   2 , 996.16
n 1

n Standard Deviation
s s2  2996. 47  54. 74

n Coefficient of Variation
s 54. 74
 100   100  11.15
x 490.80

Measures of Relative Location


and Detecting Outliers
n z-Scores
n Chebyshev’s Theorem
n Empirical Rule
n Detecting Outliers

© 2024 International Executive MBA - Paris Graduate School of Management.


All rights reserved.
Paris Graduate School of Management (PGSM)
International Executive Master of Business Administration

z-Scores
n The z-score is often called the standardized value.
n It denotes the number of standard deviations a data value xi
is from the mean.
x x
zi  i
s
n A data value less than the sample mean will have a z-score
less than zero.
n A data value greater than the sample mean will have a z-
score greater than zero.
n A data value equal to the sample mean will have a z-score of
zero.

Example: Apartment Rents


n z-Score of Smallest Value (425)
xi  x 425  490.80
z   1. 20
s 54. 74

-1.20 Standardized
-1.11 -1.11 -1.02Values
-1.02 for Apartment
-1.02 Rents
-1.02 -1.02 -0.93 -0.93
-0.93 -0.93 -0.93 -0.84 -0.84 -0.84 -0.84 -0.84 -0.75 -0.75
-0.75 -0.75 -0.75 -0.75 -0.75 -0.56 -0.56 -0.56 -0.47 -0.47
-0.47 -0.38 -0.38 -0.34 -0.29 -0.29 -0.29 -0.20 -0.20 -0.20
-0.20 -0.11 -0.01 -0.01 -0.01 0.17 0.17 0.17 0.17 0.35
0.35 0.44 0.62 0.62 0.62 0.81 1.06 1.08 1.45 1.45
1.54 1.54 1.63 1.81 1.99 1.99 1.99 1.99 2.27 2.27

© 2024 International Executive MBA - Paris Graduate School of Management.


All rights reserved.
Paris Graduate School of Management (PGSM)
International Executive Master of Business Administration

Chebyshev’s Theorem
At least (1 - 1/k2) of the items in any data set will be
within k standard deviations of the mean, where k is
any value greater than 1.
• At least 75% of the items must be within
k = 2 standard deviations of the mean.
• At least 89% of the items must be within
k = 3 standard deviations of the mean.
• At least 94% of the items must be within
k = 4 standard deviations of the mean.

Example: Apartment Rents


n Chebyshev’s Theorem

Let k = 1.5 with x = 490.80 and s = 54.74

At least (1 - 1/(1.5)2) = 1 - 0.44 = 0.56 or 56%


x of the rent values must be between
x - k(s) = 490.80 - 1.5(54.74) = 409
and
+ k(s) = 490.80 + 1.5(54.74) = 573

© 2024 International Executive MBA - Paris Graduate School of Management.


All rights reserved.
Paris Graduate School of Management (PGSM)
International Executive Master of Business Administration

Example: Apartment Rents


n Chebyshev’s Theorem (continued)
Actually, 86% of the rent values
are between 409 and 573.
425 430 430 435 435 435 435 435 440 440
440 440 440 445 445 445 445 445 450 450
450 450 450 450 450 460 460 460 465 465
465 470 470 472 475 475 475 480 480 480
480 485 490 490 490 500 500 500 500 510
510 515 525 525 525 535 549 550 570 570
575 575 580 590 600 600 600 600 615 615

Empirical Rule
For data having a bell-shaped distribution:

• Approximately 68% of the data values will be within


one standard deviation of the mean.

© 2024 International Executive MBA - Paris Graduate School of Management.


All rights reserved.
Paris Graduate School of Management (PGSM)
International Executive Master of Business Administration

Empirical Rule
For data having a bell-shaped distribution:

• Approximately 95% of the data values will be within


two standard deviations of the mean.

Empirical Rule
For data having a bell-shaped distribution:

• Almost all (99.7%) of the items will be within three


standard deviations of the mean.

© 2024 International Executive MBA - Paris Graduate School of Management.


All rights reserved.
Paris Graduate School of Management (PGSM)
International Executive Master of Business Administration

Example: Apartment Rents


n Empirical Rule
Interval % in Interval
Within +/- 1s 436.06 to 545.54 48/70 = 69%
Within +/- 2s 381.32 to 600.28 68/70 = 97%
Within +/- 3s 326.58 to 655.02 70/70 = 100%
425 430 430 435 435 435 435 435 440 440
440 440 440 445 445 445 445 445 450 450
450 450 450 450 450 460 460 460 465 465
465 470 470 472 475 475 475 480 480 480
480 485 490 490 490 500 500 500 500 510
510 515 525 525 525 535 549 550 570 570
575 575 580 590 600 600 600 600 615 615

Detecting Outliers
n An outlier is an unusually small or unusually large value in a
data set.
n A data value with a z-score less than -3 or greater than +3
might be considered an outlier.
n It might be an incorrectly recorded data value.
n It might be a data value that was incorrectly included in the
data set.
n It might be a correctly recorded data value that belongs in
the data set !

© 2024 International Executive MBA - Paris Graduate School of Management.


All rights reserved.
Paris Graduate School of Management (PGSM)
International Executive Master of Business Administration

Example: Apartment Rents


n Detecting Outliers
The most extreme z-scores are -1.20 and 2.27.
Using |z| > 3 as the criterion for an outlier,
there are no outliers in this data set.
Standardized Values for Apartment Rents
-1.20 -1.11 -1.11 -1.02 -1.02 -1.02 -1.02 -1.02 -0.93 -0.93
-0.93 -0.93 -0.93 -0.84 -0.84 -0.84 -0.84 -0.84 -0.75 -0.75
-0.75 -0.75 -0.75 -0.75 -0.75 -0.56 -0.56 -0.56 -0.47 -0.47
-0.47 -0.38 -0.38 -0.34 -0.29 -0.29 -0.29 -0.20 -0.20 -0.20
-0.20 -0.11 -0.01 -0.01 -0.01 0.17 0.17 0.17 0.17 0.35
0.35 0.44 0.62 0.62 0.62 0.81 1.06 1.08 1.45 1.45
1.54 1.54 1.63 1.81 1.99 1.99 1.99 1.99 2.27 2.27

Exploratory Data Analysis


n Five-Number Summary
n Box Plot

© 2024 International Executive MBA - Paris Graduate School of Management.


All rights reserved.
Paris Graduate School of Management (PGSM)
International Executive Master of Business Administration

Five-Number Summary
n Smallest Value
n First Quartile
n Median
n Third Quartile
n Largest Value

Example: Apartment Rents


n Five-Number Summary
Lowest Value = 425 First Quartile = 450
Median = 475 Third Quartile = 525
Largest Value = 615
425 430 430 435 435 435 435 435 440 440
440 440 440 445 445 445 445 445 450 450
450 450 450 450 450 460 460 460 465 465
465 470 470 472 475 475 475 480 480 480
480 485 490 490 490 500 500 500 500 510
510 515 525 525 525 535 549 550 570 570
575 575 580 590 600 600 600 600 615 615

© 2024 International Executive MBA - Paris Graduate School of Management.


All rights reserved.
Paris Graduate School of Management (PGSM)
International Executive Master of Business Administration

Box Plot
n A box is drawn with its ends located at the first and third
quartiles.
n A vertical line is drawn in the box at the location of the
median.
n Limits are located (not drawn) using the interquartile range
(IQR).
• The lower limit is located 1.5(IQR) below Q1.
• The upper limit is located 1.5(IQR) above Q3.
• Data outside these limits are considered outliers.
… continued

Box Plot (Continued)


n Whiskers (dashed lines) are drawn from the ends of the
box to the smallest and largest data values inside the
limits.
n The locations of each outlier is shown with the symbol
*.

© 2024 International Executive MBA - Paris Graduate School of Management.


All rights reserved.
Paris Graduate School of Management (PGSM)
International Executive Master of Business Administration

Example: Apartment Rents


n Box Plot

Lower Limit: Q1 - 1.5(IQR) = 450 - 1.5(75) = 337.5


Upper Limit: Q3 + 1.5(IQR) = 525 + 1.5(75) = 637.5
There are no outliers.

37 40 42 45 47 50 52 550 575 600 625


5 0 5 0 5 0 5

Measures of Association
Between Two Variables
n Covariance
n Correlation Coefficient

© 2024 International Executive MBA - Paris Graduate School of Management.


All rights reserved.
Paris Graduate School of Management (PGSM)
International Executive Master of Business Administration

Covariance
n The covariance is a measure of the linear association
between two variables.
n Positive values indicate a positive relationship.
n Negative values indicate a negative relationship.

Covariance
n If the data sets are samples, the covariance is denoted
by sxy.
 ( xi  x )( yi  y )
sxy 
n 1

n If the data sets are populations, the covariance is


denoted by xy .

 ( xi   x )( yi   y )
 xy 
N

© 2024 International Executive MBA - Paris Graduate School of Management.


All rights reserved.
Paris Graduate School of Management (PGSM)
International Executive Master of Business Administration

Correlation Coefficient
n The coefficient can take on values between -1 and +1.
n Values near -1 indicate a strong negative linear
relationship.
n Values near +1 indicate a strong positive linear
relationship.
n If the data sets are samples, the coefficient is rxy.
sxy
rxy 
sx s y
 If the data sets are populations, the coefficient is  xy .
 xy
 xy 
 x y

The Weighted Mean and


Working with Grouped Data

n Weighted Mean
n Mean for Grouped Data
n Variance for Grouped Data
n Standard Deviation for Grouped Data

© 2024 International Executive MBA - Paris Graduate School of Management.


All rights reserved.
Paris Graduate School of Management (PGSM)
International Executive Master of Business Administration

Weighted Mean
n When the mean is computed by giving each data value a
weight that reflects its importance, it is referred to as a
weighted mean.
n In the computation of a grade point average (GPA), the
weights are the number of credit hours earned for each
grade.
n When data values vary in importance, the analyst must
choose the weight that best reflects the importance of each
value.

Weighted Mean
x =  wi xi
 wi
where:
xi = value of observation i
wi = weight for observation i

© 2024 International Executive MBA - Paris Graduate School of Management.


All rights reserved.
Paris Graduate School of Management (PGSM)
International Executive Master of Business Administration

Grouped Data
n The weighted mean computation can be used to obtain
approximations of the mean, variance, and standard
deviation for the grouped data.
n To compute the weighted mean, we treat the midpoint of
each class as though it were the mean of all items in the
class.
n We compute a weighted mean of the class midpoints using
the class frequencies as weights.
n Similarly, in computing the variance and standard
deviation, the class frequencies are used as weights.

Mean for Grouped Data


n Sample Data
x
fM i i

f i

n Population Data fM i i



N

where:
fi = frequency of class i
Mi = midpoint of class i

© 2024 International Executive MBA - Paris Graduate School of Management.


All rights reserved.
Paris Graduate School of Management (PGSM)
International Executive Master of Business Administration

Example: Apartment Rents


Given below is the previous sample of monthly rents for one
bedroom apartments presented here as grouped data in the
form of a frequency distribution.
Rent ($) Frequency
420-439 8
440-459 17
460-479 12
480-499 8
500-519 7
520-539 4
540-559 2
560-579 4
580-599 2
600-619 6

Example: Apartment Rents


n Mean for Grouped Data
Rent ($) fi Mi f iMi
420-439 8 429.5 3436.0 34 , 525
x   493. 21
440-459 17 449.5 7641.5 70
460-479 12 469.5 5634.0
480-499 8 489.5 3916.0
500-519 7 509.5 3566.5
520-539 4 529.5 2118.0
This approximation
540-559 2 549.5 1099.0 differs by $2.41 from
560-579 4 569.5 2278.0
580-599 2 589.5 1179.0 the actual sample
600-619 6 609.5 3657.0 mean of $490.80.
Total 70 34525.0

© 2024 International Executive MBA - Paris Graduate School of Management.


All rights reserved.
Paris Graduate School of Management (PGSM)
International Executive Master of Business Administration

Variance for Grouped Data


n Sample Data
 f i ( Mi  x ) 2
s2 
n 1

n Population Data
2
 f i ( Mi   )
2 
N

Example: Apartment Rents


n Variance for Grouped Data
s 2  3, 017.89

n Standard Deviation for Grouped Data


s 3, 017.89  54. 94

This approximation differs by only $.20


from the actual standard deviation of $54.74.

© 2024 International Executive MBA - Paris Graduate School of Management.


All rights reserved.
Paris Graduate School of Management (PGSM)
International Executive Master of Business Administration

End of Chapter 3

© 2024 International Executive MBA - Paris Graduate School of Management.


All rights reserved.

You might also like