Professional Documents
Culture Documents
Lect 1 - 2
Lect 1 - 2
INTERNATIONAL EXECUTIVE
MASTER OF BUSINESS ADMINISTRATION
IEMBA
© 2024 International Executive MBA - Paris Graduate School of Management. All rights reserved.
INTERNATIONAL EXECUTIVE
MASTER OF BUSINESS ADMINISTRATION
Management Decision Making
January 2024
© 2024 International Executive MBA - Paris Graduate School of Management. All rights reserved.
LECTURES 1 & 2
© 2024 International Executive MBA - Paris Graduate School of Management. All rights reserved.
Chapter 1
Data and Statistics
n Applications in Business and Economics
n Data
n Data Sources
n Descriptive Statistics
n Statistical Inference
Applications in
Business and Economics
n Accounting
Public accounting firms use statistical sampling procedures when
conducting audits for their clients.
n Finance
Financial advisors use a variety of statistical information, including
price-earnings ratios and dividend yields, to guide their
investment recommendations.
n Marketing
Electronic point-of-sale scanners at retail checkout counters are
being used to collect data for a variety of marketing research
applications.
Applications in
Business and Economics
n Production
A variety of statistical quality control charts are used to
monitor the output of a production process.
n Economics
Economists use statistical information in making
forecasts about the future of the economy or some
aspect of it.
Data
Scales of Measurement
n Scales of measurement include:
• Nominal
• Ordinal
• Interval
• Ratio
n The scale determines the amount of information
contained in the data.
n The scale indicates the data summarization and
statistical analyses that are most appropriate.
Scales of Measurement
n Nominal
• Data are labels or names used to identify an
attribute of the element.
• A nonnumeric label or a numeric code may be used.
Scales of Measurement
n Nominal
• Example:
Students of a university are classified by the school
in which they are enrolled using a nonnumeric label
such as Business, Humanities, Education, and so on.
Alternatively, a numeric code could be used for the
school variable (e.g. 1 denotes Business, 2 denotes
Humanities, 3 denotes Education, and so on).
Scales of Measurement
n Ordinal
• The data have the properties of nominal data and
the order or rank of the data is meaningful.
• A nonnumeric label or a numeric code may be used.
Scales of Measurement
n Ordinal
• Example:
Students of a university are classified by their
class standing using a nonnumeric label such as
Freshman, Sophomore, Junior, or Senior.
Alternatively, a numeric code could be used for
the class standing variable (e.g. 1 denotes
Freshman, 2 denotes Sophomore, and so on).
Scales of Measurement
n Interval
• The data have the properties of ordinal data and the
interval between observations is expressed in terms
of a fixed unit of measure.
• Interval data are always numeric.
Scales of Measurement
n Interval
• Example:
Melissa has an SAT score of 1205, while Kevin has
an SAT score of 1090. Melissa scored 115 points
more than Kevin.
Scales of Measurement
n Ratio
• The data have all the properties of interval data and
the ratio of two values is meaningful.
• Variables such as distance, height, weight, and time
use the ratio scale.
• This scale must contain a zero value that indicates
that nothing exists for the variable at the zero point.
Scales of Measurement
n Ratio
• Example:
Melissa’s college record shows 36 credit hours
earned, while Kevin’s record shows 72 credit
hours earned. Kevin has twice as many credit
hours earned as Melissa.
Qualitative Data
n Qualitative data are labels or names used to identify an
attribute of each element.
n Qualitative data use either the nominal or ordinal scale
of measurement.
n Qualitative data can be either numeric or nonnumeric.
n The statistical analysis for qualitative data are rather
limited.
Quantitative Data
n Quantitative data indicate either how many or how
much.
• Quantitative data that measure how many are
discrete.
• Quantitative data that measure how much are
continuous because there is no separation between
the possible values for the data..
n Quantitative data are always numeric.
n Ordinary arithmetic operations are meaningful only
with quantitative data.
Data Sources
n Existing Sources
• Data needed for a particular application might
already exist within a firm. Detailed information is
often kept on customers, suppliers, and employees
for example.
• Substantial amounts of business and economic data
are available from organizations that specialize in
collecting and maintaining data.
Data Sources
n Existing Sources
• Government agencies are another important source
of data.
• Data are also available from a variety of industry
associations and special-interest organizations.
Data Sources
n Internet
• The Internet has become an important source of
data.
• Most government agencies, like the Bureau of the
Census (www.census.gov), make their data available
through a web site.
• More and more companies are creating web sites
and providing public access to them.
• A number of companies now specialize in making
information available over the Internet.
Data Sources
n Statistical Studies
• Statistical studies can be classified as either experimental
or observational.
• In experimental studies the variables of interest are first
identified. Then one or more factors are controlled so
that data can be obtained about how the factors influence
the variables.
• In observational (nonexperimental) studies no attempt is
made to control or influence the variables of interest.
• A survey is perhaps the most common type of
observational study.
Descriptive Statistics
91 78 93 57 75 52 99 80 97 62
71 69 72 89 66 75 79 75 72 76
104 74 62 68 97 105 77 65 80 109
85 97 88 68 83 68 71 69 67 74
62 82 98 101 79 105 79 69 62 73
12
10
8
6
4
2
Parts
50 60 70 80 90 100 110 Cost ($)
Statistical Inference
n Statistical inference is the process of using data
obtained from a small group of elements (the sample)
to make estimates and test hypotheses about the
characteristics of a larger group of elements (the
population).
End of Chapter 1
Chapter 2
Descriptive Statistics:
Tabular and Graphical Methods
n Frequency Distribution
n Relative Frequency
n Percent Frequency Distribution
n Bar Graph
n Pie Chart
Frequency Distribution
n A frequency distribution is a tabular summary of data
showing the frequency (or number) of items in each of
several nonoverlapping classes.
n The objective is to provide insights about the data that
cannot be quickly obtained by looking only at the
original data.
Rating Frequency
Poor 2
Below Average 3
Average 5
Above Average 9
Excellent 1
Total 20
Relative Percent
Rating Frequency Frequency
Poor .10 10
Below Average .15 15
Average .25 25
Above Average .45 45
Excellent .05 5
Total 1.00 100
Bar Graph
n A bar graph is a graphical device for depicting qualitative
data that have been summarized in a frequency, relative
frequency, or percent frequency distribution.
n On the horizontal axis we specify the labels that are used for
each of the classes.
n A frequency, relative frequency, or percent frequency scale
can be used for the vertical axis.
n Using a bar of fixed width drawn above each class label, we
extend the height appropriately.
n The bars are separated to emphasize the fact that each class
is a separate category.
7
6
5
4
3
2
1
Below Average Above Excellent Rating
Poor
Average Average
Pie Chart
n The pie chart is a commonly used graphical device for
presenting relative frequency distributions for
qualitative data.
n First draw a circle; then use the relative frequencies to
subdivide the circle into sectors that correspond to the
relative frequency for each class.
n Since there are 360 degrees in a circle, a class with a
relative frequency of .25 would consume .25(360) =
90 degrees of the circle.
Quality Ratings
n Frequency Distribution
n Relative Frequency and Percent Frequency
Distributions
n Dot Plot
n Histogram
n Cumulative Distributions
n Ogive
Frequency Distribution
n Guidelines for Selecting Number of Classes
• Use between 5 and 20 classes.
• Data sets with a larger number of elements usually
require a larger number of classes.
• Smaller data sets usually require fewer classes.
Frequency Distribution
n Guidelines for Selecting Width of Classes
• Use classes of equal width.
• Approximate Class Width =
Largest Data Value Smallest Data Value
Number of Classes
Dot Plot
n One of the simplest graphical summaries of data is a
dot plot.
n A horizontal axis shows the range of data values.
n Then each data value is represented by a dot placed
above the axis.
Histogram
n Another common graphical presentation of quantitative data
is a histogram.
n The variable of interest is placed on the horizontal axis and
the frequency, relative frequency, or percent frequency is
placed on the vertical axis.
n A rectangle is drawn above each class interval with its height
corresponding to the interval’s frequency, relative frequency,
or percent frequency.
n Unlike a bar graph, a histogram has no natural separation
between rectangles of adjacent classes.
12
10
8
6
4
2 Parts
50 60 70 80 90 100 110 Cost ($)
Cumulative Distribution
n The cumulative frequency distribution shows the
number of items with values less than or equal to the
upper limit of each class.
n The cumulative relative frequency distribution shows
the proportion of items with values less than or equal
to the upper limit of each class.
n The cumulative percent frequency distribution shows
the percentage of items with values less than or equal
to the upper limit of each class.
Crosstabulation
n Crosstabulation is a tabular method for summarizing the
data for two variables simultaneously.
n Crosstabulation can be used when:
• One variable is qualitative and the other is quantitative
• Both variables are qualitative
• Both variables are quantitative
n The left and top margin labels define the classes for the two
variables.
< $99,000 18 6 19 12 55
> $99,000 12 14 16 3 45
Total 30 20 35 15 100
Scatter Diagram
n A scatter diagram is a graphical presentation of the
relationship between two quantitative variables.
n One variable is shown on the horizontal axis and the
other variable is shown on the vertical axis.
n The general pattern of the plotted points suggests the
overall relationship between the variables.
Scatter Diagram
n A Positive Relationship
y
Scatter Diagram
n A Negative Relationship
y
Scatter Diagram
n No Apparent Relationship
y
30
25
20
15
10
5
0 x
0 1 2 3
Number of Interceptions
End of Chapter 2
Chapter 3
Descriptive Statistics: Numerical Methods
n Measures of Location
n Measures of Variability
n Measures of Relative Location and Detecting Outliers
n Exploratory Data Analysis
n Measures of Association Between Two Variables
n The Weighted Mean and
Working with Grouped Data
x
© 2024 International Executive MBA - Paris Graduate School of Management.
All rights reserved.
Paris Graduate School of Management (PGSM)
International Executive Master of Business Administration
Measures of Location
n Mean
n Median
n Mode
n Percentiles
n Quartiles
Mean
n The mean of a data set is the average of all the data
values.
n If the data are from a sample, the mean is denoted by x
xi
x
n
n If the data are from a population, the mean is denoted
bym (mu).
xi
N
Median
n The median is the measure of location most often
reported for annual income and property value data.
n A few extremely large incomes or property values can
inflate the mean.
Median
n The median of a data set is the value in the middle
when the data items are arranged in ascending order.
n For an odd number of observations, the median is the
middle value.
n For an even number of observations, the median is the
average of the two middle values.
Mode
n The mode of a data set is the value that occurs with
greatest frequency.
n The greatest frequency can occur at two or more
different values.
n If the data have exactly two modes, the data are
bimodal.
n If the data have more than two modes, the data are
multimodal.
Percentiles
n A percentile provides information about how the data
are spread over the interval from the smallest value to
the largest value.
n Admission test scores for colleges and universities are
frequently reported in terms of percentiles.
Percentiles
n The pth percentile of a data set is a value such that at least p
percent of the items take on this value or less and at least (100
- p) percent of the items take on this value or more.
• Arrange the data in ascending order.
• Compute index i, the position of the pth percentile.
i = (p/100)n
• If i is not an integer, round up. The p th percentile is the
value in the i th position.
• If i is an integer, the p th percentile is the average of the
values in positions i and i +1.
Quartiles
Measures of Variability
n It is often desirable to consider measures of variability
(dispersion), as well as measures of location.
n For example, in choosing supplier A or supplier B we
might consider not only the average delivery time for
each, but also the variability in delivery time for each.
Measures of Variability
n Range
n Interquartile Range
n Variance
n Standard Deviation
n Coefficient of Variation
Range
Interquartile Range
n The interquartile range of a data set is the difference
between the third quartile and the first quartile.
n It is the range for the middle 50% of the data.
n It overcomes the sensitivity to extreme data values.
Variance
n The variance is a measure of variability that utilizes
all the data.
n It is based on the difference between the value of each
observation (xi) and the mean (x for a sample, for a
population).
Variance
n The variance is the average of the squared differences
between each data value and the mean.
n If the data set is a sample, the variance is denoted by s2.
2
2 ( xi x )
s
n 1
Standard Deviation
n The standard deviation of a data set is the positive square
root of the variance.
n It is measured in the same units as the data, making it more
easily comparable, than the variance, to the mean.
n If the data set is a sample, the standard deviation is denoted
s.
s s2
Coefficient of Variation
n The coefficient of variation indicates how large the
standard deviation is in relation to the mean.
n If the data set is a sample, the coefficient of variation is
computed as follows:
s
(100)
x
n If the data set is a population, the coefficient of variation is
computed as follows:
(100)
n Standard Deviation
s s2 2996. 47 54. 74
n Coefficient of Variation
s 54. 74
100 100 11.15
x 490.80
z-Scores
n The z-score is often called the standardized value.
n It denotes the number of standard deviations a data value xi
is from the mean.
x x
zi i
s
n A data value less than the sample mean will have a z-score
less than zero.
n A data value greater than the sample mean will have a z-
score greater than zero.
n A data value equal to the sample mean will have a z-score of
zero.
-1.20 Standardized
-1.11 -1.11 -1.02Values
-1.02 for Apartment
-1.02 Rents
-1.02 -1.02 -0.93 -0.93
-0.93 -0.93 -0.93 -0.84 -0.84 -0.84 -0.84 -0.84 -0.75 -0.75
-0.75 -0.75 -0.75 -0.75 -0.75 -0.56 -0.56 -0.56 -0.47 -0.47
-0.47 -0.38 -0.38 -0.34 -0.29 -0.29 -0.29 -0.20 -0.20 -0.20
-0.20 -0.11 -0.01 -0.01 -0.01 0.17 0.17 0.17 0.17 0.35
0.35 0.44 0.62 0.62 0.62 0.81 1.06 1.08 1.45 1.45
1.54 1.54 1.63 1.81 1.99 1.99 1.99 1.99 2.27 2.27
Chebyshev’s Theorem
At least (1 - 1/k2) of the items in any data set will be
within k standard deviations of the mean, where k is
any value greater than 1.
• At least 75% of the items must be within
k = 2 standard deviations of the mean.
• At least 89% of the items must be within
k = 3 standard deviations of the mean.
• At least 94% of the items must be within
k = 4 standard deviations of the mean.
Empirical Rule
For data having a bell-shaped distribution:
Empirical Rule
For data having a bell-shaped distribution:
Empirical Rule
For data having a bell-shaped distribution:
Detecting Outliers
n An outlier is an unusually small or unusually large value in a
data set.
n A data value with a z-score less than -3 or greater than +3
might be considered an outlier.
n It might be an incorrectly recorded data value.
n It might be a data value that was incorrectly included in the
data set.
n It might be a correctly recorded data value that belongs in
the data set !
Five-Number Summary
n Smallest Value
n First Quartile
n Median
n Third Quartile
n Largest Value
Box Plot
n A box is drawn with its ends located at the first and third
quartiles.
n A vertical line is drawn in the box at the location of the
median.
n Limits are located (not drawn) using the interquartile range
(IQR).
• The lower limit is located 1.5(IQR) below Q1.
• The upper limit is located 1.5(IQR) above Q3.
• Data outside these limits are considered outliers.
… continued
Measures of Association
Between Two Variables
n Covariance
n Correlation Coefficient
Covariance
n The covariance is a measure of the linear association
between two variables.
n Positive values indicate a positive relationship.
n Negative values indicate a negative relationship.
Covariance
n If the data sets are samples, the covariance is denoted
by sxy.
( xi x )( yi y )
sxy
n 1
( xi x )( yi y )
xy
N
Correlation Coefficient
n The coefficient can take on values between -1 and +1.
n Values near -1 indicate a strong negative linear
relationship.
n Values near +1 indicate a strong positive linear
relationship.
n If the data sets are samples, the coefficient is rxy.
sxy
rxy
sx s y
If the data sets are populations, the coefficient is xy .
xy
xy
x y
n Weighted Mean
n Mean for Grouped Data
n Variance for Grouped Data
n Standard Deviation for Grouped Data
Weighted Mean
n When the mean is computed by giving each data value a
weight that reflects its importance, it is referred to as a
weighted mean.
n In the computation of a grade point average (GPA), the
weights are the number of credit hours earned for each
grade.
n When data values vary in importance, the analyst must
choose the weight that best reflects the importance of each
value.
Weighted Mean
x = wi xi
wi
where:
xi = value of observation i
wi = weight for observation i
Grouped Data
n The weighted mean computation can be used to obtain
approximations of the mean, variance, and standard
deviation for the grouped data.
n To compute the weighted mean, we treat the midpoint of
each class as though it were the mean of all items in the
class.
n We compute a weighted mean of the class midpoints using
the class frequencies as weights.
n Similarly, in computing the variance and standard
deviation, the class frequencies are used as weights.
f i
where:
fi = frequency of class i
Mi = midpoint of class i
n Population Data
2
f i ( Mi )
2
N
End of Chapter 3