Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 39

ADBM STATISTICAL

TECHNIQUES
Kuda Sibanda
Introductory Lecture
Overview of the This Lecture
■ What is Statistics?
■ Types of statistics
■ Data Presentation
■ Measure of central Tendency
■ Measure of dispersion
Statistics
■ Statistics is a way of gathering, organizing, and interpreting information from that group
to help you make sense of it all.
■ Statistics all around
– Income
– Profit
– Turnover
– Levels of production
– Forecasting
– Etc

■ Understand Statistical reasoning and be able to accurately interpret results


What is Statistics?

■ gathering data, and then classifying, summarising, and


interpreting and drawing conclusions from the data
■ facilitates the informational presentation of data to support
decisions that are needed and refers to a range of techniques
and procedures for analysing, interpreting and displaying data
in order to make decisions based on data
Types of Statistics

■ Descriptive Statistics
– summarising and describing data using
frequencies, percentages, measures of central
tendency, dispersion/spread and by looking at
shapes of distributions
■ Inferential Statistics
– makes inferences or generalisations about the
population from which the sample was drawn
Data Presentation
■ The information obtained from a statistical analysis is meaningful to business
managers only when it can be interpreted and communicated effectively and
concisely.
■ It is customary to convey such information through the use of summary tables
and charts.
■ Tables and charts convey information more vividly and quickly than written
reports.
■ One such compact and efficient way of presenting data is in the form of a table.
■ Using a table to list data according to category is often much clearer than
writing out all the information in paragraph form
The beauty of Using tables
■ Example: During the 1995-1996
academic year, a survey of the holdings
of university research libraries and rank
was done in the United States and
Canada. It was found that Syracuse
University, in New York, had 2,692,147
holdings, and was figured to rank
eighty-first. Harvard University ranked
first with 13,369,855 holdings. The
University of Connecticut was ranked
fiftieth place, and reported 2,626,066
holdings. The Massachusetts Institute of
Technology reported 2,448,647
holdings, and was ranked in seventy-
third place. (Source: Association of
Research Libraries)
Frequency Table
A frequency table, shows the number of occurrences of a variable falling
into a specific range or category.

■ Example: Consider a group of 47 males


of various ages. 12 are between 20 and 29
years of age, 13 are between 30 and 39
years of age, 7 are between 40 and 49
years of age, 8 are between 50 and 59
years of age while the rest are between 60
and 69 years of age.

Immediately we can make the following conclusions:


1. Majority of the males are “young”, i.e. below 40 years of age.
2. Most lie between 30 and 39 years of age.
3. The sum of the number of men in each interval (i.e. 12+13+7+8+7) must equal to the
total number in the group which is 47 in this case.
Cumulative Frequency Table
Example
Consider a class consisting of 100 students. Suppose the teacher gives the entire class a statistics
test which has a maximum mark of 100. Upon marking the scripts (which are in no order
whatsoever), he puts the marks into a table as shown below. This represents raw data since there
is no set order of the marks.

On examining the data, we


see the following:

Maximum value = 90

Minimum value = 8

Range (max – min) = 82


Cumulative Frequency Table
Since the highest possible mark in this
case is 100, and the lowest mark is 0, an
interval size (width) of 10 is easy to
work with. Hence, we may choose to use
the following intervals:
Graphs and Charts
■ A table is a useful way to present detailed data, but a
picture or diagram is more powerful if we want to focus
attention on a particular aspect, such as a dominant
feature or a trend.
■ There are many different types of graphs we can use to
portray data.
■ The important graphs we will discuss are the
– Line graph,
– Bar graph and,
– Pie graph.
Line Graphs
One method of showing trends or A study of the table shows that the
comparative trends is to use a line profits have dropped considerably.
graph. A good example is one that is This trend can be better depicted
used to show the growth in terms of using a line graph
profits of a particular organization.
Bar charts
Note the features of a bar chart:
A bar chart consists of 1. The width of each bar must be the same (we use length to represent
a series of bars, the length of each bar magnitude of the value, not width).
representing the value of the variable 2. An appropriate scale must be used on each axis such that the
lengths of the bars are reasonable.
being plotted. The bars can be either 3. The distance between each bar must be kept constant to give the bar
drawn vertically or horizontally chart uniformity.
4. The bars may or may not be coloured (this is purely up to the
person drawing the bar chart).
Pie Charts
A pie chart is usually used when proportions are to be depicted relative to a whole.
In essence, it is a circle divided into segments, with the size of each segment
proportional to the value of the variable, relative to the whole, and is usually
expressed in percentage terms.

Spending Money
Example: Consider a father who gives spending money to
each of his three sons. Josh, the eldest gets R 120,

Matt gets R 80 and David, the youngest gets R 50. This data
can be expressed in a pie chart as follows:

Firstly, we calculate the percentage (in terms of the total


amount the father gave out) that each son receives. The
total in this case is R 120 + R 80 + R 50 = R 250.

The percentages are:


Josh: (120/250) x 100 = 48 %
Matt: (80/250) x 100 = 32 %
David: (50/250) x 100 = 20 %
Note: The sum of the three percentages must be 100 % (the Josh Matt David
full pie!)
Histogram
A histogram is a graphic display of a frequency distribution, using a bar-like graph. Earlier, we constructed
a frequency table for the ages of 47 males. We can illustrate this information on a histogram as follows:
Measures of Central Tendencies

■ A measure of central tendency is a single value that attempts to describe a set of


data by identifying the central position within that set of data
■ Measures of central tendency help you find the middle, or the average, of a data
set.
■ There are three commonly used numerical measures of central tendency or
central location of a dataset: the mean, the median and the mode.
Two types of Data

■ Ungrouped or Raw data


– data you first gather from an experiment or study
– The data is raw — that is, it’s not sorted
– Ungrouped data is data given as individual data points or a list
■ Grouped Data
– Grouped data is data given in intervals
– data that has been bundled together in categories
– Histograms and frequency tables can be used to show this type of data:
Mean, Mode and Median for Ungrouped
Data
■ Mean for Ungrouped or Raw data
Mean, Mode and Median for Ungrouped
Data
■ Median for Ungrouped or Raw data
– The median is the middle number of an ordered set of data. It divides an
ordered set of data values into two equal halves (i.e. 50% of the data values
lie below the median and 50% lie above it).
– Follow these steps to calculate the median for ungrouped (raw) numeric
data:
■ Arrange the n data values in ascending order
■ Find the median by first identifying the middle position in the data set as
follows:
– If n is odd, the median value lies in the position in the data set
– If n is even, the median value is found by identifying the position and then
averaging the data value in this position with the next consecutive data
value
Mean, Mode and Median for Ungrouped
Data
Mean, Mode and Median for Ungrouped
Data
Mean, Mode and Median for Ungrouped
Data

■ Mode of Ungrouped data


– the most frequently occurring score, or the number that occurs the most
frequently.
– In the case of the marks below, 64% occurs twice, so the mode is 64
Mean, Mode and Median for grouped
Data
■ Mean for grouped data
– So far, we have dealt with deriving the middle measure from a group of raw
data, or actual unmanipulated scores.
– However, there will be instances where you will need to analyse grouped
frequency data
– Take for example the following set of scores:
Mean, Mode and Median for grouped
Data
Mean, Mode and Median for grouped
Data
Mean, Mode and Median for grouped
Data
■ Median for grouped data
Mean, Mode and Median for grouped
Data
Mean, Mode and Median for grouped
Data
■ Mode for grouped data
Mean, Mode and Median for grouped
Data
■ Mode for grouped data
Measures of Dispersion

■ Dispersion (or spread) refers to the extent to which the data values of a numeric
random variable are scattered about their central location value
■ It numerically describes how data is scattered or spread.
■ The measures that are used to measure dispersion are:
– Range
– Variance and Standard deviation
– Interquartile range
– Quartile deviation
Range

■ The range measures the difference between the highest and lowest values in a dataset.
■ It is affected by outliers or extreme values and gives no indication of the clustering of
the data
Variance and Standard Deviation

■ Variance for ungrouped data


■ Variance measures the squared distance of random numbers from the mean, or
average
Variance and Standard Deviation
Variance and Standard Deviation

■ Variance for grouped data


Variance and Standard Deviation
Variance and Standard Deviation

■ Standard deviation for ungrouped data


■ The standard deviation is the square root of the variance.
■ It offers a measure of the average deviation from the mean.
■ It is basically the square root of the variance
Variance and Standard Deviation

■ Standard deviation for grouped data


Variance and Standard Deviation

■ Standard deviation for grouped data


Main Assignment

You might also like