02 DescriptiveStatistics

You might also like

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 4

ME614 PROBABILITY & STATISTICAL CONCEPTS IN ENGINEERING DESIGN

DESCRIPTIVE STATISTICS
STATISTICS – a branch of knowledge which deals on the collection, organization, analysis and interpretation
of numerical data.

Uses and Importance of Statistics


- gives us a general information or techniques in solving problems in politics, government, industry,
education, medicine, sports, etc.

Two Types of Statistics


1. Descriptive Statistics – techniques which are concerned with summarizing and describing numerical data
by the use of descriptive measures of statistics like mean, median, standard deviation and others.
2. Inferential Statistics – techniques by which decisions about a statistical population are made based on a
sample having been observed. This is concerned more with generalizing information or making
predictions about a population.

Statistical Terms
1. Population – consists of the totality of the observations with which we are concerned
2. Sample – a subset or a representative of a population
3. Variable – is a characteristic or phenomenon which may take on any value such as age, weight, no. of
households, etc.
4. Data – the statistical facts, historical facts, principles, opinions, and items of various sources like scores,
income, etc.

Types of Data
Grouped Data – data which are organized, summarized and presented in a frequency distribution
Ungrouped Data - data which are not organized or classified and usually exhibit no pattern

Quantitative Data (Numerical) – data which are capable of being measured like heights, ages, weights, etc.
Qualitative Data (Textual) – data which are not capable of being measured but can only be categorized such
as sex (male, female), marital status (single, married, annulled, widowed), etc.

Types of Quantitative Data


1. Discrete Data (Countable) - are those which assume only specific values like number of person,
school enrolment, registration, or number of voters
2. Continuous Data (Measurable) – are those which assume any values within a defined range of values
like income, length, weight, height, age, etc.

Data According to Measurement Scale Used


1. Nominal Data – categories of qualitative variables like sex, religion, marital status, etc. They may be
assigned with numerals as labels only to identify the members within a given class. They do not
share any of the properties we deal in arithmetic.
2. Ordinal Data – are values of variables which reflect rank order of the individual or objects. Ordinal
measures are arranged from highest to lowest or vice versa.
3. Interval Data – are values that reflect differences among items and measurements are equal
4. Ratio Data – highest type of measurement scale. They are measured from a true zero like heights,
lengths, density

Some Statistical Measures Applicable to Type of Data


1. Nominal Data – proportions and percentage
2. Ordinal Data - median, centiles, rank correlation
3. Interval Data – mean, standard deviation, Pearson’s product moment correlation, T-test, F-test
4. Ratio Data – almost all statistical measures are applicable

Other Statistical Terms


1. Statistic – it is the characteristic of a sample which is measurable
2. Parameter – it is the characteristic of a population which is measurable
3. Random Sampling – selection of samples in such a way that each sample has precisely the same
probability of being selected
4. Tabulation – the process of grouping or classifying data for purposes of interpretation
5. Array – arrangement of data from highest to lowest or vice versa
6. Range – is the difference between the highest and the lowest number
7. Frequency Distribution – tabulation of groups of scores and measures with class interval
Example. Frequency Distribution of the Annual Income of Potential Customers

Annual Frequency
Income
50,000 – 59,999 25
60,000 – 69,999 15
70,000 – 79,999 12
80,000 – 89,999 7
90,000 – 99,999 5

8. Class Interval or Class Limit – is the grouping or category defined by a lower limit and an upper limit
9. Class Boundaries or True Limits – they are more precise expressions of the class limits by at least 0.5
of their value
10. Class Frequency – refers to the number of observations belonging to a class interval
11. Class Size – width of the class interval
12. Midpoint or Class Mark – is computed by adding their lower and upper limits divided by 2

TYPES OF SAMPLING
1. Simple Random Sampling or “Lottery method” – the items are picked out for sample at random. This
is to eliminate any possibility of a bias in the sampling procedure.
2. Stratified Sampling – selects simple random samples from mutually exclusive subpopulations, or
strata, of the population

City / Number of No. of


Percentage
Municipality Households Questionnaires
Cebu City 147,600
Mandaue City 54,882
Lapu-Lapu City 44,439
Talisay City 28,751
Minglanilla 14,739
Total

The table shows the major cities and respective number of households wherein a survey shall be
conducted. Using stratified sampling, how many questionnaires shall be distributed to each area, if sample
size is 120?

3. Systematic Sampling - the items are chosen from the population at uniform intervals of time, space or
order of occurrence.
Example. Every 5th item in a sequence of electronic components for quality control
Every 10th person listed in the telephone directory is selected

PRESENTATION OF DATA
1. Textual - data are presented with the use of phrases or text
2. Tabular – data are presented in the form of a table
The frequency distribution below shows the study of battery lives

Class Frequency
Interval
1.5 – 1.9 3
2.0 – 2.4 6
2.5 – 2.9 15
3.0 – 3.4 11
3.5 – 3.9 5
3. Graphical
a. Bar Chart – the base of each bar corresponds to a class interval of the frequency distribution
and the heights of the bar chart represent the frequency associated with each class
b. Histogram – bases of each bar are the class boundaries and the heights represent the
frequency associated with each class
c. Frequency Polygon – constructed by plotting class frequency against class marks then
connecting the consecutive points by straight lines and an additional class interval is added
on both ends of the distribution, each with zero frequency
d. Pie Chart or Circle Graph
Example. Student’s Monthly Expenditures

Expenditures Amount Percentage


Food P 1,800
Boarding House 800
School Supplies 500
Entertainment 650
Miscellaneous 700
Savings 550
Total P5,000

Graphical Data Presentation


For Discrete Type of Data for Continuous Type of Data
- Line Graph - Frequency Polygon
- Bar Graph - Histogram
- Pie Chart - Less than ogive or greater than ogive
- Pictograph - Relative Frequency Polygon

FREQUENCY DISTRIBUTION
- tabulation of scores or measures with class interval
Steps in the Construction of Frequency Distribution
1. Determine the desired number of class intervals, k
Sturgees’ Rule: k = 1 + 3.3221 log n; n = sample size
2. Determine the highest and lowest values of the given data set, and find the range R
3. Determine the class size or the width, w, of the class intervals w = R / k
4. Determine the lower limit and upper limit of first class interval
5. Determine the frequency of values filling within each class interval
6. Tally the scores / observations filling in each class

Example. The following data represent the scores in the entrance examination of college freshmen.
Construct a frequency distribution starting with the lowest score.

19 44 24 43 33 29 26 25 29 23
31 33 38 18 33 33 34 33 27 32
36 37 40 24 40 37 57 48 39 48
26 39 42 32 24 30 30 39 35 28
34 45 39 49 26 43 40 34 41 45
32 21 32 33 22 43 33 29 29 19
MEASURES OF LOCATION FOR FREQUENCY DISTRIBUTION

Mean , μ =
 fX m
where: f = frequency of each class
N
X m =class mark of every class
LCL  UCL
Xm = ; LCL = lower class limit & UCL = upper class limit
2

N F
Median, Md = Lo 
 2 c where: Lo  lower class boundary of median class
 f 
 
Median class = class containing the middle term
F = cumulative frequency for the lowest class to the class
before the median class
f = frequency of median class
c = class size

 d 
Mode, Mo = Lo   1
c where: Lo  lower class boundary of modal class
 d1  d 2 
Modal class = class having the highest frequency
d1  difference between the frequency of the modal class
and the frequency of the class before the modal class
d 2  difference between the frequency of the modal class
and the frequency of the class after the modal class

You might also like