Dessertation Design &data Collection

You might also like

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 201

Dissertation & Data Collection

Dr. Ghassan M
Ghassan Mohamed
Ghassan was born on November 23, 1986, in Giza, Egypt. He graduated from the faculty of science!
Then he earned a master’s degree in statistical quality control and quality assurance from Cairo university
with an excellent degree. Later, He earned a doctorate degree in statistical quality control and quality
assurance from Cairo university with an excellent degree.
 The educational journey did not end there. This year, he earned a diploma in governance, TOT, and
Organizational excellence diploma. Ghassan realized his biggest dream: He finally became a lecturer at Cairo
University to complete his learning journey. He although an assessor in government agencies for the Egypt
Award and Egypt's Vision 2030.

However, he achieved one of his dreams to become a member of CHG family, teaching quality, statistics,
and 7 habits besides his daily work as a quality supervisor.

Today, Ghassan encourages a promising people—to study the different sciences. his advice to follow your
dreams, no matter how great.
CONTENTS

Dissertation
01 Dissertation & Dissertation design

Data
02 Data Types and how to dealing with it

Data Collection Methods


03

Exam
04
00 Learning Objectives
What is a dissertation?
A dissertation is...

● A long piece of academic writing


● Based on original research
● Usually submitted at the end of a degree
● Tests your capacity for independent research
● Sometimes called a thesis

1 Dr. Ghassan M
Ghassan_mohamed@yahoo.com
Check your guidelines!

● The sections included vary


● They may change based on your field…...and the nature of your
specific research
● Check any guidelines you are given
● Ask your supervisor if you’re unsure

Dr. Ghassan M
Ghassan_mohamed@yahoo.com
Structuring a dissertation or thesis

Dr. Ghassan M
Ghassan_mohamed@yahoo.com
Title page

✓ Dissertation title
✓ Your name
✓ Type of document
✓ Department and institution
✓ Degree program
✓ Date of submission

Dr. Ghassan M
Ghassan_mohamed@yahoo.com
Acknowledgements

● Less formal, more personal


● No longer than a page
● Thank people who helped you complete your dissertation
● E.g. supervisors, friends and family, pets!

Dr. Ghassan M
Ghassan_mohamed@yahoo.com
Acknowledgements
Abstract

✓ State the main topic and aims of your research


✓ Describe the methods you used
✓ Summarize the main results
✓ State your conclusions
X Not an introduction but a summary

Dr. Ghassan M
Ghassan_mohamed@yahoo.com
Abstract

in other words, it should be able to stand alone.


For it to stand alone, your abstract should cover the following key points (at a minimum):
1- Your research questions and aims – what key question(s) did your research aim to answer?

2- Your methodology – how did you go about investigating the topic and finding answers to your research
question(s)?

3- Your findings – following your own research, what did do you discover?

4- Your conclusions – based on your findings, what conclusions did you draw? What answers did you find to
your research question(s)?
Table of contents

● Lists all sections that come after it


● Can be auto-generated in Word
● Clear, consistent headings

Dr. Ghassan M
Ghassan_mohamed@yahoo.com
Table of contents
Lists of figures and tables

● Include if your dissertation has a lot of tables and figures


● Tables and figures listed and numbered separately
● Make sure all tables and figures are included
● List in the order they appear in the text

Dr. Ghassan M
Ghassan_mohamed@yahoo.com
Lists of figures and tables

Dr. Ghassan M
Ghassan_mohamed@yahoo.com
Lists of figures and tables

Dr. Ghassan M
Ghassan_mohamed@yahoo.com
List of abbreviations

● Include if you use a lot of abbreviations


● Define abbreviations here and in the main text
● List alphabetically
● Very well-known abbreviations not included

Dr. Ghassan M
Ghassan_mohamed@yahoo.com
List of abbreviations

Dr. Ghassan M
Ghassan_mohamed@yahoo.com
Glossary

● Include if you use a lot of specialist terms


● List terms alphabetically
● Include a brief definition of each term
● Consult with your supervisor to determine what terms you should
define

Dr. Ghassan M
Ghassan_mohamed@yahoo.com
Glossary

Dr. Ghassan M
Ghassan_mohamed@yahoo.com
Introduction

✓ Establish your research topic


✓ Provide background information
✓ Define the scope of your research
✓ Show your work’s relevance
✓ State your research questions and objectives
✓ Give an overview of your structure

Dr. Ghassan M
Ghassan_mohamed@yahoo.com
Literature review

✓ Search for sources


✓ Select the most relevant
✓ Critically evaluate sources
✓ Make connections between them
✓ Draw conclusions based on your review
✓ Show how your own research builds on what you found

Dr. Ghassan M
Ghassan_mohamed@yahoo.com
Theoretical framework

● Often builds upon/includes the literature review


● Define and analyze key theories, concepts & models
● Show how they inform your own approach

Dr. Ghassan M
Ghassan_mohamed@yahoo.com
Methodology

✓ Overall approach (e.g. qualitative, quantitative, experimental,


ethnographic)
✓ Data collection methods (e.g. surveys, interviews, archives)
✓ Details: where, when, who?
✓ Tools and materials (e.g. programs, lab equipment)
✓ Data analysis methods (e.g. statistical analysis)
✓ Obstacles faced during research
Dr. Ghassan M
Ghassan_mohamed@yahoo.com
Results

✓ Report your results concisely and objectively


✓ Include results relevant to your research questions
✓ May include data visualizations (e.g. graphs, tables)
X Don’t give subjective interpretations

Dr. Ghassan M
Ghassan_mohamed@yahoo.com
Discussion

● Interpret your results


● Did they meet your expectations?
● Did they fit the established framework?
● What factors might have influenced any unexpected results?
● Consider alternative interpretations
● Acknowledge limitations

Dr. Ghassan M
Ghassan_mohamed@yahoo.com
Conclusion

✓ Answer your main research question


✓ Make suggestions for future research
✓ Show what you have contributed
✓ Emphasize the importance of your research
X Don’t introduce any new data, interpretations, or arguments

Dr. Ghassan M
Ghassan_mohamed@yahoo.com
Reference list or bibliography

● Lists all sources cited in your dissertation


● Includes full and accurate details of each source
● Format varies depending on style guide (e.g.
APA, MLA)
● Citation generators can help
Appendices

● Present additional data or documents not included in your main text


● E.g. interview transcripts, survey questions, tables of data
● If you have multiple appendices, they are numbered (Appendix 1,
Appendix 2…)

Dr. Ghassan M
Ghassan_mohamed@yahoo.com
APA Style

An introduction to formatting and citation


What is APA Style?

● APA = American Psychological Association


● Based on the APA Publication Manual
● The 7th edition is the latest, published in 2019
● Used widely, especially in the social sciences
● Provides guidelines for language, formatting, and citations
General formatting

● Times New Roman 12 pt, Calibri 11 pt, Arial 11 pt, etc.


● Double line spacing
● One-inch (2.54 cm) margins
● Page number in the top right
● Running head in the top left (if submitting for publication)
Levels of heading
Title page
Tables and figures
In-text citations
Basic in-text citation format

Parenthetical citation
It is important to avoid plagiarism in academic writing (Smith, 2020, p. 15).

Narrative citation
Smith (2020, p. 15) states that it is important to avoid plagiarism in academic writing.
Multiple authors

Parenthetical citation Narrative citation

1 author (Smith, 2020) Smith (2020)

2 authors (Smith & Jones, 2020) Smith and Jones (2020)

3+ authors (Smith et al., 2020) Smith et al. (2020)


Missing information

No author (Scribbr, 2020)


(“Statistical analysis,” 2020)

No date (Smith, n.d.)

No page numbers (Smith, 2020, para. 10)


Combining citations

X (Smith, 2020) (Jones, 2015) (McCombes et al.,


2017)

✓ (Jones, 2015; McCombes et al., 2017; Smith, 2020)


The reference list
Reference or no reference?
Page on a website Interview you
conducted
Article from an
academic journal Email from an
expert
Book used as
background reading Chapter from a book
that you cited
PowerPoint slides
from a lecture Facebook status
✓ Reference required X No reference required ? It depends…

Page on a website Interview you conducted Facebook status

Book Email Lecture slides

Article from an academic Background reading


journal
Format of the reference page
Components of a reference entry

1. Author
2. Date
3. Title
4. Source
The author component

One author Anderson, B.

Multiple authors Andreff, W., Staudohar, P. D., & LaBrode, M.

Corporate author Scribbr.

With username Obama, B. [@BarackObama].

With specific role Scott, R. (Director).


The date component

Year only (books and journals) (2020).

Full date (web pages, newspapers, (2020, November 20).


online videos)

No publication date available (n.d.).

Retrieval date for online sources that Retrieved December 3, 2020, from …
are continually updated
The title component

Italics for standalone Statistical methods for psychology.


sources (books, films)

Plain text for sources within The evolving European model of sports finance.
sources (articles, chapters,
web pages)

Square brackets to describe [Photograph of a wren].


untitled sources
The source component
Book publisher Verso.

Journal issue Journal of Sports Economics, 1(3), 257–276.


https://doi.org./10.1177/152700250000100304

Website BBC News.


https://www.bbc.com/news/health-54531075

Physical location Museo del Prado, Madrid, Spain.


The full reference entry

Anderson, B. (1983). Imagined communities: Reflections on the origins and


spread of nationalism. Verso.

Andreff, W., & Staudohar, P. D. (2000). The evolving European model of


professional sports finance. Journal of Sports Economics, 1(3),
257–276. https://doi.org./10.1177/152700250000100304

Rowlatt, J. (2020, October 19). Could cold water hold a clue to a


dementia cure? BBC News. https://www.bbc.com/news/health-54531075
Data and Data Collection

Dr. Ghassan M
Ghassan_mohamed@yahoo.com
What is Statistics mean?
Statistics is the science of collecting, organizing,
summarizing and analyzing information in order to
draw conclusions.
The Process of Statistics

Step 1: Identify a Research Objective


• Researcher must determine question he/she wants answered - question must
be detailed.
• Identify the group to be studied. This group is called the population.
• An individual is a person or object that is a member of the population
being studied
The Process of Statistics

Step 2: Collect the information needed to answer the questions.


• In conducting research, we typically look at a subset of the population, called
a sample.
Step 3: Organize and summarize the information.
• Descriptive statistics consists of organizing and summarizing the
information collected. Consists of charts, tables, and numerical summaries.
The Process of Statistics

Step 4: Draw conclusions from the information.


• The information collected from the sample is generalized to the population.
• Inferential statistics uses methods that generalize results obtained from a
sample to the population and measure their reliability.
EXAMPLE The Process of Statistics

Many studies evaluate batterer treatment programs, but there are few experiments designed to
compare batterer treatment programs to non-therapeutic treatments, such as community service.
Researchers designed an experiment in which 376 male criminal court defendants who were accused
of assaulting their intimate female partners were randomly assigned into either a treatment group or a
control group. The subjects in the treatment group entered a 40-hour batterer treatment program
while the subjects in the control group received 40 hours of community service. After 6 months, it
was reported that 21% of the males in the control group had further battering incidents, while 10% of
the males in the treatment group had further battering incidents. The researchers concluded that the
treatment was effective in reducing repeat battering offenses.
Source: The Effects of a Group Batterer Treatment Program: A Randomized Experiment in Brooklyn by Bruce G. Taylor, et. al. Justice Quarterly,
Vol. 18, No. 1, March 2001.
Step 1: Identify the research objective.

To determine whether males accused of batterering their intimate female


partners that were assigned into a 40-hour batter treatment program are less
likely to batter again compared to those assigned to 40-hours of community
service.
Step 2: Collect the information needed to answer the question.

The researchers randomly divided the subjects into two groups. Group 1
participants received the 40-hour batterer program, while group 2 participants
received 40 hours of community service. Group 1 is called the treatment group
and the program is called the treatment. Group 2 is called the control group.
Six months after the program ended, the percentage of males that battered their
intimate female partner was determined.
Step 3: Organize and summarize the information.

The demographic characteristics of the subjects in the experimental and control


group were similar. After the six month treatment, 21% of the males in the control
group had any further battering incidents, while 10% of the males in the treatment
group had any further battering incidents.
Step 4: Draw conclusions from the data.

We extend the results of the 376 males in the study to all males who batter
their intimate female partner. That is, males who batter their female partner and
participate in a batter treatment program are less likely to batter again.
Types of data

Dr. Ghassan M
Ghassan_mohamed@yahoo.com
Data Collection Techniques 

 Observations,
 Tests,

 Surveys,

 Document analysis

(the research literature) 

Dr. Ghassan M
Ghassan_mohamed@yahoo.com
Key Factors for High Quality
Experimental Design

Data should not be contaminated by poor


measurement or errors in procedure.

Eliminate confounding variables from study or


minimize effects on variables.

Representativeness: Does your sample represent the population you are


studying? Must use random sample techniques.

Dr. Ghassan M
Ghassan_mohamed@yahoo.com
What Makes a Good Quantitative Research Design? 

4 Key Elements
1. Freedom from Bias
2. Freedom from Confounding
3. Control of Extraneous Variables
4. Statistical Precision to Test Hypothesis

Dr. Ghassan M
Ghassan_mohamed@yahoo.com
Bias: When observations favor some
individuals in the population over others. 

Confounding: When the effects of two


or more variables cannot be separated.

Extraneous Variables: Any variable that


has an effect on the dependent variable.
Need to identify and minimize these variables.
e.g., Erosion potential as a function of clay content. rainfall intensity,
vegetation & duration would be considered extraneous variables.

Dr. Ghassan M
Ghassan_mohamed@yahoo.com
Precision Vs Accuracy

Dr. Ghassan M
Ghassan_mohamed@yahoo.com
Both Accurate Accurate
and Precise Not precise

Not accurate
But precise
Neither accurate
nor precise
Dr. Ghassan M
Ghassan_mohamed@yahoo.com
Interpreting Results of Experiments 

Goal of research is to draw conclusions. What did the study


mean?

What, if any, is the cause and effect of the outcome?  

Dr. Ghassan M
Ghassan_mohamed@yahoo.com
Overall Methodology:

* State the objectives of the survey


* Define the target population
* Define the data to be collected
* Define the variables to be determined
* Define the required precision & accuracy
* Define the measurement `instrument'
* Define the sample size & sampling method, then select the sample

Dr. Ghassan M
Ghassan_mohamed@yahoo.com
Introduction to Sampling

 Sampling is the problem of accurately acquiring the necessary


data in order to form a representative view of the problem.

 This is much more difficult to do than is generally realized.

Dr. Ghassan M
Ghassan_mohamed@yahoo.com
Sampling

Distributions:
When you form a sample, you often show it by a plotted
distribution known as a histogram .

A histogram is the distribution of frequency of occurrence of


a certain variable within a specified range.

Dr. Ghassan M
Ghassan_mohamed@yahoo.com
Dr. Ghassan M
Ghassan_mohamed@yahoo.com
Dr. Ghassan M
Ghassan_mohamed@yahoo.com
Interpreting quantitative findings 

Descriptive Statistics

Dr. Ghassan M
Ghassan_mohamed@yahoo.com
Example

Dr. Ghassan Mohamed


Descriptive Statistics • Relation among average, Median and mode

Dr. Ghassan Mohamed


Dr. Ghassan Mohamed
Definition

• Measures of dispersion are descriptive statistics that describe how similar a set of
scores are to each other
• The more similar the scores are to each other, the lower the measure of dispersion will be

• The less similar the scores are to each other, the higher the measure of dispersion will be

• In general, the more spread out a distribution is, the larger the measure of dispersion
will be

81
Measures of Dispersion

125
• Which of the distributions of 100
75
scores has the larger dispersion? 50
25
0
1 2 3 4 5 6 7 8 9 10

The upper distribution has more


dispersion because the scores are 125
more spread out 100
75
That is, they are less similar to each 50
other 25
0
1 2 3 4 5 6 7 8 9 10
Example

Dr. Ghassan Mohamed


Graphs

 Histogram

 Pareto Chart

 Cause and Effect Diagram

 Check Sheet

 Process flow Diagram

 Scatter Diagram

 Control Chart
Dr. Ghassan Mohamed
WHY
GRAPHS ?
• To reveal a trend or comparison of a data
• Easily understood

85 Dr. Ghassan Mohamed


(A)Type
s
There are different kinds of graphical charts based on
statistics as follows:
1. Line graphs
2. Pie charts
3. Bar graph
4. Scatter plot
5. Stem and plot
6. Histogram
7. Frequency polygon
8. Frequency curve
9. Cumulative frequency or ogives
86 Dr. Ghassan Mohamed
Line
Graph
• A line joining several points, or a line that shows the relationship
between the points
• X-y plane
• independent variable and a dependent variable

87 Dr. Ghassan Mohamed


Example

88 Dr. Ghassan Mohamed


Pie
Charts
• A pie chart can be taken as a circular graph which is divided into
different disjoint pieces, each displaying the size of some related
information.
• Represents a whole and each part represents a percentage of the
whole

89 Dr. Ghassan Mohamed


Advantage
s
• Good visual treat
• Percentage value-instantly known

90 Dr. Ghassan Mohamed


Preferred use(Limitation)

 Categorical data - one understand what percentage each of these


category constitute

91 Dr. Ghassan Mohamed


Example

92 Dr. Ghassan Mohamed


Final Product

93
Bar
Graph
• Bar graph is drawn on an x-y graph and it has labelled horizontal or
vertical bars that show different values
• The size, length and color of the bars represent different
values.

94
Preferred
use(Limitation)
 Non continuous data
 Comparing or contrasting the size of the different categories of the
data provided.

95
Exampl
e

96
Stem and Leaf Plot
• Stem and leaf plot also called as stem plot are connected with quantitative data
such that it helps in
• Displaying shapes of the distributions,
• Organize numbers and
• Set it as comprehensible as possible.

97 Dr. Ghassan Mohamed


Stem and leaf
• Descriptive technique-emphases on the data provided
• It concludes more about the shape of a set of data
• Provides better view about each of the data. The data is arranged by “place value”.
• In Stem plots each data is taken divide  Two separate parts  a stem and a
leaf.
• A stem is usually the first digit of the number in the data a vertical column
• a leaf is the last digit of the number in the data the row to the right side of the
corresponding stem

98 Dr. Ghassan Mohamed


Stem and Leaf Diagrams

How to Draw One:

1. Put the first digits of each piece of data in numerical order


down the left-hand side

2. Go through each piece of data in turn and put the remaining


digits in the proper row

3. Re-draw the diagram putting the pieces of data in the right


order

4. Add a key

Dr. Ghassan Mohamed


Stem and Leaf Diagrams
Here are the marks gained by 30 students in an examination:
63 58 61 52 59 65 69 75 70 54 57 63 76 81 64
68 59 40 65 74 80 44 47 53 70 81 68 49 57 61

Write the tens figures in the left hand column of a diagram.


These are the ‘STEMS’
4

5
6

7 Dr. Ghassan Mohamed


Stem and Leaf Diagrams
Here are the marks gained by 30 students in an examination:
63 58 61 52 59 65 69 75 70 54 57 63 76 81 64
68 59 40 65 74 80 44 47 53 70 81 68 49 57 61

Go through the marks in turn and put in the units figures of each mark
in the proper row. These are the ‘LEAVES’

4
8
5
6 3 1

7 Dr. Ghassan Mohamed


Stem and Leaf Diagrams
Here are the marks gained by 30 students in an examination:
63 58 61 52 59 65 69 75 70 54 57 63 76 81 64
68 59 40 65 74 80 44 47 53 70 81 68 49 57 61

When all the marks are entered the diagram will look like this:

4 0 4 7 9
8 2 9 4 7 9 3 7
5
5 9 3 4 8 5 8 1
6 3 1
5 0 6 4 0

7 1 0 1 Dr. Ghassan Mohamed


Stem and Leaf Diagrams
Here are the marks gained by 30 students in an examination:
63 58 61 52 59 65 69 75 70 54 57 63 76 81 64
68 59 40 65 74 80 44 47 53 70 81 68 49 57 61

Rewrite the diagram so that the units figures in each row are in order:

4 0 4 7 9
2 3 4 7 7 8 9 9
5
1 1 3 3 4 5 5 8 8 9
6
0 0 4 5 6

7 0 1 1 Dr. Ghassan Mohamed


Stem and Leaf Diagrams
Here are the marks gained by 30 students in an examination:
63 58 61 52 59 65 69 75 70 54 57 63 76 81 64
68 59 40 65 74 80 44 47 53 70 81 68 49 57 61

Add a KEY:

4 0 4 7 9
5|2 = 52 2 3 4 7 7 8 9 9
5
1 1 3 3 4 5 5 8 8 9
6
0 0 4 5 6

7 0 1 1 Dr. Ghassan Mohamed


Stem and Leaf Diagrams

Remember:
- Always put in a Key
- Always put your data in Order

Median:
- to work out the median, you must find the middle value
- if there are two middle values, you need the average

Range:
- to work out the Range, subtract the smallest number from the
biggest

Dr. Ghassan Mohamed


Stem and Leaf Diagrams

The stem & leaf diagram below shows the masses in kg of some people in a lift.
(a) How many people were weighed?
(b) What is the range of the masses?
(c) Find the median mass.

Stem Leaf

tens 3 1 4 units Median is the


mean of the
4 3 3 6 8th and 9th
data values.
5 0 3 4 8
6 1 2 (a) 16 people.

7 2 2 7 (b) 86 – 31 = 55 kg
8 1 6
(c) 56 kg Dr. Ghassan Mohamed
Frequency Polygon

• The frequency polygon has most of the properties of a histogram, with an extra
feature. Here the mid point of each class of the x-axis is marked. Then the
midpoints and the frequencies are taken as the plotting point. These points are
connected using line segments.

• We also complete the graph, that is, it's closed by joining to the x-axis. Frequency
polygon gives a less accurate representation of the distribution, than a histogram,
as it represents the frequency of each class by a single point not by the whole class
interval.

10
7
Dr. Ghassan Mohamed
Example

10
8
Dr. Ghassan Mohamed
Final Product

10
9
Dr. Ghassan Mohamed
Frequency Curve
• The frequency polygon consists of sharp turns, and ups and downs which are not in
conformity with actual conditions.
• To remove these sharp features of a polygon, it becomes necessary to smooth it. No
definite rule for smoothing the polygon can be laid down.
• It should be understood very clearly that the curve does not, in any way, sharply
deviate from the polygon.
• In order to draw a satisfactory frequency curve, first of all, we need to draw a
frequency histogram  the frequency polygon and ultimately the frequency curve.

11
0
Example

11
1
Cumulative Frequency
(OGIVE)

• Cumulative frequency is a graph plotting cumulative frequencies on the y-axis and class
scores on the x-axis.

• The difference between frequency curve and an ogive is that in the later we plot the
cumulative frequency on the y-axis rather than plotting the individual frequencies.

• Advantage : it enables median, quartiles, etc to be studied from the graph.

11
2
Dr. Ghassan Mohamed
Example

11
3
Dr. Ghassan Mohamed
Example

11
4
Dr. Ghassan Mohamed
Pareto Diagram

 Alfred pareto ( 1848 – 1923 ) Italian Economist


 20 % Of the Population has 80 % of the wealth.

 Joseph Juran used the item “ Vital few & Trivial many”
 He noted that 20 % of the quality problems caused 80 % of the dollar loss.

Dr. Ghassan Mohamed


Dr. Ghassan Mohamed
Cause – and – Effect Diagram

 Show the relationships between a problem and its possible causes.


 Developed by Kaoru Ishikawa ( 1953 ).
 Also Known as :
 Fishbone diagrams
 Ishikawa diagrams

Dr. Ghassan Mohamed


Dr. Ghassan Mohamed
Check Sheet

Dr. Ghassan Mohamed


Example

Dr. Ghassan Mohamed


Example

Dr. Ghassan Mohamed


5) Flow chart

Dr. Ghassan Mohamed


Outpatients clinics department flow chart:

Dr. Ghassan Mohamed


Scatter plot

• A scatter plot or scatter graph is a type of graph which is drawn in Cartesian coordinate to
visually represent the values for two variables for a set of data. It is a graphical representation
that shows how one variable is affected by the other.

• Data is presented-collection of points-value of a variable positioned horizontal or x-axis


(Explanatory variable )

• Value of the other variable positioned on the vertical or y-axis(response variable)

12
5
Dr. Ghassan Mohamed
Scatter plot
 Use to illustrate the relationship between two variables by potting one against the
other.

Dr. Ghassan Mohamed


Example

Note that these data are not random


12
7
Dr. Ghassan Mohamed
Run Charts

Run Charts ( Time series plot)


Basis for control chart.

Dr. Ghassan Mohamed


Control Charts

 Why Use control Charts?


 To ( Monitor, Control, and Improve system or process performance over time by
studying variation and its source.

 What do control Charts do?


 Focus attention on detecting and monitoring process variation over time.
 Distinguishes special from common causes of variation.
 Services as a tool for ongoing control of a process.

Dr. Ghassan Mohamed


Variation in
a process
due to

Random Assignable
causes causes

Common Special
causes causes

Dr. Ghassan Mohamed


Common – cause :
 Variation that is completely random.
 Random causes that we cannot identify
 Unavoidable
((e.g. slight differences in process variables like diameter, weight, service time,
temperature))

 Special Cause :
 Variation can be exhibited within or out control limits e.g
 ( Trends, Step, Functions, shift ……..etc.)

Dr. Ghassan Mohamed


Control Chart
• A control chart is a time plot of a statistic, such as a sample mean, range,
• standard deviation, or proportion, with a center line and upper and lower
• control limits. The limits give the desired range of values for the statistic.
• When the statistic is outside the bounds, or when its time plot reveals certain
• patterns, the process may be out of control.

Dr. Ghassan Mohamed


Control Chart
Upper Control Limits (UCL) = µ + K ơ
Central Limits ( CL ) = µ
Lower Control Limit (LCL) = µ - K ơ
where :
K = Distance of the control limits from the center line
µ = mean of some sample statistic.
ơ= standard deviation of some statistic

Dr. Ghassan Mohamed


Control Chart
 Control charts are technique for improving productivity.
 Control charts are effective in defect prevention.
 Control charts are prevent unnecessary process adjustment.
 Control charts provide diagnostic information.
 Control charts provide information about process capability.

Dr. Ghassan Mohamed


Correlation
&
Regression Analysis

Applied Statistical Methods (BMTH113)-----Dr. Rasha El Kholy----Eslsca


• Introduction

• Correlation Analysis

• Coefficient of Determination

• Simple Linear Regression

Applied Statistical Methods (BMTH113)-----


Dr. Rasha El Kholy----Eslsca
136
Introduction:
 Recall that scatter diagram is a graphical technique used to describe the
relationship between two quantitative variables.

 In this chapter we carry this idea further. We are going to calculate numerical
measures (Correlation Analysis) to express the strength of relationship
between two variables.

 In addition, an equation (regression line) is used to express the


relationship between variables, allowing us to estimate (predict) one
variable on the basis of another.

Applied Statistical Methods (BMTH113)-----


Dr. Rasha El Kholy----Eslsca
137
Introduction:
• The Dependent Variable is the variable being predicted or estimated (denoted by y).

• The Independent Variable provides the basis for estimation. It is the predictor
variable (denoted by x).

Examples: Dependent

1. Age of a bus and maintenance


cost
Independent
2. Auction price and odometer
reading
Applied Statistical Methods (BMTH113)-----Independent
Dr. Rasha El Kholy----Eslsca
138
Dependent
Correlation Analysis
• Correlation Analysis is the study of the relationship between variables. It is also
defined as group of techniques to measure the association between two variables.

• The first step in correlation analysis is drawing a scatter diagram to portray the
relationship between the two variables.

Applied Statistical Methods (BMTH113)-----


Dr. Rasha El Kholy----Eslsca
139
1-Correlation Coefficient
The correlation coefficient measures the strength of the linear relationship between
two variables.

• It requires interval or ratio-scaled data.

•It can range from -1.00 to 1.00 according to the scale below:

non linear
relationship

Applied Statistical Methods (BMTH113)-----


Dr. Rasha El Kholy----Eslsca
140
1- Correlation Coefficient

Strong negative linear Moderate positive linear


relationship (r= - 0.933) relationship (r=0.518)

Applied Statistical Methods (BMTH113)-----


Dr. Rasha El Kholy----Eslsca
141
1- Correlation Coefficient: Example
The sales manager of Copier Sales of America, wants to determine whether there is a
relationship between the number of sales calls made in a month and the number of
copiers sold that month. The manager selects a random sample of 10
representatives and determines the number of sales calls each representative made
last month and the number of copiers sold.

Applied Statistical Methods (BMTH113)-----


Dr. Rasha El Kholy----Eslsca
142
1- Correlation Coefficient: Example

Scatter diagram:
X: sales calls; Y:copiers sold

Positive linear
relationship, is it
strong or moderate?

Applied Statistical Methods (BMTH113)-----


Dr. Rasha El Kholy----Eslsca
143
1- Correlation Coefficient: Example

tical Methods (BMTH113)-----Dr. Rasha El Kholy----Eslsca


1- Correlation Coefficient: Example

Choose the cell


where you want the
results to be
shown.

Applied Statistical Methods (BMTH113)-----


Dr. Rasha El Kholy----Eslsca
145
1- Correlation Coefficient: Example

r=0.759

Applied Statistical Methods (BMTH113)-----


Dr. Rasha El Kholy----Eslsca
146
1- Correlation Coefficient: Example
r=0.759…What does that mean?

There is a direct linear relationship between the number of sales calls and the number of copiers
sold. The association is strong.

Note that this doesn’t mean that more sales calls cause more copiers sales. we have not
demonstrated cause and effect here, only that the two variables—sales calls and copiers sold—are
related.

Correlation Vs Causality just because two things occur together does not mean that one is
the cause of the other. For example: an increase in ice-cream consumption in the summer is
correlated with increased rate of drowning deaths. But this doesn’t mean that ice cream
consumption causes drowning.
Applied Statistical Methods (BMTH113)-----
Dr. Rasha El Kholy----Eslsca
147
https://www.e-education.psu.edu/marcellus/node/636
2- Coefficient of Determination
The coefficient of determination (r2) is the proportion of the total variation in
the dependent variable (Y) that is explained by the variation in the
independent variable (X). It is the square of the coefficient of correlation.

• It ranges from 0 to 1.

• It does not give any information on the direction of


the relationship between the variables.

Applied Statistical Methods (BMTH113)-----


Dr. Rasha El Kholy----Eslsca
148
2- Coefficient of Determination: Example

Referring to the sales calls and copiers sales example:

The coefficient of determination

r2= (0.759)2=0.576

It means that 57.6% of the variation in the number of copiers sold is


explained, or accounted for, by the variation in the number of sales calls.

Applied Statistical Methods (BMTH113)-----


Dr. Rasha El Kholy----Eslsca
149
3- Simple Linear Regression Model
If we want to draw a straight line that passes between the
points and fits (represents) the data fairly well.

This line is called “Best fit line” or


“Regression Line”.

It is used to predict future values.

Simple Linear Regression

A straight line can


Only one
represent the
independent
relationship.
variable (X)
Equation of a line.

Applied Statistical Methods (BMTH113)-----


Dr. Rasha El Kholy----Eslsca
150
3- Simple Linear Regression Model
The equation of the regression line -the best fit line-is:

Where, Yˆ (said y hat) is the estimated value of Y for a given value of X.


a is the Y-intercept, the value of y when X=0.
b is the slope, the change (increase or decrease) in Y when X increases by
one unit.
a and b are the regression coefficients.

151
3- Simple Linear Regression Model: Example
Recall the example of Copier Sales. Find the linear regression line to express the
relationship between the two variables. What is the expected number of copiers sold by a
representative who made 20 calls?

Applied Statistical Methods (BMTH113)-----


Dr. Rasha El Kholy----Eslsca
152
3- Simple Linear Regression Model:
Example

tical Methods (BMTH113)-----Dr. Rasha El Kholy----Eslsca


3- Simple Linear Regression Model: Example

Absolute
value of r

r2

The linear regression equation


Applied Statistical Methodsis:
Dr. Rasha El Kholy----Eslsca
𝑌෠ = 18.95 +154
(BMTH113)----- 1.18𝑋
3- Simple Linear Regression Model: Example

The linear regression equation is: 𝑌෠ = 18.95 + 1.18𝑋

Interpretation of the regression coefficients:


Around 19 copiers are expected to be sold even if there is no sales calls. For every additional
call, the expected sales of the copiers will increase by 1.18.

The estimated number of copiers sold by a representative who made 20 calls is

Yˆ = 18.95 + 1.18(20) = 42.55

Applied Statistical Methods (BMTH113)-----


Dr. Rasha El Kholy----Eslsca
155
3- Simple Linear Regression Model
The regression line is determined by the least squares method
Summary:

Scatter Diagram → linearity, direction

Correlation Coefficient (r) → direction, strength

Coefficient of Determination (r2) → % of variation in Y explained by variation of X

Regression Model→ direction, prediction

Applied Statistical Methods (BMTH113)-----


Dr. Rasha El Kholy----Eslsca
157
Example 1:
 Ten students were selected at random, and a comparison was made of their high school grade
point averages (GPAs) and their grade-point averages at the end of their first year in college.

(a) Whatkind of correlation is present between High school


GPA and college GPA?

(b) Using the regression line, what would be the predicted college GPA
if a student has a high school GPA?

a) r= 0.941088, positive, strong linear relationship.


b) 𝑌෠ = −0.45 + 1.1 𝑋. Interpretation!
𝑌෠ = 3.29 The estimated college GPA is 3.3 for a high school GPA
of 3.4

Applied Statistical Methods (BMTH113)-----Dr. Rasha El Kholy----Eslsca 158


Example 2:
 The data obtained in a study on the number of absences and the final grades of seven
randomly selected students from a statistics class.
(b) What kind of correlation is present between final grade and absences?
(c) Predict the final grade of a student who missed 3 classes.

a) r= - 0.944, negative, strong linear relationship.


b) 𝑌෠ = 102.49 − 3.62 𝑋. Interpretation!
𝑌෠ = 91.63 The estimated final grade is 91.63 for a student who missed 3 classes
Applied Statistical Methods (BMTH113)-----Dr. Rasha El Kholy----Eslsca 159
Random Variables, Probability
Distributions &
Normal Distribution
Learning
Objectives
1. Understand what is a random variable.
2. Identify the characteristics of a probability distribution.
3. Distinguish between discrete and continuous random variables.
4. Describe the characteristics of a normal probability distribution.
5. Describe the standard normal probability distribution and use it
to calculate probabilities.
6. Convert normal random variable to standard random variable to
calculate probabilities
7. The Empirical Rule
8. Find the value x from a given probability
Exampl
e

Suppose we are interested in the number of heads showing face up


when we toss a coin 3 times.

How many and what are the possible outcomes of this


experiment?

If we are interested in the number of heads appear, what are the


possible values?
Example…cont’d
Using these data, we
construct what we can
Probability Distribution call
a
Random
variable

Possible values
Corresponding
of the
probabilities
Random
variable
Example…cont’d
Also we can represent the Probability distribution using a chart
What is a Probability
Distribution?
A probability distribution of a random variable is a listing of
all
possible values of the random variable and the probability
associated with each outcome.

CHARACTERISTICS OF A PROBABILITY
DISTRIBUTION

1. The probability of a particular outcome (value) is between 0


and 1 inclusive.
2. The outcomes (values) are mutually exclusive.
3. The list of outcomes (values) is exhaustive‫املة‬TT‫ ش‬.
So the sum of the probabilities of the outcomes(values) is 1.
Random
Variable
A quantity resulting from a random experiment that, by chance,
can assume different values.

It is a mapping of the outcomes of a random experiment to


numbers.

Random variable is denoted by capital letters; X, Y, Z,…


Random
Variable
Examples
1) Rolling 2 dice and observing the number on the 2 faces and we are
interested in the sum of upward faces.

2) A bank counting the number of credit cards carried by a


group of customers.

3) Distance made each day to work.

4) Waiting time (in minutes) to get a service.


Two Types of Random Variables: 1- Discrete Random
Variable
One type of random variable is the discrete random variable.

Discrete Random Variable is a random variable that can assume only


certain clearly separated values. It takes distinct values.

Discrete variables are usually the result of counting.

Examples
 Tossing a coin three times and counting the number of heads.
A bank counting the number of credit cards carried by a group of customers.
 Number of students in a class.
1- Discrete Random
Variable
The probability distribution for the number of cards carried can take the
following form:

Number of Credit Cards Prob. (Relative


Frequency)
0 .03
1 .10
2 .18
3 .21
4 or more .48
Total 1.00
Two Types of Random Variables: 2- Continuous
Random Variable

Continuous random variables can assume an infinite number of


values within a given range.

Continuous variables are usually the result of measuring.

Examples
The time between flights between Atlanta and LA are 4.67 hours, 5.13
hours, and so on.
 The annual snowfall in Minneapolis, measured in inches.
 Sales, in dollars, of a certain company.
Normal Probability
Distribution
 The normal probability distribution is a continuous distribution that is widely used in
theory and in practice.
 Used to model natural phenomena as height, IQ scores,…

 It has the following characteristics:


• It is bell-shaped and has a single peak at the center of the distribution
• It is symmetric at the mean
• It is asymptotic, meaning the curve approaches but never
touches the X-axis
• It is completely described by its mean and standard deviation
• The area and the curve equals to 1
Normal Probability
Distribution
The Normal curve…
Normal Probability
Distribution
 There is a family of normal probability distributions; there is
infinite
number of normal distribution according to different values for the
mean or the standard deviation.

Equal Means and Different Means and Standard


Different Standard Deviations
Deviations

Different Means and Equal Standard Deviations


Normal Probability
Distribution
The probabilities are the area under the curve

Calculating an area under the curve requests integrating a


function but instead we are using a table of a reference
normal distribution called Standard normal distribution…..
Standard Normal Probability
Distribution

The standard normal probability distribution is a particular normal


distribution with mean of 0 and a standard deviation of 1
 It is always denoted by Z
 We are provided with a table for the area under the curve for Z
 The standard normal curve has the following characteristics:
• It is bell-shaped and has a single peak at the center of the
distribution
Standard Normal Probability
Distribution

 It is asymptotic, meaning the curve approaches but never touches the X-axis
 The area and the curve equals to 1
 It is symmetric at zero
 Area to the left of zero equals the area to the right of zero
How to read the Z
table?
How to read the Z
table?
P(0 < Z< 0.56)

P(-0. 24< Z<


0)

P(Z >
1.96)

P(Z<
0.56)
Example:
Find
(a) Find P(Z>2)

(b) Find P(-2<Z<2)

(c) Find P(0<Z<1.73)

(d) Find P(1<Z<2)

(e) Find P(Z<-1.94)

(f) Find P(-2.1<Z<0.84)

(g) Find P(-2.5<Z<-2.09)


Recall

The standard normal probability distribution is a particular
normal distribution. It has a mean of 0 and a standard deviation
of 1.

How to calculate the Z-value?


Any normal probability distribution, with mean μ and standard
deviation σ, can be converted to the standard normal probability
distribution with the following formula

If 𝑋~𝑁(𝜇, 𝜎 2 ) then 𝑍 = 𝑋𝜎−𝜇 → 𝑁(0,1)


Example
Suppose the weekly income of Uber drivers follows the normal
probability distribution with a mean of $1,000 and a standard
deviation of $100.

a) What is the probability that an Uber driver earns between $1000


and $1,100?
1000−𝜇 1100−
P($1,000 < X < $1,100) = P 𝑋 −𝜇 < < 𝜇 =
𝜎 𝜎
1000−1000 1100−1000 𝜎
P <𝑍< = P 0 < 𝑍 < 1 =0.3413
100 100
Example..
Cont’d
b) What is the probability that an Uber driver earns less $1,200
weekly?
𝑋−𝜇 1200−𝜇
P(X < $1,200) = P <
𝜎
1200−1000 𝜎
=P 𝑍<
100
=P 𝑍<2
=0.5+0.4772=0.9772
Example..
Cont’d
c) What is the probability that an Uber driver earns between $900 and $1,100 weekly?

900−1000 1100−1000
P($900 < X < $1,100) = P <𝑍<
100 100
= P −1 < 𝑍 < 1 =2(0.3413)=0.6826

d) What is the probability that an Uber driver earns between $800


and $1,200 weekly?
800−1000 1200−1000
P($800 < X < $1,200) = P <𝑍<
100 100
= P −2 < 𝑍 < 2 =2(0.4772)=0.9544
The Empirical
Rule
For any symmetric distribution:
1. About 68% of the observations lie between μ ±1 σ
2. About 95% of the observations lie between μ ±2 σ
3. About 99.7% of the observations lie between μ ±3σ

To verify the Empirical Rule for


the standard
normal distribution:

P(0<Z<1) = 0.3413 so
0.3413 * 2 =0.6826 or about 68%
P(0<Z<2) =0.4772 so
0.4772 * 2 = 0.9544 or about 95%
P(0<Z<3) = 0.4987 so
0.4987 * 2 = 0.9974 or about 99.7%
The Empirical Rule…
Example
As part of its quality assurance program, the Autolite Battery
Company conducts tests on battery life. For a particular D-cell
alkaline battery, the mean life is 19 hours. The useful life of the
battery follows a normal distribution with a standard deviation
of 1.2 hours.
1. About 68% of the batteries failed between what two values?
μ ±1 σ; 19 ± 1(1.2) hours;
About 68% of batteries will fail between 17.8 and 20.2 hours.
2. About 95% of the batteries failed between what two values?
μ ±2 σ; 19 ± 2(1.2) hours;
About 95% of batteries will fail between 16.6 and 21.4 hours.
3. Virtually all of the batteries failed between what two values?
μ ±3 σ; 19 ± 3(1.2) hours;
Practically all (99.7%)of the batteries will fail between 15.4 and 22.6
hours.
Finding the value of Z given a
probability
Find the value of z satisfying each of the
following:

(a) P(Z>z)=0.5

(b) P(Z<z)=0.8643

(c) P(-z<Z<z)=0.9

(d) P(-z<Z<z)=0.99

(e) P(Z<z)=0.33

𝑎) 𝑧 = 0 𝑏)𝑧 = 1.1 𝑐)𝑧 = 1.645


𝑑) 𝑧 = 2.575 𝑒) 𝑧 = −0.44
Finding the value of X given a
probability
Example 1: Scores on an examination are assumed to be
normally distributed with mean 78 and variance 36.
(a) Suppose that students scoring in the top 10% of this
distribution are to receive an A grade. What is the minimum
score a student must achieve to earn an A grade?
𝑃(𝑌 > 𝐴) = 0.1 ⇒ 𝑃(𝑍 > 𝑧) = 0.1
𝐴 − 78
⇒ 𝑧 = 1.285 = ⇒ 𝐴 = 78 + 1.285 × 6 = 85.71
6
(b) What must be the cutoff point for passing the
examination if the examiner wants only the top 72% of all
scores to be passing?
𝑃(𝑌 > 𝑘) = 0.72 ⇒ 𝑃(𝑍 > 𝑧) = 0.72
𝑘 − 78
⇒ 𝑧 = −0.585 = ⇒ 𝑘 = 78 + (−0.585) × 6 = 74.49
6
Finding the value of X given a
probability
Example 2: Layton Tire and Rubber Company wishes to set a
minimum mileage guarantee on its new MX100 tire. Tests reveal
the mean mileage is 67,900 with a standard deviation of 2,050
miles and that the distribution follows the normal distribution.
Let x represent the minimum guaranteed so that no more
than 4% of tires need to be replaced. Find the value of x.
Finding the value of X given a
probability

z = x − μ = x −67,900 and from the table we find z = -1.755


σ 2,050
so -1.755 = = x −67,900 therefore, x = 64,302.25 miles
2,050
Acceptance Sampling

 Meaning of Acceptance Sampling or Sampling Inspection


 Classification of Acceptance Sampling
 Terms Used in Acceptance Sampling
 Advantages of Acceptance Sampling
 Limitations of Acceptance Sampling

Dr. Ghassan Mohamed


Meaning of Acceptance Sampling or Sampling Inspection

 One method of controlling the quality of a product is 100% inspection which requires huge
expenditure in terms of time, money and labor. Moreover due to boredom and fatigue involved in
repetitive inspection process, there exists a possibility to overlook and some defective products
may pass the inspection point

Dr. Ghassan Mohamed


Classification of Acceptance Sampling

• (i) Acceptance sampling on the basis of attributes


i.e. GO and NOT GO gauges.

• (ii) Acceptance sampling on the basis of variable.

Dr. Ghassan Mohamed


Following terms are generally used in acceptance
sampling:

• (i) Acceptable Quality Level (AQL):

• (iii) Average outgoing Quality (A.O.Q):

• Operating Characteristic Curve or O.C. Curve

Dr. Ghassan Mohamed


Advantages of Acceptance Sampling

• (i) The method is applicable in those industries where there is mass production and the industries follow a
set production procedure.

• (ii) The method is economical and easy to understand

• (iii) Causes less fatigue boredom.

• (iv) Computation work involved is comparatively very small.

• (v) The people involved in inspection can be easily imparted training.

• (vi) Products of destructive nature during inspection can be easily inspected by sampling.

• (vii) Due to quick inspection process, scheduling and delivery times are improved

Dr. Ghassan Mohamed


Limitations of Acceptance Sampling

• (i) It does not give 100% assurance for the confirmation of specifications so there is always
some likelihood/risk of drawing wrong inference about the quality of the batch/lot.

• (ii) Success of the system is dependent on, sampling randomness, quality characteristics to
be tested, batch size and criteria of acceptance of lot.

Dr. Ghassan Mohamed


Producer’s and Consumer’s Risk:
The acceptance or rejection of the whole batch of products in acceptance sampling depends
upon the results of the sample inspected. There is always a chance that a sample may not be true
representative of the batches or lots from which it is drawn.

This leads to following two types of risks:


(i) Producer risk.
(ii) Consumer risk.
Single and Double Sampling Plan

Count the no. of defectives,


‘d’ in the sample of size ‘n’

Is ‘d’ ≤ ‘c’

If yes, than If no, then


accept the lot reject the lot

Dr. Ghassan Mohamed


Single and Double Sampling Plan

Dr. Ghassan Mohamed


Switching rules

Tightene
Reduced Normal
d

Dr. Ghassan Mohamed


Dr. Ghassan Mohamed

Discussion
% 100 Thank you

You might also like