Dessertation Design &data Collection

Dissertation & Data Collection
Dr. Ghassan M
Ghassan Mohamed
Ghassan was born on November 23, 1986, in Giza, Egypt. He graduated from the faculty of science!
Then he earned a master’s degree in statistical quality control and quality assurance from Cairo university
with an excellent degree. Later, He earned a doctorate degree in statistical quality control and quality
assurance from Cairo university with an excellent degree.
 The educational journey did not end there. This year, he earned a diploma in governance, TOT, and
Organizational excellence diploma. Ghassan realized his biggest dream: He finally became a lecturer at Cairo
University to complete his learning journey. He although an assessor in government agencies for the Egypt
Award and Egypt's Vision 2030.
However, he achieved one of his dreams to become a member of CHG family, teaching quality, statistics,
and 7 habits besides his daily work as a quality supervisor.
Today, Ghassan encourages a promising people—to study the different sciences. his advice to follow your
dreams, no matter how great.
CONTENTS
Dissertation
01 Dissertation & Dissertation design
Data
02 Data Types and how to dealing with it
Data Collection Methods

03
Exam
04
00 Learning Objectives
What is a dissertation?
A dissertation is...
● A long piece of academic writing

● Based on original research
● Usually submitted at the end of a degree
● Tests your capacity for independent research
● Sometimes called a thesis
1 Dr. Ghassan M
Ghassan_mohamed@yahoo.com
Check your guidelines!
● The sections included vary

● They may change based on your field…...and the nature of your
specific research
● Check any guidelines you are given
● Ask your supervisor if you’re unsure
Dr. Ghassan M
Structuring a dissertation or thesis
Dr. Ghassan M
Title page
✓ Dissertation title
✓ Your name
✓ Type of document
✓ Department and institution
✓ Degree program
✓ Date of submission
Dr. Ghassan M
Acknowledgements
● Less formal, more personal

● No longer than a page
● Thank people who helped you complete your dissertation
● E.g. supervisors, friends and family, pets!
Dr. Ghassan M
Acknowledgements
Abstract
✓ State the main topic and aims of your research

✓ Describe the methods you used
✓ Summarize the main results
✓ State your conclusions
X Not an introduction but a summary
Dr. Ghassan M
Abstract
in other words, it should be able to stand alone.

For it to stand alone, your abstract should cover the following key points (at a minimum):
1- Your research questions and aims – what key question(s) did your research aim to answer?
2- Your methodology – how did you go about investigating the topic and finding answers to your research
question(s)?
3- Your findings – following your own research, what did do you discover?
4- Your conclusions – based on your findings, what conclusions did you draw? What answers did you find to
your research question(s)?
Table of contents
● Lists all sections that come after it

● Can be auto-generated in Word
● Clear, consistent headings
Dr. Ghassan M
Table of contents
Lists of figures and tables
● Include if your dissertation has a lot of tables and figures

● Tables and figures listed and numbered separately
● Make sure all tables and figures are included
● List in the order they appear in the text
Dr. Ghassan M
Dr. Ghassan M
Dr. Ghassan M
List of abbreviations
● Include if you use a lot of abbreviations

● Define abbreviations here and in the main text
● List alphabetically
● Very well-known abbreviations not included
Dr. Ghassan M
List of abbreviations
Dr. Ghassan M
Glossary
● Include if you use a lot of specialist terms

● List terms alphabetically
● Include a brief definition of each term
● Consult with your supervisor to determine what terms you should
define
Dr. Ghassan M
Glossary
Dr. Ghassan M
Introduction
✓ Establish your research topic

✓ Provide background information
✓ Define the scope of your research
✓ Show your work’s relevance
✓ State your research questions and objectives
✓ Give an overview of your structure
Dr. Ghassan M
Literature review
✓ Search for sources

✓ Select the most relevant
✓ Critically evaluate sources
✓ Make connections between them
✓ Draw conclusions based on your review
✓ Show how your own research builds on what you found
Dr. Ghassan M
Theoretical framework
● Often builds upon/includes the literature review

● Define and analyze key theories, concepts & models
● Show how they inform your own approach
Dr. Ghassan M
Methodology
✓ Overall approach (e.g. qualitative, quantitative, experimental,

ethnographic)
✓ Data collection methods (e.g. surveys, interviews, archives)
✓ Details: where, when, who?
✓ Tools and materials (e.g. programs, lab equipment)
✓ Data analysis methods (e.g. statistical analysis)
✓ Obstacles faced during research
Dr. Ghassan M
Results
✓ Report your results concisely and objectively

✓ Include results relevant to your research questions
✓ May include data visualizations (e.g. graphs, tables)
X Don’t give subjective interpretations
Dr. Ghassan M
Discussion
● Interpret your results

● Did they meet your expectations?
● Did they fit the established framework?
● What factors might have influenced any unexpected results?
● Consider alternative interpretations
● Acknowledge limitations
Dr. Ghassan M
Conclusion
✓ Answer your main research question

✓ Make suggestions for future research
✓ Show what you have contributed
✓ Emphasize the importance of your research
X Don’t introduce any new data, interpretations, or arguments
Dr. Ghassan M
Reference list or bibliography
● Lists all sources cited in your dissertation

● Includes full and accurate details of each source
● Format varies depending on style guide (e.g.
APA, MLA)
● Citation generators can help
Appendices
● Present additional data or documents not included in your main text

● E.g. interview transcripts, survey questions, tables of data
● If you have multiple appendices, they are numbered (Appendix 1,
Appendix 2…)
Dr. Ghassan M
APA Style
An introduction to formatting and citation

What is APA Style?
● APA = American Psychological Association

● Based on the APA Publication Manual
● The 7th edition is the latest, published in 2019
● Used widely, especially in the social sciences
● Provides guidelines for language, formatting, and citations
General formatting
● Times New Roman 12 pt, Calibri 11 pt, Arial 11 pt, etc.

● Double line spacing
● One-inch (2.54 cm) margins
● Page number in the top right
● Running head in the top left (if submitting for publication)
Levels of heading
Title page
Tables and figures
In-text citations
Basic in-text citation format
Parenthetical citation
It is important to avoid plagiarism in academic writing (Smith, 2020, p. 15).
Narrative citation
Smith (2020, p. 15) states that it is important to avoid plagiarism in academic writing.
Multiple authors
Parenthetical citation Narrative citation
1 author (Smith, 2020) Smith (2020)
2 authors (Smith & Jones, 2020) Smith and Jones (2020)
3+ authors (Smith et al., 2020) Smith et al. (2020)

Missing information
No author (Scribbr, 2020)

(“Statistical analysis,” 2020)
No date (Smith, n.d.)
No page numbers (Smith, 2020, para. 10)

Combining citations
X (Smith, 2020) (Jones, 2015) (McCombes et al.,

2017)
✓ (Jones, 2015; McCombes et al., 2017; Smith, 2020)

The reference list
Reference or no reference?
Page on a website Interview you
conducted
Article from an
academic journal Email from an
expert
Book used as
background reading Chapter from a book
that you cited
PowerPoint slides
from a lecture Facebook status
✓ Reference required X No reference required ? It depends…
Page on a website Interview you conducted Facebook status
Book Email Lecture slides
Article from an academic Background reading

journal
Format of the reference page
Components of a reference entry
1. Author
2. Date
3. Title
4. Source
The author component
One author Anderson, B.
Multiple authors Andreff, W., Staudohar, P. D., & LaBrode, M.
Corporate author Scribbr.
With username Obama, B. [@BarackObama].
With specific role Scott, R. (Director).

The date component
Year only (books and journals) (2020).
Full date (web pages, newspapers, (2020, November 20).

online videos)
No publication date available (n.d.).
Retrieval date for online sources that Retrieved December 3, 2020, from …
are continually updated
The title component
Italics for standalone Statistical methods for psychology.

sources (books, films)
Plain text for sources within The evolving European model of sports finance.
sources (articles, chapters,
web pages)
Square brackets to describe [Photograph of a wren].

untitled sources
The source component
Book publisher Verso.
Journal issue Journal of Sports Economics, 1(3), 257–276.

https://doi.org./10.1177/152700250000100304
Website BBC News.

https://www.bbc.com/news/health-54531075
Physical location Museo del Prado, Madrid, Spain.

The full reference entry
Anderson, B. (1983). Imagined communities: Reflections on the origins and

spread of nationalism. Verso.
Andreff, W., & Staudohar, P. D. (2000). The evolving European model of

professional sports finance. Journal of Sports Economics, 1(3),
257–276. https://doi.org./10.1177/152700250000100304
Rowlatt, J. (2020, October 19). Could cold water hold a clue to a

dementia cure? BBC News. https://www.bbc.com/news/health-54531075
Data and Data Collection
Dr. Ghassan M
What is Statistics mean?
Statistics is the science of collecting, organizing,
summarizing and analyzing information in order to
draw conclusions.
The Process of Statistics
Step 1: Identify a Research Objective

• Researcher must determine question he/she wants answered - question must
be detailed.
• Identify the group to be studied. This group is called the population.
• An individual is a person or object that is a member of the population
being studied
Step 2: Collect the information needed to answer the questions.

• In conducting research, we typically look at a subset of the population, called
a sample.
Step 3: Organize and summarize the information.
• Descriptive statistics consists of organizing and summarizing the
information collected. Consists of charts, tables, and numerical summaries.
Step 4: Draw conclusions from the information.

• The information collected from the sample is generalized to the population.
• Inferential statistics uses methods that generalize results obtained from a
sample to the population and measure their reliability.
EXAMPLE The Process of Statistics
Many studies evaluate batterer treatment programs, but there are few experiments designed to
compare batterer treatment programs to non-therapeutic treatments, such as community service.
Researchers designed an experiment in which 376 male criminal court defendants who were accused
of assaulting their intimate female partners were randomly assigned into either a treatment group or a
control group. The subjects in the treatment group entered a 40-hour batterer treatment program
while the subjects in the control group received 40 hours of community service. After 6 months, it
was reported that 21% of the males in the control group had further battering incidents, while 10% of
the males in the treatment group had further battering incidents. The researchers concluded that the
treatment was effective in reducing repeat battering offenses.
Source: The Effects of a Group Batterer Treatment Program: A Randomized Experiment in Brooklyn by Bruce G. Taylor, et. al. Justice Quarterly,
Vol. 18, No. 1, March 2001.
Step 1: Identify the research objective.
To determine whether males accused of batterering their intimate female

partners that were assigned into a 40-hour batter treatment program are less
likely to batter again compared to those assigned to 40-hours of community
service.
Step 2: Collect the information needed to answer the question.
The researchers randomly divided the subjects into two groups. Group 1
participants received the 40-hour batterer program, while group 2 participants
received 40 hours of community service. Group 1 is called the treatment group
and the program is called the treatment. Group 2 is called the control group.
Six months after the program ended, the percentage of males that battered their
intimate female partner was determined.
Step 3: Organize and summarize the information.
The demographic characteristics of the subjects in the experimental and control

group were similar. After the six month treatment, 21% of the males in the control
group had any further battering incidents, while 10% of the males in the treatment
group had any further battering incidents.
Step 4: Draw conclusions from the data.
We extend the results of the 376 males in the study to all males who batter
their intimate female partner. That is, males who batter their female partner and
participate in a batter treatment program are less likely to batter again.
Types of data
Dr. Ghassan M
Data Collection Techniques
 Observations,
 Tests,
 Surveys,
 Document analysis
(the research literature)
Dr. Ghassan M
Key Factors for High Quality
Experimental Design
Data should not be contaminated by poor

measurement or errors in procedure.
Eliminate confounding variables from study or

minimize effects on variables.
Representativeness: Does your sample represent the population you are

studying? Must use random sample techniques.
Dr. Ghassan M
What Makes a Good Quantitative Research Design?
4 Key Elements
1. Freedom from Bias
2. Freedom from Confounding
3. Control of Extraneous Variables
4. Statistical Precision to Test Hypothesis
Dr. Ghassan M
Bias: When observations favor some
individuals in the population over others.
Confounding: When the effects of two

or more variables cannot be separated.
Extraneous Variables: Any variable that

has an effect on the dependent variable.
Need to identify and minimize these variables.
e.g., Erosion potential as a function of clay content. rainfall intensity,
vegetation & duration would be considered extraneous variables.
Dr. Ghassan M
Precision Vs Accuracy
Dr. Ghassan M
Both Accurate Accurate
and Precise Not precise
Not accurate
But precise
Neither accurate
nor precise
Dr. Ghassan M
Interpreting Results of Experiments
Goal of research is to draw conclusions. What did the study

mean?
What, if any, is the cause and effect of the outcome?
Dr. Ghassan M
Overall Methodology:
* State the objectives of the survey

* Define the target population
* Define the data to be collected
* Define the variables to be determined
* Define the required precision & accuracy
* Define the measurement `instrument'
* Define the sample size & sampling method, then select the sample
Dr. Ghassan M
Introduction to Sampling
 Sampling is the problem of accurately acquiring the necessary

data in order to form a representative view of the problem.
 This is much more difficult to do than is generally realized.
Dr. Ghassan M
Sampling
Distributions:
When you form a sample, you often show it by a plotted
distribution known as a histogram .
A histogram is the distribution of frequency of occurrence of

a certain variable within a specified range.
Dr. Ghassan M
Dr. Ghassan M
Dr. Ghassan M
Interpreting quantitative findings
Descriptive Statistics
Dr. Ghassan M
Example
Dr. Ghassan Mohamed

Descriptive Statistics • Relation among average, Median and mode
Dr. Ghassan Mohamed

Dr. Ghassan Mohamed
Definition
• Measures of dispersion are descriptive statistics that describe how similar a set of
scores are to each other
• The more similar the scores are to each other, the lower the measure of dispersion will be
• The less similar the scores are to each other, the higher the measure of dispersion will be
• In general, the more spread out a distribution is, the larger the measure of dispersion
will be
81
Measures of Dispersion
125
• Which of the distributions of 100
75
scores has the larger dispersion? 50
25
0
1 2 3 4 5 6 7 8 9 10
The upper distribution has more

dispersion because the scores are 125
more spread out 100
75
That is, they are less similar to each 50
other 25
0
1 2 3 4 5 6 7 8 9 10
Example
Dr. Ghassan Mohamed

Graphs
 Histogram
 Pareto Chart
 Cause and Effect Diagram
 Check Sheet
 Process flow Diagram
 Scatter Diagram
 Control Chart
Dr. Ghassan Mohamed
WHY
GRAPHS ?
• To reveal a trend or comparison of a data
• Easily understood
85 Dr. Ghassan Mohamed

(A)Type
s
There are different kinds of graphical charts based on
statistics as follows:
1. Line graphs
2. Pie charts
3. Bar graph
4. Scatter plot
5. Stem and plot
6. Histogram
7. Frequency polygon
8. Frequency curve
9. Cumulative frequency or ogives
Line
Graph
• A line joining several points, or a line that shows the relationship
between the points
• X-y plane
• independent variable and a dependent variable

Example

Pie
Charts
• A pie chart can be taken as a circular graph which is divided into
different disjoint pieces, each displaying the size of some related
information.
• Represents a whole and each part represents a percentage of the
whole

Advantage
s
• Good visual treat
• Percentage value-instantly known

Preferred use(Limitation)
 Categorical data - one understand what percentage each of these

category constitute

Example

Final Product
93
Bar
Graph
• Bar graph is drawn on an x-y graph and it has labelled horizontal or
vertical bars that show different values
• The size, length and color of the bars represent different
values.
94
Preferred
use(Limitation)
 Non continuous data
 Comparing or contrasting the size of the different categories of the
data provided.
95
Exampl
e
96
Stem and Leaf Plot
• Stem and leaf plot also called as stem plot are connected with quantitative data
such that it helps in
• Displaying shapes of the distributions,
• Organize numbers and
• Set it as comprehensible as possible.

Stem and leaf
• Descriptive technique-emphases on the data provided
• It concludes more about the shape of a set of data
• Provides better view about each of the data. The data is arranged by “place value”.
• In Stem plots each data is taken divide  Two separate parts  a stem and a
leaf.
• A stem is usually the first digit of the number in the data a vertical column
• a leaf is the last digit of the number in the data the row to the right side of the
corresponding stem

Stem and Leaf Diagrams
How to Draw One:
1. Put the first digits of each piece of data in numerical order

down the left-hand side
2. Go through each piece of data in turn and put the remaining

digits in the proper row
3. Re-draw the diagram putting the pieces of data in the right

order
4. Add a key
Dr. Ghassan Mohamed

Here are the marks gained by 30 students in an examination:
63 58 61 52 59 65 69 75 70 54 57 63 76 81 64
68 59 40 65 74 80 44 47 53 70 81 68 49 57 61
Write the tens figures in the left hand column of a diagram.

These are the ‘STEMS’
4
5
6

63 58 61 52 59 65 69 75 70 54 57 63 76 81 64
68 59 40 65 74 80 44 47 53 70 81 68 49 57 61
Go through the marks in turn and put in the units figures of each mark
in the proper row. These are the ‘LEAVES’
4
8
5
6 3 1

63 58 61 52 59 65 69 75 70 54 57 63 76 81 64
68 59 40 65 74 80 44 47 53 70 81 68 49 57 61
When all the marks are entered the diagram will look like this:
4 0 4 7 9
8 2 9 4 7 9 3 7
5
5 9 3 4 8 5 8 1
6 3 1
5 0 6 4 0
7 1 0 1 Dr. Ghassan Mohamed

63 58 61 52 59 65 69 75 70 54 57 63 76 81 64
68 59 40 65 74 80 44 47 53 70 81 68 49 57 61
Rewrite the diagram so that the units figures in each row are in order:
4 0 4 7 9
2 3 4 7 7 8 9 9
5
1 1 3 3 4 5 5 8 8 9
6
0 0 4 5 6

63 58 61 52 59 65 69 75 70 54 57 63 76 81 64
68 59 40 65 74 80 44 47 53 70 81 68 49 57 61
Add a KEY:
4 0 4 7 9
5|2 = 52 2 3 4 7 7 8 9 9
5
1 1 3 3 4 5 5 8 8 9
6
0 0 4 5 6

Remember:
- Always put in a Key
- Always put your data in Order
Median:
- to work out the median, you must find the middle value
- if there are two middle values, you need the average
Range:
- to work out the Range, subtract the smallest number from the
biggest
Dr. Ghassan Mohamed

The stem & leaf diagram below shows the masses in kg of some people in a lift.
(a) How many people were weighed?
(b) What is the range of the masses?
(c) Find the median mass.
Stem Leaf
tens 3 1 4 units Median is the

mean of the
4 3 3 6 8th and 9th
data values.
5 0 3 4 8
6 1 2 (a) 16 people.
7 2 2 7 (b) 86 – 31 = 55 kg
8 1 6
(c) 56 kg Dr. Ghassan Mohamed
Frequency Polygon
• The frequency polygon has most of the properties of a histogram, with an extra
feature. Here the mid point of each class of the x-axis is marked. Then the
midpoints and the frequencies are taken as the plotting point. These points are
connected using line segments.
• We also complete the graph, that is, it's closed by joining to the x-axis. Frequency
polygon gives a less accurate representation of the distribution, than a histogram,
as it represents the frequency of each class by a single point not by the whole class
interval.
10
7
Dr. Ghassan Mohamed
Example
10
8
Dr. Ghassan Mohamed
Final Product
10
9
Dr. Ghassan Mohamed
Frequency Curve
• The frequency polygon consists of sharp turns, and ups and downs which are not in
conformity with actual conditions.
• To remove these sharp features of a polygon, it becomes necessary to smooth it. No
definite rule for smoothing the polygon can be laid down.
• It should be understood very clearly that the curve does not, in any way, sharply
deviate from the polygon.
• In order to draw a satisfactory frequency curve, first of all, we need to draw a
frequency histogram  the frequency polygon and ultimately the frequency curve.
11
0
Example
11
1
Cumulative Frequency
(OGIVE)
• Cumulative frequency is a graph plotting cumulative frequencies on the y-axis and class
scores on the x-axis.
• The difference between frequency curve and an ogive is that in the later we plot the
cumulative frequency on the y-axis rather than plotting the individual frequencies.
• Advantage : it enables median, quartiles, etc to be studied from the graph.
11
2
Dr. Ghassan Mohamed
Example
11
3
Dr. Ghassan Mohamed
Example
11
4
Dr. Ghassan Mohamed
Pareto Diagram
 Alfred pareto ( 1848 – 1923 ) Italian Economist

 20 % Of the Population has 80 % of the wealth.
 Joseph Juran used the item “ Vital few & Trivial many”
 He noted that 20 % of the quality problems caused 80 % of the dollar loss.
Dr. Ghassan Mohamed

Dr. Ghassan Mohamed
Cause – and – Effect Diagram
 Show the relationships between a problem and its possible causes.

 Developed by Kaoru Ishikawa ( 1953 ).
 Also Known as :
 Fishbone diagrams
 Ishikawa diagrams
Dr. Ghassan Mohamed

Dr. Ghassan Mohamed
Check Sheet
Dr. Ghassan Mohamed

Example
Dr. Ghassan Mohamed

Example
Dr. Ghassan Mohamed

5) Flow chart
Dr. Ghassan Mohamed

Outpatients clinics department flow chart:
Dr. Ghassan Mohamed

Scatter plot
• A scatter plot or scatter graph is a type of graph which is drawn in Cartesian coordinate to
visually represent the values for two variables for a set of data. It is a graphical representation
that shows how one variable is affected by the other.
• Data is presented-collection of points-value of a variable positioned horizontal or x-axis

(Explanatory variable )
• Value of the other variable positioned on the vertical or y-axis(response variable)
12
5
Dr. Ghassan Mohamed
Scatter plot
 Use to illustrate the relationship between two variables by potting one against the
other.
Dr. Ghassan Mohamed

Example
Note that these data are not random

12
7
Dr. Ghassan Mohamed
Run Charts
Run Charts ( Time series plot)

Basis for control chart.
Dr. Ghassan Mohamed

Control Charts
 Why Use control Charts?

 To ( Monitor, Control, and Improve system or process performance over time by
studying variation and its source.
 What do control Charts do?

 Focus attention on detecting and monitoring process variation over time.
 Distinguishes special from common causes of variation.
 Services as a tool for ongoing control of a process.
Dr. Ghassan Mohamed

Variation in
a process
due to
Random Assignable
causes causes
Common Special
causes causes
Dr. Ghassan Mohamed

Common – cause :
 Variation that is completely random.
 Random causes that we cannot identify
 Unavoidable
((e.g. slight differences in process variables like diameter, weight, service time,
temperature))
 Special Cause :
 Variation can be exhibited within or out control limits e.g
 ( Trends, Step, Functions, shift ……..etc.)
Dr. Ghassan Mohamed

Control Chart
• A control chart is a time plot of a statistic, such as a sample mean, range,
• standard deviation, or proportion, with a center line and upper and lower
• control limits. The limits give the desired range of values for the statistic.
• When the statistic is outside the bounds, or when its time plot reveals certain
• patterns, the process may be out of control.
Dr. Ghassan Mohamed

Control Chart
Upper Control Limits (UCL) = µ + K ơ
Central Limits ( CL ) = µ
Lower Control Limit (LCL) = µ - K ơ
where :
K = Distance of the control limits from the center line
µ = mean of some sample statistic.
ơ= standard deviation of some statistic
Dr. Ghassan Mohamed

Control Chart
 Control charts are technique for improving productivity.
 Control charts are effective in defect prevention.
 Control charts are prevent unnecessary process adjustment.
 Control charts provide diagnostic information.
 Control charts provide information about process capability.
Dr. Ghassan Mohamed

Correlation
&
Regression Analysis
Applied Statistical Methods (BMTH113)-----Dr. Rasha El Kholy----Eslsca

• Introduction
• Correlation Analysis
• Coefficient of Determination
• Simple Linear Regression
Applied Statistical Methods (BMTH113)-----

Dr. Rasha El Kholy----Eslsca
136
Introduction:
 Recall that scatter diagram is a graphical technique used to describe the
relationship between two quantitative variables.
 In this chapter we carry this idea further. We are going to calculate numerical
measures (Correlation Analysis) to express the strength of relationship
between two variables.
 In addition, an equation (regression line) is used to express the

relationship between variables, allowing us to estimate (predict) one
variable on the basis of another.

137
Introduction:
• The Dependent Variable is the variable being predicted or estimated (denoted by y).
• The Independent Variable provides the basis for estimation. It is the predictor
variable (denoted by x).
Examples: Dependent
1. Age of a bus and maintenance

cost
Independent
2. Auction price and odometer
reading
Applied Statistical Methods (BMTH113)-----Independent
138
Dependent
Correlation Analysis
• Correlation Analysis is the study of the relationship between variables. It is also
defined as group of techniques to measure the association between two variables.
• The first step in correlation analysis is drawing a scatter diagram to portray the
relationship between the two variables.

139
1-Correlation Coefficient
The correlation coefficient measures the strength of the linear relationship between
two variables.
• It requires interval or ratio-scaled data.
•It can range from -1.00 to 1.00 according to the scale below:
non linear
relationship

140
1- Correlation Coefficient
Strong negative linear Moderate positive linear

relationship (r= - 0.933) relationship (r=0.518)

141
1- Correlation Coefficient: Example
The sales manager of Copier Sales of America, wants to determine whether there is a
relationship between the number of sales calls made in a month and the number of
copiers sold that month. The manager selects a random sample of 10
representatives and determines the number of sales calls each representative made
last month and the number of copiers sold.

142
Scatter diagram:
X: sales calls; Y:copiers sold
Positive linear
relationship, is it
strong or moderate?

143
tical Methods (BMTH113)-----Dr. Rasha El Kholy----Eslsca

Choose the cell

where you want the
results to be
shown.

145
r=0.759

146
r=0.759…What does that mean?
There is a direct linear relationship between the number of sales calls and the number of copiers
sold. The association is strong.
Note that this doesn’t mean that more sales calls cause more copiers sales. we have not
demonstrated cause and effect here, only that the two variables—sales calls and copiers sold—are
related.
Correlation Vs Causality just because two things occur together does not mean that one is
the cause of the other. For example: an increase in ice-cream consumption in the summer is
correlated with increased rate of drowning deaths. But this doesn’t mean that ice cream
consumption causes drowning.
147
https://www.e-education.psu.edu/marcellus/node/636
2- Coefficient of Determination
The coefficient of determination (r2) is the proportion of the total variation in
the dependent variable (Y) that is explained by the variation in the
independent variable (X). It is the square of the coefficient of correlation.
• It ranges from 0 to 1.
• It does not give any information on the direction of

the relationship between the variables.

148
2- Coefficient of Determination: Example
Referring to the sales calls and copiers sales example:
The coefficient of determination
r2= (0.759)2=0.576
It means that 57.6% of the variation in the number of copiers sold is

explained, or accounted for, by the variation in the number of sales calls.

149
3- Simple Linear Regression Model
If we want to draw a straight line that passes between the
points and fits (represents) the data fairly well.
This line is called “Best fit line” or

“Regression Line”.
It is used to predict future values.
Simple Linear Regression
A straight line can

Only one
represent the
independent
relationship.
variable (X)
Equation of a line.

150
The equation of the regression line -the best fit line-is:
Where, Yˆ (said y hat) is the estimated value of Y for a given value of X.

a is the Y-intercept, the value of y when X=0.
b is the slope, the change (increase or decrease) in Y when X increases by
one unit.
a and b are the regression coefficients.
151
3- Simple Linear Regression Model: Example
Recall the example of Copier Sales. Find the linear regression line to express the
relationship between the two variables. What is the expected number of copiers sold by a
representative who made 20 calls?

152
3- Simple Linear Regression Model:
Example
tical Methods (BMTH113)-----Dr. Rasha El Kholy----Eslsca

Absolute
value of r
r2
The linear regression equation

Applied Statistical Methodsis:
𝑌෠ = 18.95 +154
(BMTH113)----- 1.18𝑋
The linear regression equation is: 𝑌෠ = 18.95 + 1.18𝑋
Interpretation of the regression coefficients:

Around 19 copiers are expected to be sold even if there is no sales calls. For every additional
call, the expected sales of the copiers will increase by 1.18.
The estimated number of copiers sold by a representative who made 20 calls is
Yˆ = 18.95 + 1.18(20) = 42.55

155
The regression line is determined by the least squares method
Summary:
Scatter Diagram → linearity, direction
Correlation Coefficient (r) → direction, strength
Coefficient of Determination (r2) → % of variation in Y explained by variation of X
Regression Model→ direction, prediction

157
Example 1:
 Ten students were selected at random, and a comparison was made of their high school grade
point averages (GPAs) and their grade-point averages at the end of their first year in college.
(a) Whatkind of correlation is present between High school

GPA and college GPA?
(b) Using the regression line, what would be the predicted college GPA
if a student has a high school GPA?
a) r= 0.941088, positive, strong linear relationship.

b) 𝑌෠ = −0.45 + 1.1 𝑋. Interpretation!
𝑌෠ = 3.29 The estimated college GPA is 3.3 for a high school GPA
of 3.4
Applied Statistical Methods (BMTH113)-----Dr. Rasha El Kholy----Eslsca 158

Example 2:
 The data obtained in a study on the number of absences and the final grades of seven
randomly selected students from a statistics class.
(b) What kind of correlation is present between final grade and absences?
(c) Predict the final grade of a student who missed 3 classes.
a) r= - 0.944, negative, strong linear relationship.

b) 𝑌෠ = 102.49 − 3.62 𝑋. Interpretation!
𝑌෠ = 91.63 The estimated final grade is 91.63 for a student who missed 3 classes
Applied Statistical Methods (BMTH113)-----Dr. Rasha El Kholy----Eslsca 159
Random Variables, Probability
Distributions &
Normal Distribution
Learning
Objectives
1. Understand what is a random variable.
2. Identify the characteristics of a probability distribution.
3. Distinguish between discrete and continuous random variables.
4. Describe the characteristics of a normal probability distribution.
5. Describe the standard normal probability distribution and use it
to calculate probabilities.
6. Convert normal random variable to standard random variable to
calculate probabilities
7. The Empirical Rule
8. Find the value x from a given probability
Exampl
e
Suppose we are interested in the number of heads showing face up

when we toss a coin 3 times.
How many and what are the possible outcomes of this

experiment?
If we are interested in the number of heads appear, what are the

possible values?
Example…cont’d
Using these data, we
construct what we can
Probability Distribution call
a
Random
variable
Possible values
Corresponding
of the
probabilities
Random
variable
Example…cont’d
Also we can represent the Probability distribution using a chart
What is a Probability
Distribution?
A probability distribution of a random variable is a listing of
all
possible values of the random variable and the probability
associated with each outcome.
CHARACTERISTICS OF A PROBABILITY
DISTRIBUTION
1. The probability of a particular outcome (value) is between 0

and 1 inclusive.
2. The outcomes (values) are mutually exclusive.
3. The list of outcomes (values) is exhaustive‫املة‬TT‫ ش‬.
So the sum of the probabilities of the outcomes(values) is 1.
Random
Variable
A quantity resulting from a random experiment that, by chance,
can assume different values.
It is a mapping of the outcomes of a random experiment to

numbers.
Random variable is denoted by capital letters; X, Y, Z,…

Random
Variable
Examples
1) Rolling 2 dice and observing the number on the 2 faces and we are
interested in the sum of upward faces.
2) A bank counting the number of credit cards carried by a

group of customers.
3) Distance made each day to work.
4) Waiting time (in minutes) to get a service.

Two Types of Random Variables: 1- Discrete Random
Variable
One type of random variable is the discrete random variable.
Discrete Random Variable is a random variable that can assume only

certain clearly separated values. It takes distinct values.
Discrete variables are usually the result of counting.
Examples
 Tossing a coin three times and counting the number of heads.
A bank counting the number of credit cards carried by a group of customers.
 Number of students in a class.
1- Discrete Random
Variable
The probability distribution for the number of cards carried can take the
following form:
Number of Credit Cards Prob. (Relative

Frequency)
0 .03
1 .10
2 .18
3 .21
4 or more .48
Total 1.00
Two Types of Random Variables: 2- Continuous
Random Variable
Continuous random variables can assume an infinite number of

values within a given range.
Continuous variables are usually the result of measuring.
Examples
The time between flights between Atlanta and LA are 4.67 hours, 5.13
hours, and so on.
 The annual snowfall in Minneapolis, measured in inches.
 Sales, in dollars, of a certain company.
Normal Probability
Distribution
 The normal probability distribution is a continuous distribution that is widely used in
theory and in practice.
 Used to model natural phenomena as height, IQ scores,…
 It has the following characteristics:

• It is bell-shaped and has a single peak at the center of the distribution
• It is symmetric at the mean
• It is asymptotic, meaning the curve approaches but never
touches the X-axis
• It is completely described by its mean and standard deviation
• The area and the curve equals to 1
Normal Probability
Distribution
The Normal curve…
Normal Probability
Distribution
 There is a family of normal probability distributions; there is
infinite
number of normal distribution according to different values for the
mean or the standard deviation.
Equal Means and Different Means and Standard

Different Standard Deviations
Deviations
Different Means and Equal Standard Deviations

Normal Probability
Distribution
The probabilities are the area under the curve
Calculating an area under the curve requests integrating a

function but instead we are using a table of a reference
normal distribution called Standard normal distribution…..
Standard Normal Probability
Distribution
The standard normal probability distribution is a particular normal

distribution with mean of 0 and a standard deviation of 1
 It is always denoted by Z
 We are provided with a table for the area under the curve for Z
 The standard normal curve has the following characteristics:
• It is bell-shaped and has a single peak at the center of the
distribution
Standard Normal Probability
Distribution
 It is asymptotic, meaning the curve approaches but never touches the X-axis
 The area and the curve equals to 1
 It is symmetric at zero
 Area to the left of zero equals the area to the right of zero
How to read the Z
table?
How to read the Z
table?
P(0 < Z< 0.56)
P(-0. 24< Z<

0)
P(Z >
1.96)
P(Z<
0.56)
Example:
Find
(a) Find P(Z>2)
(b) Find P(-2<Z<2)
(c) Find P(0<Z<1.73)
(d) Find P(1<Z<2)
(e) Find P(Z<-1.94)
(f) Find P(-2.1<Z<0.84)
(g) Find P(-2.5<Z<-2.09)

Recall
…
The standard normal probability distribution is a particular
normal distribution. It has a mean of 0 and a standard deviation
of 1.
How to calculate the Z-value?

Any normal probability distribution, with mean μ and standard
deviation σ, can be converted to the standard normal probability
distribution with the following formula
If 𝑋~𝑁(𝜇, 𝜎 2 ) then 𝑍 = 𝑋𝜎−𝜇 → 𝑁(0,1)

Example
Suppose the weekly income of Uber drivers follows the normal
probability distribution with a mean of $1,000 and a standard
deviation of $100.
a) What is the probability that an Uber driver earns between $1000

and $1,100?
1000−𝜇 1100−
P($1,000 < X < $1,100) = P 𝑋 −𝜇 < < 𝜇 =
𝜎 𝜎
1000−1000 1100−1000 𝜎
P <𝑍< = P 0 < 𝑍 < 1 =0.3413
100 100
Example..
Cont’d
b) What is the probability that an Uber driver earns less $1,200
weekly?
𝑋−𝜇 1200−𝜇
P(X < $1,200) = P <
𝜎
1200−1000 𝜎
=P 𝑍<
100
=P 𝑍<2
=0.5+0.4772=0.9772
Example..
Cont’d
c) What is the probability that an Uber driver earns between $900 and $1,100 weekly?
900−1000 1100−1000
P($900 < X < $1,100) = P <𝑍<
100 100
= P −1 < 𝑍 < 1 =2(0.3413)=0.6826
d) What is the probability that an Uber driver earns between $800

and $1,200 weekly?
800−1000 1200−1000
P($800 < X < $1,200) = P <𝑍<
100 100
= P −2 < 𝑍 < 2 =2(0.4772)=0.9544
The Empirical
Rule
For any symmetric distribution:
1. About 68% of the observations lie between μ ±1 σ
2. About 95% of the observations lie between μ ±2 σ
3. About 99.7% of the observations lie between μ ±3σ
To verify the Empirical Rule for

the standard
normal distribution:
P(0<Z<1) = 0.3413 so
0.3413 * 2 =0.6826 or about 68%
P(0<Z<2) =0.4772 so
0.4772 * 2 = 0.9544 or about 95%
P(0<Z<3) = 0.4987 so
0.4987 * 2 = 0.9974 or about 99.7%
The Empirical Rule…
Example
As part of its quality assurance program, the Autolite Battery
Company conducts tests on battery life. For a particular D-cell
alkaline battery, the mean life is 19 hours. The useful life of the
battery follows a normal distribution with a standard deviation
of 1.2 hours.
1. About 68% of the batteries failed between what two values?
μ ±1 σ; 19 ± 1(1.2) hours;
About 68% of batteries will fail between 17.8 and 20.2 hours.
2. About 95% of the batteries failed between what two values?
μ ±2 σ; 19 ± 2(1.2) hours;
About 95% of batteries will fail between 16.6 and 21.4 hours.
3. Virtually all of the batteries failed between what two values?
μ ±3 σ; 19 ± 3(1.2) hours;
Practically all (99.7%)of the batteries will fail between 15.4 and 22.6
hours.
Finding the value of Z given a
probability
Find the value of z satisfying each of the
following:
(a) P(Z>z)=0.5
(b) P(Z<z)=0.8643
(c) P(-z<Z<z)=0.9
(d) P(-z<Z<z)=0.99
(e) P(Z<z)=0.33
𝑎) 𝑧 = 0 𝑏)𝑧 = 1.1 𝑐)𝑧 = 1.645

𝑑) 𝑧 = 2.575 𝑒) 𝑧 = −0.44
Finding the value of X given a
probability
Example 1: Scores on an examination are assumed to be
normally distributed with mean 78 and variance 36.
(a) Suppose that students scoring in the top 10% of this
distribution are to receive an A grade. What is the minimum
score a student must achieve to earn an A grade?
𝑃(𝑌 > 𝐴) = 0.1 ⇒ 𝑃(𝑍 > 𝑧) = 0.1
𝐴 − 78
⇒ 𝑧 = 1.285 = ⇒ 𝐴 = 78 + 1.285 × 6 = 85.71
6
(b) What must be the cutoff point for passing the
examination if the examiner wants only the top 72% of all
scores to be passing?
𝑃(𝑌 > 𝑘) = 0.72 ⇒ 𝑃(𝑍 > 𝑧) = 0.72
𝑘 − 78
⇒ 𝑧 = −0.585 = ⇒ 𝑘 = 78 + (−0.585) × 6 = 74.49
6
probability
Example 2: Layton Tire and Rubber Company wishes to set a
minimum mileage guarantee on its new MX100 tire. Tests reveal
the mean mileage is 67,900 with a standard deviation of 2,050
miles and that the distribution follows the normal distribution.
Let x represent the minimum guaranteed so that no more
than 4% of tires need to be replaced. Find the value of x.
probability
z = x − μ = x −67,900 and from the table we find z = -1.755

σ 2,050
so -1.755 = = x −67,900 therefore, x = 64,302.25 miles
2,050
Acceptance Sampling
 Meaning of Acceptance Sampling or Sampling Inspection

 Classification of Acceptance Sampling
 Terms Used in Acceptance Sampling
 Advantages of Acceptance Sampling
 Limitations of Acceptance Sampling
Dr. Ghassan Mohamed

Meaning of Acceptance Sampling or Sampling Inspection
 One method of controlling the quality of a product is 100% inspection which requires huge
expenditure in terms of time, money and labor. Moreover due to boredom and fatigue involved in
repetitive inspection process, there exists a possibility to overlook and some defective products
may pass the inspection point
Dr. Ghassan Mohamed

Classification of Acceptance Sampling
• (i) Acceptance sampling on the basis of attributes

i.e. GO and NOT GO gauges.
• (ii) Acceptance sampling on the basis of variable.
Dr. Ghassan Mohamed

Following terms are generally used in acceptance
sampling:
• (i) Acceptable Quality Level (AQL):
• (iii) Average outgoing Quality (A.O.Q):
• Operating Characteristic Curve or O.C. Curve
Dr. Ghassan Mohamed

Advantages of Acceptance Sampling
• (i) The method is applicable in those industries where there is mass production and the industries follow a
set production procedure.
• (ii) The method is economical and easy to understand
• (iii) Causes less fatigue boredom.
• (iv) Computation work involved is comparatively very small.
• (v) The people involved in inspection can be easily imparted training.
• (vi) Products of destructive nature during inspection can be easily inspected by sampling.
• (vii) Due to quick inspection process, scheduling and delivery times are improved
Dr. Ghassan Mohamed

Limitations of Acceptance Sampling
• (i) It does not give 100% assurance for the confirmation of specifications so there is always
some likelihood/risk of drawing wrong inference about the quality of the batch/lot.
• (ii) Success of the system is dependent on, sampling randomness, quality characteristics to
be tested, batch size and criteria of acceptance of lot.
Dr. Ghassan Mohamed

Producer’s and Consumer’s Risk:
The acceptance or rejection of the whole batch of products in acceptance sampling depends
upon the results of the sample inspected. There is always a chance that a sample may not be true
representative of the batches or lots from which it is drawn.
This leads to following two types of risks:

(i) Producer risk.
(ii) Consumer risk.
Single and Double Sampling Plan
Count the no. of defectives,

‘d’ in the sample of size ‘n’
Is ‘d’ ≤ ‘c’
If yes, than If no, then

accept the lot reject the lot
Dr. Ghassan Mohamed

Single and Double Sampling Plan
Dr. Ghassan Mohamed

Switching rules
Tightene
Reduced Normal
d
Dr. Ghassan Mohamed

Dr. Ghassan Mohamed
Discussion
% 100 Thank you

Dessertation Design &data Collection

Uploaded by

Copyright:

Available Formats

You might also like

Dessertation Design &data Collection

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Dessertation Design &data Collection

Uploaded by

Copyright:

Available Formats

Dissertation & Data Collection

Data Collection Methods

● A long piece of academic writing

● The sections included vary

● Less formal, more personal

✓ State the main topic and aims of your research

in other words, it should be able to stand alone.

3- Your findings – following your own research, what did do you discover?

● Lists all sections that come after it

● Include if your dissertation has a lot of tables and figures

● Include if you use a lot of abbreviations

● Include if you use a lot of specialist terms

✓ Establish your research topic

✓ Search for sources

● Often builds upon/includes the literature review

✓ Overall approach (e.g. qualitative, quantitative, experimental,

✓ Report your results concisely and objectively

● Interpret your results

✓ Answer your main research question

● Lists all sources cited in your dissertation

● Present additional data or documents not included in your main text

An introduction to formatting and citation

● APA = American Psychological Association

● Times New Roman 12 pt, Calibri 11 pt, Arial 11 pt, etc.

Parenthetical citation Narrative citation

1 author (Smith, 2020) Smith (2020)

2 authors (Smith & Jones, 2020) Smith and Jones (2020)

3+ authors (Smith et al., 2020) Smith et al. (2020)

No author (Scribbr, 2020)

No date (Smith, n.d.)

No page numbers (Smith, 2020, para. 10)

X (Smith, 2020) (Jones, 2015) (McCombes et al.,

✓ (Jones, 2015; McCombes et al., 2017; Smith, 2020)

Page on a website Interview you conducted Facebook status

Book Email Lecture slides

Article from an academic Background reading

One author Anderson, B.

Multiple authors Andreff, W., Staudohar, P. D., & LaBrode, M.

Corporate author Scribbr.

With username Obama, B. [@BarackObama].

With specific role Scott, R. (Director).

Year only (books and journals) (2020).

Full date (web pages, newspapers, (2020, November 20).

No publication date available (n.d.).

Italics for standalone Statistical methods for psychology.

Square brackets to describe [Photograph of a wren].

Journal issue Journal of Sports Economics, 1(3), 257–276.

Website BBC News.

Physical location Museo del Prado, Madrid, Spain.

Anderson, B. (1983). Imagined communities: Reflections on the origins and

Andreff, W., & Staudohar, P. D. (2000). The evolving European model of

Rowlatt, J. (2020, October 19). Could cold water hold a clue to a

Step 1: Identify a Research Objective

Step 2: Collect the information needed to answer the questions.

Step 4: Draw conclusions from the information.

To determine whether males accused of batterering their intimate female

The demographic characteristics of the subjects in the experimental and control