Professional Documents
Culture Documents
ScientificMethod Part III HCMUT-HCMUNS v2
ScientificMethod Part III HCMUT-HCMUNS v2
BK
TP.HCM
Formulating a Problem
Bibliographic ∃ literature
Search
or public ?
Finalized Problem
Research Planning BK
TP.HCM
BK
TP.HCM
Identify issues and suggest research directions are part of problem definition
(literature review and creativity),
The remain part is to demonstrate that the suggested direction could bring new
knowledge in terms of:
Extended/Limited applicability of a theory,
Extended/Limited performance of an automation task,
New theory and the boundary of applicability,
Extended/Limited precision of the use of a technology in a “system”
BK
TP.HCM
Practically:
Modeling is required to demonstrate (on paper) the potential of the
suggested direction.
Most frequent method in CSE:
Graphs to illustrate the static (nodes) and dynamics (arcs) part of a
“system”
Physics that generates actual world description leading to continuous
mathematical abstraction, then discrete mathematics, and algorithms
(pseudo-programming).
Testing can be performed to see whether or not the desired results are
met. This is sometimes called Simulation.
Validation consists in conducting experimentations (with or without
human) looking for the evidence that a research topic reckons,
positively or negatively.
BK
TP.HCM
Required Methods
Observation - Selection,
Modeling - Testing,
Simulation - Validation,
Analyzing - Synthesizing - Reporting.
BK
TP.HCM
Conducting Research
Simulation
Experimentation
Experimentation
Observation
Data Analysis
Finalize Problem Validation
Planning
Modeling
Expectations
Match
Communicating
Reports
BK
TP.HCM
Research Methods
Descriptive Methods
Observational Methods
Survey Methods
Predictive Methods
Correlation Research
Quasi-Experimental Design
Single-Case Research
Explanatory Methods
Between-Participants Experimental Design
Correlated-Groups and Development Design
Advanced Experimental Designs
BK
TP.HCM
Observational Method
• Simple in its form but can be more complex in its nature because in
most cases, any phenomena shall be observed by the many
causes leading to its spatiotemporal occurrence.
• These causes can be of natural nature of physics, of human social
behaviors, which shall be validated in laboratory or real-world
conditions.
• Two types of observational studies:
• Naturalistic (or field) observation
• Laboratory (or systematic) observation
BK
TP.HCM
Computational Mathematics:
Physics-based models (where observation is “sometimes” required),
Mathematical models,
Discretization models,
Programming models.
BK
TP.HCM
Naturalistic Observation
Ecological validity.
The extend to which research can be generalized to
real-life situations (Aronson & Carlsmith, 1968)
BK
TP.HCM
Options
Laboratory Observation
BK
TP.HCM
Options
Approach to Observation
Example:
Effect of a particle, e.g. an electron, when lights are used for observation.
The collision between “in-coming light” from which the electron is the cause
of the change of light direction.
Observer detects “out-coming light.”
At that moment, the state of the particle has changed.
BK
TP.HCM
Observation Method
Observe
Note
Observed population
Analyzing relations
BK
TP.HCM
Observation
Observation:
We are capable to describe the variations and evolutions, but we could all
forget after a certain time.
Must note the changes.
BK
TP.HCM
Sampling
BK
TP.HCM
Sampling
f(x)
!
1
1 x
BK
TP.HCM
Sampling
BK
TP.HCM
Sampling
Physical Method:
Example: «Wheel of Fortune»
Equal chances for all numbers.
Each of the N numbers
corresponds to a member of the
whole population that must be
sorted a priori.
This method can only be applied
when N is relatively small, i.e. <
100
BK
TP.HCM
Size of Samples
BK
TP.HCM
Sampling
BK
TP.HCM
Non-probability Sampling
BK
TP.HCM
BK
TP.HCM
Archival Method
Data was collected a priori for other purposes, and may not fully fit with
the objectives of the research.
No control on how the data was to be collected.
BK
TP.HCM
Qualitative Methods
In-Class Exercise
BK
TP.HCM
“In-metro” Exercises
BK
TP.HCM
Survey Methods
BK
TP.HCM
A typical test item in a Likert scale is a statement. The respondent is asked to indicate his or her degree of agreement with the
statement or any kind of subjective or objective evaluation of the statement. Traditionally a five-point scale is used, however
many psychometricians advocate using a seven or nine point scale. A recent empirical study showed that data from 5-point, 7-
point and 10-point scales showed very similar characteristics in terms of mean, variance, skewness and kurtosis after a simple
re-scaling was applied (although as discussed below, it is inappropriate to apply such statistics to Likert-scale data).
Likert scaling is a bipolar scaling method, measuring either positive or negative response to a statement. Sometimes a four-
point scale is used; this is a forced choice method since the middle option of "Neither agree nor disagree" is not available.
Likert scales may be subject to distortion from several causes. Respondents may avoid using extreme response categories
(central tendency bias); agree with statements as presented (acquiescence bias); or try to portray themselves or their
organization in a more favorable light (social desirability bias).
Scoring and analysis After the questionnaire is completed, each item may be analyzed separately or in some cases item
responses may be summed to create a score for a group of items. Hence, Likert scales are often called summative
scales.Responses to a single Likert item are normally treated as ordinal data, because, especially when using only five levels,
one cannot assume that respondents perceive the difference between adjacent levels as equidistant. When treated as ordinal
data, Likert responses can be collated into bar charts, central tendency summarised by the median or the mode (but not the
mean), dispersion summarised by the range across quartiles (but not the standard deviation), or analyzed using non-parametric
tests, e.g. Chi-square test, Mann-Whitney test, Wilcoxon signed-rank test, or Kruskal-Wallis test.Responses to several Likert
questions may be summed, providing that all questions use the same Likert scale and that the scale is a defendable
approximation to an interval scale, in which case they may be treated as interval data measuring a latent variable. If the
summed responses fulfils these assumptions, parametric statistical tests such as the analysis of variance can be applied.
These can be applied only when the components are more than 5.Data from Likert scales are sometimes reduced to the
nominal level by combining all agree and disagree responses into two categories of "accept" and "reject". The Chi-Square,
Cochran Q, or McNemar-Test are common statistical procedures used after this transformation.Consensus based assessment
(CBA) can be used to create an objective standard for Likert scales in domains where no generally accepted standard or
objective standard exists. Consensus based assessment (CBA) can be used to refine or even validate generally accepted BK
TP.HCM
standards.
© University of Technology & University of Sciences - VNU HCM (2008) 34
Lectures on Scientific Research Methods
BK
TP.HCM
Arranging the questions (Dillman. Mail and Telephone Surveys: the Total
Design Method. Wiley & Sons, 1978, NY, USA. )
1. Present related questions in subsets to focus on one issue at a time;
2. Place questions that dead with sensitive topics (eg. Drug use or sexual
experiences) at the end of the applicable subset of questions;
3. Last, to prevent participants from losing interest in the survey, place
the demographic questions at the end of the survey.
BK
TP.HCM
BK
TP.HCM
Phone Surveys
Involves telephoning the participants and reading questions to them.
Advantages:
- representative samples even with random-digit dialing technique
- respondents can ask for clarification if needed
- researcher can ask follow-up questions if needed (for more reliable data)
- generally higher response rate, 59 to 70% (Groves & Kahn, 1979)
Potential Problems:
- response rate may decrease because of increase in telemarketing
- more time consuming and more costly
- can be interviewer bias
- participants can give socially desirable responses
respondent believes it is
deemed appropriate by society BK
TP.HCM
Personal Interviews
A survey in which the questions are asked face to face.
Advantages:
- Can be conducted anywhere, e.g. home, street, shopping mall, university,
etc.
- Allow research t record not only verbal responses but also “body languages”
- participants usually devote more time
- clarification of questions and responses possible
- fairly high response rate, typically around 80% (Dillman, 1978; Erdos 1983)
Potential Problems:
- similar to phone: interviewer bias, socially desirable questions
- lack of anonymity in personal interviews may affect the responses
A variation on personal interview is the focus group interviews
6-8 persons BK
TP.HCM
BK
TP.HCM
Experimental Method
Experimental method:
Characteristics
Basic Steps of an Experiment:
Determine relevant factors,
Experiment with two variables,
Analyze primary data,
Analyze secondary (processed) data,
Investigate abnormal data,
Test baseline hypothesis,
Experiment with remaining variables,
Link the experimental results to a model.
Experimental Conditions :
Availability of the system,
Quantification of the problem and accountability of the results,
Absence of external restrictions on the used methods.
BK
TP.HCM
Experimental Approach
Characteristics:
Basic steps in an experiment,
Experimental conditions,
BK
TP.HCM
Step 1
BK
TP.HCM
Step 2
BK
TP.HCM
measures
00.0 0.0
10.00 ± 0.01 2.0 ± 0.2
20.01 ± 0.02 2.8 ± 0.1
30.00 ± 0.01 3.5 ± 0.1
40.01 ± 0.01 4.1 ± 0.3
50.00 ± 0.01 4.8 ± 0.2
BK
TP.HCM
Step 3
BK
TP.HCM
Step 4
BK
TP.HCM
Step 5
BK
TP.HCM
Step 6
BK
TP.HCM
Step 7
BK
TP.HCM
Step 8
Example:
Obtained model is a mathematical model known as “Newton Mechanics”
If the resistance of the air is neglected, the relation between time and
distance can be written d = 1/2 at2 ; a is the acceleration.
BK
TP.HCM
Experimental Conditions
BK
TP.HCM
Experimental Method
Research :
Instrument ++
Time ++
Precision
New discoveries
BK
TP.HCM
Experimental Method
BK
TP.HCM
55
Generality
BK
TP.HCM
The researcher has responsibility towards the subjects involved in the experiment.
The subjects have the right to privacy, confidentiality, and the right to be informed about
the nature of the research, and must be treated with respect.
Any subject has the right to withdraw from an experiment at any time.
Any explanation that is required by the subjects shall be provided (compatibility with the
purposes of the experiment, i.e. do not reveal the hypotheses beforehand).
Sometimes the subjects might feel frustrated if they feel that their performance was not
good enough. The researcher must stress that the main goal of the research is to
assess the performance or to evaluate some properties of some
tools/algorithms/automations in general. Must make sure that every participant feels
comfortable.
It is a good practice to obtain an informed consent from the participants.
The researcher shall be as neutral as possible, do not reveal expectations about the
experiments either explicitly or implicitly.
BK
TP.HCM
Other Obligations
BK
TP.HCM
Planning an Investigation
BK
TP.HCM
BK
TP.HCM
BK
TP.HCM
Validity
BK
TP.HCM
Validity
BK
TP.HCM
Validity
BK
TP.HCM
BK
TP.HCM
The terms «a simple design», mentioning the fact that the researcher
had selected two groups of pupils, one receiving the new teaching
method and the other one receiving a traditional method. This means,
for example, that ten students are allocated to one experimental
condition, and ten students to the other condition. In this case we
have a between-subjects design.
When using different groups of subjects it is necessary to bear in
mind that there could be a variability linked to the subjects (e.g. some
students remarkably good at math).
Proposition I: In order to deal with this type of problems, all the subjects
have to be assigned randomly to each condition.
Random assignment means that every subject must have equal
chances to end up in one of the conditions. Random assignment is
essential. Following non-random strategies may lead to biased results.
BK
TP.HCM
BK
TP.HCM
BK
TP.HCM
BK
TP.HCM
The mean, mode and median give information about the central
tendency of the scores, which is, where the scores tend to aggregate
and cluster together.
But, there are also some “markers” of variability, that is, the opposite
tendency of the scores to spread and differentiate away form each
other, namely the variance and the standard deviation.
conditions TP.HCM
The variance ( s2) tells us the degree to which the value for each person
deviates from the mean of the group. It is calculated starting from the Sum
of the Squared deviates (SS), obtained subtracting each score from the
mean and squaring each value. Then all the values are summed up; this
sum is then divided by the number of scores. In other words:
s =2 # i x
(x " M ) 2
=
SS
N N
Where xi represents the score, M is the mean, and N is the number of scores.
The standard deviation (SD) is the square root of the variance.
A simple example on the meaning of variability and spread: if you have a group
!of 9 nine scores and the value of any of the scores is five, both variance and SD
will be zero: that is, there is no variability in the scores.
BK
TP.HCM
Statistical Probability
BK
TP.HCM
Statistical Test
Proposition IV: It is a good rule to set the alpha level before the
experiment (e.g. p=.05) and then to stick with that level also after,
no matter how significant the results are. BK
TP.HCM
Statistical Tables
These probabilities levels are provided in statistical tables. These tables
report the critical values for a certain statistic and the associated levels of
significance;
in other words, they support the identification of a certain level of significance
for any given difference.
Another important issue is whether the experimental hypothesis is “open”
or “closed”. To be more precise, we must decide whether the
experimental hypothesis is one-tailed or two-tailed.
This means that the researcher investigating the teaching method could
predict if the new teaching method will affect the math scores (in either ways,
thus the prediction does not go into a specific direction) or if the new teaching
method will increase the math scores of the pupils, or vice versa (thus making
a prediction going into a specific direction).
Error Types
There are two ways in which a researcher could be
wrong about her/his hypotheses: Let’s consider
again the null hypothesis (usually written as: H0):
H0 False H0True
Reject H0 Correct Type I error
Decision
Fail to reject H0 Type II error Correct
BK
TP.HCM
Statistical Tests
BK
TP.HCM
BK
TP.HCM
BK
TP.HCM
Degree of Freedom
BK
TP.HCM
More Example:
Now, let’s take a sample of 4 subjects, and consider a situation in which
only the number of subjects (4), the scores of three subjects (e.g. 2, 3 and
6) and the sum of the scores (20) are known.
The score of the fourth subject is also predictable. This means that this
fourth score has no freedom to vary, and that there are 3 DOF.
BK
TP.HCM
Analysis of Variance
BK
TP.HCM
ANOVA
BK
TP.HCM
BK
TP.HCM
3. df wg = N - k
As an example, having three conditions (thus, three independent groups of subjects), and
10 subjects in each group, the degrees of freedom will be:
1. df bg = 3 1 = 2
2. df tot = 30 1 = 29
BK
3. df wg = 30 3 = 27 TP.HCM
MSbg
F=
MSwg
BK
TP.HCM
BK
TP.HCM
BK
TP.HCM
BK
TP.HCM
Sources of
SS Df MS F p (alpha)
variance
Within groups SS wg df wg
The SS bg, SS wg, and SS tot can be calculated in the same manner as for the between-subjects
ANOVA. The SS subj can be obtained by taking the average of each subject’s scores for all the
conditions deduct it from the overall mean, then square the result and multiply it by the number of
conditions. In other words:
SSsubj = K(M subji " M)2
Where K is the number of conditions.
The df are calculated as follows. Assuming that we have a sample of 10 subjects, each of them
participating in an experiment in which three levels of one independent variable are evaluated, we
will have:
1. d f tot = N tot – 1 (where Ntot is total number of scores) df tot = 30 – 1= 29
2. d f bg = k 1 d f bg = 3 – 1= 2
3. d f subj = N 1 (N is the number of subjects) d f subj = 10 1= 9
4. d f wg = N tot K d f wg = 30 – 3 = 27
5. d f error = df wg df subj df error = 27 – 9= 18
When the SS bg and the SS error are divided by the corresponding degrees of freedom (cf. Table 5),
the required MS values can be obtained. The F statistic is given by:
MSbg
F=
MSerror
When the F value is obtained, the same procedure and the same logic used for the between-subjects
ANOVA apply. Having in mind the alpha level and the appropriate degrees of freedom (F2,18) the
!
calculated F value has to be compared with the critical F reported in the statistical table.
BK
TP.HCM
BK
TP.HCM
the smallest value (1) corresponds to the “first” rank (1); while the largest value (7) corresponds to
the “last” rank (6). The aim of this test is to assess whether there is a significant difference
among the rank totals of three groups.
BK
TP.HCM
Once the ranks are assigned, the following equation has to be used to obtain a value called H:
12 # T 2 &
H= % " ( ) 3(N + 1)
N ( N + 1) $ n'
2
where N is the number of all subjects; n is the number of subjects in each group; T
is the squares of the rank totals for each condition
!
Once the H is known ! and the degrees of freedom are calculated (in this case, the df
are given by the number of conditions minus one, that is 3 - 1= 2 df), the ! H we found
has to be compared to the H reported in the corresponding statistical table. For the
level of s ignificance decided (.05), if the calculated H i s equal to or larger than the
critical H in the table, then the difference is significant. Let’s assume that the
calculated H is larger than the critical H reported in the table.
Thus we can say that: df=2, H=10, p<.05; we can reject the null hypothesis and accept our
experimental hypothesis. This test only checks for overall differences among conditions
(i.e. a two-tailed hypothesis). Thus, if the calculated value of H is significant, we can
proceed to pairwise comparis o n s
In this case the most appropriate test is the Mann-Whitney U test, a non parametric test for one independent
variable with two levels.
BK
TP.HCM
Friedman Test
BK
TP.HCM
Also for the Freidman test the original scores should be ranked.
But, differently from the Kruskal-Wallis, this time, the scores ranked
entail each subject across each condition (to use a “graphical
explanation”, the ranking is done “horizontally” for each subject).
For example, the scores of the subject 1 are ranked (in the order) 1, 3
and 2. The scores of the subject 2 are ranked: 1, 3 and 2, etc.
BK
TP.HCM
Once the r anks are assigned the " r2 value has to be found. The fo llowing equation
should be used:
## 12 & &
2
" = %% ( ) T (( * ( 3N(k + 1))
!r %$$ Nk ( k + 1) '
2
'
( )
2
Where N is the number of subjects (8); k is the number of conditions (3); T is the squares
of the rank totals for each condition. Once the " r2 value is found, it can be compared to the
!
critical " r2 reported in the corresponding statistical table. In order to find the right " r2 in the
table, the number of subjects (N) and the num ber of conditions ! (k) are required. The
2
calculated value of " r is significant (at any gi ven level of p) if it is equal to or larger than
!
the critical " r2 . Let’s assume now, that our " r2 is smaller than the critical " r2 at the level of
! !
significance decided (which was .05, cf. above). This means that we must accept the null
hypothesis and reject the experimental hypothesis. This also means that our analysis
! proceed.
cannot further
! ! !
Proposition IX: For an exploratory study, if a statistical test that checks
for overall differences among conditions does not allow accepting
the experimental hypothesis, the pair-wise comparisons (comparing
each condition against each other) cannot be performed.
BK
TP.HCM
Summary
Summary (2)
BK
TP.HCM
Summary (3)
BK
TP.HCM
Recommended Bibliography
Baird, D.C. (1988) Experimentation : An Introduction to Measurement Theory and Experimental Design,
2nd Ed., Prentice Halls.
Brodsky, B. E., Darkhovsky, B. S., (2000), Non parametric statistical diagnosis, Kluwer Academic
Publishers, The Netherlands.
Das, M. et Giri,(1989) N. Design and Analysis of Experiments, 2nd Ed., John Wiley & Sons..
Dean, A., Voss, D., (2000), Design and analysis of experiments, Springer, NY.
Frigon, L., Mathews, D., (1997), A practical guide to experimental design, John Wiley and Sons Inc.,
NY.
Gooding, D., Pinch, T., et Schaffer, S. (Editors) (1989). The Use of Experiment. Cambridge University
Press.
Hicks, C. R., Turner, K. V., (1999), Fundamental concepts in the design of experiments Fifth Edition,
Oxford University Press, NY
Myers, J. L. (1966). Fundamentals of Experimental Design. Allyn & Bacon.
Scheaffer, R. et al. (1986). Elementary Survey Sampling, 3rd Ed., Duxbury Press.
Wilson, E. Bright, Jr. (1990). An Introduction to Scientific Research. Dover (McGraw-Hill 1952, 1980).
Wickens, C. D., Gordon, S. E., Liu, Y., (1997), An introduction to human factors engineering, Addison
Wesley, NY.
BK
TP.HCM