Research Principles Methods and Statisti PDF

Research Principles, Methods and Statistics in
Applied Linguistics
Ebrahim Khodadady (PhD)
Ferdowsi University of Mashhad
September 2013
I
Preface
Research Principles, Methods and Statistics in Applied Linguistics has
been written for university students who major in various fields related
to English language, literature and translation in Iran. Although there are
a number of related textbooks such as Understanding research in second
language learning: A teacher’s guide to statistics and research design
(Brown 1988), Research design and statistics for applied linguistics
(Hatch & Farhady 1982), and The research manual: Design and
statistics for Applied Linguistics (Hatch & Lazaraton 1991) available in
the market, the present textbook enjoys several features which make its
contribution to the field indispensible.
First, Research Principles, Methods and Statistics in Applied Linguistics

has an adequate familiarity with its readership. Since the theoretical
aspect of human knowledge and sciences are emphasized in Iranian high
schools, when their graduates enter universities they find it quite
challenging to adapt themselves to the requirements of academic
activities, particularly conducting research projects. I remember quite
well how surprised I became when I was asked to teach a research
course to my grade 12 high school students in Australia in 1994! Even to
my greater surprise, these students were already quite familiar with
concepts such as correlations, samples, and variables. Almost none of
these concepts are covered in English classes in Iranian high schools.
Secondly, in addition to being a new subject to Iranian university

students, the very offering of the course in English as a foreign language
adds another dimension to the complexity of teaching and learning
Research Principles, Methods and Statistics in Applied Linguistics.
Most Iranian students employ bilingual dictionaries when they read
textbooks in English. Compared to similar textbooks written in English
speaking countries, the present book takes nothing for granted and
employs footnotes to define technical terms whenever their application
proves to be totally necessary. These footnotes include not only
pronunciation to render them self-contained but also background
knowledge when a particular personality like Kurt Lewin is introduced.
Research Principles, Methods and Statistics in Applied Linguistics

II
Thirdly, every opportunity has been seized upon to re-present

fundamental concepts along with real examples in various parts of
Research Principles, Methods and Statistics in Applied Linguistics. For
example, research hypotheses are covered in chapter three. However,
hypotheses formulated and explored by researchers have also been given
in other chapters such as chapter four when the concept of content
validity is brought up for the first time. This cycling of previously taught
materials helps readers relate their new knowledge to what they have
already learned and thus overcome forgetting and achieve better and
lasting cognition.
Fourthly, Research Principles, Methods and Statistics in Applied

Linguistics includes three new and separate chapters. The first deals with
population and sampling. Although sampling plays an important role in
conducting research projects, almost none of the available textbooks has
devoted a specific chapter on the topic to explore it as comprehensively
as it deserves [e.g., Brown (1988), Hatch and Farhady (1982), and Hatch
and Lazaraton (1991)]
The second new and separate chapter in the present textbook addresses
translation as a research method, i.e., chapter 7. It relates translation
research to other types of research projects and then employs schema
theory to explore translation research projects from macrostrutural and
microstructural perspectives. The chapter resorts to various rendering of
the Quran and Mathnavi Manavi (Rumi, 2001) to provide tangible
examples of qualitative translation research projects.
The third new and separate chapter in the present textbook provides the
literature and rationale to establish a new type of translation research
based on schema theory through which translators choose equivalents on
the basis of their background knowledge with both source and target
languages. In addition to the novelty of the method, it contrasts
quantitative research with its qualitative counterpart in order to help
readers decide which type they prefer to utilize in their own projects.

III
As its fifth distinctive feature, Research Principles, Methods and

Statistics in Applied Linguistics has employed the tables and graphs used
in authentic research papers so that its readers can have a model to
follow. They are preceded or followed by descriptions offered by the
researchers and/or present author to help the readers do the same when
they design similar tables and graphs in their own study. For example,
Brown (1988) developed a complete chapter on comparing frequency
but nowhere in the chapter an interested reader can find a single table
developed by researchers on Chi-square by use of which they had
confirmed or disconfirmed their hypotheses. In other words, most
authors have described and tabulated how to calculate the statistic but
have failed to show how it should be employed in the research paper.
Sixthly, Research Principles, Methods and Statistics in Applied

Linguistics has tried to focus on language related problems faced in Iran.
This deliberate attempt has been made in order to draw its readers’
attention to what facilities they have, e.g., available sites and journals,
and what their peers have done in their own country, e.g., introducing
and discussing MA and PhD theses submitted at Iranian universities in
general and Ferdowsi University of Mashhad in particular.
Finally, I hope Research Principles, Methods and Statistics in Applied

Linguistics will contribute to the enhancement of English as an
International language. As a member of the present day international
community, we Iranians have contributed to its development by
establishing a relatively large number of undergraduate and graduate
programs and academic journals and thus helped achieve global
understanding and communication.
Dr. Ebrahim Khodadady

Ferdowsi University of Mashhad
September 2013

IV
Pronunciation Symbols (British English)
Consonants Vowels
Symbol Key Word Symbol Key Word
p pen i sit
b bad i: sheep
t ten e bed
d desk æ bad
k key ǎ cut
g get a: father
č chair ǒ pot
j jump o sort
f few u put
v very u: boot
θ thing з father
ð then з: bird
s soon ā make
z zero ō note
š she ī bite
ž pleasure aw now
h hot oy boy
m sum iз here
n sun eз there
ŋ sing uз poor
l led āз player
r red ōз lower
y yet īз tire
w wet awз tower
oyз employer
/ / … slant lines used in pairs to mark the beginning and end of a transcription
' … mark preceding a syllable with primary (strongest) stress as in
discover /dis'kǎvз/
` … mark preceding a syllable with secondary (next-strongest) stress as
in indication /`indi'kāšn/

V
Table of Contents
Page
Preface I
Pronunciation Symbols IV
Table of Contents V
List of Tables XIII
List of Figures XVII
List of Appendices XXI
Chapter 1 Defining Research

1.1 Introduction 1
1.2 Single Definition 1
1.3 Composite Definition 2
1.3.1 Research Starts with Problems 2
1.3.2 Curious Researchers Face Problems 3
1.3.3 Research Requires Taking and 3
Checking Notes
1.3.4 Research Depends on References 4
1.3.5 Research References Are Verifiable 5
1.3.6 Research Entails Formulating 6
Hypotheses
1.3.7 Research Requires Choosing 6
Certain Methods
1.3.8 Research Is Logical and Systematic 7
1.3.9 Research Is Convincing 7
1.3.10 Research Is Objective 8
1.3.11 Research Is Replicable 9
1.4 Summary 9
Chapter 2 Research Variables

2.1 Introduction 11
2.2 Psychometric Variables 12

VI
Chapter 2 Research Variables (Continued) Page

2.2.1 Categorical Variables 13
2.2.2 Ordinal Variables 15
2.2.3 Interval Variables 17
2.2.4 Ratio Variables 18
2.2.5 String Variables 19
2.3 Functional Variables 19
2.3.1 Independent Variables 20
2.3.2 Dependent Variables 21
2.3.3 Moderator Variables 21
2.3.4 Control Variables 23
2.3.5 Intervening Variables 24
2.4 Latent Variables 24
2.5 Summary 26
Chapter 3 Research Hypotheses

3.1 Introduction 28
3.2 Hypothesis Defined 33
3.3 Null hypothesis 33
3.4 Directional Hypothesis 34
3.5 Summary 38
3.6 Application 38
Chapter 4 Characteristics of Research

4.1 Introduction 39
4.2 Validity 40
4.2.1 Internal Validity 40
4.2.1.1 Researchers 41
4.2.1.1.1 Bias 41
4.2.1.1.2 Implementation 42
4.2.1.1.3 Presence 42
4.2.1.2 Participants 43
4.2.1.2.1 Attitude 43
4.2.1.2.2 Attrition 44

VII
Chapter 4 Characteristics of Research (Continued)

4.2.1.2.3 Expectancy 45
4.2.1.2.4 History 46
4.2.1.2.5 Maturation 47
4.2.1.2.6 Selection 47
4.2.1.2.7 Sympathy 48
4.2.1.2.8 Tolerance 49
4.2.1.3 Location 49
4.2.1.4 Instruments 50
4.2.1.4.1 Construct 51
Validity
4.2.1.4.2 Content Validity 52
4.2.1.4.3 Empirical 54
Validity
4.2.1.4.4 Directions 55
4.2.1.4.5 Subjectivity 56
4.2.1.4.6 Test Effect 58
4.2.2 External Validity 59
4.3 Reliability 61
4.4 Feasibility 62
4.5 Summary 62
Chapter 5 Population and Sampling

5.1 Introduction 64
5.2 Population Defined 65
5.2.1 Target Population 65
5.2.2 Accessible Population 66
5.2.3 Population and Normal Distribution 67
5.3 Sampling 69
5.3.1 Randomness: Homogeneity and 70
Mixture
5.3.2 Stratified Random Sampling 73
5.3.3 Cluster Sampling 74
5.3.4 Convenience Sampling 79
5.3.5 Matched/Block Sampling 79

VIII
Chapter 5 Population and Sampling (Continued)

5.4 Sample Size 81
5.5 Intact Groups 82
5.6 Summary 83
Chapter 6 Types of Research

6.1 Introduction 85
6.2 Classification of Research by Purpose 86
6.2.1 Basic Research 87
6.2.2 Applied Research 90
6.2.3 Evaluation Research 92
6.2.4 Action Research 93
6.3 Classification of Research by Method 94
6.3.1 Experimental Research 95
6.3.2 Correlational Research 98
6.3.3 Observational Research 100
6.3.3.1 Longitudinal 101
6.3.3.2 Cross-sectional 102
6.3.4 Archival Research 103
6.3.5 Survey Research 105
6.3.6 Historical Research 107
6.4 Summary 107
Chapter 7 Translation Research

7.1 Introduction 109
7.2 Purposes of Translation Research 111
7.2.1 Basic Translation Research 111
7.2.2 Applied Translation Research 113
7.2.3 Evaluation Translation Research 114
7.2.4 Action Translation Research 117

IX
Chapter 7 Translation Research (Continued)

7.3 Methods Employed in Translation Research 119
7.3.1 Macrostructural Methods 120
7.3.1.1 Word-for-Word 120
Translation
7.3.1.2 Literal Translation 120
7.3.1.3 Faithful Translation 121
7.3.1.4 Semantic Translation 122
7.3.1.5 Adaptation 123
7.3.1.6 Free Translation 124
7.3.1.7 Idiomatic Translation 126
7.3.1.8 Communicative 127
Translation
7.3.2 Microstructural Method 128
7.4 Validity in Translation Research 129
7.4.1 Internal Validity 130
7.4.2 External Validity 132
7.5 Reliability 133
7.6 Feasibility 134
7.7 Summary 135
Chapter 8 Schema-Based Translation Research: A

Quantitative Research Method
8.2 Qualitative versus Quantitative Research 137
8.2.1 Researcher and Participants 138
8.2.2 Methodological Considerations 139
8.2.3 Theory and Hypothesis Formation 140
8.3 Schema-Based Translation Research 141
8.4 Schemata: Objective Units of Translation 145
8.4.1 Semantic Schemata 147
8.4.2 Syntactic Schemata 152
8.4.3 Parasyntactic Schemata 155
8.5 Summary 158

X
Chapter 9 Statistical Analysis of Categorical Variables 159

9.2 Schemata as Categorical Variables 161
9.2.1 Text as a Word File 161
9.2.2 Categorizing Schemata in Excel 168
File
9.2.3 Naming and Categorizing 177
Variables in SPSS
9.2.4 Transferring the Data from Excel 182
Sheets to SPSS
9.2.5 Copying Data from Different 183
Excel Files and Pasting them in
the Same SPSS File
9.3 Utilizing SPSS Facilities to Analyze Data 186
9.4 Crosstabs Procedure 187
9.5 Summary 192
Chapter 10 Statistical Analysis of Ordinal Variables

10.2 Characteristics of Ordinal variables 196
10.3 Quantifying Ordinal Variables 196
10.4 Relationship among Ordinal Variables 197
10.4.1 Correlational Relationships 197
among Ordinal Variables
10.4.1.1 Spearman Correlation 198
Coefficient (ρ)
10.4.1.2 Pearson Correlation 201
Coefficient
10.4.1.2 Kendall Tau 203
Coefficient (τ)
10.5 Factorial Validity of Ordinal Variables 205

XI
Chapter 10 Statistical Analysis of Ordinal Variables

(Continued)
10.5.1 Utilizing SPSS to Run Factor 206
Analysis
10.6 Criticism of Ordinal Variables 220
10.7 Summary 221
Chapter 11 Working with Interval Variables

11.2 Raw Scores 222
11.3 Central Tendency 225
11.3.1 Mode 225
11.3.2 Median 225
11.3.3 Mean 226
11.2 Variation 227
11.2.1 Range 228
11.2.2 Variance 229
11.2.3 Standard Deviation 230
11.4 Using SPSS to Calculate Mean and 231
Standard Deviations
11.5 Normal Distribution, Mean and Standard 236
Deviation
11.6 Summary 239
Chapter 12 Employing Interval Variables to Evaluate

Instruments
12.2 Instrument Reliability 241
12.2.1 Raw Scores and Reliability 242
12.2.2 Cronbach's α (alpha) 243
12.3 Interval Variables and Validity 247
12.3.1 Interval Variables and Empirical 247
Validity

XII

Instruments (Continued)
12.3.2 Interval Variables and Internal 252
Validity
12.4 Summary 254

Performance in Groups
13.2 Z Statistic 256
13.3 T Test 261
13.3.1 Estimating T test 261
13.3.2 Applying T Test to Raw Scores 263
via SPSS
13.4 One-Way between Groups ANOVA with 267
Post-Hoc Tests
13.4.1 Applying One-Way between 269
Groups ANOVA to Raw Scores
via SPSS
13.5 Regression Analysis 276
13.5.1 Standard Multiple Regression 277
13.5.2 Hierarchical Multiple Regression 290
13.6 Summary 294
Chapter 14 Finding Research Papers

14.2 The Institute for Scientific Information 295
(ISI)
14.3 Scientific Information Database (SID) 301
14.4 Online Journals 307
14.5 Summary 308
References 327

XIII
List of Tables
Page
Table 2.1 Codification of gender as a categorical variable 14
for 10 hypothetical participants
Table 2.2 Ordinal scale of writing ability at advanced level 15
Table 2.3 Conversion of ranks into ordinal numbers, 17
cardinal numbers and scores
Table 2.4 Five areas of learning addressed by beliefs 25
explored by the BALLI
Table 3.1 Schema domains, genera, type, token and their 36
ordered percentage in the RMTIAR
Table 4.1 Correlations coefficients of three tests 55
Table 5.1 Classification of test takers based on their IQs. 68
Table 5.2 The first two blocks of the random numbers 72
appearing on page one of Appendix 5.1
Table 5.3 Students enrolled at five strata of school 74
education in Iran in 2004-2005
Table 5.4 Tertiary education students in 2004-2005 75
Table 5.5 The number of undergraduate and graduate 77
students majoring in the specified academic fields
offered in Dr. Ali Shariati Faculty of Literature
and Humanities at FUM in 2008
Table 5.6 Random assignment of 135 participants into three 80
groups
Table 5.7 Standard deviation range in samples having 82
normal population
Table 6.1 IQ means and standard deviations for whites and 89
black
Table 6.2 Mean values of nasal resistance at AAR 97
(Pa/cc/sec), nasal endoscopic scores and nasal
symptoms, before and after treatment
Table 6.3 Comparison and statistical analysis of results 97

XIV
List of Tables (Continued)
Table 6.4 The raw scores of five participants (Ps) on the 99

TOEFL and C-tests
Table 6.5 Cross-sectional selection of 16 hypothetical 103
bilingual children
Table 6.6 The application of SEM in research projects on 104
assessment
Table 6.7 A variable by case matrix 105
Table 7.1 Validity analysis of eight renderings of the source 131
schema ‫ﺖ‬ُ ْ‫َأﺣْ َﺒﺒ‬
Table 8.1 Differences between quantitative and qualitative 140
research projects
Table 8.2 The chi-square test of English equivalents 145
provided for the Persian semantic schema
ARZYABEE
Table 8.3 Semantic schemata comprising the two 148
contemporary translations of the Quran
Table 8.4 Semantic schemata comprising the first surah of 150
two contemporary translations of the Quran
Table 8.5 Syntactic schemata comprising the first surah of 153
two contemporary translations of the Quran
Table 8.6 Syntactic schemata and their types and tokens 154
Table 8.7 Parasyntactic schemata comprising the first surah 156
of two contemporary translations of the Quran
Table 8.8 Parasyntactic schema species and their example 157
types
Table 9.1 Comparing the schemata of Surah 1 with those of 177
modern political texts
Table 9.2 Translators by Schema Domain Crosstabulation 190
Table 9.3 Chi-square test of schema domains 191
Table 9.4 Chi-square test of schema types 192
Table 9.5 Chi-square test of schema tokens 192

XV
Table 10.1 A graduate and undergraduate student’s 198

agreement with five beliefs
Table 10.2 Calculation of Spearman Rank-Order Correlation 199
Coefficient (ρ)
Table 10.3 Spearman's rho obtained on five beliefs 201
Table 10.4 Calculation of squared individual deviation scores 201
on five beliefs
Table 10.5 Calculation of z scores on five beliefs 202
Table 10.6 Pearson correlation coefficient obtained on the 203
five beliefs
Table 10.7 The ordering of five beliefs according to graduate 203
student’s ranking
Table 10.8 Kendall's tau_b correlation coefficient obtained 205
by the SPSS on the five beliefs
Table 10.9 KMO and Bartlett's Test of 418 participants as 211
adequate sample
Table 10.10 The extraction of communalities via Principle 212
Component Analysis
Table 10.11 Total Variance Explained 213
Table 10.12 Component matrix 215
Table 10.13 Component matrix with the loadings less than .30 217
suppressed
Table 10.14 Logical areas of 13 beliefs loading on component 220
one
Table 11.1 Percentage and percentile of a set of raw scores 224
Table 11.2 Steps involved in calculating variance 229
Table 11.3 Descriptive statistics produced by the SPSS via 235
Frequencies command
Table 11.4 Standard deviations and percentages of a normal 237
set of scores
Table 11.5 Comparing the scores obtained by the sample 238
control group with its population

XVI
Table 12.1 Steps involved in calculating 248

Table 12.2 Pearson correlation coefficients and their 252
significance calculated by SPSS
Table 12.3 Manual calculation of point-biserial correlation 253
coefficient
Table 13.1 Z scores and their percentages in a normal 258
distribution
Table 13.2 Raw scores obtained on the schema-based cloze 260
MCIT administered as a pre-and-post test
Table 13.3 Group statistics 266
Table 13.4 Levene's Test for Equality of Variances 266
Table 13.5 T-test for equality of means 267
Table 13.6 Regression and correlation among three tests (n = 268
591)
Table 13.7 One-Way ANOVA analysis of scores obtained by 272
control and experimental groups on the syllabus-
and-schema based reading comprehension pretest
Table 13.8 One-Way ANOVA analysis of scores obtained by 273
control and experimental groups on the syllabus-
and-schema based reading comprehension
posttest
Table 13.9 The Scheffe Post Hoc Test of the scores obtained 275
on syllabus-and-schema-based reading
comprehension posttest
Table 13.10 The participants’ code (C), group (G) and scores 276
on the unseen reading comprehension test (UR)
Table 14.1 Selected list of Journals ranked on the basis of 296
their total cites in 2008
Table 14.2 List of English language related Journals 301
published in Iran
Table 14.3 Name and Address of some online journal 307

XVII
List of Figures
Page
Figure 2.1 Interaction of independent and dependent 22
variables
Figure 5.1 Intelligence IQ normal curve 69
Figure 6.1 Black Scores on Four Tests of Cognitive Ability 90
Figure 8.1 Hierarchical relationship of schemata comprising 143
the Quran as a written text
Figure 9.1 Page setup in word 162
Figure 9.2 Original typed text in word 164
Figure 9.3 Find function activated 164
Figure 9.4 Replace function activated 165
Figure 9.5 Successful functioning of Replace All command 165
Figure 9.6 Depunctualized schemata 166
Figure 9.7 Commands involved in converting texts to table 167
Figure 9.8 Last command in converting texts to table 167
Figure 9.9 Schemata column converted to a table 168
Figure 9.10 Forming the column of the first variable called 169
schema
Figure 9.11 Schema and its codes as variables 170
Figure 9.12 Codification of schema tokens 171
Figure 9.13 Coded schema domains, types and tokens 172
Figure 9.14 Naming Sheet 2 for sorting schemata 172
Figure 9.15 Creating variables in the Sorted sheet 173
Figure 9.16 Activating the Ribbon command 173
Figure 9.17 Maximized ribbon 174
Figure 9.18 Activating Sort & Filter dialogue box 174
Figure 9.19 Sorted schemata 175
Figure 9.20 Some sorted schemata and their frequency 176
Figure 9.21 SPSS Data Editor 178
Figure 9.22 SPSS variable types 178
Figure 9.23 Numeric variable types 179

XVIII
List of Figures (Continued)
Figure 9.24 Variable Labels 180

Figure 9.25 Activated Value Labels dialogue box 181
Figure 9.26 Specified Value labels for schema type variable 182
Figure 9.27 Active Data View sheet of the SPSS file 182
Figure 9.28 Completed Data View sheet 183
Figure 9.29 Adding variable in an SPSS file 184
Figure 9.30 A new variable added as number 1 184
Figure 9.31 Defining the values of a new variable 185
Figure 9.32 Checking the new variable in Data View 185
Figure 9.33 Adding the data related to the second value of 186
the new variable
Figure 9.34 Activating Crosstabs on SPSS 187
Figure 9.35 Crosstabs dialogue box 187
Figure 9.36 Activating expected frequency and percentage 188
Figure 9.37 Statistics available in Crosstabs 189
Figure 9.38 Bar chart of schema domains employed by two 189
translators
Figure 10.1 Association of the noun bus with the real vehicle 195
or its picture in human mind
Figure 10.2 Two participants’ ranking as SPSS variables 199
Figure 10.3 Correlation menu 200
Figure 10.4 Bivariate Correlation 200
Figure 10.5 Activating Spearman function 200
Figure 10.6 Activating Kendall τ function 204
Figure 10.7 Assigning values to an ordinal variable 207
Figure 10.8 Activating Factor Analysis menu on the SPSS 208
Figure 10.9 Dialogue box requiring the specification of 208
variables to be analysed
Figure 10.10 The Descriptives dialogue box of factor analysis 209
Figure 10.11 Activating Factor Analysis: Extraction box 210
Figure 10.12 Activating Factor Analysis: Rotation box 210

XIX
Figure 10.13 Activating Suppress absolute values less than 217

Figure 11.1 Defining an interval variable in the SPSS 232
Figure 11.2 Entering Raw Score on the SPSS 233
Figure 11.3 Activating Descriptive Statistics on the SPSS 233
Figure 11.4 SPSS Frequencies dialogue box 234
Figure 11.5 Statistics available in SPSS Frequencies 235
Figure 11.6 Normal curve and the percentages falling 236
between its standard deviation
Figure 12.1 Specifying the values of categorical variable on 244
the SPSS Data View sheet
Figure 12.2 Entering the values of categorical variables on 245
the SPSS Data View sheet
Figure 12.3 Activating Reliability Analysis in SPSS 246
Figure 12.4 Reliability Analysis dialogue box 246
Figure 12.5 Defining two interval variables on the SPSS 249
Figure 12.6 Choosing Correlate from Analyze menu 250
Figure 12.7 Bivariate Correlations dialogue box 251
Figure 13.1 Defining two variables for T test analysis 263
Figure 13.2 Entering data on the Data View sheet 264
Figure 13.3 Independent-Sample T Test 265
Figure 13.4 Moving variables 265
Figure 13.5 Activating Defining Groups box 265
Figure 13.6 Activating Defining Groups box 265
Figure 13.7 Defining and assigning values to the variable 270
called Group
Figure 13.8 Activating One-Way ANOVA on the Analyze 271
menu
Figure 13.9 Activating One-Way ANOVA 272
Figure 13.10 Specifying the dependent and factor variables 272
Figure 13.11 Activating Post Hoc Multiple Comparison 274

XX
Figure 13.12 Activating the linear regression dialogue box and 279
specifying variables
Figure 13.13 Marking relevant boxes in Statistics dialogue 279
box
Figure 13.14 Specifying functions in Plots 281
Figure 13.15 Specifying functions in Save 281
Figure 13.16 Forming block 1 291
Figure 13.17 Forming block 2 291

XXI
List of Appendices
Appendix 3.1 Self-assessment used as reading portfolios 309

Appendix 3.2 Peer-assessment used as reading portfolios 311
Appendix 3.3 Self-reflections on readings 313
Appendix 5.1 Table of random numbers 314
Appendix 12.1 Critical values for the Pearson Product-moment 322
Correlation Coefficients
Appendix 13.1 Critical values for T test 324
Appendix 13.2 The scores of 30 students obtained on five tests 326

1
1 Research: Definition
1.1 Introduction
In general, all types of research are done in order to solve problems.
Before embarking on solving any problem one must first decide what
the nature of the problem is. Without a clear definition of the problem,
taking any steps would be a waste of time, effort, energy and of course
money. The necessity of defining the problem before doing anything
sounds quite obvious. There are, however, a large number of people who
usually do something when they encounter a problem and then start to
think about what they have done.
Many parents, for example, face the problem of leisure time in summer.
Since schools close down, their children stay home and get bored. Not
having anything to do or any places to attend may push children to
juvenile delinquency and thus destroy their future lives. Children's free
time and the necessity of filling it with proper activities, therefore, poses
a problem which calls for parents’ immediate solution.
You can guess that parents may adopt various solutions. But which one
is the best and the most useful? As soon as we bring up the topic of
finding the best solution, we need certain principles to fulfill our search
in the best possible manner. These principles will be discussed in the
context of definitions offered for research by a number of scholars.
1.2 Single Definition

Research has been defined as a “systematic approach to searching for
answers to questions (Hatch & Lazaraton, 1991, p. 9). The most
important key words used in this single definition are systematic and

2
approach. These words will be similar to all unseen terms given in

references such as dictionaries if you do not know their meaning and
apply them to your future research projects.
One way of getting to know what systematic and approach mean is to

find out how research scholars have defined them. Hatch and Farhady
(1982), for example, defined systematicity as “established principles” (p.
4) and employed approach and way interchangeably. From the
perspective of these scholars, then, researchers must follow certain
principles in an established way if they wish to find the best solution for
their research problems.
What established principles are and whether approach and way indicate
the same methods or procedures followed in research designs need to be
explained and discussed by the proposers of single definitions. The
explanation of key words used in single definitions, therefore, render
them complex and difficult to grasp and convey. For this reason, I
believe offering a composite definition captures the nature of research
and sheds light on some research features which are usually overlooked
in single definitions.
1.3 Composite Definition

In contrast to a single definition, a composite definition consists of
several sentences which focus on various factors involved in the process
of carrying out any type of research in language education without
forcing the definer to compensate comprehensibility for succinctness.
These factors have to be taken into account in order to conduct a valid,
reliable and practical research.
1.3.1 Research Starts with Problems

I have been teaching the course Research Principles and Methods to
undergraduate and graduate university students for almost two decades.
After becoming familiar with the principles involved in research, most
of them come to my office and ask: What should I do my research

3
about? Although this is a good question by itself, it simply shows the

lack of a problem by its poser.
As an instructor I usually give my students some research topics to work

on. These topics are not, however, very interesting to them because they
are my problems, not theirs. A good researcher has always a problem to
solve or successfully formulates one if he lacks it for the present.
1.3.2 Curious Researchers Face Problems

There is a Turkish proverb that captures the fact that only curious
researchers face problems. It reads: Only traveling feet stumble. Nobody
will travel, if she does not possess a desire to see outside world. If you
limit yourself to classes and university campus all the time, you will not
come across stones or problems to stumble over.
As language researchers, we need to visit various places where language

teaching and learning plays a role. These places may include libraries,
language institutes, schools, kindergartens, ministry of education and
conferences, to name a few.
While visiting ministry of education, for example, you may come across
some teachers who complain about lack of discipline in their classes. If
you are lucky enough, you may notice an announcement on the board
calling researchers to submit their proposals on a number of issues such
as the effect of media on language use, cultural invasion, and linguistic
imperialism.
1.3.3 Research Requires Taking and Checking Notes

In studying a linguistic or educational topic, a reader might face a
concept that is incomprehensible or ambiguous to her. The reader might
opt for two alternatives. The first would be to ignore it. The second
would be to look for possible explanations. If she adopts the first
alternative, there will be no research. This is because the opposite of
researching is ignoring, overlooking, and disregarding (Urdang, 1977, p.
416).

4
In reading the paragraph dealing with the single definition of research,

as an example for the second option, the reader might encounter the term
systematicity for the first time and wonder what it means. If she asks a
question as simple as “What does systematicity mean?” her research has
started, however simple it might seem to be.
As diligent researchers, we should remember that our memory is limited

and affected by so many factors such as weather, context, feelings, and
fatigue. In order to remove the effect of these factors on our memory, we
have to take written notes. These notes have to be handy so that we
won’t face problems in spotting them whenever we wish. We also need
to accustom ourselves to checking our notes regularly.
1.3.4 Research Depends on References

When we do not understand the meaning of a word in a written text, we
usually look it up in a dictionary. This is because we view dictionaries
as authoritative references. Some people, however, do not take the
trouble to do so and try to guess the meaning by themselves. Instead of
employing a reference, these people resort to their own reading
comprehension ability to solve their meaning problems. Whatever
meaning they reach at, their dependence on their own ability makes their
understanding unreliable.
If we look at the process of guessing the meaning of a word as a coin, it

will have two sides: head or correct and tail or incorrect. This means that
there is a 50% possibility of guessing the right meaning. If the guessers
arrive at an unacceptable meaning, they have to wipe out their
personally constructed meaning from the memory. In other words, they
have to unlearn an unacceptable meaning and do what they should have
done from the very beginning, i.e., look for the acceptable meaning in a
reliable reference.
Valid and objective references therefore help us avoid developing

mistaken ideas on the basis of our own and others’ personal or

5
subjective interpretations. These references provide us with quotable

findings upon which we should base our own arguments. The soundness
of a research report is, in fact, judged by the type and number of
references employed by its writer to address a given topic.
In covering the course Report Writing, we learned that a research paper

in APA style should have these sections: Abstract, Introduction,
Methodology, Results, Discussion, Conclusion, References (and
Appendices). In the Introduction section of a paper, researchers must
introduce the topic and tell their readers what other researchers have
done so far with respect to the topic and give the exact address of their
findings in References.
1.3.5 Research References Are Verifiable

When you use a reference in your research, you must ensure your
readers that what you cite is based on findings obtained by employing
rigorous methods and statistical operations. (We will cover these
methods later in the book.). For this reason, almost all scholarly
references employed in language education are taken from journals or
textbooks published by educational organizations and governmental
departments.
Perhaps the verifiability of research findings is the only criterion which

distinguishes science from other branches of humanity such as theology
and arts. Many people, for example, look for religious answers when
they face a problem and accept them without any questions.
Compared with solutions reached by research, religious solutions such

as those offered by Islam enjoy absolute certainty. They enjoy certainty
not because they are based on certain principles but because they are
decreed by the Almighty Allah. Since devout people believe that
religious solutions have been given by God, they follow them without
asking for any sort of argumentation or experimentation for that matter.
(Of course, this does not mean that solutions offered by religion are not

6
scientific at all. Many religious solutions might prove to be scientific in

the sense that observation supports their application.)
1.3.6 Research Entails Formulating Hypotheses

Reading, taking notes and citing references in a research paper are
indispensible in writing its Introduction section and giving their exact
addresses in References section. They are also vital because they provide
you with enough background information to formulate your hypotheses
or possible answers.
You might, for example, read a number of research papers written on

anxiety. During your extensive reading, you may realize that almost all
studies have focused on anxiety and explored its effect on social
interactions such as casual conversations, dating and attending parties.
As you read deeper, you may notice that the researches you have studied
so far have shown a negative relationship between anxiety and social
interactions. If you want to explore the relationship between anxiety and
learning English as a foreign language, what do you think the outcome
of your own research will be? Whatever answer you give to this
question, it will be your hypothesis.
1.3.7 Research Requires Choosing Certain Methods

While reading a passage, you might come across a word like simplicity
whose meaning might not be known to you. What will you do?
Whatever you do, your action will be your adopted method to solve the
problem.
Most students usually follow three methods when they face an unknown
word: contextual, authoritative, and referential. They might read the
sentence in which the term simplicity has occurred several times and
focus on its surrounding words and sentences to guess its meaning. The
second method would be to ask an authority, i.e., their English
instructor. And finally, they might consult a reference book such as a
dictionary as the third method.

7
Similarly, when you face a problem related to language learning, you

should adopt a certain method to solve it. Naturally, your methods will
depend on the type of your problem. Some problems call for
interviewing participants or asking them to fill out questionnaires. While
some other problems require designing and administering certain tests to
measure the participants' ability. There might still be some other
problems which call for studying certain documents to find out what
happened in the past or creating certain situations to find out how
participants behave under certain conditions. We will address various
types of research methods in chapter six.
1.3.8 Research Is Logical and Systematic

Determining which method, i.e., contextual, authoritative or referential,
one is going to employ in order to find out the meaning of systematicity
requires decision making. Suppose that the reader opts for referential
method and selects a reference book like Longman Dictionary of
Contemporary English (1978) for his research. His possible search will
result in finding entries for system, systemic, systematic, and systematize.
To his surprise, he will find no entry for systematicity in the dictionary.
A linguistic analysis of system, systemic, systematic, and systematize

might suggest that all of these entries are related to systematicity because
of sharing the free morph system. Since systematic and systematicity
have the same bound morphs, i.e., at and ic, they should be the closest in
meaning. The reader, therefore, copies the meaning of systematic from
the dictionary: “based on a regular plan or fixed method; thorough”. In
other words, your search for finding the meaning of systematicity does
not end as soon as you find no entry in a dictionary. You employ your
power of reasoning to develop a meaning on the basis of what you
already know, i.e., you use logic to reach a solution.
1.3.9 Research Is Convincing

Imagine that after finding the meaning of systematic in a dictionary, you
formulate your own definition of systematicity as something thorough
which deals with a regular plan or fixed method. The existence of a

8
vague concept such as something in the definition renders the tentative

definition challenging and a source of uncertainty and anxiety. If a
classmate or a colleague asks the meaning of systematicity, how are you
going to explain the troubling something in the self constructed meaning
of systematicity?
Research projects will be convincing if researchers do use their

background knowledge, employ their data and discuss them in the light
of their results without leaving any point of their research topic
unaddressed. You as a researcher should first of all be convinced of the
answers you have found for your research questions and be able to
convince others of their acceptability. The plausibility of any research
will therefore depend on how it relates to relevant research projects, how
sound and strong its findings are, and how convincingly the researcher
presents his findings.
1.3.10 Research Is Objective

It seems that many students mix up research projects with essays.
Although essays are legitimate academic products by themselves, they
do not qualify as research projects. While the former have to be based
on representative samples in order to be generalizable, the latter are
basically written on personal views dealing with the topics their writers
find worth addressing.
The personal nature of essays does not, however, entail that their writers
do not employ evidence or logical arguments to support their claims and
adopted positions. The evidence, however, might be idiosyncratic.
Orwell (1958), for example, announced that “from a very early age,
perhaps the age of five or six, I knew that when I grew up I should be a
writer” (Bott 1958, p. 99). Orwell’s statement might mean that great
writers know from their early ages that they will become writers!
Besides, such a claim is too subjective to be verified by a third party
under controlled conditions.

9
The term essay is used in academic circles to refer to a composition in

prose1 or in verse2, "which may be of only a few hundred words … or of
a book length" (Cuddon 1979, p. 244). It addresses a topic of personal
interest formally or informally.
1.3.11 Research Is Replicable

Any type of objective research must have a methodology section. This
section consists of four subsections: Participants, Instruments and/or
Materials, Procedure, and Data Analysis. These subsections require
researchers to give a detailed description of who took part in the
research, what instruments and/or materials were employed to measure
research variables, how and in what sequence the instruments were
administered and what statistics was employed to analyze the data.
I remember quite well that one of my colleagues was doing his PhD in
soil sciences in the agriculture department of a university in Australia. In
order to study the effect of a certain substance on a sample soil, he had
to use a solution which was mentioned in a study. Since the required
solution was not available in the market, he followed the descriptions
given in the Materials subsection to produce the solution in his own lab.
After his frequent attempts bore no fruit, he contacted the author of the
study who was luckily available. It was then he realized the writer of the
study had forgotten to describe a small step in the process of producing
the solution!
1.4 Summary
Research is an indispensible part of any tertiary or higher education
center. According to Burns and Sandra (2003), old universities were
founded primarily as “research-based institutions,” (p. 24) where
research was conducted for the sake of research. In other words, old
universities addressed only those topics which attracted their own
scholars’ attention.
1
Prose /prōz/ n. the ordinary form of written or spoken language
2
Verse /vз:s/ n. each of the lines of a poem

10
Since the scope of topics studied by researchers in old universities was

very narrow, they failed to attract the public. New universities were,
however, established to make higher education available to the public by
addressing and solving their problems. This would not have happened if
the authorities in higher education had not realized their own problem.
Identifying an academic, educational, political, psychological or social
problem is then the first step in doing research.
Upon identifying a research problem, we need to specify it very clearly

so that we can collect the required data. All research problems
regardless of their type and scope deal with variables or features that
differ from situation to situation and must therefore be studied to find
out what makes them change. We will study them in Chapter two.

11
2 Research Variables
2.1 Introduction
In Chapter One we realized that there would be no research if we do not
face a problem related to language learning and teaching. We also
learned that all research projects must be replicable, i.e., if other
researchers conduct our research, they must get the same or very similar
results. This means that we need to operationalize our research problems
so that there will be no misunderstanding on the part of our research
users. In other words, we must tell our readers what exactly we mean
when we use particular key words in expressing our problems.
The word “problem” itself, for example, means “difficulty, trouble,

dilemma, conundrum, predicament and complication” (Urdang, 1997, p.
371), to name a few. These synonyms describe the early phase of
conceiving a research problem. Though I believe the most successful
research projects have been conducted by those who have faced a
difficulty or trouble in whatever they were doing, they could not achieve
excellence in research without describing their problems in clear words.
All problems addressed in a research project deal with attributes. The

participants who take part in a research project posses certain attributes
such as nationality, mother language, age, gender, social, economic,
political and educational backgrounds which make them different from
each other. Some of these attributes may be common, i.e., the
participants share them with each other. For example, if the participants
of your research project are all Iranian students, their nationality is a
common attribute. If a certain attribute like nationality remains common
or the same for all participants, it is called a “constant” (Fraenkel &
Wallen, 1993, p. 46).

12
The gender or sex of participants in your research project may, however,

be different. Half of the participants may be female and the other half
male. In this case, gender will be a variable that varies from females to
males. A variable is then an attribute that changes from person to person
(Hatch & Lazaraton 1991). If you limit the scope of your research and
include only female students in your project in that case gender would
become a constant.
It does not matter what type of research you wish to conduct, you must
identify both constants and variables of your project from the very
beginning. After identifying the variables of your research, you should
determine their function so that you can explore their possible
relationship. Variables should therefore be approached from two
perspectives in all types of research: psychometrically and functionally.
2.2 Psychometric Variables

Psychometric theory deals with fundamental principles of measurement
(Nunnally, 1978). In this theory, attempts are made to establish rules for
assigning numbers to persons or objects so that their attributes can be
quantified. Based on this theory a variable or variate is defined as “any
class of measurement on which individual observations are made”
(Armitage, 1971, p. 19).
The attributes, variables or variates are, therefore, measurements, which

vary from person to person, object to object or text to text. Variables are
psychometrically classified into categorical, ordinal, interval, and ratio
scales.
It should be noted that psychometric variables are commonly referred to

as “levels of measurement” and “scales” (Brown 1988, p. 21). I prefer to
use the term psychometric variables because they specify the nature of
variables under study rather than scales per se. In other words, scales by
themselves are extraneous and do not bear any relationship to variables.
We need a term which emphasizes variables first and foremost and then
show how they are quantified in order to be measured. In other words,

13
gender is a variable whose existence can be shown through participants’

sex, i.e., they are either female or male.
2.2.1 Categorical Variables

Variables are categorical if they are qualitatively different. In other
words, they differ in kind. Categorical variables are either present or
absent. Examples include nationality, gender and occupation. A male
person, for example, is qualitatively different from a female person.
Similarly, Iranian learners of English are qualitatively different from
Japanese, African and Australian learners.
Categorical variables are discrete in that their possible categories are

quite distinct and separate. Since the categories of categorical variables
are discrete by nature, they are counted. (Armitage, 1971). For example,
participants’ gender, translators of the same text, essay scorers and
interviewers are all discrete variables because we can say how many
participants took part in our research projects, how many translators
rendered a given text, how many scorers marked the same essay and
how many interviewers assessed the participants’ speaking ability.
Most textbooks dealing with research in applied linguistics refer to

categorical variables as nominal scales. Hatch and Farhady (1982), for
example, believed that “nominal refers to naming variables” (p. 13). We
had better use categorical because as Hatch and Lazaraton (1991)
themselves stated, “a nominal variable … names an attribute or category
and classifies the data according to presence or absence of the attribute”
(p. 55). In other words, it is not the name which is important, it is the
categories into which a given variable is assigned in order to explore
their effect on other variables.
When the participants in a research are assigned to either male or female

categories, you can decide how many male and female participants took
part in your study and how they performed on a certain task. A given
variable will therefore be categorical if it has specific categories or
levels. We should bear in mind that specifying the levels of a categorical

14
variable is very important. As will be discussed shortly, in some

research projects3 these levels assume great value in determining the
relationship between independent and dependent variables.
Upon specifying the levels of a categorical variable, signs or numbers

are given to each level to make its tabulation easier. The assigned signs
or numbers are just codes and do not therefore have any arithmetic
value. Since the signs or numbers used for codifying categorical
variables lack arithmetic value, you can choose whatever signs or
numbers you like. The codification of gender for a hypothetical sample
of participants is shown in Table 2.1 as an example.
Table 2.1
Codification of gender as a categorical variable for 10 hypothetical
participants
No Name Surname Gender Gender Gender* Gender Gender

1 Ali Ahmady male + ♂ 1 2
2 Hassan Beighi male + ♂ 1 2
3 Shirin Morady female - ♀ 0 1
4 Mehdi Ahmady male + ♂ 1 2
5 Leila Nouri female - ♀ 0 1
6 Parvaneh Seif female - ♀ 0 1
7 Reza Babaii male + ♂ 1 2
8 Ebrahim Sedghi male + ♂ 1 2
9 Sholeh Bagheri female - ♀ 0 1
10 Heidar Asady male + ♂ 1 2
* Symbols used in medicine (see Clayton, 1985, p. 2206)
As shown in Table 2.1 above, in the fourth and fifth columns, different
signs are used to show the two levels of gender. Similarly, the sixth and
seventh columns show different numbers indicating the same levels.
3
The word project has been used in two senses. It means both a
definitely formulated piece of research and a task engaged in by a
group of students to supplement and apply classroom studies.

15
Whatever signs or numbers used in the tabulation, we can readily say

that six males and four females took part in the research. It should be
mentioned that Table 2.1 suffers from a serious fault. Under no
circumstances you should reveal the identity of the participants in your
report to secure their anonymity. That is why in the description of the
table it has been mentioned that the names used in the table are all
hypothetical.
2.2.2 Ordinal Variables

Ordinal variables are ranked or ordered on the basis of some hierarchical
system such as degree of presence of a certain attribute. A rank is used
merely to indicate where a particular individual stands with respect to
the other individuals sharing the same attribute (Dixon & Massey, 1983,
p. 5).
Writing ability, for example, is a variable which has to be taught and

measured in applied linguistics as an integral part of learning any
language. One of the ways of measuring writing ability is to ask learners
to write essays. Since marking essays is inherently unreliable (Heaton,
1988, p. 145), it is essential to develop an ordering or ranking variable,
which describes the various grades of achievement expected to be
attained by the learners. Table 2.2 presents a sample ordinal variable
developed by University of Cambridge Local Examinations Syndicate. It
is used to mark the essays of advanced-level learners.
Table 2.2
Ordinal scale of writing ability at advanced level
Rank Description
Excellent Error-free, substantial and varied material, resourceful and controlled
in language and expression.
Very good Very good realization of task, ambitious and natural in style.
Good Sufficient assurance and freedom from basic error to maintain theme.

16
Table 2.2 (Continued)

Ordinal scale of writing ability at advanced level
Rank Description
Pass Clear realization of task, reasonably correct and natural.
Weak Near to pass level in general scope, but with either numerous errors or
too elementary or translated in style.
Very poor Basic errors, narrowness of vocabulary
In scales like the one presented in Table 2.2 the intervals between the
ranks are not equal. We cannot, for instance, say that the interval
between Excellent and Very good is the same as the interval between
Very good and Good. We can, however, claim that the ability of an
excellent student in writing essays is higher than the ability of a very
good student.
The labels of ranks such as excellent, very good and good can be
replaced by ordinal numbers 1st, 2nd, 3rd, … nth. These ordinal numbers
are in turn replaced by cardinal numbers 1, 2, 3, … n for purposes of
calculation. When the ordinal numbers are replaced by cardinal
numbers, the assumption underlying the intervals between ranks
changes.
According to Ferguson (1971, p. 303), the substitution of cardinal

numbers for ordinal numbers always assumes equality of intervals. The
difference between the first and second number is assumed to be equal
to the difference between the second and third and so on. This
assumption underlies all coefficients of rank correlation. Table 2.3
shows the conversion of ranks into ordinal and cardinal numbers as well
as scores.

17
Table 2.3
Conversion of ranks into ordinal numbers, cardinal numbers and scores
Rank Ordinal Cardinal Score Description

Error-free, substantial and varied
Excellent 1st 6 18-20 material, resourceful and controlled in
language and expression.
Very good realization of task,
Very good 2nd 5 16-17
ambitious and natural in style.
Sufficient assurance and freedom from
Good 3rd 4 12-15
basic error to maintain theme.
Clear realization of task, reasonably
Pass 4th 3 8-11
correct and natural.
Near to pass level in general scope,
Weak 5th 2 5-7 but with either numerous errors or too
elementary or translated in style.
Very poor 6th 1 0-4 Basic errors, narrowness of vocabulary
As shown in Table 2.3, ordinal numbers in column 2 allow us to

determine where a particular participant stands in terms of his writing
ability among other participants. The third column allows the researcher
to compare the judgment of two or more persons who may examine the
essays, and the fourth column provides the participants with a
quantitative measure of their writing ability.
2.2.3 Interval Variables

An interval variable is an attribute whose levels are rank ordered and it
is known how far each level is from other levels with respect to the
attribute. For example, on a reading comprehension test four students
may score 18, 17, 16, and 15 out of 20. Reading comprehension ability
will be considered as an attribute which has four levels, i.e., 18, 17, 16,
and 15. The difference between a reading score of 18 and 17 would be
assumed equal to the difference between the scores of 16 and 17.
To use another example, if you develop a close test consisting of 20

items and assign the value of one to each item, then the scores obtained

18
on the test will be an interval variable having 20 levels, ranging from 1

to 20. It will also mean that the difference between the score of 1 and 2
will be equal to the difference between the scores 10 and 11 or 19 and
20.
In general, when cardinal numbers are assigned to the levels of a

variable it becomes interval. In the third column of Table 2.3 above, the
cardinal numbers given to the ranks of writing ability have transformed
it from an ordinal variable to an interval variable.
2.2.4 Ratio Variables

Similar to an interval variable, a ratio variable is an attribute whose
levels are rank-ordered and it is known how far each level is from other
levels with respect to the attribute. Furthermore, the distance from a
rational zero is known for one of the levels. In other words, ratio
variables are interval variables whose levels are calculated or
determined with respect to a rational zero rather than with respect to one
another.
Consider the variable of time as an instance of ratio. In physics time

plays an important role and the zero point (no time at all) is employed in
time-related experiments. This might hold equally applicable to
psychological measurement involving the variable of time. However, as
Nunnally (1978, p. 18) emphasized, in nearly all types of investigation
in behavioral sciences like applied linguistics one does not need ratio
variables to conduct research projects. Interval variables accomplish this
purpose. For this reason, most textbooks on research in applied
linguistics avoid discussing ratio variables.
Ordinal, interval and ratio variables are continuous in that their

successive refinements will result in more precise measurements. For
example, speaking, listening, reading and writing proficiencies are all
continuous variables. These proficiencies are recorded in scores, which
can be out of 20, 50, 100 or more. Naturally, a score obtained out of 100
gives a better measurement of a person’s proficiency than a score out of

19
20 or 50. In addition to being continuous, ordinal, interval and ratio

variables are quantitative in that they vary in amount whereas
categorical variables like sex and nationality vary in kind.
2.2.5 String Variables

Categorical, ordinal, interval, and ratio variables have values and
intervals which can be treated as values and calculated. Gender as a
categorical variable, for example, has two values, i.e., female and male.
We can therefore assign the value of 1 to female and 2 to male
participants and then count these 1s and 2s. (See section 9.2.3 in chapter
9 to find out how we can do this by employing statistical softwares.)
There are, however, some variables which are made of alphabets,

characters, or letters. The words comprising texts, for example, are
string in nature. They are also known as alphanumeric. As long as the
words themselves are studied, they will be string. We can nonetheless
assign these words into various categories, in this case they will be
considered categorical.
2.3 Functional Variables

Psychometric variables are defined in terms of their ability to provide a
quantifying measure of abilities such as reading comprehension or
expressing personal feelings orally. For example, when we look at the
scores obtained on a language proficiency test, we can say a test taker
who has scored 80 out of 100 is more proficient than the test taker who
has scored 50 on the same test.
Quantifying a variable such as language proficiency by itself will have

little meaning if any. When we conduct a research project, we wish to
know whether there is a relationship between two variables such as
language proficiency and success in an academic program. In other
words, we try to find out whether the variables under our investigation
serve any educational functions.

20
Wardaugh (1992) expressed the necessity of variable functions within a

research context quite convincingly. He asserted that “data collected for
the sake of collecting data can have little interest, since without some
kind of focus–that is, without some kind of non-trivial motive for
collection–they can tell us little or nothing” (p. 17). If data collection
remains at the level of psychometrics and does not address a motive or
function it will be of little use. A researcher must, therefore, decide what
roles or functions, the data he has collected must be put to. These
functions play a key role in supporting whatever problems researchers
try to solve. They include independent, dependent, moderator, control
and intervening variables. We will study these functional variables in
some details.
2.3.1 Independent Variables

We have already learned that variables are attributes or properties,
which vary from person to person, object to object, text to text or event
to event. Some of these attributes bring about changes in other attributes.
When an attribute causes some changes in the value of another attribute,
it is called an independent variable. The main function of experimental
and non-experimental researches is to explore independent variables.
For example, Felix and Lawson (1994) conducted a research project to

find out whether suggestopedia4 affects more sophisticated language
skills than recall. Their findings showed that suggestopedia does in fact
have the potential to positively affect complex language skills such as
transfer of structures and creative writing. In Felix and Lawson’s
project, suggestopedia is, therefore, the independent variable, which has
caused some changes in their participants’ transfer of structures and
creative writing. Twelve high school students took part in their research.
4
Suggestopedia, originally developed by Lozanov (1978), is a teaching method
incorporating music, relaxation and suggestion. According to Lozanov, we use only
five to 10 percent of our mental capacity. This percentage can be increased by
desuggesting the feeling that we cannot succeed in learning a modern language.

21
2.3.2 Dependent Variables

The attribute that an independent variable is presumed to affect is called
the dependent or “outcome” variable (Fraenkel & Wallen, 1993, p.50).
The outcome of dependent variables depends on what the independent
variable does to it or how it affects it.
When an independent variable assumes the role of a cause or stimulus,

the dependent variable would be its affect or response. Purpura (1997),
for example, conducted a research project to investigate the relationship
between test takers’ reported strategy use and performance on second
language tests (SLTP). You can readily see that Purpura’s research deals
with the effect of test takers’ reported strategy use, an independent
variable, on performance on the SLTP, a dependent variable. In other
words, the performance of test takers on the SLTP depends on their use
of reported strategy.
2.3.3 Moderator Variables

Some research projects are simple in design and explore the relationship
between an independent variable and a dependent one. The study
designed by Felix and Lawson (1994), for example, investigated the
effect of suggestopedia, an independent variable, on language skills such
as transfer of structure and creative writing, dependent variables.
There are, however, some research projects that are quite complex. One
of these complex projects belongs to Purpura (1997). These projects are
factorial designs, which permit the investigation of additional
independent variables. They also provide researchers with an
opportunity to study the interaction of an independent variable with one
or more other variables called moderator variables.
If you remember, in discussing dependent variables I referred to

Purpura’s (1997) research in a simplified fashion, i.e., the relationship
between test takers’ reported strategy use and performance on second
language tests (SLTP). We observed that test takers’ reported strategy is
an independent variable in its own right.

22
However, Purpura (1997) divided test takers’ reported strategies into

two categories: cognitive processing and metacognitive processing
strategies, each having its own levels. In this classification cognitive
processing strategies play the role of an independent variable and
metacognitive processing strategies a moderator variable. The
classification allowed Purpura to explore how metacognitive processing
strategies moderated the affect of cognitive processing strategies.
Cognitive processing, the independent variable, had five levels:

analyzing inductively, clarifying/verifying, associating,
repeating/rehearsing and summarizing. Metacognitive processing, the
moderator variable, had 9 levels: assessing the situation, monitoring,
self-evaluating, self-testing, transferring, inferencing, linking
world/prior knowledge, applying rules, practicing naturalistically.
When there is only one independent variable and one dependent

variable, no complex interactions occur. However, when there are two
independent variables, one of them will naturally assume the role of a
moderator. These two independent variables will interact with each
other and affect their dependent variables. This interaction can be
studied by factorial designs as shown in Figure 2.1.
Figure 2.1
Interaction of independent and dependent variables
Dependent: the SLTP

Variables
Grammar Vocabulary Reading Writing
Cognitive processing 1 2 3 4
Metacognitive processing 5 6 7 8

23
2.3.4 Control Variables

A research project is conducted to explore possible relationships
between variables. When researchers realize that certain levels of the
variables they intend to study may affect the outcome of the research in
an undesirable or unpredictable manner they will have to identify and
exclude them from their project.
For example, Businco, Businco, Lauriello,

and Tirelli (2004) studied "State and trait
anxiety in patients affected by nasal
polyposis5 before and after medical
treatment.” Read the following description
of their participants and decide who were
not included in their study.
Nasal polyposis
A total of 30 consecutive patients were enrolled (16 male, 14

female, age range 18-77 years, mean 45.6), all affected by
idiopathic6 ethmoidal7 NP, primary or recurrence. Patients
affected by asthma, mental diseases, chronic diseases, in
general, or used any other drugs during the study, were
excluded (p. 327)
If we look at human diseases as a single variable, for example, it will

have many levels such as nasal polyposis, asthma, arthritis and
schizophrenia, to name a few. Naturally, including all these levels within
5
Polyposis /pǒlз'ōsis/n. a condition in which numerous small growth develop in a
hollow organ such as the nose
6
Idiopathic /idiз'pæθik/ adj. describes a disease or disorder that has no known cause
7
Ethmoid /'eθmoyd/ adj. of, relating to, or being a light spongy bone located between
the orbits, forming part of the walls and septum of the superior nasal cavity, and
containing numerous perforations for the passage of the fibers of the olfactory nerves

24
a single research project would be too broad to handle. For this reason,
Businco, Businco, Lauriello, and Tirelli (2004) controlled the human
diseases in their study and included those human patients who suffered
only from nasal polyposis.
2.3.5 Intervening Variables

There is a positive relationship between the number of variables
involved in a research project and its complexity. As the number of
variables increases, so does its complexity. For this reason, researchers
must first determine what their independent, dependent, moderator and
control variables are otherwise their project would lack internal validity.
There are, however, some variables that influence the outcome of a

research project, no matter how hard researchers try and how exact they
are. They are called intervening variables because they interfere with the
results of the study and researchers are aware of their presence in their
study but can do little to control them.
Businco, Businco, Lauriello, and Tirelli (2004), for example, studied the
effect of nasal polyposis on the anxiety level of 30 male and female
patients whose age ranged between 18 and 77. We may say that the
participants studied in this research might have all been from broken
families where they did not receive emotional support. Patient’s family
background, however, is considered personal and no researcher is
allowed to question it unless they volunteer to disclose it themselves.
The family background might therefore have been an interfering
variable in this research.
2.4 Latent Variables

As attributes, variables are also divided into two types: external or
surface variables and internal or latent variables (Tucker 1997, pp 1-2).
Surface variables are psychometric in nature because they can be
manipulated and measured. The categorical variable gender, for
example, is a surface variable whose categories can be observed by
employing the physical features of femaleness and maleness.

25
Internal or latent variables are, however, hypothetical constructs which

are formed by employing either reasoning, i.e., logical, or statistics, i.e.,
latent or factorial. Horwitz (1981, 1985, 1988), for example, developed a
questionnaire called Beliefs about Language Learning Inventory
(BALLI) in order to find out what opinions students and teachers of
language hold.
The latest version of the BALLI contains 34 beliefs. Each of these

beliefs is, therefore, a surface variable which can be measured by
developing a five-point Likert scale. The first belief, for example, reads:
It is easier for children than adults to learn English. Horwitz (1988)
believed that the 34 beliefs comprising the BALLI addressed five areas
of foreign language learning.
Table 2.4 presents the five areas along with the percentage of their
constituting beliefs. As can be seen, belief one and eight other beliefs
have been subsumed under foreign language aptitude (FLA). Based on
this logical categorization Horwitz claims that the FLA is an internal
variable whose existence is manifested by nine observable ordinal
beliefs.
Table 2.4
Five areas of learning addressed by beliefs explored by the BALLI
# Of
# Areas Beliefs Percentage
Beliefs
1 Difficulty of language 3, 4, 6, 23, 27, 34 6 17.6
learning
2 Foreign language aptitude 1, 2, 10, 14, 21, 28, 31, 9 26.5
32, 33
3 The nature of language 5, 8, 11, 15, 19, 24, 25 7 20.6
learning
4 Learning and 7, 9, 12, 13, 16, 17, 18, 8 23.5
communication strategies 20
5 Motivations and 22, 26, 29, 30 4 11.8
expectations
Total 34 100

26
Ghobadi (2009) and Hong (2006) used a statistical test called Factor
Analysis to explore whether Horwitz’ (1988) identification of five
logical areas has any empirical or factorial validity. They administered
the BALLI to 428 Korean speaking and 423 Persian speaking
undergraduate students, respectively, and asked them to indicate
whether they completely agreed, agreed, had no idea, disagreed or
completely disagreed with the statement. Since the participants’
responses to belief one was collected and analyzed statistically, it was a
surface ordinal variable.
2.5 Summary
Research in applied linguistics is conducted to find sound solutions to
language learning and teaching problems by adopting systematic
approaches. All the problems explored in the field basically deal with
attributes which vary from learners to learners, teachers to teachers and
texts to texts, to name a few. These attributes are technically referred to
as variables. They usually assume three roles in research projects:
psychometric, functional and latent.
The identification and manipulation of variables depend on their nature.

Some variables such as height and weight are physical and can be
measured directly. They are therefore called surface variables and
measured by employing units such as centimeters and grams. Most
variables studied in applied linguistics are, however, mental and
therefore need to be measured by tests and questionnaires.
Foreign language learning is, for example, considered to involve

variables such as aptitude and motivation. These variables are called
internal because nobody can study them directly. In order to do so we
need to see what observable attributes they involve so that we can
measure them objectively. Language aptitude, for example is said to be
related to age. So a belief such as It is easier for children than adults to
learn English is a surface variable of language learners.

27
Based on the nature of surface variables, they are converted to

quantitative scales in order to collect data from a sample upon which all
statistical operations are based. If variables differ in kind, they are
labeled categorical into which attributes such as gender and mother
language fall. And if they differ in degree, they are classified into
ordinal, interval and ratio scales depending on the assumed distance of
their levels.
Upon determining the nature and level of surface variables studied in a

research project, researchers must decide what functions, i.e.,
independent, dependent, moderator and control, the variables must serve
in the project. This will help them formulate their research questions and
hypotheses as discussed in the next chapter.

28
3 Research Hypotheses
3.1 Introduction
Identifying a problem will not lead to a verifiable and valid research
project alone if you do not study what other researchers have already
done on related problems and use their findings to formulate your
possible answers to your research problems. This is exactly what great
inventors like Thomas Edison did. According to Microsoft Encarta
(2006), before he started to do any type of experiment, Edison tried to
read all the literature on the subject to avoid repeating experiments that
other people had already conducted. Perhaps the best illustration of
Edison's working method is his own famous statement: "Genius is one
percent inspiration and 99 percent perspiration."
In Chapter two we concluded that research problems are at first personal

because they deal primarily with certain researchers’ own concerns
related to various language issues. The review of literature allows them,
however, to find out whether other researchers have faced similar
problems and what they have found. Their findings will help new
researchers make their research problems more specific by providing
them with the necessary background. You might, for example, have
heard portfolios as current methods of assessment and wonder whether
they have any relationship with language learning. If you look for some
theses written in your faculty, you may come across some interesting
studies. Faravani (2006, p. 4), for example, raised the questions below in
her MA thesis:
1. Can reading portfolios increase students’ achievement more than

traditional tests?

29
2. Can reading portfolios increase students’ reading comprehension

ability more than traditional tests?
3. Do portfolios have any effect on students’ critical thinking ability?
In the three questions raised above, there are three keys terms which
need to be defined before we proceed with the analyses of hypotheses
formulated on these questions. (Similarly, whenever you employ certain
words which play a very important role in your research, you ought to
define them very precisely so that there would be no misunderstanding
on the part of your readers if they use the same words in different
meanings.) They are portfolios, traditional achievement tests, and
critical thinking.
According to Abdelwahab (2002), a portfolio assessment is a method

employed to show learners’ work and activities on a certain subject in a
given class. In collaboration with their teacher, the learners establish a
specific rubric or checklist to reflect on the task, collect data, assess and
evaluate themselves and their peers on the basis of the progress they
make from the beginning to the end of a time period. Based on this
definition, Faravani (2006) developed a checklist and asked her students
to assess themselves (see Appendix 3.1) and their peers (see Appendix
3.2) whenever they read a new passage in the class and reflected on its
content (see Appendix 3.3).
In contrast to portfolios in which learners assess themselves and their

peers as they accomplish a task and self reflect on what they have read,
traditional achievement tests are developed by teachers to assess
learners’ comprehension of materials taught during a complete term.
Since the learners were supposed to read six texts from Passages
(Richards & Sandy, 2000), Faravani (2006) developed a test on these
texts which consisted of 69 schema-based close multiple choice items.
Schema-based cloze multiple choice item tests (MCITs) have the

capacity to be developed on all authentic and written texts from which

30
certain schemata8, i.e., words comprising given spoken or written texts,

have been deleted and replaced with a numbered blank space. For each
deleted schema four choices have been given which are syntactically,
semantically and discoursally related to each other. The test takers have
to read what comes before and after the choices in order to decide which
choice fits the blank best. Items 1 to 10 below, for example, were
developed on the text given on page 65 of the book Passages (Richards
& Sandy, 2000). Since the choices of schema-based cloze MCITs are
semantically and syntactically related to each other, the test takers must
resort to their background knowledge as well as what they read in order
to differentiate the choices from each other and choose the right schema.
No one noticed when Mick Novak … (1) little Alex, a sleeping

bundle wrapped in a blanket, onto a NorthStar Airlines … (2). Alex
caused no trouble when he woke up as he was strapped into his own
… (3), purchased at the full fare of $400. He was enjoying his lunch
when the trouble began. A flight attendant … (4), "He's alive!" when
she realized furry little Alex wasn't a … (5) animal. Alex is a 25-
pound chimp. He is tidy, … (6), and pleasant, but he is a chimp, and
NonhStar says he cannot fly economy … (7). In fact, NorthStar
spokesperson Jon Austin said the airline's … (8) is that large animals
have to ride in the … (9) hold. But Novak said. "I would think, given
NorthStar's current financial … (10), they would be happy to take
any paying customer." (Richard & Sandy 1998, p. 65).
1 A. carried* B. conveyed C. moved D. lifted

2 A. flight* B. journey C. travel D. trip
3 A. couch B. chair C. seat* D. bench
4 A. called B. screamed* C. said D. nagged
8
A schema, the singular form of schemata, “is any concept realized in a word or
phrase, syntactic or semantic, closed or open, syntagmatic or paradigmatic, which can
stand by itself or combine with other concepts to produce an idiosyncratic image in the
mind of a given person. This image has a direct relationship with the person’s
experiences with the concept gained through its application with other semantically
and syntactically related concepts. Schemata are idiosyncratic because individuals
differ from each other in terms of their experiences” (Khodadady 2001, p. 111).

31
5 A. loaded B. stuffed* C. packed D. filled

6 A. immobile B. still C. quiet* D. tranquil
7 A. kind B. type C. sort D. class*
8 A. tactic B. way C. policy* D. technique
9 A. cargo* B. load C. shipment D. stock
10 A. difficulties B. inconveniences C. troubles D. problems*
Item five above, for example, presents test takers with the four choices
load, stuff, pack and fill. Since these choices all have the meaning put
something in, they compete with each other to be chosen as the correct
answer and are therefore referred to as competitives (Khodadady 1997).
However, the utterance "He's alive!" which comes before the competitives
rules out load, pack and fill because they do not indicate animals which are
killed and treated to be filled with special materials. The only competitive
which describes these animals is stuffed.
Having covered the concepts of portfolios and schema-based cloze MCITs as

achievement test, we can now focus on the third key term used in Faravani’s
(2006) research questions, i.e., critical thinking. According to Halpern
(1997) and Facione (1998), critical thinking is a purposeful, reasoned
and goal oriented mental activity. It is done to analyse and solve
problems, formulate inferences, calculate likelihoods, evaluate and make
decisions in a particular context.
A number of tests have been developed by specialists in psychology and

psychometrics to assess critical thinking. Faravani (2006) employed
Watson-Glaser’s Critical Thinking Test (WGCTT) developed by
Psychorp (2000). The WGCTT comprises 80 items which measure test
takers’ ability in five tests dealing with inference, recognition of
assumptions, deduction, interpretation and evaluation of arguments. Test
1 below, for example, explains how the WGCTT measures inference.
Test 1: Inference
An inference is a conclusion a person can draw from
certain observed or supposed facts. For example if the lights are on
in a house and music can be heard coming from the house, a

32
person might infer that someone is at home. But this inference

may or may not be correct. Possibly the people in the house did
not turn the lights and the radio off when they left the house.
In this test, each exercise begins with a statement of facts
that you are to regard as true. After each statement of facts you
will find several possible inferences- that is,
conclusions that some persons might draw
from the stated facts. Examine each
inference separately, and make a decision as
to its degree of truth or falsity.
For each inference you will find
spaces on the answer sheet labeled T, PT, ID,
PF, and F. For each inference make a mark
on the answer sheet under the appropriate
heading as follows:
T if you think the inference is definitely TRUE; that it probably
follows beyond a reasonable doubt from the statement of facts
given.
PT if, in the lights of the facts given, you think the inference is
PROBABLY TRUE; that it is more likely to be true than
false.
ID If you decide that there are INSUFFICIENT DATA; that you
can not tell from the facts given whether the inference is
likely to be true or false; if the facts provide no basis for
judging one way or the other.
PF if, in the lights of the facts given, you think the inference is
PROBABLY FALSE; that it is more likely to be false than
true.
F if you think the inference is definitely FALSE; that it is wrong,
either because it misinterprets the facts given, or because it
contradicts the facts or necessary inferences from those facts.
To sum up, Farvani (2006) developed an achievement test on the

passages she taught in the class and administered it with Watson-
Glaser’s Critical Thinking Test to explore whether reading portfolios

33
affected her students’ achievement and critical thinking ability. For this
purpose, she raised three questions for which she offered some possible
answers. We will turn to these answers in the next section.
3.2 Hypothesis Defined

Hypotheses are possible answers we expect to get after we have carried
out our research project. They are informed answers because they rest
on reviewing the literature related to our problem. In addition to being
informed, hypotheses must predict a possible relationship between at
least one independent and one dependent variable. If we focus on
Farvani’s (2006) first research question, for example, we can identify
reading portfolio as an independent variable and achievement test as a
dependent variable. After identifying her independent and dependent
variables, what type of relationship did she predict to exist between
them?
3.3 Null Hypothesis

Null literally means non-existent. We formulate a null hypothesis when
we predict that there will be no significant relationship between
independent and dependent variables. A relationship will be significant
if it is supported by statistical analyses of data. In other words, we have
to employ psychometric tests such as schema-based close MCITs and
Watson-Glaser’s Critical Thinking Test in our research so that our
participants’ performance on these tests can provide us with the
necessary data, i.e., scores. These scores can then be compared with
each other statistically to see whether there is any significant
relationship among variables.
Faravani (2006), for example, wanted to find out whether using reading
portfolios in her classes will help her students gain higher scores not
only on their achievement test but also on Watson-Glaser’s Critical
Thinking Test. For this purpose, she divided her 32 general English
female students into two classes: a control class and an experimental
class.

34
Faravani (2006) developed and administered her schema-based cloze

MCIT to measure her learners’ achievement and Watson-Glaser’s
Critical Thinking Test to assess their critical thinking ability before she
started to teach both control and experimental classes. (When you give
some tests before teaching, you should call them pre-tests.) After
administering the tests and recording the students’ scores on these two
pre-tests, she taught six passages in her control and experimental
classes. However, she used reading portfolios only in her experimental
class so that she could compare it with her control class. (Since she did
not use any reading portfolios in her control class, she could claim that if
any significant difference occurred between the two classes, it was
because of reading portfolios.) Then she administered the same schema-
based cloze MCIT and Watson-Glaser’s Critical Thinking Test to both
classes at the end of the term. (When you administer the same tests to
the same students after you have taught them certain passages, they are
called post-tests.)
However, before she started her research project, she formulated three
null hypotheses, two of which are given below. They helped her be
consistent throughout her project.
1. There is no significant difference between the means of pre-tests and

post-tests of critical thinking for the experimental and control classes
after the implementation of reading portfolios.
2. There is no significant difference between the means of pre-tests and
post-tests of schema-based cloze MCIT (achievement) for the
experimental and control groups after the implementation of reading
portfolios.
3.4 Directional Hypothesis

In contrast to a null hypothesis which does not predict any sort of
relationship between two variables, a directional hypothesis employs
related literature or reasoning in order to assume the existence of a
relationship between independent and dependent variables.

35
The words comprising a text, schemata, for example, provide a good

example to formulate some rational hypotheses. Khodadady (1999a)
categorized the schemata used in the 22 passages of the textbook
Reading media texts: Iran-America relations (RMTIAR) into three
domains: Semantic, Syntactic and Parasyntactic.
Semantic domain expresses the main message of a given text and

consists of adjectives, adverbs, nouns and verbs as its genera. They are
open in nature because new adjectives, adverbs, nouns and verbs are
employed by speakers and writers when they express new actions,
attitudes, feelings and states whenever they change the topic of their
speech and writing. For this reason, semantic schemata are many in type
but few in frequency or token. For example, 499 different types of
adjective schemata had been used by the authors of the 22 passages
comprising the RMTIAR. In spite of having a key role in Iran-America
relations, the adjective schema new had, however, a relatively low
frequency. Although it enjoyed the position of the second most frequent
adjective among the 499 adjectives, it had been used 35 times in the
entire textbook, i.e., its token was 35.
Syntactic domain schemata which are traditionally referred to as

“closed-class items” (Quirk, Greenbaum, Leech & Svartvik, 1985, p. 71)
consist of auxiliaries, conjunctions, determiners, prepositions and
pronouns as their genera. In contrast to semantic schemata, syntactic
schemata are few in type but many in token. For example, only 12
auxiliary schemata were employed in the RMTIAR. Among them, the
auxiliary schema have, was used 247 times in the textbook. The high
frequency of syntactic schemata such as have is due to their limited and
predetermined syntactic roles such as forming different tenses.
Parasyntactic domain schemata are similar to syntactic schemata

because they depend on and attach to semantic as well as syntactic
schemata in order to constrain them within the variables of place and
time. The schema not is, for example, parasyntactic because it attaches

36
to a syntactic schema such as does and to a semantic schema such as is

in order to show that particular actions and states have not materialized.
Similar to syntactic schemata, parasyntactic schemata are highly
frequent or have many tokens. The adverb schema not, for example, was
used 80 times in the RMTIAR, i.e., it has a token of 80. Parasyntactic
schemata are also similar to semantic schemata because they can be
open in type. As parasyntactic schemata, numerals, for example, can
include whatever dates speakers or writers may wish to refer to, e.g.,
(December) 1, 1900, 1901, 1902 … n. Table 3.1 provides the type and
token (frequency) of the three schema types comprising the textbook.
Table 3.1
Schema domains, genera, type, token and their ordered percentage in the
RMTIAR
Schema Type Total Token Total

Genera Type Token
Domain % % % %
Nouns 1156 35.9 3777 21.4
Verbs 745 23.2 2236 12.7
Semantic 77 42
Adjectives 450 14 1165 6.6
Adverbs 118 3.7 179 1
Determiners 59 1.8 2256 12.8
Prepositions 49 1.5 2472 14
Syntactic Pronouns 43 1.3 6 814 4.6 44
Conjunctions 39 1.2 1188 6.7
Auxiliaries 12 0.4 1133 6.4
Names 343 10.7 1399 7.9
Numerals 74 2.3 181 1
Parasyntactic 17 14
Para-adverbs 72 2.2 536 3
Abbreviations 58 1.8 283 1.6
3218 17619
Having reviewed the literature, Gholami (2006) decided to find out

whether content schema types, i.e., adjectives, adverbs, nouns and verbs,
had any effect on Iranian test takers’ performance. She developed five
types of schema-based close multiple choice item tests (MCITs) on an

37
authentic text to measure the language proficiency of 92 Iranian

undergraduate university students. The first test was developed only on
adjectives. The second, third and forth tests were constructed on
adverbs, nouns and verbs, respectively. The fifth was a composite test
because it included all the four types of content schemata.
In addition to schema-based cloze MCITS, Gholami (2006) employed a

disclosed Test of English as a Foreign Language (TOEFL) in order to
validate her schema-based cloze MCITS as measures of English
language proficiency. Before administering the five schema-based cloze
MCITs and TOEFL, she formulated the directional hypotheses (p. 6)
below:
1. Since schemata differ from each other in type, there must be a

significant difference between the tests measuring different types.
2. The highest correlation with the TOEFL is expected to belong to the
[content] schema-based cloze MCIT, since it has all types of
schemata naturally constituting a text.
3. The second highest correlated test with the TOEFL would belong to
the noun-based cloze MCIT which tested the most frequent type of
schema, nouns.
4. The verb schema-based cloze MCIT is expected to be the third
highest correlated test with TOEFL. Since after nouns, verbs are most
frequently used schemata.
5. Adjective schema-based cloze MCIT enjoys the fourth highest
correlation with the TOEFL. Adjective schemata are the next type
used frequently after verbs.
6. Adverb schema-based cloze MCIT is the fifth and the least correlated
one with the TOEFL. Since adverb schemata are the least frequent
among content schemata comprising texts.

38
3.5 Summary
The development and conduction of a research project depends on
facing a genuine problem and taking principled steps to solve it. The
most fundamental step in this regard is reviewing literature and
employing the findings of other researchers in order to refine the
research problem and formulate hypothesis upon which appropriate
research methods can be chosen. Null hypotheses supply researchers
with the most straightforward answer assuming the lack of any
significant relationship between independent and dependent variables.
Directional hypotheses, however, help researchers focus on a significant
relationship which must rationally exist between two variables. The
specification of variables and the statement of their expected
relationships in hypotheses pave the way to ensure that all the
requirements of an acceptable research project are met. These
requirements or characteristics are discussed in chapter four.
3.6 Application
Read the following hypotheses and indicate whether they are null or
directional.
1. Poor reading in the L2 [second language] is due to poor ability in the

L1 [first language]; therefore, poor L1 readers will read poorly in the
L2 and good L1 readers will read well in the L2 (Alderson 1984).
2. Poor reading in the L2 is due to inadequate knowledge of the L2
(Alderson 1984).
3. Poor listening comprehension in the L2 is due to poor listening
comprehension ability in the L1. Poor L1 listeners will listen poorly
in the L2, and good L1 listeners will listen well in the L2
(Vandergrift 2006).
4. Poor listening comprehension in the L2 is due to inadequate
knowledge of the L2 (Vandergrift 2006).
5. There is no significant difference between students’ reading
comprehension of narrative and non-narrative texts especially in the
intermediate stages of language learning (Maibodi 2008)

39
4 Characteristics of Research
4.1 Introduction
In teaching language, teachers may face many problems requiring
appropriate solutions. The problems may be related to learners, teaching
methods, syllabus, schools, parents and peers, to name a few. In chapter
one, we learned that the first step in conducting a research would be
reading the literature in order to find out what others have done so far.
The review of literature will help us not only save time, energy and
resources but also refine our problem and define it in an operationalised
manner which enables us to collect data and run statistical analyses.
In chapter two, we familiarized ourselves with the variables involved in

research by admitting the fact that what brings about different
performances or behaviours stems from differences in individual
attributes. The attributes might be in kind such as gender and nationality
or in amount such as learners’ level of achievement. We also learned
that the variables of a research project must be specified and controlled
so that external factors such as chance and luck do not affect our results.
In chapter three, we noticed that reviewing literature and identifying

various variables involved in our research project should help us predict
the answers we might get after we have collected our data. These
expected answers or hypotheses might be null or directional depending
on whether we predict our treatment will bring about any changes in the
variables we are studying.
In the present chapter, we will focus on three characteristics or standards

all research projects must meet so that their findings can be applied to

40
solve the problems investigated by those projects. The standards are

validity, reliability and feasibility.
4.2 Validity
Validity denotes the acceptability of an object or an argument as factual
according to certain criteria. For example, the validity of an original
passport depends on its being officially issued for a fixed period of time.
In other words, official approval and having dates of issue and expiry
are among various criteria employed to check the validity of passports.
Another criterion for their validity is bearing a recent photograph of its
holder.
As an example for the validity of an argument, consider the reason

offered for being absent in an examination. The absence will be valid if
it is caused by an accident upon which absentees have no control, e.g.,
sickness. The mere announcement of being sick will not be valid unless
they produce the prescription of a trusted physician as a factual
document.
Similar to factual reasons, research projects will be valid if they are

based on observable criteria. Some of these criteria relate to the process
of their conduction, i.e., internal validity, and some concern the findings
obtained during the process, i.e., external.
4.2.1 Internal Validity

The internal validity of a research project depends on the variables9 on
which research is conducted. The project will be internally valid if, and
only if, its results are the outcome of manipulating or controlling the
variables explored in the research. In other words, as Campbell and
Stanley (1963) noted internal validity has to do with interpreting
findings of research within the study itself. Researchers, participants,
9
Attributes of a person, a piece of text, or an object which varies from person to
person, text to text, object to object, or from time to time (Hatch &Lazaraton, 1991,
p.51)

41
location, and instruments employed in the projects are among the major
variables that affect their internal validity.
4.2.1.1 Researchers
Some research projects are narrow enough to be conducted by a single
researcher. Others are, however, too broad to be handled by one
researcher, especially when a large amount of data should be collected.
In these research projects bias, implementation, and presence of
researchers and data collectors will have an effect on the nature of the
data they obtain.
4.2.1.1.1 Bias
Bias indicates a researcher’s tendency to consider one person, group, or
idea more favorably than others do. This bias stands in sharp contrast to
that of participants, i.e., halo effect (e.g., Brown 1988, p. 33), and may
influence the researcher’s attitude unfairly in favour of or against the
participants and/or methods he employs in his project.
For example, I supervised a research project dealing with cultural

onslaught and divorce in Sanandaj. Ten male and female undergraduate
university students took part in the project in 1999. They were sent to
local family courts to administer a questionnaire to divorced couples.
After the students’ first visit to the courts, I invited them to my office to
discuss the questionnaire. During the discussion I noticed that one of the
female students had divorced and harbored a negative attitude toward
men. The analysis of her collected data showed that her bias had resulted
in her interviewing only female divorcees.
Fraenkel and Wallen (1993, p. 226) suggested two principle techniques

for handling this problem. The first is to standardize all procedures by
training the data-collectors and the second is planned ignorance, i.e., the
data-collectors lack the information they would need to distort results. I
think the first technique is more plausible and practical than the second.
The main researcher should have as many meetings as possible with his

42
data-collectors to find and remove their biases and thus standardize all
procedures by their agreement.
4.2.1.1.2 Implementation
Sometimes occasions arise when researchers have to seek the help of
some people to implement the research projects. For example, in
investigating the effect of using two different teaching methods in
Iranian schools, e.g., audio-lingual and silent way, a female researcher
may invite four teachers to join her study. She may ask two teachers to
teach a certain text audiolingually and the other two to teach the same
text in silent way. It may turn out that students performed better in
audio-lingual classes simply because their teachers were better than the
other two. This is known as implementer threat to internal validity.
The implementer threat may also happen when some researchers are
biased towards certain methods. All four teachers mentioned above
might prefer audio-lingual method and the two teachers employing the
silent way in their classes may do so reluctantly. As a result, the audio-
lingual group may perform better on the achievement test.
The implementer threat can be removed by training the teachers and

having them use the two different methods alternatively. It is not,
however, practical to have two different teachers teach the same class in
Iran. The best way seems to find out the implementers’ preference
before the research project starts and train the two using the silent way
to teach as required by the method.
4.2.1.1.2 Presence
As researchers we should take whatever steps necessary in order to be
present in the field where data is collected. Participants in a research
project love to see that their participation is appreciated. They will take
whatever tests administered to them seriously if they are administered
by the researchers themselves. Since the effect of researchers’ presence
was first discovered in a plant named Hawthorne, it is referred to as
Hawthorne effect in research literature.

43
4.2.1.2 Participants
Students, parents or other people involved in educational programs such
as teachers and school staff may form the participants of research
projects in applied linguistics. Some researchers employ the term
subjects instead of participants (e.g., Biria 2002, Tajareh & Tahririan
2003). I believe employing participants in the research projects
conducted in social sciences is more descriptive of the people who
voluntarily take part in them. The term participant differentiates humans
from non-human subjects upon whom some experiments are conducted
without their approval in fields such as biology and zoology.
Even in medicine there seems to be an awareness of employing an

appropriate subheading which best represents the people who
voluntarily enroll in medical research projects. For example, Businco,
Businco, Lauriello and Tirelli (2004) labelled them patients instead of
subjects to describe their participants. Whatever our choice, the
participants of a research project must be studied in terms of the
following variables in order to ensure its internal validity.
4.2.1.2.1 Attitude
Some research projects require having two groups: experimental and
control. While the experimental group is exposed to research treatment,
the members of the control group receive no treatment at all. If they
realize that they are not treated as the experimental group, they may
become demoralized or resentful and hence perform more poorly than
the treatment group. It may thus appear that the experimental group is
performing better as a result of the treatment when this is not the case.
The following example is given by Fraenkel and Wallen (1993).
A researcher decides to investigate the possible reduction in

test anxiety by playing classical music during examinations.
She randomly selects 10 freshmen algebra classes from all such
classes in the five high schools in a large urban school district.
In five of these classes, she plays classical music softly in the

44
background during the administration of examinations. In the

other five (the control group), she plays no music. The students
in the control group, however, learn that music is being played
in the other classes and express some resentment when their
teachers tell them that the music cannot be played in their
classes. This resentment may actually cause them to be more
anxious during exams or to inflate10 their anxiety scores (p.
229).
Fraenkel and Wallen (1993) offered two solutions to remove attitude

threat to internal validity. One solution is to provide the control group
with a special treatment and/or novelty similar to that received by the
experimental group. The other solution is to keep them unaware of the
experiment. While the former seems to be more feasible than the latter,
finding a special treatment not affecting the research outcome would be
too difficult.
4.2.1.2.2 Attrition
When research extends over a long period of time, it may not only affect
the participants’ tolerance but also change the composition of the sample
studied. This change of composition brings about attrition (Seliger &
Shohamy, 1989, p.101). As research takes months to be completed,
some participants may lose their interest and drop out of study. It is also
possible some participants may become sick on certain days and fail to
attend and therefore the data related to their performance on certain
tasks related to the research project will be missing. This will result in
having unequal number of participants on the tasks which may put the
internal validity of the project at risk.
Attrition or “mortality” (Fraenkel & Wallen, 1993, p. 223) is a common

problem in questionnaire studies. In such studies, it is not uncommon to
find that 20 percent or more of the subjects involved do not fill the
questionnaire or return them to the research.
10
a technical term indicating an increase in scores

45
Attrition not only limits the generalizability of research but also can
introduce bias–if those participants who dropped out would have
answered the questions differently from those who completed the
research. Quite often this is the case because those who drop out or are
absent act this way for a reason.
Attrition is perhaps the most difficult variable to control in internal

validity. According to Fraenkel and Wallen (1993), it is a misconception
to believe that attrition is eliminated when lost participants are replaced
by new participants. Even if they are replaced by new participants
selected randomly, the researcher can never be sure that the replaced
participants respond as the dropped ones would have. The following
example was given by Fraenkel and Wallen:
A high school teacher decides to teach his two English classes

differently. His one o’clock class spends a large amount of time
writing analyses of plays, whereas his two o’clock class spends
much time acting out and discussing portions of the same play.
Halfway through the semester several students in the two o’clock
class are excused to participate in the annual school play–thus
being “lost” to the study. If they, as a group, are better students
than the rest of their class, their loss will lower the performance of
the two o’clock class (p. 224).
4.2.1.2.3 Expectancy
Participants are indispensible members of the society in which
researchers conduct their projects. They are therefore well aware of the
norms and expectations of their society. When they come across the
projects which address their norms, they go with the crowd and answer
what they think the researchers expect them to answer. This threat to
internal validity which is known as subject expectancy in the literature
can be removed by adopting appropriate strategies. The first and most
important one requires keeping the identity of participants anonymous.

46
For this purpose, codes can be specified in advance and the participants
can be asked to employ the codes rather than their names.
However, some research projects require administering various types of

drugs or treatments to different participants. In medical research, for
example, the experimental group receives the drug under investigation
whereas the control group receives nothing. Since the patients in the
control group may get upset if they realize they have not been treated the
same as the experimental group, the researcher must give them a
placebo, i.e., a drug which looks like the experimental one but has no
known effect on the type of disease explored in the research.
The research conductors might, however, influence the outcome of

research projects if they know participants. The nurses in the hospital
who administer the prescribed drugs may, for example, sympathise with
some patients in the control group and give them the experimental drug
rather than the placebo. In this case, double-blind technique (Rosenthal
1966) should be employed in which neither the research conductors nor
the patients know what type of treatment they receive.
4.2.1.2.4 History
In addition to participants’ attitude, attrition and expectancy, their
history also affects the internal validity of research projects. History
refers to whatever happens during research with or without researcher’s
knowledge. If the researcher knows his participants’ history, he might
take necessary steps to control some variables and thus save the internal
validity of his study. If he ignores the history or remains unawares of it,
he jeopardizes the internal validity.
A male English teacher, for example, may wish to explore whether silent
way11 is a more effective method in teaching English to his beginner
11
A method of foreign-language teaching developed by Gattengo which makes use of
gesture, mime, visual aids, wall charts and wooden sticks of different lengths and
colours. In contrast to methods such as Audio-Lingual Method, teachers who employ
Silent Way must stay silent most of the time and use gestures various types of aids in

47
students than the Audio-Lingual Method12. He assigns his 15 adolescent

students to one class and teaches them through Audio-Lingual Method.
And he puts his other 15 beginner students in another class and employs
the Silent Way method. However, the results he obtains do not show any
significant difference between the two classes. Both classes score very
high on the final exam. An interview with the students may, to his
disappointment, show that his Silent Way class were studying English at
the same time at a private institute where the same materials were taught
through Audio-Lingual method. This means that the teacher did not get
his expected results simply because he failed to check the history of his
Silent Way class and thus violated the internal validity of his project.
4.2.1.2.5 Maturation
Maturation indicates growing or developing naturally over time. Many
biological, psychological and emotional changes occur over a long
period of time as a result of maturation. If it is not taken into account,
researchers might take those natural changes as the results of their
research treatment.
4.2.1.2.6 Selection
Participants’ selection relates to their being almost the same in all
characteristics except the variables under investigations. In other words,
certain attributes must stay constant in a research project in order to
ensure that the variables under investigation are treated properly. These
constant attributes might include developmental, educational, political,
social and economic backgrounds.
For example, a researcher might wish to find out whether audio-lingual

method would be efficient for teaching English to his beginner students
order to help their learners produce express what they see (see Richards & Rogers,
1986).
12
A method of teaching foreign languages based on five principles: (1) language is
speech, not writing, (2) a language is a set of habits, (3) teach the language, not facts
about the language, (4) a language is what native speakers say, not what someone
thinks they ought to say, and (5) languages are different (Moulton, 1961)

48
in a certain Iranian school. To conduct his research he might put his

adolescent and adult students in one class and teach them audiolingually.
The results he might get would not be possibly what he was looking for
simply because adolescent learners favour active participation and
repetition–the key requirements for habit formation–whereas adults
prefer a more cognitive approach. Since the participants differ in their
age and cognitive approaches, this research would lack internal validity.
Read the following description of patients who took part in Businco,

Businco, Lauriello and Tirelli’s (2004) study and decide what attribute
was kept constant in order to insure internal validity of their research.
Participants
A total of 30 consecutive patients were enrolled (16 male, 14
female, age range 18-77 years, mean 45.6), all affected by
idiopathic ethomoidal NP, primary or recurrence. Patients
affected by asthma, mental diseases, chronic diseases, in
general, or used any other drugs during the study, were
excluded (p. 327).
4.2.1.2. Sympathy
The sympathy of participants towards researchers is another factor
showing the inappropriateness of calling them subjects. While
participants have the capacity to sympathize with researchers, subjects
such as laboratory animals cannot.
Brown (1988, p. 33) gives an illustrative example. If a questionnaire is

administered to rate teachers on a number of characteristics such as
punctuality and hardworking, the students of a given class might rate
their teacher highly punctual and very hardworking, simply because they
liked him. In reality, however, the teacher might have been regularly late
and the least hard working among others. Sympathy is thus different
from bias discussed in 4.2.1.1. While some participants may sympathize
with researchers, researchers may become biased towards certain
participants.

49
4.2.1.2.7 Tolerance
Tolerance concerns the participants’ capacity to function normally while
a research project is conducted. Although Hatch and Lazaraton (1991,
p.35) discuss features such as getting tired and/or bored under
maturation, differentiating maturation from tolerance seems necessary.
While maturation entails advancing involuntarily in a developmental
direction over a relatively long period of time, tolerance accrues when
participants experience a physical or emotional change in a short span of
time. Maturation and tolerance should therefore be separated from each
other.
While researchers would never be able to stop their participants’

maturation, they can and should certainly take whatever steps required
to avoid their getting tired or bored by checking their tolerance
carefully. If it is required that participants should sit for two tests for
four running hours, for example, the mere length of the tests will
exhaust their tolerance. The researchers can administer their tests on two
separate occasions by adding a break between them and providing
refreshments to help the test takers overcome their exhaustion.
When the participants’ tolerance is affected physically or

psychologically, they cannot perform normally and thus the treatment of
the research will not reveal their true individual differences. If a research
project does not measure true individual differences among participants,
it will not have internal validity.
4.2.1.3 Location
The particular location where a research project is conducted may affect
the outcome of a research project. The participant who takes a test in the
classroom which is larger, has better lighting and air-conditioning, will
perform better than those placed in small, dark and smelly classrooms.
In Western countries such as America and Australia most primary and

high school classes contain small libraries. Unfortunately, in developing

50
countries like Iran, few books can be found in classes. Even central
libraries of some Iranian universities lack many newly published books
and journals on English literature and applied linguistics. The number of
specialized books and journals even vary from city to city and university
to university.
The best method to control location is to hold it constant–that is, keep it

the same for all participants. When this is not feasible, you should try to
ensure that different locations of your research do not systematically
differ from each other. This will require the collection of enough
descriptions of the various locations before the research starts. Here is
an example of a location threat to internal validity.
“A researcher designs a study to compare the effects of team

versus individual teaching of US history on student attitudes
toward history. The classroom in which students are taught by a
single teacher have fewer books and material than the ones in
which students are taught by a team of three teachers”
(Fraenkel & Wallen, 1993, p. 225).
4.2.1.4 Instruments
Not only participants’ characteristics and location but also instruments
used in a research project will influence its internal validity. Instruments
relate to whatever tools a researcher employs to quantify his variables.
Some common instruments used in educational research projects include
tests, questionnaires, tapes and films.
A researcher should decide what instrument he is going to use before he

starts his research project. The selection of any instrument depends on
the type of research project he intends to conduct. After selecting the
instrument, the researcher should examine it carefully and describe it in
whatever details necessary in his final report. The properties of research
instruments which need to be considered lest the research will lack
internal validity include construct validity, content validity, empirical
validity, directions, subjectivity, and test effect.

51
4.2.1.4.1 Construct Validity

The term “construct”, according to Stratton and Hayes (p. 42), is used in
personal construct theory to “define concepts in a precise way”. Hatch
and Lazaraton (1991) believed that many constructs exist in applied
linguistics for which there are no direct measures, e.g., “motivation,
need achievement, attitude, role identification, acculturation,
communicative competence and so forth” (p. 541). Construct validity is
thus the process of defining these concepts as precisely as possible, and
then designing specific tests to measure them in an objective way.
In order to define a construct such as language proficiency (LP)

researchers do, however, need to have a theory. Khodadady (2012), for
example, believed that three theories have been proposed so far to define
the LP construct, i.e., language components, reduced redundancy, and
schema. The first approaches the LP as a construct consisting of four
components, i.e., structure, written expressions, vocabulary and reading
comprehension. Some traditional multiple choice items are then
designed on these four components, administered to language learners
and their scores are obtained and employed as objective measures of the
learners’ LP. Khodaday (1997) questioned the first theory on two
grounds. First, it does not address the type of texts on which the MCITs
are developed. Secondly, it does not explain how the MCITs must be
designed.
As the second theory of LP, reduced redundancy (RR) was proposed by

Spolsky (1973) who assumed that
The non-native’s inability to function with reduced redundancy,

evidence that he cannot supply from his knowledge of the
language the experience on which to base his guesses as to what
is missing. In other words, the key thing missing is the richness
of knowledge of probabilities - on all levels, phonological,
grammatical, lexical, and semantic - in the language (p. 17)

52
Klein-Braley (1997) employed the RR theory to develop conventional

C-Tests. To develop these tests, every second word from the second
sentence of some short passages is mutilated to be restored by test
takers. Klein-Braley argued that “knowing a language certainly involves
the ability to understand a distorted message, [and] to make valid
guesses about a certain percentage of omitted elements” (p. 47).
Khodadady (2013), however, argued that the conventional C-Tests
lacked construct validity because the RRT can neither explain their
construction nor their development on several short texts.
The third theory approaches language proficiency as a construct

entailing understanding all the words constituting written/oral texts in
terms of their syntactic, semantic, and discoursal relationships they hold
with each other. Since they have already been produced by writers/
speakers under real conditions, in real places, at real times and for real
purposes, i.e., being read or heard, the constituting words, i.e., schemata,
of authentic texts enjoy pragmatic relationships among themselves by
their being chosen to convey the message (Khodadady & Elahi, 2012).
Schema theory not only specifies the type of texts but also explains what
types of alternatives must be developed if multiple choice items are to
be designed on the texts.
Construct validity of psychological measures such as language

proficiency tests, therefore, involves defining what they measure as
precisely as possible by using a theory to explain the definition. They
must then be administered a representative sample of test takers to find
out whether they measure what they have been designed to measure.
Construct validity is thus “an integration of any evidence that bears on
the interpretation or meaning of the test scores” (Messick 1989, p. 17).
4.2.1.4.2 Content Validity

Content validity of a research instrument deals basically with the
participants of a research project and what they do, study or use during
the course of the research. For example, Seif and Khodadady (2003)
conducted a study to find out whether schema-based cloze multiple-

53
choice item tests (MCITs) could be used as indirect measures of

university students’ translation ability. After reviewing the literature,
they decided to develop a schema-based cloze MCIT, a Persian to
English MCIT, and an English to Persian open-ended translation
examination (OETE) in order to explore the hypotheses below:
1. The schema-based cloze MCIT will correlate significantly with the

Persian to English MCIT.
2. The schema-based cloze MCIT will correlate significantly with the
English to Persian OETE.
3. The scores obtained on the schema-based cloze MCIT will be
significantly higher than the Persian to English MCIT and English to
Persian OETE.
Before developing their three tests, Seif and Khodadady (2003),

however, had to specify who their participants were so that they could
base the content of the tests on what the participants had to study. Since
113 undergraduate university students majoring in Arabic Language and
Literature took part in their project, they had to first determine the
participants’ language proficiency level. Based on the students’
enrollment in the course English for Specific Purposes II (ESP II), they
were assigned to an intermediate level of English language proficiency
because they had to pass General English and ESP I before they took
ESP II.
Specifying the participants and their language proficiency level helped

Seif and Khodadady (2003) choose the materials which best suited their
current level of proficiency and thus secured the content validity of their
three tests. They selected a collection of 16 passages which ranged from
400 to 500 words in length. The passages were chosen from The
Literary History of the Arabs (Nicholson, 1969), The Encyclopaedia of
Islam (Brill, 1971), and Anthology of Islamic Literature from rise of
Islam to the present time (Kritzeck, 1964). These references are widely
used as textbooks in Iran. The content of these references are also used

54
to develop tests employed for the admission of students who wish to

continue their graduate studies in Arabic in Iran.
4.2.1.4.3 Empirical Validity

The term empirical is related to human knowledge and whether it is
inborn or gained through experience. Nativists believe that certain
abilities are innate or inborn and therefore need not be gained through
experience. In contrast, empiricists believe that a human mind is like a
tabula rasa, or blank slate, when they are born. All human knowledge,
therefore, comes from sensory experiences. This latter belief has
affected establishing the validity of research projects.
An instrument employed in a research project will have empirical

validity if at least one external evidence is produced to confirm the
researcher’s claim that it really does what it is designed to do. For
example, a translation examination will be a valid measure of translation
ability if it requires its examinees to translate a linguistic unit such as a
sentence from one language to another. Seif and Khodadady (2003),
therefore, asked the participants of their research project to translate 20
English sentences to Persian. The English sentence below along with its
Persian translation are given as an example.
It is generally believed that from the long and monotonous march of

the caravans and the uniform stride of the camels grew the unique
rhythmic song of the riders which incited the camels to a faster pace.
‫ﻋﻤﻮﻣًﺎ ﻋﻘﻴﺪﻩ ﺑﺮ اﻳﻦ اﺳﺖ آﻪ از راﻩ رﻓﺘﻦ ﻃﻮﻻﻧﻲ و ﻳﻜﻨﻮاﺧﺖ آﺎروان وﺻﺪاي هﻤﺎهﻨ ﮓ‬
‫ﭘ ﺎي ﺷ ﺘﺮان ﺁواز ﺁهﻨﮕ ﻴﻦ ﺳ ﻮارﻩ ه ﺎ ﺑﻮﺟ ﻮد ﺁﻣ ﺪ آ ﻪ ﺑﺎﻋ ﺚ ﺷ ﺪ ﺷ ﺘﺮان ﺑ ﺎ ﺳ ﺮﻋﺖ ﺑﻴﺸ ﺘﺮ‬
.‫ﺣﺮآﺖ آﻨﻨﺪ‬
No one can dispute the validity of the translation question given above
because it requires translating the sentence from English to Persian. Seif
and Khodadady (2003), however, claimed that schema-based cloze
MCITs, which are designed in only one language, i.e., English, can
measure their takers’ Persian translation ability indirectly.

55
In order to confirm their claim, Seif and Khodadady (2003) formulated a

hypothesis, i.e., the schema-based cloze multiple choice item test
(MCIT) will correlate significantly with the English to Persian open-
ended English to Persian examination (OETE). In other words, they
used the English to Persian OETE as an empirical evidence to validate
their schema-based cloze MCIT as an indirect measure of translation
ability.
Naturally, the more external evidence is produced by researchers to

support their hypotheses, the sounder their research project becomes.
For this reason, Seif and Khodadady (2003) formulated another
hypothesis, i.e., the schema-based cloze MCIT will correlate
significantly with the Persian to English MCIT. The administration of
their three tests to 113 undergraduate university students majoring in
Arabic Language and Literature produced the results given in Table 4.1.
Table 4.1
Correlations coefficients of three tests
Tests Persian to English MCIT English to Persian OETE

Schema-based cloze MCIT .61* .71*
Persian to English MCIT .61*
Note: * p < 0.01
4.2.1.4.4 Directions
Directions relate to all instructions the participants in a research project
must follow to complete the required tasks. Before the directions are
included in the final draft, they must be carefully planned and piloted.
While planning requires specifying exactly and clearly what the
participants must do during the research, piloting indicates administering
the instrument and directions to a representative sample of participants
and analyzing their performance before administering them to the target
sample.

56
Directions play a vital role in written tests developed on foreign

languages. The present author was holding and proctoring a test at an
Iranian university when a monitor approached him and asked for help.
Two students had been seated along with his own students and required
to take a test designed by a researcher who was absent himself. The
students wanted to know what they had to do with certain items on a
page.
One of the students handed the test booklet to the present author and he
could find directions neither on the questioned page nor on the previous
one. After close scrutiny he could spot the directions in the top corner of
page three where it was stapled to other pages! They were written in
English and were virtually undistinguishable from the test items. The
test was designed on an ESP13 course for Iranian university students
majoring in industrial management. If the students had not sought for
help and if he was not there, they would have answered the questions
differently and thus invalidate the research.
The directions given on any written research instruments such as tests

should stand out from other parts of the instrument. They should also be
clear and easy to understand. If there is any possibility of
misunderstanding, the directions should be given in the participants’
mother language (Khodadady, 1999).
4.2.1.4.5 Subjectivity
Instruments employed in a research project may constitute a threat to the
internal validity if they change according to the person who uses them.
This is usually referred to as subjectivity or “instrument decay”
(Fraenkel & Wallen, 1993, p. 225). It occurs when the instrument
permits different responses as in open-ended questions and essays.
13
English for specific purposes (ESP) courses focus on distinctive features of the
language, especially vocabulary, that are most immediately associated with its
restricted use, e.g., technical terms in agriculture (Munby, 1978, p. 2)

57
I was asked to mark some examination papers as the second marker in a

university. When a high stake examination such as selection for graduate
and post graduate programs is held, another marker is also invited to
remove the subjectivity of marking open-ended questions. The marks I
assigned to the examination papers were drastically different from the
other marker because we had apparently adopted two different
approaches: holistic and analytic. Since holistic marking rests on a given
marker’s overall assessment of a question, it is prone to subjectivity. (It
is worth noting that some markers have a special talent for holistic
marking and their marks usually correlate significantly with objective or
analytic markers.)
To overcome subjectivity, an analytic approach must be adopted. In this

approach, all possible responses to questions are specified in advance
and an objective procedure for scoring is set accordingly. This procedure
does not remove subjectivity all together, it does, however, decrease
subjectivity to a very noticeable degree.
In marking the English to Persian open-ended translation examination

Seif and Khodadady (2003), for example, broke the English sentences
into their constituting parts and assigned given scores to each. They also
translated the English sentences to Persian themselves, broke them into
parts and assigned scores to each part in order to have an objective
measure to mark the translations as shown below.
It is generally believed that from the long and monotonous march of

the caravans (0.25) and the uniform stride of the camels (0.25) grew
the unique rhythmic song of the riders (0.25) which incited the
camels to a faster pace. (0.25)
‫( وﺻﺪاي‬0/25) ‫ﻋﻤﻮﻣًﺎ ﻋﻘﻴﺪﻩ ﺑﺮ اﻳﻦ اﺳﺖ آﻪ از راﻩ رﻓﺘﻦ ﻃﻮﻻﻧﻲ و ﻳﻜﻨﻮاﺧﺖ آﺎروان‬
‫( آﻪ ﺑﺎﻋﺚ ﺷﺪ‬0/25) ‫( ﺁواز ﺁهﻨﮕﻴﻦ ﺳﻮارﻩ هﺎ ﺑﻮﺟﻮد ﺁﻣﺪ‬0/25) ‫هﻤﺎهﻨﮓ ﭘﺎي ﺷﺘﺮان‬
(0/25) ‫ﺷﺘﺮان ﺑﺎ ﺳﺮﻋﺖ ﺑﻴﺸﺘﺮ ﺣﺮآﺖ آﻨﻨﺪ‬
Another solution would be changing the questions and responses into

multiple choice format. Regardless of the type of the instrument, all

58
answers must be written by the test designer and answer keys for scoring
the instrument must be prepared before administering the instrument.
4.2.1.4.6 Test Effect

Some research projects require treating participants with certain
materials over time. For example, a researcher may wish to employ a
new textbook in his English classes. He wants to see if his students will
score higher on an achievement test14 if they are taught the subject using
this new text than those who study a regular text.
For determining whether the students perform better by using the new
text, the researcher has to develop an achievement test on the basis of
content presented in both new and regular texts and administer it to his
students at the beginning of the term. This is technically referred to as
pretest.
At the end of the research or term, the researcher must also administer
the same test, i.e., pretest, to the same students in the same location. This
is called posttest. A significant change in the performance of students on
the pretest and posttest may show which textbook leads to a better or
higher performance on the part of students.
The students may score higher on the posttest because they studied the
new textbook. They may also be test-wise and use the items of the
pretest to determine what will be studied during the research and
accordingly make a greater effort to learn the material. This is known as
test effect.
To remove the threat of test effect on internal validity, some researchers

develop parallel tests, i.e., two tests having the same content presented
in different items. However, as Hatch and Lazaraton (1991) pointed out,
the researcher must be completely sure that the two tests are equivalent,
14
a test designed to “measure the degree of students’ learning from a particular set of
instructional materials” (Farhady, Jafarpoor & Birjandi, 1997, p.24)

59
otherwise he cannot support his claim that the students scored higher
because the new textbook was taught instead of the regular one.
4.2.2 External Validity

Seliger and Shohamy (1989) declared that research projects would have
external validity if researchers could apply or generalize their findings to
“situations outside those in which the research was conducted” (p. 105).
According to Seliger and Shohamy, seven factors affect external
validity: population characteristics, interaction of subject selection and
research, the descriptive explicitness of the independent variable, the
effect of the research environment, researcher or experimenter effects,
data collection methodology and the effect of time.
It seems that Seliger and Shohamy (1989) used internal and external
validities interchangeably. This indiscriminate application results from
their attempt to relate the concept of external validity to research types
in terms of their function, i.e., basic, applied and practical. For example,
in their application of the effect of time on external validity Seliger and
Shohamy wrote:
In applying this concept to external validity, we are concerned

with the degree to which the time frame established by the
research context can be extended to the real world to which the
results of the research will be generalized. That is, will the
results of the conditions be valid when they are applied to
situations in which the conditions of time are not controlled and
where, as in second language acquisition, the concern is with
long-term changes? (p. 110)
Seliger and Shohamy (1989) seem to be using acquisition and learning

synonymously. While the former refers to natural setting such as bus
stations, the latter takes place in official settings such as classes. This
very distinction between acquisition and learning reveals the irrelevant
nature of their argument. It is the belief of the present author that
research projects differ from each other in terms of their purposes. The

60
findings of basic research, i.e., language acquisition, are not meant for
being applied to the classrooms. This is the responsibility or purpose of
applied research. As long as a research project is announced to be basic
in nature, looking for its external validity would be unreasonable.
Though in many cases some researchers try to find out whether the
findings of basic research have any bearing on real life situations.
Compared to Seliger and Shohamy (1989), Hatch and Farhady (1982)

approached external validity in a more realistic manner and applied it to
“other similar situations in the real world” (p.8). Interpreting external
validity in terms of the real world is somehow perplexing. It seems that
employing other similar situations is more definitive than situations
outside. For example, no one expects patients suffering from pancreas
problems in the real world take the drugs manufactured for stomach
problems, which are similar to pancreas problems. This similarity has
unfortunately resulted in many deaths, especially those related to
pancreas cancer. Hatch and Farhady’s concern with the real world led
them to conclude:
… there is a trade-off between maximizing internal and

external validity. In order to have the most valid results we
restrict our procedures as carefully as possible, often to
laboratory procedures, which are not generalizable beyond the
laboratory. And maximizing external validity militates against
internal validity (p.9).
The present researcher is of the opinion that there is no negative

relationship between internal validity and external validity, i.e., as the
internal validity of a research project increases, its external validity
decreases. On the contrary, the external validity of research findings
increases to the extent to which their internal validity increases and vice
versa. The external validity of any research project depends on the
representativeness of its sample. As the number of participants in a
sample increases, the internal and external validity of the study
increases, too. The external validity of a research project depends on the

61
extent to which the results of a research project can be applied to the

accessible population from which the sample has been selected and
whether the accessible population represents the target population. The
types of populations and sample selection will be discussed in the next
chapter.
4.3 Reliability
Reliability of a research instrument such as a test refers to its ability to
provide its users with dependable results. This means that whatever
instruments you employ in designing and conducting your research
project, they must be robust in their content and design in order to yield
consistent results.
If you remember, we discussed psychometric variables in chapter two

and divided them into five categories: categorical, ordinal, interval, ratio
and string. Among these variables, one can say for sure that ratios will
provide the most reliable results because they have a zero as their
starting point and can, therefore, be multiplied by each other.
For example, if we study the time under which two participants in our
research project read a passage and answer questions related to the
passage correctly and notice that the first participant spends 5 minutes
whereas it takes 10 minutes for the second, we can say that the first
participant reads two times faster than the second.
However, if we ask two university students to write an essay on “Which

is more important in success: nature or nurture?” and they obtain 10 and
20 out of 25, can we say that the student who got 20 on her essay writes
two times better than the student who scored 10? If we mark the essays
again, say after one week, will we give the same marks to the same
students?
As the examples of time spent on reading and marking the essays on two
different occasions show the selection of variables in terms of their
psychometric features will affect the reliability of our results. And since

62
instruments form indispensible parts of internal validity in any research

project, establishing their reliability will secure their validity too. In
other words, for a research project to be valid, it must be reliable as well.
4.4 Feasibility
Before any research project starts, its designers must ask themselves
whether conducting their project would be practical in terms of available
expertise, resources and facilities. For example, if we are interested in
exploring a topic which requires a highly sophisticated software, we
must decide whether we can master the software and work with it within
the projected time.
If our research project requiring a sophisticated software is funded by an

organization, then we will be able to find some experts who could help
us out with the software. In addition to taking the factors of expertise
and expense into account, the researchers must find out whether they
will have access to the software and there will be enough participants to
take part in the project.
4.5 Summary
For a research project to be acceptable, it must meet the requirements of
validity, reliability and feasibility. It will be a valid project if its
designers, conductors, participants, instruments and location are studied
carefully and all necessary steps have been taken to foresee and control
extraneous variables which might affect its results.
For a research project to be valid, it must employ objective instruments

to obtain reliable results. The reliability of the project is not limited to its
instruments and involves consistency and care on the part of its
conductors and participants as well. The location and circumstances
under which it is conducted must also be well controlled so that the
results obtained would be the outcome of what has been done rather than
anything else.

63
Since research projects are conducted to find solutions to the problems

faced by given populations whose representatives will be reflected in the
selected sample, researchers need to be careful as regards who takes part
in the projects. If their participants do not represent the population under
study, their findings will serve little, if any. We will study research
population and sample in chapter five.

64
5 Population and Sampling
5.1 Introduction
Applied linguistics deals with identifying the findings of almost all
fields which are related to human languages directly and indirectly and
employing them in teaching language. In addition to applying the
findings of various fields as diverse as sociology and physics, applied
linguists themselves design and conduct various types of research
projects in order to find solutions to the problems not addressed by other
fields.
Research projects conducted in any field require their designers'

thorough familiarity with the population on which they are developed.
The researchers have to use various documents, especially those
published by governmental organizations, to obtain the data related to
their population. Naturally, the whole process through which the
projects are designed and carried out will depend on the type and nature
of population under investigation.
Since applied linguistics deals basically with formal education offered to

school and university students, the first question which faces researchers
in Iran, for example, would be their population. There are a number of
well functioning sites which provide the necessary information related
not only to students but also to other populations such as farmers and
employees online. One of the most convenient ones is the Statistical
Center of Iran. It can be accessed on http://amar.sci.org.ir/.
There are, however, many other sources which deal with certain aspects
of education more specifically and thus provide information in a more
refined manner. These sources can also be accessed electronically. One

65
of these specific online sources is the Embassy of Islamic Republic of

Iran in Copenhagen. It can be accessed on http://www.iran-
embassy.dk/eng/ index.htm. According to a PDF file accessed in July
2008 on this site, for example, more than 15 million students enrolled at
different grades in the Iranian schools in 2004-2005. These students
might have provided a very interesting population for some research
projects in applied linguistics.
5.2 Population Defined

The word population refers to any specified aggregate or group of
animals, persons, items, objects, or events which share a given attribute.
For example, Persian cats form a population whose members come from
Persia as their place of origin. Being Persian is, therefore, an attribute
which all Persian cats share. Similarly, Iranian students form a
population whose members share one particular nationality.
Although Persian cats and Iranian students share two different attributes,
both form an indefinite population whose members will vary from place
to place and time to time. They include past, present and future Persian
cats and Iranian students who lived, are living now and might live in
various parts of Iran in particular and of the world in general.
In contrast to their indefinite counterparts, definite populations comprise

a large number of members confined by certain constants such as place
and time. In research projects, they are usually described as carefully as
possible so that whoever uses their findings can clearly decide what
features of the population the participants share and to whom their
findings can be applied. For example, primary and secondary education
students in Iran in 2004-2005 typified a definite population which
consisted of 15,306,757 learners enrolled at primary and secondary
schools in Iran from September 2004 to June 2005.
5.2.1 Target Population

The target population is a type of indefinite population researchers
always wish to generalize their findings to. It is the ideal choice the

66
researchers make (Fraenkel & Wallen, 1993) in order to free themselves

from factors of place and time and offer solutions which might be
applicable universally. For example, after conducting their research
project on 12 year 10 students, Felix and Lawson (1994) concluded:
The findings of the present study suggest that the addition of

music, relaxation and suggestion to communicative teaching
does, indeed, have the potential to affect both quantitative and
qualitative measures of language learning (p. 16).
The direct quotation above demonstrates how researchers usually

generalize their findings to their target population. Felix and Lawson
(1994) conducted their research on year 10 students who studied in a
specific metropolitan secondary school. They did, however, generalize
their findings to all foreign language learners by saying the addition of
music, relaxation and suggestion to communicative language teaching
affects “language learning.” Now read the description below and decide
what the target population of the study is.
The subjects were 184 randomly selected undergraduate female

English majors at Azzahra University in Tehran. To secure the
validity of the data, the scores of 34 subjects who did not
complete all the study measures were excluded from the data.
Thus, the data for this study were obtained from 150 subjects.
Based on the performance of the subjects on an earlier version
of the Michigan Language Proficiency Test, they were divided
into two groups: high group, those scoring above the mean, and
low group, those scoring below the mean (Farhady & Sajadi,
1998, pp. 25-26)
5.2.2 Accessible Population

In contrast to target populations, accessible populations are the realistic
choices made by the researcher to explore a research question (Fraenkel
& Wallen, 1993). They are controlled by the place and time at which
research projects are conducted. In other words, the accessible

67
population of a research project consists of all the individuals who are

present at the time of the projects. And the researcher knows where
these individuals are. Read the following description of the sample used
by Felix and Lawson (1994) and answer the questions raised on the
description.
One year 10 class consisting of twelve students, five boys and

seven girls (mean age 14 years 11 months), at a metropolitan
secondary school took part. Students were in their third year of
learning German. They had not been streamed for ability (p. 4).
1. How many classes were chosen in the study?

2. How many students took part in the research project?
3. What variables did the participants share?
4. What is the accessible population of the sample?
5.3 Population and Normal Distribution

Psychologists have developed various types of intelligence tests in order
to measure human ability in terms of learning and dealing with new
situations. Among these, Stanford-Binet and Wechsler intelligence tests
have been widely used throughout the world to reach educational
decisions (see Fancher, 1985; Reber, 1995; Wechsler, 1944; Zenderland,
1998). Table 5.1 presents the range of intelligence scores or quotients
obtained on these two tests administered to a very large population.

68
Table 5.1
Classification of test takers based on their IQs.
Intelligence Quotient Range Classification

140 and over Genius or near genius
120-140 Very superior intelligence
110-120 Superior intelligence
90-110 Normal or average intelligence
80-90 Dullness
70-80 Borderline deficiency
Below 70 Definite feeble-mindedness
50-69 Moron
20-49 Imbecile
below 20 Idiot
We now know that 15,306,757 learners enrolled at primary and

secondary schools in Iran from September 2004 to June 2005. We also
know that the Iranian Ministry of Education administers Stanford-Binet
and/or Wechsler intelligence tests to identify the students who need
special attention and treatment such as being placed in particular
educational programs. If we had access to the learners' files, we could
calculate the mean and standard deviation of the students' intelligence
quotients (IQs). If we did, we could come up with a normal curve which
is the basis of all probabilistic studies.
Figure 5.1 shows a normal curve. If the performance of a large

population, e.g., 15,306,757 pre-primary, primary and high school
students, is analyzed and the mean and standard deviation of raw scores
are estimated and put on a graph like Figure 5.1, we will have a normal
curve. (We will study means and standard deviation in chapter 11.) The
normality of a curve helps us reach conclusions and apply them to our
research projects. For example, as shown in Figure 5.1 below, about 50
percent of all students scored 100 on the Stanford-Binet and Wechsler
intelligence tests.

69
Figure 5.1
Intelligence IQ normal curve (Source: de la Jara n.d.)
If you look at the normal curve in Figure 5.1 once again and focus on
the IQs 116 and 84, you will find 84.134 and 15.866 cumulative
percentages under them, respectively. (We will deal with cumulative
percentages or percentile in chapter 11.) By subtracting these two IQs
from each other, i.e., 84.134 - 15.866, we will get 68.268. This means
that more than 68% of students have an IQ of 84 to 116 if measured by
Stanford-Binet intelligence test. This knowledge will be very necessary
when we select a sample for our research project in which intelligence
plays an important role.
5.3 Sampling
The ultimate purpose of any research project is to explore the
relationship between two or more variables common in a population.
When a sample is chosen, i.e., participants, it is not the sample which is
explored but the population from which the sample is drawn. For this
very reason, whatever the size of the sample, it must be representative of
the population of interest. We have already seen that the selection of
participants is a vital variable in internal validity (see chapter four). If
the participants of a research project are not sampled properly, the
project will not have internal validity. For obtaining a representative

70
sample of any population, two major procedures have gained popularity

and widespread use: stratified random sampling and cluster sampling.
Since both of these sampling procedures depend on the concept of
randomness, we will explore it first and then elaborate on the two
procedures.
5.3.1 Randomness: Homogeneity and Mixture

The participants of a research projects will not represent their
population, if they have not been selected randomly. The concept of
randomness rests on two principles. First, the population from which a
given sample is taken is homogenous. Second, the members of
population are well mixed and there is no systematic order in their
collection.
In order to receive feedback from their students, for example, teachers

usually raise some questions related to previous sessions or even years.
If we view the students of a class, say grade one in an Iranian high
school, as a finite population, selecting a few students from the same
class to answer questions will be sampling.
Since grade one high school students have all passed centrally
administered final examinations when they graduated from guidance
schools, the students of a given grade one class will be homogenous in
their background knowledge. This means that the first principle of
random selection, i.e., homogeneity, is taken into account. If a language
teacher chooses the student sitting in the first front row to answer a
question on a topic like simple present tense, will his selection meet the
second principle of randomness?
You know well that highly able and active students usually sit in the first
front row. If a language teacher chooses one of these students and she
answers the question on simple present tense correctly, it does not
necessarily mean that the whole class knows this particular tense simply
because the selected sample does not represent the whole class. In other

71
words, the teacher has violated the second principle, i.e., the student was
not chosen from a well mixed list.
In order to insure that the members of a finite population such as a given

grade one high school class are well mixed, a language teacher must first
get the names of all students, write them down on slips of paper, put
them in a container like a hat, mix the slips well and then ask one of the
students to come and take a slip as the first random sample of the class.
(The student should not watch the content of the hat and hold up her
head while she takes one slip randomly.) This simple random sampling
is one of the most widely used procedures to dispense with sampling
bias.
While writing, mixing and picking a few slips from a hat might be
practical in a small finite population such as the members of a class, it
becomes cumbersome when researchers start collecting the sample from
a relatively large population. A more practical way of simple random
sampling is using tables of random numbers. A table of random numbers
contains an extremely large list of numbers that has no order or pattern.
Since there is no order or pattern for selecting random numbers,

researchers utilize random number tables prepared by various
organizations. Appendix 5.1 presents parts of Table of 105,000 Random
Decimal Digits prepared by Interstate Commerce Commission, Bureau
of Transport Economic and Statistics, Washington, D. C. (Freund, 1973,
pp. 489-492). The table consists of blocks in which five rows and seven
columns of digit random numbers are given. Based on this table you can
choose your sample randomly.
For example, to choose a sample of 30 students from an accessible

population of 300, we might number the students 001, 002, 003, … and
300. We can then arbitrarily pick a page in a table of random numbers,
select a block, start anywhere in the column of the chosen block and
start reading three-digit numbers. (Why three digits? Because the final
number, 300, consists of three digits.) Most random number tables

72
consist of five digit numbers such as 39876 and 25879. We read the first
three digit numbers from the left, i.e., 398 and 258. In choosing the
numbers we move down the page reading off 3-digit numbers, skipping
those which do not apply. For choosing our sample of 30 students we
may arbitrarily use the second block of Appendix 5.1.
Table 5.2 presents the first two blocks of Appendix 5.1. If we look at the
second column and the first row of the second block, we will find
15405. We read the first three digits, i.e., 154, and choose the student
having this number because there are only 300 individuals in our
accessible population. Look at the second number: It is 21694. We read
the first three digits, i.e., 216. The third and fourth numbers are 49810
and 32196. Since there are no 498 and 321 in the population, we skip
them and go to the fifth number: It is 13678. We read the first three
digits and choose student number 136. The sixth number in the table of
random numbers is 47609 so it does not get selected. We go to the next
number and so on until we select a total of 30 students.
Table 5.2
The first two blocks of the random numbers appearing on page one of
Appendix 5.1
04433 80674 24520 18222 10610 0594 37515

60298 47829 72648 37414 75755 04717 29899
67884 59651 67533 68123 17730 95862 08034
89512 32155 51906 61662 64130 16688 37275
32653 01895 12506 88535 36553 23757 34209
95913 15405 13772 76638 48423 25018 99041

55864 21694 13122 44115 01601 50541 00147
35334 49810 91601 40617 72876 33967 73830
57729 32196 76487 11622 96297 24160 09903
86648 13697 63677 70119 94739 25875 38829
The greatest advantage of random sampling is that it is very likely to

produce a representative sample of accessible population. Its greatest

73
disadvantage lies in the difficulty of identifying each and every member

of the population. In addition to identifying the members one by one, the
researcher has to contact the individuals selected. Some of these
individuals may refuse to take part in the project after being selected.
5.3.2 Stratified Random Sampling

We now know that 15,306,757 learners enrolled at schools in Iran from
September 2004 to June 2005. The members of this finite and accessible
population share one attribute, i.e., being a student. However, as we
know pre-primary, primary and secondary education differ from each
other in level and quality. In other words, 15,306,757 students form a
heterogeneous population when their educational levels and what they
study at those levels are taken into account.
English as a subject of study is not, for example, taught in Iranian pre-

primary and primary schools. If researchers wish to study the English
language proficiency of Iranian school students, they must, therefore,
exclude these students from their study. In other words, they must
identify the various strata or levels of an accessible population such as
school students before they choose their sample randomly.
Table 5.3 presents the five strata of school education in Iran in 2004-
2005. As can be seen, the very stratification of school education enables
researchers to realize that 44% of school population, i.e., pre-primary
and primary school students, was not taught English at the specified
period at all. It also specifies the best strata to conduct research projects
on English language proficiency, i.e., pre-university students. This
specification brings down the population of interest from around 15
million to 400 thousand.

74
Table 5.3
Students enrolled at five strata of school education in Iran in 2004-2005
No Levels Number Percentage

1 Pre-Primary 543422 3.5
2 Primary 6206718 40.5
3 Lower Secondary (Guidance School) 4371305 28.5
4 Upper Secondary (High School) 3772585 24.6
5 Pre-University 412727 2.6
Total 15306757 100
Upon specifying the most suitable stratum, researchers can start

selecting a representative sample by determining the required size,
listing individual members, assigning digital codes to those members
and choosing the sample by utilizing random number tables. If we
accept 10% of a population consisting of 400 thousand pre-university
students as a representative sample, we have to randomly select 40
thousand students. This very large number necessitates cluster sampling
because they may spread all over the country and thus render the
projects infeasible.
5.3.3 Cluster15 Sampling

Conducting a research project on a population of 400 thousand would be
a mammoth task. Such projects would never be feasible unless they are
supported and funded by states or governments. Even when the projects
are funded by state organizations, choosing 40 thousand individuals
based on a random number table would be illogical because few selected
students may live in areas too dispersed to justify research expenses.
This problem can easily be solved by opting for cluster sampling in
order to overcome the obstacle of choosing a representative sample of
large population.
15
Cluster /'klǎstз/ n. a group of people, animal or objects that are closely packed
together

75
Let us look at the accessible population of students studying at tertiary

education centers in 2004-2005. According to the Iranian Ministry of
Science, Research and Technology (n.d.), 1,160,765 students majored at
all degree programs in 30 provinces. Choosing a simple random sample
of this population will not be a good procedure because it is a
heterogeneous population. It may result in selecting a sample of a
particular degree program whose population differs from others. In other
words, we need to stratify the population.
Table 5.4 presents the stratified population of tertiary education students

in Iran from 2004 o 2005. As can be seen, the number of students
majoring at various degree programs varies from province to province.
Table 5.4
Tertiary education students in 2004-2005
Associate Professional
BA/BS MA/MS PhD
Province Diploma Doctorate
F M F M F M F M F M
Ardebil 304 580 5597 9242 31 43 0 0 0 0

Bushehr 54 79 4223 7116 8 33 0 0 0 0
Chahar Mahall
143 209 6744 6984 44 108 98 184 0 0
and Bakhtiari
East Azarbaijan 428 1783 11214 32228 688 1578 0 0 56 315
Esfahan 218 1906 34317 50222 1343 3349 0 0 129 624
Fars 614 2390 21679 33903 584 1554 147 307 106 448
Gilan 43 72 10786 12003 229 668 0 0 5 18
Golestan 614 971 7112 1053 134 337 0 0 3 34
Hamadan 562 926 12553 23084 167 519 0 0 7 38
Hormozgan 146 624 4422 7254 5 94 0 0 1 8

76

Tertiary education students in 2004-2005
Associate Professional
BA/BS MA/MS PhD
Province Diploma Doctorate
F M F M F M F M F M
Ilam 472 985 3729 6517 6 16 0 0 0 0

Kerman 957 1963 16444 38197 330 921 151 280 32 212
Kermanshah 310 973 11921 208230 170 550 0 0 3 20
Khuzestan 319 588 17679 26865 374 1209 184 314 21 148
Kohgiluyeh and
142 293 3505 4792 0 11 0 0 0 0
Buyer Ahmad
Kordestan 556 1142 6306 14628 32 92 0 0 0 0
Lorestan 1056 1739 7217 11391 61 120 0 0 0 0
Markazi 550 1694 13609 20851 87 178 9 0 1 1
Mazandaran 121 429 18949 28850 352 1051 0 0 9 65
North Khorasan 124 173 2716 3203 0 0 0 0 0 0
Qazvin 26 198 8022 11688 80 232 0 0 0 0
Qom 0 0 4295 8133 161 602 0 0 12 121
Razavi
1344 2915 29480 43863 617 1647 109 213 62 219
Khorasan
Semnan 632 1444 9595 17174 176 603 0 0 0 21
Sistan and
1939 3834 13493 36155 218 840 0 0 6 39
Baluchestan
South Khorasan 763 1503 4664 7578 80 165 0 0 0 0
Tehran 3600 8020 63014 100290 8705 34515 306 371 1013 5722
West
151 443 14058 24913 120 412 140 298 11 52
Azarbaijan
Yazd 154 175 11418 17690 214 608 0 0 1 10
Zanjan 316 705 7831 11839 131 263 0 0 22 82
Total 16658 38756 386592 638529 15147 52275 1144 1967 1500 8197
For example, the greatest number of undergraduate students at Bachelor

of Arts (BA)/Bachlor of Science (BS) degrees studied in Tehran
province, i.e., 163304 (63014 female and 100290 male). In contrast, the

77
lowest number of students majored at the same degree in North

Khorasan Province, i.e., 5619 (2716 female and 3203male). This very
process of clustering allows researchers to choose samples accordingly.
If they decide to choose 10%, they must sample 16330 and 592 BA/BS
students from Tehran and North Khorasan provinces, respectively.
A cluster such as Razavi Khorasan can be further broken into

subclusters for finer sample selection. For example, this province
consists of various cities in which a number of private and state
universities operate. Mashhad as the capital of province accommodates
Ferdowsi University of Mashhad (FUM), Emam Reza and Khayyam
universities, to name a few. A good sample procedure in this city would
involve contacting these universities and getting the statistics related to
their programs. Table 5.5 presents the academic fields and the number of
students enrolled in those fields at FUM in 2008.
Table 5.5
The number of undergraduate and graduate students majoring in the
specified academic fields offered in Dr. Ali Shariati Faculty of
Literature and Humanities at FUM in 2008
Day Night
No Fields Degree Total
Female Male Female Male
BA 110 38 111 13 272
Arabic Language
01 MA 11 14 3 4 32
and Literature
PhD 0 8 0 0 8
English Language
02 BA 122 48 179 57 406
and Literature
French Language BA 99 16 92 8 215
03
and Literature MA 11 1 6 0 18
04 French: Literary BA 31 5 9 0 45
05 Geography BA 0 0 1 0 1
Geography and BA 112 53 30 18 213
06
Urban Planning MA 2 8 2 10 22

78

The number of undergraduate and graduate students majoring in the
specified academic fields offered in Dr. Ali Shariati Faculty of
Literature and Humanities at FUM in 2008
Day Night Total

No Fields Degree
Female Male Female Male
Geography: Human
07 BA 0 0 1 0 1
and Economic
Geography: Human
08 BA 31 1 65 23 120
Rural
09 Geography: Natural MA 0 0 6 2 8
BA 18 5 10 1 34
Geography: Rural
10 MA 4 7 5 3 19
Planning
PhD 2 11 0 1 14
BA 122 20 90 27 259
11 History
MA 1 0 0 0 1
History: Islamic Iran
12 MA 7 9 7 2 25
History
Human Geography:
13 BA 31 1 65 23 120
Rural
MA 8 2 3 2 15
14 Linguistics: General
PhD 13 22 0 0 35
BA 129 35 161 23 348
Persian Language
15 MA 21 16 20 3 60
and Literature
PhD 13 22 0 0 35
16 Russian Translation BA 69 24 4 2 99
17 Social Sciences BA 2 0 0 0 2
Social Sciences
18 MA 0 0 1 1 2
Research
Social Sciences: BA 133 48 125 37 343
19
Researching MA 10 11 6 4 31
Sociology: Economic and
20 Developmental PhD 1 9 0 0 10
Teaching English as BA 0 0 1 0 1
21
a FL MA 18 7 16 3 44
Teaching Persian as
22 MA 8 2 3 2 15
a SL
Total 1139 443 1022 269 2873

79
5.3.4 Convenience Sampling

Sometimes researchers select their samples on the basis of their
convenience. For example, some students were conducting a research
project on English newspaper readership in Urmia. They were supposed
to visit some newsstands regularly and administer a questionnaire to the
people who bought those newspapers.
When the research team met to discuss their procedure and pool their
data together, it was found that almost all the members of the team had
targeted the newsstand near Urmia University because it was the closest!
This means that the researchers were treating their samples as if they
were random. In other words, they thought that the buyers visiting this
particular newsstand represent the newspaper buyers who buy them
from the stands located in other parts of the city.
According to Hahn and Meeker (1993), if a convenience sample is used

as if it were a random sample, its validity has to be critically assessed.
For example, the standard deviation obtained on the performance of the
sample on a given task should be closely watched to see whether it
represents a normal population. Further, any report of the results should
include a discussion of the issue and stating clearly why convenience
sampling has been adopted.
5.3.5 Matched/Block Sampling

Some research projects rest on participants’ performance on a number of
tests. Khodadady and Herriman (2000), for example, conducted a
research project to determine whether native and non-native speakers
performed differently on schema-based cloze multiple choice item tests
(MCITs), cloze tests and traditional cloze MCITs measuring reading
comprehension proficiency. As the research question indicates the
project required six groups; three groups for native speakers (NSs) and
three groups for non-native speakers (NNSs).

80
One hundred thirty five first-year undergraduate students took part in

Khodadady and Herriman’s (2000) research. Ninety-two students were
NSs and 43 were NNSs. How could they assign these NNSs and NSs to
the three groups? One possibility was employing simple random
selection. They could give a code to each participant from 1 to 100,
utilize a random numbers table like the one given in Appendix 5.1,
select the participants randomly one by one and assign each participant
to one group at a time to come up with the following table of
specification.
Table 5.6
Random assignment of 135 participants into three groups
Groups
Participants Schema-based Cloze test Traditional cloze
cloze MCIT MCIT
Non-native speakers 14 14 15
Native speakers 31 31 30
Table 5.6 shows the random assignment of 135 participants to six

groups. As you can see, the number of participants in each group is
relatively small, especially for NNSs. What would happen if just by
chance the NNSs in, say, Cloze Test group, were all more proficient in
reading than the other two groups? Would not their higher proficiency in
reading have led to higher performance and thus given the invalid result
that cloze tests show the reading comprehension proficiency of NNSs
better than the other two tests? This inappropriate selection of
participants would have destroyed the internal validity of research.
In order to avoid assigning high ability participants to one or two groups

just by chance, Khodadady and Herriman (2000) adopted matched or
block sampling. They administered the disclosed Test of English as a
Foreign Language (ETS 1991) and used the scores obtained on this
proficiency test to assign the participants to the six groups given in
Table 5.6. They ordered the scores from the highest to the lowest for

81
both NSs and NNSs and then assigned the first three highest scorers to
the traditional cloze MCIT, cloze test, and schema-based cloze MCIT.
The order of the groups was changed and then the second three highest
scorers were assigned to schema-based cloze MCIT, cloze test and
traditional cloze MCIT. The same procedure was followed until all
participants were assigned to the specified groups.
5.4 Sample Size

Although the internal validity of all research projects depends on their
sample and how it is selected, no authority has offered a definite number
for its size. Hatch and Lazaraton (1991, p. 235), however, offered 30 as
a magic number and rule of thumb and emphasized their random
selection as a must.
Few scholars have specified a definite size for samples because their
selection depends on the type of research projects and statistical tests
applied to the analysis of data. For example, while it is possible to
design a one-page questionnaire in social surveys and administer it to as
many participants as possible, finding volunteer participants to take part
in an experimental research over a long period of time would be really
difficult.
The problem of sample size can, however, be easily solved if it is based

on a homogenous and well mixed population. All of us have gone to a
medical laboratory, for example, where a few drops of our blood have
been taken. As Cochran (1963) reminded us these drops are employed in
order to reach laboratory diagnoses based on the assumption that the
circulating blood is always well mixed and that one drop tells the same
story as others do. This means that the blood in our body forms a
homogenous and well mixed population from which a few drops can be
taken as a representative sample.
The homogeneity of a sample can be objectively tested by administering

a pre-test before an experiment starts. The results of the test can be used
to estimate its standard deviation to decide whether the sample

82
represents the population or not. Table 5.6 presents sample sizes ranging
from 10 to 1000. The standard deviations given in the table belong to
Garrett (1938, p. 243) who used normal distribution to specify sample
size.
Table 5.7
Standard deviation range in samples having normal population
Sample Size Range of Standard Deviation

10  2.0
50  2.5
200  3.0
1000  3.5
As can be seen in table 5.7, as the size of a given sample increases, so

does its standard deviation. The more extreme the standard deviation,
the less the probability of its occurrence; hence, in small samples such as
10 cases, wide deviations from the mean cannot appear if the sample is
truly representative of a normally distributed population.
5.5 Intact Groups

Some research projects involve treatment. Researchers need to work on
participants for a specific period of time in order to find out whether the
treatment has produced any significant difference in the participants’
behaviour or performance. If you remember in chapter three we talked
about Faravani’s (2006) research on portfolios. She raised three research
questions. In order to answer the questions she had to find two classes to
establish her experimental and control groups.
In two of her advanced classes, Faravani (2006) brought up the topic and
asked her own students to take part in her research project voluntarily
and out of class. She even offered to teach them for a complete term free
of charge. Unfortunately, her advanced students seemed to be too busy
and therefore declined to participate in her study. Had they accepted to

83
take part in her study voluntarily, Farvani could have used simple
random selection to assign her advanced students to experimental and
control groups.
Faravani (2006) had no choice, therefore, but to find an alternative way.

She used two of her own classes in a private institute and treated one of
them as if it were her experimental group and the other remained her
control groups. When researchers like Faravani employ their own classes
and cannot move or replace the students of the class because the
institution where they teach does not allow that, they are using intact
groups. Although intact groups are not randomly selected, they usually
meet the requirement of experimental research because they are formed
on the basis of certain criteria, e.g., achievement in the previous term.
In addition to enjoying a criterion on which intact groups are formed,

some statistical tests can be employed to find out whether the two intact
groups taken as experimental and control are significantly different from
each other or not. If they do not differ, then they can safely be accepted
as experimental and control groups. (We will discuss the tests in a
separate chapter.)
5.6 Summary
In applied linguistics and related fields such as translation and education,
research projects are carried out in order to solve perceived problems.
Researchers select appropriate research methods and follow their
principles in order not to solve an isolated problem but the one which is
common to a given target population.
Whether a given research problem is common to a target or accessible

population or not can be determined by selecting representative samples.
While simple random sampling has proven to be the best for a
homogenous and well-mixed small population, stratified and cluster
sampling must be adopted when the population under study enjoys a
heterogeneous nature and considerable size.

84
However, sometimes it is not possible to select a random sample for a

number of reasons such as feasibility and convenience. In this case,
intact groups can be employed along with required statistical tests to
show their normal structure as the most distinctive index of a population.
The familiarity with population and samples paves the way for us to
focus on some major types of research method.

85
6 Types of Research
6.1 Introduction
In previous chapters, we learned that in teaching language and mastering
it, not only teachers and students but also parents and language policy
makers may face some problems. If these problems are not properly
addressed and solved, they may entail spending time, energy and public
funds on unproductive programs and activities.
Research projects are not confined just to solving urgent educational

problems. They can also be employed to find more efficient and modern
ways of learning language. No one can, for example, deny the fact that
internet has influenced global communication in both developed and
developing countries. In addition to global communication, the internet
has given birth to innovative tools such as weblogs.
Fielder (2003) defined weblogs as reflective conversational tools for

self-organized learning, which best capture the constructivist spirit used
for fostering autonomous and self-directed learning approaches. Rezaee
and Oladi (2008) used the definition in their research project to explore
the following two research questions:
1. How do students socialize in their interactions in cyberspace while

blogging?
2. Is there any difference between the IELTS writing proficiency results
of students who participated in journal writing and those who took
part in traditional writing classes?
In order to answer their research questions, Rezaee and Oladi (2008)

divided their 160 participants studying at the Medical School of the

86
University of Tehran into three groups: blogging, journal writing and

traditional writing. The performance of the blogging group at the
beginning of the study showed that they viewed blogs as a place to
publish their homework and meet the course requirements whereas at
the end of the term most of them considered blogging as an opportunity
to interact with their peers and publish their thoughts. Blogging also
provided an ideal opportunity for introvert students to interact and
participate in class discussions and thus reveal their true abilities without
being exposed.
The performance of the three groups on the IELTS writing proficiency

tests showed that they were significantly different from each other. The
results also showed that the blogging group had “the highest scores in
the IELTS writing proficiency test” (Rezaee & Oladi, 2008, p. 84). This
finding can be used to probably say that modern technologies such as
internet has the capacity to contribute to language learning and should
therefore be seized upon in foreign language teaching classes.
If we focus on the two research questions posed by Rezaee and Oladi

(2008), we notice that they conducted their research project in order to
achieve a given purpose, i.e., which of the two groups performing a
specific writing task will do better on a writing proficiency test? There
are also other projects which can be classified according to the methods
researchers employ in their study in order to serve their purposes. We
will discuss the purposes and methods of research projects in this
chapter.
6.2 Classification of Research by Purpose

Gay (1990) believed that classification of research by purpose is based
primarily on the degree to which findings have direct educational
application and the degree to which they are generalizable to other
educational situations. While the former reason stands to reason, the
second argument is hard to support in terms of research purposes. As
will be discussed later, the generalizability of any research depends on
the research method rather than research purpose. In any case, research

87
is classified by purpose into four categories: basic, applied, evaluation

and action
6.2.1 Basic Research

It is really difficult to separate basic research from applied one in that
even the most basic, pure or theoretical research can find certain
application in one setting or another. It is, however, feasible to
distinguish basic and applied research from each other because the
former is basically carried out to satisfy human curiosity whereas the
latter is designed to meet ever-increasing human needs.
Basic research is conducted to develop new theories or refine old ones.

Some researchers refer to developing new theories and refining old ones
as exploratory and confirmatory research, respectively.
Basic research is best materialized under laboratory conditions and

controls associated with scientific inquiries. It aims at standardization
and regularization and results in establishing general principles. Almost
the majority of researches conducted in theoretical linguistics are basic
in nature. The conditions under which basic research is conducted are
carefully identified and controlled by researchers. For example,
Chomsky (1965) stated:
Linguistic theory is concerned primarily with an ideal speaker-

listener, in a completely homogeneous speech-community, who
knows its language perfectly and is unaffected by such
grammatically irrelevant conditions as memory limitations,
distractions, shifts of attention and interest, and errors (random
or characteristic) in applying his knowledge of the language in
actual performance (p.3).
What Dickens and Flynn (2006) did, provides us with an example for
basic research. They conducted a research project to see whether there is
any evidence to support the claim that although blacks have had

88
environmental gains, they have made no IQ gains on whites, therefore,

the gap between the black and white IQ has genetic origins.
Dickens and Flynn (2006) utilized the results from standardizations of

four tests as follows:
1. The Wechsler Intelligence Scale for Children (WISC): The test was
called the WISC-R, WISC-III, and WISC-IV in 1972, 1989, and 2002
respectively (Harcourt, 2005)
2. The Wechsler Adult Intelligence Scale (WAIS) called the WAIS-R
and WAIS-III in 1978 and 1995, respectively (Harcourt, 2005).
3. The Armed Forces Qualification Test called the AFQT in1980 and
1997 (Department of Defense, 2005)
4. The Stanford-Binet-4 and the SB-5 administered in 1985 and 2001
(Thorndike, Hagen, & Sattler, 1986, pp. 34-36; Riverside, 2005).
Table 6.1 contains the summary data from the test publishers and the
Department of Defence. As can be seen, the mean IQ and standard
deviation of the white are greater than the black. If the inheritance
perspective of intelligence is accepted as a fact, the data collected over
years should show the same pattern of difference, i.e., the black will not
gain greater IQs on any of these tests even if their environment becomes
better than before.

89
Table 6.1
IQ means and standard deviations for whites and black (Source: Dickens
& Flynn, 2006, p. 27)
Standard Number of
Mean IQ
Test deviation observations
White Black White Black White Black
4 103.6 90.0 15.37 13.86 3691 711
Stanford-Binet
5 102.9 92.1 13.93 14.47 2070 384
R 102.3 86.4 14.08 12.63 1870 305
WISC III 103.5 88.6 13.86 12.83 1543 337
IV 103.2 91.7 14.52 15.73 1403 343
R 101.4 86.8 14.65 13.14 1664 192
WAIS
III 102.6 89.1 14.81 13.31 1523 247
R 101.2 87.0 14.28 13.54 519 72
WAIS < 25
III 102.6 90.9 14.59 12.31 413 93
80 100.0 82.0 15.00 13.63 5533 2298
AFQT
97 100.0 85.6 15.00 13.23 2880 1191
Figure 6.1 shows the black scores on four tests of cognitive ability on
the next page. Although all the scores are below the white mean score,
i.e. 100, there is consistent gain over years. Based on the analysis of data
from nine standardization samples for four major tests of cognitive
ability, Dickens and Flynn (2006) declared, “blacks have gained 5 or 6
IQ points on non-Hispanic whites between 1972 and 2002. Gains have
been fairly uniform across the entire range of black cognitive ability” (p.
2).

90
Figure 6.1
Black Scores on Four Tests of Cognitive Ability (Source: Dickens &
Flynn, 2006, p. 29)
6.2.2 Applied Research

As the name implies, applied research is done in order to apply new
and/or old theories to educational problems. It is concerned with “what”
works best, and thus provides data to support new theories, revise the
old and suggest the development of new ones. Khodadady (1997),
Khodadady and Herriman’s (1998) study on multiple choice item tests
(MCITs) provide a good example for applied research in that they
provide the rationale needed to develop objective measures of reading
comprehension ability.
Mehrens and Lehman (1991) stated that among various testing methods
such as open-ended questions and essay writing, MCITs are the most
highly regarded types. Although MCITs are the most popular, reliable,
time and cost effective testing methods, they suffer from one major
shortcoming, i.e., a solid basis in item writing theory. The underlying

91
rationale for constructing MCITs has been questioned by many scholars

(e.g., Bennett, 1993; Haladyna, 1994; Mislevy, 1993; Resnick &
Resnick, 1990; Shepard, 1991a, 1991b).
Because MCITs lack a sound theory, multiple-choice item writers are

often uncertain as to where to get their plausible and attractive
distracters from. As Tindal and Marstorn (1990) stated, “the most
difficult problem in writing multiple-choice items is creating effective
options among which to include the correct answer” (p. 55). Khodadady
and Herriman (1998), therefore, conducted a research project to find out
whether applying schema theory to multiple choice item writing would
yield theoretically sound items. Their project yielded the findings below.
1. Non-native (NNSs) and native speakers (NSs) perform differently on

schema-based cloze multiple choice item tests (MCITs).
2. Traditional cloze MCITs, i.e., the tests that lack a sound item writing
theory, are the easiest tests of reading comprehension ability and
subsequently fail to discriminate between high-ability and low-ability
test takers. The traditional cloze MCITs also lack internal validity
because the number of individual items correlating with the total test
scores and having the required p-values is fewer than the other test
methods such as cloze test.
3. As interactive measures of reading comprehension ability, schema-
based close MCITs are the only valid tests for NNSs. They are not
only valid tests of reading comprehension ability but also of structure
and vocabulary knowledge for NNSs.
4. Schema-based close MCITs are the only tests whose items correlate
significantly with the language of test takers and the number of items
showing significant relationship with language is more than the other
test methods.
5. Schema-based close MCITs are highly reliable tests
6. Schema-based close MCITs are the best measures of reading
comprehension ability in terms of their concurrent validity, face
validity, time and cost effectiveness.

92
6.2.3 Evaluation Research

The purpose of evaluation research is to select an alternative in order to
make decisions. As Gay (1990, p. 8) emphasized, there may be only two
alternatives, e.g., adopt a new curriculum16 or keep the current one, or
there may be several alternatives. For example, there may be many
textbooks available for adoption in the new curriculum.
Disagreements, nonetheless, exist among researchers regarding

evaluation as a type of research. While some scholars believe that
evaluation should be considered as a separate discipline, I agree with
Gay (1990, p.8) who approached it as a type of research which involves
decision making and requires steps which parallel those of the scientific
method.
The disagreements among scholars arise from the fact that evaluation
research cannot be as controlled as basic research. In contrast to basic
research, evaluation is conducted in natural and real-world settings such
as classrooms. Naturally, the number of variables involved in evaluation
increases because of the nature of settings. It does not, however,
preclude applying research principles to evaluation, i.e., validity,
reliability and practicality. Maibodi (2008), for example, conducted a
research project to find out whether text genre affects students’ reading
comprehension ability.
Two hundred freshmen and sophomore female undergraduate university

students majoring in English Translation took part in Maibodi’s (2008)
research. Based on their performance on Oxford Placement Test (OPT)
the participants were divided to two groups: A and B. Group A studied
16
/kз'rikyulзm/ n. designs for carrying out a particular language program. Features
include a primary concern with the specification of linguistic and subject-matter
objectives, sequencing, and materials to meet the needs of a designated group of
learners in a defined context. The term “syllabus” is used more customarily in the
United Kingdom to refer to what is referred to as a curriculum in the United States
(Brown, 1994, pp. 159-160).

93
Discovering Fiction by Judith Kay and Rosemary Gelshenen for their

course Reading Comprehension and Group B studied Simple Prose by
Abbas Ali Rezaee and Helen Ouliaenia.
After teaching groups A and B for nearly 12-14 weeks, and covering
about ten lessons for each group a TOEFL post-test consisting of 30
reference questions and one short narrative text and another expository
text was administered to control the processing time and proficiency
level of the students chosen for the study. Those scorers who fell within
one standard deviation above and below the mean were taken as final
samples for further study, i.e., 30 students for each group.
Maibodi (2008) administered another TOEFL post test consisting of two

narrative passages (14 items) and two expository passages (17 items) to
the 30 students in each group to address the effect of narrative and non-
narrative texts on reading comprehension ability. The results obtained
showed that there was a significant difference in the mean scores of
Group A, who studied narrative texts, and Group B who studied
expository texts. Group A outperformed Group B in reading
comprehension of both narrative and non-narrative texts.
6.2.4 Action Research

Most scholars believe that action research was established by Kurt
Lewin17 working on social sciences in the late 1930s. He encouraged
social workers to apply research to their work and thus bring about
17
Kurt Lewin (1890-1947) is a German American psychologist who was born in
Mogilno, Prussia (now in Poland). He received his education from the University of
Berlin. Lewin immigrated to the United States in 1932 and taught at Stanford, Cornell,
and Iowa universities. Finally, he became the director of the Research Center for
Group Dynamics at Massachusetts Institute of Technology in 1944. Lewin explored the
problems of motivation of individuals and groups and conducted research projects on
child development and personality characteristics. His work had a major influence on
modern investigations in psychology. Among his books are A Dynamic Theory of
Personality (trans. 1935), The Conceptual Representation and Measurement of
Psychological Forces (1938), and Resolving Social Conflicts (1947) [Microsoft
Encarta, 2006]

94
social change. Noffke (1990) described his formulation of action

research as an attempt on instituting change by
Taking actions, carefully collecting information on their efforts,

and then evaluating them, rather than formulating hypotheses to
be tested, although the eventual development of theory was
important. This represents not only a clear distinction from the
dominant educational research forms of the time, but also
emphasizes Lewin’s concern with resolving issues, not merely
collecting information and writing about them. The theory
developed as a result of the research was theory about change,
not about the problem or topic itself (pp. 35-36)
The purpose of action research is, therefore, to find solutions to the

problems faced in classrooms. In contrast to basic research, action
research is concerned with a local problem and is conducted in a local
setting. Action researchers seem not to be interested in contributing to
science. Their primary goal is to find solutions to their classroom
problems here and now.
Action researchers are thus no one but the language teachers themselves.
They are an essential part of the process in which the action research is
conducted. This very fact produces a paradox in terms of research
principles. Since the teachers themselves act as researchers, their
research projects can help them respond to their students’ needs and
problems. However, as the findings they obtain in their classes are
limited to their local settings, they cannot generalize their findings and
their researches will therefore lack external validity.
6.3 Classification of Research by Method

All research projects have certain procedures in common. They state a
problem, offer hypotheses, collect data, analyze them and draw
conclusions. The types of problems researchers face are, however,
different. For finding solutions to different types of questions various

95
research methods have been developed over years. They include

experimental research, correlational research, observational research,
archival research, causal-comparative research, survey research and
historical research.
6.3.1 Experimental Research

In experimental research, different treatments are established to study
the effect of a certain variable on another or other variables. Anxiety, for
example, has been found to be associated with both physical and mental
states of human beings.
Spielberger (1983) defined anxiety as the subjective feeling of tension,

apprehension, nervousness and worry associated with the arousal of the
nervous system. He developed a psychometric test which measures two
types of anxiety: anxiety as a personality trait and as a transient anxiety
state. It has been employed by many researchers in various fields of
science to study the effect of anxiety on given variables.
Businco, Businco, Lauriello and Tirelli (2004), for example, conducted a

research project in order to explore whether the successful treatment of a
diseased called nasal polyposis (NP) removed patients’ anxiety. (The
disease affects the appearance of the patients and deforms their noses)
The participants of the project were 30 consecutive patients (16 male, 14
female, age range 18-77 years, mean 45.6), all affected by idiopathic
ethmoidal NP, primary or recurrence.
Patients affected by asthma, mental diseases, chronic diseases,

in general, or used any other drugs during the study were
excluded. All patients received montelukast (MLK 10mg/day)
+ loratadine (LOR 10mg/day) + mometasone furoate (MOM
100 μg per nostril/day) for 7 months and underwent a monthly
follow-up, with nasal endoscopy, in order to evaluate the
efficacy of the treatment (personal polyps score used: 0 = no
polyp, 1 = in middle meatus, 2 = outside middle meatus, 3 =
contact with inferior turbinate, 4 = contact with nasal floor),

96
anterior active rhinomanometry (AAR), record cards for nasal

symptoms (score for each symptom from 0 = good to 4 = bad).
None of these drugs have sedative or anxiety reducing effect
(Hindmarch, Johnson, Meadows, Kirkpartrick, & Shamsi, Z.,
2001).
All patients were included in the study group after a run-in

period of at least 1 month without any therapy apart from nasal
washes with saline solution, this was considered necessary also
in order to prevent the possible influence of other drugs on
mood (Clark, Bauer, Cobbs, 1952; Bender, Lerner, & Poland,
1991). [p. 327]
Businco, Businco, Lauriello and Tirelli (2004) asked their patients to fill
out a questionnaire before and at the end of the treatment. It contained
two self-rating psychometric tests: one for anxiety and the other for
depression. Anxiety test comprised state anxiety and trait anxiety, each
consisting of 20 multiple choice items on which a score of > 40 is
considered high (Spielberger, Gorsuch, & Lushene, 1983). The test
allows to distinguish between existing anxiety and anxiety as a relatively
stable personality trait. A medium value of > 40 was used to classify
patients as high anxious and low anxious.
For assessing their participants’ depression, Businco, Businco, Lauriello

and Tirelli (2004) utilized Zung self-rating depression scale (Zung,
Richards, & Short, 1965). The test consists of 20 multiple-choice items
on which a score of ≥ 49 is considered high. It is effective in diagnosing
current depression in a clinical setting (Kaplan & Saboch, 1995) and its
positive predictive value to diagnose depression is between 88.7% to
92.3% (Magruder, Norquist, Feil, Kopans, & Jacobs 1995).
Table 6.2 shows the mean values of AAR and disease symptoms. As can
be seen, all patients revealed a significant reduction of NP at nasal
endoscopy, reduction of symptoms and nasal resistance at AAR.

97
Table 6.2
Mean values of nasal resistance at AAR18 (Pa19/cc20/sec), nasal
endoscopic scores and nasal symptoms, before and after treatment
Nasal Post-nasal
Time AAR Endoscopy Rhinorrhea Sneezing
obstruction drip
To 2.52 3.2 3.8 3.6 2.6 3.6
7 months 1.22 1.6 1.2 1.5 1.2 1.8
Table 6.3 shows the statistical analysis of results obtained on the anxiety
and depression tests. As can be seen, 19 patients (63.3%) showed high
levels of state anxiety, 21 (70%) showed anxiety as a trait and six (20%)
were positive for depression. However, after the medical treatment,
seven patients (23.3%) showed high levels of state anxiety, 8 (26.6%)
showed anxiety as a state and five (16.6%) were positive for depression.
The percentage of patients with high levels of state anxiety was
significantly higher before treatment than after (63.3% vs. 23.3%; with
X2 = 0.14; p = 0.004) equivalent to the percentage of patients with high
levels of anxiety as a trait (70% vs. 26.6%; X2 = 0.10; p = 0.002); there
was, nonetheless, no significant difference in depression before and after
treatment.
Table 6.3
Comparison and statistical analysis of results
N. of patients N. of patients after % of patients Chi- Pearson
before treatment treatment improved Square p
State anxiety 19 7 63.15 0.14 0.004
Trait anxiety 21 8 61.90 0.10 0.002
Depression 6 5 16.6 1.11 0.090
18
Anterior Active Rhinomanometry /rīnōmз'nǒmзtri/: measurement of the air flow and
pressure within the nose during respiration
19
Pa stands for Pascal which is a unit of pressure equal to one newton per square meter
20
cc: cubic centimeter

98
6.3.2 Correlational Research

In studying experimental research projects, we realized researchers
hypothesize that a certain variable such as anxiety exists in a certain
group of people because they suffer from having a certain disease called
nasal polyposis (NP). In other words, the existence of NP in patients
causes anxiety.
Based on a causal relationship, Businco, Businco, Lauriello and Tirelli

(2004), hypothesized that their patients were highly anxious because
they were suffering from NP. If they could treat their patients’ NP, their
anxiety would disappear. A causal relationship of this type can only be
hypothesized in experimental research where researchers control all
variables in their study and treat a particular one in order to see whether
any significant difference would appear as a result of their treatment.
There are, however, many occasions on which it is not possible to single

out certain variables and start a treatment program over time to find out
whether there was a causal relationship between the variables. For
example, every year a large number of high school graduates take part in
the University Entrance Examination (UEE) in Iran. Can we hypothesize
that success at high school results in success at the UEE?
When we start a research project in which a large number of dependent

variables affect an independent variable and it is not possible to
determine their effect by introducing and implementing certain
treatments, we embark on a correlational research. In contrast to
experimental research, correlational investigations do not claim any type
of causal relationships among the variables they study.
Khodadady (2007), For example, conducted a research project to find

out whether C-Tests21 measure language proficiency or spelling and
21
C-tests are a type of cloze language proficiency tests which are usually developed on
some short passages. The second half of every second word from the second sentence
of each passage is deleted in order to be restored by test takers.

99
vocabulary knowledge. In order to explore his research question,

Khodadady administered the C-tests designed by Klein-Braley (1997)
and a disclosed Test of English as a Foreign Language (TOEFL) along
with two vocabulary tests to 63 participants. Table 6.4 presents the raw
scores of five students on the C-tests and TOEFL. (Only five scores
have been given here in order to save space and avoid unnecessary
complexity).
Table 6.4
The raw scores of five participants (Ps) on the TOEFL and C-tests
Tests P1 P2 P3 P4 P5
C-Tests 80 75 53 50 50
TOEFL 105 100 90 74 60
As can be seen in Table 6.4, the C-Tests and TOEFL will correlate with
each other because the scores obtained on both tests by the same test
takers show a positive relationship, i.e., whoever has scored high on the
C-tests; s/he has obtained a high score on the TOEFL, too.
Figure 6.1 presents a visual comparison of the scores obtained on the C-

Tests and TOEFL. As can be seen, there is a positive relationship
between the two proficiency tests, i.e., as the scores on the TOEFL
increase so do the scores on the C-Tests. Although both Table 6.4 and
Figure 6.1 provide a pattern to see the correlation, they do not indicate
the magnitude of correlation.

100
Figure 6.1
6
The linee chart of fivve scores obbtained on tw
wo proficieency tests
Statisticians have foortunately worked

w out some formuula to determ mine the
magnituude of correllations betwween two orr more psychhometric vaariables
such as scores
s on thhe TOEFL and
a C-Testss. These form mula can bee
employeed in correlaational reseaarch projectts to exploree all possiblle
relationsships. They can be applied to raw data either manually orr
electronically. We will
w familiaarize ourselvves with onee of these foormula,
i.e., Pearrson producct moment correlation,
c in a separatte chapter aand then
employ a software to t do the saame to see whether
w we get
g similar rresults.
6.3.3 Obbservationaal Research h

McBurnney (1994) classified
c obbservationall research as a variety oof
nonexpeerimental reesearch, in which
w the reesearcher simmply observves
ongoingg behaviourss. These behhaviours in applied lingguistics cann be,
among others,
o learnners’ responnses given to o oral questtions, lecturres
delivered in seminaars and verbbal interactioons. Most sttudies donee on
languagee acquisitioon are observvational in nature.
n An oft-cited
o exxample is
observedd and reportted by McN Neil (1966) anda quoted by Atkinsoon (1992,
p.24):
Research Principles, Methods

M and Sttatistics in App
plied Linguisttics
101
CHILD: Nobody don’t likes me.

MOTHER: No, say ‘Nobody likes me’.
CHILD: Nobody don’t likes me.
(Dialogue repeated eight times)
MOTHER: Now listen carefully, say ‘Nobody likes me’.
CHILD: Oh! Nobody don’t likes me.
Since observational research projects are conducted in the field, i.e., in

real places and under real circumstances, they are classified into two
major classes: longitudinal and cross sectional. Each will be discussed
briefly.
6.3.3.1 Longitudinal
Longitudinal research projects are usually conducted over a relatively
long period of time by observing the natural behaviour of a certain
person or a small group of persons. One of the areas which can best be
explored by longitudinal research is the simultaneous acquisition of two
languages from birth, or what is generally referred to as bilingual first
language acquisition (BFLA).
According to Genesee and Nicoladis (2006), BFLA studies are

concerned with the questions whether the developmental path and time
course of language development in BFL learners is the same as that of
children learning only one language. For answering questions of this
type, some researchers have closely observed and studied the language
acquisition of their own children. Genesee and Nicoladis cited two
scholars’ observations of their children as the pioneering studies in
BFLA.
Ronjat, the first scholar, published a detailed description of his son

Louis’ simultaneous acquisition of French and German in 1913. Louis
showed remarkable progress in both his languages and little sign of
confusion. Ronjat attributed Louis’s lack of confusion to both parents’
use of only one language with him.

102
Ronjat’s conclusion was, however, brought into doubt in 1949 when

Leopold, the second scholar, published the last volume of a detailed
diary of his daughter’s (Hildegard) simultaneous acquisition of English
and German. Leopold claimed that the parents were insistent on a one
parent-one language rule. Yet Hildegard passed through a stage when
she used words from both languages, a fact that Leopold interpreted as a
sign that she had confused her two languages and was functioning as a
monolingual.
6.3.3.2 Cross-Sectional
Although longitudinal studies provide ideal opportunities for researchers
to observe bilingual first language acquisition (BFLA) as it takes place,
they might not always be feasible. The researchers might not have
bilingual partners to observe their children’s BFLA or have access to
similar children under natural conditions.
If we accept Genesee and Nicoladis’s (2006) limitation of their

discussion regarding simultaneous acquisition from birth to about four
years of age, we can develop a cross-sectional study to include bilingual
children, say Turkish and Persian speaking ones, whose age falls within
the specified range. Instead of observing a bilingual child over four
years, we can divide four years into the periods of, say six months, and
choose one child of each period to observe their language production for
a given time.
Table 6.5 shows the selection of hypothetical bilingual children whose

parents speak Persian and Turkish at home. As can be seen, the selection
rests on children’s ages and must therefore reflect their development
both physiologically and psycholinguistically. Other variables such as
gender can also be included to increase the scope and generalizability of
research findings.

103
Table 6.5
Cross-sectional selection of 16 hypothetical bilingual children
Name Ali Mahdi Babak Arash Hassan Omid Hamed Reza

Toktam Zahra Shirin Leila Akram Mina Raziyeh Nahid
Age Six One 1.5 2 years 2.5 3 3.5 4
months year years years years years years
6.3.4 Archival Research

Archival research is a type of nonexperimental research, in which
existing records are examined in order to test hypotheses about the
relationships between some variables. This type of research is usually
done to know the state of art in a particular area or provide enough
background to follow a particular line of research.
Structural equation modeling (SEM) is, for example, a relatively new

statistical method which is currently employed to explore the
relationship among a host of variables among which some can predict
others. Kunnan (1998) conducted an archival research to pave the way
for its wider application in applied linguistics.
According to Kunnan (1998), Bachman and Palmer were the earliest

researchers who employed SEM in the validation of FSI Oral Interview
(1981), components of communicative proficiency (1982) and self-
ratings of communicative language ability (1989). Although Kunnan
focuses on what has been investigated through SEM rather than the
years in which those studies were done, they have been chronologically
ordered in Table 6.6

104
Table 6.6
The application of SEM in research projects on assessment
Year Researchers Variables Explored

1980 Swinton and Powers the component abilities that underlie performance
on the TOEFL
1983 Gardner, Lalonde and Motivation, aptitude, and attitude as factors that
Pierson, affect second language acquisition.
1983 Purcell models of pronunciation
1985 Fouly the relationships among learner variables and
second language proficiency
1985 Clement and Motivation, aptitude, and attitude as factors that
Kruidenier affect second language acquisition.
1986 Ely Motivation, aptitude, and attitude as factors that
affect second language acquisition.
1987 Gardner, Lalonde, Motivation, aptitude, and attitude as factors that
Moorcraft, and Evers affect second language acquisition.
1988 Gardner Motivation, aptitude, and attitude as factors that
affect second language acquisition.
1988 Wang cognitive achievement and psychological
orientation among language minority groups
1989 Hale, Rock and Jirele the factor structure of the TOEFL
1989 Turner second language cloze test performance
1993 Sasaki the relationships among second language
proficiency, foreign language aptitude, and
intelligence
1995 Kunnan the influence of some test taker characteristics on
test performance in
tests of English as a foreign language
1996 Purpura the relationships between test takers’ cognitive and
metacognitive strategy use and second language test
performance
1998 Ginther The factor structure of an Advanced Placement
and Stevens Spanish language examination among four different
Spanish-speaking test taking groups.

105
6.3.5 Survey Research

Surveys are research methods that are frequently employed in social
sciences. One of the most common techniques for collecting information
in surveys is questionnaire. Since questionnaires are the most common
technique, they have become synonymous with surveys. Marsh (1982),
however, insisted that they are not synonymous. According to de Vaus
(1985), other techniques such as structured and in-depth interviews,
observation and content analysis are also appropriate in surveys. de
Vaus believes that the distinguishing features of surveys are the form of
data collection and the method of analysis.
Surveys are characterized by a structured or systematic set of data

collection tables called “case data matrix” (de Vaus, 1985, p.3). They
tabulate the collected information on the basis of variables and cases or
participants in the surveys. Table 6.7 presents a typical data matrix.
Table 6.7
A variable by case matrix
Variables Cases
Person 1 Person 2 Person 3 Person 4 Person 5
Sex Male male female male female
Age 36 yrs 19yrs 30yrs 55yrs 42yrs
Political Progressive moderate23 Progressive Traditionalist Traditionalist
22
orientation
class working Lower class Upper class
Upper Middle
middle
From D. A. de Vaus (1985) Surveys in Social research. Sydney: Allen & Unwin, p. 4
One function of survey analysis is to describe the characteristics of a set

of cases. As shown in table 6.6, for example, if a researcher wishes to
22
Progressive is a person who favours social, economic or political reforms
23
Moderate is a person who holds no extreme opinion in politics and favours gradual
reforms

106
describe how a group of people will vote, he needs to know their

distinctive characteristics such as age, gender, political orientation and
social class. A variable by case matrix provides this information.
Survey researchers are also interested in exploring causes of given

variables such as voting. In contrast to experimental research, however,
the survey analysts try to locate causes by comparing cases rather than
introducing and implementing a particular type of treatment. By
comparing how cases vary on some characteristic (e.g. some cases will
be progressive and others will be traditionalist), the survey analysts will
see if the progressive are systematically different in some other way as
well from the traditionalists.
For example, in Table 6.6 there is variation across cases in how they
vote. This is systematically linked to variations in class: the progressives
are working class and the traditionalists are middle class. In other words,
survey research seeks an understanding of what causes some
phenomenon (e.g. vote) by looking at variation in that variable across
cases, and looking for other characteristics which are systematically
linked with it. As such, it aims to draw causal inferences (e.g. class
affects vote) by a careful comparison of the various characteristics of
cases. It does not end there. The next step is to ask why class affects
vote. Survey researchers need to be very careful, however, to avoid
mistaken attribution of causal links (simply to demonstrate that two
things go together does not prove a causal link).
This style of research and analysis in sociology can be contrasted with

other methods. For example, the case study method in psychology
involves data collection about one case. Since there are no other cases
for comparison, quite different strategies for understanding the
behaviour and attitudes of that case have to be employed. The
experimental method is similar to the survey method in that data are
collected in the variable by case matrix form, but they are fundamentally
different in that the variation between the attributes of people in

107
experimental research is created by intervention or treatment from an

experimenter.
6.3.6 Historical Research

Historical Research involves studying, understanding, and explaining
past events (Gay 1990). We study the past events to find out what
brought them about, what effects they produced at the time of their
occurrence and what trends they went through. The information we
gather may help us explain present events and anticipate future ones.
As you might have figured out, there is a relationship between archival

research and historical research. While the former deals with a specific
set of documents dealing with a particular topic at a specific period of
time, and usually within a given data bank, historical research stretches
back in time and traces the development of an event through ages.
Miremadi (1991, p. 18), for example, conducted a historical research on

Theories of translation and interpretation and developed it into a
textbook in order to “make a review of the translation literature and to
trace the development of the translation theories from antiquity to the
present,” hoping that it will “illustrate some common problems
translators and interpreters face and the techniques how to confront
them.”
6.4 Summary
All research projects require facing a problem, feeling a need to solve
the problem and taking proper steps to solve it. They are conducted in
order to satisfy an internal desire to know, i.e., basic research, to employ
a theory to address an existing problem, i.e., applied research, to decide
which currently practiced approaches yield the best result, i.e.,
evaluation research, or to address a trouble faced in class or at work, i.e.,
action research.
Upon determining the purpose of their research projects, researchers

have to adopt an appropriate method to solve their research problems. If

108
they have a new approach of teaching language skills such as reading

comprehension ability, they ought to design an experimental research by
establishing either control and experimental groups or a single group
whose lack of reading ability is evidenced before the project and its
mastery is confirmed by the data obtained at the end of the experiment.
If the number of variables involved in the projects are many and there is
no possibility of introducing and implementing treatments, the
researchers have to resort to correlational and survey research projects
and study the participants’ behaviour on certain instruments which
quantify the variables under study.
Both experimental and correlational research projects call for

manipulating participants who are capable of complying with prescribed
treatments such as such as taking drugs or embarking on certain
activities such as reading and taking tests. If some participants such as
children cannot be directed by researchers, their behaviour must be
studied as they are exhibited under real circumstances in natural places
at appropriate time, i.e., observational research.
There are, however, some occasions on which there is no possibility of

having access to the people or events whose variables are studied within
given experimental, correlational, observational or survey research
projects. The researchers have no choice but to find archives and/or
written documents upon which they can base their analyses, i.e., archival
and historical research methods.
In addition to experimental, correlational, observational, survey, archival

and historical research projects, there are two other research methods
which, I believe, deserve separate chapters, i.e., translation and text-
based research methods. The next two chapters will deal with these two
methods.

109
7 Translation Research
7.1 Introduction
Chapter 7 is a revised and enlarged version of a paper entitled,

“Translation: A research method” (Khodadady, 2000). A review of
literature in this paper shows that different types of research projects
have been discussed and conducted in various fields such as education,
psychology and applied linguistics. While the majority of research
methods have received a fair share of attention and elaboration in these
fields, none of the research textbooks published in these fields has
referred to translation as a viable method of investigation.
Chapter 7, therefore, attempts to approach translation as a research

method by resorting to scientific method and the three widely
recognized principles of research, viz., validity, reliability and
feasibility. Similar to all scientific methods, translation rests not only on
a theory but also on observation. As regards research principles,
translation reaches validity internally but leaves external validity
unaddressed. Consistency in translation is obtained by translator
reliability and the factors of time, money and energy saved on avoiding
replicating research projects conducted via a given source language
make it the most feasible method of research in Iran.
Approaching translation as a research method is, however, by no means

new. Wills (1982, p. 8), for example, used the term translation research
and did his best to develop it as an empirical concept. Roberts (1973), on
the other hand, adopted a literary approach and maintained that scholars
do research in order to uncover some of the accumulated lore of our

110
civilization. Roberts defines lore as “the knowledge that presently

exists” (p. 275). Although the attitudes of Wills and Roberts on the
nature of research stand on the two extremes of a continuum–scientific
vs. literary–they provide us with a sound basis to define translation
research as a scientific method to uncover for our target readers the
knowledge that presently exists in a language unknown for the readers.
Providing target language readers with the knowledge that exists in a

language unintelligible to them should pose an educational problem for
translators and thus necessitate a research project to find appropriate
solutions. This view of translation research is compatible with the
definition of research as the formal and systematic application of the
scientific method to the study of problems in education (e.g., Gay,
1990), psychology (e.g., McBurney, 1994) sociology (e.g., de Vaus,
1985) and applied linguistics (e.g., Hatch & Farhady, 1982; Hatch &
Lazaraton, 1991).
According to Lerner, Kendall, Miller, Hultsch and Jensen (1986, p. 5)

the term scientific method refers to (1) the assumptions about how to
understand the world, and (2) the many different techniques of
observation. When the term assumption is used synonymously with the
term theory (see Richards and Rodgers 1986, pp. 15-16) and applied to
translation, it will be scientific if (1) there is a theory to explain how it is
done and (2) it uses one of the techniques of observation.
No definite theory has been offered to explain how translation occurs so

far. Newmark (1988), however, asserted that “translation theory derives
from comparative linguistics, and within linguistics, it is mainly an
aspect of semantics; all questions of semantics relate to translation
theory” (p.7). I have, however, already shown that multiple choice items
suffered from a similar shortcoming (Khodadady, 1997, 1999a). I
suggest schema as a powerful theory to explain how translation occurs.
My suggestion is supported by a study based on the translation of 22
undergraduate students at Kurdistan University (forthcoming).

111
Dixon and Massey (1983) referred to observation, the second

requirement of scientific method, as “any sort of numerical recording of
information” (p.5). According to Armitage (1971), there are two types of
observation, qualitative and quantitative. As will be discussed shortly,
translation research is based on qualitative observations.
Although all research textbooks discuss research methods without

focusing on theories, I believe a theoretical explanation of a given
research method provides a coherent view as to what must go on when it
is employed to conduct a project. Schema theory provides such an
explanation and thus can be employed successfully to account for what
happens during translation.
As we realized in chapter six, all research projects have a purpose to

fulfill and a method to be followed. Similarly, translation research
projects serve certain functions and are basically conducted by
employing certain conscious or unconscious and qualitative or
quantitative methods. We shall discuss these functions and methods in
this chapter.
7.2 Purposes of Translation Research

Among various types of research methods, translation is the most
feasible method that provides researchers with an ideal opportunity to
claim one or even all research purposes, i.e., basic, applied, evaluation
and action.
7.2.1 Basic Translation Research

The main function of translation is basic in that a translator wishes to
make the ideas of the author whose works he chooses to translate known
to the community of his mother or target language. Besides appreciating
the work under translation and satisfying his curiosity need, the
translator responds to his ego. In his essay Why I write, Orwell (1958)
gave the following reason for writing, which I believe is equally
applicable to translation.

112
Sheer egoism. Desire to seem clever, to be talked about, to

be remembered after death, to get your own back on grown-
ups who snubbed you in childhood, etc., etc. It is humbug
to pretend this is not a motive, and a strong one. Writers
share this characteristic with scientists, artists, politicians,
lawyers, soldiers, successful businessmen- in short, with the
whole top crust of humanity (Bott 1945, pp. 101-102).
According to Holzman (1970, p. 84), ego is

a “slowly developed network of pathways”
which enable an individual to take an
active part in finding satisfaction and in
relating himself to people who can help to
provide such satisfaction. Translators
respond to their ego when they try to
establish a network with their readers of
their translations. Pickthall24 (1930), for
example, tried to establish a network with
his English readers who wish to read the
Quran rendered by a Muslim translator as
follows:
Marmaduke Pickthall
'The aim of this work is to present to English readers what

Muslims the world over hold to be the meaning of the words of
the Koran, and the nature of that Book, in not unworthy
language and concisely, with a view to the requirements of
English Muslims. It may be reasonably claimed that no Holy
Scripture can be fairly presented by one who disbelieves its
inspiration and its message; and this is the first English
24
According to Hadhrami (2009), Muhammad Marmaduke Pickthall was born in
London in 1875 to an Anglican clergyman. He was contemporary of Winston Churchill
at Harrow private school. He travelled extensively in the Arab world and Turkey and in
1917 reverted to Islam and soon became a leader among the emerging group of British
Muslims.

113
translation of the Koran by an Englishman who is a Muslim

(quoted in Arberry 1964, p. 20).
Similarly, as a non-Muslim translator who has discovered the poetic

aspect of the Holy Quran and thus is eager to make his discovery
accessible to his readers, Arberry (1964) announced that there are a
number of familiar topics which are brought up throughout the Holy
Quran. Each surah elaborates one or more of these topic and reinforces
them by employing a subtle rhythmical flow of the discourse. The non-
Arabic readers of the translated Quran often fail to realize that the
Arabic Quran has a unique musical aspect which makes it a literary
masterpiece.
If this diagnosis of the literary structure of the Koran may be

accepted as true -- and it accords with what we know of the
poetical instinct, indeed the whole aesthetic impulse, of the
Arabs -- it follows that those notorious incongruities and
irrelevancies, even those 'wearisome repetitions', which have
proved such stumbling-blocks in the way of our Western
appreciation will vanish in the light of a clearer understanding
of the nature of the Muslim scriptures. A new vista opens up;
following this hitherto unsuspected and unexplored path, the
eager interpreter hurries forward upon an exciting journey of
discovery, and is impatient to report his findings to a largely
indifferent and incredulous public (Arberry 1964, p. 28).
7.2.2 Applied Translation Research

Translation research can be viewed applied from two perspectives. First,
the very process of translating entails applying one’s syntactic, semantic,
discoursal and global knowledge of the source text to expressing the
same knowledge in the target text. It is, therefore, a deliberate endeavour
on the part of translators to transfer the content of a given text from one
language to another without their being visible. As Norman Shapiro
(quoted by Venti, 1995) said,

114
I see translation as the attempt to produce a text so transparent

that it does not seem to be translated. A good translation is like a
pane of glass. You only notice that it’s there when there are little
imperfections- scratches, bubbles. Ideally, there shouldn’t be
any. It should never call attention to itself (p. 1).
Secondly, in contrast to basic translational research, which is primarily

done to satisfy translators’ own ego, applied translational research is
conducted to fulfill various purposes some of which are specified below.
1. To meet a public demand in general. This does not, however, mean

that a given translator might not have both basic and applied purposes
in mind when s/he embarks on translating a given source text. It is in
fact my own understanding that the more purposes translators have in
their translational research, the better their translated target text
would be in terms of its quality and readership.
2. To embark on a mission by exposing the assumed wrong beliefs held
by certain groups of people. For example, debunk a given religion
like Islam and aid in the conversion of its believers to another (Ross,
1649).
3. To answer polemics faced by certain translators. For example,
Mohammed (2005) assessed some translations to show what opinions
Christians hold about the Quran.
4. To reply to criticism leveled at a particular translated text, e.g., the
manifold criticisms of the Quran by various Christian authors
7.2.3 Evaluation Translation Research

Translation as a research method is employed to make a foreign
document available to the speakers of a given language as accurately as
possible. Naturally, there are some foreign documents which are highly
valuable and of great value to speakers of all languages.
The Quran is, for example, of great importance not only to English
speaking Muslims but also to those who are in contact with Islamic
societies. Its very importance has encouraged many scholars to translate

115
the Quran not only from Arabic but also from other languages, e.g.,
Latin and French, to English.
Although evaluating different translations done by various translators is

by itself a noteworthy research, some scholars have used their evaluation
in order to justify a new translation of the Glorious Quran. Arberry
(1964), for example, in his preface to Volume One of The Koran
Interpreted has reproduced the rendering of some verses from two
surahs by six translators to show how his translation differs from theirs.
In order to save space, I have reproduced the English translations of

three verses in Surah 19 (Mary), i.e., verses 16, 17 and 18 from six
sources. Arberry (1964) quoted these sources in order to justify his own
translation. To render evaluation as accurate as possible, Arberry’s own
translation of the verses along with their original Arabic versions have
been given at the end. As a class activity, compare the seven translations
with each other and then decide what makes Arberry’s different. It is
worth noting that these translations have a chronological order, i.e.,
Ross’s translation is the earliest English translation of the Quran.
Ross (1649)
'Remember thou what is written of Mary, she retired towards the East,
into a place far remote from her Kindred, and took a Vail to cover her,
we sent her our Spirit in form of a man; she was afraid, and said, God
will preserve me from thee, if thou have his fear before thine eyes.
Sale (1880)
'And remember in the book of the Koran the story of Mary; when she
retired from her family to a place towards the east, and took a veil to
conceal herself from them; and we sent our spirit Gabriel unto her, and
he appeared unto her in the shape of a perfect man. She said, I fly for
refuge unto the merciful God, that he may defend me from thee: if thou
fearest him, thou wilt not approach me.
Rodwell (1909)

116
'And make mention in the Book, of Mary, when she went apart from her
family, eastward, and took a veil to shroud herself from them: and we
sent our spirit to her, and he took before her the form of a perfect man.
She said: "I fly for refuge from thee to the God of Mercy! If thou fearest
Him, begone from me."
Palmer (1880)
'And mention, in the Book, Mary; when she retired from her family into
an eastern place; and she took a veil to screen herself from them; and we
sent unto her our spirit; and he took for her the semblance of a well-
made man. Said she, "Verily, I take refuge in the Merciful One from
thee, if thou art pious."
Pickthall (1930)
16. And make mention of Mary in the Scripture, when she had
withdrawn from her people to a chamber looking East.
17. And had chosen seclusion from them. Then We sent unto her Our
spirit and it assumed for her the likeness of a perfect man.
18. She said: Lo! I seek refuge in the Beneficent One from thee, if thou
are God-fearing.
Bell (1960)
16. Make mention in the Book of Mary; When she withdrew from her
people to a place, eastward.
17. And took between herself and them a curtain. Then We sent to her
Our spirit, who took for her the form of a human being, shapely.
18. She said: "Lo, I take refuge with the Merciful from thee, if thou art
pious."
Arberry (1964, p. 331)

'And mention in the Book Mary ْ‫ب َﻣﺮْ َﻳ َﻢ ِإ ِذ اﻧ َﺘ َﺒ َﺬتْ ِﻣﻦ‬ِ ‫وَاذْ ُآﺮْ ﻓِﻲ اﻟْ ِﻜﺘَﺎ‬
when she withdrew from her people {16} ‫ﺷﺮْﻗِﻴًّﺎ‬ َ ‫َأهِْﻠﻬَﺎ َﻣﻜَﺎﻧًﺎ‬
to an eastern place,
‫ﺳﻠْﻨَﺎ‬
َ ْ‫ﺣﺠَﺎﺑًﺎ َﻓَﺄر‬ ِ ْ‫ﺨ َﺬتْ ﻣِﻦ دُو ِﻧ ِﻬﻢ‬ َ ‫ﻓَﺎ ﱠﺗ‬
and she took a veil apart from them;
then We sent unto her Our Spirit ‫ﺳﻮِﻳًّﺎ‬
َ ‫ﺸﺮًا‬ َ ‫ﻞ َﻟﻬَﺎ َﺑ‬ َ ‫ﺣﻨَﺎ َﻓ َﺘ َﻤ ﱠﺜ‬
َ ‫ِإَﻟﻴْﻬَﺎ رُو‬

117
that presented himself to her {17}

a man without fault. ‫ﻚ إِن‬
َ ‫ﻗَﺎَﻟﺖْ ِإﻧﱢﻲ َأﻋُﻮ ُذ ﺑِﺎﻟ ﱠﺮﺣْﻤَﻦ ﻣِﻨ‬
She said, I take refuge in
{18} ‫ﺖ َﺗﻘِﻴًّﺎ‬ َ ‫آُﻨ‬
the All-merciful from thee!
If thou fearest God …’
Here is how Arberry (1964) justified his own translation:
The verses into which the individual Surah is divided usually,

but not always, represent rhetorical units, terminated and
connected together by a rhyming word. A few bold spirits have
ventured on occasion to show this feature by rhyming their
translations; the resulting products have not been very
impressive. For my own part I have preferred to indicate these
terminations and connections by rounding off each succession of
loose rhythms with a much shorter line. The function of rhyme
in the Koran is quite different from the function of the rhyme in
poetry; it therefore demands a different treatment in translation.
7.2.4 Action Translation Research

While some scholars may conduct translation research for basic and
evaluation purposes, others may feel that certain aspect of knowledge
present in a source text is missing in a target population and thus take
action to make the text available to the population. Although this action
may have many reasons such as the ones mentioned in applied
translation research, its basic thrust is to take a practical step to solve a
perceived local problem.
In his forward to Ma'ariful-Qur'an (Shafi, 1995), Usmani, for example,

declared that although a relatively large number of English translations
of the Quran exists in the market, “yet no comprehensive commentary
of the Holy Qur'an has still appeared in the English language. Some
brief footnotes found with some English translations cannot fulfill the
need of a detailed commentary” (p. xvii).

118
In order to meet the public demand for an English commentary of the

Quran, Usmani decided to translate Ma'ariful-Qur'an (Shafi, 1995) from
Urdu to English. For this purpose, he invited some Pakistani Muslim
experts in English to help him out in this giant task. During the process
of translating the commentary, however, they faced a new problem
which called for an immediate action. They realized that they needed to
have a translation of the Quran so that they could use it as a reference.
They had three choices (p. xix):
1. To adopt any one of the already available English translations of the

Holy Qur'an, like those of Arberry, Pickthall or Abdullah Yousu Ali
2. To translate the Urdu translations used in the Ma'ariful-Qur'an into
English.
3. To provide a new translation of their own
After deliberation and consultation, Usmani and his team elected to

work on the third option, i.e. to prepare a new translation of the Quran
themselves. Unfortunately, they did not feel any need to list, let alone
explain, their “manifold” reasons for this option but a desire “to prepare
a translation which may be closer to the Qur'anic text and easier to
understand” (Shafi, 1995 p. xix).
I have reproduced Usmani and his team’s translation of three verses in

Surah 19 (Mary) i.e., 16, 17 and 18, in order to pave the way for their
comparison with the seven translations given in section 7.2.3.
And mention in the Book (the story of) Maryam, when she withdrew
from her people to a place eastwards, [I6] then she used a barrier to
hide herself from them. Then, We sent to her Our Spirit and he took
before her the form of a perfect human. [I7] She said, "I seek refuge
with the Rahman (All-Merciful) against you if you are God-fearing.
[I8] (Shafi, n.d., p. 34)
As can be seen in the verses translated above, Usmani and his team have
transliterated the prophets’ names according to their Arabic

119
pronunciation, and not according to their biblical form. For example, the
biblical Mary has been transliterated as Maryam. Similarly, instead of
biblical Abraham, the Qur'anic Ibrāhīm, and instead of Joseph, the
Qur'anic Yūsuf have been kept in English translation. However, for
proper nouns other than those of prophets, like Pharaoh, their English
forms have been retained.
7.3 Methods Employed in Translation Research

In 1963 Campbell and Stanley listed 16 different methods of research.
The list has undergone a number of changes since then and has received
various degrees of emphasis in different fields. In applied linguistics
some methods have been emphasized whereas others have been ignored.
Hatch and Farhady (1982), for example, discussed five methods, namely
pre-experimental, true experimental, quasi-experimental, ex post facto
and factorial methods. Hatch and Lazaraton (1991) left out factorial
design and elaborated on subdesigns of true experimental and quasi-
experimental studies as independent methods. Other methods such as
historical and survey research projects are discussed in textbooks on
educational research (e.g. Fraenkel & Wallen, 1993; Gay, 1990).
Research methods are differentiated from each other in terms of the

techniques they employ to observe and collect data. Questionnaires are,
for example, the most commonly used techniques in survey research
projects. Surveys and questionnaires are therefore used synonymously in
the literature (de Vaus, 1985).
Similar to other research methods, translation employs its own

techniques or procedures. I have divided translation methods into two
categories: macrostructural and microstructural. This classification is
based on schema theory. Almost all translation methods discussed in the
literature are macrostructural.

120
7.3.1 Macrostructural Methods

Macrostructural methods of research approach translation on the basis of
units larger or other than sentences. These methods attempt to address
the meaning or the message expressed in the source language (SL) as a
whole, i.e., SL text. Newmark (1988), for example, asserted that there
are eight methods of translation, namely, word-for-word, literal, faithful,
semantic, adaptation, free, idiomatic and communicative translations.
7.3.1.1 Word-for-Word Translation

In this method the SL determines the structures used in the target
language (TL). The word order of the SL is, for example, preserved and
the words are translated mainly by their most common meanings. The
context in which words occur in the SL does not play any role. The main
function of this linear translation method is either to understand the
mechanics of the SL or to understand the complex text as a pre-
translation process.
I have translated the first three lines of the Song of the Reed (Rumi
2001, Book 1: Lines 1-34) word by word below.
Listen to the reed because it complains: ‫ﺑﺸﻨﻮ از ﻧﯽ ﭼﻮن ﺷﮑﺎﻳﺖ ﻣﯽ ﮐﻨﺪ‬

From separations it tells stories ‫از ﺟﺪاﻳﻴﻬﺎ ﺣﮑﺎﻳﺖ ﻣﯽ ﮐﻨﺪ‬
From the reed-bed till they cut me away ‫ﮐﺰ ﻧﻴﺴﺘﺎن ﺗﺎ ﻣﺮا ﺑﺒﺮﻳﺪﻩ اﻧﺪ‬
In my wailing, men and women have ‫در ﻧﻔﻴﺮم ﻣﺮد و زن ﻧﺎﻟﻴﺪﻩ اﻧﺪ‬
cried ‫ﺳﻴﻨﻪ ﺧﻮاهﻢ ﺷﺮﺣﻪ ﺷﺮﺣﻪ در ﻓﺮاق‬
A chest I want broken piece by piece in Rumi, ) ‫ﺗﺎ ﺑﮕﻮﻳﻢ ﺷﺮح درد اﺷﺘﻴﺎق‬
severance (2001
To tell the description of yearning pain
7.3.1.2 Literal Translation

As an improvement to word-for-word translation, literal translation
changes the structure of the SL into the TL. However, the lexical words
are translated on the basis of their most common meanings in the SL.
This allows translators to specify what problems should be solved.

121
Nickolson (1926), for example, translated the first three lines described
in 7.3.1.1. As you can see, he has kept the structure of Persian lines as
much as possible and used the most common meaning of words such as
CHON (‫ )ﭼﻮن‬as how and JODAIHA (‫ )ﺟﺪاﻳﻴﻬﺎ‬as separations.
Listen to this reed how it complains: ‫ﺑﺸﻨﻮ از ﻧﯽ ﭼﻮن ﺷﮑﺎﻳﺖ ﻣﯽ ﮐﻨﺪ‬

it is telling a tale of separations. ‫از ﺟﺪاﻳﻴﻬﺎ ﺣﮑﺎﻳﺖ ﻣﯽ ﮐﻨﺪ‬
Saying, "Ever since I was parted from the ‫ﮐﺰ ﻧﻴﺴﺘﺎن ﺗﺎ ﻣﺮا ﺑﺒﺮﻳﺪﻩ اﻧﺪ‬
reed-bed ‫در ﻧﻔﻴﺮم ﻣﺮد و زن ﻧﺎﻟﻴﺪﻩ اﻧﺪ‬
Men and women have moaned in (unison ‫ﺳﻴﻨﻪ ﺧﻮاهﻢ ﺷﺮﺣﻪ ﺷﺮﺣﻪ در‬
with) my lament. ‫ﻓﺮاق‬
I want a bosom torn by severance, ‫ﺗﺎ ﺑﮕﻮﻳﻢ ﺷﺮح درد اﺷﺘﻴﺎق‬
that I may unfold (to such a one) the pain of (Rumi, 2001)
love-desire (Nickolson, 1926).
Nicolson’s translation, however, moves beyond literary boundaries by

replacing BOBRIDEAND (‫ )ﺑﺒﺮﻳﺪﻩ اﻧﺪ‬with was parted. He has somehow
misunderstood the meaning of ESHTIYAGH (‫ )اﺷﺘﻴﺎق‬by choosing love-
desire as its English equivalent.
7.3.1.3 Faithful Translation

In this method the lexical problems faced in the literal translation is
solved by using the context of the SL text. Trying to be faithful to the
author’s intentions expressed in the SL text, translators preserve the
grammatical and lexical deviations from SL norms in the TL text.
Arberry (1961), for example, translated the Song of the Reed (Rumi
2001, Book 1: Lines 1-34) faithfully by adopting one-to-one English
equivalents for original Persian words. He has, however, tried to make
the English translation as natural as possible.
Listen to this reed, how it makes ‫ﺑﺸﻨﻮ از ﻧﯽ ﭼﻮن ﺷﮑﺎﻳﺖ ﻣﯽ ﮐﻨﺪ‬

complaint, ‫از ﺟﺪاﻳﻴﻬﺎ ﺣﮑﺎﻳﺖ ﻣﯽ ﮐﻨﺪ‬
telling a tale of separation: ‫ﮐﺰ ﻧﻴﺴﺘﺎن ﺗﺎ ﻣﺮا ﺑﺒﺮﻳﺪﻩ اﻧﺪ‬
"Ever since I was cut off from my reed- ‫در ﻧﻔﻴﺮم ﻣﺮد و زن ﻧﺎﻟﻴﺪﻩ اﻧﺪ‬

122
bed, ‫ﺳﻴﻨﻪ ﺧﻮاهﻢ ﺷﺮﺣﻪ ﺷﺮﺣﻪ در ﻓﺮاق‬

men and women all have lamented my Rumi, ) ‫ﺗﺎ ﺑﮕﻮﻳﻢ ﺷﺮح درد اﺷﺘﻴﺎق‬
bewailing. (2001
I want a breast torn asunder by severance, so that I may fully declare
the agony of yearning (Arberry 1961).
7.3.1.4 Semantic Translation

The fidelity of translators to the grammatical and lexical deviations in
the SL is replaced by translators’ intuitions and empathy. An attempt is
made to preserve what Newmark (1988) called aesthetic value or the
beautiful and natural sound of the SL text. This is achieved by choosing
the TL words, which preserve the sounds that resembled each other in
the SL word. These TL words may not convey the meanings expressed
in the SL words.
Redhouse (1881), for example, translated the first three lines of the Song
of the Reed (Rumi 2001, Book 1: Lines 1-34) by following a semantic
method. As can be seen, Redhouse has been quite successful in
preserving the music of the original Persian lines. He has translated
BOBRIDEAND (‫ )ﺑﺒﺮﻳﺪﻩ اﻧﺪ‬tore. Since it does not rhyme with wept as an
appropriate equivalent of NALIDEAND (‫)ﻧﺎﻟﻴﺪﻩ اﻧﺪ‬, he has replaced it with
to weeping eyes sore not only to rhyme with tore but also to stay
semantically faithful.
From reed-flute hear what tale it tells; ‫ﺑﺸﻨﻮ از ﻧﯽ ﭼﻮن ﺷﮑﺎﻳﺖ ﻣﯽ ﮐﻨﺪ‬
What plaint it makes of absence' ills. ‫از ﺟﺪاﻳﻴﻬﺎ ﺣﮑﺎﻳﺖ ﻣﯽ ﮐﻨﺪ‬
"From jungle-bed since me they tore, ‫ﮐﺰ ﻧﻴﺴﺘﺎن ﺗﺎ ﻣﺮا ﺑﺒﺮﻳﺪﻩ اﻧﺪ‬
Men's, women's, eyes have wept right ‫در ﻧﻔﻴﺮم ﻣﺮد و زن ﻧﺎﻟﻴﺪﻩ اﻧﺪ‬
sore. ‫ﺳﻴﻨﻪ ﺧﻮاهﻢ ﺷﺮﺣﻪ ﺷﺮﺣﻪ در ﻓﺮاق‬
My breast I tear and rend in twain, Rumi, ) ‫ﺗﺎ ﺑﮕﻮﻳﻢ ﺷﺮح درد اﺷﺘﻴﺎق‬
To give, through sighs, vent to all my (2001
pain.
Sometimes preserving and conveying the feeling achieved through

sounds in the original becomes so important that the translator

123
paraphrases a single schema such as JODAIIHA (‫ )ﺟﺪاﻳﻴﻬﺎ‬to achieve an

artistic value. For example, Sir William Jones (Quoted in Arberry 1954)
employs phrases such as departed bliss and present woe to describe
JODAIIHA. Similarly, he inserts bleed to rhyme with feel by adding the
compound sentence feel what I sing and bleed when I lament in order to
convey the beauty of meaning expressed through rhyming FERAGH
(‫ )ﻓﺮاق‬with ESHTIYAGH (‫)اﺷﺘﻴﺎق‬.
Hear, how yon reed in sadly pleasing tales ‫ﺑﺸﻨﻮ از ﻧﯽ ﭼﻮن ﺷﮑﺎﻳﺖ ﻣﯽ ﮐﻨﺪ‬
Departed bliss and present woe bewails! ‫از ﺟﺪاﻳﻴﻬﺎ ﺣﮑﺎﻳﺖ ﻣﯽ ﮐﻨﺪ‬
'With me, from native banks untimely torn, ‫ﮐﺰ ﻧﻴﺴﺘﺎن ﺗﺎ ﻣﺮا ﺑﺒﺮﻳﺪﻩ اﻧﺪ‬
Love-warbling youths and soft-ey'd virgins ‫در ﻧﻔﻴﺮم ﻣﺮد و زن ﻧﺎﻟﻴﺪﻩ اﻧﺪ‬
mourn. ‫ﺳﻴﻨﻪ ﺧﻮاهﻢ ﺷﺮﺣﻪ ﺷﺮﺣﻪ در‬
O! Let the heart, by fatal absence rent, ‫ﻓﺮاق‬
Feel what I sing, and bleed when I lament ‫ﺗﺎ ﺑﮕﻮﻳﻢ ﺷﺮح درد اﺷﺘﻴﺎق‬
(Arberry, 1954) (Rumi, 2001)
7.3.1.5 Adaptation
Adaptation is a free method of translation employed in dramas and
poetry. While the themes, characters, and plots of the SL text are
preserved, the SL culture is converted to the TL culture and the text is
rewritten by an established dramatist or poet (Newmark, 1988).
Due to rapid developments in technology and human communication,

however, having an established poet rewrite the translated text in the TL
does not seem to be a major concern these days, particularly on Internet.
World Wide surfers, for example, look for materials of their interest and
make them available to the public even without mentioning their sources
properly. This does not, however, mean than one cannot find well
referenced sources on the internet.
A highly self-motivated and enthusiastic scholar called Ibrahim Gamard,

for example, has posted the various translation of the Song of the Reed
(Rumi 2001, Book 1: Lines 1-34) at the site http://www.dar-al-
masnavi.org/reedsong.html. Whinfield's (1887) translation of the first

124
three lines of the song appears below. They provide a typical example
for adaptation.
Hearken to the reed-flute, how it complains, ‫ﺑﺸﻨﻮ از ﻧﯽ ﭼﻮن ﺷﮑﺎﻳﺖ ﻣﯽ ﮐﻨﺪ‬

Lamenting its banishment from its home: ‫از ﺟﺪاﻳﻴﻬﺎ ﺣﮑﺎﻳﺖ ﻣﯽ ﮐﻨﺪ‬
Ever since they tore me from my osier bed, ‫ﮐﺰ ﻧﻴﺴﺘﺎن ﺗﺎ ﻣﺮا ﺑﺒﺮﻳﺪﻩ اﻧﺪ‬
My plaintive notes have moved men and ‫در ﻧﻔﻴﺮم ﻣﺮد و زن ﻧﺎﻟﻴﺪﻩ اﻧﺪ‬
women to tears.
‫ﺳﻴﻨﻪ ﺧﻮاهﻢ ﺷﺮﺣﻪ ﺷﺮﺣﻪ در ﻓﺮاق‬
I burst my breast, striving to give vent to
sighs,
Rumi, ) ‫ﺗﺎ ﺑﮕﻮﻳﻢ ﺷﺮح درد اﺷﺘﻴﺎق‬
And to express the pangs of my yearning (2001
for my home (Whinfield, 1887).
As can be seen in the translated lines above, Whinfield (1887) adapted

the poem to his/her own feelings and translated JODAIIHA (‫ )ﺟﺪاﻳﻴﻬﺎ‬as
banishments instead of separations. Similarly, instead of translating
HEKAYAT (‫ )ﺣﮑﺎﻳﺖ‬as narrating, he has offered lamenting. It seems that
since Whinfield was a Westner he could not capture Rumi's message as
well as Shahriari, (1998) did as shown below.
Pay heed to the grievances of the reed ‫ﺑﺸﻨﻮ از ﻧﯽ ﭼﻮن ﺷﮑﺎﻳﺖ ﻣﯽ ﮐﻨﺪ‬
Of what divisive separations breed ‫از ﺟﺪاﻳﻴﻬﺎ ﺣﮑﺎﻳﺖ ﻣﯽ ﮐﻨﺪ‬
From the reedbed cut away just like a weed ‫ﮐﺰ ﻧﻴﺴﺘﺎن ﺗﺎ ﻣﺮا ﺑﺒﺮﻳﺪﻩ اﻧﺪ‬
My music people curse, warn and heed ‫در ﻧﻔﻴﺮم ﻣﺮد و زن ﻧﺎﻟﻴﺪﻩ اﻧﺪ‬
Sliced to pieces my bosom and heart bleed ‫ﺳﻴﻨﻪ ﺧﻮاهﻢ ﺷﺮﺣﻪ ﺷﺮﺣﻪ در‬
While I tell this tale of desire and need. ‫ﻓﺮاق‬
‫ﺗﺎ ﺑﮕﻮﻳﻢ ﺷﺮح درد اﺷﺘﻴﺎق‬
(Rumi, 2001)
7.3.1.6 Free Translation

In contrast to adaptation, translators adopt a free approach when they
consider rendering the exact message unnecessary. This might happen

125
for a number of reasons. I believe the most tangible reason for free
translation might be found in the difficulty a translator finds himself
when he comes across some unintelligible parts in the source text. By
applying this method they interpret the original and present their own
understanding as the author’s.
The second reason has its roots in translators’ personal understandings

and even religious preferences, which predispose them to take as much
liberty as they want with their rendering. The effect of personal ideas
and inclinations can best be documented in the translation of the Quran.
Let’s read the verses 32 and 33 of Surah 38 (‫ )ص‬and see how they have
been translated by different translators.
‫ﻖ‬
َ ‫ﻄ ِﻔ‬
َ ‫ﻲ َﻓ‬
‫ﻋَﻠ ﱠ‬
َ ‫{ ُردﱡوهَﺎ‬32} ‫ب‬
ِ ‫ﺤﺠَﺎ‬
ِ ْ‫ﺣﺘﱠﻰ َﺗﻮَا َرتْ ﺑِﺎﻟ‬
َ ‫ﺨﻴْ ِﺮ ﻋَﻦ ِذآْ ِﺮ َرﺑﱢﻲ‬
َ ْ‫ﺣﺐﱠ اﻟ‬ ُ ‫ﺖ‬ ُ ْ‫َﻓﻘَﺎ َل ِإﻧﱢﻲ َأﺣْ َﺒﺒ‬
{33} ‫ق‬ ِ ‫ق وَاﻟْ َﺄﻋْﻨَﺎ‬
ِ ‫َﻣﺴْﺤًﺎ ﺑِﺎﻟﺴﱡﻮ‬
Arberry he said, 'Lo, I have loved the love of good things better
(1964) than the remembrance of my Lord, until the sun was
hidden behind the veil. Return them to me!' And he began
to stroke their shanks and necks.
Khan And he said: "Alas! I did love the good (these horses)
(1970) instead of remembering my Lord (in my 'Asr prayer)" till
the time was over, and (the sun) had hidden in the veil (of
night). Then he said "Bring them (horses) back to me."
Then he began to pass his hand over their legs and their
necks (till the end of the display).
Mohmmad So he said, I love the good things on account of the
Ali (1917) remembrance of my Lord -- until they were hidden
behind the veil. (He said): Bring them back to me. So he
began to stroke (their) legs and necks.
Pickthal And he said: Lo! I have preferred the good things (of the
(1930) world) to the remembrance of my Lord; till they were
taken out of sight behind the curtain. (Then he said):
Bring them back to me, and fell to slashing (with his
sword their) legs and necks.

126
Sarwar he said, "My love of horses for the cause of God has
(1981) made me continue watching them until sunset, thus
making me miss my prayer". He said, "Bring them back
to me." Then he started to rub their legs and necks.
Shakir Then he said: Surely I preferred the good things to the
(1983) remembrance of my Lord-- until the sun set and time for
Asr prayer was over, (he said): Bring them back to me; so
he began to slash (their) legs and necks.
Sherali He said, `I love the love of good things because they
(1955) remind me of my Lord.' And when they were hidden
behind the veil, He said, `Bring them back to me.' Then
he started stroking their legs and their necks.
Yusufali And he said, "Truly do I love the love of good, with a
(1989) view to the glory of my Lord,"- until (the sun) was hidden
in the veil (of night): "Bring them back to me." then
began he to pass his hand over (their) legs and their
necks.
7.3.1.7 Idiomatic Translation

Idiomatic translation aims at colloquialism and idiomatic style in the TL
text. Although it attempts to reproduce the message conveyed in the SL
text, it distorts some meaning by preferring colloquials and idioms
where these do not exist in the original.
Rashad (1978) employed an idiomatic approach to translate the verse 32

and 33 of Surah 38 (‫)ص‬. As can be seen below, instead of the sun was
hidden behind the veil he uses the idiomatic expression the son was
gone. He has also added "to bid farewell" to verse 33 by himself by
inserting it inside a parenthesis.
He then said, "I enjoyed the material things more than I

enjoyed worshiping my Lord, until the sun was gone.
"Bring them back." (To bid farewell,) he rubbed their legs
and necks.

127
It is worth noting that many Islamic scholars avoid mentioning Rashad's

(1978) translation of the Quran because it contains "blasphemous
statements" (Kidwai, 1987). However, some inappropriate translations
such as the ones belonging to Yusufali (1989) have not caught the
attention of critics yet. (An example will be in 7.4.1 shortly.)
7.3.1.8 Communicative Translation

Translators who employ communicative translation attempt to convey
the message of the SL text to their TL readers in such a way that they
accept both the content and language of the TL text. Since
communicative translators do not adhere to the content of the ST and
delete some concepts from the original and add their own to the TT,
communicative translation should be viewed as a free macrostructural
method.
The adoption of a communicative approach towards translation,

however, makes it subjective in the sense that there is still no objective
way to quantify the process. It seems, for example, the translator of
Amouzadeh-Mahdiraji's (2004, p. 65) Persian abstract has done his best
to make his English speaking readers accept the content and language of
the abstract by following the style and format of English journals.
Language and the Representation ‫ﻧﻘﺶ زﺑﺎن در ﻧﻤﻮد واﻗﻌﻴﺖ هﺎ‬

of Reality
Abstract: :‫ﭼﻜﻴﺪﻩ‬
This paper aims to investigate how ‫راﺑﻄﻪ ﺑﻴﻦ زﺑﺎن و واﻗﻌﻴﺖ آﻪ از دﻳﺮ ﺑﺎز‬
language forges and structures the ‫ در ﺳﺪة اﺧﻴﺮ ﺑﻪ‬،‫ﻣﻮرد ﺗﻮﺟﻪ ﻓﻼﺳﻔﻪ ﺑﻮدﻩ‬
conceptual apparatus of human ‫دﻳﮕﺮ ﺣﻮزﻩ هﺎي ﻋﻠﻮم اﻧﺴﺎن ﺧﺼﻮﺻﺎ‬
beings to perceive the world. By ‫رواﻧﺸﻨﺎﺳﻲ و زﺑﺎﻧﺸﻨﺎﺳﻲ ﻧﻴﺰ راﻩ ﻳﺎﻓﺘﻪ‬
presenting a critical review of major ‫ روان ﺷﻨﺎﺳﺎن و‬،‫ اﻏﻠﺐ ﻓﻴﻠﺴﻮﻓﺎن‬.‫اﺳﺖ‬
linguistic and philosophical ‫ﻣﻌﻨﺎﺷﻨﺎﺳﺎن ﺗﺎ دهﻪ هﺎي اﺧﻴﺮ زﺑﺎن را ﻣﻨﻔﻚ‬
reflections on language and reality, .‫از ﻣﻘﻮﻟﻪ ﺷﻨﺎﺧﺖ و واﻗﻌﻴﺖ ﻓﺮض ﻣﻴﻜﺮدﻧﺪ‬
the paper brings to light certain ‫ ﻣﻘﻮﻟﻪ ﺷﻨﺎﺧﺖ ﺟﺪا از‬،‫ﺑﻪ ﻋﺒﺎرت دﻳﮕﺮ‬
aspects of language involved in ‫ﺳﺎﺧﺘﺎر زﺑﺎن ﺗﺼﻮر ﻣﻲ ﺷﺪ و زﺑﺎن اﺑﺰاري‬
constructing reality in a specific ‫ در‬.‫ﺟﻬﺖ ﺑﻴﺎن وﻗﺎﻳﻊ ﻣﺤﺴﻮب ﻣﻲ ﮔﺸﺖ‬

128
ways. On the basis of three ‫ روﻳﻜﺮد ﻧﻮﻳﻦ زﺑﺎن‬،‫ﻣﻘﺎﺑﻞ اﻳﻦ ﮔﺮاﻳﺶ‬

theoretical frameworks in ‫ﺷﻨﺎﺳﺎن ﻗﺮار دارد آﻪ ﺑﺎ اﻳﻦ ﻣﺴﺎﻟﻪ ﺑﻨﺤﻮي‬
linguistics, namely anthropological ‫ ﺑﺮ اﺳﺎس‬.‫آﺎﻣﻶ ﻣﺘﻔﺎوت ﺑﺮﺧﻮرد ﻣﻲ آﻨﺪ‬
linguistics by Whorf, cognitive ‫اﻳﻦ روﻳﻜﺮد زﺑﺎن ﺷﻨﺎﺳﺎن ﻣﻌﺘﻘﺪﻧﺪ آﻪ ﻧﻈﺎم‬
linguistics by Lakoff, and social ‫ادراآﻲ و ﺷﻨﺎﺧﺘﻲ اﻧﺴﺎن از ﺳﺎﺧﺘﺎر زﺑﺎﻧﻲ‬
semiotics by Halliday, the current ‫ زﺑﺎن ﻧﻪ ﺗﻨﻬﺎ در ﺑﻴﺎن‬،‫ ﺑﻪ ﺗﻌﺒﻴﺮي‬.‫ﺟﺪا ﻧﻴﺴﺖ‬
paper analyzes a number of ‫واﻗﻌﻴﺎت و ﺷﻨﺎﺧﺖ ﭘﻴﺮاﻣﻮن ﺑﻲ ﺗﺎﺛﻴﺮ ﻧﻴﺴﺖ‬
examples from Persian language to .‫ﺑﻠﻜﻪ در ﺳﺎﺧﺖ واﻗﻌﻴﺖ هﺎ ﻧﻴﺰ ﻣﻮﺛﺮ اﺳﺖ‬
demonstrate that the perception and ‫ زﺑﺎن ﻣﺴﺘﻘﻴﻤﺎ دﻧﻴﺎي ﺑﻴﺮون‬،‫ﺑﻪ ﺑﻴﺎن دﻳﮕﺮ‬
conceptualization of reality is, to a ‫را ﻣﻨﻌﻜﺲ ﻧﻤﻲ آﻨﺪ ﺑﻠﻜﻪ ﺑﺎزﮔﻮ آﻨﻨﺪﻩ ﻣﻔﺎهﻴﻢ‬
great extent, bound up with ‫ در اﻳﻦ‬.‫اﻧﺴﺎﻧﻲ و ﺗﻌﺎﻣﻞ وي ﺑﺎ ﺟﻬﺎن اﺳﺖ‬
language use. Therefore, it can be ‫ ﻣﻘﺎﻟﻪ ﺣﺎﺿﺮ در ﭘﻲ ﺁن اﺳﺖ آﻪ‬،‫راﺑﻄﻪ‬
argued that language does not map ‫ﻧﺤﻮة ﺗﻜﻮﻳﻦ دﻳﺪﮔﺎهﻬﺎي ﻣﻬﻢ زﺑﺎﻧﺸﻨﺎﺳﻲ را‬
onto external reality, but it ‫درﺑﺎرﻩ ﻧﻘﺶ زﺑﺎن ﺑﺮاي ﺷﻨﺎﺧﺖ ﻳﺎ ﻧﻤﻮد‬
represents our conceptualization of ‫واﻗﻌﻴﺖ هﺎ ﻣﻮرد ﺑﺮرﺳﻲ و ﺗﺤﻠﻴﻞ ﻗﺮار دهﺪ‬
and interaction with the world.
Key words: language, cognition, ‫ ﻧﻤﻮد‬،‫ ﺷﻨﺎﺧﺖ‬،‫ زﺑﺎن‬:‫واژﻩ هﺎي آﻠﻴﺪي‬

presentation of reality, linguistic ‫ ﻃﺒﻘﻪ ﺑﻨﺪي‬،‫ ﻣﻘﻮﻟﻪ هﺎي زﺑﺎﻧﻲ‬،‫واﻗﻌﻴﺖ‬
categories, classification of ‫ اﻧﺘﺨﺎب ﮔﺮي )ﻋﻤﻮزادﻩ‬،‫ﺗﺠﺎرب‬
experiences, selectivity (Amouzadeh- (1 ‫ ص‬،1383 ،‫ﻣﻬﺪﻳﺮﺟﻲ‬
Mahdiraji 2004, p. 65).
7.3.2 Microstructural Method

As stated before translation research can be generally conducted either
literally or scientifically. I believe that macrostructural methods are
literary because concepts such as text and message are too broad and
intrinsically subjective to be captured by operationalizable definitions.
[Interested readers are referred to Unit One of Reading Media Texts:
Iran-America Relations (Khodadady, 1999b) for a discussion of text.]
A microstructural method, however, approaches translation research

scientifically and addresses words comprising texts or “microschemata”
(Khodadady 1997, 1999a) as the basic units of translation. The
technique used in the microstructural method is semantic trait analysis,

129
which allows a translator to determine what word-meaning participates

in the meaning of another word (Cruse, 1986).
Semantic trait analysis is similar to componential analysis employed in

contrastive analysis. However, while componential analysis aims at
semantic features, controversial concepts in linguistics, the function of
semantic trait analysis is to find out which TL schema compete with all
semantically, syntactically and discoursally related words as the best
equivalent for a SL schema. This analysis depends mainly on the context
in which the SL word occurs and is best achieved by activating the
translator’s feeling, attitudes and experiences with the selected
equivalent schema.
For example, the schema arzyabi (‫ )ارزﻳﺎﺑﯽ‬is used in an authentic Persian

text whose first sentence reads: dar arzyabi har padideh aan raa baa
negaahi bebineem keh dar asreh peidaayeshash dideh mee shodeh ast ...
(Shariati, 1362-1981, p. 37). The schema arzyabi is semantically related
to the schema evaluation as are the schemata assessment, estimating,
gauging, appraisal, value, prize, measurement, meter, calculation, value,
rating, grading, size, stepping, weighing, and calculation. The most
appropriate TL equivalent for arzyabi is assessment since it requires a
criterion for assessing a given phenomenon, i.e., the age in which it
appears (asreh peidayeshash).
Whether a research method is experimental, quasi-experimental,

historical, survey or translational, it should be conducted scientifically.
Translation research will be methodologically scientific, if it resorts to
observation in order to meet three standards: validity, reliability and
practicality.
7.4 Validity in Translation Research

Validity denotes acceptability of a given object or concept according to
certain criteria. The criterion is established as soon as the translators
adopt a theory of translation for themselves. Larson (1984), for example,
approached translation as “change of form” and equated the form with

130
“the actual words, phrases, clauses, sentences, paragraphs, etc.” (p. 3).
Naturally such an approach is too difficult, if not impossible, to follow
because the term “etc” shows that it can include larger units such as
chapters and an entire book. Lagzian (2013) reviewed the literature on
translation and announced that there is virtually no rationale other than
schema theory to explain the process and provide the translators with an
objective measure to evaluate source and target texts.
7.4.1 Internal Validity

The internal validity of translation depends on the TL equivalents a
translator selects for the SL schemata constituting a given text. If the
selection of the TL equivalents is influenced by any variables other than
the semantic traits of the SL schemata, the rendered text will lack
internal validity.
The internal validity of Yusufali’s (1986) translation of verses 32 and 33

of Surah 38 (‫ )ص‬suffers from his personal desire to save the Prophet
David’s face and thus unintentionally imposing his own interpretation of
the two verses on the Quran.
And he said, "Truly do I love the love ‫ﺨﻴْ ِﺮ ﻋَﻦ ِذآْ ِﺮ َرﺑﱢﻲ‬
َ ْ‫ﺣﺐﱠ اﻟ‬ُ ‫ﺖ‬
ُ ْ‫َﻓﻘَﺎ َل ِإﻧﱢﻲ َأﺣْ َﺒﺒ‬
of good, with a view to the glory of {32} ‫ب‬ ِ ‫ﺤﺠَﺎ‬
ِ ْ‫ﺣﺘﱠﻰ َﺗﻮَا َرتْ ﺑِﺎﻟ‬ َ
my Lord,"- until (the sun) was hidden
in the veil (of night):
"Bring them back to me." then began ‫ق‬
ِ ‫ﻖ َﻣﺴْﺤًﺎ ﺑِﺎﻟﺴﱡﻮ‬
َ ‫ﻄ ِﻔ‬
َ ‫ﻲ َﻓ‬
‫ﻋ َﻠ ﱠ‬
َ ‫ُردﱡوهَﺎ‬
he to pass his hand over (their) legs {33} ‫ق‬ ِ ‫وَاﻟْ َﺄﻋْﻨَﺎ‬
and their necks.
The SL schemata and their TL equivalents form the most important

variables of translation research from a microstructural perspective. In
contrast to macrostructural approaches whose units of translation are
subjective in nature, e.g., message, microstructural translation employs
each and all the schemata comprising the source text (ST). In contrast to
subjective units such as message, each schema can be observed and
controlled both psychometrically and functionally.

131
The Arabic schema ‫ﺖ‬ ُ ْ‫ َأﺣْ َﺒﺒ‬in Surah 38, verse 32, for example, consists of
three morphs which have to be translated via three distinct schemata in
English, i.e., the free syntactic sensor I, the bound syntactic finite past
and the free semantic process prefer. The identification and
classification of these three schemata will now help us evaluate the
internal validity of the translations.
Table 7.1 presents the eight translation of the Arabic source schema
‫ﺖ‬
ُ ْ‫َأﺣْ َﺒﺒ‬. As can be seen, only Shakir’s translation might be considered
schematic because he has neither inserted any parasyntactic schemata
such as do and did as Yusufali (1989) and Khan (1970) have done, nor
chosen the most frequent sense of the semantic schema ‫ﺣﺐ‬, i.e. love.
Sarwar’s (1981) seems to be the least valid in the sense that he has
changed the verb schema prefer to the noun love and thus depicted the
prophet as a person controlled by his emotions.
Table 7.1
Validity analysis of eight renderings of the source schema ‫ﺖ‬
ُ ْ‫َأﺣْ َﺒﺒ‬
Source English Equivalents Translator Year Method

Schema
do I love Yusufali 1989 Free
I did love Khan 1970 Fee
I have loved Arberry 1964 Literal
I have preferred Pickthal 1930 Semantic
‫ﺖ‬
ُ ْ‫َأﺣْ َﺒﺒ‬
I love Mohmmad Ali 1917 Literal
Sherali 1955
I preferred Shakir 1983 Schematic
My love Sarwar 1981 Free
SL schemata and their TL equivalents become psychometrically

categorical when they are assigned to three classes: syntactical, semantic
and parasyntactic. This categorization provides translation researchers
with an observational tool to study not only the SL and TL texts

132
observationally but also to evaluate the schemata employed by two or

more translators of the same SL text.
Each syntactic, parasyntactic and semantic target equivalent schema

might be considered as an ordinal variable whose ranks are determined
by the number of semantic traits it shares with the source schema. These
ranks can then be converted to numerical values if the translator decides
to treat the TL equivalents as interval variables.
In terms of function, five variables have been identified in the literature:

independent, dependent, moderator, control and intervening. The words
comprising the SL texts are all independent variables and the TL
equivalents form the dependent variables of translation research.
The translator of the SL texts can assume the role of either a controlled
or moderator variable. If the text is translated by only one translator, he
will become a controlled variable. In the case of having two translators
render the same SL text, one of them will become an independent
variable and the other a moderator.
As human beings, translators differ from each other with respect to

many factors such as intelligence, educational background, attitudes,
personalities and dispositions. These factors will inevitably influence the
quality of their research. The educational background of a researcher, for
example, may be a crucial variable in translation and there may be a
positive relationship between the quality of translation research and the
educational background of the translator. However, even if two
translators have the same educational background in terms of their
academic degrees, they may differ in many factors including the ones
mentioned above. These factors form the intervening variables of
translation research.
7.4.2 External Validity

Hatch and Farhady (1982) believed that external validity refers to “the
extent that the outcome of any research study would apply to other

133
situations in the real world” (p. 8). It implies that external validity has
relevance only to applied researches. Since translation is basic in nature,
it does not seem to have external validity.
7.5 Reliability
Reliability literally means the quality of being dependable. In
psychometrics the term reliability is always employed to show
consistency. The reliability of a research project can be approached both
internally and externally. A given translation will be internally reliable if
the same translator translates given parts of a source text independently
on different dates. I will call this translate-retranslate reliability. The
translations can then be scored by an experienced translator and the
correlation coefficients obtained on these translations can be used as an
index of reliability. As Nunnally (1978) emphasized research projects
are reliable to “the extent that they are repeatable” (p. 22).
While some instruments employed in applied linguistics are subjective

by nature, translation projects can be as objective as possible. In contrast
to open-ended questions and essays, which “leave a good deal to the
judgment of the scorer” (Anastasi, 1968, p. 86), all translations are based
on source texts, which can be analyzed or even translated independently
by others in order to assess the reliability of a given translator.
For example, an appropriate sample of translated pages of a given text

can be selected, scored and correlated with each other to explore their
degree of go-togetherness, i.e., correlation coefficient. The higher the
obtained correlation coefficients, the more reliable the translation would
be. This procedure might be very similar to designing parallel tests in
applied linguistics to determine their reliability.
Studying a sample of a given translation to determine translator

reliability is employed in most Iranian universities though in a
subjective way. When an academic member announces his readiness to
translate a certain text, he is required to submit a sample translation of
about 10 pages. The sample is then examined by two or three

134
independent academic members who are usually experts on the subject

of the text. The agreement of these experts determines whether the
translation should be conducted or not.
7.6 Feasibility
Research projects other than translation require relatively longer periods
of time. The two academic semesters in Iranian universities start in
Mehr and Bahman and end in Day and Khordad. Each semester lasts for
almost four and a half months of which the first and last months are
usually spent in registration and examination. This leaves only two and a
half months for both teaching and conducting research projects, which is
barely enough to conduct even field studies.
The problems of time aside, students face great difficulty in gaining

access to teachers and classrooms at guidance and high schools, where
English is taught. As Hatch and Lazaraton (1991) pointed out “in some
school districts (and for good reasons), there is a monumental amount of
red tape involved with school-based research” (p. 19). The problem of
red tape holds equally applicable to other ministries and bureaus.
Recently I supervised a research project dealing with divorce and thus
introduced my students to a given bureau. The authority in charge
officially responded and sought for our getting permission from
somewhere else, implying a polite refusal.
Even if the problems of time and access are solved in one way or
another, the cost of research remains a major obstacle, especially for
undergraduate students. There is no budget available to cover the
expenses incurred during research such as travelling, typing and copying
tests and questionnaires and buying necessary equipments.
In contrast to other types of research, translation is the most feasible in

Iran at present. It can be assigned at the very beginning of each
semester, it does not involve red tape and it does not require much
budget. All it requires is having access to a computer to type the TL text
and a library to get reference and other required books.

135
7.7 Summary
Translation is an active, cognitive and creative process requiring the
application of scientific method as other types of research do. Scientific
method is systematic in that it follows an “established principle” (Hatch
& Farhady, 1982, p.4). The three principles followed in research
projects are validity, reliability and feasibility. Translation research
projects are valid if the variables involved in their conduction are
identified, controlled and rendered properly. They are also reliable if
independent translators score a sample translation and provide the
researcher with an acceptable reliability coefficient. And they are the
most feasible in Iran at present in that time, access and cost of their
conduction is fairly manageable.
While the very process of translation is personal and calls for the
qualification and attention of translators, the translated texts can be
compared with their original in order to evaluate the validity of their
content. The very act of contrasting two or more types of texts to find
their similarities and dissimilarities calls for establishing another new
type of research projects. Although there is nothing new about this type
of comparative study, the application of schema theory to its design
renders comparative studies more objective in procedure and sound in
their construct. Chapter 8 addresses this issue and offers text-based
research projects as a new contribution to human investigation.

136
8 Schema-Based Translation Research: A

Quantitative Research Method
8.1 Introduction
Chapter 7 provides necessary background to approach translation as a
research method which can claim validity, reliability and feasibility if it
is done by employing a scientific method. None of the eight
macrostructural methods employed traditionally in translation research
projects, i.e., adaptation, communicative translation, faithful translation,
free translation, idiomatic translation, literal translation, semantic
translation, and word-for-word translation can claim objectivity based
on scientific observation. This is because they are by their very nature
qualitative. In other words, there is no procedure available to
operationalize these methods so that their functioning can be observed
and quantified.
Adaptation, for example, is a type of translation research which depends

heavily on the translators' cultural, educational, political and social
background, to name a few. In chapter seven section 7.3.1.5 we noticed
how the translation of the same Persian poem by two different poets
yields two different messages. This means that macrostructural
translation projects have questionable internal validity in the sense that
translating the same text by two different translators does not result in
expressing almost the same content.
The objectivity of research findings brings up the debate related to

qualitative versus quantitative research. According to Walle (1997),
while both methods are useful and legitimate by themselves, scientific
(or quantitative) methods have dominated all fields of knowledge since
World War II. “As a result, the main role of qualitative research has

137
typically been reduced to helping create and pose hypotheses which can
then be tested and refined using scientific and/or statistical research
methods and models” (p. 524). In order to appreciate quantitative
translation research projects, we need to discuss the qualitative and
quantitative research projects and detail their differences.
8.2 Qualitative versus Quantitative Research

Walle (1997) believed that the classification of research methods in
social sciences was first dichotomized by Pike (1954) who employed
two linguistic terms to differentiate them: phonetic and phonemic.
Phonetics is the branch of linguistics in which researchers record sounds
produced by speakers, define them operationally and then employ
instruments such as audiometers25 to measure variables such as pitch
and perception of sounds.
Phonemics, on the other hand, does not address the measurable features
of sounds, but addresses conceptual categories which exist within the
mind of sound producers and receivers. The verification of these
categories seemed to be empirically impossible. For example, patients
who suffer from head damages may produce unusual speech patterns
which might not fit the statistical norms of typical pronunciation.
Phonemically, however, other people could still understand what was
being said because the underlying structure of the language exists in the
minds of both the speaker and the listener. The existence of these
structures, however, is not as observable as normally produced
utterances are.
Pike (1954) generalized phonetics and phonemics into etics and emits
which stand for quantitative and qualitative research projects,
respectively. He admitted that emit research projects led to unverifiable
conclusions. However, Pike argued that they help researchers not only
25
An audiometer is an instrument that produces pure tones of various fixed pitches or
frequencies. Patients put on headphone to listen to these pitches in a soundproof
booth to eliminate external noise. They are provided with a press switch to indicate
whether they heard a given tone or not. Hearing is tested one ear at a time.

138
understand the culture or language in holistic ways, but also explain the
life, attitudes, motives, interests, responses, conflict, and personality of
specific actors. Pike further argued that the etic or scientific research, in
contrast, hinders the ability to deal with these basic considerations
because such phenomena cannot be empirically validated.
Pike’s (1954) support of emit or qualitative research faced wide

popularity among anthropologists who employed his argument as an
intellectual justification of their methods. He was, therefore, widely
lauded as a convincing and convenient defender of humanistic, artistic
research. Ten years later, Harris (1964), however, convincingly
debunked26 the emit method. He argued that instead of observing the
behaviour of others, emit researchers deduce what goes on in their mind.
He thus rejected emit research as a mere deductive exercise.
Qualitative or subjective research projects differ from quantitative or

objective research projects in three areas: researcher and participants,
methodological considerations and theory. These areas will be
discussed, albeit briefly.
8.2.1 Researcher and Participants

In quantitative research projects researchers are separated from the
participants. They try to stay away from participants as far and as much
as possible. The objective researchers study the behaviour of their
participants in order to understand objective reality.
Qualitative or subjective research projects, however, put researchers into

the context of a situation to understand it themselves. In fact, they try to
understand the situation through the eyes of the participants, assuming
themselves as the representatives of participants (Chatman, 1984; Fidel,
1993; Mellon, 1990; Westbrook, 1994).
26
Debunk /bi'bâŋk/ v. to show that something is wrong or false; expose; deflate;
demystify; discredit; lay bare; set straight; show up; throw light on

139
Some scholars also include audience as a third party in researcher and

participant relationship (e.g., Sutton 1993). This is in fact what happens
in macrostructural translation research projects such as adaptation. As
adaptive translators, some scholars, for example, put themselves in the
shoes of the readers of both the original text, i.e., participants, and the
readers of their own translations, i.e., audience, and try to understand
and reconstruct the message expressed in a source text as the readers of
the source text and the readers of their own translation would have done.
8.2.2 Methodological Considerations

Three methodological considerations should be taken into account in
qualitative research projects. First, subjective (particularly ethnographic)
research attempts to develop a relationship with the participants by
gaining entry, rapport27, empathy28 and reciprocity (Chatman 1984).
However, the data thus gathered can be analyzed either in a quantitative
or in a qualitative manner. Content analysis of field notes or transcribed
interviews is a way of quantifying texts.
Secondly, in the qualitative research no attempt is made to control

research environment. It is left totally to the participants in order to
secure their natural behaviour (Bradley 1993; Fidel 1993; Sutton 1993;
Mellon 1990). Similar to qualitative research, environment plays a
significant role in quantitative research when data are collected.
However, it is always controlled so that it will not bring about
significant differences in the participants’ behaviour without the
researcher’s knowledge.
And finally, qualitative research uses multiple methods to measure the

same qualities, one verifying the other (Fidel, 1993). Quantitative
27
Rapport /ræ'po/ n. an emotional bond or friendly relationship between people based
on mutual liking, trust, and a sense that they understand and share each other's
concerns
28
Empathy /'empзθi/ n. the power of mentally identifying oneself with (and so fully
comprehending) a person or object of contemplation: Pity is feeling sorry for
someone; empathy is feeling sorry with someone.

140
research, however, employs multiple methods to measure different

qualities as appropriate. In other words, qualitative research may occupy
itself with one single independent variable whereas quantitative research
explores the relationship of at least two.
8.2.3 Theory and Hypothesis Formation

Qualitative and quantitative research projects differ from each other in
terms of theory and hypothesis formation. In conducting former projects,
theory may be generated by the evidence during the study. The latter,
however, formulates hypotheses or theses prior to the study. The
differences mentioned in 8.2.1, 8.2.2 and 8.2.3 have been enumerated in
Table 8.1 below in order to have a detailed description and
understanding of quantitative and qualitative research projects.
Table 8.1
Differences between quantitative and qualitative research projects
No Quantitative Research Projects Qualitative Research Projects

1 Are experimental Are field based and study events as they
happen in real life
2 Conceive variables a priori, i.e., Identify variables posteriori, i.e., after
before projects start data are collected
3 Create variables that are amenable to Seek discourse and long descriptive
statistical analysis texts produced by participants
4 Employ validated instruments which Seek techniques to explore a
can be scored objectively phenomenon from “under the skin” of
another
5 Exert laboratory-like control to Make researchers invisible so that they
determine the effect of specific would have no effect on phenomenon
variables under study
6 Explore the variables involved in Investigate how the world is
experiencing the world. experienced by individuals

141

Differences between quantitative and qualitative research projects
No Quantitative Research Projects Qualitative Research Projects

7 Follow logical positivism29 Adopt constructivism30
8 Manipulate the environment and Develop trust in the population being
measure changes studied to approach the environment as
it is.
9 Raise research questions and Theories are formed after the data are
formulate hypotheses before projects collected
start
10 Seek facts and causes apart from Study a phenomenon from within an
individual states to secure external individual’s state of being regardless of
validity its applicability to similar situations
11 View interventions as a way of Study phenomena without any
discovery intervention
8.3 Schema-Based Translation Research

Khodadady (2001) argued that since translation is a linguistically and
cognitively productive process, it requires a sound theory not only to
describe but also to explain how it takes place. He stated that schema
theory offers such a rationale through two approaches: macrostructure
and microstructure. Macrostructuralists define translation as a process of
meaningful rendering of units larger than sentences whereas
microstructuralists approach it as a process of supplying the best
equivalents for the author’s schemata on the basis of translators’
experiences with the schemata employed in composing the source text
29
Logical positivism is a school in philosophy which holds the idea that personal
experiences are the basis of true knowledge if they are verified scientifically (the
Austrian Ludwig Wittgenstein and the British Bertrand Russell and G. E. Moore
belonged to this school).
30
Constructivism has its origin in developmental psychology founded by Jean Piaget
who argued that children play an active role in their own learning. When they
encounter new experiences, they apply their background knowledge to modify it and
construct the new one (see Ross 2006).

142
(ST) and supplying their best equivalents on the basis of their textual or
discoursal context.
According to Khodadady (2001), translating a text from a source

language to a target language for the first time will be a quantitative
research project provided that the translator employs a microstructural
approach and utilizes semantic trait analysis to choose the best target
equivalent for the source schemata. He defines schemata as the words
comprising the source text and assigns them into three major domains:
semantic, syntactic and parasyntactic. (They will be addressed in section
8.3.1, 8.3.2 and 8.3.3, respectively.)
Texts as macrostructural products of language consist of chapters, which

in turn consist of paragraphs and sentences. Each sentence employed
within a paragraph comprises phrases. Each phrase in turn is formed by
one or more schemata which perform a given function within a sentence.
As the smallest unit of texts, schemata can be broken into morphs in
order to specify their meaning-based syntactic and semantic roles within
sentences. The schemata comprising texts, therefore, have a hierarchical
relationship with each other in terms of their overall structure as shown
in Figure 8.1.
The schemata employed in the composition of authentic texts have a

hierarchical relationship not only with the schemata used throughout the
texts themselves but also with other semantically related schemata
internalized and stored in the author and translators’ minds. These
relationships should be taken into account when they embark on a
quantitative translation project. The hierarchical relationship is not,
therefore, limited to the texts themselves. It extends to the schemata
stored as background knowledge in their own minds as well.

143
Figure 8.1
Hierarchical relationship of schemata comprising the Quran as a written
text
Surah schemata1 + … + Section schemata114 → The Quranic schemata
↑
Section schemata1 + Section schemata2 + Section schemata3 + … → Surah schemata
↑
Paragraph schemata1 + Paragaph schemata2 + Paragraph schemata3 + … → Section schemata
↑
Sentential schemata1 + Sentential schemata2 + Sentential schemata3 + … → Paragraph Schemata
↑
Clausal schemata1 + Clausal schemata2 + Clausal schemata3 + … → Sentential schemata
↑
The + all-merciful + has (Syn.) + taught (Sem.) + the (Syn.) + Quran (parasyntactic) + …→
Clausal schemata
↑
The (Syn.) + all (Syn.)- merci (semantic [Sem.] + ful (bound Syn. morph) + …→ Phrasal
schemata
↑
The (free syntactic morph [Syn.]) + …→ Single schema
According to Faber (1994), all English semantic schemata can be

subsumed under ten domains (Faber & Uson, 1998), i.e., perception,
speech, movement, change, existence, possession, position, cognition,
action and feeling. Each of these domains has a hierarchical relationship
with its constituting semantic schemata. The familiarity of translators as
researchers with these relationships will determine the quality of their
translations.
Khodadady (2001), for example, chose a contemporary Persian text

(Shariati, 1981) and asked his 22 undergraduate students to translate it
into English. The first sentence of the text (translated by the researcher)
reads, “In assessing any phenomenon, we should view it as it was
viewed at the age of its appearance.” The English microschema assess is
the best equivalent of the Persian schema ARZYABEE. It is
hierarchically related to the macroschema of evaluation, which in turn
falls into the semantic domain of cognition. The students would choose
the best equivalent if they knew that

144
1. to determine the value, significance, worth, size, amount, extent, or

nature of something
1.1 thoroughly, evaluate
1.2 roughly, estimate
1.3 exactly, gauge
1.4 expertly, appraise
1.5 highly in mind, value or prize
1.6 using a criterion, measure
1.7 using a meter, meter
1.8 mathematically, calculate
1.9 monetarily, calculate
1.10 on a rank, rate or grade
1.11 on scales, weight
1.12 according to bulk, size
1.13 by steps, step had to be used.
2. to determine the rate of tax, assess ought to be used.
3. to consider opposite factors to reach a choice or conclusion, weigh,
evaluate or assess had to be used.
4. to standardize, calibrate had to be used..
Based on the relationship between the schemata comprising the Persian

text and their activation of related schemata in the minds of translators
as the readers of source texts, Khodadady (2001) hypothesized that his
22 undergraduate students would choose different schemata related to
assessment because their English background knowledge differed from
each other. If his hypotheses were true, then the equivalents given to
ARZYABEE would be significantly different from each other.
Table 8.2 presents the chi-square analysis of English equivalents written

for the Persian schema ARZYABEE, i.e., assessment/ assessing,
estimating, evaluation/evaluating, and studying. (We will study the chi-
square test in another chapter.) As can be seen, these equivalents are
significantly different from each other (x2 = 11.8, df = 3, p = 0.008).
This result confirmed Khodadady’s (2001, p. 116) hypothesis that there

145
is a significant difference in the number of equivalents provided for

semantic schemata.
Table 8.2
The chi-square test of English equivalents provided for the Persian
semantic schema ARZYABEE
Observed Expected
Equivalent Schemata Residual Test
frequency frequency
Assessment/assessing 4 5.5 -1.5 x2 = 11.8
Estimating 5 5.5 -0.5 df = 3
Evaluation/evaluating 12 5.5 6.6 p = 0.008
Studying 1 5.5 -4.5
Total 22
Source: Khodadady (2001, p. 116)
One of the most interesting results obtained by Khodadady (2001) was

that novice translators differed not only in their background knowledge
of semantic schemata but also in their background knowledge of
syntactic schemata. The most recent studies also show that authentic
English texts differ significantly from each other in the number of
parasyntactic schemata the authors employ in the composition of
specific-domain texts (e.g., Khodadady, 2008). These research projects
provide the background necessary to pursue schema-based translation as
a quantitative research project and thus call for an operationalized
discussion of schemata and their domains, genera, species, types and
tokens.
8.4 Schemata: Objective Units of Translation

Khodadady (2008) defined a schema as a single word used along with
other words to form an authentic text uttered or written for being heard
or read under given conditions at a specific place and time. He argued
that the acceptance and adoption of schema as the building block of
authentic textual products provides linguists and language teachers alike
with an objective measure on which they can base their analyses and
teaching, respectively. The same argument can be extended to

146
translation. Any analysis of translated texts would be objective if its

constituting schemata are identified in source texts and their equivalents
are employed in composing the target texts. The schemata and their
equivalents in both source and target texts can then be analyzed
statistically.
The statistical analysis of source and target texts can be achieved if their
constituting schemata are classified hierarchically to reflect the structure
of human mind. There are three broad domains to which all schemata
can be assigned: Semantic, syntactic and parasyntactic. These domains
are further broken down into genera, species, types and tokens as
discussed below.
For example, the semantic domain schemata consist of four genera, i.e.,
adjectives, adverbs, nouns and verbs. The adjective genus of semantic
domain in turn comprises the species of agentive adjective, agentive
complex adjective, comparative adjective, complex adjective, dative
adjective, complex dative adjective, derivational adjective, derivational
complex adjective, nominal adjective, simple adjective, and superlative
adjective. Similarly, the agentive adjective species consist of a
potentially indefinite types such as interesting, fascinating and
intriguing. If the adjective schema type interesting is used once in a
given text, it will have a token of 1. This hierarchical classification of
schemata will be elaborated and employed in the remaining sections and
chapters in order to make the arguments coherent.
Furthermore, throughout section 8.4, two translations of Surah 1 of the

Quran will be compared with each other in order to provide an objective
example of schema-based translation research. As we noticed in chapter
seven, Arberry (1964) translated the Quran into English because he
believed none of his contemporary translations could reflect the rhyming
of original verses.
Instead of focusing on rhyming in Quranic verses, Irving (1985) stated

that his main goal in translating the Quran into English was to convey

147
its message “in reverent yet contemporary English.” As can be seen

from the date of their publications, both translations are contemporary in
that they have both been translated in the 20th century. An interval of 21
years does not seem to affect any language in terms of its semantic,
syntactic and parasyntactic schemata.
The translated texts of Arberry (1964) and Irving (1985) are giving
below. Based on the stated goals of both translators, it is hypothesized
that if rhyming and being contemporary are reflected in the semantic,
syntactic and parayntactic schemata of the translated texts, they must
differ from each other significantly in terms of schema domains, types
and tokens.
1:1 In the Name of God, the 1:1 In the name of God, the
Merciful, the Compassionate Mercy-giving, the Merciful!
1:2 Praise belongs to God, the Lord 1:2 Praise be to God, Lord of the
of all Being, Universe,
1:3 the All-merciful, the All- 1:3 the Mercygiving, the
compassionate, Merciful!
1:4 the Master of the Day of 1:4 Ruler on the Day for
Doom. Repayment!
1:5 Thee only we serve; to Thee 1:5 You do we worship and You
alone we pray for succour. do we call on for help.
1:6 Guide us in the straight path, 1:6 Guide us along the Straight
1:7 the path of those whom Thou Road,
hast blessed, not of those 1:7 the road of those whom You
against whom Thou art have favored, with whom You
wrathful, nor of those who are are not angry,
astray. (Arberry 1964) nor who are lost! (Irving,
1985).
8.4.1 Semantic Schemata

The semantic schema domain of all texts consists of four genera:
adjectives, adverbs, nouns and verbs. They are traditionally referred to
as “open-class items” (Quirk, Greenbaum, Leech & Svartvik, 1985, p.

148
73) representing the translators’ personal understanding of the authentic

texts. For example, while the Arabic noun schema RAHIM means
compassionate and all-compassionate to Arberry (1964), it invokes
Irving’s (1985) experiences with, feelings related to and attitudes of
being merciful in personal life.
Semantic schema genera are open in nature because new adjectives,

adverbs, nouns and verbs are employed by speakers and writers when
they express new attributes, complements, ideas, and actions related to
the topics of their speech and writing. For this very reason, the semantic
schema domain comprising source texts consists of many types of
semantic genera but few tokens.
For example, Arberry (1964) and Irving (1985) used 42 and 41schemata
to translate the whole Surah 1 of the Quran, respectively. Table 8.3
presents the semantic schema types used in their translations. As can be
seen, out of 42 schemata employed by Arberry, 23 (54.8%) belong to
semantic domain, implying that the Surah is semantically more loaded.
Furthermore, none of the semantic schemata of the surah has a
frequency or token of more than 2, indicating that it calls for its readers’
complete attention in terms of what it purports to convey.
Table 8.3
Semantic schema types and tokens comprising the first surah of two
َ rabic
A Arberry Irving Schema Token (f)
No
Schema (1964) (1985) Type Arberry Irving
1 ‫ﱠرﺣِﻴ ِﻢ‬ All-compassionate Merciful Adjective 1 2
Compassionate 1
2 ِ ‫ ﱠرﺣْﻤـ‬All-merciful
‫ﻦ‬ Mercygiving Adjective 1 2
Merciful 1
3 ‫ﻦ‬
َ ‫ﺿﱠﺎﻟﱢﻴ‬ Astray Lost Adjective 1 1

149

Semantic schema types and tokens comprising the first surah of two
َ rabic
A Arberry Irving Schema Token (f)
No
Schema (1964) (1985) Type Arberry Irving
4 ‫ﻣُﺴ َﺘﻘِﻴ َﻢ‬ Straight Straight Adjective 1 1
5 ‫ب‬
ِ ‫ﻣَﻐﻀُﻮ‬ Wrathful Angry Adjective 1 1
6 ‫ﻦ‬
َ ‫ﻋَﺎ َﻟﻤِﻴ‬ Being Universe Noun 1 1
7 ‫َﻳﻮْ ِم‬ Day Day Noun 1 1
8 ‫ﻦ‬
ِ ‫دﱢﻳ‬ Doom Repayment Noun 1 1
9 ‫ﷲ‬
ِ ‫ا‬ God God Noun 2 2
10 ‫ب‬
‫َر ﱢ‬ Lord Lord Noun 1 1
11 ‫ﻚ‬
ِ ‫ﻣَﺎ ِﻟ‬ Master Ruler Noun 1 1
12 ‫اﺳﻢ‬ Name Name Noun 1 1
13 ‫ط‬
َ ‫ﺻﺮَا‬ ‫ﱢ‬ Path Road Noun 2 2
14 ‫ﺣﻤْ ُﺪ‬ َ Praise Praise Noun 1 1
15 ‫ﻋْ ُﺒ ُﺪ‬ Serve Worship Noun 1 1
16 ‫ﺳ َﺘﻌِﻴﻦ‬ Succour Help Noun 1 1
17 Are/art (stray) Are Verb 2 2
18 ‫ل‬ Belongs Be to verb 1 1
19 ‫ﺖ‬
َ ‫أَﻧﻌَﻤ‬ Blessed Favored Verb 1 1
20 ‫اه ِﺪ‬ Guide Guide Verb 1 1
21 ‫ﺳﺌﻞ‬ Pray Call on Verb 1 1
Similar to Arberry’s (1964) translation, the majority of schemata

employed by Irving (1985) are semantic in domain. As can be seen in
Table 8.3, out of 41 schemata employed by Irving, 21 (51.2%) are
semantic by domain. The most frequent semantic schemata employed by
Irving are adjectives, nouns and verbs, i.e., f = 2.

150
As a verb type of semantic domain, are and be differ from other verb
schemata such as belong. Are and be are simple in their structure
whereas belong is a complex verb in that it is derived from the bound
verb morph be and free adjective morph long. Intuitively, one may argue
that the more complex verb schemata there are in a translated text, the
more difficult it would be in terms of its readability level. In order to
explore whether such an argument holds true, semantic schema types are
further broken into semantic schema tokens.
Table 8.4 presents semantic schema types and tokens. As we can see,
there are four semantic types and 42 semantic tokens in English. The
tabulation and codification of semantic tokens allows us study all
translated texts objectively. Instead of working with the tokens
themselves, we can use their codes as categorical psychometric variables
and apply statistical tests to explore hypotheses.
Table 8.4
Semantic schema species and their example types
Species SC Type
Agentive Adjective 1110 Interesting, fascinating
Agentive Complex Adjective 1111 Flesh-eating, fine-looking
Comparative Adjective 1120 Better; worse, longer
Complex Adjective 1130 antiwar;
Dative Adjective 1140 Interested; devoted,
Complex Dative Adjective 1141 Research-based
Derivational Adjective 1150 Distinctive; gracious, functional,
merciful, compassionate
Derivational Complex Adjective 1151 Nonfunctional, sociopolitical, All-
merciful;
Nominal Adjective 1160 Iraqi, Swedish
Simple Adjective 1170 Good; straight; due
Superlative Adjective 1180 Best, cleverest
Comparative Adverb 1210 Faster; better; more; less

151

Species SC Type
Derivational Adverb 1220 Quickly, remarkably, across
Simple Adverb 1230 Fast; far; long; well
Superlative Adverb 1240 Fastest; latest
Adjectival Noun 1310 Warmth; ability
Complex Noun 1320 Uprise, greatcoat, background,
greenback, aftereffects, breakthrough
Compound Noun 1330 Notebook; bathrooms
Compound Complex Noun 1331 Slide-and-lantern;
Conversion Noun 1332 "Little and often" is a good
expression; ups; why, how, what
(understanding the how, what and why
of assessment)
Derivational Noun (Simple) 1340 Arrival, student
Gerund Noun 1350 Swimming, Reading,
Gerund Noun (Complex) 1351 Understanding; notemaking
Nominal Noun 1370 Iranian; British,
Simple Noun 1380 Book; heaven;
Complex Verb (Base) 1411 Underlie; undertake
Complex Verb (Third Person) 1412 Underlies; undertakes
Complex Verb (Past participle) 1413 Underlined; undertaken
Complex Verb (Present participle) 1414 Underlying; undertaking
Complex Verb (Simple Past) 1415 Underlined; undertook
Derivational Verb (Base) 1421 Realize; darken; enlarge
Derivational Verb (Third Person) 1422 Realizes; darkens; enlarges
Derivational Verb (Past 1423 Realized; darkened; enlarged
Participle)
Derivational Verb (Present 1424 Realizing; darkening; enlarging
participle)
Derivational Verb (Simple Past) 1425 Realized; darkened; enlarged
Phrasal Verb (Base) 1431 Give up; look down;
Phrasal Verb (Third Person) 1432 Gives up; looked down;

152

Species SC Type
Phrasal Verb (Past Participle) 1433 Given up; looked down;
Phrasal Verb (Present Participle) 1434 Giving up; looking down;
Phrasal Verb (Simple Past) 1435 Gave up; looked down;
Simple Verb (Base) 1441 Go; see; take, do, walk
Simple Verb (Third Person) 1442 Goes; sees; takes, does, walks
Simple Verb (Past Participle) 1443 Gone; seen; taken, done, walked
Simple Verb (Present participle) 1444 Going; seeing, taking, doing, walking
Simple Verb (Simple Past) 1445 Went; saw; took, did, walked
(Slang) Verb 1446 Kick the bucket
As can be seen in Table 8.4, all species codes (SC) consist of four digits,
i.e., 0000. From the left to the right, the first digit on the left shows a
schema’s domain, the middle digit represents its genera and the two
digits on the right stands for their types. The code 1110, for example
shows that the schema it represents is an agentive adjective in semantic
domain. Similarly, the code 1411 is a complex verb such as underlie
which consists of the prepositional adverb under and the simple verb lie.
8.4 Syntactic Schemata

Syntactic schemata which are traditionally referred to as “closed-class
items” (Quirk, Greenbaum, Leech & Svartvik, 1985, p. 71) consist of
five genera: conjunctions, determiners, prepositions, pronouns and
syntactic verbs. In contrast to semantic schemata, syntactic genera
comprise species whose types are few but their tokens or frequencies are
many.
For example, Arberry (1964) used only 16 types of syntactic schemata

in the translation of Surah 1 of the Quran. Similarly, Irving (1985)
employed 19 syntactic schemata to accomplish the same task. Table 8.5
presents the syntactic schemata employed by both translators.

153
Table 8.5
Syntactic schemata comprising the first surah of two contemporary
translations of the Quran
Arberry Schema Irving Schema

No f f
(1964) Type (1985) Type
1 Nor Conjunction 1 And Conjunction 1
2 All Determiners 1 Nor Conjunction 2
3 The Determiners 10 The Determiners 9
4 Against Preposition 1 Along Preposition 1
5 For Preposition 1 For Preposition 2
6 In Preposition 2 In Preposition 1
7 Of Preposition 7 Of Preposition 4
8 To Preposition 2 On Preposition 1
9 Those Pronoun 3 With Preposition 1
10 Thee Pronoun 2 To Preposition 1
11 Us Pronoun 1 Those Pronoun 2
12 Who Pronoun 1 Whom Pronoun 2
13 Whom Pronoun 2 Us Pronoun 1
14 Thou Pronoun 2 You Pronoun 4
15 We Pronoun 2 Who Pronoun 2
16 Hast Verb 1 We Pronoun 2
17 Are Verb 1
18 Do Verb 2
19 Have Verb 1
If we compare the results presented in tables 8.3 and 8.5, we realize that
in contrast to 23 semantic schemata (54.8%), Arberry (1964) used only
16 syntactic schemata ((38.1%) in the translation of Surah one.
Although these schemata are fewer in number, their types such as the
determiner the enjoy the highest frequency in the entire translation, i.e., f

154
= 10. Similar to their semantic counterparts, syntactic schema species

consist of certain types as shown in Table 8.6.
Table 8.6
Syntactic schema species and their types
Species SC Type
Conjunction (Phrasal) 2110 as well as; so that, such as
Conjunction (Simple) 2120 But, or, as, while, when, since,
Demonstrative Determiner 2210 This, that, these, those, such; both
Interrogative Determiner 2220 What (season), which (place)
Numeral Determiner 2230 Two, ten
Possessive Determiner 2240 My, your, her, his, its, our, your, their
Ranking Determiner 2260 First, second, twelfth
Specifying Determiner 2270 A, an, the
Complex Preposition 2310 Across; around; toward; between;
beyond
Compound Preposition 2320 Upon, into, within; without;
throughout;
Phrasal Preposition 2330 In spite of, because of, according to
Simple Preposition 2340 Up, on, at; of; than, for
Demonstrative Pronoun 2410 This, that, here, there, both, own, same
Emphatic Pronoun 2420 Myself; yourself
Interrogative Pronoun 2430 Who, where
Object Pronoun 2440 Me, you, him, her, it, us, them
Possessive Pronoun 2441 Mine, yours, theirs
Reflexive Pronoun 2450 Myself; himself
Relative Pronoun 2460 Who, where, when, such that
Subject Pronoun 2470 I, you, he, she, it, we, you, they, there,
it (expletive)
Unspecified Pronoun 2480 One, some, few, many, much, several;
others; something; another, plenty
Specified Pronoun 2481 fourth; half; each other;
Conditional Auxiliary 2510 Had (he known the answer, he would
have come)
Past Auxiliary 2511 Was, were, had, did

155

Syntactic schema species and their types
Species SC Type
Past Perfect Auxiliary 2512 Had been
Present Auxiliary 2521 Am, are, is, have, do,
Present Perfect Auxiliary 2522 Has been, have been
Present Perfect Continuous Auxiliary 2523 Has been being, have been being
Past Model Auxiliary 2531 Might be, should be, could be, would
be
Past Perfect Model Auxiliary 2532 Might have (been), should have, could
have, would have (been)
Present Model Auxiliary 2541 Will be, can be, may be
Present Perfect Model Auxiliary 2542 Will have, may have, shall have, can
have,
Future Perfect Continuous Auxiliary 2543 Will have been; shall have been
Past Phrasal Auxiliary 2551 Was/were going to, was/were to, had
to, ought to
Past Perfect Phrasal Auxiliary 2552 Ought to have
Present Phrasal Auxiliary 2561 Am/are/is to, has/have to, ought to,
am/are/is going to
Model (Present) 2570 Can, may, shall
Model (Past) 2580 Could, might, should
Syntactic schemata are limited in their types because they depend on and
attach to the semantic schemata comprising the texts in order to
constrain them within the variables of place and time. The auxiliary
schema hast, for example, was used in verse 2, i.e., the path of those
whom Thou hast blessed…, to limit the path only to those blessed by the
Almighty Allah within the constraint of the present perfect tense.
8.5 Parasyntactic Schemata

As a domain, parasyntactic schemata comprise seven genera, i.e.,
abbreviations, adverbs, interjections, names, numerals, particles and
symbols. Each type in turn consists of certain tokens adding up to 27 in
total. Since parasyntactic schemata are similar to syntactic schemata,

156
they depend on and attach to semantic as well as syntactic schemata in

order to constrain them within the variables of place and time.
Table 8.7 presents the parasyntactic schemata comprising the first surah
of two contemporary translations of the Quran. As can be seen, few
parasyntactic schemata have been employed by both Arberry (1964) and
Irving (1985). This feature might be an intriguing area of research in text
analysis. It seems that religious texts differ from scientific texts in terms
of their constituting parasyntactic schemata.
Table 8.7
Parasyntactic schemata comprising the first surah of two contemporary
translations of the Quran
Arberry Schema Irving Schema

No f f
(1964) Type (1985) Type
1 Alone Para-adverb 1 Not Para-adverb 1
2 Not Para-adverb 1
3 Only Para-adverb 1
The schema not is, for example, parasyntactic because it attaches to a

syntactic schema such as does and to a semantic schema such as is in
order to show that particular actions and states have not materialized.
For example, not is used in the last verse of Surah one, i.e., the road of
those whom You have favored, with whom You are not angry, nor who
are lost! (Irving, 1985), to reveal what type of road the believer is not
asking for.
Table 8.8 presents parasyntactic schema species and their types. As can
be seen, there are 27 parasyntactic tokens in English among which
adverbs claim for 11 tokens.

157
Table 8.8
Parayntactic schema species and their example types
Species SC Types
Abbreviations 3110 Adj., Dec., et al, L1; L2; they're; I've;
i.e., e.g.,
Acronyms 3120 NATO, NASA, Scuba, radar
Interjection 3210 Ah, ooh, why! Please,
Name (Full) 3310 Douglas Brown, Ernest Hemingway,
United States
Name (Labeling) 3320 Natural Approach
Name (Organizational) 3330 National Security Council
Name (Single) 3340 Brown, Mary, Iran, America
Name (Titles) 3350 Professor; Sir; Madam
Numeral (Alphabetic ) 3410 A, b, twenty-five, in time,
Numeral (Digital) 3420 0, 1, 20, 3.1,
Numeral (Roman ) 3430 II, xi,
Numeral (Year) 3440 1998,
Para-adverbs (Additive) 3511 Also, too, furthermore
Para-adverbs (Contrasting) 3512 However, nonetheless, nevertheless;
perhaps; regardless; instead
Para-adverbs (Emphatic) 3513 Of course; certainly; on the whole;
indeed, even; all all (in any way)
Para-adverbs (Frequency) 3514 Always, never, ever, again
Para-adverbs (Intensifying) 3515 Very, dead, only, merely; just; most;
more; so; in fact; at least, as…as,
how
Para-adverbs (Interrogative) 3516 How (high is the road?), why, when
Para-adverbs (Manner) 3517 Together; how (to);
Para-adverbs (Negation/Approval) 3518 Not, Yes, No,
Para-adverbs (Prepositional) 3519 Up, upstairs, down,
Para-adverbs (Referential) 3520 Thus, hence, so; first; then; as such;
that is; here; there
Para-adverbs (Time) 3521 Now, ago, tomorrow, yesterday,
since, still; once; already
Para-adverbs (Exemplifying) 3522 For example,

158

Parayntactic schema species and their example types
Species SC Type
Particle (Complex) 3610 in order to
Particle (Simple) 3611 to (in the need to)
Symbol (Conventional) 3710 $, &,
Symbol (Scientific) 3720 ×, <,
8.5 Summary
Both original and translated texts provide translation researchers with
necessary data to conduct a quantitative research project provided that
their constituting elements are theoretically defined in advance and a
consistent method developed to operationalize the definitions. Not only
does schema theory define translation as a linguistic and cognitive
process through which translators choose the best target equivalents for
source schemata, i.e., words comprising source phrases, clauses,
paragraphs and texts, but also it explains whether two or more translated
texts differ significantly from each other in terms of their domains, types
and tokens. In this chapter we familiarized ourselves with the domains,
types and tokens involved in the translation of source texts into English.
These are by their very nature categories into which all schemata
comprising translated texts can be schematically assigned. In chapter 9
we will learn how these categories can be treated as categorical variables
to study translated texts as objectively as possible.

159
9 Statistical Analysis of Categorical

Variables
9.1 Introduction
Any committed attempt to solve problems poses itself as a research
project which can be conducted qualitatively and quantitatively. While
the former calls for researchers’ personal interpretation of the problems
as their solution, the latter encourages them to review the relevant
literature, formulate hypotheses, adopt an appropriate research method
and stay aloof as much as possible and let the data be collected before
they offer any solutions.
Translating from a source text to a target text can, for example, be

approached as a problem which needs translators rendering the text in a
target language, i.e., English, as its solution. If a translator adopts
adaptation, communicative translation, faithful translation, free
translation, idiomatic translation, literal translation, semantic translation,
or word-for-word translation as his preferred research style, he follows a
qualitative approach in that he will use a number of personally selected
points to support his translation.
In contrast to qualitative approaches to translation such as adaptation,

schema-based translation projects focus on the words comprising a
source text, i.e., schemata, and try to understand them within their
contextual context and then replace them with appropriate equivalents to
create a target text. The schemata comprising the two texts, rather than
the translators’ personal justification, are therefore employed to conduct
or analyze the translation rendered.

160
While qualitative translation research projects do not supply their users

with any sort of comprehensive data as to what to expect from the
translation, the quantitative approach must specify in advance what its
method would be and what sort of translation its readers must expect.
For example, in chapter eight we studied Arberry (1964) and Irving’s

(1985) translations of the first surah of the Quran. These two translators
have expressed two different objectives as their goals. While Arberry
conducted his translation project in order to convey the message of the
Quran by rendering its rhyming as much and as best as possible, Irving
took his contemporary English knowledge as the medium through
which, he believed, the message of the Quran could be communicated to
his American readers.
Since schema theory focuses on the background knowledge of

translators as the most important variable involved in translation, we can
assume that two or more translations of the same text will be similar if
their translators enjoy similar educational background. Based on this
assumption, in this chapter we will conduct a research project on
Arberry (1964) and Irving’s (1985) translations of the first surah of the
Quran to test the following hypotheses.
1. The semantic, syntactic and parasyntactic domains of the two

translations of Surah 1 will not be significantly different.
2. The semantic, syntactic and parasyntactic types of the two translations
of Surah 1 will not be significantly different.
3. The semantic, syntactic and parasyntactic tokens of the two
The hypotheses above rest on the fact that Irving was a linguist by
profession and served as a professor of Spanish and Arabic at the
University of Minnesota for some time (Khan, 2000). Similarly, Arberry
was a renowned Orientalist and Professor of Arabic at the Universities
of London and Cambridge (Kidwai, 1987).

161
9.2 Schemata as Categorical Variables

We have already noticed that the schemata comprising a text can be
classified hierarchically into domains, types and tokens. For example,
the schema ُ ‫ْﺪ‬
‫َﻤ‬‫( ﺣ‬praise) in the second verse of surah 1 in the Quran, is
a simple schema which belongs to the type of nouns expressing a
specific meaning in the domain of semantics. It can therefore be treated
three times as a domain, type and token variable.
In order to treat each and all schemata comprising texts three times as
three different categorical variables, we have no choice but to use
computers. The first step in this process will be typing the text as a word
document if its electronic version is not available. Some students scan
printed pages to bypass the process of typing the texts. Unfortunately,
this does not work because when a given text is scanned, its photo is
taken. For categorizing schemata we need to work with the words
themselves, not their pictures. We will go through the process of
categorizing schemata step by step so that our project will be as
objective and free of error as possible.
9.2.1 Text as a Word File

We must type the text under investigation as carefully as possible.
Before going through the instruction, we need to create a word file on
our computer hard. Since we are going to analyze the first translated
surah of the Quran, we can create a file called QS001EA.doc. This name
is very handy and at the same time accurate. It is handy because it
consists of only seven characters. It is accurate because its characters
stand for Quran, Surah, 001 (out of 114), English and Arberry. (Since
we are going to compare both Arberry and Irving’s translation, we need
to create a second file called QS001EI.) Follow the instruction below for
whatever type of texts you type.
1. Type the entire text as it is in the original form

2. Follow the original text in terms of its paragraphs. If there are three
paragraphs, you have to type it in three paragraphs.

162
3. Use Tabs to form paragraphs

4. If the initial of any word in the text is capitalized, capitalize it.
5. If any Words or Phrases in the original text are capitalized, they must
also be capitalized in the typed text. For example, “it became widely
known in the United States…” (The initials of United States are
capitalized.)
6. Put dots (.) immediately after the words they follow. Don not insert
any spaces between words and dots.
7. Type commas (,) semicolons (;), dashes (–) immediately after the
words they follow. Don not insert any spaces between words and
commas. For example, John , my friend , is an American should be
typed John, my friend, is an American.)
8. Do not use underline ( _ ) Figure 9.1
instead of dashes (–). Page setup in word
9. Leave only one space between
two words in the text. For
example, “Seidnstucker and
Plotz …” should be corrected
as “Seidnstucker and Plotz
were …(There was an extra
space between and and Plotz.)
10. Pay attention to the spelling of
words in the original text.
11. No plural morphs, i.e., s and
es, should be dropped. For
example, the sentence “Typical
sentence were…” should have
been typed “Typical sentences
were…”
12. Use single line spacing for
paragraphs.
13. Use New Times Romans for fonts
14. Use 1‫( ״‬one inch) for the top and bottom margins and 1.25‫ ״‬for left
and right margins as shown in the dialogue box in Figure 9.1.
15. Do not decorate the text in any form.

163
16. Do not use automatic numbering. Number the items manually. For
example, the following two items are automatically numbered. They
must have been typed manually as they are done in this file.
1. Classroom instruction was conducted exclusively in the target
language.
2. Only everyday vocabulary and sentences were taught.
As you conduct your own research, you will realize how important it is
to number the items manually. When the entire text is broken into its
constituting schemata, whatever comes after the automatic numbers will
be numbered automatically!
17. Pay close attention to the words whose spellings are very similar.
For example, “It offered innovations at the level of teaching
procedures but lacked a through methodological basis.” Through
should have been typed thorough.
18. Pay attention to morphs forming different tenses. For example,
“Once basic proficiency was establish, …” should have been typed
“Once basic proficiency was established, …”
19. Leave one space between symbols and words that follow them. For
example, “Kelly 1969:53” should be corrected as “Kelly 1969: 53”
(There must be a space between: and 53.)
If we follow the 19 steps described above, we will have the word file
QS001EA.doc whose text appears in Figure 9.2. As can be seen, the
verses consist of sentences and clauses. We need to break the sentences
and clauses of the original typed text into its constituting single and
phrasal schemata. (Note that there are no phrasal schemata such as give
up. If there were any, they must have been put together in one row.) We
do not need to do anything manually because the word itself will break
down the text if we apply the relevant command to the text.

164
Figure 9.2
Original typed text in word
The command which breaks texts down into their constituting schemata
is called Find and Replace. In order to activate this command, you need
to click on the buttons Ctrl and F together. Upon clicking on these
buttons, the Find and Replace dialogue box will appear on the screen.
We need to click the space bar once in the box appearing in front of
Find what. Figure 9.3 shows the process.
Figure 9.3
Find function activated

165
After clicking once on the space bar, we have to click on the Replace
button on the top to activate another box called Replace with and type
^p in its front box as shown Figure 9.4.
Figure 9.4
Replace function activated
Upon typing ^p in Replace with box, we must click on Replace All, the
word will change the whole document into a single column of schemata
and a message will appear reading, “Word has completed search of the
document and has made 63 replacements”. If you click on OK, the
message will disappear and you will have the column of schemata in
front of you as shown on the right. Figure 9.5 will appear for our further
processing.
Figure 9.5
Successful functioning of Replace All command

166
Upon having the column of schemata on our screen, we have to take two
steps. First, if there are any empty spaces between the two rows of
schemata, delete them. Secondly, if there are any punctuation marks
such as single or double quotation marks and comma at the beginning or
end of any schemata, delete them. The deletion of punctuation marks is
very important because they affect the way schemata are automatically
sorted in an Excel file.
Figure 9.6 shows the polished and Figure 9.6

depunctualized schemata. As we can see, Depunctualized schemata
there are no marks at the beginning of any
schema. Neither are there any marks at
their end, e.g., there is no comma after the
noun schema God any more. Moving
down the column of schemata and deleting
their punctuations automatically
capitalizes each column as the button
having the downward arrow (↓) is clicked.
If it did not, click F7 and click on Change
whenever capitalization appears on the top
of the first white box.
If we pay attention to the column of schemata shown in Figure 9.6 and

read them from top to bottom, we will realize that they still have the
structure of the original sentences in which they appeared. This structure
is very important because it will help us decide what domain, type and
token each schema belongs to. For example, from its position in the
sentence we know that praise is a noun not a verb schema in verse 2 of
Surah 1 in the Quran.
We need to take the last step before we start categorizing the schemata
in an Excel file. Click on Ctrl + A to select and highlight the schemata.
Then click on Insert menu in Word 2007. As soon as you click on

167
Insert a table with appear with a downward arrow. Upon clicking on the
arrow, a dialogue box will appear as shown in Table 9.7.
Figure 9.7 Figure 9.8

Commands involved in converting Last command in converting
texts to table texts to table
If we click on Convert Text to Table, the dialogue box Convert Text

to Table will appear. Table 9.8 presents the box. As we can see, it
contains three sections. The first section shows the number of columns
which is automatically identified 1. It also shows the number of rows in
an inactive line, i.e., 68. The number of rows created will be exactly the
same as the number of schemata. We need only to click on OK to have
the whole column converted to a table. Figure 9.9 presents the resultant
table.
Microsoft Word is an excellent program to create texts and tables. If we

want to work with numbers within tables and create graphs, we need to

168
employ another program called Microsoft Excel. All Microsoft office

CD or DVDs have both word and Excel programs.
Figure 9.9
Schemata column converted to a table
9.2.2 Categorizing Schemata in Excel File

Having converted the column of schemata into a column in a table, we
can copy and paste the column in an Excel Work Sheet. Each sheet
consists of columns A, B, C … and rows 1, 2, 3 … At the bottom of the
sheet you will find Sheet 1, Sheet 2 and Sheet 3.
You can insert extra sheets or delete them through certain steps to which
we will turn in due course. For the present, you need to open an Excel
file and give it an appropriate name.
If you remember we created two word files and named them QS001EA
and QS001EI to break down Arberry and Irving’s translations. We can
create and give the same names to two Excel files. (We will focus on
working with the first file hoping that the same procedures will be
followed for establishing and working with the second Excel file.

169
After naming and saving the Excel worksheet in an appropriate folder,

we need to name the sheets in each file, too. If you click on Sheet 1 icon
at the bottom of the Excel sheet, it will get highlighted so that you can
write whatever name you choose. Since we are going to paste the
column containing the schemata as they were originally used in each
translation, we name sheet 1, Original.
Each column in Original sheet will be considered as a categorical

variable. Whatever variable we type in each column, we can put its
name in the first cell of the column. For example, since we are going to
categorize schemata, we will call Column A Schema, and type the word
Schema in the first row of Column A.
Upon naming Column A as the first Figure 9.10

variable, we need to go back to the Word Forming the column of the
file, copy the table with the schema first variable called schema
column, return to Excel file, place the
pointer in row 2 of Column A and paste
the entire word column. Figure 9.10
shows what the result of following the
above mentioned steps would be.
As soon as we establish the Schema

variable in sheet 1, we can treat the
domain of each schema as another
variable. If you remember we divided all
schemata comprising a translated text
into semantic, syntactic and parasyntactic
domains and coded them 1, 2, and 3
respectively. Since typing the code is
always easier than typing the name of a
variable, we will use the name DOC for
Column B which stands for Domain Code.

170
Similarly, we can employ the name TYC for Column C to stand for the
type of schema. If you remember, as a domain, semantic schema
consisted of adjectives, nouns and verbs as its types.
And finally, we use the name TOC for Column D to stand for the
schema tokens. As a semantic type, for example, adjectives were divided
into agentive, comparative, complex, dative, derivational, nominal,
simple and superlative tokens.
Figure 9.11 shows the four variables established in sheet 1 named

Original because it contains the schemata in their original order. In
order to codify the schema, it is better we start with codifying tokens.
We can use the codes given in sections 8.4.1, 8.4.2 and 8.4.3 to form a
comprehensive table of codes. The handiest way to work with the table
of codes would be printing and displaying it somewhere near the
monitor of the computer. The printed pages can, for example be stuck to
the sides of monitor by sticky tapes in order to be consulted regularly.
Figure 9.11
Schema and its codes as variables

171
Figure 9.12 presents the codes given to schema tokens as the fourth
variable, i.e., TOC. If you remember, the three digits used in the codes
were systematically chosen in that each digit represents a certain level of
schema classification. The digits on the right, in the middle and on the
left show the domain, type and token, respectively. The schema in, for
example, belongs to syntactic domain, i.e., 2, as a preposition type, 3
which has a simple structure as its token, i.e., 4. The systematic nature
of token codes helps internalize the codes table and employ our memory
instead of the table after a short period of time. The TOC codes can also
be employed to specify the codes required for schema domains (DOC)
and types (TYC).
Figure 9.12
Codification of schema tokens
Figure 9.13 presents the codes assigned to schema domains (DOC),

types (TYC) and tokens (TOC) used in Arberry’s (1964) translation of
the first Surah in the Quran.

172
Figure 9.13
Coded schema domains, types and tokens
Upon codifying schema domains, types and tokens by employing the

context in which they appeared in the original translated text, we can
establish another sheet in order to sort the coded schemata in terms of
their frequency. We need to keep the sheet called Original for future
reference. Whenever we feel that one of the codes does not seem right,
we can refer to the sheet and check its context.
Each Excel file normally consists of three sheets. If you remember, we

clicked on the icon Sheet1 two times and named it Original.
We do the same with Sheet 2 and Figure 9.14
name it Sorted. Sorting the Naming Sheet 2 for sorting schemata
original schemata will allow us to
count their frequency, and thus
decide how many different types
of schemata domains, types and
tokens been used in a given text
and then compare it with other
texts. Figure 9.14 shows the
Sorted sheet. This

173
sheet will help us find out how many times a given schema type is
repeated. To achieve this, we need to add a variable called frequency.
After adding a variable for the Figure 9.15

frequency of schemata, we must Creating variables in the Sorted sheet
go to the Original sheet, copy all
the four variables by highlighting
them from the first cell of column
A to the last cell of column D,
come back to the Sorted sheet,
click in cell 1 of Column A and
paste the copied variables as
shown in Figure 9.15.
The ribbon of Figure 9.15, i.e., its

top area which includes two bars,
is minimized in order to have a
larger space to work with the data
in the sheet. If you right click
in front of any menu such as developer in the second bar from the top,
and click on the Minimize the Ribbon command, the ribbon of Figure
9.15 will be automatically maximized.
Figure 9.16 shows the process of Figure 9.16

deactivating the Minimize the Activating the Ribbon command
Ribbon command. Maximizing
the ribbon makes a number of
new menus accessible to us.
Figure 9.17 presents the

maximized ribbon. As we can
see, the last menu in the
maximized ribbon deals with
Editing. In the middle of the
Editing menu, there is another

174
menu called Sort & Filter.

Place your pointer in Cell 1 of Column A and then go to Sort & Filter
menu and click on it.
Figure 9.17
Maximized ribbon
Figure 9.18 presents a dialogue box Figure 9.18

which is activated as soon as you click Activating Sort & Filter dialogue box
on the Sort & Filter icon. We can use
these commands to sort our variables
either from letter A to Z or from Z to A.
The Custom Sort command allows us to
sort more than one variable at a time.
If you look at Figure 9.16, you can see

that the schemata in the first column
have kept their original appearance in
the sentence. We can now use Sort A
to Z command to find out whether any
schema token has been repeated more
than once. Schema tokens are the ones
which are different from each other in terms of their type, not inflection.
For example, go, goes, gone and went are all considered one token of the

175
simple verb go. Therefore, if we have a text in which each of these four
forms has been used only once, we will sort them all as one single
schema token and write 4 in the column specifying their frequency.
The schemata act, active, actively and action will, however, all be
considered different schema types because they differ from each other in
being verb, adjective, adverb and noun, respectively. Similarly, if the
schema act appears two times in text, once as a verb and once as a noun,
they will be considered two different schema tokens and coded
differently.
We can now apply the Figure 9.19

Sort A to Z command Sorted schemata
to sort the schema
tokens in Column A.
Figure 9.19 presents
the sorted schemata
comprising Arberry’s
(1964) translation of
Surah 1 in the Quran.
As we can see, the

schema Against is
used just once. So, we
write 1 in row 2 of the
Frequency column.
The schema tokens All, All-compassionate, All-merciful and Alone, are
also used just once, too, so we write 1 in the corresponding cells of
Frequency column. The Schema types Are and Art, are however, two
forms of the same main verb. (Art is the archaic form of Are.) We have
to, therefore, delete Art row by right clicking on 8 and choosing the
command Delete. For further reference, we can, however, go back to
Are and type Are/Art. The frequency of Are/Art, will then be 2.

176
Figure 9.20 shows the frequency of some sorted schemata.

Unfortunately, the whole Sorted sheet could not be reproduced here for
the sake of saving space. However, the first page provides us with a
representative sample. As we can see along with Are/Art, the schemata
for and God have been used two times throughout Surah 1.
Establishing the Figure 9.20

Frequency column in the Some sorted schemata and their frequency
Sorted sheet has revealed
two facts about the
schemata used in the text
under analysis. First, it
shows which schemata
have had the highest
frequency in the entire
text. Secondly, it allows
us to double check the
total number of schema
types used in the text.
For example, if the

schema types go, goes,
gone, went and going
have each been used once in a text, we will consider them all as one type
of schema with five tokens. In other words, type shows what schemata
have been used in a text, and their tokens show how many times those
types have been repeated or used in their inflected forms.
We can employ schema domains, types and tokens to compare texts

with each other in terms of their lexical density. Since semantic
schemata convey the message of a given text, the higher its percentage
in a text, the more difficult it will be. We can also argue that the higher
the percentage of syntactic and parasyntactic schemata, the easier the
text would be.

177
Table 9.1 presents the percentage of domains used in modern political

texts and the translation of the Quran (Surah 1). If you are interested to
compare schema types and tokens as well, read Khodadady’s (2008)
paper from which the data related to the political texts have been taken.
Table 9.1
Comparing the schemata used in the translation of Surah 1 with those of
modern political texts
Text/Schema type Semantic Syntactic Parasyntactic

Frequency (%) Frequency (%) Frequency (%)
Political 2469 (77%) 202 (6%) 547 (17%)
Arberry 23 (55%) 16 (38%) 3 (7%)
Irving 21 (51%) 18 (44%) 2 (5%)

9.2.3 Naming and Categorizing Variables in SPSS
Statistical Package for Social Sciences (SPSS) is one of the best
statistical softwares available in the market. We will use it in order to
analyze our data and apply various statistical tests to our hypotheses.
Similar to other programs, first we need to give our SPSS file a name.
We can use the same name we gave to Word and Excel files, i.e.,
QS001EA, to stand for the translation of Surah 001 of the Quran into
English by Arberry.
Figure 9.21 shows SPSS Data Editor file which we have named
QS001EA. As you we can see, there are two icons at the bottom of the
file: Data View and Variable View. As their names show, if we click
on Data View it will activate an SPSS sheet on which we can enter our
data. If we click on Variable View, it will activate the sheet on which
we can specify the names, types, values and features of our variables.
Let us remember filling out these sheets would be the most important
phase in our analysis of the data. The validity of our results will directly
depend on how exactly we fill them out.
As can be seen in Figure 9.21, the Variable View of our SPSS Data
Editor file is activated. The first variable in our analysis is the schemata

178
used in the text. We, therefore, click in case 1 under Name column and
type schemata. The name of our second variable is DOC (domain code).
So, we click in case 2 in the Name column and type doc. Similarly, we
click in cases 3, 4 and 5 in the Name column and type tyc, i.e., type
code, toc, i.e., token code, and freq, i.e., frequency, respectively.
Figure 9.21
SPSS Data Editor
The second column in Figure 9.22

the Variable View SPSS variable types
sheet is called Type. If
you click on case 1
under this column you
will see a grey area on
the right having the
ellipsis (..). As soon as
we click on this ellipsis
icon, a dialogue box
appears which asks us
to specify what type of
variable schema is.

179
Figure 9.22 presents SPSS variable types. As we can see, there are eight
types of variable in SPSS. Since our schema variable consist of letters,
we click in the circle called String. As soon as we click in the String
circle, another box called Characters appears. Each characters stands
for one letter. We have chosen 20 because none of the schemata used in
the translation has more than 20 letters. The schema All-compassionate,
for example, consists of 16 characters, including the hyphen.
The other four Figure 9.23

variables are, however, Numeric variable types
numeric in nature
because they have been
codified by numbers.
For this reason, we
need to click in cases
2, 3, 4, and 5 one by
one, activate the
Variable Type
dialogue box by
clicking on ellipsis icon and choosing numeric as our variable type as
As we can see in Figure 9.23, for numeric variable types two other icons
called Width and Decimal Places become active. SPSS automatically
chooses 8 for width and 0 for these icons. We do not change them
because our codes are whole numbers and we do not need any decimals.
These two features of variables appear as the third and fourth columns
of the Variable View sheet. If we look at Figure 9.23, we can see 8 in
cases 2, 3, 4, and 5 of the column Width.
The column Label is designed so that we can write the full name of the
variables as we wish them to appear in the table in which we will
present our research results. We can therefore, write the complete names
of the variables in the corresponding cells in the Label column. For

180
example, for the variables doc, tyc, toc, freq we type Domain, Type,
Token and Frequency as shown in Figure 9.24.
Figure 9.24
Variable Labels
The sixth column of the Variable View sheet deals specifically with
categorical and ordinal variables (see 2.2.1 and 2.2.2 in chapter 2). In
this column called Values, we assign a numeric value to each category
or rank and then specify its label. It is important to remember that string
variables do not have any values (See 2.2.5 in chapter 2). In order to
assign values to our categorical and ordinal variables we need to click in
the relevant cells of the column Values. As soon as we click in one of its
cells, the ellipsis icon (…) on its right becomes active. Upon clicking on
this icon a dialogue box called Value Labels appears.
Figure 9.25 presents the Value Labels dialogue box activated for the
schema domain variable. If you remember, for the variable doc, we
assigned each schema to one of the three domains: semantic, syntactic
and parasyntactic. In the Value box we must type 1 and then go to
Value Label and type Semantic and then click on Add icon. The phrase
1 = “Semantic” will appear in the third big box. We need to go back to
Value box and type 2 and write Syntactic as its label. The phrase 2 =
“Syntactic” will appear in the third box. We follow the same procedure
for the third value and the phrase 3 = Parasyntactic will appear in the big
box. As soon as we have finish specifying the value and label of
categories, we have to click on OK to ensure that the software has saved
them.

181
Figure 9.25
Activated Value Labels dialogue box
Since there are two other categorical variables in our file, i.e., tyc and
toc. We should follow the same exact procedure described above for
both. Figure 9.26 presents the values and labels given to schema types.
There are sixteen schema types in English: four for semantic, five for
syntactic and seven for parasyntactic. In chapter 8 we gave the following
codes or values to these types: 11 = Adjective, 12 = Adverb, 13 = Noun,
14 = Verb, 21 = Conjunction, 22 = Determiner, 23 = Preposition, 24 =
Pronoun, 25= Verb, 31 = Abbreviation, 32 = Adverb, 33 = Interjection,
34 = Name, 35 = Numeral, 36 = Particle, 37 = Symbol. We must type
and add these values and labels in the Value Labels dialogue box and
click on OK to save them.

182
Figure 9.26
Specified Value labels for schema type variable
9.2.4 Transferring the Data from Excel Sheets to SPSS

After specifying the name, type, width, decimals, label and values of our
five variables in the Variable View sheet of SPSS file, we need to go to
the bottom of the file and click on the icon Data View in order to make
it active.
Figure 9.27
Active Data View sheet of the SPSS file
Figure 9.27 above presents the active Data View sheet of the SPSS file
QS001EA and how it looks after we have specified the required features

183
of our five variables. As we can see, the name of variables appear in

grey at the top of each column. Fortunately, Microsoft Excel sheets are
compatible with SPSS data sheets. We can therefore copy the data from
our Excel file QS001EA, click in the cell 1 under Schema variable and
paste the data.
Figure 9.28 presents the completed Data View sheet in QS001EA. As

soon as we copy our data from the Excel file and past it in our Data
View sheet, we will be able to apply statistical tests to our data.
Figure 9.28
Completed Data View sheet
9.2.5 Copying Data from Different Excel Files and Pasting them in
the Same SPSS File
At the beginning of this chapter we decided to compare Arberry’s
(1964) translation of Surah 1 of the Quran with that of Irving (1985). To
achieve this objective we need to follow the procedures described in
sections 9.2.1, 9.2.2 and 9.2.3 in order to establish an Excel file for the
schemata employed by Irving. We can name the Excel file QS001EI.
One of the greatest advantages of SPSS is its capacity to accommodate a

large amount of data collected from different files provided that they
deal with the same variables. After establishing the Excel file QS001EI,
we do not need to establish a new SPSS file. What we need is to add

184
another variable to show that the data have been collected from two
translators.
In order to add a new variable to our SPSS file, we need to click and
activate our Variable View sheet of the QS001EA, click in cell 1 under
Name, i.e., schema, go to Data menu and activate its dialogue box as

Adding variable in an SPSS file A new variable added as number 1
Upon clicking on Insert Variable command a new variable will be

added and our former variable 1, i.e. schema, will become variable 2 on
the sheet. This process is shown in Figure 9.29 above. As we can see,
the name var00001 automatically appears under the Name column. We
must click in this cell and type trans as its name. Since there are no
decimals such as 0.5 in our data, we can type 0 and then under the
column Label we type Translator.
Since the schemata employed by two translators will be analyzed in this

chapter, we need to treat this variable as categorical and assign two
values for its translators. To achieve this objective we need to activate
the Value Label dialogue box as shown in Figure 9.31. We will type 1

185
in Value box and Arberry in Value Label box and then click Add. Then
we need to go back to Value box again and type 2, fill the Value Label
box with Irving and click Add. We should not forget to click on OK
icon otherwise whatever we have typed will be lost.

Defining the values of a new Checking the new variable in
variable Data View
Table 9.32 above presents the Data View sheet with the new variable.
Since whatever data we have pasted in the file so far belongs to Arberry
whose value is 1, we type 1 in the cells under the variable trans. As we
type 1 and go down the file we will notice that Arberry has used 42
schema token.
We can now copy the data related to our five variables form the Excel
file QS001EI, click in cell 43 under the second column schema and
paste the data. Figure 9.33 presents the pasted data from the file
QS001EI. As we can see, 2 is typed in the first column as a value given
to Irving’s schemata.

186
Figure 9.33
Adding the data related to the second value of the new variable
9.3 Utilizing SPSS Facilities to Analyze Data

Now that we have entered our data in SPSS Data View Sheet, we can
use its Analyze menu to test the three hypotheses we formulated at
section 9.1, i.e.,
1. The semantic, syntactic and parasyntactic domains of the two

2. The semantic, syntactic and parasyntactic types of the two translations
of Surah 1 will not be significantly different.
3. The semantic, syntactic and parasyntactic tokens of the two
The statistical tests we employ in the analysis of our data must be

compatible with the type of data we have collected. Since our variables
are all categorical in nature we must use Chi-Square /kī skweз/ test

187
which is represented by χ2.We can employ this test via Crosstabs

procedure on SPSS.
9.4 Crosstabs Procedure

When two categorical variables are analysed to explore whether they are
significantly different from each other, Pearson chi-square (χ2) is
utilized. In order to have the SPSS software calculate it for us, we need
to go to Analyze menu, click on Descriptive Statistics and then go to
Crosstabs as shown in Figure 9.34.

Activating Crosstabs on SPSS Crosstabs dialogue box
Upon clicking on Crosstabs, a dialogue box will appear as shown in

Figure 9.35. Click on Translator variable to highlight it and then click on
rightward arrow (►) to transfer it to Row(s) box. Click on Domain
variable in the big box on the left again to highlight it and then click on
the rightward arrow (►) in the middle to transfer it to Column(s) box. If
we want to have graphs, too, we can click in the small box labeled
Display clustered bar charts.

188
Figure 9.36
Activating expected frequency and percentage
Since χ2 is calculated on the basis of observed frequency, i.e., the

number of schemata used in the translations, and expected frequency,
i.e., the number of schemata we would normally expect if they are used
by chance, we need to calculate both. For this purpose on Cells icon at
the bottom of Crosstabs dialogue box, another dialogue box appears as
shown in Figure 9.36. We must check in Expected box. If we want to
have the percentage of schema domains in our report, we can check the
Row box under percentages. We can then click on Continue to return to
Crosstabs dialogue box.
In order to tell the software what test to apply to the data, we need to
click on Statistics icon at the bottom of the Crosstabs dialogue box.
Upon clicking on Statistics icon, another dialogue box will appear as
shown in Figure 9.37. First, we will check Chi-Square box so that we
can decide whether there is a significant difference between translators

189
in terms of their schema domains. As we can see in the figure, we can

explore correlations between categorical (nominal) variables, categorical
and interval variables and two or more ordinal variables by checking the
relevant boxes in Crosstabs: Statistics dialogue box.
For example, if we Figure 9.37

want to find out Statistics available in Crosstabs
whether two
categorical variables
correlate with each
other or not, we can
click in Phi and
Cramér's V box.
If we wish to explore
the relationship
between a categorical
variable and an interval
one such as scores on a
proficiency test, we
must click in Eta box.
Upon choosing the

relevant statistics in
Crosstabs: Statistics
dialogue box, we have to click on Continue icon shown in Figure 9.37.
This click will take us back to Crosstabs dialogue box. After ensuring
that we have chosen the right variables for analysis, we have to click on
OK icon to have the SPSS software calculate the requested statistics.
SPSS automatically produces three tables among which two are very
important and we must copy and paste them in our report. One of them
displays variable by variable crosstabulation as shown in Table 9.2. The
first row of numbers is the observed frequency so we can replace the
word count with observed to help the readers have a clear picture in their
mind.

190
Table 9.2
Translators by Schema Domain Crosstabulation
Domain
Translators Count Total
Semantic Syntactic Parasyntactic
Arberry Observed 23 16 3 42
Expected 22.3 17.2 2.5 42.0
% within translators 51.2% 43.9% 4.9% 100.0%
Irving Observed 21 18 2 41
Expected 21.7 16.8 2.5 41.0
Total Observed 44 34 5 83
Expected 44.0 34.0 5.0 83.0
Figure 9.38
Bar chart of schema domains employed by two translators

191
Figure 9.38 presents the information given in Table 9.2 as a bar chart.
As we can see, while Arberry (1964) used 23 semantic schemata in his
translation of Surah 1, Irving (1985) employed 21. The fewest number
of schemata employed by both translators were in parasyntactic domain.
Another outstanding feature of the two translations is that the percentage
of syntactic schemata used by Arberry, i.e., 38%, and Irving, i.e., 44%,
is much higher than that of political texts, i.e., 6%. (See Table 9.1)
Table 9.3 presents the Pearson chi-square test conducted on schema

domains. Since the result obtained is not significant, i.e., χ2 = .397, df =
2, p = .820, it confirms the first hypothesis that the semantic, syntactic
and parasyntactic domains of the two translations of Surah 1 will not be
significantly different. The abbreviation df in column 3 of Table 9.3
stands for degree of freedom. It shows how free the researcher can be in
choosing an alternative. For example, if we have three schemata
belonging to different domains and we already know one of them is
semantic, we can readily say that the second schema is either syntactic
or parasyntactic. It is easily determined by subtracting 1 from the
number of categories, i.e., 3-1=2.
Table 9.3
Chi-square test of schema domains
Asymp. Sig.
Tests Value df
(2-sided)
Pearson Chi-Square .397(a) 2 .820
Likelihood Ratio .398 2 .820
Linear-by-Linear Association .009 1 .924
N of Valid Cases 83
a 2 cells (33.3%) have expected count less than 5. The minimum expected count is 2.47.

types. Similar to schema domains, the chi-square value obtained for
schema types is not significant, i.e., χ2 = 3.999, df = 9, p = .911, it
therefore confirms the second hypothesis that the semantic, syntactic

192
and parasyntactic types of the two translations of Surah 1 will not be

significantly different.
Table 9.4
Chi-square test of schema types
Asymp. Sig.
Tests Value df
(2-sided)
Pearson Chi-Square 3.999(a) 9 .911
Likelihood Ratio 4.110 9 .904
N of Valid Cases 83
a 11 cells (55.0%) have expected count less than 5. The minimum expected count is .99.

tokens. Similar to schema domains and types, the chi-square value
obtained for schema tokens is not significant, i.e., χ2 = 14.034, df = 22, p
= .900, it therefore confirms the third hypothesis that the semantic,
syntactic and parasyntactic tokens of the two translations of Surah 1 will
not be significantly different.
Table 9.5
Chi-square test of schema tokens
Asymp. Sig.
Tests Value df
(2-sided)
Pearson Chi-Square 14.034(a) 22 .900
Likelihood Ratio 18.358 22 .685
N of Valid Cases 83
a 43 cells (93.5%) have expected count less than 5. The minimum expected count is .49.
9.5 Summary
Research projects sometime require collecting and analyzing categorical
variables such as translators and the schemata they use in their

193
translation. The statistical analyses of these data are based on the

frequency of their categories or how many times the defined categories
of a given variable are observed among the sample chosen for study. In
chapter 9, for example, we treated translators as a variable which had
two categories: 1 = Arberry, 2 = Irving. We also treated schema domain
as a variable which comprised three categories, 1 = semantic, 2 =
syntactic, and 3 = parasyntactic.
Since categorical variables are studied in terms of their frequency, they

are considered discrete. The gender variable, for example consists of
two discrete categories of females and males. For this reason, specific
tests such as chi-square must be used to study significant differences
among categorical variables. Research projects are not, however, limited
to categorical variables. Some may call for the collection and analysis of
ordinal variables to which we will turn in chapter 10.

194
10 Statistical Analysis of Ordinal Variables
10.1 Introduction
The creation of man by the Almighty Allah has a direct bearing on
research because He employed an instrument to prove man’s superiority
over angels. According to the Quran (chapter 2: 31-33), God told the
angels that he was going to create and set man as his viceroy31 on the
earth. They objected to his creation and said that why He would create a
creature who would corrupt the earth and shed blood, implying that they
deserved more to be his viceroy than man. In response to their criticism,
God said that he would take a test so that they would know why he had
chosen man.
The test which the Almighty Allah administered to both man and angels
consisted of nouns. In chapter 2 section 2.2.5 we talked about string
variables. We noticed that they are simply a set of characters or letters
such as bus. The string of letters which form nouns or other types of
semantic, syntactic and parasyntactic schemata are by themselves
meaningless unless they are associated with real events, objects, and
people in the mind.
When a child, for example, sees a bus for the first time in her life in an
English speaking environment and looks at her mum to tell her what it is
and she says, “bus,” she immediately forms a concept in her mind by
associating the noun bus with the object of bus as it exists in reality.
Figure 10.1 presents the association. If this concept gets established in
her mind and results in saying bus by seeing the vehicle on the street or
31
Viceroy /'vīsroy/ n. the governor who represents God on the earth

195
restoring its picture upon hearing the verbal noun bus, it becomes a
schema for that particular child.
Figure 10.1
Association of the noun bus with the real vehicle or its picture in human
mind
Bus ↔ stands for ↔
According to the Quran (Al-Baqarah 2:31) , angels lack human mind

and cannot establish any association between a string of sounds/
characters/ letters and what they can represent in reality because when
the Almighty Allah presented both man and angels with the nouns and
asked them to tell him what they stood for, only man could answer. In
other words, nouns stayed as a string of meaningless letters to the angels
but turned to schemata in man’s mind.
In chapter 9, we learnt that string variables become categorical if we can

say whether they differ from each other in kind. (We should notice that
telling the difference is a cognitive activity which occurs only in human
mind.) For example, while Arberry (1964) employed compassionate as
the English equivalent of RAHIM, Irving (1985) used merciful. As two
string variables, compassionate and merciful are different because they
consist of different letters. As schema types, however, they both refer to
the same quality in the Almighty Allah, i.e., they are synonymous
adjectives that can be used interchangeably (See Urdang, 1977, p. 282).
Schematically speaking, therefore, compassionate and merciful are
appropriate English equivalents for the same Arabic schema RAHIM for
Arberry and Irving, respectively.
In conducting any research project, we need to perceive a problem. The

perception of a problem shows itself in the research questions we raise.
As we have already realized, some problems are categorical in nature. In
chapter 9, for example, our schema-based research showed that

196
Arberry’s (1964) translation of Surah one of the Quran is the same as

Irving’s (1985) because they both used almost the same kind of
schemata in their translation. In this chapter, we will study variables
which differ from each other in rough degrees.
10.2 Characteristics of Ordinal Variables

Some abilities such as speaking a foreign language are ordinal by nature
because listeners rank speakers on the basis of certain subjective criteria
such as accent, fluency, tone, words chosen, and rhythm, to name a few.
There are other variables such as anxiety and willingness to
communicate which are ordinal by nature. Baker (1989) listed the
following characteristics for attitudes as the most widely studied ordinal
variables.
1. Attitudes are both cognitive and affective because we not only think
about something in our mind but also attach feelings and emotions to
it.
2. Attitudes are dimensional for they vary in degree. They are not
therefore categorical or bipolar to be either/or
3. Attitudes predispose learners to action in a certain way, their
relationship with actions is not a strong one.
4. Attitudes are not genetic but learned
5. Although attitudes persist, they can be modified by experience or
teaching.
10.3 Quantifying Ordinal Variables

Instead of differing in kind, e.g., whether a schema is an adjective or
noun, ordinal variables differ in degree. For example, attitudes, personal
views of certain phenomena such as people and objects, are ordinal by
nature. In your English class, one of your classmates may say, “I don't
think what's happening overseas has much to do with my daily life.”
This personal view poses a question which requires an ordinal answer
having different degrees. You may, for example, strongly agree, agree,
agree somewhat, remain undecided, disagree or strongly disagree with

197
the view. Naturally, working with these degrees of agreement would be

difficult if we do not assign some sort of numerical values to them.
Assigning values to the degree of agreement a given language learner

expresses regarding a certain attitude provides a researcher with an
objective scale to study that attitude. For example, we can assign values
of one to seven to establish a scale with which your classmates’ attitude
towards “I don't think what's happening overseas has much to do with
my daily life,” can be measured as shown below.
Strongly Agree Agree undecided Disagree disagree Strongly

agree somewhat somewhat disagree
1 2 3 4 5 6 7
10.4 Relationship among Ordinal Variables

One of the recent research projects employing a self-report questionnaire
to study ordinal variables belongs to Ghobadi (2009). She modified and
administered Horwitz’s (1998) 34-item Beliefs about Language
Learning Inventory (BALLI) to 423 undergraduate and graduate
students of English at seven tertiary education centers to find out
whether their beliefs had any significant relationship with their language
proficiency.
Each belief on the BALLI is presented to language learners as an ordinal

variable having five ranks, i.e., Strongly agree, Agree, Undecided,
Disagree and Strongly disagree. Ghobadi (2009) assigned the values of
1 to 5, to the ranks, respectively. This very assignment of cardinal
numbers helps researchers like Ghobadi estimate the extent of whatever
relationships exist among ordinal variables such as beliefs held by
learners. The most widely explored relationships studied among ordinal
variables are correlational validity and factorial validity.
10.4.1 Correlational Relationships among Ordinal Variables

Table 10.1 presents one graduate and one undergraduate student’s
agreement with the first five beliefs explored by the BALLI. As can be

198
seen, these two students differ from each other as regards the five beliefs
shown in the table. For example, while the graduate student agrees that
some people are born with a special ability which helps them learn
English, the undergraduate disagrees.
Table 10.1
A graduate and undergraduate student’s agreement with five beliefs
Beliefs Graduate Undergraduate

(G) (U)
1. It is easier for children than adults to learn Strongly Agree
English. agree
2. Some people are born with a special ability Agree Disagree
which helps them learn English.
3. Some languages are easier to learn than others. Undecided Agree
4. Learning English is very difficult. Agree Undecided
5. English is structured in the same way as Undecided Strongly disagree
Persian.
Table 10.2 presents the values assigned to degrees of agreement with the
five beliefs expressed by graduate (G) and undergraduate (U) students.
We can now employ three different types of formula developed to study
the relationship between two ordinal variables: Spearman Correlation
Coefficient (ρ), Pearson Correlation Coefficient (r) and Kendall’s Tau
Correlation Coefficient (τ).
10.4.1.1 Spearman Correlation Coefficient (ρ)

The calculation of Spearman rank-order correlation coefficient (rho or ρ)
is based on the differences between two sets of ranks. The agreement
ranks of an undergraduate and graduate student are given in Table 10.2
along with ρ formula. As can be seen, the five beliefs of the two students
show a correlation coefficient of 0.45.

199
Table 10.2
Calculation of Spearman Rank-Order Correlation Coefficient (ρ)
Beliefs G U G-U d d2
1 1 2 1-2 -1 1
2 2 4 2-4 -2 4
3 3 2 3-2 1 1 ρ = 1 – [6 × 11/ 5 (25-1) =
4 2 3 2-3 -1 1 1 – (66/120) =
5 3 5 3-5 -2 4 1 - 0.55 = 0.45
2
Σ d = 11
Calculating the ρ even for five beliefs takes time. It can be much faster if
we enter the data on the SPSS. Since we are exploring the correlation of
beliefs expressed by two participants, we only need to define two
variables, i.e., one for the graduate (Grad) and another one for the
undergraduate (UGra) participant as shown in Figure 10.2.
Figure 10.2
Two participants’ ranking as SPSS variables

200
For having the SPSS estimate the ρ, we need to go to the Analyze menu
and move down to the list to Correlate. Highlighting this submenu will
activate three command among which we need to click on Bivariate to
activate the Bivariate Correlation dialogue box as shown in Figures 10.3
and 10.4.

Correlation menu Bivariate Correlation
If we transfer Grad and UGrad to Figure 10.5

the Variables box via rightward Activating Spearman function
arrow ► in the middle and click
in the box appearing in front of
the Spearman, the button OK
becomes active as shown in
Figure 10.5. Upon clicking on the
button OK, the SPSS will
automatically calculate the ρ
coefficient.
Table 10.3 presents the results

produced by the SPSS. In
addition to Estimating the ρ coefficient, the SPSS provides us with the
significance level of the correlation and thus saves us the time to check
it manually in tables given by statisticians. As can be seen, the five

201
beliefs of the two undergraduate and graduate students do not show any
significant relationship, i.e., ρ = 0.35. This coefficient is lower than what
we obtained manually, i.e., ρ = 0.45. According to Howell (2002), this
discrepancy happens because scholars do not agree on an accepted
method to calculate the standard error of ρ for small samples. As a
result, computing confidence limits on spearman rho “is not practical”
(p. 307).
Table 10.3
Spearman's rho obtained on five beliefs
Grad UGra
Correlation Coefficient 1.000 .351
Sig. (2-tailed) . .562
N 5 5
10.4.1.2 Pearson Correlation Coefficient

Since the Pearson formula depends on mean and standard deviation, all
the ranks set for ordinal variables are treated as if they were scores
having equal intervals. After obtaining the mean of the scores, their
standard deviation is calculated as shown in Table 10.4. (Mean and
standard deviation are discussed in details in Chapter 11.)
Table 10.4
Calculation of squared individual deviation scores on five beliefs
Beliefs G G-M=g g2 U U-M=u u2

1 1 1-2.2=-1.2 1.44 2 2-3.2=-1.2 1.44
2 2 2-2.2=-0.2 0.04 4 4-3.2=0.8 0.64
3 3 3-2.2=+0.8 0.64 2 2-3.2=-1.2 1.44
4 2 2-2.2=-0.2 0.04 3 3-3.2= -0.2 0.04
5 3 3-2.2=+0.8 0.64 5 5-3.2=1.8 3.24
ΣG=11 Σg2=2.80 ΣU=16 Σu2=6.80
M= 2.2 M=3.2

202
Where:
G = Graduate student
U = Undergraduate student
∑ .
Variance (VG) = = = 0.7
N
Standard Deviation (SDG) = VG = √0.7 = 0.84
∑ .
Variance (Vu) = = = 1.7
N
Standard Deviation (SDu) = √Vu = √1.7 = 1.30
Upon calculating the standard deviation of the two scores obtained on

five beliefs, their z scores can be computed as shown in Table 10.5. If
we replace the sum of z scores on five beliefs in the formula rGU=
Σ(zGzU) /N-1, we will get a coefficient of 0.41, i.e., 1.63 ÷ (5-1) = 1.63÷
4 = 0.41. (When you add up the z scores of two test takers, pay a close
attention to the negative marks otherwise, you will get a very high
correlation!)
Table 10.5
Calculation of z scores on five beliefs
Beliefs g÷SD = zg u÷SD = zu zg × zu zgzu

1 -1.2÷.84= -1.43 -1.2÷1.30= -0.92 -1.43×-.92 1.32
2 - 0.2÷.84=-.24 +0.8÷1.30=0.62 -0.24×.62 -0.15
3 +0.8÷.84=.95 -1.2÷1.30=-0.92 0.95×-.92 -0.88
4 -0.2÷.84=-.24 -0.2÷1.30=-0.15 -0.24×-.15 0.03
5 +0.8÷.84=.95 +1.8÷1.30=1.38 0.95×1.38 1.31
ΣzGzU =1.63
We can now use SPSS to save time and labour. If we look at Figure 10.4
again, we will see Pearson appearing as the first and default choice.
Instead of clicking in the box labeled Spearman, we can check
Pearson, click on the OK and see what coefficient we will get from the
SPSS.

203
Table 10.6 presents the Pearson coefficient estimated by the SPSS. If we

compare the result obtained manually with the one obtained by the
SPSS, we will see that they are both the same, i.e., r = 0.41. Similar to
the Spearman coefficient, i.e., .351, Pearson correlation coefficient is not
significant.
Table 10.6
Pearson correlation coefficient obtained on the five beliefs
Grad UGra
Pearson Correlation 1 .413
Sig. (2-tailed) .490
N 5 5
10.4.1.3 Kendall Tau Coefficient (τ)

Kendall Tau is considered as a serious competitor to Spearman rho
because it is not based on treating ranks as scores. According to Howell
(2002), it is based on “the number of inversions in the rankings” (p.
308). In order to estimate Kendall τ, we need to list the beliefs in the
order of rankings given by one of the participants. This is done as an
example in Table 10.7.
Table 10.7
The ordering of five beliefs according to graduate student’s ranking
Participant Belief 1 Belief 4 Belief 2 Belief 3 Belief 5

Graduate 1 2 2 3 3
Undergraduate 2 3 4 2 5
After ordering the responses given by the graduate participant, we need

to draw a line to connect comparable rankings. This provides us with the
easiest way to calculate the number of inversions upon which the
Kendall τ is based (see Howell 2002, p. 308 for more information).
What we need then is to count the number of intersections of the lines

204
and put it in the formula. (No lines should be drawn when the two
participants give the same responses. The two participants whose beliefs
appear in Table 10.7 did not, of course, give the same responses at all.)
For example, the ranking given by the undergraduate student on Belief 1
is comparable to the ranking given by graduate participant on Belief 4
and her own ranking on Belief 3, so we need to draw lines as shown in
Table 10.7. After drawing the lines and specifying the intersections, we
can apply the formula to our data.
2 Number of inversions
Kendall τ 1
N N 1 /2
2 3 6 6
τ 1 1 1 1 .6 0.4
5 5 1 /2 5 4 /2 10
Similar to Spearman and Pearson Figure 10.6

correlations, we can employ the Activating Kendall τ function
SPSS to calculate the Kendall τ
for us. What we need is to go to
Analyze menu, highlight
Correlate to activate the
Bivariate Correlation dialogue
box as shown in Figure 10.6. (Do
not forget to transfer the two
variables Grad and UGra to the
variables box by using the arrow
button appearing between the
two boxes.
Table 10.8 presents the Kendall Tau-b coefficient estimated by the

SPSS. If we compare the result obtained manually with the one obtained
by the SPSS, we will see that they are different, i.e., τ = 0.40 vs .35. The
difference lies in the fact that version b of the formula used by the SPSS
is based on a formula which is very much similar to Spearman ρ.
However, neither the manually obtained coefficient nor the one obtained

205
by the SPSS is significant and therefore should be ignored in terms of

having significant relationship.
Table 10.8
Kendall's tau_b correlation coefficient obtained by the SPSS on the five
beliefs
Grad UGra
Kendall's tau_b Correlation 1.000 .354
N 5 5
10.5 Factorial Validity of Ordinal Variables

In section 2.4 of chapter two, we became familiar with latent variables.
In this chapter we will study them in some details in order to find out
how they contribute to human understanding.
Latent variables are based on dependent variables which provide

researchers with observable and measurable indexes of attributes such as
beliefs. For example, Ghobadi (2009) administered the Beliefs about
Language Learning Inventory (BALLI) to 423 undergraduate and
graduate students. Since the BALLI consists of 34 beliefs, each belief on
the questionnaire is an attribute which tells us what learners believe.
However, what the responses given to each belief on the BALLI do not
show is whether there is a relationship among certain beliefs. The
possible relationships among beliefs which can be explored only through
certain statistical tests such as Factor Analysis, is referred to as latent
variables. According to Tucker and MacCallum (1997), they are more
fundamental than observed variables because “they cannot be directly
measured” (p. 2).
Two criteria are generally adopted to decide whether factor analysis is

suitable to analyze the collected data (e.g., Pallant, 2007). The first

206
involves the sample size. While Tabachnick and Fidell (2007) suggested
at least 300 cases, Nunnally (1978) suggested the ratio of 10 participants
to one item.
Ghobadi’s (2009) study meets the first criterion because it explored the
beliefs of 423 participants whose ratio to 34 beliefs is well over 10 to
one. The second criterion is the strength of intercorrelations among
items. They can be explored either by focusing on intercorrelation
coefficients or by employing statistical measures. For the former,
Tabachnick and Fidell (2007) stated that if few items showed
coefficients greater than 0.30, factor analysis might not be appropriate.
For the latter, two statistical measures are employed: Bartlett’s test of
sphericity (Bartlett, 1954), and the Kaiser-Myer-Olkin (KMO) measure
of sampling accuracy (Kaiser 1970, 1974). Bartlett’s test of sphericity
should be significant (p <.05) for the factor analysis to be appropriate.
The KMO index ranges from 0 to 1, with .6 suggested as the minimum
value for a good factor analysis.
10.5.1 Utilizing SPSS to Run Factor Analysis

In order to explore the latent variables within a given instrument such as
the BALLI, we need to employ the SPSS because it is virtually
impossible to estimate correlations among dependent variables such as
34 beliefs manually.
For this purpose we need to open a Variable View sheet of a SPSS file
and define 34 variables. We might, for example, type B01 under the
column name and specify the type as numeric. Under the six column
called Values, we need to specify the five points presented with each
belief. Figure 10.7 shows the values assigned to B01.

207
Figure 10.7
Assigning values to an ordinal variable
We can follow the procedure described above and establish 34 beliefs

and then enter the data in the Data View sheet of the SPSS. In order to
have the SPSS compute whatever we need, we need to follow the steps
below.
From the main menu on the top of the SPSS Data View we click on
Analyze to activate another menu upon which we need to click on Data
Reduction. (The newer versions of the SPSS such as the IBM SPSS
Statistics 20 have changed the name to Dimension Reduction.) It offers
three submenus called Factor, Correspondence Analysis and Optimal
Scaling. Among these three submenus we need to choose Factor.
Figure 10.8 shows the steps described so far.

208
Figure 10.8
Activating Factor Analysis menu on the SPSS
Upon clicking on Factor, its dialogue box appears on the monitor which
requires us to specify the variables to be analyzed as shown in Figure
10.9.
Figure 10.9
Dialogue box requiring the specification of variables to be analysed

209
After selecting the observed variables and transferring them to Variables

box, we need to click on Discriptives button to activate its dialogue box.
In the box, there are two sections as shown in Figure 10.10. We need to
check in Initial solution in the Statistics section and KMO and
Bartlett’s test of sphericity in the Correlation Matrix section.
Figure 10.10
The Descriptives dialogue box of factor analysis
Upon clicking on Continue button on the Factor Analysis:

Descriptives, we will be taken back to the Factor Analysis dialogue
box. Click on the second button Extraction on the top left corner of the
box to active the box shown in Figure 10.11. There are three sections in
the Factor Analysis: Extraction box. In the Analyze section we need
to select the Correlation matrix. In the Display section, we should
select Unrotated factor solution and Scree plot. In the Extract section,
we have to become sure that the Eigenvalues over 1 is checked in.

210
Figure 10.11
Activating Factor Analysis: Extraction box
Upon clicking on Continue button on the Factor Analysis: Extraction,

we will be taken back to the Factor Analysis dialogue box. Click on the
third button on the Rotation button on right hand corner to activate its
box and select Varimax as shown in Figure 10.12.
Figure 10.12
Activating Factor Analysis: Rotation box
Table 10.9 presents the KMO and Bartlett's Test of the BALLI
administered to 418 graduate and undergraduate students majoring in

211
English in seven higher education institutions in Mashhad, Iran

(Khodadady 2009). As we can see, there are two key statistics in the
table, i.e., the Kaiser-Meyer-Olkin (KMO) Measure of Sampling
Adequacy and Bartlett's Test of Sphericity. According to DiLalla and
Dollinger (2006) the KMO statistics reflects the degree to which it is
likely that common factors explain the observed correlations among the
variables and it is calculated as the sum of the squared simple
correlations between pairs of variables divided by the sum of squared
simple correlations plus the sum of squared partial correlations. To the
degree that partial correlations approach zero as common factors
account for increasing variance among the variables.
Table 10.9
KMO and Bartlett's Test of 418 participants as adequate sample
Kaiser-Meyer-Olkin Measure of Sampling Adequacy .602
Bartlett's Test of Sphericity Approx. Chi-Square 1.262E3

df 561
Sig. .000
The KMO statistic will be higher when a common-factor model is

appropriate for the data. Small values for the KMO statistic indicate that
correlations between pairs of variables cannot be accounted for by
common factors. According to Kaiser (1974), KMOs in the .90s are
“marvelous”, in the .80s “meritorious,” in the .70s “middling,” in the
.60s “medicocre,” in the .50s “miserable” and below .50 “unacceptable”
(cite in DiLalla & Dollinger 2006, p. 250). Since the KMO reported in
Table 10.9 is .60 and falls in the .60s, then subjecting the BALLI t factor
analysis is appropriate.
The second key measure presented in Table 10.9 is Bartlett’s test of

sphericity (Bartlett, 1954). When it is reported in a table like Table 10.9,

212
reporting its significance level would be enough. However, when the

table is not given to save space, its Chi-Square index and df must be
reported, i.e., x2=1.262, df=561, p<.001. These results show that since
the x2 is significant, the correlation matrix was not an identity matrix.
The second table the SPSS provides us with is the table of

communalities. Table 10.10 presents the communalities of 34 beliefs
extracted through Principle Component Analysis (PCA). As we can see,
they have all been extracted from 1 as the initial eigenvalue, i.e., the sum
of the squared loadings on a factor (Vollmer & Sang, p. 43). The higher
the eigenvalue, the more it is likely that this factor represents variance in
some meaningful way. This would be even more true when at least two
variables have their highest loadings on just this factor. For several
reasons the limit of an eigenvalue of one has been established (cf.
Kaiser, 1958) as a criterion for stopping the process of extracting
factors.
Table 10.10
The extraction of communalities via Principle Component Analysis
Belief Initial Extraction Belief Initial Extraction Belief Initial Extraction

B01 1 0.51 B13 1 0.62 B25 1 0.64
B02 1 0.60 B14 1 0.60 B26 1 0.61
B03 1 0.72 B15 1 0.60 B27 1 0.60
B04 1 0.59 B16 1 0.47 B28 1 0.70
B05 1 0.67 B17 1 0.59 B29 1 0.67
B06 1 0.68 B18 1 0.57 B30 1 0.52
B07 1 0.43 B19 1 0.63 B31 1 0.63
B08 1 0.58 B20 1 0.55 B32 1 0.66
B09 1 0.57 B21 1 0.71 B33 1 0.66
B10 1 0.58 B22 1 0.56 B34 1 0.49
B11 1 0.53 B23 1 0.63
B12 1 0.57 B24 1 0.58
Extraction Method: Principal Component Analysis

213
In addition to communalities, we must also present the total variance

explained by each component. Table 10.11 shows the total variance
extracted through the PCA. As we can see, 14 components have an
eigenvalue of over one. This means that based on the PCA, the BALLI
measures 14 latent variables, i.e., the components upon which the
observed 34 beliefs have the highest loadings. It is commonly accepted
that if the loading of a variable on a factor is less than .30, that factor
loading can be safely ignored. Thus, factors with loadings of .30 or less
can be eliminated and this criterion can be used to terminate factor
extraction.
Table 10.11
Total Variance Explained
Extraction Sums of Squared

Initial Eigenvalues
Loadings
Component
Total % of Cumulative Total % of Cumulative
Variance % Variance %
1 2.85 8.39 8.39 2.85 8.39 8.39
2 2.16 6.36 14.75 2.16 6.36 14.75
3 1.68 4.94 19.69 1.68 4.94 19.69
4 1.58 4.66 24.35 1.58 4.66 24.35
5 1.55 4.56 28.91 1.55 4.56 28.91
6 1.41 4.14 33.05 1.41 4.14 33.05
7 1.37 4.03 37.08 1.37 4.03 37.08
8 1.22 3.59 40.67 1.22 3.59 40.67
9 1.18 3.47 44.14 1.18 3.47 44.14
10 1.13 3.33 47.47 1.13 3.33 47.47
11 1.10 3.22 50.69 1.10 3.22 50.69
12 1.05 3.09 53.78 1.05 3.09 53.78
13 1.03 3.04 56.83 1.03 3.04 56.83
14 1.01 2.98 59.80 1.01 2.98 59.80

214

Total Variance Explained
Extraction Sums of Squared

Initial Eigenvalues
Loadings
Component
Total % of Cumulative Total % of Cumulative
Variance % Variance %
15 0.94 2.75 62.56
16 0.92 2.70 65.26
17 0.89 2.62 67.88
18 0.88 2.58 70.47
19 0.84 2.48 72.94
20 0.81 2.39 75.34
21 0.79 2.31 77.65
22 0.78 2.30 79.95
23 0.74 2.18 82.14
24 0.70 2.06 84.19
25 0.67 1.97 86.16
26 0.63 1.84 88.00
27 0.62 1.81 89.82
28 0.58 1.71 91.53
29 0.55 1.63 93.16
30 0.52 1.54 94.70
31 0.49 1.45 96.15
32 0.48 1.40 97.55
33 0.45 1.31 98.87
34 0.38 1.13 100.00
Extraction Method: Principal Component Analysis.
Table 10.12 presents the component matrix obtained via PCA. As we

can see, the 34 beliefs of the BALLI load on 14 components. What we

215
need to do now is to spot loadings higher than .30 in order to see which
observed variables form the 14 latent variables.
Table 10.12
Component matrix
Components
1 2 3 4 5 6 7 8 9 10 11 12 13 14
B01 .42 -.06 .21 -.12 -.03 .13 -.22 .03 .24 -.36 .09 -.06 .07 -.03
B02 .20 .01 .09 .36 -.02 .08 -.32 -.22 .45 .18 .09 .08 .11 .08
B03 .12 -.04 -.07 .14 -.42 .17 .36 .10 .20 -.17 .42 .02 -.30 .04
B04 -.05 -.46 .32 -.25 -.09 -.12 -.12 -.07 .11 .27 -.11 -.09 .25 .09
B05 .06 .17 .16 -.21 .23 -.08 -.06 .36 .42 .33 -.09 .09 -.09 .26
B06 -.01 .58 -.19 .03 .19 -.06 .12 -.03 -.17 .20 -.17 .06 -.15 .36
B07 .44 .02 -.22 .22 .27 -.09 .06 .16 .08 -.08 -.03 .13 .02 .02
B08 .37 .28 .19 -.05 .12 .26 -.27 .08 .13 .13 -.04 -.01 -.02 -.36
B09 .11 -.12 .15 .32 .51 .08 -.01 -.03 -.03 -.26 .04 .17 .15 .17
B10 .18 .34 -.06 -.04 -.17 -.04 .20 -.06 -.08 .02 .46 -.01 .28 .24
B11 .21 .03 .38 -.19 -.06 .03 -.10 .17 -.11 -.09 .14 .37 -.09 .27
B12 .36 .24 .09 -.21 .03 -.01 -.15 .00 -.35 .13 .25 .25 .04 -.22
B13 .10 .27 .16 -.22 -.41 .39 .16 .09 .17 -.03 -.17 .20 .01 -.11
B14 .21 .56 -.10 .14 -.27 -.12 -.15 .03 .10 .22 -.11 -.11 -.01 .10
B15 .42 -.31 -.42 .00 -.08 .24 .02 .02 -.12 -.03 -.18 .07 -.12 .15
B16 .34 -.01 .02 .09 -.25 .04 -.18 .35 -.19 -.16 -.19 -.01 -.13 .08
B17 .17 .51 -.14 .08 .25 .16 -.16 -.28 -.07 .04 .01 -.21 -.16 -.07
B18 .26 -.16 .04 .08 .36 -.23 .06 .35 .16 .05 .28 -.07 -.23 .03
B19 .41 -.39 -.42 .07 -.02 .11 -.06 -.23 -.04 .07 -.05 -.01 .09 .21
B20 .56 -.07 .13 .03 -.12 -.18 -.24 -.17 -.24 -.08 .05 -.09 -.08 -.08
B21 -.03 .03 -.10 .49 -.16 .01 .13 .24 -.16 .20 -.03 .43 .16 -.27
B22 .44 .14 -.07 .02 .01 -.28 .30 .13 .13 -.10 -.21 -.04 .30 -.03

216

Component matrix
Components
1 2 3 4 5 6 7 8 9 10 11 12 13 14
B23 .02 .02 .29 .27 .24 .48 .23 .06 -.18 .06 -.27 .04 .11 .05
B24 .11 .13 .51 .09 -.07 .11 -.10 -.36 -.12 -.14 -.07 .09 -.04 .27
B25 .35 -.45 -.14 -.07 .11 .35 -.11 .10 -.04 .30 .14 .02 .14 .03
B26 .52 .06 .06 -.10 -.07 -.27 .19 -.02 .04 .00 -.16 -.12 .33 -.22
B27 .24 -.21 .13 -.11 -.22 .19 .17 .07 -.18 .44 -.04 -.29 -.19 .09
B28 .06 .07 .25 .54 .03 .15 .06 .11 -.04 .10 .24 -.46 .04 -.10
B29 .18 -.10 .37 -.10 .18 -.22 .47 -.23 -.16 .26 .15 .12 -.01 -.03
B30 .39 -.06 .09 -.18 .22 -.16 .08 .08 -.15 -.14 -.15 -.12 -.37 -.14
B31 .48 -.06 -.07 .21 -.21 -.25 .08 -.35 .21 .07 -.07 .16 -.17 .05
B32 .08 -.01 -.03 -.15 .23 .28 .40 -.34 .32 .00 -.07 .11 -.21 -.23
B33 .23 .25 -.15 -.37 .13 .31 .22 .08 .01 -.20 .08 -.23 .30 .18
B34 .00 .11 -.35 -.31 .19 .07 -.23 -.08 -.05 .15 .27 .18 .04 -.16
Extraction Method: Principal Component Analysis
As we can see in Table 10.12 above, spotting loadings equal to or higher

than .30 is really difficult and we may make mistakes when we do it
visually. The SPSS has a command which suppresses loadings
according to the index specified by its users.
If you remember, on the top right corner of the Factor Analysis box,
there were some buttons. We need to click on the Options button to
activate its box. There are two sections in the box: Missing Values and
Coefficient Display format. In the latter section, there are two boxes as
shown in Figure 10.13. If you click in the box at the beginning of
Suppress absolute values less than: its front box becomes active. You
can now type .30 and click on Continue in order for the SPSS to
provide you with the loadings equal to or higher than .30.

217
Figure 10.13
Activating Suppress absolute values less than
Table 10.13 presents the loadings equal to or higher than .30. The
suppression of the loadings less than .30 makes the identification of
acceptable loadings much easier. However, the interpretation of what the
loadings mean and what the factors upon which the observed variables
load stand for is not as simple as their identification. In fact, the
researchers need to employ their common sense, reasoning power along
with other researchers’ findings to give sense to the factors they extract
from their data. In other words, factor analysis is a test employed in
inferential statistics where the researchers must interpret what their
findings express.
Table 10.13
Component matrix with the loadings less than .30 suppressed
Components
Beliefs
1 2 3 4 5 6 7 8 9 10 11 12 13 14
B01 .42 -.36

B02 .36 -.32 .45

218

Components
Beliefs
1 2 3 4 5 6 7 8 9 10 11 12 13 14
B03 -.42 .36 .42

B04 -.46 .32
B05 .36 .42 .33
B06 .58 .36
B07 .44
-
B08 .37
.36
B09 .32 .51
B10 .34 .46
B11 .38 .37
B12 .36 -.35
B13 -.41 .39
B14 .56
B15 .42 -.31 -.42
B16 .34 .35
B17 .51
B18 .36 .35
B19 .41 -.39 -.42
B20 .56
B21 .49 .43
B22 .44 .30
B23 .48
B24 .51 -.36
B25 .35 -.45 .35
B26 .52 .33

219

Components
Beliefs
1 2 3 4 5 6 7 8 9 10 11 12 13 14
B27 .44
-
B28 .54
.46
B29 .37 .47
-
B30 .39
.37
B31 .48 -.35
B32 .40 -.34 .32
B33 -.37 .31 .30
B34 -.35 -.31
For example, as can be seen in Table 10.13 above beliefs 1, 7, 8, 12, 15,
16, 19, 20, 22, 25, 26, 30 and 31 all load meaningfully on component
one. If we agree with Tucker and MacCallum (1997), then we should
accept component one as a latent variable consisting of 13 beliefs which
measure a macro belief which cannot be measured directly by any of
these beliefs alone. This means that we need to look for a feature shared
by these 13 beliefs.
Table 10.14 on the next page presents the 13 beliefs loading

meaningfully on the extracted component one along with the logical
areas the designer of the BALLI believed they addressed. As can be
seen, these beliefs address four logically different areas of foreign
language learning. It is really difficult, if not psycholinguistically
impossible, to claim that belief one which deals with child language
learning, i.e., belief 1, is closely related to learning many of grammar
rules, i.e., belief 19, or practicing English in the laboratory, i.e., belief
20. Reasoning based on these arguments can then be employed to
question either the validity of beliefs or suitability of the statistic test
used in the study, i.e., Principle Component Analysis.

220
Table 10.14
Logical areas of 13 beliefs loading on component one
Logical area 13 Beliefs Loading on Component One

Foreign 01. It is easier for children than adults to learn English.
language 31. People who speak more than one language well are very intelligent.
aptitude
08. It is necessary to know English culture in order to speak it.
15. Learning English is mostly a matter of learning many new
The nature of vocabulary words.
language 19. Learning English is mostly a matter of learning many of grammar
learning rules.
25. Learning English is mostly a matter of translating from English into
Persian.
07. It is important to speak English with an excellent accent.
Learning and 12. If I heard some people speaking English, I would go up to them so
communication that I could practice speaking the language.
strategies 16. It is important to repeat and practice often.
20. It is important to practice in the language laboratory.
22. If I speak English very well, I will have many opportunities to use
it.
Motivations
26. If I learn to speak English very well, it will help me to get a good
and
job.
expectations
30. I would like to learn English so that I can get to know its speakers
better.
10.6 Criticism of Ordinal Variables

Attitudes as the most widely explored ordinal variables are measured
through different instruments. Self-report questionnaires are one of the
direct instruments through which certain number of statements are
presented to participants. They are then required to give their personal
views on these statements on the basis of a five-or-seven-point scale.
Gardner and Lambert (1972), for example, employed this kind of
questionnaire to study their participants' reasons for learning French.
Although some scholars like Oller (1977 & 1981) have questioned the

221
validity and reliability of direct measures of attitudes on the basis of

learners' tendency towards giving socially desirable answers, self-report
questionnaires have been widely used to explore the effect of attitudes
on second language learning (see Ellis, 1994).
10.7 Summary
Ordinal variables provide the first quantitative measures through which
human attributes such as beliefs and attitudes can be objectively
measured and then subjected to inferential statistical tests such as factor
analysis and Amos (Arbuckle & Wothke 1999).
Ordinal variables seem to lend themselves best to designing

questionnaires in which the respondents choose one of the alternatives
presented in a hierarchical order. They are required to choose the
alternative which best describes what they believe, feel or think.
Although ordinal variables are widely used in almost all fields of
science, they are limited in scope simply because their ordinal
alternatives are rather few. When a mental ability such as reading
comprehension is under investigation, we need another type of
psychometric variable called interval. We will discuss it in chapter 11.

222
11 Working with Interval Variables
11.1 Introduction
If you remember, we talked about Farvani’s (2004) research project in
chapter 3. She conducted her project in order to answer the question,
“Can reading portfolios increase students’ achievement more than
traditional tests?” In order to answer the question, she formulated the
null hypothesis. “there is no significant difference between the means of
pre-tests and post-tests of schema-based cloze multiple choice item test
(achievement) for the experimental and control groups after the
implementation of reading portfolios.”
For testing her null hypothesis above Faravani (2004) designed a

schema-based cloze multiple choice item test (S-Test) consisting of 69
items. This means that she manipulated the S-Test as a variable whose
intervals ranged from 1 to 69. She administered the test two times: once
before she started to teach (pretest) and once at the end of the term
(posttest).
Furthermore, Faravani (2004) administered the S-Test as a pre-and-post

test to two groups: experimental and control in order to compare them
with each other (see chapter 3, section 3.3). We will work with the
scores of these two groups to find out how they can be employed to test
hypotheses.
11.2 Raw scores

Raw scores are the marks students obtain when they take a test. The raw
scores given on the next page were, for example, obtained by the control
group on the S-Test administered as a pretest in Faravani’s (2004) study.

223
11, 6, 22, 33, 27, 22, 26, 27, 29, 25, 31, 15, 32
Raw scores provide some preliminary information regarding their

obtainers. However, before we do anything with the raw scores, we had
better sort and arrange them from the lowest to the highest as shown
below.
6, 11, 15, 22, 22, 25, 26, 27, 27, 29, 31, 32, 33
By counting the raw scores above, we can say that there are only 13
students in the control group. While two students have scored 22 and the
other two 27, the rest have performed differently on the S-Test. In other
words, with the exception of 22 and 27, nine scores have a frequency of
one. We can then use these frequencies to calculate the percentage and
percentile of raw scores.
Table 11.1 presents the steps involved in calculating the percentage and
percentile of raw scores (X). (The table is given in the next page to save
space.) First, we need to arrange the scores from the highest to the
lowest and put them in column X. The frequency of each score should
then be shown in column F. As we can see in the table, the scores of 33,
32, 31 and 29 have, for example, an F of 1. Scores 5 and 6, however,
have an F of 2, indicating that two students have scored 5 and two others
have scored 6.
Based on the frequency given column F, we can now calculate the

relative frequency (RF) of each score. An RF shows what percentage of
students has obtained a given score on the S-Test. As can be seen in
Table 11.1, out of 13 students, one has scored 33 on the S-Test. If we
divide 1 by 13, i.e., 1÷13, we will get .07, indicating that out of 100
percent, only seven percent have scored 33 on the test.
By employing the frequency given in Column F, we can calculate

cumulative frequency (CF) for each score and present it in a column
having the same name. In order to obtain the CF, we need to add up the

224
F of each score successively from the bottom of the column F. The

frequency given for the lowest score, i.e., 6, at the bottom of Column F
is 1, so we inert 1 in the last cell of column CF. Then we look at the F of
the second lowest score, i.e., 1, and add it up with the F of the last score
to get the CF of the second lowest score, i.e., 1+1=2. To obtain the CF
of the third lowest score, we add up the CF of the second lowest score
with the F of the third, i.e., 2+1=3 and enter it in the third cell from the
bottom of the CF column. The same procedure must be followed for the
rest of scores as shown in Table 11.1.
Table 11.1
Percentage and percentile of a set of raw scores
Test X F* CF Relative Percentage Percentile

taker ** frequency (RF) RF×100 (CF/N) ×100
1 33 1 13 1/13=.07 .07×100=7 (13/13) ×100 = 100
2 32 1 12 1/13=.07 .07×100=7 (12/13) ×100 = 92
3 31 1 11 1/13=.07 .07×100=7 (11/13) ×100 = 85
4 29 1 10 1/13=.07 .07×100=7 (10/13) ×100 = 76
5&6 27 2 9 2/13=.15 .15×100=15 (9/13) ×100 = 69
7 26 1 7 1/13=.07 .07×100=7 (7/13) ×100 = 54
8 25 1 6 1/13=.07 .07×100=7 (6/13) ×100 = 46
9 & 10 22 2 5 2/13=.15 .15×100=15 (5/13) ×100 = 38
11 15 1 3 1/13=.07 .07×100=7 (3/13) ×100 = 23
12 11 1 2 1/13=.07 .07×100=7 (2/13) ×100 = 15
13 6 1 1 1/13=.07 .07×100=7 (1/13) ×100 = 7
N= RF = F/N
13
*F = Frequency
**CF = Cumulative Frequency
Once we have the CF of each score, we can calculate its percentile. Each
percentile shows the relative standing of an individual within a
particular group. For example, the percentile of students 9 and 10 who
have scored 22 in Table 11.1 shows that their position is at the same
rank as 38% of the class.

225
However informative a given students’ rank within a particular group

might be, it does not reveal any accurate information regarding her
ability. We, therefore, need to find some procedures to render the scores
more informative so that we can have a better decision regarding
participants’ performance on research instruments. They include central
tendency and dispersion as discussed below.
11.3 Central Tendency

As the name indicates, central tendency of scores allows us to see what
score falls in the center of the sample whose performance we wish to
study. There are three procedures to determine central tendency within a
set of scores: mode, median and mean.
11.3.1 Mode
In order to find the mode of a set of scores, we ought to sort and arrange
them from the lowest to the highest or vice versa and then look for the
most frequent score(s). The 13 sorted scores obtained by Faravani’s
control group are given below. As we can see, the very act of sorting the
scores, allows us to find the modes, i.e. 22 and 27, easily and
immediately.
6, 11, 15, 22, 22, 25, 26, 27, 27, 29, 31, 32, 33
11.3.2 Median
The median of a set of scores refers to the score which falls in the
middle. In other words, 50% of scores fall below the median and 50%
above. If the set consists of odd number of scores, then one of them will
fall exactly in the middle. Since there are 13 sorted scores in the set
below, 6 will be on the right of the median and six on its left.
Median
6, 11, 15, 22, 22, 25 26 27, 27, 29, 31, 32, 33
If an observed set consists of even numbered raw scores, we need to sort

them from the lowest to the highest, find the two scores which fall in the

226
middle, add them up and divide them by two in order to find its median.
For example, the sorted scores below were obtained by 10 students in
Faravani’s (2004) experimental group. Since the scores 23 and 24 are
the middlemost scores, the median of the set will therefore be (23 +
24)/2 = 23.5
Median
20, 20, 21, 21, 23 23.5 24, 26, 27, 34, 35
11.3.3 Mean
Although mode and median are the easiest ways of finding the typical
score in an observed set, they suffer from two main shortcomings. First,
they depend on the number of raw scores and usually differ from each
other. For example, while the mode for Faravani’s control group was 22
and 27, its median turned out to be 26. Secondly, they are based on
either the frequency of the scores or the middle scores and thus fail to
take all the raw scores into account.
In contrast to mode and median, mean is calculated by adding up all the

scores and dividing them by the number of test takers as shown in the
simple formula below.
X
Mean = /
N
The Greek capital letter Σ is conventionally used to show adding up. The
capital letter X refers to individual raw scores. The sloping line solidus32
(/), shows division. And finally the capital letter N indicates the number
of test takers. We can now apply the formula to the scores obtained by
13 participants in the control group.
Mean = X/N = (6+11+15+22+22+25+26+27+27+29+31+32+33) /13 = 306

/ 13 = 23.54
32
Solidus /'sǒlidзs/ n. a line sloping from right to left (/) and used to show division

227
The mean value obtained above, i.e., 23.54, is distinctly different from
the median, i.e., 26. (The actual mean estimated by a calculator is
23.538461538461538461538461538462. As we can see there are 30
fractional digits in the mean which might be necessary to be reported in
scientific fields such as physics and chemistry. In social sciences,
however, usually two decimal fractions, i.e., .54 are reported. Note that
we need to round up .53 to .54 because the third decimal fraction, i.e., 8,
is higher than 5)
Since the mean of a set of raw scores is calculated on the basis of all
individual scores, we can be sure that it is the most representative of test
takers’ performance. If a participant who scored 15 on the schema-based
cloze MCIT asked Faravani (2004), for example, how she performed on
the test, she could use the mean and say that her score was lower than
the average student in the class, e.g., 23.5, and therefore she needed to
exert some extra effort to catch up with the rest of the class. (Notice that
we do not employ the total number of items on the test, i.e., 69, as a
criterion to assess her performance.)
If you remember, Faravani (2004) administered the schema-based cloze

MCIT as a pre-test. Since the mean on this score was 23.5, she could
easily decide that her control group did not know the material she was
going to teach as the main part of her research. That is why the
performance of the student who has scored 15 on the test should not be
assessed on the basis of total number of test items, i.e., 69. The
particular student could, however, ask how far her performance was
from the mean. We discuss the distance between scores when we study
dispersion.
11.2 Variation
When we report raw scores, we must specify what the maximum score
on the test would be if a test taker answered all the items. For example, a
student who has scored 23 on a test consisting of 69 items, must be told
that out of 69 pieces of construct measured by the test, s/he has mastered

228
only 23. Reporting raw scores on a pre-test alone without drawing the
participants’ attention to its mean, however, can be disappointing and
thus affect them adversely. (They may, for example, decide not to
participate in the research project any more.) We should therefore
describe the purpose of the test as thoroughly as possible and draw their
attention to the mean of the class.
In contrast to individual raw scores, the mean of a set of scores provides

its users with a measure of test takers’ typical performance rather than
the total number of test items. If we divide the number of items on the
schema-based cloze MCIT by 2, i.e., 69 / 2, we will get 34.5. This
means that the student who scored 23 on the test has actually failed even
to answer half of the items right. However, when we inform the same
student about the mean, i.e., 23.5, the whole picture changes. It provides
her with a valid criterion to realize that she is nearly average in her
reading comprehension ability when compared to her classmates. But
where would this test taker fall if her ability was compared to all the
students who study English at the same level and took the same pretest?
In other words, how does her score differ from other scores?
11.2.1 Range
The range of a set of scores tells its users how far apart the highest and
lowest ability students are from each other. It is obtained by subtracting
the highest score from the lowest and adding one to show the fact that it
includes scores at both ends of the sorted set. We can apply the formula
to the scores obtained by Faravani’s (2004) control group on the
schema-based cloze MCIT consisting of 69 items as shown below.
6, 11, 15, 22, 22, 25, 26, 27, 27, 29, 31, 32, 33
Range = Highest score – Lowest score + 1 = 33 – 6 + 1 = 28
The range of 28 shows that the highest and lowest ability test takers’
ability differed greatly from each other. As we can see, the value of
range is drastically influenced by the two extreme scores. Just imagine
one of the participants had scored 50 on the test, then the range would

229
have changed from 28 to 45! Since the range of a set of scores depends
only on two extreme scores, it does not provide test users with an
accurate measure of test takers’ differences. This is achieved by variance
and standard deviation.
11.4.2 Variance
Variance is a measure of test takers’ individual differences which shows
to what extent they fall apart from each other on the basis of their mean
score. After sorting the raw scores (X), and calculating their mean (M),
we need to set a table where we can tabulate the steps involved in
calculating variance.
Table 11.2
Steps involved in calculating variance
Test X Minus Mean x (individual x2 (squared individual

taker deviations) deviations)
1 33 - 23.54 9.46 89.49
2 32 - 23.54 8.46 71.57
3 31 - 23.54 7.46 55.65
4 29 - 23.54 5.46 29.81
5 27 - 23.54 3.46 11.97
6 27 - 23.54 3.46 11.97
7 26 - 23.54 2.46 6.05
8 25 - 23.54 1.46 2.13
9 22 - 23.54 -1.54 2.37
10 22 - 23.54 -1.54 2.37
11 15 - 23.54 -8.54 72.93
12 11 - 23.54 -12.54 157.25
13 6 - 23.54 -17.54 307.65
 306 ≈ 0 (= - 0.02) 821.21
Table 11.2 shows the steps involved in calculating variance. They are as
follow.
1. Add up all raw scores (X) and divide the sum () by the number of
test takers (N) to obtain the mean (M), i.e., 306 /13 = 23.54

230
2. Subtract the mean from each score to get the individual deviation
score (x). See column 5 in Table 11.1
3. Square each individual deviation score (x2) and then add them up (
x2), i.e., 821.21
Once we obtain the sum of squared individual deviation scores ( x2),

we can utilize it to calculate variance as “an exceptionally important
concept and one of the most commonly used statistics” (Howell, 2002,
p. 48).  x2 needs to be divided by N-1 to obtain the variance (s2), i.e.,
Variance (s2) =  x2 /N-1 = 821.21/12 = 68.43
11.2.3 Standard Deviation

If we focus on Table 11.2 once again, we realize that adding up
individual deviation scores results in zero. This happens because the
mean of a set of scores is always in its middle. Half of the scores above
the mean will be positive and the other half negative. In order to remove
these signs, we have no choice but to square the individual deviation
scores.
After obtaining squared individual deviation scores, we added them up

and divided the result by N-1 to obtain variance. This process is very
similar to that of obtaining the mean. However, instead of resting on raw
scores, variance depends on individual deviation scores and thus can
show test takers vary in their performance. However, it is based on
squaring the individual deviation score, and therefore must undergo a
further process to become a standard measure, i.e., lose its power of 2.
Standard deviation is therefore obtained by taking the square root (√)of
variance.
Standard deviation = √variance= √s
If we apply the formula to the variance obtained on the 13 raw scores,

we will get

231
Standard deviation = √s = √68.43 = 8.27
11.4 Using SPSS to Calculate Means and Standard Deviations

We have already familiarized ourselves with the SPSS software. Once
we gain some mastery over defining variables and entering our raw data,
we can utilize it to do all our statistical tests. Here we will work once
again on establishing a typical file and side step this process in later
chapters in order to save space.
An SPSS file consists of two sheets: Variable View and Data View. In
order to have the SPSS compute the mean of raw scores given in table
11.2 we need to open, name and save a file, e.g., FData (Faravani Data).
Naming a file is very important because in an actual research project, we
may need to work with several files at the same time. Distinctive file
names will save us a lot of confusion.
Once we have named and saved the file FData in the SPSS, we can
activate its Variable View sheet and define the variable whose central
tendency and variation values we want to have calculated. A typical
SPSS Variable View sheet consists of 10 columns: Name, Type, Width,
Decimals, Label, Values, Missing, Columns, Align, and Measure. Click
under the first column Name and do the following. (We should follow
the same instructions for other variables. We can actually define as
many variables as we wish to study in our project.)
Name: pretest Label: Control Group Align: Right

Type: numeric Values: None Measure: Scale
Width: 8 Missing: None
Decimals: 0 Column: 8
Figure 11.1 shows how we can define an interval variable in an SPSS

Variable View sheet. After naming the variable pretest, we need to
choose numeric as its type. The fifth column in the Variable View
requires us to label the variable, i.e., Control Group. Labeling the
variable will help us interpret the data when the SPSS produces its

232
output sheet. It becomes particularly helpful when we have a large

number of variables in one file. The last column in the sheet requires
choosing either ordinal or scale as variable measure.
Figure 11.1
Defining an interval variable in the SPSS
If you remember, we talked about ordinal variables in section 2.2.2 of

chapter 2 and learned that the levels forming an ordinal variable such as
speaking ability qualify it as excellent, very good, good, poor and very
poor. Instead of using these terms, we can use the numbers 5, 4, 3, 2 and
1 as ordinal levels, respectively. The distance among the levels of a
scale or interval variables are assumed to be always equal. The total
score on the schema-based cloze multiple choice item test (MCIT), i.e.,
69, is, for example, an interval variable which has 69 levels. The
distance among all the 69 levels is equal, i.e., the distance between 19
and 20 is the same as the distance between 9 and 10 or 1 and 2.
Upon defining the variable pretest, we can activate Data View sheet by
clicking on its icon at the bottom of the sheet shown in Figure 11.1.
Once we are in Data View sheet, we can enter the raw scores as shown
in Figure 11.2.

233
Figure 11.2
Entering Raw Score on the SPSS
Having entered the raw scores on the Data View sheet, we must click on
the top menu, activate Analyze menu, go to Descriptive Statistics and
choose Frequencies as shown in Figure 11.3.
Figure 11.3
Activating Descriptive Statistics on the SPSS

234
Clicking on Frequencies will activate a dialogue box having the same

name as shown in Figure 11.4.
Figure 11.4
SPSS Frequencies dialogue box
We have to use the rightward arrow to transfer the variable to the box on
the right. At the bottom of the Frequencies box, we will find three
icons. If we click on Statistics, another dialogue box will appear in
which we can choose whatever we want the SPSS to do for us (Figure
11.5). Since we have studied measures of central tendency, i.e., mean,
median and mode, as well as dispersion, i.e., standard deviation,
variance and range, we check the corresponding boxes and click on
Continue. It will take us back to the Frequencies box. Click on Ok icon
to find these values calculated within a second.

235
Figure 11.5
Statistics available in SPSS Frequencies
Upon clicking on the OK icon, the software will automatically produce

a sheet called Output 1- SPSS Viewer. We will find two tables in this
output. (Since the second table will be very similar to Table 11.1, we
will not copy and paste it here to save time and avoid repetition.) We
can, nevertheless, click on the first table, copy and paste it in our report
as shown in Tables 11.3
Table 11.3
Descriptive statistics produced by the SPSS via Frequencies command
N Valid 13
Missing 0
Mean 23.54
Median 26.00
Mode 22(a)
Std. Deviation 8.273
Variance 68.436
Range 27
a Multiple modes exist. The smallest value is shown

236
If we compare Table 11.3 with what we have already calculated

manually, we will find almost every measure the same with the
exception of range. If you remember we added one to the outcome when
we subtracted the highest score (33) from the lowest (6), i.e., (33-6) + 1.
One is added to include the two extreme scores in the calculation
(Guilford, 1950, p.89). SPSS does not, however, add 1 when it subtracts
these two scores from each other.
11.5 Normal Distribution, Mean and Standard Deviation

If you remember in chapter 5 we encountered normal curve for the first
time (section 5.3). We learned that normal curve is formed when the
performance of a very large population is studied on a particular variable
such as intelligence. Statisticians have studied the properties of normal
curve and calculated the percentages falling between its standard
deviations (σ) as shown in Figure 11.6.
Figure 11.6
Normal curve and the percentages falling between its standard
deviations
As we can see in Figure 11.6, normal curve has the following properties:

237
1. It is symmetrical i.e., half of the scores fall below the mean and the
other half above the mean.
2. Its mean (μ) is always 0. Remember μ is the mean population, not the
sample.
3. Its mode, median and mean (μ) are always the same. Remember these
three measures of central tendency usually differ in real data.
4. Its tails never meet the x axis because no test taker possesses the
complete knowledge or lacks any knowledge of the ability being
measured by the test.
If we remove the curve given in Figure 11.6 and present the data by
employing the terms with which we have studied our samples, we will
have the standard deviations and percentages given in Table 11.4. As we
can see, 68%, 95% and 99% of scores obtained by a sample selected
randomly will always fall between 1, 2 and 3 standard deviations,
respectively.
Table 11.4
Standard deviations and percentages of a normal set of scores
Standard Deviations Mean Standard Deviations

-3 -2 -1 +1 +2 +3
0
2.15% 13.6% 34.1 % 34.1% 13.6% 2.15%
← 68.2% →
← 95.4% →
← 99.7% →
A person who is not interested in statistics at all may ask, "What is the
use of normal curve? What do we need it for?" In response, we should
say that all quantitative research projects depend on the normal curve
because it can be used to convince research users that the results
obtained were the outcome of what the researcher manipulated rather
than chance.

238
As an intervening variable, chance is controlled by employing statistical

tests. The significance level of these tests are set at less than 5 in 100
percent as a minimum. This means that the variable of chance have a
probability of less than five in 100% in order for the results to be
significant, i.e., p < 0.05. We can now use the standard deviations and
percentages obtained in the normal curve to say that whatever research
result falls within -2 and +2 standard deviations is NOT significant and
may thus happen by chance in nature.
In addition to deciding whether the results of a given research project

are significant or not, we can use the normal curve and its standard
deviations to find out whether our participants are representative of the
population we intend to study. Our participants will represent the
population if their scores fall between -2 and +2 standard deviations.
Table 11.5 shows the population scores which could have been obtained
if they had all participated in Faravani's (2004) control group. As we can
see, the observed mean, i.e., 23.5, has been taken as the population
mean. If we add up 23.5 with 16.6 we will get 40.1, indicating that the
student who had scored the highest on the test, i.e., 33, had behaved
normally. In other words, if a given student in the control group had
scored 40.1 and higher, she should have been excluded from the study
because she did not need to study the material presented in the class.
Table 11.5
Comparing the scores obtained by the sample control group with its
population
Test takers Standard Deviations Mean Standard Deviations

All students studying -3 -2 -1 0 +1 +2 +3
at the same level
Sample standard -24.9 -16.6 -8.3 23.5 +8.3 +16.6 +24.9
deviations
Scores -1.4 6.9 15.2 23.5 31.8 40.1 48.4
Percent 2.15 13.6 34.1 49.85 83.95 97.55 99.7

239
11.6 Summary
Interval variables are studied in order to find out whether their
manipulation within a research project will result in significant
differences in the sample. For studying these variables, the scores
quantifying these variables must be obtained from the sample and then
their central tendency and variation must be specified by calculating
their mean and standard deviation. In addition to illustrating how the
members of sample cluster and vary from each other, means and
standard deviations obtained on interval variables can be employed to
study tests and groups as discussed in the next chapter.

240
12 Employing Interval Variables to Evaluate

Instruments
12.1 Introduction
A psychometric test can be viewed as an interval variable whose items
form its levels or values. As a psychometric test, the schema-based cloze
multiple choice item test (MCIT) employed by Faravani (2006) is, for
example, a 69-level interval variable which measures its takers’
comprehension of certain passages upon which it is developed. Since
test takers differ from each other in terms of their background
knowledge and reading comprehension ability, this difference must be
revealed by the number of levels the test takers achieve on the test.
In chapter 11, we learned that the scores obtained on tests such as the
schema-based cloze MCIT can be utilized to calculate their mean,
variance and standard deviation. (While a mean shows the central
tendency of a set of scores, its variance and standard deviation show
how dispersed the scores are.) We also learned that the mean and
standard deviation of a test could be used to compare the performance of
the sample with other test takers who form the population they
represent. (We will cover the statistical tests related to means in chapter
13.)
In this chapter we will use the mean, variance and standard deviation of
a set of scores in order to explore test reliability and validity. While the
former focuses on the consistency of scores, the latter reveals not only
the degree to which two tests measure a common construct but also the
amount of construct each item shares with the total test score.

241
12.2 Instrument Reliability

Conducting a research project usually entails employing instruments
such as tests to measure variables under investigation. There are some
organizations such as the Educational Testing Service which develop
specific tests and establish them as reliable and valid measures of
specific constructs such as language proficiency.
However, we may conduct a research project which requires employing

a test unavailable in the market. In this case, we have no choice but to
develop our own. After developing and administering the test, we must
estimate its reliability and report the coefficient obtained as part of its
descriptive statistics. The obtained reliability coefficient will show how
consistent the results obtained on the tests are.
There are a number of methods to estimate the reliability of tests. They

depend not only on the type of psychometric variables employed in a
project, i.e., whether they are categorical, ordinal, interval or ratio but
also on the way they are estimated. While the former emphasize
choosing appropriate scales, the letter address the manner in which the
data are analyzed statistically.
Brown and Rodgers (2002) believed that reliability of collected data

could estimated internally and externally. For these scholars, a research
instrument will be internally valid if the collected data in a research
project are analysed by another researcher and consistent results are
obtained. They defined external reliability as “the degree to which we
can expect consistent results if the study were replicated” (p. 241). In
other words, reliability can also be estimated by correlating the scores
obtained in a study with those obtained by another person or study. We
will address reliability first and then focus on correlation.

242
12.2.1 Raw Scores and Reliability

The handiest way to estimate the reliability of a given instrument is to
employ raw scores obtained by participants in a study. Among various
formula designed for this purpose, Kuder-Richardson formula 21 stands
out because it rests on the number of items comprising an instrument
and the mean and variance of scores obtained on these items. After
calculating the mean either manually or via SPSS, we can insert them in
Kuder-Richardson formula 21 (KR-21 rk) as shown below.

KR 21 rk 1
1
69 23.54 69 23.54 69 1070.1284

1 1
69 1 69 68.44 68 4722.36
69 68
1 .23 . 77 .78
68 68
In the above formula K is the number of multiple choice items, M is the

mean of the sample, and s2 is the variance of the sample. If you
remember, we had the SPSS calculate the mean and variance of scores
obtained by 13 participants in Faravani’s (2004) control group on the
schema-based cloze MCIT consisting of 69 items (see Table 11.3). Their
mean and variance were 23.54 and 68.44, respectively. We can now
insert these values in the formula to estimate the reliability coefficient as
follows.
Reliability coefficients range between 1.00 and 0.00. While a reliability

coefficient of 1.00 indicates that a test is perfectly reliable, a coefficient
of 0.00 shows that the test is totally unreliable. None of these two
extremes are, however, realised in real life. For most tests a reliability
coefficient of 0.70 and higher is usually considered as reasonable (e.g.,
Kline, 1986). The schema-based cloze MCIT developed by Faravani
(2006), therefore, enjoys a reasonable reliability.

243
12.2.2 Cronbach α (alpha)

Cronbach (1951) developed alpha (α) as his first measure of reliability,
hoping to devise other measures in future. Cronbach α is also called the
internal consistency reliability of the test because it measures how well
each individual item on a test correlates with the sum of the remaining
items. The dependency of alpha on the number of items has led some
scholars to question its application. For example, Streiner and Normal
(1989) discussed the problems involved in employing Cronbach alpha as
follows.
It is nearly impossible these days to see a scale development

paper that has not used alpha, and the implication is usually
made that the higher the coefficient, the better. However,
there are problems in uncritically accepting high values of
alpha (or KR-20), and especially in interpreting them as
reflecting simply internal consistency. The first problem is
that alpha is dependent not only on the magnitude of the
correlations among items, but also on the number of items in
the scale. A scale can be made to look more 'homogenous'
simply by doubling the number of items, even though the
average correlation remains the same. This leads directly to
the second problem. If we have two scales which each
measure a distinct construct, and combine them to form one
long scale, alpha would probably be high, although the
merged scale is obviously tapping two different attributes.
Third, if alpha is too high, then it may suggest a high level of
item redundancy; that is, a number of items asking the same
question in slightly different ways (pp. 64-65).
In spite of the problems raised above, Cronbach α is employed by the

majority of researchers and thus approve its acceptability collectively.
Part of this general consensus might be attributed to its being estimated
by statistical softwares such as the SPSS.

244
In order to have the SPSS calculate the Cronbach α coefficient, we need

to create a file, e.g., FPCGAlpha. Upon naming and saving the SPSS
file, we need to activate Variable View sheet and define 69 variables.
(Let us remember that the schema-based cloze MCIT designed by
Faravani (2006) consisted of 69 items. The answers given to each item
must be recorded either correct or incorrect. After naming the first
variable i01, specifying Numeric as its type, we can go to the Values
column and assign 1 and 0 to correct and incorrect responses for item 1,
respectively. This procedure is shown in Figure 12.1.
Figure 12.1
Specifying the values of categorical variable on the SPSS Data View
sheet
We should follow the steps described above and define 68 numeric

variables having two values for each item. After defining and saving the
variables, we need to activate the Data View sheet. We must put the
answer sheets of these 13 students in control group and start from item
number one, i.e., i01. We see that the first student has chosen the
incorrect answer, so under column i01 we type 0. Since the same student
has chosen the correct answer for item 02 and 03, we type 1 under these
items and do the same for all other items as shown in Figure 12.2

245
Figure 12.2
Entering the values of categorical variables on the SPSS Data View
sheet
After entering the answers given to each item as a numeric variable, we

can now have the SPSS calculate the Cronbach alpha. First, we need to
go to Analyze menu and click on it to access the types of tests available.
Then we need to click on Scale and choose Reliability Analysis as

246

Activating Reliability Analysis in Reliability Analysis dialogue box
SPSS
Upon clicking on the Reliability Analysis its dialogue box will appear
as shown in Figure 12.3. In the box on the left, we need to highlight all
the 69 variables and then click on the rightward arrow (►) to transfer
them to item box. As soon as we transfer the items, the OK icon
becomes active.
Upon clicking on the OK icon, the sheet Output 1- SPSS Viewer will
appear on your monitor. We will find the following pieces of
information in the viewer.
** Method 1 (space saver) will be used for this analysis

**
R E L I A B I L I T Y A N A L Y S I S - S C A L E
(A L P H A)
Reliability Coefficients
N of Cases = 13.0 N of Items = 69
Alpha = .8201

247
If we compare the computed KR-21 r (.78) with α (.82), we will find a

difference of .04, which does not seem to be large enough to warrant
severe criticism. In addition to saving time spent on manual calculation
of KR-21 r, creating an SPSS file on individual items as numeric
variables will help us have the SPSS estimate another type of correlation
called point-biserial.
12.3 Interval Variables and Validity

After estimating the reliability of a given instrument in a research
project, we need to validate it empirically. In order to have empirical
validity, we may administer the instrument along with an already
established instrument to the same participants either simultaneously or
on two different occasions and then correlate the scores obtained on both
instruments.
Various equations have been devised to estimate correlations between

two instruments in order to explore their possible relationship. The
equations depend on the type of variables investigated in our research
project. In this chapter we focus only on interval variables as the most
objective measures of psychological variables.
12.3.1 Interval Variables and Empirical Validity

The Pearson product-moment correlation coefficient is the most
commonly employed equation in applied linguistics and other fields of
science when the relationship between two tests viewed as two interval
variables is explored (e.g., Linacre 2005). It is based on individual
deviation scores, means and standard deviations as follows:

Where: X = the raw scores obtained on the X instrument

x = the individual deviation score on the X instrument
Y = the raw scores obtained on the Y instrument

248
y = the individual deviation score on the Y instrument

Mx = the mean of the X instrument
My = the mean of the Y instrument
Sx = the standard deviation of the X instrument
Sy = the standard deviation of the Y instrument
N = the number of participants
We will now apply the formula to the scores obtained by 10 university

students who took a disclosed Test of English as a Foreign Language
(TOEFL) and a schema based cloze MCIT developed by Gholami
(2006) on the adjectives, adverbs, nouns and verbs of an authentic text.
Table 12.1
Steps involved in calculating Pearson product-moment correlation
coefficient
TEOFL Schema X–M Y–M

x y xy
X Y x y
73 36 73-37.9 36-21.5 35.1 14.5 508.95
55 25 55-37.9 25-21.5 17.1 3.5 59.85
47 21 47-37.9 21-21.5 21.5 -0.5 -10.75
39 18 39-37.9 18-21.5 1.1 -3.5 -3.85
33 16 33-37.9 16-21.5 -4.9 -5.5 26.95
30 15 30-37.9 15-21.5 -7.9 -6.5 51.35
28 21 28-37.9 21-21.5 -9.9 -0.5 4.95
26 20 26-37.9 20-21.5 -11.9 -1.5 17.85
25 23 25-37.9 23-21.5 -12.9 1.5 -19.35
23 20 23-37.9 20-21.5 -14.9 -1.5 22.35
M = 37.90 M = 21.50 Σxy = 658.3
S = 16.06 S = 5.9
Now we can insert the values obtained in Table 12.1 in the formula
below.

249
Σxy 658.3 .
r = = .73
NSx Sy 10 16.1 5.6 .
Gholami (2006) hypothesized that since the semantic schema-based

cloze MCITs are developed on authentic texts whose processing requires
English language proficiency, they will correlate significantly with the
TEOFL. Since we have already calculated the Pearson product-moment
correlation coefficient, i.e., r = 0.73, we can look it up in Appendix 12.1
which presents the critical values for this coefficient. Since 10 students
took both tests and Gholami’s hypothesis was directional, we spot 10
under N column and look for the significance level under directional
decision column. We realize that since the obtained coefficient, 0.73, is
greater than 0.7155, it is significant at p < .01 level. This result supports
the hypothesis.
In addition to being time consuming, after calculating the Pearson

product-moment correlation coefficient we need to check its critical
value in a table. We can save time and energy by having the SPSS
calculate the coefficient and provide us with its critical value. We can,
for example, use the scores given in Table 12.1 to define two variables
in the Data View sheet of an SPSS file named GToeflSchema as shown
in Figure 12.5.
Figure 12.5
Defining two interval variables on the SPSS
Upon defining the scores on the TOEFL and schema based cloze MCIT
as two interval variables, we need to activate Data View sheet and enter
the scores on both tests. The completion of data entry will lead to our

250
moving to Analyze menu on which we have to choose Bivariate as

Figure 12.6
Choosing Correlate from Analyze menu
Upon clicking Bivariate option, Bivariate Correlations dialogue box

will appear as shown in Figure 12.7. (Bivariate simple means between
two variables. It is also known as zero-order correlations.) Under the left
box three types of correlation coefficients are offered. Since the test
scores are interval, the SPSS automatically chooses Pearson. We need to
use the rightward arrow (►) to transfer the TOEFL and Schema scores
to the Variables box. The transference of the scores to this box activates
the OK icon. As soon as we click on OK, the sheet Output 1: SPSS
Viewer will appear on our screen.

251
Figure 12.7
Bivariate Correlations dialogue box
Table 12.2 shows the Pearson correlation coefficient calculated by the

SPSS. As we can see, it is slightly higher than the coefficient we
calculated manually, i.e., .78 versus .73. However, both coefficients are
significant at .01 level, showing that there is a chance of only 1 in 100 to
get this level of correlation between the two tests by chance.
Similar to reliability coefficients, correlation coefficients range from 0

to 1. For any instrument to claim empirical validity, its correlation with
an established instrument must be significant to rule out the effect of
chance on its estimation. After insuring the significance of an obtained
correlation coefficient, we can then square it, e.g., .73 × .73 = .53
(Manual), .78 × .78 = .61 (SPSS), to find out what percentage of
variance they share with each other, i.e., .53 or .61. A common variance
of .53 or .61 tells us 53 or 61 percent of whatever the TEOFL measures,
the semantic schema-based cloze MCIT does, too.

252
Table 12.2
Pearson correlation coefficients and their significance calculated by
SPSS
TOEFL Score Schema Score

TOEFL Score Pearson
1 .777(**)
Correlation
N 10 10
Schema Score Pearson
.777(**) 1
Correlation
Sig. (1-tailed) .004 .
N 10 10
** Correlation is significant at the 0.01 level (1-tailed).
12.3.2 Interval Variables and Internal Validity

If you remember we approached the 69 items comprising the schema-
based cloze MCIT as categorical variables consisting of two categories,
i.e., correct and incorrect and assigned the values of 1 and 0 to these
categories, respectively (Figure 12.2). Then we had the SPSS correlate
each item with the remaining items of the test in order to obtain one
single reliability coefficient called Cronbach Alpha.
In addition to using the categorical items to estimate Cronbach Alpha for

the whole test, we can use the values of 1 and 0 as correct and incorrect
responses given to each item, respectively, to calculate its correlation
with the total test score. We can, for example, calculate the point-
biserial correlation coefficient of item 1 with the total score obtained on
the schema-based cloze MCIT developed by Faravani (2006) by
employing the equation below.

×

253
Where: rpbi = Point-biserial correlation coefficient

= Mean of correct answers given to item 1 (i.e., the
MC1
value of 1)
= Mean of incorrect answers given to item 1 (i.e., the
MI1
value of 0)
SDT = Standard deviation for whole test
C1 = Proportion of correct answers given to item 1
I1 = proportion of incorrect answers given to item 1
If you remember, we presented the correct and incorrect responses given

to 69 items of the schema-based cloze MCIT in Figure 12.2. Table 12.3
presents the first five items and the total test score.
Table 12.3
Manual calculation of point-biserial correlation coefficient
Test
Item1 Item2 Item3 Item4 Item5 Total Calculations
taker
1 0 1 1 1 0 33 MC=
2 32+31+27+27
1 1 1 1 0 32
+25+15+11+6
3 1 1 0 1 0 31 = 174 ÷ 8 =
4 0 1 1 0 0 29 21.75
5 1 1 1 1 0 27 MI =
33+29+26+22
6 1 1 1 0 1 27 +22 = 132 ÷ 5
7 0 1 0 0 0 26 = 26.4
8 1 1 0 1 0 25 SDT = 8.27
C1 = 8 ÷ 13= 0.62
9 0 1 0 1 0 22 I1 = 5 ÷ 13= 0.38
10 0 1 1 0 1 22
11 1 0 1 1 0 15
12 1 0 1 0 0 11
13 1 0 0 0 0 6
If we apply the equation above to the data given in the table we will
obtain the point-biserial correlation coefficient of -0.27 for item 1.

254
. . .
× = × √. 62 .38 = × √0.24 =
. .
-0.56 × 0.48 = -0.27
The point-biserial correlation coefficients are used as item

discrimination indices in testing (e.g., Khodadady 1999). They show
how well the test takers have performed on the test. According to
Gronlund and Linn (1990), if an item has an acceptable item
discrimination index such as 0.45, it discriminates in a positive
direction, i.e., the index is not -0.454, and it has no apparent defect, it
can be considered satisfactory from a technical standpoint.
However, when an item discriminates in a negative direction, for

example, - .27, it means that low ability test takers have performed
better than the high ability. The item should therefore be studied
carefully to find out whether it has any defects in its context or
alternatives. After exploring the items, if it is observed that most of them
have acceptable levels of discrimination power, the designer of test can
claim that it enjoys internal validity.
Authorities have different views with respect to the amount of

discrimination index. While Baker (1989, p. 54) suggested indices lower
than 0.30 be discarded, Madsen (1983, p. 183), believed that an index of
0.15 or higher should be accepted. It all depends on the purpose of the
test. If it is a high stake measure of an ability such as language
proficiency required to meet course requirements, higher levels of
discrimination index should be adopted to secure fairness in results.
12.4 Summary
Interval variables provide psychometric measures to determine the
reliability and validity of instruments employed in research projects.
Among various methods of estimating the reliability of instruments such
as split-half and parallel tests, Kuder-Richardson Formula 21 provides

255
researchers with an equation to calculate it on the basis of raw scores

obtained on the administration of one single test on a single occasion.
Interval variables are also employed to validate instruments both

internally and externally. Through the application of point-biserial
correlation equation, categorical items having the values of 1 and 0 are
correlated with total test scores as an interval variable to obtain
coefficients ranging from 0 to 1. Coefficients equal to .20 and higher are
taken as indices of acceptable discriminatory power to validate the
instruments internally. The more the number of positively discriminating
items in an instrument, the more valid it is considered to be.
Along with internal indices of validity, instruments are administered

with other established measures of the variable under investigation to
validate them empirically, i.e., convergent validity. Most studies in
applied linguistics employ Pearson product moment correlation equation
to achieve this function. Upon establishing the reliability and validity of
instruments employed in research projects, researchers need to analyse
and compare the results with each other in order to find out whether
their treatment has been effective or not. We will focus on this type of
exploration in the next chapter.

256
13 Employing Interval Variables to Evaluate

Performance in Groups
13.1 Introduction
Throughout the present textbook we have referred to Faravani’s (2006)
study in order to achieve some sort of unity in the content of topics
explored in a typical research project. In order to conduct her project,
Faravani formed an experimental and control group and taught some
passages of a certain text to both. Since she believed that using reading
portfolios improves learners’ reading comprehension ability better than
traditional approaches, she used portfolios in her experimental group
and employed the conventional methods of testing in her control group.
Based on the content of the passages, she developed a schema-based
cloze MCIT consisting of 69 items and administered it to both
experimental and control groups as a pre-and-post test to achieve two
purposes: testing the homogeneity of both control and experimental
groups and measuring achievement.
Now that we have learned how the mean of a set of scores is calculated,
and we have worked on the scores obtained by Faravani’s (2006)
experimental and control groups, we can explore the type of statistical
tests she used to test her null hypothesis. i.e., “there is no significant
difference between the means of pre-tests and post-tests of schema-
based cloze multiple choice item test (achievement) for the experimental
and control groups after the implementation of reading portfolios.”
13.2 Z Statistic
Z statistic is based on the concept of z scores, which in turn rests on the
concept of normal curve. If you remember, we talked about normal

257
curve in chapter 11, section 11.5. The normal curve is created on the
assumptions below.
1. A well functioning psychometric test such as a schema-based cloze

MCIT is developed.
2. The test consists of a large number of items, e.g., 1000, which
measure an ability such as reading comprehension
3. Each item on the test addresses one specific area of the ability under
investigation, e.g., how a given adjective schema such as good
describes a particular noun schema such as teacher in an oral or
written text.
4. All areas related to the ability under investigation are addressed by the
items comprising the test.
5. The test is administered to a very large sample which represents the
population under investigation, e.g., 10,000.
If the scores obtained on the test described above are plotted out on the
basis of their frequency, the result will be a normal curve whose mode,
median and mean will be exactly in the middle of the curve. Five
thousand test takers will fall on one side of the curve and the remaining
5000 on the other, if we collect our data from a sample of 10000.
If you remember, we also talked about the standard deviation of a given

test. We realized that each test will have a single standard deviation
which shows how far all scores fall from the mean on average.
Let us imagine that we select 1000 samples consisting of 10000 test

takers who have varying degrees of the reading comprehension ability.
Let’s further assume that we develop a schema-based cloze MCIT test
having 1000 items and administer the test to these 1000 samples. We
will then have 1000 standard deviations for 1000 tests. Since these tests
represent all levels of the ability under investigation, they will show
what percentage of specific ability fall in which part of the normal curve
as shown in Table 13.1.

258
Table 13.1
Z scores and their percentages in a normal distribution
Z scores Mean Z scores

-3 -2 -1 +1 +2 +3
0
2.15% 13.6% 34.1 % 34.1% 13.6% 2.15%
← 68.2% →
← 95.4% →
← 99.7% →
We can now use Table 13.1 to tell where a specific test taker falls within
the normal population, if we know his raw score, the mean and standard
deviation of the test he has taken. This is achieved by applying z score
formula shown below.
Raw score - Mean X–M

Z score = =
Standard deviation S
In chapter 12, we noticed that Gholami (2006) administered a disclosed

TOEFL test to 10 undergraduate students (see Table 12.1). The first high
ability student had scored 73 on the test whose mean and standard
deviations were 37.9 and 16.06, respectively. We can now insert these
values in the z score formula as shown below.
X–M 73 – 37.9 35.1

Z score = = = = 2.19
S 16.06 16.06
If we look at Table 13.2, we realize that 95.4% of test takers fall below
2 z score. Since 2.19 is higher than the z score of 2, we can say that the
student who has scored 73 on the TOEFL has performed better than 95.4
percent of the population from which he has been selected. (We should
remember that the sample size was very small in that only 10 students
took the test. Raw scores should be converted to z scores when the

259
sample size is large otherwise the test may overestimate its takers’
ability.)
Z statistic is very similar to z score in that it compares the difference in

the means of two tests to the population of all possible pairs of tests
whose differences have been calculated based on the assumptions we
described at the beginning of this section, i.e., both tests are fairly
comprehensive and exhaust all areas of the variable under investigation
and administered to a very large number of representative samples. It is
estimated via the formula below.
Difference between two sample means

Z statistic = =
Standard error of difference between means
MT MT
Z statistic
ST S
T
√ nT √nT
Where: MT1 = Mean of test 1

MT2 = Mean of test 2
sT1 = Standard deviation of test1
nT1 = Number of participants who took test 1
sT2 = Standard deviation of test2
nT2 = Number of participants who took test 2
We can now apply the z statistic to compare the means obtained on the
69-item schema-based cloze MCIT. Faravani (2006) administered the
test to 15 students in her experimental group. They took the test twice,
once as a pre-test and another time as a posttest. Table 13.2 presents the
scores obtained on the tests.

260
Table 13.2
Raw scores obtained on the schema-based cloze MCIT administered as a
pre-and-post test
Participants
Tests
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Posttest 59 57 57 57 54 53 52 49 49 48 43 38 38 32 29
Pretest 35 27 16 26 21 20 24 13 16 21 19 18 9 20 23
If we calculate the mean, variance and standard deviations, we will get

MT1 = 47.7, V = 91.7 and S T1 = 9.6 for the posttest and MT2 = 20.5, V =
38.6 and S T2 = 6.2 for the pretest, respectively. We can now insert the
values in the formula below:
MT MT 47.7 20.5
Z statistic
ST S 9.6 6.2
T
√nT √nT √15 √15
47.7 20.5 27.2 27.2 27.2

9.6 6.2 2.48 1.60 6.15 2.57 8.72

3.87 3.87
3.12
The z statistic we calculated for the difference in the means of scores

obtained by the experimental group on the posttest and pretest schema-
based cloze MCIT, i.e., z = 3.12, is significant at p < .01! It is because if
we look at Table 13.1, we can see that 99.7% of z scores fall below +3.
The calculated z statistic therefore shows that there is only 1 in 100% to
obtain the result by chance.
When z statistic is applied to collected data, it is assumed that
1. The scores are normally distributed

261
2. The number of participants who take both tests is equal

3. Each group consists of 62 or more participants (Brown 1988, p. 165)
13.3 T Test
Most experimental studies conducted in applied linguistics are based on
intact groups because researchers are not usually allowed to change the
structure of classes due to institutional restrictions. Besides, the number
of students in these intact groups seldom reaches 62. This means that the
researchers have to put up with small groups whose performance would
be inappropriate to be analysed by z statistic.
T test is the only statistic which fits small groups. It has another
advantage over z statistic, i.e., the difference in the members of
participants in each group. While z statistic requires comparing groups
having the same number of participants, t test can be safely applied to
two groups whose number of members vary. For example, the number
of participants in an experimental group may be more than the
participants in a control group.
In addition to being applicable to groups with varying members, t-test

can be utilized to compare the performance of the same group on the
same instrument administered on two different occasions or at different
times, i.e., at the beginning and end of the term. In other words, t test
suits dependent measures. Z statistic is, however, designed for
independent groups. In other words, the scores upon which means are
obtained must be obtained by different participants. For example, the
scores obtained by participants in experimental and control groups are
independent from each other.
13.3.1 Estimating T-test

Estimating t test value is pretty straightforward as shown and described
in the formula and steps below. (Note that the formula is very much
similar to that of the Z statistic. Instead of standard deviation, however,
variance is employed).

262
MT MT

nT nT
1. Subtract the mean of test 1 (MT1) from the mean of test 2 (MT2)
2. Divide the variance of test 1 (VT1) from the number of test takers who
took test 1 (NT1)
3. Divide the variance of test 2 (VT2) from the number of test takers who
took test 2 (NT2)
4. Add up the divided VT1 and VT1
5. Take the square root of the step 4
5. Divide the subtracted means by the result of step 5
If we insert the mean and variance of the posttest and pretest (see Table
13.1) in the T test formula we will get:
MT MT 47.7 20.5 27.2 27.2

91.7 38.6 √6.11 2.57 √8.68

nT nT 15 15
27.2
9.22
2.95
Upon estimating the T test index, we need to check its critical value in
the table given in Appendix 13.1. There are nine columns in the
appendix. The first column shows the degree of freedom (df). Since
there were two groups in the study and each group consisted of 15
students, its df will be (15 × 2) – 2= 28. If we check the row having the
df of 28 and move to the ninth column, i.e., p = .0001, we will find
3.674. Since it is lower than the calculated index of 4.53, we can then
say that the result obtained is significant at the highest level possible,
i.e., p <.0001.

263
13.3.2 Applying T-test to Raw Scores via SPSS

We have already worked with the SPSS. If you remember, we noticed
that an SPSS file consists of two sheets: Variable View and Data View.
In order to compare the mean of raw scores given in table 13.1 we need
to open and save an SPSS file such as SPrePost and activate its
Variable view and define two variables.
The first variable will be categorical in nature because the participants

were assigned to two groups: experimental and control. We will
therefore name it group and fill out the first row of the ten columns in
Variable View sheets as follows:
Name: group Values: Two (1 = pretest, 2 = posttest

Type: numeric Missing: None
Width: 8 Column: 8
Decimals: 0 Align: Right
Label: Groups Measure: Scale
The second variable will be interval in nature because it is based on the

scores obtained in both pretest and posttest. We will therefore name it
Scores and fill out the second row of the ten columns in Variable View
sheet as tabulated below and shown in Figure 13.1.
Name: scores Decimals: 0 Missing: None Align: Right

Type: numeric Label: Scores Column: 8 Measure: Scale
Width: 8 Values: None
Figure 13.1
Defining two variables for T test analysis

264
Upon defining the two variables called groups and scores as shown in
Figure 13.1, we need to activate Data View sheet in order to type the
data given in Table 13.1 under relevant variables.
Figure 13.2 shows the Data View sheet on which the data related to the
schema-based cloze MCIT have been entered. The test consisted of 69
items and was administered two times: once as a pretest and another
time as a posttest. In order to have the software compare the means, we
have to click on Analyze menu to activate its box containing Compare
Means, which in turn offers five available tests among which we need to
choose One-Sample T Test because the two tests were administered to
the same group of participants.
Figure 13.2
Entering data on the Data View sheet
Upon clicking Independent-Sample T Test a dialogue box shown in

Figure 13.3 will appear. As can be seen, on the right there are two
boxes. The big box on the top is labeled Test Variables. We must
transfer the Scores to this box by clicking on the rightward arrow. The
lower box on the right is labeled Grouping Variable. We have to move
the Groups variable to this box as shown in Figure 13.4. As soon as we

265
move this variable to the box, a button called Define Groups becomes
active.

Independent-Sample T Test Moving variables
We must click on the button Define Groups and define the two groups
as shown in Figure 13.5. We have to type 1 in Group 1 and 2 in Group 2
in its dialogue box as shown in Figure 13.6. As soon as we click on
Continue button, we will be taken back to the main box with an active
OK button as shown in Figure 13.6.

Activating Defining Groups box Activating Defining Groups box
Now the SPSS is ready to calculate the T Test value by clicking on OK

button. It will first provide us with the groups statistics shown in Table
13.3.

266
Table 13.3
Group statistics
Groups N Mean Std. Deviation Std. Error Mean
Scores Pretest 15 20.53 6.209 1.603
Posttest 15 47.67 9.574 2.472
The second table produced by the SPSS is petty wide and thus does not
fit the size of this page. It has, therefore, been divided into two parts and
presented in Tables 13.4 and 13.5.
Table 13.4
Levene's Test for Equality of Variances
F Sig.
Scores Equal variances assumed 3.780 .062
Equal variances not assumed
Table 13.4 presents the Levene’s test for equality of variances. As can
be seen, the F obtained, i.e., 3.780 is not significant, i.e., it is greater
than .05. We can therefore support our claim that the variances of the
scores obtained by the same students on the pretest and posttest are
equal and their mean scores can thus be safely compared.
Table 13.5 shows the T-test for equality of means. As can be seen, the T
test value obtained by the SPSS is almost the same as what we got by
employing the formula, i.e., 9.21. In addition to dispensing with manual
calculation, the SPSS provides the significance level and thus saves the
time required for checking the critical values in a given table.

267
Table 13.5
T-test for equality of means
95% Confidence Interval of

Sig. Mean Std. Error
t df the Difference
(2-tailed) Difference Difference
Lower Upper
-9.209 28 .000 -27.133 2.946 -33.169 -21.098

-9.209 24.006 .000 -27.133 2.946 -33.214 -21.052
13.4 One-Way between Groups ANOVA with Post-Hoc Tests

Based on their findings Shiotsu and Weir (2007) claimed that syntactic
knowledge is relatively more significant than “vocabulary breadth in
predicting text reading comprehension test performance” (p. 123-4).
They administered three tests in their study: A 32-item grammar test
designed by Weir (1983, pp. 371–3) and Educational Testing Service, a
20-item reading comprehension test whose passages were taken from
Yang and Weir, (1998) and Lee and Schallert (1997) and the 60-item
vocabulary level test developed by Schmitt, Schmitt and Clapham
(2001). Shiotsu and Weir (2007) administered the three tests to 591
participants.
Table 13.6 presents the findings of Shiotsu and Weir (2007). As can be
seen, the correlation coefficient obtained, i.e., r, between grammar test
and reading comprehension test (.85) is higher than that of vocabulary
and reading comprehension tests (.79).

268
Table 13.6
Regression and correlation among three tests (n = 591)
Reading Grammar
× ×
Grammar Vocabulary Vocabulary
Beta .64 .25 -
r .85 .79 .84
% explained 72% 62% 70%
% jointly explained 74% -
The results presented in Table 13.6 are based on structural equation

modelling (SEM) calculated by utilizing the software AMOS (Arbuckle
& Wothke, 1999). According to Shiotsu and Weir (2007), SEM
allows the researcher to evaluate regression models that take latent

ability variables as the predictors and criterion, and one of its many
advantages over conventional regression is that it takes into account
and partials out the differences in measurement errors across the
observed variables." (p. 104).
Latent variables or factors described in the quotation above are

themselves not directly observable (Tucker & MacCallum 1997, p. 2)
and are estimated through complex statistical analyses. Furthermore, the
number of factors may vary depending on the way they are extracted
(see Costello & Osborne 2005, Khodadady & Hashemi, in press). Based
on these reasons, Khodadady, Pishghadam and Fakhar (under review)
designed a study in order to find out whether similar results will be
obtained if the relationships among grammatical knowledge, reading
comprehension ability and vocabulary knowledge were explored
experimentally rather than factorially.
Khodadady, Pishghadam and Fakhar (2010) taught the first five units of
True to life: Intermediate class book (Gairns & Redman 1996, pp.4-39),

269
43 units of English vocabulary in use: Pre-intermediate & intermediate

(Redman 2003, pp. 72-178) and 22 units of English Grammar in Use
(Murphy 2004) to three groups named control, grammar and vocabulary
groups. While the three groups were all taught communicatively, only
the grammar and vocabulary groups received explicit teaching in the
grammatical points and the vocabulary used the units of three textbooks,
respectively.
Since Khodadady, Pishghadam and Fakhar (2010) were teaching

English to learners at an intermediate level in a private language
institute, they had no choice but to work with intact groups whose
members could not be changed. So they designed a schema-based close
cloze multiple choice item test (MCIT) on the first five units of True to
life: Intermediate class book (Gairns & Redman 1996) in order to find
out whether the learners were at the same level of proficiency and their
reading comprehension ability did not differ from each other
significantly when they started their study.
The three groups established by Khodadady, Pishghadam and Fakhar

(2010) establish an independent variable with three values, i.e., control,
grammar and vocabulary. The phrase One-Way used in the title, i.e.,
One-Way between Groups ANOVA, thus shows that there is only one
independent variable with more than one group. The performance of the
learners in three groups on the schema-based cloze MCIT forms the
dependent variable of the study.
13.4.1 Applying One-Way between Groups ANOVA to Raw Scores

via SPSS
In order to save space we will not focus on the manual calculation of the
One-Way between Groups ANOVA. Instead, we will employ the SPSS
to do the calculation automatically.
First, we need to establish an SPSS file. Since the control and

experimental groups’ performance on the schema-based reading

270
comprehension test needs to be explored we will name the file

TotalSchePretest.sav
In the Variable View sheet of the SPSS file, we will create a variable
called Code and assign a single code for each learner. Each code can be
a number. The names of the learners can form the second string
variable. The group to which each learner belongs can be established as
the third variable.
In the sixth column of the Variable View sheet, which is called values,
click on the right side of the cell in front of the third variable named
group to open the Value Labels dialogue box. Type 1 in the box in front
of Value and Control in front of Label and then click on the button Add
as shown in Figures 13.7. Give the values of 2 and 3 to Grammar and
Vocabulary groups.
Figure 13.7
Defining and assigning values to the variable called Group
The scores on the schema-based cloze MCIT administered as a pretest

will form the fourth variable. Since the group itself is a categorical

271
independent variable and the schema a continuous dependent variable,

we can now run the One-Way between Groups ANOVA on the SPSS.
Figure 13.8
Activating One-Way ANOVA on the Analyze menu
Click on the Analyze button in the top menu as shown in Figure 13.8
above. Move down the menu to reach Compare Means and then go to
One-Way ANOVA. As soon as you click on this icon, its dialogue box
will open as shown in Figure 13.9 below. Click on the variable Schema
and then click on the top rightward arrow to move it to the box called
Dependent list. Then click on the variable Group and use the bottom
rightward arrow to move it to the box called Factor as shown in Figure
13.10 below.

272

Activating One-Way ANOVA Specifying the dependent and factor
variables
Since we have specified our dependent variable, i.e., Schema, and

independent or factor variable, i.e., Group, we can now click on the OK
button at the bottom left corner of the dialogue box to find out whether
there was any significant difference in the three groups.
Table 13.7 presents the One-Way ANOVA analysis of scores obtained

by control and experimental groups on the syllabus-and-schema based
reading comprehension pretest. (Note that the table has been reformatted
to match the ones given in scholarly journals.)
Table 13.7
One-Way ANOVA analysis of scores obtained by control and
experimental groups on the syllabus-and-schema based reading
comprehension pretest
Sum of Squares df Mean Square F Sig.

Between Groups 6.594 2 3.297 .110 .896
Within Groups 2359.467 79 29.867
Total 2366.061 81

273
As we can see in Table 13.7, the scores of 82 learners forming the

control, grammar and vocabulary groups did not differ significantly
from each other on the syllabus-and-schema based reading
comprehension pretest because the F obtained, i.e., .110, is not
significant, i.e., p > .05. Based on this result, Khodadady, Pishghadam
and Fakhar (under review) claimed that the participants in the three
groups were all at the same level of reading comprehension ability when
their experiment started.
Table 13.8 presents the One-Way ANOVA analysis of scores obtained

by control and experimental groups on the same syllabus-and-schema
based reading comprehension test administered at the end of the
experiment. As we can see, the F obtained, i.e., 3.736, is significant, i.e.,
p <.05. This result shows that the treatment was effective. However, we
do not know which explicit teaching was more effective, i.e., grammar,
vocabulary or both!
Table 13.8
One-Way ANOVA analysis of scores obtained by control and
experimental groups on the syllabus-and-schema based reading
comprehension posttest
Sum of Squares df Mean Square F Sig.

Between Groups 297.591 2 148.795 3.736 .028
Within Groups 3146.507 79 39.829
Total 3444.098 81
In order to find out which of the three groups’ mean scores on the
posttest differed significantly, we need to run Scheffe Post Hoc Test.
First, we need to specify the members of the three groups by giving
them the values of 1, 2 and 3 as we did for the pretest, then we need to
have each member’s score on the posttest and a continuous variable. We
have to follow the steps described in Figures 13.9 and 13.10 once again

274
and designate the groups as a factor or independent variable and the

scores on the posttest as a dependent variable.
As you can see in Figure 13.10, there are three buttons on the right
corner of the dialogue box One-Way ANOVA. If you click on the
second button, i.e., Post Hoc, another dialogue box called One-Way
ANOVA: Post Hoc Multiple Comparisons will appear as shown in
Figure 13.11
Figure 13.11
Activating Post Hoc Multiple Comparison
Since it is assumed that the members of the three groups will all be of
the same level of reading comprehension ability, in the top part labeled
Equal Variances Assumed click in the box named Scheffe. Then go to
the bottom of the dialogue box and click on Continue. It will take you
back to the One-Way ANOVA dialogue box. If you click on OK
button, the SPSS will provide you with the data related to the Scheffe
Post Hoc Test.
Table 13.9 present the Scheffe Post Hoc Test of the scores obtained on
syllabus-and-schema-based reading comprehension posttest. As we can

275
see, only the mean score of vocabulary group is significantly different

from the control group, i.e., MD = 4.387, p <.05. The results obtained by
Khodadady, Pishghadam and Fakhar (under review), therefore, show
that only the explicit teaching of the vocabulary used in the units of
teaching materials brings about a significant increase in the learners’
scores on syllabus-and-schema-based reading comprehension posttests.
Table 13.9
The Scheffe Post Hoc Test of the scores obtained on syllabus-and-
schema-based reading comprehension posttest
Mea Difference Std. 95% Confidence Interval

(I) Group (J) Group Sig.
(I-J) Error Lower Bound Upper Bound
Control Grammar -3.787 1.752 .103 -8.16 .58
Vocabulary -4.387* 1.709 .042 -8.65 -.12
Grammar Control 3.787 1.752 .103 -.58 8.16
Vocabulary -.600 1.674 .938 -4.78 3.58
Vocabulary Control 4.387* 1.709 .042 .12 8.65
Grammar .600 1.674 .938 -3.58 4.78
* The mean difference is significant at the 0.05 level.
Table 13.10 presents the 82 participants’ code (C), group (G) and scores
on the unseen reading comprehension test (UR) designed and
administered by Khodadady, Pishghadam and Fakhar (under review).
Establish an SPSS file and create three variables corresponding to the
codes, groups and scores giving in the table. Then apply One-Way
ANOVA analysis to the variables to find out whether the three Control
(1), Grammar (2) and Vocabulary (3) groups’ performance on the UR
was significantly different. Then apply the Scheffe Post Hoc Test to find
out whether any significant difference appears between two or three
groups. Reach conclusions on the basis of the results you obtain from
the analysis and test.

276
Table 13.10
The participants’ code (C), group (G) and scores on the unseen reading
comprehension test (UR)
C G UR C G UR C G UR C G UR C G UR
1 1 19 18 1 13 35 3 24 52 3 23 69 2 22
2 1 19 19 1 17 36 3 24 53 3 23 70 2 13
3 1 18 20 1 12 37 3 24 54 3 24 71 2 21
4 1 12 21 1 12 38 3 25 55 3 25 72 2 20
5 1 21 22 1 11 39 3 22 56 2 18 73 2 24
6 1 15 23 1 14 40 3 21 57 2 18 74 2 17
7 1 14 24 1 22 41 3 25 58 2 19 75 2 25
8 1 13 25 1 22 42 3 19 59 2 19 76 2 23
9 1 17 26 3 20 43 3 21 60 2 12 77 2 24
10 1 18 27 3 13 44 3 23 61 2 20 78 2 23
11 1 17 28 3 14 45 3 20 62 2 20 79 2 23
12 1 18 29 3 20 46 3 21 63 2 23 80 2 21
13 1 13 30 3 20 47 3 19 64 2 18 81 2 18
14 1 19 31 3 22 48 3 9 65 2 20 82 2 21
15 1 17 32 3 18 49 3 22 66 2 15
16 1 20 33 3 21 50 3 23 67 2 17
17 1 14 34 3 17 51 3 26 68 2 19
13.5 Regression Analysis

Linear regression models the value of a dependent interval variable such
as unseen reading comprehension test on the basis of its linear
relationship to two or more independent interval variables or predictors
such as grammar and vocabulary knowledge test. It assumes that there is
a linear, or "straight line," relationship between the dependent variable
and each predictor. In this chapter we will familiarize ourselves with

277
two most frequently used analyses, i.e., standard and hierarchical

multiple regression.
13.5.1 Standard Multiple Regression

In order to run a sample linear regression analysis, we need to focus on
the assumptions upon which regression is based. As Pallant (2007)
convincingly argued the analysis “is not all that forgiving if they are
violated” (149). The first and most important assumption deals with
sample size. According to Stevens (1996), in social sciences at least 15
participants per each independent variable are required to have a reliable
equation.
Tabachnick and Fidell (2007), however, provided researchers with a

formula, i.e., N > 50 + 8m, where m stands for the number of
independent variables. Based on this formula, we can understand why
Khodadady, Pihshghadam and Fakhar (2010) did not run regression
analysis to predict their 27 learners’ performance on the unseen reading
comprehension test, i.e., dependent variable, by using their scores on the
grammar posttest and syllabus-based reading pretest. i.e., independent
variables. They needed to have at least 66 participants, i.e., 50 + (8×2),
to do so if they followed Tabachnick and Fidell’s formula.
The second assumption involves the existence of relationship between a

dependent and independent variable. Before we apply regression
analysis to our data we need to become sure that the two variables
correlate “above .3 preferably” (Pallant, 2007, p. 155). Running a
bivariate correlation analysis of our data is, therefore, necessary before
we submit them to more sophisticated analyses.
The third assumption deals with multicollinearity. When two

independent variables correlate highly with each other, i.e., r=.90 and
above, they should not be used to predict the dependent variable. It
must, however, be emphasized that very rarely two independent
variables correlate at .9 and higher. For this reason Pallant (2007),
suggested “.7 or more” (p. 155).

278
The forth assumption involves singularity. It occurs when one

independent variable forms a part of another independent variable. For
example, the latest paper and pencil version of TOEFL consists of two
sections, i.e., structure and reading comprehension. If we use the
structure section and the TOEFL itself to predict the test takers’
achievement in a course offered at a university, we violate the
singularity assumption. However, we can use the two sections as
independent variables to predict their achievement because they do not
form any part of each other.
Outliers form the fifth assumption of regression analyses. They are

nothing but very high or low scores obtained by participants. Before
running the analysis we must screen out data for outliers which are
defined by Tabachnick and Fidell (2007) as standardized residual
values falling above 3.3 or below -3.3. Residuals are technically the
differences between the obtained and predicted scores reported as
dependent and independent interval variables. The outliers of both
dependent and independent variables must be specified, modified or
deleted if they prove to be many. A few outliers do not, however, have
noticeable effect if the sample is large enough. SPSS estimates the
standardized residual values for the dependent variable by default and
presents it as a separate variable in data view sheet called .
In order to have hands-on practice with regression analysis, the data

collected by Khodadady (2012) will be used in this chapter. Out of 430
undergraduate and graduate students, the scores of 402 students are
reported in Appendix 13.2 because they all took four tests voluntarily,
i.e., the disclosed 110-item TOEFL, 99-item C-Test (Klein-Braley,
1997), 60-item S-Test (Gholami, 2006) and Nation’s 60-item Lexical
Knowledge Test (LKT) published by Schmitt, Schmitt, and Clapham
(2001). The analysis will be guided by answering the research questions
below:
Q1: How well do the C-Test, S-Test and LKT predict the TOEFL?

279
Q2: How much variance in the TOEFL scores can be explained by

scores on the three tests?
Q3. Which test is the best predictor of the TOEFL?
To answer the questions above, we need to enter the data in an SPSS

Data View sheet. The first variable will be the Codes given to the
students. The 2nd, 3rd, 4th and 5th will be the scores obtained on the
TOEFL, C-Test, S-Test and LKT, respectively. Since a large number of
students sit for the TOEFL to get admittance to universities in English
speaking countries, it will be treated as a dependent variable, and the C-
Test, S-Test and LKT will be adopted as independent variables to find
out which of these tests is the best predictor of success on the TOEFL.
In order to run the standard multiple regression analysis we need to click
on Analyze menu on the top bar, click on Regression and then choose
Linear to have the dialogue box shown in Figure 13.12 activated.

Activating the linear regression Marking relevant boxes in
dialogue box and specifying variables Statistics dialogue box
In the box shown in Figure 13.12 above do the following

1. Use the top rightward arrow to move TOEFL from the box in the left
to the box on the top right labeled Dependent.

280
2. Use the second rightward arrow to copy the C-Test, S-Test and LKT
in the second box Independent(s).
3. Let the Method stay Enter as it is.
4. Click on Statistics button on the right top corner of the box to activate
its dialogue box shown in Figure 13.13.
5. Check in (√) Estimates, Confidence Intervals, Model fit,
Descriptives, Part and partial correlations and Collinearity
diagnostics as shown in Figure 13.13 and then click on Continue
button to return to Linear Regression dialogue box.
6. On the right top corner click on the second button called Plots to
activate its dialogue box shown in Figure 13.14 on the next page.
Using the rightward arrow move *ZRESID into the Y box. Click on
*ZPRED and the arrow button to move it to the X box. Under the box
on the left two alternatives are offered for Standardized Residual
Plots, click in the box labeled Normal probability plot. Click on
Continue to return to Linear Regression dialogue box.

281

Specifying functions in Plots Specifying functions in Save
7. Click on the Save button on the right top corner to activate its
dialogue box as shown in Figure 13.15. In the section labeled
Residuals, tick Standardized and in the Distances section, tick the
Mahalanobis and Cook’s Boxes. Click on Continue to return to
Linear Regression.
8. Click on the Options button. In the Missing Values section, click on
Exclude cases pairwise. Click on Continue to return to Linear
Regression.
9. Click on OK
The output generated from following the steps specified above are
presented and discussed, albeit briefly, one by one below. The first table
presents the correlation coefficients obtained among the variables. As

282
we can see, the TOEFL shows significant relationships with the three
independent variables as they do with each other and none of the
relationships is too high, i.e., r=.7 or higher. However, if you find highly
correlated variables in your own data, you may need to delete one of
them or form a composite variable from the scores obtained on the two
highly correlating variables
Correlations
TOEFL C-Test S-Test LKT
TOEFL 1.000 .679 .580 .569
C-Test .679 1.000 .524 .451

Pearson Correlation
S-Test .580 .524 1.000 .579
LKT .569 .451 .579 1.000

TOEFL . .000 .000 .000
C-Test .000 . .000 .000
Sig. (1-tailed)
S-Test .000 .000 . .000
LKT .000 .000 .000 .
TOEFL 402 402 402 402
C-Test 402 402 402 402

N
S-Test 402 402 402 402
LKT 402 402 402 402
In order to find out whether there is any multicollinearity among the

variables we need to check another table based on the analysis we
requested the SPSS to run, i.e., Collinearity diagnostics. The results are
presented in a table called Coefficients. A number of columns have
been deleted from the original table so that its results can be presented
within the width of the present page and the shortened table is given
below.

283
a
Coefficients
Model Unstandardized Standardized Correlations Collinearity
Coefficients Coefficients Statistics
B Std. Beta Zero- Parti Part Toler VIF
Error order al ance
(Constant) 29.027 2.376
C-Test .565 .048 .466 .679 .508 .388 .692 1.444

1
S-Test .361 .081 .193 .580 .218 .146 .578 1.729
LKT .319 .053 .247 .569 .287 .197 .635 1.575
a. Dependent Variable: TOEFL
Two values need to be consulted in the table above, i.e., Tolerance and
VIF. The first shows how much of the variability of the independent
variable is not explained by the other independent variables in the
model. For each variable it is calculated using the formula 1-R squared.
If the Tolerance value is very small, i.e., less than .10, it shows
multicollinearity among variable. As we can see, the values for the C-
Test, S-Test and LKT, are .692, .578, and .635, respectively. Since they
are all greater than .10, they show that we have not violated the
multicollinearity assumption.
The VIF (variance inflation factor) is the inverse of the tolerance value.
It is obtained by dividing 1 by Tolerance value. The VIF values of above
10 indicate multicollinearity. Since the VIF values of the C-Test, S-Test
and LKT are less than 10, i.e., 1.444, 1.729, and 1.575, respectively,
they provide further evidence that we have not violated the
multicollinearity assumption.
In order to check the normality, linearity and homoscedasticity of data

we can check the Normal Probability Plot (p-p) as shown below.

284
(Homoscedasticity is the necessity that the variance of one variable is the

same at all values of the other variable, i.e., homogeneity of variance.) As we
can see, there is no major deviation from normality because all of our
points fall in a reasonably straight diagonal line from bottom left to the
top right of the plot.
The normality of our data can also be checked by inspecting the outliers
in the Scatterplot of residuals given below. According to Tabachnick
and Fidell (2007), cases more than 3.3 or less than -3.3 are outliers. We
should, however, remember that it is common to find some outlying
residuals in large samples and ours in not an exception. As we can see in
the scatterplot, there are three outliers whose standardized residuals are
less than -3.3.

285
The outlying standardized residuals can also be checked in our Data

View sheet. If you remember by activating the Save button in Figure
13.21, we ticked Standardized in Residuals section and Mahalanobis
and Cook’s in Distances section. The SPSS automatically created three
variables in our Data View sheet called MAH_1, COO_1, and ZRE_1. If
you check the ZRE_1 in Appendix 13.2 you will see that the TOEFL
scores of three students coded 332, 367, 116 have standardized residuals
of -4.1, -3.7, and -3.6, respectively. These are the cases placed in the red
rectangle specified in the scatterplot above. (The scores of only 30
students have been given to save space.)
We can also have the SPSS find the outliers automatically and report
them in table. To do this, we need to go to the Residuals section in the
Statistics menu given as the first button on the top right hand corner of
Linear Regression dialogue box, tick Casewise diagnostics and specify
the magnitude of outliers specified as Standard deviations box. Since it

286
is set at 3 by default, you can change it to 3.3 as suggested by

Tabachnick and Fidell (2007). The SPSS will give you the Casewise
Diagnostics table as shown below.
Casewise Diagnosticsa
Case Number Std. Residual TOEFL Predicted Value Residual
116 -3.578 28 66.29 -38.294

332 -4.141 43 87.33 -44.330
367 -3.728 34 73.90 -39.903
As can be seen in the table above, our model could not predict the
TOEFL scores of students 116, 332 and 367 very well. For example, the
model predicted a score of 66.29 for student 116, but s/he obtained 28
on the TOEFL. Similarly, the scores predicted for students 332 and 367
are far higher than what they scored on the TOEFL in reality. We
therefore need to check whether these outliers have had any undue
influence on the results for the model as a whole.
The required results for checking the influence of outliers can be found
in the table labeled Residuals Statistics as presented below. If you
remember we specified statistics about residuals we need by activating
the Save menu and ticking the required boxes (see Figure 13.21).
According to Tabachnick and Fidell (2007), cases with values larger
than 1 are indicative of problems. If we check the table for the
Maximum value for Cook’s Distance, we find .214 which is less than 1
suggesting no potential problem. If the Maximum value we find in the
table is greater than 1, we need to go to our Data View sheet to check
the column COO_1 which is automatically produced by the SPSS itself.
If the Cook’s Distance value given for the outliers are greater than 1 we
will need to remove them and rerun the analysis.

287
Residuals Statisticsa
Minimum Maximum Mean Std. Deviation N
Predicted Value 43.60 111.87 78.58 12.241 402

Std. Predicted Value -2.858 2.720 .000 1.000 402
Standard Error of Predicted
.549 2.605 1.017 .327 402
Value
Adjusted Predicted Value 43.76 111.92 78.60 12.236 402
Residual -44.330 24.463 .000 10.664 402
Std. Residual -4.141 2.285 .000 .996 402
Stud. Residual -4.177 2.292 -.001 1.003 402
Deleted Residual -45.086 24.607 -.018 10.804 402
Stud. Deleted Residual -4.266 2.304 -.002 1.007 402
Mahal. Distance .059 22.755 2.993 2.889 402
Cook's Distance .000 .214 .003 .013 402
Centered Leverage Value .000 .057 .007 .007 402
Now that we have become certain that all the assumptions of regression
analysis have been met by our data, we can check the Model Summary
table given below. If we check the value given under the R Square,
(.569), we notice that it is almost the same as the value given for
Adjusted R Square (.565). These two values are very much the same
because our sample size is pretty large. With a small sample, it is,
however, better to report the Adjusted R Square. These results show that
57% of variance (.565 × 100) in the TOEFL is explained by the C-Test,
S-Test and LKT.

288
Model Summaryb
Model R R Square Adjusted R Std. Error of the

Square Estimate
a
1 .754 .569 .565 10.704
a. Predictors: (Constant), LKT, CTest, STest

b. Dependent Variable: TOEFL
In order to ensure that the results we have obtained are significant we

need to check the ANOVA table given below. As we can see, our results
are significant at .000. According to Pallant (2007), 000 is the same as
“p<.0005” (p. 158), indicating that there is a possibility of five in ten
thousands to get these results by chance. We can thus announce that our
results are highly significant.
ANOVAa
Model Sum of df Mean Square F Sig.

Squares
Regression 60089.268 3 20029.756 174.816 .000b

1 Residual 45601.250 398 114.576
Total 105690.518 401

b. Predictors: (Constant), LKT, C-Test, S-Test
Upon securing the significance of our study, we can now focus on each
of the independent variables inserted in the model to determine which
one contributes to the prediction of the dependent variable significantly.
This can be achieved by scrutinizing the Coefficients table given below.
(It is broken into two parts to be fitted in the width of the page.) We
need to employ the Standardized Coefficients to discuss our findings.

289
We use these particular coefficients because standardization allows us

compare the results obtained on all tests with each other. (The
Unstandardized coefficients can be used in constructing a regression
equation for applied purposes which is not the purpose of this study at
present.)
Coefficientsa
Model Unstandardized Standardized t Sig.

Coefficients Coefficients
B Std. Error Beta
(Constant) 29.027 2.376 12.216 .000
C-Test .565 .048 .466 11.775 .000

1
S-Test .361 .081 .193 4.447 .000
LKT .319 .053 .247 5.986 .000
Coefficientsa (Continued)
95.0% Confidence Interval for B Correlations Collinearity Statistics
Lower Bound Upper Bound Zero- Partial Part Tolerance VIF
order
24.355 33.698
.471 .659 .679 .508 .388 .692 1.444
.201 .520 .580 .218 .146 .578 1.729
.214 .424 .569 .287 .197 .635 1.575
As we can see in the first part of Coefficients table, among the three
independent variables, the C-Test has the largest Beta (.466). This
means that C-Test has the strongest unique contribution to explain the
TOEFL when the variance explained by all other variables in the model
is controlled for. Since the value given in Sig. column for the C-Test is

290
.000, we can say that it is making a statistically significant and unique

contribution to the equation.
We can also get more results from the table of Coefficients above. In the
correlations columns, we can look for Part correlation coefficients
referred to as “semipartial correlation coefficients” by Tabachnick and
Fidell (2007, p. 145). By squaring the Part coefficients we can
determine how much of the total variance in the TOEFL is explained
uniquely by each independent variable. For example, the Part coefficient
for the C-Test is .388. If we square or multiply it by itself (.388×.388)
we obtain 0.15, indicating that the C-Test explains 15 percent of
variance in the TOEFL. Similarly, 2% (.146×.146=0.021) and 4%
(.197×.197=0.038) of variance in the TOEFL is explained uniquely by
the S-Test and LKT, respectively. If we add up 15 with 2 and 4 percents
we get 21 percent which is much less than total the R square value (.57)
given in the Model Summary table. This is because the total R square
value includes both common (.36) and unique (.21) variances explained
by the three independent variables.
13.5.2 Hierarchical Multiple Regression

Hierarchical or sequential multiple regression requires entering the
variables under study in blocks in a predetermined order. Employing the
data analysed in the standard multiple regression, we will, for example,
apply the hierarchical method to answer the research question below.
Q: If we control for the possible effect of age and GPA, are the C-Test,
S-Test and LKT still able to predict a significant amount of the variance
in the TOEFL?
We need to take the steps below to answer the question above:

1. Click on Analyze, choose Regression and then click on Linear.
2. Move the interval variable TOEFL to the Dependent box.
3. Move the age and GPA as control variables to the Independent box
to form Block 1 of 1 as shown in Figure 13.16 on next page.

291
4. Click on the button Next to activate a second independent variables

box to enter the variables under study, i.e., C-Test, S-Test and LKT as

Forming block 1 Forming block 2
5. Make sure that the Method is the default Enter

6. Click on the
7. Click on Statistics button to tick the boxes Estimates, Model Fit, R
squared change, Descriptives, Part and partial correlations and
Collinearity diagnostics. Click on Continue.
8. Click on the Options button and click tick Exclude cases pairwise in
the Missing Values section. Click on Continue.
9. Click on the Save button and tick Mahalanobis and Cook’s boxes in
Distances section.
10. Click on Continue to return to the main dialogue box and click OK.
Table below provides the model summary. After the variables of age
and GPA are controlled as model 1, they explain 2.9% (.029×100) of the
variance as shown in R Square Change column. The inclusion of the C-
Test, S-Test and LKT in model 2 explain 54.5% (.545×100) of the
variance in the TOEFL when the effects of the age and GPA are
statistically controlled.

292
Model Summaryc
Model R R Adjusted Std. Error of Change Statistics

Squar R Square the Estimate R F df1 df2 Sig. F
e Square Change Change
Change
a
1 .170 .029 .024 16.039 .029 5.916 2 399 .003
b
2 .758 .574 .569 10.663 .545 168.926 3 396 .000
a. Predictors: (Constant), GPA, Age
b. Predictors: (Constant), GPA, Age, LKT, CTest, STest
c. Dependent Variable: TOEFL
The significance of the variance is given in the table labeled ANOVA

presented below. As can be seen in the table, the variance explained by
the C-Test, S-Test and LKT in model 2 is significant, i.e., F(5, 396)=
106.710, p<.0005.
ANOVAa
Model Sum of Squares df Mean Square F Sig.
Regression 3043.881 2 1521.941 5.916 .003b

1 Residual 102646.636 399 257.260
Total 105690.518 401

Regression 60665.047 5 12133.009 106.710 .000c
2 Residual 45025.471 396 113.701
Total 105690.518 401

b. Predictors: (Constant), GPA, Age
c. Predictors: (Constant), GPA, Age, LKT, C-Test, S-Test
To find out how well each of the variances contributes to the final
equation we need to check the Coefficients tables in the model 2 row.
(Due to the large width of the table, it has been scanned so that it can be

293
presented on the next page.) When we check, the Sig. column, we notice
that four variables contribute to the variance in the model as reflected in
the Standardized Coefficients column. In order of importance, they are
the C-Test (beta=.46), LKT (beta=.25), S-Test (beta=.18), and GPA
(beta=.07).
The tables presented in this chapter have to be reformatted according to

the style adopted by journals to which we may wish to submit our
findings. The Publication Manual of the American Psychological
Associations (2010, pp. 141-4), for example, provides guidelines
examples of how to present the results of multiple regression. An
example of how you might choose to present the results of the analyses
conducted in this chapter is presented below.
Hierarchical multiple regression was used to assess the ability of three

control measures [C-Tests, S-Tests and lexical knowledge test (LKT)] to
predict the TOEFL scores of 430 undergraduate university students in
Mashhad, Iran, after controlling for the influence of age and GPA. After
securing normality, linearity, multicollinearity and homoscedasticity,
age and GPA were entered at Step 1, explaining 2.9% of the variance in
TOELF. When the C-Tests, S-Tests and LKT were entered at Step 2, the
model explained an additional 57.4%, F(5,396)= 106.710, p<.0005.The
three control measures explained an additional 55% of the variance in
the TOELF after controlling for age and GPA, R squared change=.55, F
change(3, 396)= 168.926, p<.0005. In the final model, four measures

294
were statistically significant, with the C-Test having the highest beta
value (beta=.46) followed by LKT (beta=.25), S
Test (beta=.18), and GPA (beta=.07).
13.6 Summary
A large number of human characteristics are identified and
operationalized as variables in applied linguistics so that they can be
measured quantitatively. The quantification of variables endows
researchers with the ability to study their relationships with each other
and determine how they relate to learning languages.
In chapter 13 we learned that interval variables allow researchers not

only to compare the performance of the same group on the same
instruments on different occasions but also to explore whether there is a
significant difference in the performance of two or more groups which
share almost all variables except the treatment given to only one. We
also realized that regression can be employed to predict the behaviour of
learners on a given instrument by utilizing their performance on other
instruments.
It is hoped that the familiarity and mastery of readers with the principles,
methods and statistics of research in language education will help them
approach language learning and teaching as an exciting and rewarding
process which can ultimately lead to societies populated by ever
conscientious and responsible members. The materials presented in the
chapters so far can be supplemented by reading research papers in which
various methods have been employed. The Iranian sources of these
papers are presented as the last chapter.

295
14 Finding Research
Papers
14.1 Introduction
Similar to all types of sciences, social sciences have witnessed an ever-
increasing number of journals published in their subfields. The growth
has occurred due to a large number of factors among which economy
and excellence stand out as the two most influential variables governing
organizations offering social sciences.
Privatization has become a reality accepted by almost all nations.

Educational organization such as universities are now battling for
receiving a fair portion of money learners are ready to pay for their
being educated. The learners are in their own turn looking for those
universities whose graduates have been most successful in getting
employed or establishing their own private businesses.
But how can one decide whether a given educational institute such as a
university has been successful? Among various approaches available,
one has proved to be the most widely used index, i.e., citation. It is
argued that the best universities are those whose academic members
have published the most widely cited research. For this very reason, a
special institute was established in America.
14.2 The Institute for Scientific Information (ISI)

According to King (2006), the Institute for Scientific Information (ISI)
is located in Philadelphia, Pennsylvania. It has the largest citation
database in the world. The ISI tracks the papers published in humanities,
sciences and social sciences every year.

296
In one study, the ISI decided to the find the “best of the best” as judged
by citations. By employing a computer program it extracted 1381
biomedical papers which were cited at least 300 times and published
between 1990 and 1996. By identifying the “high-impact” papers and
employing 10 of them as the least criterion, the ISI could rank the
institutions in which the papers were produced.
When the total number of citations was used as a measure of ranking

Harvard University in Cambridge, Massachusetts, outranked all other
institutions because its 128 high-impact biomedical papers published
from 1990 to 96 had been cited more than 60,500 times! Johns Hopkins
University in Baltimore, Maryland, stood in the second position by
producing 56 high-impact papers cited over 35,500 times.
The high-impact papers are not studied by the ISI alone. Other
organizations have followed the suit. Thomosn Reuters (2009), for
example, does the same. The 79 cited journals given in Table 14.1 have
been taken from 1985 journals listed in its 2008 JCR33 Social Science
Edition.
Table 14.1
Selected list of Journals ranked on the basis of their total cites in 2008
Total
No Title of the Journal ISSN
Cites
1 Journal of Memory and Language 0749-596X 4912
2 Brain and Language 0093-934X 4286
Journal of Speech Language and Hearing 1092-4388 2931
3
Research
4 Review of Educational Research 0034-6543 2402
5 Multivariate Behavioral Research 0027-3171 1831
33
Journal Citation Report

297

Total
Cites
6 Journal of Communication 0021-9916 1816
7 Language 0097-8507 1807
8 Intelligence 0160-2896 1611
9 Journal of Pragmatics 0378-2166 1364
10 Journal of Child Language 0305-0009 1327
11 Computational Linguistics 0891-2017 1286
12 Linguistics Inquiry 0024-3892 1269
13 Language and Cognitive Processes 0169-0965 1194
14 Journal of Phonetics 0095-4470 1012
15 Journal of Communication Disorders 0021-9924 966
16 Language Learning 0023-8333 954
17 Journal of Educational Research 0022-0671 950
18 Applied Psycholinguistics 0142-7164 924
19 Modern Language Journal 0026-7902 913
20 TESOL Quarterly 0039-8322 912
21 Studies in Second Language Acquisition 0272-2631 835
22 Applied Linguistics 0142-6001 815
American Journal of Speech-Language 1058-0360 763
23
Pathology
24 Journal of Psycholinguistics Research 0090-6905 730
25 Journal of Teacher Education 0022-4871 719
26 Mind & Language 0268-1064 690
27 Journal of Educational Measurement 0022-0655 676
Language Speech and Hearing Services 0161-1461 639
28
in Schools
29 Language in Society 0047-4045 634
30 Language and Speech 0023-8309 589

298

Total
Cites
31 Lingua 0024-3841 588
32 Linguistics and Philosophy 0165-0157 557
33 Linguistics 0024-3949 530
34 Bilingualism-Language and Cognition 1366-7289 472
International Journal of Language & 1368-2822 472
35
Communication Disorders
36 Annals of Dyslexia 0736-9387 467
37 Natural Language & Linguistic Theory 0167-806X 452
38 Clinical Linguistics & Phonetics 0269-9206 438
39 American Journal of Education 0195-6744 393
40 Cognitive Linguistics 0936-5907 384
41 Journal of Research in Reading 0141-0423 379
42 Journal of Fluency Disorders 0094-730X 377
43 Dyslexia 1076-9242 371
44 Journal of Neurolinguistics 0911-6044 364
Journal of Language and Social 0261-927X 348
45
Psychology
46 Phonetica 0031-8388 343
47 Syntax and Semantics 0092-4563 342
Research On Language and Social 0835-1813 328
48
Interaction
49 Second Language Research 0267-6583 319
50 Language & Communication 0271-5309 317
51 English for Specific Purposes 0889-4906 316
52 Journal of Second Language Writing 1060-3743 316
53 Journal of Second Language Writing 1060-3743 316
54 American Journal of Evaluation 1098-2140 301

299

Total
Cites
55 Journal of Sociolinguistics 1360-6441 292
56 Quarterly Journal of Speech 0033-5630 292
57 Journal of Linguistics 0022-2267 282
58 Language Learning & Technology 1094-3501 270
59 Foreign Language Annals 0015-718X 253
60 Canadian Modern Language Review 0008-4506 224
61 Linguistic Review 0167-6318 216
62 American Speech 0003-1283 201
63 Adult Education Quarterly 0741-7136 187
64 Applied Measurement in Education 0895-7347 182
65 Language Sciences 0388-0001 163
66 Metaphor and Symbol 1092-6488 162
67 International Journal of Lexicography 0950-3846 156
68 Narrative Inquiry 1387-6740 145
69 Language Teaching Research 1362-1688 124
70 Interaction Studies 1572-0373 113
71 Theoretical Linguistics 0301-4428 90
72 Folia Linguistica 0165-4004 88
73 Probus 0921-4771 79
74 Journal of Pidgin and Creole Languages 0920-9034 57
75 Lexikos 1684-4904 40
76 Text & Talk 1860-7330 35
77 Journal of Historical Pragmatics 1566-5852 25
International Journal of Speech Language 1748-8885 22
78
and the Law
79 Interpreter and Translator Trainer 1750-399X 2

300

Total
Cites
80 Across Languages and Cultures 1585-1923 -
81 Child Language Teaching & Therapy 0265-6590 -
82 English Teaching-Practice and Critique 1175-8708 -
83 European Journal of Teacher Education 0261-9768 -
84 Functions of Language 0929-998X -
85 Journal of Semantics 0167-5133 -
86 Language and Linguistics 1606-822X -
87 Language and Literature 0963-9470 -
88 Language and Speech 0023-8309 -
89 Language Assessment Quarterly 1543-4303 -
90 Language Awareness 0965-8416 -
91 Language Culture and Curriculum 0790-8318 -
92 Language Matters 1022-8195 -
Language Problems & Language 0272-2690 -
93
Planning
94 Language Testing 0265-5322 -
95 Language Variation and Change 0954-3945 -
96 Phonology 0952-6757 -
97 Review of Research in Education 0091-732X -
98 Teacher and Teacher Education 0742-051X -
99 Translator 1355-6509 -
As can be seen in Table 14.1, all listed periodicals have an International

Standard Serial Number (ISSN). The ISSN is an eight-digit number
which identifies periodical publications, including electronic serials.
According to the ISSN homepage, i.e., http://www.issn.org/2-22640-
Statistics.php, the ISSN given to each publication is just a numeric code
used as an identifier. It neither bears any significance nor contains any
information referring to the origin or contents of the publication.

301
The fourth column of Table 14.1 shows the total cites. The statistics has
been taken from Thomosn Reuters (2010). As can be seen, among the 99
journals which publish articles in English, Journal of Memory and
Language, Brain and Language, Journal of Speech Language and
Hearing Research have occupied the first, second and third ranks,
respectively. The ranking is based on the total cites. (There was no
statistics for the last 20 journals.)
14.3 Scientific Information Database (SID)

In a very laudable attempt, Jahad Daneshgahi established the online
Scientific Information Database (SID) which is accessible at
http://sid.ir/en/Subject.asp?ID=5. While the ISI is economically too
expensive for Iranian students to access, the SID provides a free of
charge database for the papers published in Iran.
Table 14.2 presents nine journals selected from among some 150
journals given in the SID site. It includes the name of those journals
which publish papers related to applied linguistics and English language
and literature.
Table 14.2
List of English language related Journals published in Iran
Publication
No Name Place of Publication
Times
1 Ferdowsi Review (not Quarterly Ferdowsi University
listed yet) of Mashhad
2 Iranian Journal of Applied Quarterly Sistan and
Language Studies Baluchestan
University
3 Iranian Journal of Applied Semi-annually Teacher Training
Linguistics (IJAL) University

302

List of English language related Journals published in Iran
Publication
No Name Place of Publication
Times
4 Journal of Curriculum Quarterly The Iranian
Studies (J.C.S) Curriculum Studies
Association
(I.C.S.A)
5 Journal of Faculty of Quarterly Tabriz University
Letters and Humanities
(Tabriz)
6 Language and Linguistics Semi-annually Linguistic Society of
Iran
7 Pazhuhesh-e Zabanya-ye Quarterly Faculty of Foreign
Khareji Languages, Tehran
University
8 Research on Foreign Semi-annually Tabriz University
Languages (Journal of
Faculty of Letters and
Humanities) (Tabriz)
9 Teaching English Quarterly English Language
Language and Literature and Literature
Society of Iran (TELLSI) Society of Iran
You can access the journals specified above online. For example, click
on the Iranian Journal of Applied Linguistics (IJAL) to be linked to its
homepage as shown below:
Bibliographic Information for : IRANIAN

10 Volume(s)
JOURNAL OF APPLIED LINGUISTICS (IJAL)
IRANIAN JOURNAL OF
Name:
APPLIED LINGUISTICS (IJAL)

303
Type: SEMI-ANNUALLY
Editor in MOHAMMAD HOSSEIN
chief : KESHAVARZ
TEACHER TRAINING
Publisher :
UNIVERSITY
MOHAMMAD HOSSEIN
Manager :
KESHAVARZ
NO. 49, SHAHID DR.
MOFATTEH AVE., FOREIGN
LANGUAGES DEPARTMENT,
Address:
OFFICE OF IRANIAN
JOURNAL OF APPLIED
LINGUISTICS (IJAL), 15614
Tel: (021) 88304896
MHKESHAVARZ@YAHOO.CO
E.Mail:
M
According to SID's Policy, all of the articles published in this journal are
indexed in SID web site from the year of 2000 until now.
List Of Issue(s) : IRANIAN JOURNAL OF APPLIED LINGUISTICS

(IJAL)
Year : 2007
Year : 2006
Year : 2005
Year : 2004
Year : 2003
Year : 2001
Year : 2000
There is a + mark in front of each year. If you click on that mark, it will
show you the number of issues published in each year. For example, at

304
the time it was accessed, issue 2006 had two numbers published in
March and September.
Year : 2007
Number 1 MARCH 2007
Year : 2006
Number 2 SEPTEMBER 2006
Number 1 MARCH 2006
If you click on Number 2 in Year 2006, the page below will appear:
Papers list in IRANIAN JOURNAL OF 5 Papers

APPLIED LINGUISTICS (IJAL)---
Number:2
1 : A MULTIPLE INTELLIGENCE-BASED INVESTIGATION

INTO THE EFFECTS OF FEEDBACK CONDITIONS ON EFL
WRITING ACHIEVEMENT
FAHIM MANSOUR,NEZHAD ANSARI D.
IRANIAN JOURNAL OF APPLIED LINGUISTICS
(IJAL) SEPTEMBER 2006; 9(2):51-78.
Keyword: MULTIPLE INTELLIGENCE THEORY, DOMINANT MI
PROFILE, ALTERNATE/TRIADIC RESPONSES, TUTOR
RESPONSE, SELF RESPONSE, PEER RESPONSE
Reference(s) (0) Citation(s) (0)
2 : DIFFERENTIAL ITEM FUNCTIONING IN HIGH-STAKES

TESTS: THE EFFECT OF FIELD OF STUDY

305
KETABI SAEID,AHMADI ALI REZA,BARATI H.

(IJAL) SEPTEMBER 2006; 9(2):27-49.
Keyword: DIFFERENTIAL ITEM FUNCTIONING (DIF), FIELD OF
STUDY, IRANIAN NATIONAL UNIVERSITY ENTRANCE EXAM
(INUEE), ENGLISH SUBTEST, IRT
3 : ON THE IMPACT OF CONCEPT MAPPING AS A

PREWRITING ACTIVITY ON EFL LEARNERS' WRITING
ABILITY
PISH GHADAM R.,GHANI ZADEH AFSANEH
(IJAL) SEPTEMBER 2006; 9(2):101-126.
Keyword: CONCEPT MAPPING, PREWRITING PHASE,
CONTROL GROUP, EXPERIMENTAL GROUP, INTERVIEWS,
WRITING TASK ANALYSIS
4 : RELATIONSHIP BETWEEN MODALITY, TYPES OF

PASSAGES, AND PERFORMANCE OF ADVANCED EFL
LEARNERS ON LISTENING COMPREHENSION TESTS
JALALI SARA, KIANI GHOLAM REZA
(IJAL) SEPTEMBER 2006; 9(2):79-99.
Keyword: ADVANCED EFL LEARNERS, MODALITY OF
LISTENING PASSAGES, AND TYPES OF LISTENING
PASSAGES

306
5 : THE EFFECT OF PORTFOLIO ASSESSMENT ON

METACOGNITIVE READING STRATEGY AWARENESS OF
IRANIAN EFL STUDENTS
ATAEI M.R., NIKOUEINEZHAD F.

(IJAL) SEPTEMBER 2006; 9(2):1-25.
Keyword: ALTERNATIVE ASSESSMENT, PORTFOLIO,
PORTFOLIO ASSESSMENT, METACOGNITIVE AWARENESS,
AUTONOMY, EFL READING COMPREHENSION
If we click on any of the five articles published in Number 2, we will get

its abstract as shown below:
3: IRANIAN JOURNAL OF APPLIED LINGUISTICS

(IJAL) SEPTEMBER 2006; 9(2):101-126.
ON THE IMPACT OF CONCEPT MAPPING AS A PREWRITING

ACTIVITY ON EFL LEARNERS' WRITING ABILITY
PISH GHADAM R., GHANI ZADEH AFSANEH
The present study investigated the impact of concept mapping as a

prewriting activity on EFL learners' writing ability in terms of product
and process of writing tasks. Twenty female students at the upper
intermediate level were randomly assigned into two equal groups. The
first group, serving as a control group, was not instructed to use concept
mapping during prewriting phase, while the second group used concept
maps in preparation for writing tasks. The results of pretests and
posttests of the two groups scored by two raters based on predetermined
criteria were compared. Concept mapping was shown to enhance the
learners' writing ability. In order to capture the quality of this
enhancement, the study used comparison and analysis of writing
assignments written every two sessions, along with students' interviews.

307
The results indicated the improvement of students' writing ability in

terms of quantity and quality of generating, organizing, and associating
ideas. The findings of the present study also suggest that concept
mapping can be effective for affective, as well as cognitive instructional
objectives.
Keyword: CONCEPT MAPPING, PREWRITING PHASE,

CONTROL GROUP, EXPERIMENTAL GROUP, INTERVIEWS,
WRITING TASK ANALYSIS
Printable Version
If you click on the icon you will access the whole article as a PDF
file. You can then save it on your hard.
14.4 Online Journals

In addition to journals listed in section 14.2 and 14.3, there are certain
online journals which can be accessed free of charge. Table 14.3
presents the name and homepage of some online journals.
Table 14.3
Name and Address of some online journals
Journal Name Homepage

Asian EFL Journal http://www.asian-efl-journal.com/
Iranian EFL Journal http://www.iranian-efl-journal.com/Feb-08-
mr&mn.php
Journal of English as http://www.eilj.com/
an International
Language

308

Name and Address of some online journals
Journal Name Homepage

The Philippines ESL http://www.philippine-esl-journal.com/
Journal
The Asian ESP Journal http://www.asian-esp-journal.com/
The Linguistics Journal http://www.linguistics-journal.com/
The Reading Matrix: http://www.readingmatrix.com/editors.html
An International Online
Journal
TESL Canada http://www.tesl.ca/
14.5 Summary
Papers published in research journals provide the best information
regarding the topics related to given fields. The most widely cited papers
published by top-ranking institutes act like pivots upon which solid
research projects must be built. The ISI and SID provide the most
informative lists for researchers in English language teaching, literature
and translation.

309
Appendix 3.1
Self-assessment used as reading portfolios (Faravani, 2006, p. 84)
Name: Family name:

Reading No.: Date:
Please put a check mark (√) in the box which best describes your own
reading activities
Almost Almost
Activities Sometimes
Always Never
1. Participates in small group
discussion.
2. Shares responses to the reading.
3. Comprehends questions.
4. Gives quality responses during
small group discussions.
5. Uses structure and background
knowledge (look-in strategy).
6. Uses other sources (dictionary,
thesaurus, content, and Text).
7. Integrates strategies and sources.
8. Uses meaning cues.
9. Uses structural cues.
10. Uses visual cues.
11. Integrates cues (meaning,
structural, visual).
12. Makes predictions and reads to
find out if it was right.
13. Reads the sentences before and
after a word he doesn’t know.
14. Guesses the meanings of the
words he doesn’t know from the
context.

310
Please put a check mark (√) in the box which best describes your own
reading activities
Almost Almost
Always Never
15. Looks for the main idea.
16. Discusses what he reads with
others.
17. Has the ability to self-correct.
18. Recognizes cause and
relationships between/among
sentences.
19. Draws inferences.
20. Can provide examples from
personal experience and/or prior
knowledge and uses relevant
examples from the text.
21. Recognizes logical order.
22. Recognizes paraphrasing.

311
Appendix 3.2
Peer-assessment used as reading portfolios (Faravani, 2006, p. 85)
Name: Family name:

Reading No.: Date:
Please put a check mark (√) in the box which best describes your
classmate's reading activities.
Almost Almost
Always Never
1. Participates in small group discussion.
2. Shares responses to the reading.
3. Comprehends questions.
4. Gives quality responses during small
group discussions.
5. Uses structure and background
knowledge (look-in strategy).
6. Uses other sources (dictionary,
thesaurus, content, and Text).
7. Integrates strategies and sources.
8. Uses meaning cues.
9. Uses structural cues.
10. Uses visual cues.
11. Integrates cues (meaning, structural,
visual).
12. Makes predictions and reads to find
out if it was right.
13. Reads the sentences before and after
a word he doesn’t know.
14. Guesses the meanings of the words
he doesn’t know from the context.

312
Please put a check mark (√) in the box which best describes your
classmate's reading activities.
Almost Almost
Always Never
15. Looks for the main idea.
16. Discusses what he reads with others.
17. Has the ability to self-correct.
18. Recognizes cause and relationships
between/among sentences.
19. Draws inferences.
20. Can provide examples from personal
experience and/or prior knowledge
and uses relevant examples from the
text.
21. Recognizes logical order.
22. Recognizes paraphrasing.

313
Appendix 3.3
Self-reflections on readings (Faravani, 2006, p. 86)
Name: Date:
Reading Number: Group:
The main topic of this reading is ….
Have you read this reading before?
The author concludes that …
How is the reading related to your everyday life? What does the author
want you to learn?
One of the new words my group talked about was ….
How did your group figure out its meaning?
Did you like the reading? Why or why not?
What problems did you have when you read this passage?( What were
your weaknesses in reading?)
What do you think about your reading now? Do you think you made
progress? What are your strengths in reading now?

314
Appendix 5.1
Table of random numbers34
04433 80674 24520 18222 10610 0594 37515

60298 47829 72648 37414 75755 04717 29899
67884 59651 67533 68123 17730 95862 08034
89512 32155 51906 61662 64130 16688 37275
32653 01895 12506 88535 36553 23757 34209
95913 15405 13772 76638 48423 25018 99041

55864 21694 13122 44115 01601 50541 00147
35334 49810 91601 40617 72876 33967 73830
57729 32196 76487 11622 96297 24160 09903
86648 13697 63677 70119 94739 25875 38829
30574 47609 07967 32422 76791 39725 53711

81307 43694 83580 79974 45929 85113 72268
02410 54905 79007 54939 21410 80980 91772
18969 75274 52233 62319 08598 09066 95288
87863 82384 66860 62297 80198 19347 73234
68397 71708 15438 62311 72844 60203 46412

28529 54447 58729 10854 99058 18260 38765
44285 06372 15867 70418 57012 72122 36634
86299 83430 33571 23309 57040 29285 67870
84842 68668 90894 61658 15001 94055 36308
56970 83609 52098 04184 54967 72938 56834

83125 71257 60490 44369 66130 72936 69848
55503 52423 02464 26141 68779 66388 75242
47019 76273 33203 29608 54553 25971 69573
84828 32592 79526 29554 84580 37859 28504
34
Based on parts of Table of 105,000 Random Decimal Digits, Interstate Commerce
Commission, Bureau of Transport Economic and Statistics, Washington, D. C.

315
68921 08141 79227 05748 51276 57143 31926

36458 96045 30424 98420 72925 40729 22337
95752 59445 36847 87729 81679 59126 59437
26768 47323 58454 56958 20575 76746 49878
42613 37056 43636 58085 06766 60227 96414
95457 30566 65482 25596 02678 54592 63607

95276 17894 63564 95958 39750 64379 46059
66954 52324 64776 92345 95110 59448 77249
17457 18481 14113 62462 02798 54977 48349
03704 36872 83214 59337 01695 60666 97410
21538 86497 33210 60337 27976 70661 08250

57178 67619 98310 70348 11317 71623 55510
31048 97558 94953 55866 96283 46620 52087
69799 55380 16498 80733 96422 58078 99643
90595 61867 59231 17772 67831 33317 00520
33570 04981 98939 78784 09977 29398 93896

15340 93460 57477 13898 48431 72936 78160
64079 42483 36512 56186 99098 48850 72527
63491 05546 67118 62063 74958 20946 28147
92003 63868 41034 28260 79708 00770 88643
52360 46658 66511 04172 73085 11795 52594

74622 12142 68355 65635 21828 39539 18988
04157 50079 61343 64315 70836 82857 35335
86003 60070 66241 32836 27573 11479 94114
41268 80187 20351 09636 84668 42486 71303

316
48611 62866 33963 14045 79451 04934 45576

78812 03509 78673 73181 29973 18664 04555
19472 63971 37271 31445 49019 49405 46925
51266 11569 08697 91120 64156 40365 74297
55806 96275 26130 47949 14877 69594 83041
77527 81360 18180 97421 55541 90275 18213

77680 58788 33016 61173 93049 04694 43534
15404 96554 88265 34537 38526 67924 40474
14045 22917 60718 66487 46346 30949 03173
68376 43918 77653 04127 69930 43283 35766
93385 13421 67957 20384 58731 53396 59723

09858 52104 32014 53115 03727 98624 84616
93307 34116 49516 42148 57740 31198 70336
04794 01534 92058 03157 91758 80611 45357
86265 49096 97021 92582 61422 75890 86442
65943 79232 45702 67055 39024 57383 44424

90038 94209 04055 27393 61517 23002 96560
97283 95943 78363 36498 40662 94188 18202
21913 72958 75637 99936 58715 07943 23748
41161 37341 81838 19389 80336 46346 91895
23777 98392 31417 98547 92058 02277 50315

59973 08144 61070 73094 27059 69181 55623
82690 74099 77885 23813 10054 11900 44653
83854 24715 48866 65745 31131 47636 45137
61980 34997 41825 11623 07320 15003 56774

317
99915 45821 97702 87125 44488 77613 56823

48293 86847 43186 42951 37804 85129 28993
33225 31280 41232 34750 91097 60752 69783
06846 32828 24425 30249 78801 26977 92074
32671 45587 79620 84831 38156 74211 82752
82096 21913 75544 55228 89796 05694 91552

51666 10433 10945 55306 78562 89630 41230
54044 67942 24145 42294 27427 84875 37022
66738 60184 75679 38120 17640 36242 99357
55064 17427 89180 74018 44865 53197 74810
69599 60264 84549 78007 88450 06488 72274

64756 87759 92354 78694 63638 80939 98644
80817 74533 68407 55862 32476 19326 95558
39847 96884 84657 33697 39578 90197 80532
90401 41700 95510 61166 33757 23279 85523
78227 90110 81378 96659 37008 04050 04228

87240 52716 87697 79433 16336 52862 69149
08486 10951 26832 39763 02485 71688 90936
39338 32169 03713 93510 61244 73774 01245
21188 01850 69689 49426 49128 14660 14143
13287 82531 04388 64693 11934 35051 68576

53609 04001 19648 14053 49623 10840 31915
87900 36194 31567 53506 34304 39910 79630
81641 00496 36058 75899 46620 70024 88753
19512 50277 71508 20116 79520 06269 74178

318
24418 23508 91507 76455 54941 72711 39406

57404 73678 08272 62941 02349 71389 45605
77644 98489 86268 73652 98210 44546 27174
68366 65614 01443 07607 11826 91326 29664
64472 72294 95432 53555 96810 17100 35066
88205 37913 98633 81009 81060 33449 68055

98455 78685 71250 10329 56135 80647 51404
48977 36794 56054 59243 57361 65304 93258
93077 72941 92779 23581 24548 56415 61927
84533 26564 91583 83411 66504 02036 02922
11338 12903 14514 27585 45068 05520 56321

23853 68500 92274 87026 99717 01542 72990
94096 74920 25822 98026 05394 61840 83089
83160 82362 09350 98536 38155 42661 02363
97425 47335 69709 01386 74319 04318 99387
83951 11954 24317 20345 18134 90062 10761

93085 35203 05740 03206 92012 42710 34650
33762 83193 58045 89880 78101 44392 53767
49665 85397 85137 30496 23469 42846 94810
37541 82627 80051 72521 35342 56119 97190
22145 85304 35348 82854 55846 18076 12415

27153 08662 61078 52433 22184 33998 87436
00301 49425 66682 25442 83668 66236 79655
43815 43272 73778 63469 50083 70696 13558
14689 86482 74157 46012 97765 27552 49617

319
16680 55936 82453 19532 49988 13176 94219

86938 60429 01137 86168 78257 86249 46134
33944 29219 73161 46061 30946 22210 79302
16045 67736 18608 18198 19468 76358 69203
37044 52523 25627 63107 30806 80857 84383
61471 45322 35340 35132 42163 69332 98851

47422 21296 16785 66393 39249 51463 95963
24133 39719 14484 58613 88717 29289 77360
67253 67064 10748 16006 16767 57345 42285
62382 76941 01635 35829 77516 98468 51686
98011 16503 09201 03523 87192 66483 55649

37366 24386 20654 85117 74078 64120 04643
73587 83993 54176 05221 94119 20108 78101
33583 68291 50547 96085 62180 27453 18567
02878 33223 39199 49536 56199 05993 71201
91498 41673 17195 33175 04994 09879 70337

91127 19815 30219 55591 21725 43827 78862
12997 55013 18662 81724 24305 37661 18956
96098 13651 15393 69995 14762 69734 89150
97627 17837 10472 18983 28387 99781 52977
40064 47981 31484 76603 54088 91095 00010

16239 68743 71374 55863 22672 91609 51514
58354 24913 20435 30965 17453 65623 93058
52567 65085 60220 84641 18273 49604 47418
06236 29052 91392 07551 83532 68130 56970

320
94620 27963 96478 21559 19246 88097 44926

60947 60775 73181 43264 56895 04232 59604
27499 53523 63110 57106 20865 91683 80688
01603 23156 89223 43429 95353 44662 59433
00815 01552 06392 31437 70385 45863 75971
83844 90942 74857 52419 68723 47830 63010

06626 10042 93629 37609 57215 08409 81906
56760 63348 24949 11859 29793 37457 59377
64416 29934 00755 09418 14230 62887 92683
63569 17906 38076 32135 19096 96970 75917
22693 35089 72994 04252 23791 60249 83010

43413 59744 01275 71326 91382 45114 20245
09224 78530 50566 49965 04851 18280 14039
67625 34683 03142 74733 63558 09665 22610
86874 12549 98699 54952 91579 26023 81076
54548 49505 62515 63903 13193 33905 66936

73236 66167 49728 03581 40699 10396 81827
15220 66319 13543 14071 59148 95154 72852
16151 08029 36954 03891 38313 34016 18671
43635 84249 88984 80993 55431 90793 62603
30193 42776 85611 57635 51362 79907 77364

37430 45246 11400 20986 43996 73122 88474
88312 93047 12088 86937 70794 01041 74867
98995 58159 04700 90443 13168 31553 67891
51734 20849 70198 67906 00880 82899 66065

321
88698 41755 56216 66852 17748 04963 54859

51865 09836 73966 65711 41699 11732 17173
40300 08852 27528 84648 79589 95295 72895
02760 28625 70476 76410 32988 10194 94917
78450 26245 91763 73117 33047 03577 62599
50252 56911 62693 73817 98693 18728 94741

07929 66728 47761 81472 44806 15592 71357
09030 39605 87507 85446 51257 89555 75520
56670 88445 85799 76200 21795 38894 58070
48140 13583 94911 13318 64741 64336 95103
36764 86132 12463 28385 94242 32063 45233

14351 71381 28133 68269 65145 28152 39087
81276 00835 63835 87174 42446 08882 27067
55524 86088 00069 59254 24654 77371 26409
78852 65889 32719 13758 23937 90740 16866
11861 69032 51915 23510 32050 52052 24004

67699 01009 07050 73324 06732 27510 33761
50064 39500 17450 18030 63124 48061 59412
93126 17700 94400 76075 08817 27324 72723
01657 92602 41043 05686 15650 29970 95877
13800 76690 75133 60456 28491 03845 11507

98135 42870 48578 29036 69876 86563 61729
08313 99293 00990 13595 77457 79969 11339
90974 83965 62732 85161 54330 22406 86253
33273 61993 88407 69399 17301 70975 99129

322
Appendix 12.1
Critical values for the Pearson Product-moment Correlation Coefficients
Level of significance for one-tailed test

df= N-1
.05 .025 .01 .005
Level of significance for two-tailed test
df= N-2
.10 .05 .02 .01
1 .988 .997 .9995 .9999
2 .900 .950 .980 .990
3 .805 .878 .934 .959
4 .729 .811 .882 .917
5 .669 .754 .833 .874
6 .622 .707 .789 .834
7 .582 .666 .750 .798
8 .549 .632 .716 .765
9 .521 .602 .685 .735
10 .497 .576 .658 .708
11 .476 .553 .634 .684
12 .458 .532 .612 .661
13 .441 .514 .592 .641
14 .426 .497 .574 .628
15 .412 .482 .558 .606
16 .400 .468 .542 .590
17 .389 .456 .528 .575
18 .378 .444 .516 .561
19 .369 .433 .503 .549
20 .360 .423 .492 .537
21 .352 .413 .482 .526
22 .344 .404 .472 .515
23 .337 .396 .462 .505
24 .330 .388 .453 .495
25 .323 .381 .445 .487
26 .317 .374 .437 .479
27 .311 .367 .430 .471
28 .306 .361 .423 .463
29 .301 .355 .416 .456

323
Appendix 12.1 (Continued)

Critical values for the Pearson Product-moment Correlation Coefficients
Level of significance for one-tailed test

df= N-1
.05 .025 .01 .005
Level of significance for two-tailed test
df= N-2
.10 .05 .02 .01
30 .296 .349 .409 .449
35 .275 .325 .381 .418
40 .257 .304 .358 .393
45 .243 .288 .338 .372
50 .231 .273 .322 .354
60 .211 .250 .295 .325
70 .195 .232 .274 .302
80 .183 .217 .256 .284
90 .173 .205 .242 .267
100 .164 .195 .230 .254
Source: Fisher, R. A., & Yates, F. (1963). Statistical tables for biological, agricultural and medical research
(4th ed.). Edinburgh: Oliver & Boyd.

324
Appendix 13.1
Critical values for T test35
Two Tailed Significance

df
.2 .01 .05 .01 .005 .001 .0005 .0001
2 1.89 2.92 4.30 9.92 14.09 31.60 44.70 100.14
3 1.64 2.35 3.18 5.84 7.45 12.92 16.33 28.01
4 1.53 2.13 2.78 4.60 5.60 8.61 10.31 15.53
5 1.48 2.02 2.57 4.03 4.77 6.87 7.98 11.18
6 1.44 1.94 2.45 3.71 4.32 5.96 6.79 9.08
7 1.41 1.89 2.36 3.50 4.03 5.41 6.08 7.89
8 1.40 1.86 2.31 3.36 3.83 5.04 5.62 7.12
9 1.38 1.83 2.26 3.25 3.69 4.78 5.29 6.59
10 1.37 1.81 2.23 3.17 3.58 4.59 5.05 6.21
11 1.36 1.80 2.20 3.11 3.50 4.44 4.86 5.92
12 1.36 1.78 2.18 3.05 3.43 4.32 4.72 5.70
13 1.35 1.77 2.16 3.01 3.37 4.22 4.60 5.51
14 1.35 1.76 2.14 2.98 3.33 4.14 4.50 5.36
15 1.34 1.75 2.13 2.95 3.29 4.07 4.42 5.24
16 1.34 1.75 2.12 2.92 3.25 4.01 4.35 5.13
17 1.33 1.74 2.11 2.90 3.22 3.97 4.29 5.04
18 1.33 1.73 2.10 2.88 3.20 3.92 4.23 4.97
19 1.33 1.73 2.09 2.86 3.17 3.88 4.19 4.90
20 1.33 1.72 2.09 2.85 3.15 3.85 4.15 4.84
21 1.32 1.72 2.08 2.83 3.14 3.82 4.11 4.78
22 1.32 1.72 2.07 2.82 3.12 3.79 4.08 4.74
23 1.32 1.71 2.07 2.81 3.10 3.77 4.05 4.69
24 1.32 1.71 2.06 2.80 3.09 3.75 4.02 4.65
35
Mile, J. (2008). Learning Statistics: A blog about learning statistics in psychology,
health and social sciences. Retrieved August 31, 2010 from
http://www.jeremymiles.co.uk/misc/tables/t-test.html

325
Appendix 13.1 (Continued)

Critical values for T test
Two Tailed Significance

df
.2 .01 .05 .01 .005 .001 .0005 .0001
25 1.32 1.71 2.06 2.79 3.08 3.73 4.00 4.62
26 1.31 1.71 2.06 2.78 3.07 3.71 3.97 4.59
27 1.31 1.70 2.05 2.77 3.06 3.69 3.95 4.56
28 1.31 1.70 2.05 2.76 3.05 3.67 3.93 4.53
29 1.31 1.70 2.05 2.76 3.04 3.66 3.92 4.51
30 1.31 1.70 2.04 2.75 3.03 3.65 3.90 4.48
35 1.31 1.69 2.03 2.72 3.00 3.59 3.84 4.39
40 1.30 1.68 2.02 2.70 2.97 3.55 3.79 4.32
45 1.30 1.68 2.01 2.69 2.95 3.52 3.75 4.27
50 1.30 1.68 2.01 2.68 2.94 3.50 3.72 4.23
55 1.30 1.67 2.00 2.67 2.92 3.48 3.70 4.20
60 1.30 1.67 2.00 2.66 2.91 3.46 3.68 4.17
65 1.29 1.67 2.00 2.65 2.91 3.45 3.66 4.15
70 1.29 1.67 1.99 2.65 2.90 3.43 3.65 4.13
75 1.29 1.67 1.99 2.64 2.89 3.42 3.64 4.11
80 1.29 1.66 1.99 2.64 2.89 3.42 3.63 4.10
85 1.29 1.66 1.99 2.63 2.88 3.41 3.62 4.08
90 1.29 1.66 1.99 2.63 2.88 3.40 3.61 4.07
95 1.29 1.66 1.99 2.63 2.87 3.40 3.60 4.06
100 1.29 1.66 1.98 2.63 2.87 3.39 3.60 4.05
200 1.29 1.65 1.97 2.60 2.84 3.34 3.54 3.97
500 1.28 1.65 1.96 2.59 2.82 3.31 3.50 3.92
1000 1.28 1.65 1.96 2.58 2.81 3.30 3.49 3.91
Infinity 1.28 1.64 1.96 2.58 2.81 3.29 3.48 3.89

326
Appendix 13.2
The scores of 30 students obtained on five tests
Code Age GPA TOEFL C-Test S-Test LKT ZRE_1 MAH_1 COO_1
1 19 17.00 85 49 29 33 0.682 0.910 0.001
2 33 18.50 96 55 43 51 0.384 6.051 0.001
3 19 17.00 45 33 19 4 -1.009 3.980 0.003
4 19 16.50 74 52 26 9 0.313 2.220 0.000
5 19 17.50 65 38 25 14 0.095 2.037 0.000
6 24 14.30 81 43 28 26 0.867 1.058 0.001
7 21 16.50 73 61 30 20 -0.718 0.875 0.001
8 28 15.96 74 51 16 14 0.554 2.508 0.001
9 22 16.00 83 47 30 30 0.656 0.846 0.001
10 22 16.00 90 47 31 30 1.277 0.954 0.002
11 20 16.30 76 42 28 17 0.721 1.530 0.001
12 24 16.50 84 58 26 29 0.375 0.482 0.000
13 22 15.50 68 50 22 24 -0.455 0.710 0.000
14 22 15.50 73 50 25 25 -0.118 0.261 0.000
15 22 15.00 79 55 23 13 0.603 1.426 0.001
16 20 17.30 87 61 30 22 0.530 0.593 0.000
17 20 17.30 89 61 30 22 0.717 0.593 0.001
18 21 16.00 79 57 25 15 0.371 1.165 0.000
19 21 16.00 63 57 25 16 -1.154 0.998 0.002
20 24 16.50 76 62 40 12 -0.589 7.076 0.002
21 19 14.00 52 45 42 8 -1.883 11.990 0.031
22 24 17.00 78 56 32 19 -0.025 1.066 0.000
23 19 17.00 100 77 42 39 -0.011 3.381 0.000
24 20 16.50 75 55 22 42 -0.601 5.027 0.001
25 20 18.00 76 65 28 26 -0.761 0.910 0.001
26 23 14.50 81 65 28 42 -0.771 2.987 0.002
27 23 14.50 60 42 6 7 0.266 7.068 0.000
28 21 16.50 104 71 31 22 1.557 2.394 0.005
29 21 16.83 104 71 31 22 1.557 2.394 0.005
30 21 16.83 89 34 37 53 0.981 13.154 0.009

327
References
Abdelwahab, S. (2002). Portfolio assessment: A qualitative
investigation of portfolio self-assessment in an intermediate EFL
classroom. Unpublished PhD thesis, University of Ohio.
Alderson, J. C. (1984). Reading in a foreign language: A reading
problem or a language problem? In J. C. Alderson & A. H.
Urquhart (Eds.). Reading in a foreign language (pp. 122–135).
New York: Longman.
American Psychological Association (2010). Publication manual (6th
ed.). Washington: American Psychological Association.
Amouzadeh-Mahdiraji, M. (2004). Language and the representation of
reality. Tabriz University Journal of Faculty of Letter &
Humanities, 190, 1-21.
Anastasi, A. (1968). Psychological testing (3rd ed.). London: Macmillan.
Arberry, A. (1964). The Koran interpreted. London: OUP.
Arberry, A. J. (1954). Persian Poems (translated by Sir William Jones,
pp. 118-119). London.
Arberry, A. J. (1961). Tales from the Masnavi. London: George Allen
and Unwin.
Arbuckle, J., & Wothke, W. (1999). AMOS 4.0 user’s guide.
Chicago: Smallwaters.
Armitage, P. (1971). Statistical methods in medical research. Oxford:
Blackwell.
Atkinson, M. (1992). Children’s syntax: An introduction to principles
and parameters theory. Oxford: Blackwell.
Bachman, L.F. & Palmer, A. (1982). The construct validation of some
components of communicative proficiency. TESOL Quarterly 16,
449–65.
Bachman, L.F. & Palmer, A. (1989). The construct validation of self-
ratings of communicative language ability. Language Testing 6,
14–29.
Bachman, L.F. & Palmer, A. (1981). The construct validation of the FSI
oral interview. Language Learning 31, 67–86.

328
Baker, D. (1989). Language testing: A critical survey and practical

guide. London: Edward Arnold.
Bartlett, M. S. (1954). A note on the multiplying factors for various chi
square approximations. Journal of the Royal Statistical Society, 16
(Series B), 296-8.
Bell, R (1960). The Qur’an Translated (2 Vols.). Edinburgh: T. & T.
Clark.
Bender, B. G., Lerner, J. A., Poland, J. E. (1991). Association between
corticosteroids and psychologic change in hospitalized asthmatic
children. Ann Allergy, 66, 414-9.
Bennett, R. E. (1993). On the meaning of constructed response. In R. E.
Bennett., & W. C. Ward (Eds.), Construction versus choice in
cognitive measurement: Issues in constructed response,
performance testing and portfolio assessment (pp. 1-28). Hillsdale,
NJ: Lawrence Erlbaum Associates.
Biria, R. (2002). Text thematization, deletion method and the validity of
cloze procedure. Sheikhbahee Research Bulletin 1(2), pp. 16-31.
Brill, E. J. (1971). The Encyclopaedia of Islam. London: Luzac & Co.
Brown, H. D. (1994). Principles of language learning and teaching (3rd
ed.). Englewood cliffs, NJ: Prentice Hall Regents.
Brown, J. D. (1988). Understanding research in second language
learning: A teacher’s guide to statistics and research design.
Cambridge: CUP.
Brown, J. D., & Rodgers, T. S. (2002). Doing second language
research. Oxford: OUP.
Burns, T., & Sandra, S. (2003). Essential study skills: The complete
guide to success @ university. London: Sage.
Businco, L. D. R, Businco, A. D. R., Lauriello, M., & Tirelli, C. (2004).
State and trait anxiety in patients affected by nasal polyposis
before and after medical treatment. ACTA Otorhinolaryngol ITAL,
24, 326-329.
Campbell, D. T., & Stanley, J. C. (1963). Experimental and quasi-
experimental designs for research. Washington, DC: AERA
(American Education Research Association.

329
Chatman, E. A. (1984). Field research: methodological themes. Library

and Information Science Research, 6, 425-438.
Chomsky, N. (1965). Aspects of the theory of syntax. Cambridge,
Massachusetts: The MIT Press.
Clark, L. D., Bauer, W., Cobbs, S. (1952). Preliminary observations of
mental disturbances occurring in patients under therapy with
cortisone and ACTH. N Eng j Med, 246, 205-16.
Clayton, L. T. (1985). Taber’s cyclopedic medical dictionary (16 ed.).
Philadelphia: F. A. Davis Company.
Clement, R., & Kruidenier, B.G. (1985). Aptitude, attitude and
motivation in second language proficiency: a test of Clement’s
model. Journal of Language and Social Psychology 4, 21–37.
Cochran, W. G. (1963). Sampling techniques (2nd ed.). New York, NY:
John Wiley & Sons.
Cronbach, L. J. (1951). Coefficient alpha and the internal structure of
tests. Psychometrika, 16(3), 297-334.
Cruse, D. A. (1986). Lexical semantics: Cambridge textbooks in
linguistics. Cambridge: Cambridge University Press.
Cuddon, J. A. (1979). A dictionary of literary terms. New York:
Penguin.
de la Jara, R. (n.d.). IQ comparison site. Retrieved July 8, 2008 from
http://www.iqcomparisonsite.com/IQBasics.aspx#
de Vaus, D. A. (1985). Surveys in social research. Sydney: Allen &
Unwin.
Department of Defense (2005). Unpublished standardization data for the
AFQT from the Profiles of American Youth of 1980 and 1997
provided by the US Department of Defense.
Dickens, W. T., & Flynn. J. R. (2006, October). Black Americans
Reduce the Racial IQ Gap: Evidence from Standardization
Samples. Psychological Science (forthcoming).
DiLalla, D. L., & Dollinger, S. J. (2006). Cleaning up data and running
preliminary analyses. In F. T. L. Leong and J. T. Austin (Eds.).
The psychology research handbook: A guide for graduate students
and research assistants (241-253). California: Sage.

330
Dixon, W. J., & Massey, F. J. Jr. (1983). Introduction to statistical

analysis (4th ed.). Auckland: McGraw-Hill.
Educational Testing Service. (1991). Reading for TOEFL. Princeton,
NJ: ETS.
Ely, C. M. (1986). Language learning data: a description and causal
analysis. Modern Language Journal 70, 28–35.
Embassy of Islamic Republic of Iran-Copenhagen (n.d.) Education in
the Islamic Republic of Iran. Retrieved July 7, 2008 from
http://www.iran-embassy.dk/fa/culteral/education%20en.pdf
Faber, P. (1994). The semantic architecture of the lexicon. In K.
Hyldgaard & V. H. Pedersen (Eds.). Symposium on lexicography
VI (pp. 37-50). Tubingen: Max Niemeyer.
Faber, P., & Uson, R. M. (1998). Methodological criteria for the
elaboration of a functional lexicon-based grammar of the semantic
domain of cognitive verbs. In H. Olbertz, K. Hengeveid, & J. S.
Garcia (Eds.), The structure of the lexicon in functional grammar.
Amsterdam: John Benjamins.
Fahim, M., Pishghadam, R. (2007). On the Role of Emotional,
Psychometric, and Verbal Intelligences in the Academic
Achievement of University Students Majoring in English
Language. Asian EFL Journal, 9/4(Conference Proceedings), 240-
253.
Falk, J. S. (1978). Linguistics and language: A survey of basic concepts
and implications (2nd ed.). New York: John Wiley & Sons.
Fancher, R. (1985). The Intelligence Men: Makers of the IQ
Controversy. New York: W.W. Norton & Company
Faravani, A. (2006). Investigating the effect of reading portfolios on the
Iranian students’ critical thinking ability, reading comprehension
ability, and reading achievement. Unpublished MA thesis,
Ferdowsi University of Mashhad, Iran.
Farhady, H., & Sajadi, F. (1999) Location of the topic sentence, level of
language proficiency, and reading comprehension. Journal of the
Faculty of Foreign Languages, Allame Tabatabaee University,
308-318.

331
Farhady, H., Jafarpoor, A., & Birjandi, P. (1994). Testing language

skills: From theory to practice. Tehran: SAMT.
Felix, V., & Lawson, M. (1994). The effects of suggestopedic elements
on qualitative and quantitative measures of language production.
ARAL 17(2), 1-21.
Ferguson, G. A. (1971). Statistical analysis in psychology and education
(3rd ed.). Tokyo: McGraw-Hill Kogakusha.
Fidel, R. (1993). Qualitative methods in information retrieval research.
Library and Information Science Research, 15, 219-247.
Field, A. (2008). Multiple regression using SPPS/PASW. Retrieved May
12, 2010 from http://www.statisticshell.com/multireg.pdf
Fielder, S. (2003). Personal webpublishing as a reflective conversational
tool for self organized learning. Proceedings of ‘BlogTalk- A
European Conference on Weblogs. Vienna, Austria, 23-24 May
2003. Retrieved March 29, 2005 from
http://seblogging.cognitivearchitects.com/stories/storyReader
Finocchiaro, M. (1964). English as a second language: From theory to
practice. New York: Simons & Schuster.
Fisher, R. A., & Yates, F. (1963). Statistical tables for biological,
agricultural and medical research (4th ed.). Edinburgh: Oliver &
Boyd.
Fouly, K. (1985). A confirmatory multivariate study of the nature of
second language proficiency and its relationships to learner
variables. Unpublished Ph.D. dissertation, University of Illinois,
Urbana.
Fraenkel, J. R., & Wallen, N. E. (1993). How to design and evaluate
research in education (2nd ed.). New York: McGraw-Hill.
Freund, J. E. (1974). Modern elementary statistics (4th ed.). London:
Prentice-Hall International.
Gardner, R. C. & Lambert, W. E. (1972). Attitudes and Motivation in
Second Language Learning. Rowley, Mass.: Newbury House.
Gardner, R. C., Lalonde, R.N., Moorcraft, R. & Evers, F.T. (1987).
Second language attrition: the role of motivation and use. Journal
of Language and Social Psychology 6, 1–47.

332
Gardner, R.C. (1988). The socio-educational model of second language

learning: assumptions, findings and issues. Language Learning 38,
101–26.
Gardner, R.C., Lalonde, R.N., & Pierson, R. (1983). The socio-
educational model of SLA: an investigation using LISREL causal
modeling. Journal of Language and Social Psychology 2, 1–15.
Garret, H. E. (1938). Statistics in psychology and education (2nd ed.).
New York: Longman, Green and Co.
Gay, L. R. (1990). Educational research: Competencies for analysis and
application (3rd ed.). New York: Merrill.
Genesee, F., & Nicoladis, E. (2006). Bilingual acquisition. In E. Hoff &
M. Shatz (eds.), Handbook of Language Development, Oxford,
Eng.: Blackwell.
Ghobadi, S. (2009). The link between beliefs about language learning
and language proficiency in an EFL context: A comparative study
of teachers and university student beliefs about language learning
in Mashhad. Unpublished MA thesis, Ferdowsi University of
Mashhad, Iran
Gholami, M. (2006). The effect of content schema type on Iranian test
takers’ performance. Unpublished MA thesis, Ferdowsi University
of Mashhad, Iran.
Ginther, A., & Stevens, J. (1998). Language background, ethnicity, and
the internal construct validity of the Advanced Placement Spanish
language examination. In Kunnan, A. J. (Ed.), Validation in
language assessment (169-94). Mahwah, NJ: Lawrence Erlbaum
Associates.
Gronlund, N. E., & Linn, R. L. (1990). Measurement and evaluation in
teaching (6th ed.). New York: Macmillan.
Guilford, J. P. (1950). Fundamental statistics in psychology and
education (2nd ed.). New York: McGraw-Hill.
Hadhrami, A. A. (2009). Muhammad Marmaduke Pickthall: A
Servant of Islam. Quran & Science: Where Religion Meets
Science. Retrieved August 27, 2010 from
http://www.quranandscience.com/

333
Hahn, G. J., & Meeker, W. Q. (1993). Assumptions for statistical

inference. The American Statistician, 47,1–11.
Haladyna, T. M. (1994). Developing and validating multiple-choice test
items. Hillsdale, N.J.: Lawrence Erlbaum Associates.
Hale, G.A., Rock, D.A., & Jirele, T. (1989). Confirmatory factor
analysis of the TOEFL. TOEFL Research Report 32. Princeton:
Educational Testing Service.
Harcourt (2005a). Unpublished standardization data for Wechsler tests:
the WAIS-R, WAIS-III, WISC-R, WISC-III, and WISC-IV.
Harcourt Assessment, Inc.
Harris, M. (1964). The Nature of Cultural Things Studies in
Anthropology. New York: Random House.
Hatch, E., & Farhady, H. (1982). Research design and statistics for
applied linguistics. Rowley: Newbury House.
Hatch, E., & Lazaraton, A. (1991). The research manual: Design and
statistics for Applied Linguistics. Boston, Massachusetts: Heinle &
Heinle Publishers.
Hatch, E., & Lazaraton, A. (1991). The research manual: Design and
statistics for Applied Linguistics. Boston, Massachusetts: Heinle &
Heinle Publishers.
Heaton, J. B. (1988). Writing English language tests (new ed.). London:
Longman.
Hindmarch, I., Johnson, S., Meadows, R., Kirkpartrick, T., Shamsi, Z.
(2001). The acute and sub-chronic effects of levocetirizine,
cetirizine, loratadine, promethazine and placebo on cognitive
function, psychomotor performance, and weal and flare. Curr Med
Res Opin, 17, 241-55.
Holzman, P. S. (1970). Psychoanalysis and psychotherapy. New York:
McGraw Hill.
Hong, K. (2006). Beliefs about language learning and language
learning strategy use in an EFL context: a comparison study of
monolingual Korean and bilingual Korean-Chinese university
students. Unpublished doctoral dissertation. University of North
Texas.

334
Horwitz, E. K. (1981). Beliefs about language learning inventory.

Unpublished instrument: The University of Texas at Austin.
Austin, TX.
Horwitz, E.K. (1985). Using student beliefs about language learning and
teaching in the foreign language methods course. Foreign
Language Annals, 18(4), 333-340.
Horwitz, E.K. (1988). The beliefs about language learning of beginning
university foreign language students. The Modern Language
Journal, 72, 283–294
Howell, D. C. (2002). Statistical methods for psychology (5th ed.).
Duxbury: Australia.
Irving, T. B. (1985). The Qur'an: First American Version. Battleboro,
Vt.: Amana Books.
Kaiser, H. (1970). A second generation Little Jiffy. Psychometrika, 35,
401-15.
Kaiser, H. (1974). An index of factorial simplicity. Psychometrika, 39,
31-6.
Kaiser, H. F. (1958). The Varimax Criterion for Analytic Rotation in
Factor Analysis. Psychometrika, 23, 187–200.
Kaplan, H. I., Saboch, B. J. (1995). Comprehensive textbook of
psychiatry (6th ed.). New York: William & Wilkins.
Khan, A. (2000). TB Irving (Al-Hajj Talim Ali: A life well lived). New
Delhi, India: Pharos Media & Publishing Pvt Ltd. Retrieved April
8, 2008 from
http://www.milligazette.com/Archives/15102002/1510200232.htm
Khan, Z. (1970). The Qur'an: Arabic Text and English Translation.
London.
Khodadady, E. (1997). Schemata theory and multiple choice item tests
measuring reading comprehension. Unpublished PhD thesis, the
University of Western Australia.
Khodadady, E. (1999a). Multiple choice items in testing: Practice and
theory. Tehran: Rahnama.
Khodadady, E. (1999b). Reading Media texts: Iran-America relations.
Sanandaj: Kurdistan University Press.

335
Khodadady, E. (2000). Translation: A research method. Scientific and

Research Journal of Kurdistan University, 3/4, 1-22.
Khodadady, E. (2001). Schema: A theory of translation. In S. Cunico
(Ed.). Training Translators and Interpreters in the New
Millennium, Portsmouth 17th March 2001 Conference Proceedings
(pp. 107-123). Portsmouth, England: University of Portsmouth,
School of Languages and Areas Studies.
Khodadady, E. (2007). C-Tests method specific measures of language
proficiency. Iranian Journal of Applied Linguistics (IJAL), 10/2,
1-26.
Khodadady, E. (2008). Schema-based textual analysis of domain-
controlled authentic texts. Iranian Journal of Language Studies
(IJLS), 2/4, 431-448.
Khodadady, E. (2009).The beliefs about language learning inventory:
Factorial validity, formal education and the academic achievement
of Iranian students majoring in English. Iranian Journal of
Applied Linguistics (IJAL), 12(1), 115-165.
Khodadady, E. (2012). Validity and tests developed on reduced
redundancy, language components and schema theory. Theory and
Practice in Language Studies, 2(3), 585-595.
Khodadady, E. (2013). Authenticity and sampling in C-Tests: A
schema-based and statistical response to Grotjahn’s critique. The
International Journal of Language Learning and Applied
Linguistics World (IJLLALW), 2(1), 1-17.
Khodadady, E. Pishghadam, R., & Fakhar, M. (2010). The relationship
among reading comprehension ability, grammar and vocabulary
knowledge: An experimental and schema-based approach. Iranian
EFL Journal, 6(2), 7-49.
Khodadady, E., & Elahi, M. (2012). The effect of schema-vs-translation-
based instruction on Persian medical students’ learning of general
English. English Language Teaching, 5 (1), 146-165. URL:
http://dx.doi.org/10.5539/elt.v5n1p146.
Khodadady, E., & Herriman, M. (2000). Schemata Theory and Selected
Response Item Tests: From Theory to Practice. In A. J. Kunnan

336
(Ed.), Fairness and validation on language assessment (pp. 201-

222). Cambridge: CUP.
Khodadady, E., & Seif. S. (2006, February). Measuring Translation
Ability and Achievement: A Schema-Based Approach. Paper
presented at the third Annual Conference of the TELLSI, Razi
University of Kermanshah, Iran.
Kidwai, A. R. (1987, Summer). Translating the Untranslatable: A
Survey of English Translations of the Quran. The Muslim World
Book Review, 7(4), 66-71. Retrieved August 2, 2008 from
http://www.soundvision.com/Info/quran/english.asp
Klein-Braley, C. (1997). C-Tests in the context of reduced redundancy
testing: an appraisal. Language Testing, 14(1), 47-84.
Kline, P. (1986). A handbook of test construction. New York: Methuen.
Kritzeck, J. (Ed.) (1964). Anthology of Islamic literature from rise of
Islam to the present time. Penguin: The New American Library.
Kunnan, A. J. (1998). An introduction to structural equation modelling
for language assessment research. Language Testing, 15(3), pp.
295–332
Kunnan, A.J. (1995). Test taker characteristics and test performance: a
structural modelling approach. Cambridge: Cambridge University
Press.
Lagzian, M. (2013). Textual analysis of an English dentistry text and its
translation in Persian: A schema-based approach. Unpublished
MA thesis, Ferdowsi University of Mashhad.
Larson, M. L. (1984). Meaning-based translation: A guide to cross-
language equivalence. Lanham, MD: University Press of America.
Leopold, W. (1949). Speech development of a bilingual child (Volume
4). Evanston, IL: Northwestern University Press.
Lerner, R. M. Kendall, P. C. Miller, D. T., Hultsch, D. F., & Jensen, R.
A. (1986). Psychology. New York: Macmillan Publishing
Company.
Lester, J. D. (1995). Writing research papers: A complete guide (7th ed.).
New York: Harper Collins.
Linacre J. M. (2005). Correlation Coefficients: Describing
Relationships. Rasch Measurement Transactions, 19/3, 1028-9.

337
Retrieved September 19, 2008 from

http://www.rasch.org/rmt/rmt193c.htm
Lozanov, G. (1978). Suggestology and outlines of suggestopedy. New
York: Gordon & Breech.
Madsen, H. (1983). Techniques in testing. New York: OUP.
Magruder, K. M., Norquist, G. S., Feil, M. B., Kopans, B., Jacobs, D.
(1995). Who comes to a voluntary depression screening program.
Am J Psychiatry, 152, 1915-22.
Maibodi, A. H. (2008). Learning English through short stories. Iranian
Journal of Language Studies (IJLS), 2/1, 41-72.
Marsh, C. (1982). The survey method: the contribution of surveys to
sociological explanation. London: George Allen and Unwin.
McBurney, D. H. (1994). Research methods (3rd ed.). Pacific Grove,
California: Brooks/Cole Publishing Company.
McNeil, D. (1966). Developmental psycholinguistics. In F. Smith & G.
A. Miller (Eds.), The genesis of language: A psycholinguistic
approach. Cambridge, Mass.: MIT Press.
Mehrens, W.A., & Lehman, Irvin, J.(1991). Measurement and
evaluation in education and psychology (4th Ed.), Fort Worth:
Holt, Rinehart and Wiston, Inc.
Mellon, C. A. (1990). Naturalistic inquiry for library science: methods
and applications for research, evaluation, and teaching. New
York: Greenwood.
Messick, S. A. (1989). Validity. In R. L. Linn (Ed.), Educational
measurement (3rd ed.). New York: American Council of
Education: Macmillan.
Microsoft Encarta [DVD]. (2006). Kurt Lewin. Redmond, WA:
Microsoft Corporation.
Microsoft Encarta [DVD]. (2006). Thomas Edison. Redmond, WA:
Microsoft Corporation, 2005.
Miremadi, S. A. (1991). Theories of translation and interpretation.
Tehran: SAMT.
Mislevy, R. J. (1993). Foundations of a new test theory. In N.
Fredriksen, R. J. Mislevy, & I. I. Bejar (Eds.), Test theory for a

338
new generation of tests. Hillsdale, NJ: Lawrence Erlbaum

Associates.
Mohammed, K. (2005, Spring). Assessing English Translations of the
Qur'an. Middle East Quarterly. Retrieved August 2, 2008 from
http://www.meforum.org/article/717
Moulton, W. (1961). Linguistics and language teaching in the United
States, 1940-1960. In C. Mohrmann, A Sommerfelt, & J.
Whatmough (eds.), Trends in European an American Linguistics
1930-1960. Utrecht: Spectrum.
Munby, J. (1978). Communicative syllabus design. Cambridge: CUP.
Newmark, P. (1988). A textbook of translation. New York: Prentice-
Hall.
Nicholson, R. A. (1969). The literary history of the Arabs. Cambridge:
Cambridge University Press.
Nickolson, R. A. (1926). The MathnawÌ of Jal·lu'ddÌn RumÌ. London:
Noffke, S. (1990). Action Research: A multidimensional analysis.
Dissertation: University of Wisconsin-Madison.
Nunnally, J. C. (1978). Psychometric theory (2nd ed.). New York:
McGraw-Hill.
Orwell, G. (1958). Why I write. In G. Bott (ed.). George Orwell: selected
writings (pp. 99-105). London: Heinemann Educational Books.
Palmer, E. (1880). The Qur'an. Clarendon: Oxford Press.
Pei, M. (1966). Glossary of linguistic terminology. New York: Anchor
Books.
Pickthall, M. M. (1930). The Meaning of the Glorious Koran.
Hyderabad: Hyderabad Government Press.
Pike, K. (1954). Language in Relation to a Unified Theory of the
Structure of Human Behavior (prelim. ed.). Glendale CA: Summer
Institute of Linguistics.
Plonsky, M. (1997-2006). Psychological statistics: Analysis of variance-
one way. Retrieved September 23, 2008 from
http://www.uwsp.edu/psych/stat/12/anova-1w.htm

339
Psychorp (2000). Watsho-Glaser critical thinking test. Retrieved January

29, 2006 from
http://www.panttesting.com/products/PsychoCorp/WGCTA.asp.
Purcell, E.T. (1983). Models of pronunciation accuracy. In Oller, J.W.
(Ed.), Issues in language testing research (133–53). Rowley, MA:
Newbury House,
Purpura, J. E. (1997). An analysis of the relationship between test
takers’ cognitive and metacognitive strategy use and second
language test performance. Language Learning, 47/2, pp. 289-325.
Purpura, J.E. (1996). Modeling the relationships between test takers’
reported cognitive and metacognitive strategy use and performance
on language tests. Unpublished Ph.D. dissertation, University of
California, Los Angeles.
Quirk, R., Greenbaum, S., Leech, G., & Svartvik, J. (1985). A
comprehensive grammar of the English language. London:
Longman.
Rashad, K. (1978). The Qur'an: The Final Scripture (Authorized English
Version). Tucson.
Reber, A.S. (1995). The Penguin Dictionary of Psychology (2nd ed).
Toronto: Penguin.
Redhouse, J. W. (1881). The Mesnevi of Mevlânâ Jelâlu'd-dîn
Muhammed er-Rûmî. Book the First. London.
Resnick, L. B., & Resnick, D. P. (1990). Tests as standards of
achievement in schools. In J. Pfleiderer (Ed.), Proceedings of the
1989 ETS Invitational Conference: The uses of standardised tests
in American education (pp. 63-80). Princeton, NJ: Educational
Testing Service.
Rezaee, A. A., & Oladi, S. (2008). The effect of blogging on language
learner’s improvement in social interactions and writing
proficiency. Iranian Journal of Language Studies, 2/1, pp. 73-88.
Richards, J. & Sandy, C. (2000). Passages. Cambridge: CUP.
Richards, J. C. & Rodgers, T. (1986). Approaches and methods in
language teaching: A description and analysis. Cambridge: CUP.
Riverside (2005). Unpublished standardization data for the SB-5.

340
Roberts, E. V. (1973). Writing themes about literature (3rd ed.).

Englewood Cliffs, NJ: Prentice-Hall.
Rodwell, J. (1909). The Koran—Translated from the Arabic. London:
J.M. Dent & Co.
Ronjat, J. (1913). Le développement du langage observé chez un enfant
bilingue. Paris: Champion.
Rosenthal, R. (1966). Experimenter effects in behavioural research.
New York: Appleton-Century-Crofts.
Ross, A. (1649). The Alcoran of Mahomet translated out of Arabique
into French, by the Sieur Du Ryer...And newly Englished, for the
satisfaction for all that desire to look into the Turkish vanities.
London.
Ross, T. A. (2006). Child Development. Microsoft Encarta [DVD].
Redmond, WA: Microsoft Corporation.
Rumi, M. J. M. B. (2001). Mathnavi Manavi [Spiritual Mathnavi]
(edited in Persian by T. H. Sobhani 1380). Tehran: Rozaneh.
Sale, G. (1880). The Koran Commonly Called the Al-Koran of
Mohammed. New York: W. L. Allison Co.
Sarwar, S. M. (1981). The Holy Qur'an: Arab Text and English
Translation. Elmhurst.
Sasaki, M. (1993). Relationships among second language proficiency,
foreign language aptitude and intelligence: a structural equation
modeling approach. Language Learning 43, 313–44.
Schmitt, N., Schmitt, D., & Clapham, C. (2001). Developing and
exploring the behavior of two new versions of the Vocabulary
Levels Test. Language Testing, 18 (1), 55-88.
Seif, S. & Khodadady, E., (2003) Schema-based cloze multiple choice
item tests: measures of translation ability. Universite de Tabriz,
Revue de la Faculte des Letters et Sciences Humaines, Langue
187(46), 73-99.
Seliger, H. W., & Shohamy, E. (1989). Second language research
methods. Oxford: OUP.
Shafi, M. (1995). Ma'ariful-Qur'an, Vol. 1 (translated by M. H. Askari
& M. Shameem) [revised by M. T. Usmani]. Karachi: Darul-
Uloom

341
Shafi, M. (n.d). Ma'ariful-Qur'an, Vol. 6 (translated by Muhammad

Ishrat Husain) [revised by M. T. Usmani]. Karachi: Darul-Uloom.
Shahriari, S. (April 27, 1998). The Reed Flute. Vancouver: Canada.
Retrieved August 13, 2008 from
http://www.rumionfire.com/mathnavi/1_1.htm
Shariati, A. (1362-1981). Farhang Loghateh Kotobeh Dr Ali Shariati (A
glossary of terms used in Dr Ali Shariati’s world). Tehran: Ferdosi
Publications.
Shariati, A. (1981). Farhang Loghateh Kotobeh Dr Ali Shariati [1362]
(A glossary of terms used in Dr. Ali Shariati’s works.). Tehran:
Ferdosi Publications.
Shepard, L. (1991a). Interview on assessment issues with Lorrie
Shepard. Educational Researcher, 20/2, 21-23.
Shepard, L. (1991b). Psychometricians’ beliefs about learning.
Educational Researcher, 20/6, 2-16.
Sherali, M. (1955). The Holy Qur'an: Arabic Text with English
Translation. Rabwah.
Spielberger, C. D. (1983). Manual for the state trait anxiety inventory.
CA: Consulting Psychologists Press.
Spielberger, C. D., Gorsuch, R. L., Lushene, R. E. (1983). Manual for
the state-trait anxiety inventory. Palo Alto, California: Consulting
Psychologists Press.
Spolsky, B. (1973). What does it mean to know a language; or how do
you get somebody to perform his competence? In J. W. Oller J.
and J. R. Richards (Eds.). Focus on the learner (pp.164-76).
Rowley, MA: Newbury House.
Stevens, J. (1996). Applied multivariate statistics for the social sciences
(3rd ed.). Mahwah, NJ: Lawrence Erlbaum.
Straton, P., & Haye, N. (1988). A student’s dictionary of psychology.
London: Edward Arnold.
Streiner, D. L, & Norman, G. R (1989). Health measurement scales: A
practical guide to their development and use. New York: Oxford
University Press, Inc.

342
Sutton, B. (1993). The rationale for qualitative research: a review of

principles and theoretical foundations. Library Quarterly, 63, 411-
430.
Swinton, S.S., & Powers, D.E. (1980). Factor analysis of the TOEFL.
TOEFL Research Report 6. Princeton, NJ: Educational Testing
Service.
Tabachnick, B. G., & Fidell, L. S. (2007). Using Multivariate Statistics
(5th ed.). Boston: Pearson Education.
Tajareh, A. Z., & Tahririan, M. H. (2002). College EFL learners’
achievement and their language learning strategies. Sheikhbahee
Research Bulletin 1/2, pp. 32-46.
The Ministry of Science, Research and Technology (n.d.). Distribution
of students based on provinces and degrees in 2004-5 (83-84)
[Table 2.9]. Retrieved July 22, 2008 from
http://www.msrt.gov.ir/default.aspx
Thomosn Reuters (2009). Source publication list for Web of science:
Social Sciences Citation Index. Retrieved May 12, 2010 from
http://science.thomsonreuters.com/mjl/publist_ssci.pdf
Thomosn Reuters (2010). ISI Web of Knowledge: Journal Citation
Reports (2008 JCR Social Science Edition). Retrieved May 13,
2010 from http://admin-
apps.isiknowledge.com/JCR/JCR?RQ=SELECT_ALL
&cursor=61
Thorndike, R. L., Hagen, E. P., & Sattler, J. M. (1986). The Stanford-
Binet Intelligence Scale: Technical Manual (4th ed.). Chicago:
Riverside Publishing.
Tindal, G. A., & Marston, D. B. (1990). Classroom-based assessment:
Evaluating instructional outcomes. Columbus, Ohio: Merrill
Publishing Company.
Tucker, L. R., & MacCallum, R. C. (1997). Exploratory Factor
Analysis. Retrieved May 7, 2008 from
http://www.unc.edu/~rcm/book/factor.pdf
Tucker, L. R., & MacCallum, R. C. (1997). Exploratory Factor
Analysis. Retrieved May 7, 2008 from
http://www.unc.edu/~rcm/book/factor.pdf

343
Turner, C. (1989). The underlying factor structure of L2 cloze test

performance in francophone, university-level students: causal
modeling as an approach to construct validation. Language Testing
6, 172–97.
Urdang, L. (Ed.) (1977). The Oxford thesaurus (2nd ed.). Oxford: OUP.
Valden-Pierce, L. and J. O’Malley. 1992. Performance and portfolio
assessment for language minority students. NCBE Program
Information Guide Series, No. 9.
http:ncbe.gwu.edu/ncbepubs/pigs/pig9.htm.Wardhaugh, R. (1972).
Introduction to linguistics. New York: McGraw-Hill.
Vandergrift, L. (2006). Second Language Listening: Listening Ability or
Language Proficiency? The Modern Language Journal, 90, 0026-
7902/06, pp. 6–18
Venti, L. (1995). The translator’s invisibility: A history of translation.
London: Routledge.
Vollmer, H. J., & Sang, F. (1983). Competing hypotheses about second
language ability: A pleas for caution. In J. W. Oller, Jr. (Ed.).
Issues in language testing research. Rowley, MA: Newbury
House.
Walle, A. H. (1997). Quantitative versus qualitative tourism research.
Annals of Tourism Research, 24/3, 524-536,
Wang, L.-S. (1988). A comparative analysis of cognitive achievement
and psychological orientation among language minority groups: a
LISREL approach. Ph.D. dissertation, University of Illinois,
Urbana.
Wardhaugh, R. (1992). An introduction to sociolinguistics (2nd ed.).
Oxford: Blackwell.
Wechsler, D. (1944). The Measurement of Adult Intelligence.
Baltimore: The Williams & Wilkins Company.
Westbrook, L. (1994). Qualitative research methods: a review of major
stages, data analysis techniques, and quality controls. Library and
Information Science Research, 16, 241-254.
Whinfield, E. H. (1887). MasnavÌ-i Ma'navÌ, The Spiritual Couplets of
Maul·n· Jal·lud-dÌn Muhammad-i RumÌ. London.

344
Wills, W. (1982). The science of translation: Problems and methods.

Tübingen: Gunter Narr Verlag.
Wilss, W. (1977). The science of translation: Problems and methods.
Tübingen: Gunter Narr Verlag.
Yano, Y., Long, M. H., & Ross, S. (1994). The effects of simplified and
elaborated texts on foreign language reading comprehension.
Language Learning, 44(2), 189-219.
Yashima, T., Zenuk-Nishide, L. & Shimizu, K. (2004). The influence of
attitudes and affect on willingness to communicate and second
language communication. Language Learning 54/1, pp. 119-152
Yusuf Ali, A. (1989). The Holy Qur’an: Text, Translation and
Commentary. Washington, DC: Amanah.
Zenderland, L. (1998). Measuring Minds: Henry Herbert Goddard and
the Origins of American Intelligence Testing. Cambridge:
Zung, W. W., Richards, C. B., Short, M. J. (1965). A self-rating
depression scale in outpatient clinic. Further validation of SDS.
Arch Gen Psychiatry, 13, 508-15.

Research Principles Methods and Statisti PDF

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Research Principles Methods and Statisti PDF

Uploaded by

Copyright:

Available Formats

Research Principles, Methods and Statistics in

First, Research Principles, Methods and Statistics in Applied Linguistics

Secondly, in addition to being a new subject to Iranian university

Research Principles, Methods and Statistics in Applied Linguistics

Thirdly, every opportunity has been seized upon to re-present

Fourthly, Research Principles, Methods and Statistics in Applied

Research Principles, Methods and Statistics in Applied Linguistics

As its fifth distinctive feature, Research Principles, Methods and

Sixthly, Research Principles, Methods and Statistics in Applied

Finally, I hope Research Principles, Methods and Statistics in Applied

Dr. Ebrahim Khodadady

Research Principles, Methods and Statistics in Applied Linguistics

Research Principles, Methods and Statistics in Applied Linguistics

Chapter 1 Defining Research

Chapter 2 Research Variables

Research Principles, Methods and Statistics in Applied Linguistics

Chapter 2 Research Variables (Continued) Page

Chapter 3 Research Hypotheses

Chapter 4 Characteristics of Research

Research Principles, Methods and Statistics in Applied Linguistics

Chapter 4 Characteristics of Research (Continued)

Chapter 5 Population and Sampling

Research Principles, Methods and Statistics in Applied Linguistics

Chapter 5 Population and Sampling (Continued)

Chapter 6 Types of Research

Chapter 7 Translation Research

Research Principles, Methods and Statistics in Applied Linguistics

Chapter 7 Translation Research (Continued)

Chapter 8 Schema-Based Translation Research: A

Research Principles, Methods and Statistics in Applied Linguistics

Chapter 9 Statistical Analysis of Categorical Variables 159

Chapter 10 Statistical Analysis of Ordinal Variables

Research Principles, Methods and Statistics in Applied Linguistics

Chapter 10 Statistical Analysis of Ordinal Variables

Chapter 11 Working with Interval Variables

Chapter 12 Employing Interval Variables to Evaluate

Research Principles, Methods and Statistics in Applied Linguistics

Chapter 12 Employing Interval Variables to Evaluate

Chapter 13 Employing Interval Variables to Evaluate

Chapter 14 Finding Research Papers

Research Principles, Methods and Statistics in Applied Linguistics

Research Principles, Methods and Statistics in Applied Linguistics

List of Tables (Continued)

Table 6.4 The raw scores of five participants (Ps) on the 99

Research Principles, Methods and Statistics in Applied Linguistics

List of Tables (Continued)

Table 10.1 A graduate and undergraduate student’s 198

Research Principles, Methods and Statistics in Applied Linguistics

List of Tables (Continued)

Table 12.1 Steps involved in calculating 248

Research Principles, Methods and Statistics in Applied Linguistics

Research Principles, Methods and Statistics in Applied Linguistics

List of Figures (Continued)

Figure 9.24 Variable Labels 180

Research Principles, Methods and Statistics in Applied Linguistics

List of Figures (Continued)

Figure 10.13 Activating Suppress absolute values less than 217

Research Principles, Methods and Statistics in Applied Linguistics

List of Figures (Continued)

Research Principles, Methods and Statistics in Applied Linguistics

Appendix 3.1 Self-assessment used as reading portfolios 309

Research Principles, Methods and Statistics in Applied Linguistics

1.2 Single Definition