Professional Documents
Culture Documents
Carl Ace Parilla - Nelson and Rawlings
Carl Ace Parilla - Nelson and Rawlings
Carl Ace Parilla - Nelson and Rawlings
’ Contribution
fromthe Inst. of Statistics, NorthCarolinaStateUni., ] REP 1"[ REP 1T i REP TIT [ CHEMICAL C
Raleigh,NC27650.
2 Professorsof statistics, NorthCarolinaState Univ.,Raleigh,NC Fig. 1. Exampleof anaerial sprayingexperiment
for plantdisease
27650. controlin a SouthAmerican country.
100
NELSON & RAWLINGS: MISUSES OF STATISTICS IN AGRONOMY 101
Whenexperiments are conducted in series (over sites the researcher about his experimental technique with a
and/or years) it is necessary to provide different ran- view to its improvement.
domizations for each experiment. This will reduce We find that many agronomists do not know the
biases which might result from two adjacent treatments meaning of effective blocking. In many cases, they are
interacting and will tend to equalize the precision of all blocking just to provide replication, not error control.
comparisons. Others attempt to run experiments before they have ade-
There are also experiments in which the entire experi- quately becomefamiliar with the experimental process
mental process is a chain of individual steps. For ex- (perhaps through small pilot studies) and so they do not
ample, plants might be grown according to an experi- use the best technique. Somelack care in controlling
mental design in the greenhouse and then transferred to variation. Wesay that their experimental technique is
the field and used in a second stage experiment under out-of-control. Others do not oversee the experimental
field environmental conditions. Or, more likely, plants process adequately or they do not take note of unusual
may be grown in a haphazard arrangement in the green- events which took place at the experimental site. Con-
house for part of their growth cycle and then put into a sequently, when these unusual events have generated
designed experiment at a specified stage of growth. "outliers" in the experimental data, there is no basis for
Proper randomization at that stage would avoid biasing rejection of the extreme datumpoints from the data set.
treatment effects and experimental error (arising from Overall precision can be increased by using a uniform
environmental variation during the earlier stage of experimental technique throughout the series of experi-
growth) but a more efficient experimental design would ments. Someways of standardizing technique are to: 1)
incorporate provisions for error control at all stages of write out procedures for conducting various phases of
plant growth. the experiments and a time schedule for their execution;
2) make all personnel dealing with the treatments, plots
Misuse 4--Using an Improper Size of An Experiment and data aware of the various sources of error and the
need for good technique; 3) apply the treatments uni-
It is important to use an appropriate numberof repli- formly; 4) exercise sufficient control over external in-
cations in an experiment. Under-replication could result fluences so that every treatment produces its effect
in very imprecise estimates, whereas over-replication under controlled, comparable conditions; 5) devise suit-
can be costly. Agronomists probably err on the side of able unbiased measures of the effects of treatments; and
under-replication more often than on the other side. 6) prevent gross errors.
Onestill sees self-contained field experiments where the There are manyaspects to technique such as choice of
researcher is attempting to achieve adequate precision proper plot size and shape, choice of dosages in quanti-
with only two replications. There are very few field tative controlled variable experiments and proper timing
situations where this number of replications would be of operations. It is very important from a statistical
adequate. point of view that individuals who lay out the experi-
Table 2.1 in Cochran and Cox (1957) provides ments are trained in the subject matter discipline as well
useful guide to the determination of the proper number as in field plot technique. Otherwise it is impossible to
of replications if the approximate variability of the ex- provide credentials to imprecise data once they have
perimental material (coefficient of variation) is known been collected under dubious sets of circumstances.
and the researcher is willing to estimate differences be-
tween means of a given percent. It is also necessary to MISUSES IN ANALYSIS AND
assumea probability level for the test (or) and a proba- INTERPRETATION OF DATA
bility of rejecting a false null hypothesis.
Equally important in agronomic research is for the Misuse 6mUsingInappropriate Error Termsfor Testing
replication to adequately sample the reference popula- or for Calculating StandardErrors
tion of environmental conditions. This is usually ac-
complished by growing the test over several years at Use of an inappropriate error term for testing or for
several locations within the geographical area of inter- providing standard errors has been a problem for many
est. It is very unlikely that a test at one site in 1 year will years but has increased recently with the commonuse of
provide a reliable inference to any except the most re- statistical computingpackages which use a default error
stricted reference population of environments. term. By this is meant that all terms not included in the
linear model spelled out in the instructions to the com-
Misuse 5--Using Improper Experimental Technique puter are pooled into an error term which the computer
uses for tests and estimates of precision. In a very large
The precision of data from an experiment depends to proportion of the cases, the tests of significance auto-
a large degree upon the experimental technique used. matically provided by computer packages are incorrect.
Because statistics deal with variability and methodsfor Each user of a computer package is responsible for the
dealing with it, experimental technique does fall within correctness of his or her analysis.
the realmof a statistician’s concern. In fact, statisticians The analysis of variance of data from a randomized
have perhaps made one of their more important con- complete block design in which each plot has been sam-
tributions to agronomic research in asking questions of pled (Table 1) has a sampling error in addition to the
NELSON & RAWLINGS: MISUSES OF STATISTICS IN AGRONOMY 103
Table1. Ananalysisof varianceof datafroma randomized complete propriate error term for testing, especially in situations
blockexperiment
(fourblocksand10treatments) in whichtheentire wherethere are several plot sizes or levels of error within
plotwasnotharvested,butinsteadsix samples weretakenwithin
eachplot andanalyzedforlevel of the response the same experiment.
variablebeingstudied.
Source df MS Misuse7--Failing to Study Patterns in Data
Blocks 3 9000
Treatents 9 4000"1 With modern computers, there is a tendency to rou-
Error 27 2000 ] F = 2 NS tinely process data through standard data analysis rou-
Sampling error 200 400
tines without careful study of the patterns of variation
usual experimental error. The appropriate error term in the data. As a result, researchers are much less
for testing treatments is experimental error, not sam- familiar with the behavior of their experimental data
pling error. The F of 2 based upon forming the ratio of than when analyses were done by hand and it is easy for
Treatment MS to Error MS with 9 and 27 degrees of badly behaved data to escape detection. The presence of
freedomis not significant at the 0.05 level. If one placed a single outlier can grossly inflate variances without
only Blocks and Treatments in the linear model used in being detected if, for example, only routine analyses of
fitting by the computer, the pooled error would be [(27 variance are run. Heterogeneous variances, inade-
x 2000) + (200 x 400)]/227 = 590. Using this inap- quacies of the model, and model assumptions will sel-
propriate pooled error, the computer would use a de- dombe detected without a careful study of the data. If
nominator of 590 rather than 2000 in the F ratio and the the data set is too large for a careful study by hand, vari-
denominator degrees of freedom would be 227 rather ous computer programs for editing, residual analysis,
than the correct value of 27. The resulting inference tests of normality, etc., are available for assisting a
would be incorrect, i.e., treatments would now be sig- complete analysis of the data. If causes for the outliers
nificant at the 0.05 level. can be identified, it is possible to replace them by
Or if the model is specified so that "error" is missing plot estimates. In somecases, entire sections of
separated from "sampling error," some computer the data (or even entire data sets) are rendered invalid
packages will use the sampling error in testing the treat- effects of uncontrolled factors or by improper experi-
ments category resulting in a large upward bias in the mental design or layout. It is important that such cases
Treatments F ratio. be recognized and dealt with appropriately.
There are other cases where the researcher desires to Detection of portions of the data where the variance
use the appropriate error term but it is difficult due to differs from that of other parts of the data may be ac-
the nature of the design constraints. A commonex- complished by comparing the ranges among the
ample is the design of growth chamberexperiments. It is replications for each treatment or by analysis of
difficult to provide enough chambers for replication of residuals. There are several courses of action once it has
the temperature-humidity conditions. The chambers are been determined that the errors in the data set are
often shared by a number of researchers and also it is heterogeneous. In some cases, a transformation such as
difficult to change the temperature-moisture settings for the log-transformation may be used. Another approach
a given chamber. Replication of the factor-combina- is to partition the data into sets which have homo-
tions within the chamber is achieved more readily but geneous variance and conduct separate analyses of
the error term resulting from this second type of repli- variance for each set.
cation is not appropriate for testing temperature- A careful study of the data patterns will also help to
humidity treatments. The answer to this problem usual- determine if the a priori biological modelis adequate or
ly is to provide multiple runs of the experimentusing the if the patterns show that some other model would be
same temperature settings within a chamber from run to more appropriate.
run but with new randomizations of the factor-com-
binations within a chamber for the various runs. The
run then serves as a block in a randomized complete Misuse 8--Depending Excessively on One Class of
block whole-plot arrangement of temperatures and the Statistical Analyses
plots within the chamberswithin a run are considered as
sub-plots. In agronomic research, the analysis of variance has
In short, the researcher needs to be sure that the ap- almost becomethe universal method of analysis. This is
propriate error term is being used whether the analysis the statistical procedure emphasizedin all basic statis-
of variance is being conducted by a desk calculator tical methodscourses and it is familiar to most agrono-
under his personal supervision or whether by an elec- mists. The analysis of variance is a powerful technique
tronic computer. The computer is in no position to de- for understanding variational patterns in the data and
termine the appropriate error term with which to test so deserves to be a primary tool. However, excessive de-
one should not blindly assume that significance tests or pendence on the analysis of variance, or any other
error estimates obtained from the computer were ob- single statistical methodof analysis, is a handicapto the
tained using the proper error. The procedure of writing researcher. There are two problems associated with this
out expectations of mean squares which is described in dependence. First, the particular statistical analysis
statistics methodstexts is useful in guiding one to the ap- simply may be inappropriate for the problem either be-
104 JOURNALOF AGRONOMIC
EDUCATION,VOL. 12, 1983
cause basic assumptions required for the analysis are Table 2. Example of an analysis of variance table which might be
not satisfied or because other more powerful methods included in the RESULTSsection
may be available. Secondly, other methods of analysis Source df MS F
maybe more revealing of the basic structure of the data. Blocks 3 1900
For example, a principal component analysis or other Treatments 4 1450 4.8*
multivariate methods can be very informative of the Ck vs others 1 2225 7.5*
Variety (V) 1 1500 5.0’
correlational structure amongseveral variables. The re- 4.2
N-rate (N) 1 1250
searcher should always be critical of the standard V × N 1 825 2.8
methodof analysis and seek statistical assistance if he Error 12 300
suspects his methodis inadequate in any sense. * Significant at the 0.05 level.