Download as pdf or txt
Download as pdf or txt
You are on page 1of 14

Linear, bilinear, and linear-bilinear fixed and mixed models

for analyzing genotype environment interaction in plant


breeding and agronomy
Jose Crossa1, Mateo Vargas1,2, and Arun Kumar Joshi3
1
Biometrics and Statistics Unit, Crop Research Informatics Laboratory, International Maize and Wheat
Improvement Center (CIMMYT), Apdo. Postal 6-641, 06600 México DF, México; 2Universidad Autónoma de
Chapingo, Chapingo, México; and 3Global Wheat Breeding Program, CIMMYT, South Asia Office, Kathmandu,
Nepal. Received 7 January 2010, accepted 11 March 2010.
Crossa, J., Vargas, M. and Joshi, A. K. 2010. Linear, bilinear, and linear-bilinear fixed and mixed models for analyzing
genotype environment interaction in plant breeding and agronomy. Can. J. Plant Sci. 90: 561574. The purpose of this
manuscript is to review various statistical models for analyzing genotype  environment interaction (GE). The objective is
to present parsimonious approaches other than the standard analysis of variance of the two-way effect model. Some fixed
effects linear-bilinear models such as the sites regression model (SREG) are discussed, and a mixed effects counterpart such
as the factorial analytic (FA) model is explained. The role of these linear-bilinear models for assessing crossover interaction
(COI) is explained. One class of linear models, namely factorial regression (FR) models, and one class of bilinear models,
namely partial least squares (PLS) regression, allows incorporating external environmental and genotypic covariables
directly into the model. Examples illustrating the use of various statistical models for analyzing GE in the context of plant
breeding and agronomy are given.

Key words: Least squares, singular value decomposition, environmental and genotypic covariables

Crossa, J., Vargas, M. et Joshi, A. K. 2010. Modèles linéaires, bilinéaires et linéaires-bilinéaires fixes et mixtes pour l’analyse
des interactions du génotype et de l’environnement en phytogénétique et en agronomie. Can. J. Plant Sci. 90: 561574. Cet
article passe en revue divers modèles statistiques servant à analyser les interactions entre le génotype et l’environnement
(GE). L’objectif consiste à présenter des approches parcimonieuses différentes de l’analyse usuelle de la variance avec le
modèle à deux axes. On y présente des modèles linéaires-bilinéaires aux mêmes effets fixes, comme le modèle de régression
des sites (SREG), et propose une contrepartie à effets mixtes, comme le modèle à analyse factorielle. Les auteurs décrivent
comment les modèles linéaires-bilinéaires servent à évaluer l’interaction des effets croisés. Un type de modèle linéaire, ceux
à régression factorielle, et un type de modèle bilinéaire, ceux à régression partielle des moindres carrés, permettent
d’intégrer directement des covariables environnementales et génotypiques. Suivent des exemples illustrant l’utilisation de
divers modèles statistiques permettant d’analyser les GE dans le contexte de l’amélioration génétique des plantes et de
l’agronomie.

Mots clés: Moindres carrés, décomposition en valeurs singulières, covariables environnementales et génotypiques

The presence of genotype environment interactions external environmental and genotypic covariables di-
(GE) in plant breeding and agronomy experiments is rectly into the model.
expressed either as inconsistent responses of some The basic two-way fixed effects linear model for GE
genotypes relative to others due to genotypic rank analyses considers that the empirical mean response, ȳij ;
change or as changes in the absolute differences between of the ith genotype (i1, 2, . . ., I) in the jth environment
genotypes without rank change. Several models are (j1, 2, . . ., J) with n replications in each of the IJ
commonly used for describing the mean response cells is expressed as
of genotypes over environments and for studying
and interpreting GE in agricultural experiments: linear ȳij mti dj (td)ij  ōij (1)
models, bilinear models, and linear-bilinear models. One
class of linear models, namely factorial regression (FR) where m is the grand mean over all genotypes and
models, and one class of bilinear models, namely partial environments, ti is the additive effect of the ith
least squares (PLS) regression, allow incorporating genotype, dj is the additive effect of the jth environment,
(td)ij is the non-additivity interaction (GE) of the ith
genotype in the jth environment, and ōij is the average
Presented at the Statistics Symposium held during the error assumed to be NID (0, s2/n) (where s2 is the
Annual Meeting of the Canadian Society of Agronomy, within-environment error variance, assumed to be con-
7 August 2009, Guelph, Ontario. stant). For a complete random model, it is assumed that
561
562 CANADIAN JOURNAL OF PLANT SCIENCE
Xt
ti, dj, and (td)ij are normally and independently dis- ȳij b lk aik gjk  ōij
k1
tributed, with variances s2t ; s2d ; and s2td ; respectively.
Yates and Cochran (1938) introduced the model in In matrix notation, these linear-bilinear models can be
which the GE term is linearly related to the environ- expressed as Y am k1 bk XK ALG?E (Cornelius and
mental main effect. Seyedsadr 1997), where Y [ȳij ]; Xk [Xkij ]; E [ōij ];
The purpose of this paper is to present parsimonious L diag(lk, k1, 2, . . . , t), l1 ]l2 ]. . .]lt ; A 
approaches  other than the model in Eq. 1  to GE (a1, . . . , at), G (g1,. . ., gt) and A?A G?G It.
analysis. Examples illustrating the use of various The xkij are known constants and bk ; lk ; aik ; and gjk
statistical models for analyzing GE in the context of are parameters to be estimated.
agronomy and plant breeding experiments are given.
Fixed Effects Sites Regression Model
STATISTICAL MODELS FOR GENOTYPE  Note that in the SREG model, the bilinear term models
ENVIRONMENT INTERACTION the main effects of genotypes (G) plus GE interaction
Fixed Effect Linear-bilinear Models (usually called a GGE biplot), and the composition of
Williams (1952) considered the model ȳij mti the two-way I J matrix to be subjected to singular
lai gj  ōij where l is the largest singular value of ZZ? value decomposition is different than the one used in the
and Z?Z (for Z ȳij  ȳi: ) and ai and gj are the AMMI model. Furthermore, SREG with two compo-
corresponding eigenvectors. Gollob (1968) and Mandel nents (SREG2) can be perceived as consisting of a set of
(1969, 1971) extended Williams’ (1952) work by con- multiple regression equations (one for each environ-
sidering the bilinear GE term as (td)ij  atk1 lk aik gjk : ment), each regression equation consisting of an envir-
Thus, the general formulation of the linear-bilinear onmental mean or environmental effect as intercept plus
model is two terms for regression on two genotypic regressor
Xt variables, ai1 and ai2 (either observed or latent), with gj1
ȳij mti dj  l a g  ōij (2) and gj2 as the regression coefficients.
k1 k ik jk
When the correlation between the site means and the
where the constant lk is the singular value of the kth first principal components of sites in the SREG2 model
multiplicative component that is ordered l1 ]l2 ]. . .] is high (say ]0.80), then the SREG model has the
lt ; the aik are elements of the kth left singular vector of property that the first principal component of SREG2
the interaction and represent the genotypic sensitivity to accounts for non-crossover interaction (non-COI) and
hypothetical environmental factors represented by the the second principal component of SREG2 is due to
kth right singular vector with elements gjk. The aik COI variability; this should hold for FA(2) as well
and gjk satisfy the constraints ai aik aik0  aj gjk gjk0  0 for (Burgueño et al. 2008). This is an important property
k "k? and ai a2ik aj g2jk  1: Gabriel (1978) described the that allows using the biplot for discriminating group
least squares fit of Eq. 2 and explained how the residual of sites and genotypes with and without crossover
matrix of the GE term interactions.
Recently, the merits and demerits of AMMI vs. GGE
Z  ȳij  ȳi:  ȳ:j  ȳ:: biplots for genotype and environment identification
(Yan et al. 2007; Gauch et al. 2008) have been examined
is subjected to a singular value decomposition (SVD) and discussed. Yang et al. (2009) pointed out the
after adjusting for the additive (linear) terms. Gauch advantages and disadvantages of these fixed effects
(1988) called Eq. 2 the Additive Main Effect and linear-bilinear models and discussed relevant issues
Multiplicative Interaction (AMMI) model. concerning the use of biplot analysis as a descriptive
Other classes of linear-bilinear models described by statistical tool. One of the main issues pointed out by
Cornelius et al. (1996) are the Genotypes Regression Yang et al. (2009) was related to the fact that genotypes
Model (GREG) or environments, or both, may be considered realiza-
Xt tions of random variables and therefore must be treated
ȳij mi  l a g  ōij ;
k1 k ik jk
as random effects. Another relevant issue pointed out by
Yang et al. (2009) is whether the biplot can detect
the Sites (environments) Regression Model (SREG) crossover interaction.
Xt
ȳij mj  l a g  ōij ;
k1 k ik jk Mixed Effect Linear-bilinear Models
A mixed-model analogue of AMMI or SREG has been
the Completely Multiplicative Model (COMM) developed using the factor analytic (FA) model for
Xt approximating the variance-covariance GE structure
ȳij m l a g  ōij ;
k1 k ik jk (Piepho 1998; Smith et al. 2002, 2005; Piepho and
Mohring 2005). Research conducted by Crossa et al.
and the Shifted Multiplicative Model (SHMM) (2006) and Burgueño et al. (2008) described how to model
CROSSA ET AL. * ANALYZING GENOTYPE ENVIRONMENT INTERACTION 563

variance-covariance GE and GGE using the FA model The simple Finlay and Wilkinson shows COI and
and how to incorporate the additive (relationship A) non-COI because for any pair of genotypes, if their lines
matrix and the additive additive covariance matrix into cross, that is COI; if their lines do not cross but have
the FA model based on pedigree information. Burgueño different slopes, that is non-COI; and if their lines are
et al. (2008) also described the equivalence between nearly parallel, that is negligible COI. In the special
SREG2 and FA(2) for finding subsets of genotypes and case that GE has a rather simple structure, so that these
environments without COI. regressions capture nearly as much GE as does the
AMMI1 or SREG2 models, this could be the preferred
model because of its conceptual simplicity.
Factor Analytic and Sites Regression Models for
In the most common case of a complex GE, yields for
Assessing Crossover Genotype Environment
each genotype can be graphed as a linear function of the
Interaction
environment first interaction component scores using
In the FA model, the random effect of the ith genotype
the AMMI1 model, as in Fig. 3 of Gauch et al. (2008).
in the jth environment (gij) is expressed as a linear
This shows COI and non-COI. Furthermore, the
function of latent variables xik with coefficients djk for
genotype means are also shown by the height of each
k 1, 2, . . . t, plus a residual, hij ; i.e., gij mj 
t line at an interaction score of zero. Hence, this graph
ak1 xik djk hij ; so that the ijth cell mean can be written
may be considered comparable to the SREG2 counter-
as yij gijoij. With only the first two latent factors
part. In case of very complex GE, the AMMI2 model
being retained, gij is approximated by gij :mj xi1 dj1 
can be used to show COI, as in Fig. 4 of Gauch et al.
xi2 dj2 hij : Therefore, there is a clear connection be-
(2008), which shows COI not for all genotypes, but for
tween the SREG2 and FA(2) models, as described by
those of greatest interest, namely those genotypes that
Burgueño et al. (2008). A similar connection between
are the winners. Its counterpart, generally recovering
the AMMI2 and FA(2) models was established by Smith
a similar amount of GE, would be the SREG3 model,
et al. (2002).
but no graphical representation of that model has yet
Under principal component rotation, the directions
been published. However, the example used by Gauch
and projections of the vectors of FA(2) and SREG2 in
et al. (2008) is rather small in size as compared with
the biplot are the same. Therefore, the property of the
those usually present in international plant breeding
SREG by which the first principal component of
trials.
SREG2 accounts for non-crossover interaction (non-
Given the Finlay-Wilkinson (and joint regression),
COI) and the second principal component of SREG2 is
AMMI1, and AMMI2 counterparts of SREG2 for
due to COI variability should hold for FA(2) as well.
graphical display of COI and non-COI, various circum-
It should be pointed out that the absolute values of
stances for a given dataset might indicate a preference
genotypic and environmental scores under the FA(2)
for the Finlay-Wilkinson or AMMI1 or AMMI2 model.
and SREG2 models may not necessarily be the same; the
However, and as described and explained by Yang et al.
estimates of the random effects in the FA(2) model are
(2009), the fixed effects AMMI as well as SREG models
BLUPs (Best Linear Unbiased Predictions) (Henderson
have the drawback of not incorporating the natural
1984), whereas the estimates in fixed effects SREG2
uncertainty present when estimating the GE parameters
model are least squares estimates, that is, Best Linear
from the data; on the other hand the mixed AMMI and
Unbiased Estimates (BLUE). Furthermore, the
SREG versions consider this uncertainty when estimat-
standard errors of the estimable functions of fixed
ing the Best Linear Unbiased Prediction (BLUP).
effects under SREG differ from those of predictable
Finally, relaying only in the display of genotypes and
functions of a mixture of fixed and random effects under
sites offer by the biplot from the fixed effects AMMI
FA, and FA models are more flexible in handling
and SREG models does not seem to be a powerful
unbalanced data (the SREG model does not handle
scientific methodology for predicting the performance of
missing data).
genotypes in other sites in future years.?
There are several statistical as well as biological
Additional Options for Assessing COI and reasons to prefer SREG over AMMI for assessing
Non-COI COI and non-COI under the common situation of a
The GREG linear-bilinear model defined above is a complex GE; (1) SREG is a more parsimonious model
reparameterization of the stability analysis models of the than AMMI, (2) SREG incorporates the main effect of
Finlay and Wilkinson (1963) linear regressions of yields genotypes directly into the statistical analysis; this is of
on environment means and the Eberhart and Russell importance for breeders’ objectives that require includ-
(1966) model, with the first multiplicative term, l1 ai1 gj1 : ing the main performance of genotypes in the model, (3)
perceived as the genotype regressions, with coefficients the mixed SREG model can be fitted much more easily
ai1 on environmental indices gj1 (the scale parameter l1 than the mixed AMMI model, and (4) the mixed SREG
can be absorbed into ai1 or gj1 or partially into each), and model, as proved by Burgueño et al (2008), is useful for
the deviation modeled as multiplicative components delineating mega-environments using a formal statistical
provided that t 1. approach based on the factor analytic model rather
564 CANADIAN JOURNAL OF PLANT SCIENCE

than a mere visualization of genotypes and sites given by The FR model including genotypic and environmen-
the biplot. tal covariables simultaneously is
XG XH
ȳij mti dj  x j 
g1 ig jg
w z
h1 ih jh
INCORPORATING EXTERNAL COVARIABLES
FOR EXPLAINING GENOTYPE ENVIRONMENT XH XG
INTERACTION  h1 g1
xig ngh zjh  ōij
Factorial regression (FR) and partial least squares
(PLS) analysis (e.g., van Eeuwijk et al. 1996; Vargas where nkh is a constant that scales the cross-product of
et al. 1999) are useful for studying the effects of both the genotypic covariables, xk ; with the environmental
genetic and environmental covariables and for explain- covariables, zh ; and can be derived from the two
ing the causes of GE. The structural equation model previous FR models by imposing the restriction jjg 
(SEM) using endogenous and exogenous variables is a ngh zjh or wih xih ngh ; each cross-product represents one
useful alternative for overcoming some of the limitations degree of freedom in the GE subspace. In matrix
of the FR and PLS approaches and for developing notation, the expectation is
functional relationships and predictability with expla-
natory covariables. E(Y)m1I 1?J t1?I 1J d?XnZXJ?zZ?

where the constraint XJ?zZ?0 (where 0 is a matrix


Factorial Regression Model H G of zeros) is required. The model should be fitted
The GE is modeled directly using regression on envir- for all possible combinations of genotypic covariables
onmental (and/or genotypic) variables. A useful linear and environmental covariables.
model for incorporating external environmental (or When environmental (or genotypic) covariables show
genotypic) variables is the factorial regression (FR) high collinearity, interpretation of the least squares
model (Denis 1988; van Eeuwijk et al. 1996). The FR regression coefficients is complicated because they are
models are ordinary linear models that approximate the estimated very imprecisely. Consequently, a stepwise
GE effects in Eq. 1 by the products of one or more (1) procedure for choosing the covariables to include could
genotypic covariables (observed) environmental po- be useful for model construction. Noise on the response
tentialities (estimated), (2) genotypic sensitivities variable also complicates interpretation of the FR
(estimated) environmental covariables (observed). parameters. Furthermore, least squares estimation of
For k 1, . . . , G genotypic covariables (centered) re- the parameters in the FR models is not unique when the
presented by xi1 ; . . . ; xiG ; Eq. 1 becomes ȳij mti  number of covariables is larger than the number of
dj aG g1 xig jjg  ōij ; G 5I 1, where jjg represents an observations, so an alternative estimation method is
environmental factor (regression coefficient) with re- needed. Partial least squares (PLS) regression, which
spect to the genotypic covariable, xig : Constraints on the overcomes some of these problems, can be used as an
parameters are ai ti  aj dj aj jjg  0: In matrix nota- alternative estimation method.
tion the expectation is
E(Y) m1I 1?J t1?J 1I d?XJ? (3)
Partial Least Squares Regression
Multivariate partial least squares (PLS) regression
where Y [ȳij ] is a IJ matrix; 1I and 1J are I 1 and
models (Aastveit and Martens 1986; Helland 1988) are
J 1 vectors with all elements equal to 1, respectively;
a special class of bilinear models. When genotypic
t [tI] is the I1 vector of main effects of genotypes;
responses over environments (Y) are modeled using
d [dj] is the J1 vector of main effects of environ-
environmental covariables, the JH matrix Z of H
ments; X [xig ] is the IG matrix of known genotypic
(h 1, 2, . . . , H) environmental covariables can be
covariables; J [jjg ] is the J G matrix of unknown
written in bilinear form as
environmental factors (regression coefficients).
For h1, . . . , H environmental covariables (cen- Z t1 p?1 t2 p?2 . . .tM p?M EM  TP?E (5)
tered) represented by zj1 ; . . . ; zjH ; Eq. 1 is ȳij mti 
dj aHh1 wih zjh  ōij ; H5J 1, where wih represents a where the matrix T contains the tJ J 1 vectors called
genotypic sensitivity (regression coefficient) with respect latent environmental covariables or Z-scores (indexed
to the environmental covariable, zjh : Constraints on by environments), and the matrix P contains the p1 . . .
the parameters are ai ti aj dj ai zih 0: In matrix pH H 1 vectors called Z-loadings (indexed by environ-
notation, the expectation is mental variables) and E has the residuals. Similarly, the
response variable matrix Y in bilinear form is
E(Y) m1I 1?J t1?J 1I d?zZ? (4)
Y t1 q?1 t2 q?2 . . .tM q?M FM  TQ?F (6)
where Z [zjh ] is the J H matrix of known environ-
mental covariables; z[wih ] is the IH matrix of where the matrix T is as in Eq. 5 and the matrix Q
unknown differential genotypic sensitivities. contains the q1 . . . qI I1 vectors called Y-loadings
CROSSA ET AL. * ANALYZING GENOTYPE ENVIRONMENT INTERACTION 565

(indexed by genotypes) and F has the residuals. The as yield components in cereals and their interrelation-
relationship between Y and Z is transmitted through ships with other variables, as well as with final grain
the latent variable T. The PLS algorithm performs yield. The SEM allows a researcher to test hypotheses
separate (but simultaneous) principal component on cause-effect relationships between variables in a
analysis of Z and of Y that allows reducing the number complex system where the initial definition of SEM
of variables in each system to a smaller number of comprises a path diagram that outlines the various levels
hopefully more interpretable latent variables. of observed (or latent) independent or dependent vari-
Helland (1988) showed that a reduced number of PLS ables, as well as the directions of causal relationships
latent variables gives a low rank representation of the among variables.
least squares estimates of the FR with environmental In an agricultural context, the SEM was first proposed
covariables because the expectation of Y? is by Dhungana (2004) to study GE of grain yield in wheat
and its components, and to account for the importance
E(Y?)QT? Q(ZW)?(QW?)Z? zZ?
of intermediate traits associated with yield components.
SH
h1 wih Zjh (7) The authors explained yield GE with cross-products of
genotypic and environmental covariates as exogenous
as in Eq. 4, where T, Q, and Z are defined as before and (independent) variables and observed yield component
the vector W is H1 and contains the Z-loadings (or GE as endogenous (dependent/independent) variables.
weights) of the environmental covariables; z contains the Dhungana (2004) concluded that SEM on observed
PLS approximation to the regression coefficients of the variables was an effective way of describing yield GE
responses in Y to the environmental covariables in Z. in wheat, given that the interrelationships and role of
The matrices T (with J coordinates for environments), yield component GE can be incorporated simultaneously
Q (with I coordinates for genotypes), and W (with H in a single model. Diagrams representing the structural
coordinates for environmental covariables) can be models known as path diagrams are useful for visualizing
represented in the PLS biplot such that projecting the complex models and variable relationships.
jth environment (row) of T on the ith genotype (row) of
Q [Y?(TQ?)?] approximates the GE; projecting the hth EXAMPLES OF THE USE OF FACTORIAL
environmental covariable (row) of W on the ith geno- REGRESSION, PARTIAL LEAST SQUARES AND
type (row) of Q (QW?z) approximates the regression STRUCTURAL EQUATION MODELS IN PLANT
coefficient of the ith genotype on the hth environmental BREEDING AND AGRONOMY EXPERIMENTS
covariable (Vargas et al. 1999; van Eeuwijk et al. 2000).
When genotypic covariables are used to model environ- Treatment Environment Interaction Analysis
mental responses over genotypes, the latent genotypic in Agronomy
covariables are T XW, where vector W is G 1 and A description of the treatment environment (T E)
contains the weights of the genotypic covariables. The interaction of 24 agronomic treatments (124) [tillage,
expectation of Y is summer crop, manure, and nitrogen (N)] evaluated over
10 yr (19881997) was provided by Vargas et al. (2001).
E(Y) TQ?XWQ? XJ? SG
g1 Xig jjg (8) Results of the final FR were compared with those of a
partial least squares (PLS) to achieve extra insight into
as in Eq. 3 (Vargas et al. 1999; van Eeuwijk et al. 2000) both the T E and the final FR model.
where J contains the PLS approximation to the regres- The FR was applied to year tillage, year summer
sion coefficients of the responses in Y to the genotypic crop, year manure, year N, year summer crop 
covariables in X. The matrices T (with I coordinates for N, and year manure N. Results for the stepwise
genotypes), Q (with J coordinates for environments), multiple factorial regression model of the interaction
and W (with G coordinates for genotypic covariables) between 27 environmental covariables and tillage
can be represented in a PLS biplot such that projection showed that evaporation in December (EVD) tillage
of the ith genotype (row) of T onto the jth environment sum of squares accounted for 68% of the whole year 
(row) of Q (Y TQ?) approximates the GE; projection tillage interaction. For year summer crop, evaporation
of the gth genotypic covariable (row) of W onto the jth in April (EVA) accounted for 36% of the year summer
environment (row) of Q (WQ?J) approximates the crop interaction. For year manure, covariables pre-
regression coefficient of the jth environment on the gth cipitation in December (PRD) and sun hours in
genotypic covariable. February (SHF) contributed 56% of the year manure
sum of squares. Year nitrogen (N) interaction deter-
Structural Equation Model (SEM) mined the major part of year treatment interaction
The SEM approach is similar to multiple regression, sum of squares.
because it analyzes a system of equations in which each The PLS biplot shown in Fig. 1 separated the nine
equation describes a causal relationship among variables highest yielding treatments (T9, T19, T21, T17, T11,
considered in the system. The SEM is useful because T12, T10, T23, and T18) from the nine lowest yielding
it can integrate and model intermediate traits, such treatments (T1, T2, T3, T4, T5, T6, T7, T8, and T16).
566 CANADIAN JOURNAL OF PLANT SCIENCE

1.0 March are related to precipitation in December,


January, and March.
T18 The most highly productive treatments are associated
mTJ
0.6 mTUJ T12
T23 T11 with high N levels (100 and 200 kg ha 1) and no
T24
PRM
T20 T22 T19 precipitation. The explanation maybe that precipitation
mTUA
1992
T14 PRJ
T15 1997 T17 is associated with leaching of N (especially if the texture
1990 T9
T10 mTA T21
of the soil is coarse). In addition, higher precipitation is
0.2 mTM
also associated with clouds, which reduce radiation.
Factor 2

PRFmTUD13 EVD
1994
1993
mTUM 1989 MTD 1988 While radiation is the major yield-limiting factor when
SHF
T16 mTD
PRD EVF
N and water are non-limiting, high radiation may also
–0.2 T2 MTM EVJ
1991 SHD be associated with higher temperatures and excessive
T6
EVM evaporative demand. Accelerated development rate may
SHJ
T1
MTJ be especially prejudicial to yield during spike growth
1995 1996
–0.6 mTUF
T5
EVA
(February) and, to a lesser extent, during grain filling
MTA
T7 T4
T8 (MarchApril). Excessive evaporative demand may
mTF T3
reduce the ability of the plant to cool itself directly by
–1.0 MTF not permitting sufficient evapotranspiration or indir-
ectly by reducing soil moisture.
–1.0 –0.6 –0.2 0.2 0.6 1.0
Factor 1

Fig. 1. Biplot of the first and second PLS factors representing


Genotype Environment Interaction for Zinc
the Z-scores of 10 yr (19881997) and the Y-loadings of the 24 and Iron Concentration of Wheat Grain
practice treatments (124) enriched with the Z-loadings of 27 Zinc and iron are important micronutrients for human
environmental variables [extracted from Vargas et al. (2001)]. health; there is widespread deficiency of these micro-
EV total monthly evaporation; PRtotal monthly precipi- nutrients in many regions of the world including South
tation; SHsun hours per day; mTmean minimum Asia. Breeding aimed at enriching wheat grains with
temperature sheltered; MT mean maximum temperature more zinc and iron are in progress in India and
sheltered; mTUmean minimum temperature unsheltered; Pakistan, and at CIMMYT. Greater knowledge of GE
D December; J January; FFebruary; MMarch; of these nutrients in the grain is expected to increase our
A April. understanding of the magnitude of this interaction and
to help identify more stable genotypes for this trait. Elite
The nine lowest yielding treatments had a positive wheat lines from CIMMYT were evaluated in multi-
interaction with year 1995, which had high mTUF environment trials in the Eastern Gangetic Plains of
(mean minimum temperature unsheltered in February), India during three years to study GE interactions for
mTF (mean minimum temperature in February), and agronomic and nutrient traits. Soil and meteorological
MTA (mean maximum temperature in April) but data from each of the locations were also used (Joshi
negative interaction with year 1988 (opposite quadrant). et al. 2010).
The PLS biplot contains roughly five clusters of Joshi et al. (2010) showed the results of factorial
correlated environmental covariables. The order of regression with contributions by significant environ-
inclusion of these covariables in the FR using the mental and soil covariables to explain GE variability
stepwise procedure for each factor effect corresponds (Table 1). For iron concentration in the grain, four
to selecting covariables for the different cluster groups covariables (maximum temperature before flowering,
depicted in Fig. 1. rainfall after flowering, zinc in 3060 soil depth, and RH
In general, SH, EV, and MT are grouped in the right after flowering) were significant, accounting for 59.46%
quadrants of the biplot, whereas PR, mT, and mTU are of GE variation. For grain Zn concentration, five
grouped in the left quadrant of the biplot. It is expected covariables explained 82.41% of GE variation. These
that with more sun hours, there will be higher maximum covariables were, in descending order of their contribu-
temperatures and more evaporation, and that with more tion, minimum temperature after flowering, Zn in 3060
precipitation, there will be fewer sun hours and thus cm soil depth, rainfall after flowering, minimum
lower temperature. This is clear for the lower right temperature before flowering, and Zn in 030 soil
cluster of variables comprising MT, EV, and SH. The depth. Zn content in 3060 cm soil depth was also a
group of environmental variables located in the right significant determinant for grain Fe concentration.
upper quadrant indicates that minimum temperature in These results suggest that the GE was substantial for
April with maximum temperature and evaporation in grain Fe and Zn.
December had a similar effect on the T E for the For Fe and Zn concentrations in wheat grain,
treatments located in that quadrant. The two groups genotypic responses varied widely across environments,
of variables in the left upper quadrant indicate that as indicated by vectors that radiated in all directions
minimum temperatures in December, January, and in the PLS biplot depicting genotypic variables,
CROSSA ET AL. * ANALYZING GENOTYPE ENVIRONMENT INTERACTION 567

Table 1. Proportion of GE variation accounted for by factorial regression the fact that most locations in 2005 and 2006 were
analysis for each significant covariable for grain iron and zinc not correlated with respect to GE and often placed
concentration opposite to each other. The biplot clearly differentiated
Variable SS variable % variation explained the 2005 and 2006 sown environments at all locations,
showing the complexity of evaluating grain Fe and Zn
Fe concentration in the grain concentrations.
TMXBFz 1045.64 22.21
RAFz 644.96 13.69
Maximum temperature before flowering (TMXBF)
Zn_60z 644.89 13.69 appeared to play an important role for both Fe and Zn.
RHAFz 463.93 9.85 This was perhaps because temperature before flowering
% contribution of 4 variables 59.46 contributes to proper development of the embryo, where
Zn concentration in the grain a large portion of micronutrients reside. PLS analysis
TMNAFz 1369.06 29.11
Zn_60 751.67 15.98
of genotypic variables and GE for grain Fe concentra-
RAFz 653.90 13.90 tion placed variables such as Fe_30 and Zn_30 in the
TMNBFz 604.58 12.85 right uppermost quadrant, while variables Fe_60 and
Zn_30z 495.65 10.54 Zn_60 were in the left uppermost quadrant (Fig. 2).
% contribution of 5 variables 82.41
PLS analysis of environmental variables and GE
z
TMXBFmaximum temperature before flowering; TMNBFmini- for grain Zn concentration placed variables such as
mum temperature before flowering; TMNAFminimum temperature minimum temperature before and after flowering
after flowering; RAFrainfall after flowering; Zn_30zinc concen- (TMNBF, TMNAF) in the right uppermost quadrant
tration in 030 cm soil depth; Zn_60zinc concentration in (Fig. 3), indicating their similar role. On the other hand,
3060 cm soil depth; RHAFrelative humidity after flowering.
maximum temperatures before and after flowering
(TMXBF, TMXAF) were in the opposite quadrants,
environments, and genotypes (Figs. 2 and 3 for Fe and indicating their opposite role in grain Zn concentration.
Zn concentration in wheat grain). Variations in the Variables Zn_30 and Zn_60 were also in the opposite
pattern of response within the location are evident from quadrants, indicating that their contribution to grain Zn

1.0
Zn_60
0.8 Fe_60
12 1 5
0.6
TMNBF
Ghu_6 BHU_6
0.4 15 17 Zn_30
TMNAF
Bhd_5
Factor 2 (16.17%)

4 Fe_30
7 8
0.2 Bha_5 Bhu_6Pid_6 3
11 16 6 RHBFRHAF
0.0 TMXBF
19
–0.2
RBF
14 BHU_5
Ghu_513
–0.4
18 2
TMXAF
–0.6
Mau_5 Bhu_5
20 910
–0.8
RAF
–1.0

–1.0 –0.8 –0.6 –0.4 –0.2 0.0 0.2 0.4 0.6 0.8 1.0
Factor 1 (18.52%)

Fig. 2. Partial least squares (PLS) biplot of the number of locations and years with environmental and soil covariables on the
performance of iron grain concentration in 20 wheat lines in 10 environments of the eastern Gangetic plains of south Asia. The more
significant variables in FR are given in italics. Bhu_5 Bhurkura in 2005; Ghu_5Ghurahoopur in 2005; Bhd_5 Bhadawal in
2005; Bha_5 Bhagwanpur in 2005; Mau_5 Mauparasi in 2005; BHU_5Banaras Hindu University in 2005; Bhu_6Bhurkura
in 2006; Mau_6 Mauparasi in 2006; BHU_6Banaras Hindu University in 2006; Pid_6Pidkhir in 2006. TMXBF maximum
temperature before flowering; TMXAF maximum temperature after flowering; TMNBF minimum temperature before flower-
ing; TMNAF minimum temperature after flowering; RHBFrelative humidity before flowering; RHAF relative humidity after
flowering; RBF rainfall before flowering; RAF rainfall after flowering; Zn_30 zinc concentration in 030 cm soil depth;
Zn_60 zinc concentration in 3060 cm soil depth; Fe_30 iron concentration in 030 cm soil depth; Fe_60iron concentration in
3060 cm soil depth.
568 CANADIAN JOURNAL OF PLANT SCIENCE

1.0 TMNAF

8
0.8 Bhu_6 TMNBF
RHAF 10 19 7
0.6
2 5 Pidk_6 20 18
RHBF TMXBF
0.4 RFAF RFBF

Factor 2 (16.30%)
6
Ghu_5 Zn_30
0.2 9
BHU_5
0.0
12 17 11
Bhd_5 Bhu_5 TMXAF
Fe_30
–0.2 13
Bha_5 Ghu_6
–0.4
Mau_5 16
–0.6 3 BHU_6
1
Fe_60 Zn_60
–0.8 4 1415
–1.0

–1.0 –0.8 –0.6 –0.4 –0.2 0.0 0.2 0.4 0.6 0.8 1.0
Factor 1 (21.53%)

Fig. 3. Partial least squares (PLS) biplot of the number of locations and years with environmental and soil covariables on the
performance of zinc concentration in the grain of 20 wheat lines in 10 environments of the eastern Gangetic plains of south Asia.
The more significant variables in FR are given in italics. Bhu_5 Bhurkura in 2005; Ghur_5Ghurahoopur in 2005; Bhd_5 
Bhadawal in 2005; Bha_5 Bhagwanpur in 2005; Mau_5 Mauparasi in 2005; BHU_5 Banaras Hindu University in 2005;
Bhu_6 Bhurkura in 2006; Mau_6Mauparasi in 2006; BHU_6Banaras Hindu University in 2006; Pidk_6 Pidkhir in 2006.
TMXBF maximum temperature before flowering; TMXAF maximum temperature after flowering; TMNBFminimum
temperature before flowering; TMNAF minimum temperature after flowering; RHBF relative humidity before flowering;
RHAF relative humidity after flowering; RBFrainfall before flowering; RAF rainfall after flowering (RAF); Zn_30 zinc
concentration in 030 cm soil depth; Zn_60 zinc concentration in 3060 cm soil depth; Fe_30 iron concentration in 030 cm soil
depth; Fe_60 iron concentration in 3060 cm soil depth.

concentration is quite different. Variable Zn_60 anthesis); (5) foliar, 2.0 kg ha1 at two stages; (6) foliar,
appeared to play a greater role in grain Zn concentra- 4.0 kg ha1 at two stages; (7) soil (25 kg ha1)2 foliar
tion. Genotypes 2, 5, and 6 were on the same quadrant (2.0 kg ha 1); (8) soil (50 kg ha 1)2 foliar (2.0 kg ha 1);
as Zn_30, indicating they can have greater Zn in the (9) soil (0 kg)2 foliar @ 0.2%; (10) soil (25 kg)2 foliar
grain when the micronutrient is higher in top soil. On @ 0.2%; (11) soil (50 kg)2 foliar @ 0.2%, and (12) local
the other hand, genotypes 1, 11, 13, 14, 15, and 16 were farmers’ practice for Zn, i.e., ZnSO4 @ 5 kg ha 1.
in the same quadrant as Zn_60, indicating their capacity Factorial regression analysis showed that for grain
to take up Zn from deeper soil. Zn, two covariables were significant and contributed
around 91% of the variation (Table 2). They were
Zinc Environment Interaction for Zinc and
Iron Concentration in Wheat Grain
Current genotypes have relatively low zinc concentra-
Table 2. Percentage of variation accounted for by each significant
tion in the grain. A low-cost agronomic intervention covariable for grain yield and grain iron and zinc concentration
using Zn fertilizers could be a complementary approach
to enrich wheat grain. A multi-environment trial includ- Variable SS variable % variation explained
ing 12 zinc application treatments was organized in late-
Fe concentration in the grain
sown environments in four sites of the Eastern Gangetic Zn_30z 1660.29 60.85
Plains of India using the most widely sown wheat RHBFz 861.96 31.59
cultivar in that region. Agronomic performance and % contribution of 2 variables 92.44
Zn and Fe concentrations in the grain were recorded. Zn concentration in the grain
Factorial regression (FR) and partial least squares Fe_30z 2025.53 73.16
(PLS) analysis was performed on the Zn treatment Zn_30z 511.36 18.47
% contribution of 2 variables 91.63
environment (TE).
For Zn, 12 treatments of ZnSO4 were used in each z
RHBFrelative humidity before flowering; Zn_30zinc concentra-
location: (1) control, (2) soil, 25 kg ha1; (3) soil, 50 kg tion in 030 cm soil depth; Fe_30 iron concentration in 030 cm soil
ha1; (4) foliar, 1.0 kg ha1 at two stages (flag leaf and depth.
CROSSA ET AL. * ANALYZING GENOTYPE ENVIRONMENT INTERACTION 569

Fe and Zn in the soil (030 cm depth). For Fe grain yield, which could lead to better human nutrition
concentration in the grain, two covariables (Zn in 0 in South Asia.
30 cm soil depth and relative humidity before flowering)
were significant and accounted for around 92% of the Causes of Genotype Environment Interaction
variation. and its Effects on Grain Yield, Biomass, Yield
For Zn concentration in wheat grain, location Components, and Other Traits in Wheat Trials
responses varied widely, as indicated by vectors that Using a Structural Equation Model
radiated in all directions in the PLS biplot depicting Vargas et al. (2007) showed how the structural equation
treatment variables, environments, and treatments model (SEM) methodology may be applied to observed
(Fig. 4). However, for Fe, some locations, such as yield GE, yield component GE, and other intermediate
traits using environmental covariates, for studying the
Banaras Hindu University (BHU) and Ghurahoopur
causes of GE and its effects on grain yield, biomass,
(Ghur), had similar genotypic responses (Fig. 5). The
yield components, and other interrelated traits acting at
TE interaction was significant for Zn concentration in different development stages in wheat trials.
the grain. Zinc concentrations increased significantly The given structural equation model explained 0.96 of
when the micronutrient was applied as a foliar spray. total variability of yield GE (Table 3). The variables that
Soil application alone was found not to show an contributed most to explaining yield GE were GEs of
enhanced effect. The highest Zn concentration was yield components grains per square meter (GM2GE),
recorded when soil and foliar combinations were applied 1000-kernel weight (TKWGE), grains per spike (GSPGE),
together. Results suggest that foliar applications can be and spikes per square meter (SM2GE), with total effects of
utilized to increase Zn concentration in wheat grain. The 1.09, 0.64, 0.56, and 0.54, respectively. The GEs
combination of soil application and foliar spray can of GM2GE, TKWGE, GSPGE, and SPMGE explained
be used to improve grain Zn concentration and increase 0.90, 0.43, 0.44, and 0.42, respectively, of total variability

1.0 RHAF 11

0.8 TMXAF
TMXBF 5
Bhad
0.6 8

0.4 6
Factor 2 (21.40%)

RHBF TMNBF
0.2 Zn_30
9 Fe_30
0.0 Pidk 12 7
RFAF
1 BHU
–0.2 TMNAF
2
–0.4 10
4
–0.6 Ghur Fe_60

–0.8 3

–1.0 RFBF Zn_60

–1.0 –0.8 –0.6 –0.4 –0.2 0.0 0.2 0.4 0.6 0.8 1.0
Factor 1 (63.75%)

Fig. 4. Partial least squares (PLS) biplot of the number of locations and treatments with environmental and soil covariables on the
performance of grain zinc concentration in wheat cultivar HUW 234 in four locations in the eastern Gangetic plains of south Asia.
Pidk Pidkhir; Ghur Ghurahoopur; Bhad Bhadawal; BHUBanaras Hindu University. The 12 treatments are: (1) control, (2)
soil, 25 kg ha 1; (3) soil, 50 kg ha 1; (4) foliar, 1.0 kg ha1 at two stages (flag leaf and anthesis); (5) foliar, 2.0 kg ha1 at two
stages; (6) foliar, 4.0 kg ha1 at two stages; (7) soil (25 kg ha 1)2 foliar (2.0 kg ha 1); (8) soil (50 kg ha1 ZnSO4)2 foliar (2.0
kg ha1); (9) soil (0 kg)2 foliar @ 0.2%; (10) soil (25 kg)2 foliar @ 0.2%; (11) soil (50 kg)2 foliar @ 0.2% (12) local farmers’
practice for Zn, i.e., 5 kg ZnSO4 ha 1. Environmental and soil variables are: TMXBF maximum temperature before flowering;
TMXAF maximum temperature after flowering; TMNBFminimum temperature before flowering; TMNAF minimum
temperature after flowering; RHBFrelative humidity before flowering; RHAF relative humidity after flowering; RBFrainfall
before flowering (RBF); RAF rainfall after flowering. Zn_30 zinc concentration in 030 cm soil depth; Zn_60 zinc
concentration in 3060 cm soil depth; Fe_30 iron concentration in 030 cm soil depth; Fe_60 iron concentration in 3060 cm
soil depth.
570 CANADIAN JOURNAL OF PLANT SCIENCE

1.0 11 RFAF

0.8

0.6 TMXBF
7
TMNBF
BHU10 Zn_60
0.4 3

Factor 2 (33.17%)
Pidk
0.2 2 8
RHBF
12
0.0
4
–0.2
6
Fe_60 RHAF
–0.4 Bhad
Ghur TMNAF
TMXAF RFBF
–0.6 Fe_30

–0.8 5
9
–1.0 1 Zn_30

–1.0 –0.8 –0.6 –0.4 –0.2 0.0 0.2 0.4 0.6 0.8 1.0
Factor 1 (59.11%)

Fig. 5. Partial least squares (PLS) biplot of the number of locations and treatments with environmental and soil covariables on the
performance of grain iron concentration in wheat cultivar HUW 234 in four locations in the Eastern Gangetic Plains of South Asia.
Pidk Pidkhir; Ghur Ghurahoopur; Bhad Bhadawal; BHU Banaras Hindu University. The treatments are: (1) control, (2)
soil-25 kg ha1; (3) soil, 50 kg ha1; (4) foliar, 1.0 kg ha 1 at two stages (flag leaf and anthesis); (5) foliar 2.0 kg ha1 at two stages;
(6) foliar, 4.0 kg ha 1 at two stages; (7) soil (25 kg ha1)2 foliar (2.0 kg ha 1); (8) soil (50 kg ZnSO4 ha1)2 foliar (2.0 kg ha1);
(9) soil (0 kg)2 foliar @ 0.2%; (10) soil (25 kg)2 foliar @ 0.2%; (11) soil (50 kg)2 foliar @ 0.2%; (12) local farmers’ practice for
Zn, i.e., 5 kg ZnSO4 ha1. The environmental and soil covariables are: TMXBF maximum temperature before flowering;
TMXAF maximum temperature after flowering; TMNBFminimum temperature before flowering; TMNAF minimum
temperature after flowering; RHBFrelative humidity before flowering; RHAF relative humidity after flowering; RBFrainfall
before flowering; RAF rainfall after flowering. Zn_30 zinc concentration in 030 cm soil depth; Zn_60 zinc concentration in
3060 cm soil depth; Fe_30iron concentration in 030 cm soil depth; Fe_60 iron concentration in 3060 cm soil depth.

(Table 3). Yield component SM2GE had a very small R2 components GM2GE and TKWGE had the largest posi-
value (0.04), but a significant indirect effect on grain yield tive direct association with yield GE (1.09 and 0.64,
GE (0.54). The model indicated that GEs of yield respectively) and no indirect effects (0.0), while GSPGE

Table 3. Direct, indirect, and total effects of yield component GEs and adjusted cross-product covariates on grain yield GE (R2 0.96) [extracted from
Vargas et al. (2007)]

Variable Direct effect Indirect effect Total effect R2

Grains per square meter (GM2GE) 1.09 0.00 1.09 0.90


1000-kernel weight (TKWGE) 0.64 0.00 0.64 0.43
Grains per spike (GSPGE) 0.05 0.61 0.56 0.44
Spikes per square meter (SM2GE) 0.00 0.54 0.54 0.04
Spike mass (SPMGE) 0.00 0.05 0.05 0.42
Relative duration of spike growth (RSGGE) 0.00 0.09 0.09
Crop growth rate during spike growth (dBMbGE) 0.00 0.07 0.07
Biomass at anthesis (BMAGE) 0.00 0.03 0.03
Biomass at the vegetative stage (BMVGE) 0.00 0.11 0.11
MXT4GM2zGE 0.00 0.39 0.39
MXT4GSPGE 0.00 0.23 0.23
RAD2SM2GE 0.00 0.09 0.09
MNT4TKWGE 0.00 0.59 0.59
RAD2TKWGE 0.00 0.40 0.40
MXT3BMAGE 0.00 0.10 0.10
MNT1BMAGE 0.00 0.01 0.01

z
MXTmean daily maximum temperature; MNTmean daily minimum temperature; RADsolar radiation. The suffixes 1, 2, 3, and 4 denote the
first, second, third, and fourth growth developmental stages, respectively.
CROSSA ET AL. * ANALYZING GENOTYPE ENVIRONMENT INTERACTION 571

and SM2GE GEs had the greatest indirect effects on analyzed using FR and PLS (Ortiz et al. 2007). Several
yield GE (0.61 and 0.54, respectively) and a low negative environmental covariables were included in the FR and
direct effect (GSPGE 0.05) or no direct effect at all PLS analyses for studying and interpreting GE. Envir-
(SM2GE 0.0) on yield GE (Table 3). onmental factors, such as days to harvest, soil pH, mean
The SEM using GE effects is a powerful method that temperature, potassium available in soil, and phospho-
gives a more complete overview of the external and rus fertilizer, accounted for a sizeable portion of GE for
internal variables acting and interacting in the GE marketable fruit yield, whereas trimming, irrigation, soil
of various traits than do the PLS or FR methods. In organic matter, and nitrogen and phosphorus fertilizers
the current analysis, climatic variables were related were important environmental covariables for explaining
mostly to final main yield components GM2GE, GSPGE, GE of average fruit weight.
and TKWGE. Only more intermediate endogenous The factorial regression model with a stepwise regres-
spike mass (SPMGE) and SM2GE were affected by sion procedure for variable selection was used to
minimum temperature in the first developmental stage determine the most informative subset of environmental
(MNT1), along with solar radiation in the second covariables affecting marketable fruit yield. The subset
developmental stage (RAD2). SEM analysis of GE of of independent environmental covariables that ex-
variables showed that weather conditions during the plained 62% of total GE included days to harvest
spike primordia and early grain-filling stages influence (DHA gen), soil pH (PH gen), mean temperature
GE of other traits; this result is consistent with PLS (MET gen), potassium (K gen), extra phosphorus
results obtained by Reynolds et al. (2004). Furthermore, (EX_P gen), and minimum temperature (MNT gen)
the result of SEM analysis of GE effects shows the (Table 4). Days to harvest (DHA) and soil pH (PH)
influence of biomass at anthesis (BMAGE) on SPMGE jointly explained 34% of total GE variability with only
and of MNT1 on SPM. 28 degrees of freedom (from a total of 238 degrees of
freedom). The ability to use nitrogen fertilizer (EX_N)
Interpreting Genotype Environment Interaction explained a small portion of GE variability. Severe N
of Fruit Yield in a Tomato Multi-Environment stress can reduce tomato fruit yield by 60 to 70%
Trial (Scholberg et al. 2000). The remaining environmental
This example describes results from a multi-environment covariables were significant, but did not explain much of
trial comprising 15 tomato genotypes [seven hybrids (H) the GE variability for marketable fruit yield.
and eight open-pollinated (OP)] evaluated in 18 locations The first two PLS factors with all 15 tomato
of Latin America and the Caribbean; the results were genotypes evaluated in 18 environments along with 16

0.08
0.36 RSGGE
MXT4*GM2GE

–0.37
MXT4*GSPGE GM2GE
dBMdGE 0.13 1.04
0.31
1.09
RAD2*SM2GE SM2GE 1.01
0.16 –0.02
–0.56

MNT1*BMAGE HIAGE
–0.37
YLDGE
–0.21 GSPGE
–0.05

BMVGE –0.14
–0.76
–0.43
0.64
0.92
MNT4*TKWGE
TKWGE
–0.63
RAD2*TKWGE

0.15
MXT3*BMAGE

Fig. 6. Path estimates of the structural equation model for endogenous variables associated with grains per square meter GE
(GM2GE), grains per spike GE (GSPGE), 1000-kernel weight GE (TKWGE), spikes per square meter GE (SM2GE), spike mass GE
(SPMGE), relative duration of spike growth GE (RSGGE), crop growth rate during spike growth GE (dBMbGE), biomass at anthesis
GE (BMAGE), biomass at the vegetative stage GE (BMVGE), and yield GE (YLDGE), and cross-products (variables 
environmental covariates) [extracted from Vargas et al. (2007)]. MXTmean daily maximum temperature; MNT mean daily
minimum temperature; RAD solar radiation. Suffixes 1, 2, 3, and 4 stand for the first, second, third, and fourth crop development
stages. Arrows represent the direction of the variables’ influence, and the numbers on the arrow lines represent the estimated
standardized coefficients.
572 CANADIAN JOURNAL OF PLANT SCIENCE

Table 4. Analysis of variance for the stepwise multiple factorial regression model with environmental covariables for marketable fruit yield. The terms in
the factorial regression model appear in the order of inclusion [extracted from Ortiz et al. (2007)]

Sourcez df Sum of squares Mean squares ProbF % of GE explained

Environment 17 701980 41293 B0.001 


Genotype 14 31669 2262 B0.001 
GE 238 160674 675 B0.001 
DHAGen 14 36363 2597 B0.001 22.63
PHGen 14 17989 1285 B0.001 11.19
METGen 14 14854 1061 B0.001 9.24
MNTGen 14 7627 545 B0.001 4.74
OMGen 14 6054 432 B0.001 3.76
MXTGen 14 5713 408 B0.001 3.56
IRRGen 14 6796 485 B0.001 4.23
PRC Gen 14 5802 414 B0.001 3.61
TRMGen 14 4100 293 B0.001 2.55
DRIGen 14 5092 364 B0.001 3.16
EX_NGen 14 5459 390 B0.001 3.39
EX_PGen 14 8108 579 B0.001 5.05
P Gen 14 6306 450 B0.001 3.92
EX_KGen 14 4114 294 B0.001 2.56
DAYGen 14 7157 511 B0.001 4.45
K Gen 14 15013 1072 B0.001 9.34
Residual 14 4123 294
Total 269 894324 3324

z
GEgenotypeenvironment; MXTmaximum temperature in (8C); MNTminimum temperature in (8C); METmean temperature in (8C);
PRCrainfall (mm); DAYdegree day; PHsoil pH; OMorganic matter; P phosphorus; K potassium; EX_Nextra nitrogen; EX_P
extra phosphorus; EXT_Kextra potassium; TRMtrimming; DRI drivings; IRRirrigation; DHAdays to harvest.

environmental covariables are depicted in Fig. 7. relatively higher values of DAY, MNT, and MET
The environmental covariables that most explained prevailing in those environments.
GE in the FR analyses (DHA, PH, MET, MNT, The PLS biplots show more specific GE between
EX_P, and K) tend to be located farther away from genotypes and environments. The covariables MNT and
the center of the PLS biplot, indicating that these MET are in the same direction as environments Es_Gu
covariables caused large GE for marketable fruit yield, (Estanzuela), Co_Sa (Cogutepeque), SA_Sa (San
as had already been detected by FR analysis. Andrés), VS_Ni (Valle del Sábaco), SC_DR (San
The PLS biplot for marketable fruit yield shows Cristobal), Pa_Co (Palmira), and Be_Br (Belem), in-
general patterns in GE with respect to environments, dicating that these locations had relatively high mini-
genotypes, and environmental covariables. Environ- mum temperature [MNT] and mean temperature
ments located to the right of the PLS biplot (Es_Gu, [MET]; these conditions favored the marketable fruit
Co_Sa, SA_Sa, Co_Ho, VS_Ni, SA_CR, SC_DR, yield of AVRDC OP heat-tolerant lines CL 5915-223
Pa_Co, Be_Br, and Ce_TT) had relatively higher values (12) and CL 5915-93 (13), which are located in the same
for environmental covariables located in the same direction. The reproductive processes in tomato are
direction, MET, MNT, and DAY, whereas sites located sensitive to high temperatures (Abdul-Baki 1991), and
on the opposite side of the biplot (BV_Gu, Co_DR, the number of pollen grains in heat tolerant genotypes
LM_Pe, Sa_Ch, Ch_Ch, Cu_Ch, Co:Ch, and Ca_Pa) was higher than that of heat sensitive genotypes. It
tended to have high PH and DAH values. Concerning appears that proline accumulates in tomato leaf tissue at
genotypes, the first PLS axis clearly separates the hybrid high temperatures, which leads to its depletion in the
tomato genotypes (5, 6, 7, 8, 9, 11, and 15) (located reproductive tissue, thus seriously reducing either pollen
towards the left) from the open-pollinated tomato formation or its viability (Kuo et al. 1986).
genotypes (on the right) (1, 2, 3, 4, 10, 12, and 13), The amount of potassium (K) in the soil of environ-
whereas the second PLS axis separates open-pollinated ments Cogutepeque (Co_Sa) and Belém (Be_Br) was
tomato genotypes 1, 2, and 4 from genotypes 3, 10, 12, relatively high, which favored positive GE interaction of
and 13. These results indicate that, in terms of GE, OP cultivar Triuque (4) in both locations. The soil in
open-pollinated genotypes 1, 2, and 4 performed better Comayagua (Co_Ho), San Antonio de Belen (SA_CR),
in environments Es_Gu, Ca_Sa, SA_Sa, SC_DR, and and Centeno (Ce_TT) had relatively high organic matter
Be_Br, whereas open-pollinated genotypes 4, 10, 12, and (OM) content; these environments are in the same
13 performed better in environments Co_Ho, VS_Ni, direction in the biplot (Fig. 7), which favored the
SA_CR, Pa_Co, and Ce_TT (i.e., they tended to have positive GE of OP cultivars Licapal 21 (3) and Angela
positive GE in those sites) and thus are favored by the Gigante (10) in these locations. Since the first two PLS
CROSSA ET AL. * ANALYZING GENOTYPE ENVIRONMENT INTERACTION 573

1.0
1
0.8 DAY

P
0.6 PH LM_Pe
2SA_SaMNTBe_Br
95 PRC
0.4 11 Co_Sa

Factor 2 (16.17%)
Sa_Ch 8 Cu_Ch
7 SC_DR
K
0.2 6 Es_Gu
15
4
0.0 Ch_Ch EX_N
Co_DR 14 VS_Ni
Pa_Co
MET
EX_K
–0.2
13
DHA Co_Ch Co_Ho
–0.4 IRR 12
BV_Gu

–0.6 EX_P MXT SA_CR


Ce_TT
OM 3
–0.8 Ca_Pa
TRM DRI

–1.0 10

–1.0 –0.8 –0.6 –0.4 –0.2 0.0 0.2 0.4 0.6 0.8 1.0
Factor 1 (18.52%)

Fig. 7. Plot of the first two partial least squares regression factors (factor 1 and factor 2) for marketable fruit yield of tomato for 15
cultivars tested across 18 environments in Latin America and the Caribbean (extracted from Ortiz et al. 2006). MXTmaximum
temperature in (8C); MNT minimum temperature in (8C); MET, mean temperature in (8C); PRCrainfall (mm); DAY degree
day (base 10); PHsoil pH; OM organic matter (%); Pphosphorus (P2O5 ppm); K potassium (K2O meq 100 g 1); EX_N
extra nitrogen (kg ha1); EX_Pextra phosphorus (kg ha1); EXT_K extra potassium (kg ha1); TRM trimming;
DRI drivings; IRRirrigation; DHA days to harvest. Estanzuela, Guatemala (Es_Gu), Baja Verapaz, Guatemala, (BV_Gu),
Cogutepeque, El Salvador (Co_Sa), San Andrés, El Salvador (SA_Sa), Comayagua, Honduras (Co_Ho),Valle de Sabaco,
Nicaragua (VS_Ni), San Antonio de Belén, Costa Rica (SA_CR), San Cristóbal, Dominican Republic (SC_DR), Constanza,
Dominican Republic (Co_DR), Palmira, Colombia (Pa_Co), La Molina, Perú (LM_Pe), Santiago, Chile (Sa_Ch), Chillán, Chile
(Ch_Ch), Curacavı́, Chile (Cu_Ch), Colina, Chile (Co_Ch), Belém, Brazil (Be_Br), Caacupé, Paraguay (Ca_Pa), Centeno, Trinidad
Tobago (Ce_TT).

factors do not explain all the GE for marketable fruit genotypes without crossover genotype environment interac-
yield, some distortion occurred, e.g., environment tion. Crop Sci. 48: 12911305.
Co_DR (Constanza) has relatively high OM content, Cornelius, P. L., Crossa, J. and Seyedsadr, M. 1996. Statistical
but is not in the same direction as OM in the PLS tests and estimators for multiplicative models for cultivar
biplot. trials. Pages 199234 in M. S. Kang and H. G. Gauch, Jr., eds.
Genotype-by-environment interaction. CRC Press, Boca
Raton, FL.
Cornelius, P. L. and Seyedsadr, M. 1997. Estimation of general
linear-bilinear models for two-way tables. J. Stat. Comput.
ACKNOWLEDGEMENTS
Sim. 58: 287322.
The authors acknowledge the time and effort contrib- Crossa, J., Burgueño, J., Cornelius, P. L., McLaren, G.,
uted by two anonymous reviewers, which substantially Trethowan, R. and Krishnamachari, A. 2006. Modeling gen-
improved the quality of the manuscript. The authors are otype environment interaction using additive genetic covar-
grateful to national programs that carried the trials and iances of relatives for predicting breeding values of wheat
originated the data analyzed and presented in this genotypes. Crop Sci. 46: 17221733.
manuscript. Denis, J.-B. 1988. Two-way analysis using covariates. Statistics
19: 123132.
Abdul-Baki, A. A. 1991. Tolerance of tomato cultivars and Dhungana, P. 2004. Structural equation modeling of gen-
selected germplasm to heat stress. J. Am. Soc. Hortic. Sci. 116: otype environment interaction. Ph.D. Dissertation, Univer-
11131116. sity of Nebraska, Lincoln, NE.
Aastveit, H. and Martens, H. 1986. Anova interactions Eberhart, S. A. and Russell, W. A. 1966. Stability parameter for
interpreted by partial least squares regression. Biometrics 42: comparing varieties. Crop Sci. 6: 3640.
829844. Finlay, K. W. and Wilkinson, G. N. 1963. The analysis of
Burgueño, J., Crossa, J., Cornelius, P. L. and Yang, R. C. 2008. adaptation in a plant breeding programme. Austr. J. Agric.
Using factor analytic models for joining environments and Res. 14: 742754.
574 CANADIAN JOURNAL OF PLANT SCIENCE

Gabriel, K. R. 1978. Least squares approximation of matrices on growth and nitrogen accumulation by field-grown tomato.
by additive and multiplicative models. J. R. Stat. Soc. Ser. B. Agron. J. 92: 159167.
40: 186196. Smith, A., Cullis, B. R. and Thompson, R. 2002. Exploring
Gauch, H. G., Jr. 1988. Model selection and validation for variety-environment data using random effects AMMI models
yield trials with interaction. Biometrics 44: 705715. with adjustment for spatial field trends: Part 1: Theory. In
Gauch, H. G., Piepho, H.-P. and Annicchiarico, P. 2008. M. S. Kang, ed. Quantitative genetics, genomics and plant
Statistical analysis of yield trials by AMMI and GGE: Further breeding. CABI Publishing, Oxford, UK.
considerations. Crop Sci. 48: 866889. Smith, A. B., Cullis, B. R. and Thompson, R. 2005. The analysis
Gollob, H. F. 1968. A statistical model which combines of crop cultivar breeding and evaluation trials: An overview of
features of factor analytic and analysis of variance. Psychome- current mixed model approaches. J. Agric. Sci. 143: 114.
trika 33: 73115. Vargas, M., Crossa, J., Reynolds, M. P., Dhungana, P. and
Helland, I. S. 1988. On the structure of partial least squares. Eskridge, K. M. 2007. Structural equation modelling for
Commun. Stat. Part B Sim. Comput. 17: 581607. studying genotype per environment interactions of physiolo-
Henderson, C. R. 1984. Applications of linear models in animal gical traits affecting yield in wheat. J. Agric. Sci. 145:
breeding. University of Guelph. Guelph, ON. 151161.
Joshi, A. K., Crossa, J., Arun, B., Chand, R., Trethowan, R., Vargas, M., Crossa, J., van Eeuwijk, F. A., Ramirez, M. E. and
Vargas, M. and Ortiz-Monasterio, I. 2010. Genotype envir- Sayre, K. 1999. Using partial least squares, factorial regression
onment interaction for zinc and iron concentration of wheat and AMMI models for interpreting genotype environment
grain in the eastern Gangetic plains of India. Field Crop Res. interaction. Crop Sci. 39: 955967.
116: 268277. Vargas, M., Crossa, J., van Eeuwijk, F. A., Sayre, K. and
Kuo, C. G., Chen, H. M. and Ma, L. H. 1986. Effect of high Reynolds, M. P. 2001. Interpreting treatment  environment
temperature on proline content in tomato floral buds and interaction in agronomy trials. Agron. J. 93: 949960.
leaves. J. Am. Soc. Hortic. Sci. 111: 746750. van Eeuwijk, F. A., Denis, J.-B. and Kang, M. S. 1996.
Mandel, J. 1969. The partitioning of interaction in analysis of Incorporating additional information on genotypes and en-
variance. J. Res. Natl Bur. Stand. Ser. B 73: 309328. vironments in models for two-way genotype by environment
Mandel, J. 1971. A new analysis of variance model for non- tables. In S. Kang and H. G. Gauch, eds. Genotype-by-
additive data. Technometrics 13: 118. environment interaction, CRC Press, Boca Raton, FL.
Ortiz, R., Crossa, J., Vargas, M. and Izquierdo, J. 2007. van Eeuwijk, F. A., Crossa, J., Vargas, M. and Ribaut, J. M.
Studying the effect of environmental variables on the gen- 2000. Variants of factorial regression for analysing QTL by
otypeenvironment interaction of tomato. Euphytica 153: environment interaction. Proceedings of the 11th Meeting of
119134. the EUCARPIA Section Biometrics in Plant Breeding. In A.
Piepho, H. P. 1998. Methods for comparing the yield stability Gallais, C. Dillmann, and I. Goldringer, eds. Quantitative
of cropping systems  a review. J. Agron. Crop Sci. 180: genetics and breeding methods: the way ahead. Paris, France.
193213. Williams, E. J. 1952. The interpretation of interactions in
Piepho, H. P. and Mohring, J. 2005. Best linear unbiased factorial experiments. Biometrika 39: 6581.
prediction of cultivar effects for subdivided target regions. Yan, W., Kang, M. S., Ma, B., Woods, S. and Cornelius, P. L.
Crop Sci. 45: 11511159. 2007. GGE biplot vs. AMMI analysis of genotype-by-
Reynolds, M. P., Trethowan, R., Crossa, J., Vargas, M. and environment data. Crop Sci. 47: 643655.
Sayre, K. D. 2004. Physiological factors associated with Yang, R. C., Crossa, J., Cornelius, P. L. and Burgueño, J. 2009.
genotype by environment interaction in wheat. Field Crop Biplot analysis of genotype environment interaction: Pro-
Res. 85: 253274. ceed with caution. Crop Sci. 49: 15641576.
Scholberg, J., McNeal, B. L., Boote, K. J., Jones, J. W., Yates, F. and Cochran, W. G. 1938. The analysis of groups of
Locascio, S. J. and Olson, S. M. 2000. Nitrogen stress effects experiments. J. Agric. Sci. 28: 556580.

You might also like