Download as pdf or txt
Download as pdf or txt
You are on page 1of 9

Molecular Genetics and Genomics (2023) 298:813–821

https://doi.org/10.1007/s00438-023-02026-0

REVIEW

Understanding the genomic selection for crop improvement: current


progress and future prospects
Rabiya Parveen1 · Mankesh Kumar1 · Swapnil2 · Digvijay Singh3 · Monika Shahani4 · Zafar Imam1 ·
Jyoti Prakash Sahoo5

Received: 29 January 2023 / Accepted: 27 April 2023 / Published online: 10 May 2023
© The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2023

Abstract
Although increased use of modern breeding techniques and technology has resulted in long-term genetic gain, the pace
of genetic gain must be sped up to satisfy global agricultural demand. However, marker-assisted selection has proven its
potential for improving qualitative traits with large effects regulated by one to few genes. Its contribution to the improve-
ment of the quantitative traits regulated by a number of small-effect genes is modest. In this context, genomic selection
(GS) has been regarded as the most promising method for genetically enhancing complicated features that are regulated by
several genes, each of which has minor effects. By examining a population's phenotypes and high-density marker scores,
genomic selection can forecast the breeding potential of individual lines. The fact that GS uses all marker data in the predic-
tion model prevents skewed marker effect estimations and maximizes the amount of variation caused by small-effect QTL.
It has the ability to speed up the breeding cycle and as a consequence of which superior genotypes are selected rapidly.
Developing the best GS models while taking into account non-additive effects, genotype-by-environment interaction, and
cost-effectiveness will enable the widespread implementation of GS in plants. These steps will also increase heritability
estimation and prediction accuracy. This review focuses on the shift from conventional selection methods to GS, underlying
statistical tools and methodologies, the state of GS research in agricultural plants, and prospects for its effective use in the
creation of climate-resilient crops.

Keywords Genomic selection · Plant breeding · Crop improvement · Genomics

Introduction

The degree of genetic variability present in the germplasm


Communicated by Bing Yang.
is a key feature in plant breeding's ability to successfully
* Jyoti Prakash Sahoo manipulate the genotype's genetic architecture in an artistic
jyotiprakash.sahoo@cgu-odisha.ac.in; and scientific manner. The result of all these breeding tech-
jyotiprakashsahoo2010@gmail.com niques is a cultivar that is better and more widely accepted.
1
Department of Genetics and Plant Breeding, Bihar The methods which were used earlier for the selection of
Agricultural University, Sabour, Bhagalpur 813210, India plants as well as to realize the genetic gain for the desired
2
Department of Genetics and Plant Breeding, traits were solely based on their phenotypes or visual selec-
Centurion University of Technology and Management, tion that might include a few major yield-related attributes.
Paralakhemundi 761211, India Phenotypic selection (PS) based breeding techniques could
3
Department of Genetics and Plant Breeding, Narayan not effectively address the different attributes regulated by
Institute of Agricultural Sciences, Gopal Narayan Singh larger number of minor effects and non-allelic QTLs and
University, Sasaram 821305, India also demonstrating strong genotype × environment (G ×
4
Department of Genetics and Plant Breeding, Maharana E) interactions (Varshney et al. 2012; Moose et al. 2008).
Pratap University of Agriculture and Technology, Numerous varieties of various crops have been created
Udaipur 313001, India
due to the introduction of molecular markers based on
5
Department of Agriculture and Allied Sciences, C.V. Raman DNA and further utilizing them in MAS (marker-assisted
Global University, Bhubaneswar 752054, India

13
Vol.:(0123456789)
814 Molecular Genetics and Genomics (2023) 298:813–821

selection), where the desired genes of different crop plants effect, desired genes as well as for marker-assisted recur-
are tagged with these markers (Collard et al. 2008). Never- rent selection (MARS). Backcross breeding program is a
theless, using gene pyramiding, MAS (Servin et al. 2004) conventional method of breeding as the recurrent (recipient)
and MARS (marker-assisted recurrent selection) (Crossa parent is only improved to the degree that the QTLs/intro-
et al. 2010) have been employed to improve the shortcom- gressed genes stipulate. It doesn't produce any novel gene
ings of extensively adapted cultivars and further introduce combinations that could be anticipated to improve the geno-
novel gene(s) into desired parents. However, the characters type's adaptability and performance potential. The MARS
which displayed quantitative or complex nature, governed approach has been criticized for being ineffective because
by a greater number of small-effect quantitative trait loci it also relies on markers that demonstrate a strong correla-
(QTLs) are not considerably improved by gene pyramiding tion with the trait(s). By merging trait phenotypic data and
of a small number of genes or alleles (Kearsey et al. 1998). marker genotype data into a pooled selection index, MARS
Furthermore, the application of low density marker sys- aims to account for minor effect QTLs. A new selection
tems limit the efficiency of these approaches in genetically method termed genomic selection (GS), which can help with
improving polygenic traits. In this context, the traditional selection for such qualities by determining an individual's
breeding has been greatly strengthened by precise indirect net genetic gain by utilizing the impacts of dense markers
choices based on molecular or genomic methods that have spread throughout the genome, has been created to overcome
been in frequent use during the past few years. A technique the drawbacks of both MAS and MARS (Meuwissen et al.
called genomic (or genome-wide) selection (GS) has the 2001).
potential to address the drawbacks of MAS for quantita- In essence, GS is a variation of MAS with added benefits
tive traits (Varshney et al. 2012). Instead of pinpointing a and a wider reach. GS, as opposed to MAS, estimate the
specific QTL, the goal of GS is to evaluate an individual's breeding value of an individual solely based on large number
genes potential. The core benefit of GS is that it may con- of marker data spread throughout the genome. To determine
currently improve numerous phenotypes while also captur- the genomic estimated breeding values (GEBVs) for each
ing a number of small-effect genetic parameters (Xu et al. individual of the breeding population (BP)/ target popula-
2020a, b). The ability to choose parents solely on the basis of tion, the GS creates a prediction model based on the phe-
GEBV (genomic estimated breeding value) is also discussed notypic and genotypic data of the training population (TP)
in this review, along with a comparison of implementation (Meuwissen et al. 2001). Since the marker profiles of these
alternatives for GS in breeding programmes, making predic- individuals are comparable to those of additional plants of
tions within as well as across breeding cycles. However, GS TP that have been shown to perform more efficiently in a
was initially created in livestock breeding as a technique to particular environment, the GEBVs give us the ability to
forecast an individual's breeding value based on markers determine which individuals will perform more effectively
that cover up the whole genome and uses simulated data and are appropriate for either their inclusion further in
(Meuwissen et al. 2001). advanced breeding program or as a parent in hybridization.
According to the GS principle, all population-wide genetic
Drawbacks of conventional marker assisted variants are captured using genome-wide markers, and each
selection of the QTL determining a particular trait is in LD with at
least one marker. Therefore, the precision of GS depends on
In an indirect selection method known as Marker Assisted the LD between particular marker alleles and QTLs; which
Selection (MAS), individuals are chosen based on the known implies, the larger the LD between the two, the more precise
markers associated with a certain trait of interest (Fernando the genomic predictions (Solberg et al. 2013). The breed-
et al. 1989). MAS was developed as well as utilized to ing cycle is accelerated via GS by increasing the selection
choose the characters governed by genes with reasonably intensity as well as precision (Crossa et al. 2017) and facili-
large effects. It utilizes molecular markers associated with tating the quick selection of high-performing genotypes.
the target characters. Unlike the conventional phenotype- This accurate selection is necessary for more rapid genetic
based selection method in plant breeding, this method has advancement during breeding for complex traits.
been found effective for the selection of individuals, but it is
not the best strategy for intricate agronomic traits, because it Population used in genomic selection
typically relies on predictions from a small number of mark-
ers in Linkage Disequilibrium (LD) with QTLs with large The process of GS involves two distinct but inter-related
impact on the trait, which overlook the contributions from populations. The Training Population (TP) (Reference Popu-
small to medium effect QTL [Bernardo 2008, Heffner et al. lation) is the first, and other one is the Breeding Popula-
2010]. MAS has been widely used for backcross breeding tion (BP) (Testing Population) (Heffner et al. 2009). The TP
program, mainly for the introgression of QTLs with major is utilized to train the GS model and to predict the associated

13
Molecular Genetics and Genomics (2023) 298:813–821 815

marker effects required to determine the GEBVs of particu- the similar set of markers that are used to estimate model
lar genotype in the BP. On the other hand, the BP is the parameters, and no phenotypic evaluation, (vi) GEBVs of BP
one that undergoes GS to accomplish the required progress lines are then estimated using information on their marker
and select superior lines to be used as parents or new vari- genotype data and related effects derived from the TP, and
eties of fresh, improved hybrids. The estimated breeding (vii) Based on their GEBV estimates, the superior genotypes
values for each genotyped line in BP are calculated using the or lines are chosen from the BP.
data from TP to rank the lines without phenotyping and to
assess the effect of each assayed marker. Additionally, these The breeder’s equation and genetic gain
reserved individuals can be used as parental lines that could
mate with one another to introgress advantageous alleles The following well-known breeder’s equation predicts
for the upcoming selection cycle (Jonas et al. 2013, Desta genetic gain in a plant breeding programme:
et al. 2014; Heffner et al. 2010). To estimate the BVs of the
lines being tested, only marker data and statistical models ΔG = ih𝜎a ∕L
created in a TP are used in the basic GS process (Meuwis-
sen et al. 2001). where i = selection intensity, h = narrow sense heritability,
A TP is a population for which complete genotypic and σa = additive genetic variance, L = length of breeding cycle
phenotypic data are available and from which the GS model interval or generation
parameters are obtained. The BVs of the lines in the BP, also To achieve higher genetic gains in breeding programs,
called as the genomic estimated breeding value (GEBV) on strategies that allow quick increases in genetic diversity in
the basis of the genotypic data, are then evaluated using the the breeding population, selection intensity, and/or heritabil-
GS models. The breeding population must be represented by ity of characters are required (Krishnappa et al. 2021). These
a set of lines that are closely associated, and whose ancestry approaches also need to allow for a decline in the number of
is well-known, such as half-siblings or closely linked popu- breeding cycles (Xu et al. 2016). The breeding cycle inter-
lations. However, a diagram of basic scheme of genomic val can be shortened to enhance the genetic gain per unit
selection process is illustrated in Fig. 1. Furthermore, a of time, significantly by raising period of breeding per unit
generalized process for genomic selection can be explained of time as well as decreasing the expense of phenotyping
as, (i) Creation of a TP appropriate for the concerned BP, (Crossa et al. 2010), particularly for plant species with long
(ii) The TPs individuals are genotyped for the large set breeding cycles (Wong et al. 2008; Sahoo et al. 2018a). GS
of markers that are uniformly distributed throughout the quickens the breeding cycle and makes it possible to identify
whole genome, (iii) Individuals are then subjected to exten- superior genotypes quickly (Crossa et al. 2017; Heffner et al.
sive phenotypic evaluation in replicated trials over loca- 2010; Sahoo et al. 2018b).
tions and preferably over years, (iv) GS model parameters
are computed using marker genotypic and phenotypic data, Design of training population
which is called model training, (v) BP is then assessed for
By fostering high levels of prediction accuracy or enhancing
BP diversity in the context of precise and effective breeding
initiatives, TP design plays a significant role in GS (Isidro
Population
et al. 2015, Zhang et al. 2017). The BP composition is the
primary factor to be taken into account when developing a
test procedure, thus it is necessary to establish BP before
developing a TP design that focuses on maximizing candi-
Training Population Breeding Population date prediction accuracy while minimizing expenses related
to phenotyping and genotyping. For high GEBV accuracy,
the training population should contain members of the BP
Phenotype (Rutkoski et al. 2011). A TP could be made up of historical
+ Genotype data or could be an actual population made up of existing
Genotype individuals (Singh et al. 2015; Sahoo et al. 2019). Ideally,
for each BP, a fresh TP should be created. Since the breeding
population and training population would be interconnected,
GS Model GEBVs this strategy will result in high accuracy in prediction of
GEBV. As a result, QTL effects, the genetic background,
moderate allele frequencies, etc., will be similar in the two
Fig. 1  Basic scheme of Genomic Selection process populations. However, this will call for a precise phenotypic

13
816 Molecular Genetics and Genomics (2023) 298:813–821

assessment of distinct TP for each BP in the desired set of at genotypic level. The amount of prediction accuracy in
environmental condition. This will hindered the growth and various GS models is determined by the assumptions and
will also add to the cost in the running program. approaches taken to marker effects (Liu et al. 2018; Wang
As an alternative, the entire breeding effort might use a et al. 2018; Samal et al. 2021). However, the most commonly
single training population. This population would be made used models in genomic selection are illustrated in Fig. 2.
up of samples of genotypes or lines taken from all the BP Since stepwise regression (SR) methods are necessary
being considered in a particular breeding program. GS mod- for traditional MAS as it treats marker effects as fixed, fit-
els trained based on the populations as per second approach ting markers separately or in small groups and resolves the
would allow to precisely estimate GEBVs of individual issue of lack of degrees of freedom. Only those markers
genotypes from each BP represented in TP. Numerous sim- are retained that displayed significant effect on the character
ulation experiments show that GS models trained on these after this process, and the rest are discarded. The markers
populations are quite accurate in predicting the GEBVs of with significant effects are estimated, while the markers with
the populations in consideration, especially when extremely non-significant effects are given "zero" effect values, which
greater marker densities are employed. This strategy would is essential for maintaining model estimability (Lande et al.
shorten selection cycles and cut their cost (Singh et al. 2015; 1990; Priyadarshini et al. 2020). Though effects preserved
Sahoo et al. 2020a). The breeding population's allelic fre- in the model can be significantly exaggerated (Beavis 1998;
quencies and LD structure would change as a result of selec- Hayes 2007; Sahoo et al. 2021a), when only substantial
tion; hence the TP should be revised to include genotypes marker effects are estimated, only a percentage of the genetic
or lines chosen from the BP. The GS model should also be variance will be detected (Goddard et al. 2007; Hayes 2007;
retrained with every revision of the training population. To Sahoo et al. 2021b). This is especially true when several
reach the highest level of prediction accuracy, the GS model effects are assessed.
should actually be trained over the course of more than one As suggested by Meuwissen et al. (2001), SR in GS sim-
generation. ulation showed low GEBV accuracy as a consequence of
limited QTL detection. The two frequently utilized models
Cross validation for prediction in different GS research in crop plants are RR-
BLUP i.e., ridge regression best linear unbiased prediction
The level of prediction accuracy is crucial for the effective
use of GS in regular plant breeding operations, hence it's
crucial to cross-validate the trained model to ensure a high
level of prediction accuracy. The most accurate prediction Stepwise
Regression
model in the TP is trained and developed via cross-vali- G-BLUP
dation, and it can then be used to assess the GEBV of BP (Genomic
BLUP (Best BLUP)
(Perez-Cabal et al. 2012; Sahoo et al. 2020b). The K-fold Linear
Unbiased RR-BLUP
cross-validation that is most frequently used cross-valida- Predictor) (Ridge
tion, in which the sample is divided into K nearly equal Regression
BLUP)
parts. Using variables calculated from the other K-1 parts,
each part is predicted (sample without the predicted part).
Bayes A
Eventually, all parts are estimated using samples that exclude
Whole genome
the parts to be estimated. This is accomplished by first creat- regression models
ing a prediction model using a large portion of the training for Genomic Bayes B
Seletion
population, and then utilizing solely genotypic information Bayesian
to find out the GEBVs of the remaining members of the TP. Approach
Bayes Cπ
This enables scientists to "test" and improve the prediction
model to ensure that the forecast accuracy is good enough
that upcoming predictions may frequently be relied upon Bayes Dπ
(Robertson et al. 2019).

Statistical model for genomic selection RR-BLUP


Penalized
Approach
A proper statistical model must be chosen to accomplish LASSO
a significantly greater prediction capacity and, in turn, the
effectiveness of GS, despite the fact that numerous GS mod-
els have been designed to forecast performance of the crops Fig. 2  Most commonly used Models in Genomic Selection

13
Molecular Genetics and Genomics (2023) 298:813–821 817

and G-BLUP i.e., genomic best linear unbiased prediction. >> n), which is the major challenge in linear models using
The RR-BLUP model presupposes that each marker have abundant of markers throughout the genome. As a result,
minor effects and similar variances. Each marker possesses this challenge is successfully addressed by the penalized
different effects, despite the assumption of equal variance regression-based technique known as ridge regression (RR)
(Bernardo et al. 2007). When the character is regulated by (Meuwissen et al. 2001). The least absolute shrinkage and
numerous loci with minor effects, RR-BLUP produces good selection operator (LASSO), which is a form of penalized
prediction accuracy (Burgueno et al. 2012). Another popular regression similar to RR, penalizes estimate to get a sparse
model that may determine the additive genetic qualities from solution. Ridge regression compelled all the coefficients
a genomic association matrix is the G-BLUP, which is equal to decrease to zero, but LASSO can set some coefficients
to the RR-BLUP (Wang et al. 2015; Sahoo et al. 2022a). to zero that are not related to the phenotype. Hence, if the
RR-BLUP and G-BLUP are better appropriate for quantita- phenotype is governed by numerous markers with minute
tive attributes that are regulated by a huge number of small effects, ridge regression will detect such effects (Heffner
genes since they share the same presumption that all loci's et al. 2009), while LASSO will considers major effects with
effects have a common variance (Sahoo et al. 2022b). The few markers (van Eeuwijk et al. 2018).
majority of the markers have minimal or no effects, and only The markers are omitted from the model and their geno-
a small number of markers have big impacts, therefore the typic information is not needed in the breeding phase if the
presumptions of RR-BLUP and G-BLUP are seldom met. coefficients of the markers are set to zero or a low value
Both the models assume that every marker adds the during the training phase. A non-parametric technique for
equal amount of variance, which is not true for all traits. As making genetic predictions is reproducing kernel Hilbert
a result, it is necessary to predict the variation of the markers space (RKHS). In comparison to parametric techniques,
depending on the genetic makeup of the trait (Sahoo et al. non-additive genetic effects can be easily captured by non-
2022c). Several Bayesian models that assume the existence parametric models in genomic prediction (Gionala et al.
of a prior distribution of marker effects have been put forth 2006). By combining an additive genetic model with a ker-
for this purpose. Additionally, conclusions regarding the nel function (Gionala et al. 2006) and transforming predic-
model's parameters are drawn from the posterior distribu- tor variables into a set of distances between observations,
tions of marker effects. For genomic prediction, Bayesian RKHS produces a definite matrix for use in a linear model.
models are available in a number of forms, including Bayes However, the factors affecting GEBVs accuracy are illus-
A, Bayes B, Bayes Cπ, and Bayes Dπ (Meuwissen et al. trated in Fig. 3.
2001; Habier et al. 2011), as well as other derivatives such
Bayesian LASSO (Least Absolute Shrinkage and Selection Population size
Operation). Given that the shrinkage level is lower than that
of Bayes B and Bayes A and Cπ is most appropriate for Whether using traditional MAS or GS, population size, par-
characters controlled by a several genes. Most markers are ticularly the TP, has a considerable impact on prediction pre-
not considered in the model because the Bayes B model pre- cision. It is evident that a decline in accuracy is anticipated
sumes that the majority of loci have no influence on the char- if the TP size is minimal since the model would inaccurately
acter. If large-effect QTLs that account for a considerable predict the marker effects and subsequently the prediction
portion of the genetic variance control how a trait manifests,
Bayes B fits well (Munkvold et al. 2009). In contrast, the
parameter in Bayes Cπ can be determined from experimental Population Size
data, allowing for the estimation of the shrinkage level. As
a result, it is better suited than Bayes B for the examination Marker Density
of actual data.
Gene effects and
Since Bayesian models capture large-effect QTLs, they GEI
will typically have more accurate predictions. These Bayes-
ian models are sensitive to the number of QTLs; as the Heritability
number of QTLs rises, so does the predictive ability (Wang
Linkage
et al. 2015). On the other hand, it is more likely for plant Disequilibrium (LD)
traits controlled by a greater number of minor genes because
the predictive ability of RR-BLUP and G-BLUP frequently
remains almost constant regardless of the number of QTLs.
The number of markers (p) exceeds the number of observa-
tions (n), i.e., genotype/lines, and it causes the difficulty of
over-parameterization (large "p" and small "n" problem (p Fig. 3  Factors affecting GEBVs Accuracy

13
818 Molecular Genetics and Genomics (2023) 298:813–821

accuracy. When TP and BP had a closer genetic link, pre- right MTGS (multi trait genomic selection) – based model,
diction accuracy in GS was noticeably greater (Calus et al. few characters with low heritability and a strong associa-
2007). Additionally, as the number of the effective breeding tion with other characters which shows high heritability can
population grows, so too should the size of the TP (Nakaya draw information from those traits. By using this, eventu-
and Isobe 2012). ally scientists can more precisely and accurately predict the
GEBV in such cases by employing the MTGS model (Sahoo
Marker density et al. 2022d).

Meuwissen discovered that accuracy in prediction rises as Linkage disequilibrium (LD)


marker density does. The ideal number of markers is such
that at least one marker is in strong LD with the largest num- Target marker densities for GS can be calculated using LD
ber of QTLs impacting the trait. Therefore, the extent of LD estimations because at equilibrium, recombination counter-
in the species in consideration will determine the marker acts drift by causing LD to decrease; therefore adjacent loci
density; generally, cross-pollinated species should have a should have greater LD values than far-off loci. Before exe-
substantially higher marker density than self-pollinated spe- cuting GS, it is important to fully comprehend how LD will
cies. Traits with high heritability requires less dense markers affect how operable GS will be. The non-random connec-
set as compared to those with low heritability (Sahoo et al. tion of alleles at various loci is referred to as LD. Average
2022d). neighboring marker r2 of 0.15 for high heritability traits has
been found to be sufficient, although r2 of 0.2 for low herit-
Gene effects and GEI ability traits improves the accuracy of GEBV predictions
(Calus et al. 2007; Sahoo et al. 2023). However, some appli-
Genotype Environment Interactions (GEI) pose a signifi- cations of GS in plant breeding are illustrated in Table 1.
cant challenge to the practical use of GS in agricultural GS as a prominent and promising strategy will become an
crops, because most economically important traits have increasingly widespread application in plant breeding, as
complex natures and are significantly influenced by environ- in livestock, with the evolution of key GS components and
mental variables due to cross-over interactions (Boer et al. associated platforms. Breeding programs are often designed
2007; Cooper 1999). With the use of the equation ∆G = to have fewer replications in the early generations viz; F2
ihσa/L, it is possible to see how GEI influences genetic gain and subsequent generations and more replications during
by dragging heritability downward. As a result, environmen- the advanced generations, along with larger plot sizes and
tal trait prediction accuracy would be impacted, especially multi-location testing (Bernardo 2010; Sahoo et al. 2023).
during the multi-environment trial (MET) test. In addition By removing one or two selfing cycles, Genomic Selec-
to GEI, non-additive genetic effects that may include domi- tion in the early generation causes a significant drop in the
nance (intra-locus) or epistasis (inter-locus) might also be breeding cycle (Hickey et al. 2014). The length of the repro-
problematic for the use of GS in plant breeding. Breeding ductive cycle is shortened by re-introducing a few individu-
populations can have various allele substitution effects at als with high GEBVs as parents. To balance the phenotypic
the relevant QTL when non-additive effects are present. To and genomic selection, either one or up to two cycles of
measure the QTL allele substitution effects for their con- GS succeeded by one cycle of PS are advised with regard
sistency across populations, TPs and BPs should both be to the costs and genetic advantages associated with the
examined. breeding program (Rutkoski et al. 2015; Sahoo et al. 2023).
Applications of HTP are actively being pursued in wheat
Heritability genomics, but remain in their infancy for integration in GS
studies (Sweeny et al. 2019). The development of HTP for
Trait heritability can also have an impact on prediction accu- routine use in crop breeding programs is lagging behind the
racy, particularly when it is smaller (h2 = 0.4) (Hayes et al. genotyping technologies. Therefore, more efforts are needed
2009). In general, it is believed that target traits with high to develop cost-effective and high-performance platforms
heritability have accurate predictions, and vice versa. The (Shakoor et al 2017). A recent concept of ‘envirotyping’
majority of agricultural traits, notably in plants, have low could benefit from recent advances in HTP to capture and
to moderate heritability, which makes genomic selection account for the source of variation in agronomic traits that
research difficult. To achieve the same prediction accuracy are related to quantifiable environmental variables (Cooper
as for characters with moderate to high heritability, low 2014). Envirotyping involves the collection and utilization
heritable traits would need a bigger training population. of information on environmental factors such as soil, geo-
However, if data are available on numerous traits, the prob- graphic, climatic conditions by multi-location empirical
lem of low heritability could be addressed. By applying the evaluations for phenotypic prediction (Chenu et al 2011; Xu

13
Molecular Genetics and Genomics (2023) 298:813–821 819

Table 1  Applications of GS in plant breeding


Crop Population size
Training population Breeding population Number of GEBV ­accuracya GS ­Modelb References
markers

Maize 95 119 1339 0.40–0.50 BLUP Lorenzana and Bernardo


28,35,70 349 160 0.59–0.72 BLUP (2009)
Arabi- 50-133 415 69 0.90–0.93 BLUP
dopsis
thaliana
Barley 54,96,120 150 223 0.64–0.83 BLUP
Maize 208 208 136 1.00 Severalc Piepho (2009)
Wheat 60 599 1279 0.48–0.61 PM-RKHSc Crossa et al. (2010)
Maize 270 300 1148 0.42–0.79 LASSO
Wheat 24, 48, 96 209 399 0.32–0.84 RRBLUP Heffner et al. (2010)
Wheat 24, 48, 96 174 574 0.41–0.73 RR- BLUP
Maize 25–157 for each population 25 population of 126–196 1,106 0.26–0.57 RRBLUP Guo et al. (2012)

2016). Parameters associated with envirotyping could also from livestock to plants and from narrow applications in a
be included in the linear mixed models for genomic predic- few crops for specific traits to broad applications in all main
tion to enhance heritability and thereby the prediction accu- crop plants for all relevant traits. Methodological advance-
racies for different traits (van Eeuwijk 2018). However, some ments will surely help GS in plant as well as animal breeding
contributions of GS to some traits are elaborated in Table 2. programs to become successful (such as the implementa-
tion of G × E interaction, imputation of missing genotypic
value, haplotypes, knowledge of epigenetic regulation, as
Conclusion well as multiple traits information into prediction models).
It is highly desired to update the training dataset for GS
To fulfill the rising demand for food on a global scale, consistently by integrating the new markers in every genera-
genetic gain must accelerate. To accomplish quick genetic tion. How well the prediction models perform is extremely
gain, modern breeding techniques are necessary. One such affected by the assessment of the TP, thus it should be done
tried-and-true technology in plant breeding initiatives is under regulated, well-managed circumstances. To provide
Genomic Selection (GS). In general, for characters con- successful outcomes, a structured program in the area of GS
trolled by genes with minor genes with cumulative effects, is required, encompassing trait phenotyping, human resource
GS may be a potential method to increase genetic gain for development, and advanced data recording technologies.
a given time period and cost. The creation of affordable
Acknowledgements Not applicable.
genotyping platforms and high-efficiency breeding tech-
niques will aid in the extension of GS-assisted breeding

Table 2  Contributions of GS to some traits of interest


GS for Contribution

Quality traits and Yield


Complex polygenic characters governed by many genes with small effect that is affected by interactions between
genes and environment.
Prediction accuracies have been improved by include GxE effects in models
Disease resistance Aids in overcoming quantitative disease resistance, which is controlled by a large number of genes with small effects
and is difficult for pathogens to overcome. Most commonly are fusarium head blight, Wheat rust, and rice blast
resistance.
Germplasm enhancement With GS, it is possible to achieve high genome-enabled prediction accuracy, which could aid breeding program to
incorporate beneficial genetic variants.
This is in favor of utilizing GS to incorporate primitive cultivars into superior germplasm and produce gene pools and
populations ideal for genetic enhancement.
Hybrid breeding Improve the performance of crosses based on genotyped parent performance.
Used to aid in hybrid selection and predict hybrid performance.

13
820 Molecular Genetics and Genomics (2023) 298:813–821

Author contributions Rabiya: drafting of manuscript, final referencing Breeding: Methods, Models, and Perspectives. Trends Plant Sci
and editing; Swapnil: collection of supporting papers and written a part 22:961–975
of manuscript; DZ: written a part of manuscript and helped in editing Desta ZA, Ortiz R (2014) Genomic selection: genome wide prediction
manuscript ; Mankesh: editing the manuscript; Monika: collection of in plant improvement. Trends Plant Sci. 19:592–601
papers; JP: Coordinates the process. All authors read and approve the Fernando R, Grossman M (1989) Marker assisted selection
final version of the manuscript. using best linear unbiased prediction. Genet Select Evolut
21(421):467–477
Funding Not applicable. Gianola D, Fernando RL, Stella A (2006) Genomic-assisted predic-
tion of genetic value with semi-parametric procedures. Genetics.
Data availability All data generated or analyzed during this study are 173(3):1761–76
included in this article. Goddard ME, Hayes BJ (2007) Genomic selection. J Anim Breed Genet
124:323–330
Declarations Guo Z, Tucker DM, Lu J et al (2012) Evaluation of genome-wide selec-
tion efficiency in maize nested association mapping populations.
Conflicts of interest The authors declare no conflict of interest. Theoret Appl Genet 124:261–275
Habier D, Fernando RL, Kizilkaya K, Garrick DJ (2011) Extension
of the bayesian alphabet for genomic selection. BMC Bioinform
12:186–197
Hayes B (2007) QTL mapping, MAS, and genomic selection. Animal
References Breeding & Genetics, Department of Animal Science, Iowa State
Univ, Ames
Beavis WD (1998) QTL analyses: Power, precision, and accuracy. In: Hayes BJ, Bowman PJ, Chamberlain AJ, Goddard ME (2009) Invited
Patterson AH (ed) Molecular dissection of complex traits. CRC review: genomic selection in dairy cattle: progress and challenges.
Press, Boca Raton, FL, pp 145–162 J Dairy Sci 92:433–443
Bernardo R (2008) Molecular markers and selection for complex Heffner EL, Sorrells ME, Jannink JL (2009) Genomic selection for crop
traits in plants: learning from the last 20 years. Crop Science improvement. Crop Sci 49:1–12
48(5):1649–1664 Heffner EL, Lorenz AJ, Jannink JL, Sorrells ME (2010) Plant breed-
Bernardo R (2010) Breeding for Quantitative traits in plants, 2nd edn., ing with genomic selection: gain per unit time and cost. Crop Sci.
Stemma Press, Woodbury, Minnesota, ISBN 978-0-9720724-1-0 50:1681–1690
Bernardo R, Yu J (2007) Prospects for genome-wide selection for quan- Hickey JM, Dreisigacker S, Crossa J, Hearne S, Babu R, Prasanna BM,
titative traits in maize. Crop Sci. 47:1082–1090 Grondona M, Zambelli A, Windhausen VS, Mathews K, Gorjanc
Boer MP, Wright D, Feng L, Podlich DW, Luo L, Cooper M, van G (2014) Evaluation of genomic selection training population
Eeuwijk FA (2007) A mixed-model quantitative trait loci (QTL) designs and genotyping strategies in plant breeding programs
analysis for multiple-environment trial data using environmental using simulation. Crop Sci 54(4):1476–1488
co-variables for QTL-by-environment interactions, with an exam- Isidro J, Jannink JL, Akdemir D et al (2015) Training set optimiza-
ple in maize. Genetics. 177(3):1801–13 tion under population structure in genomic selection. Theor Appl
Burgueño J, de los Campos G, Weigel K, Crossa J (2012) Genomic Genet 128:145–158
prediction of breeding values when modeling genotype × envi- Jonas E, de Koning DJ (2013) Does Genomic Selection have a future
ronment interaction using pedigree and dense molecular mark- in plant breeding? Trends Biotechnol. 31:497–504
ers. Crop Sci. 52:707–719 Kearsey MJ, Farquhar AG (1998) QTL analysis in plants; where are
Calus M, Veerkamp R (2007) Accuracy of breeding values when using we now? Heredity 80(Pt 2):137–42
and ignoring the polygenic effect in genomic breeding value esti- Krishnappa G, Savadi S, Tyagi BS, Singh SK, Mamrutha HM, Kumar
mation with a marker density of one SNP per cM. J Anim Breed S, Mishra CN, Khan H, Gangadhara K, Uday G, Singh G (2021)
Genet 124:362–368 Integrated genomic selection for rapid improvement of crops.
Chenu K, Cooper M, Hammer GL, Mathews KL, Dreccer MF, Chap- Genomics 113(3):1070–1086
man SC (2011) Environment characterization as an aid to wheat Lande R, Thompson R (1990) Efficiency of marker-assisted selection in
improvement. Interpreting genotype-environment interactions by the improvement of quantitative traits. Genetics 124(3):743–756
modelling water-deficit patterns in NorthEastern Australia. J Exp Liu X, Wang H, Wang H, Guo Z, Xu X, Liu J, Wang S, Li WX, Zou
Bot 62(6):1743–175 C, Prasanna BM et al (2018) Factors affecting genomic selection
Collard BC, Mackill DJ (2008) Marker-assisted selection: an approach revealed by empirical evidence in Maize. Crop J. 6:341–352
for precision plant breeding in the twenty-first century, Philosoph- Lorenzana RE, Bernardo R (2009) Accuracy of genotypic value predic-
ical. Trans Royal Soc B 363:557–572 tions for marker-based selection in biparental plant populations.
Cooper M (1999) Concepts and strategies for plant adaptation research Theor Appl Genet 120:151–161
in rainfed lowland rice. Field Crops Res 64(1–2):13–34 Meuwissen TH, Hayes BJ, Goddard ME (2001) Prediction of total
Cooper M, Messina CD, Podlich D, Totir LR, Baumgarten A, Haus- genetic value using genome-wide dense marker maps. Genetics.
mann NJ et al (2014) (2014) Predicting the future of plant breed- 157(4):1819–29
ing. Complementing empirical evaluation with genetic prediction. Moose SP, Mumm RH (2008) Molecular plant breeding as the
Crop Pasture Sci. 65(4):311 foundation for 21st century crop improvement. Plant Physiol.
Crossa J, Campos Gde L, Pérez P, Gianola D, Burgueño J, Araus JL, 147(3):969–77
Makumbi D, Singh RP, Dreisigacker S, Yan J, Arief V, Banziger Munkvold JD, Tanaka J, Benscher D, Sorrells ME (2009) Mapping
M, Braun HJ (2010) Prediction of genetic values of quantitative quantitative trait loci for pre-harvest sprouting resistance in white
traits in plant breeding using pedigree and molecular markers. wheat. Theor. Appl. Genet. 119:1223–1235
Genetics. 186(2):713–724 Nakaya A, Isobe SN (2012) Will genomic selection be a practical
Crossa J, Perez-Rodriguez P, Cuevas J, Montesinos-López O, Jarquin method for plant breeding? Ann Bot. https://​doi.​org/​10.​1093/​
D, De Los Campos G, Burgueno J, GonzalezCamacho JM, Perez- aob/​mcs109
Elizalde S, Beyene Y et al (2017) Genomic Selection in Plant

13
Molecular Genetics and Genomics (2023) 298:813–821 821

Pérez-Cabal MA, Vazquez AI, Gianola D, Rosa GJ, Weigel KA (2012) Sahoo JP, Samal KC, Lenka D et al (2023) Population genetic struc-
Accuracy of genome-enabled prediction in a dairy cattle popula- ture and marker-trait association studies for Cercospora leaf spot
tion using different cross-validation layouts. Front Genet 28:3–27 (CLS) resistance in mung bean (Vigna radiata (L.) Wilczek). Trop
Piepho HP (2009) Ridge regression and extensions for genome-wide plant pathol. https://​doi.​org/​10.​1007/​s40858-​023-​00565-w
selection in Maize. Crop Sci 49:1165–1176 Samal KC, Sahoo JP, Behera L, Dash T (2021) Understanding the
Priyadarshini L, Samal KC, Sahoo JP, Mohapatra U (2020) Morpho- BLAST (Basic local alignment search tool) Program and a step-
logical, biochemical and molecular characterization of some by-step guide for its use in life science research. Bhartiya Krishi
promising potato (Solanum tuberosum L.) cultivars of Odisha. J Anusandhan Patrika. 36:55–61
Pharmacog Phytochem. 9:1657–1664 Servin B, Martin OC, Mézard M, Hospital F (2004) Toward a theory of
Robertsen CD, Hjortshøj RL, Janss LL (2019) Genomic selection in marker-assisted gene pyramiding. Genetics. 168(1):513–23
cereal breeding. Agronomy 9(2):95 Shakoor N, Lee S, Mockler TC (2017) High throughput phenotyping to
Rutkoski JE, Heffner EL, Sorrells ME (2011) Genomic selection for accelerate crop breeding and monitoring of diseases in the field.
durable stem rust resistance in wheat. Euphytica 179:161–173 Curr Opin Plant Biol 38:184–192
Rutkoski J, Singh RP, Huerta-Espino J, Bhavani S, Poland J, Jan- Singh BD, Singh AK (2015) Hybridization-based markers. Marker
nink JL, Sorrells ME (2015) Genetic gain from phenotypic and Assist Plant Breed Princ Pract. https:// ​ d oi. ​ o rg/ ​ 1 0. ​ 1 007/​
genomic selection for quantitative resistance to stem rust of wheat. 978-​81-​322-​2316-0_2
Plant Genome. 8(2):eplantgenome2014.10.0074 Solberg TR, Sonesson AK, Woolliams JA, Meuwissen TH (2008)
Sahoo JP, Sharma V (2018) Impact of LOD score and recombination Genomic selection using different marker types and densities. J
frequencies on the microsatellite marker based linkage map for Anim Sci. 86(10):2447–54
drought tolerance in kharif rice of Assam. Int J Curr Microbiol Sweeney DW, Sun J, Taagen E, Sorrells ME (2019) Genomic selec-
Appl Sci 7:3299–3304 tion in wheat, In: Meidaner T, Korzun V (Eds.) Applications of
Sahoo JP, Singh SK, Saha D (2018) A review on linkage mapping genetics and genomic research in cereals, Woodhead publisher
for drought stress tolerance in rice. J Pharmacog Phytochem van Eeuwijk FA, Bustos-Korts D, Millet EJ, Boer MP, Kruijer W,
7:2149–2157 Thompson A et al (2018) Modelling strategies for assessing and
Sahoo JP, Sharma V, Verma RK, Chetia SK, Baruah AR, Modi MK, increasing the effectiveness of new phenotyping techniques in
Yadav VK (2019) Linkage analysis for drought tolerance in kharif plant breeding. Plant Sci 282:23–39
rice of Assam using microsatellite markers. Indian J Trad Knowl- Varshney RK, Ribaut JM, Buckler ES, Tuberosa R, Rafalski JA, Lan-
edge. 18:371–375 gridge P (2012) Can genomics boost productivity of orphan crops?
Sahoo JP, Behera L, Sharma SS, Praveena J, Nayak SK, Samal KC [Opinion and Comment]. Nat Biotechnol 30(12):1172–1176
(2020) Omics studies and systems biology perspective towards Wang X, Yang ZF, Xu CW (2015) A comparison of genomic selection
abiotic stress response in plants. Am J Plant Sci 11:2172–2194 methods for breeding value prediction. Sci Bull 60:925–935
Sahoo JP, Mohapatra U, Mishra P (2020) An outlook on metabolic Wang X, Xu Y, Hu Z, Xu C (2018) Genomic selection methods for
pathway engineering in crop plants. Arch Agric Environm Sci crop improvement: current status and prospects. Crop J 6:330–340
5:431–434 Wong CK, Bernardo R (2008) Genome wide selection in oil palm:
Sahoo JP, Behera L, Praveena J, Sawant S, Mishra A, Sharma SS, increasing selection gain per unit time and cost with small popula-
Samal KC (2021) The golden spice turmeric (Curcuma longa) tions. Theoret Appl Genet 116:815–824
and its feasible benefits in prospering human health—a review. Xu Y (2016) Envirotyping for deciphering environmental impacts on
Am J Plant Sci 12:455–475 crop plants. Theoret Appl Genet 129:653–673
Sahoo JP, Mishra AP, Samal KC, Dash AK (2021) Insights into the Xu Y, Li P, Zou C, Lu Y, Xie C, Zhang X, Prasanna BM, Olsen MS
antibiotic resistance in Biofilms–A Review. Environm Conserv (2020) Enhancing genetic gain in the era of molecular breeding.
J 22:59–67 J Exp Bot. 68(11):2641–2666
Sahoo JP, Mohapatra U, Saha D, Mohanty IC, Samal KC (2022a) Link- Xu Y, Liu X, Fu J, Wang H, Wang J, Huang C, Prasanna BM, Olsen
age disequilibrium mapping: A journey from traditional breeding MS, Wang G, Zhang A (2020) Enhancing genetic gain through
to molecular breeding in crop plants. Indian J Trad Knowledge. genomic selection: from livestock to plants. Plant Commun
21:434–442 1(1):100005
Sahoo JP, Dash D, Moharana A, Mahapatra M, Sahoo AK, Samal KC Zhang XC, Pérez-Rodríguez P, Burgueño J, Olsen M, Buckler E, Atlin
(2022b) The role of transcription factors in response to biotic G, Prasanna BM, Vargas M, San Vicente F, Crossa J (2017) Rapid
stresses in Maize. In: Wani SH, Nataraj V, Singh GP (Eds) Tran- cycling genomic selection in a multi-parental tropical maize popu-
scription Factors for Biotic Stress Tolerance in Plants. Springer, lation. G3 (Bethesda) 7:2315–2326
Cham. https://​doi.​org/​10.​1007/​978-3-​031-​12990-2_9
Sahoo JP, Mishra P, Mishra AP et al (2022c) Physiological, biochemi- Publisher's Note Springer Nature remains neutral with regard to
cal, and molecular responses of rice (Oryza sativa L.) towards jurisdictional claims in published maps and institutional affiliations.
elevated ozone tolerance. Cereal Res Commun. https://​doi.​org/​
10.​1007/​s42976-​022-​00316-8 Springer Nature or its licensor (e.g. a society or other partner) holds
Sahoo JP, Samal KC, Tripathy SK, Lenka D, Mishra P, Behera L, exclusive rights to this article under a publishing agreement with the
Acharya LK, Sunani SK, Behera B (2022d) Understanding the author(s) or other rightsholder(s); author self-archiving of the accepted
genetics of Cercospora leaf spot (CLS) resistance in mung bean manuscript version of this article is solely governed by the terms of
(Vigna radiata L. Wilczek). Trop Plant Pathol: https://​doi.​org/​ such publishing agreement and applicable law.
10.​1007/​s40858-​022-​00525-w. Accessed on: 10th August 2022d.

13

You might also like