Genotype To Phenotype L1 - 2023-Sem2

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 55

Dr Monika Murcha

Associate Prof
School of Molecular Sciences

SCIE4001
genotype to
phenotype
Lecture Outline:

• Lecture 1: genotype to phenotype


Functional genomics
Forward genetic approaches
Reverse genetic approaches
Phenotypic analysis

• Lecture 2: examples of functional


genomics
A bit about me…
BSc at UWA majoring in Biochemistry and
Molecular Biology in 1995

Honours 1999, Biochemistry at UWA (Prof. J


Whelan)

Started my PhD 2000


“Identification and characterisation of plant
mitochondrial import components”

Post-doc ICHR
Post-doc ARC Coe PEB

20144 ARC Australian Post-doctoral Fellow 4 yrs

2014-started my own research group –affiliated


with the Centre of Excellence Plant Energy Biology

2014 ARC Future fellowship


2015 ARC Discovery Project
DAAD grant Jurgen Soll (Munich)
STINT grant (Stockholm University)

Murchalab.com

Come chat to me about Masters/Honours research


projects, vacation projects etc
Learning outcomes: Functional genomics:

• Describe various targeted genetic


approaches to determine the function
of a protein.

• Describe how plant phenotypes can be


used to unravel a proteins function.

• Understand the current research


themes and approaches used in
reverse genetics and forward genetic
screens in the model Arabidopsis
thaliana

• Critically evaluate the discussed


techniques available for researchers
today.
Reverse genetics: Forward genetics/classical
Discovering the function of a genetics
gene by analysing the Determining the genetic basis
phenotypic consequences of a responsible for a phentoype
gene

If you want some background reading,


I have uploaded several reviews and papers for you .
There is no dedicated textbook, but if you do not have a strong genetic
background please refer to:

Benjamin A. Pierce, Genetics:


A Conceptual Approach (2016, 6th edition)

Also recommended: A.J.F. Griffiths, S.R. Wessler, S.B. Carroll &


J. Doebley, Introduction to Genetic Analysis (c 2015, 11th
edition)

For Quantitative Genetics: F.W. Nicholas, An Introduction to Veterinary Genetics


(2010, 3rd edition)
All in High Demand in the CMO, Biological Sciences Library.
What is functional genomics………Functional genomics uses
genomic data to study gene and protein expression and function
on a global scale (genome-wide or system-wide), focusing on
gene transcription, translation and protein-protein interactions,
and often involving high-throughput methods.

What is the aim of functional genomics…… to understand the


complex relationship between genotype and phenotype

Figure 1. Functional genomics integrates information from various molecular


methodologies to gain an understanding of how DNA sequence is translated into
complex information in a cell (DNA → RNA → Proteins → biological process).
What is functional genomics?
Functional genomics is the study of how genes (and intergenic regions)
contribute to different biological processes.

Using a variety of genomic methods to understand a gene/proteins


function

GENE PHENOTYPE
Why do we need functional genomics?

What are the roles of specific genes/proteins and how are they
regulated.

Bioinformatic approaches Create variants and Predict pathways


Genomes analyse the phenotypic functions, further
Microarray data sets Consequences biochemical characterisation
Phylogenetics
Protein structure
disclaimer?

Functional genomics will not replace the time-honored use of


genetics, biochemistry, cell biology and structural studies in
gaining a detailed understanding of biological mechanisms.

The extent to which any functional genomics approach


actually defines the function of a particular protein (or set of
proteins) will vary depending on the methodology and gene
involved.

(Hieter & Boguski 97)


Why do we need functional genomics?

% of genes with Completion date of


Organism # genes inferred function genome

E. coli 4288 60 1997


yeast 6,600 40 1996
C. elegans 19,000 40 1998
Drosophila 12-14K 25 1999
Arabidopsis 25,000 40 2000
mouse ~30,000? 10-20 2002
human ~30,000? 10-20 2000
27 000 genes, only ~9% have been functionally characterised experimentally
Functional genomics in the model plant Arabidopsis thaliana

• Reverse genetic approaches using Arabidopsis


• Forward genetic approaches using Arabidopsis

Model organism for plant biology although it is not of


agronomic importance it offers import advantages with regards
to genetic and molecular biological research
1. Small genome size, many crop species large genomes due
to polyploidization maize 19X wheat 128X
2. Many tools, techniques and resources are available
3. Easily manipulated, short life cycle, broad knowledge base
4. Translational research-basic understanding can be directed
to crop species
Reverse genetics:
Discovering the function of a gene by analysing the phenotypic
consequences of a gene
How do we do that: we need to modify that gene

1. DELETE IT!!!!!!!!!, knock-out, Gene silencing- RNA interference

T-DNA insertional
knock-out lines

SALK-TDNA
GABI-kat
SAIL
FLAG
Once upon a time …

Annual Review of Genetics, 1975

Advantages of using Arabidopsis thaliana as a genetic tool:


• The diploid chromosome number is five pairs
• The life cycle maybe completed within one month
or it may be extended, depending on photoperiodic exposure
• Outcrossing (cross-pollination) is minimal (2% in nature)
• One plant can produce more then 50,000 seeds
• Plant can be grown on soil but also as cell cultures

https://commons.wikimedia.org/wiki/File:Arabidopsis_thaliana_seeds.jpg
Arabidopsis as a genetic tool…
Plants can be grown on soil but also as cell cultures

Arabidopsis cell
culture

Arabidopsis hydroponically grown


Why wheat is not a model organism
Wheat (Triticum aestivum) has a large genome
The sequence of the first plant genome was completed and published
at the end of 2000

27,000 genes are present in the


Arabidopsis genome (they encode 35,000 proteins)

Genes are annotated along the chromosome arms


and the loci identifier look like AT5g15960
The sequence of the first plant genome was completed and published
at the end of 2000
Current status of the multinational Arabidopsis community

Plant Direct, Volume: 4, Issue: 7, First published: 02 August 2020, DOI: (10.1002/pld3.248)
Generation of T-DNA
insertional
Knock-out lines:
• 225 000 independent T-DNA
insertions
Created.

• Represent almost a
complete library of T-DNA
insertions for the whole
Arabidopsis genome (29K genes)

• Sequencing/PCR of T-DNA
identifies location of the T-DNA
insertion

• Invaluable resource for plant


researchers that has allowed the
functional characterisation of
genes.
Generation of T-DNA insertional Knock-out lines:
Agrobacterium mediated transformation of plasmid containing a homologous gene
short strand sequence. Common soil pathogen that transfers DNA into host plant
Transferred DNA (T-DNA). Single or multiple T-DNA can be inserted into host genome.

• T-DNA containing plasmid


• Plasmid transformed into
agrobacterium
o Random insertion within plant
genome
o Selection markers (antibiotic
resistance) within the T-DNA 25
base pairs direct repeat border
sequence are inserted within
genome
o Allows for an introduction of DNA
o Allows for the inactivation of
endogenous genes

Nature Protocols 1, 641 - 646 (2006) Published online: 29


June 2006
doi:10.1038/nprot.2006.97
Generation of T-DNA insertional Knock-out lines:
Agrobacterium mediated transformation is a simple process of infecting the immature
flowers with the soil pathogen.
Arabidopsis research drives plant science
How to you find mutant lines ?
How to you screen mutant lines ?
Obtain and screen T-DNA insertion lines, cheap and quick resource, many lines
already confirmed.
-optimal to have at least 2 lines, independent insertions within the same gene.
-genomic DNA prep and PCR required to confirm the line

-location of the insert is important, may not produce a KO if insert is in the 5’UTR
or 3’UTR. Introns and exons, T-DNA insert may be spliced out.
How do you screen mutant lines ?
Genotyping: identification of Homozygous mutants

i AtTric1 (At3g49560) AtTric2 (At5g24650)


SALK_149871
SALK_031707
ATG SALK_112126 TAG ATG TGA
SALK _136525
5’UTR 3’UTR
5’UTR 3’UTR

786 bp 780 bp

Col-0 SALK_112126 Col-0 SALK_031707 Col-0 SALK_149871 Col-0 SALK_136525


M 1 2 3 4 M 5 6 7 8 M 9 10 11 12 M 13 14 15 16

71
25
26
07

Attric1::2

98
65
21
17

14
13
11
03

cross
K_
K_
mu 1 mu 2 Attric1::2
K_
K_

Col-0 AL #1 #2 #3
AL
AL
AL
How do you screen mutant lines ?
Genotyping: the identification of a Homozygous mutant

Heterozygous (HET) Homozygous WT HET HET

Self =
Pollinates
How do you screen mutant lines ?
PCR based genotyping: the identification of a Homozygous mutant

HOMO WT HET HET

S1
S2

S1 S2 S1 S2 S1 S2 S1 S2
20

Averag
11.96
10
How do you screen mutant lines ? 0
0.29

Genotyping: What happens if you cannot


Fully developed seeds per silique
find a homozygous knock-out line?
Defective seeds per silique
SALK_02356 Col-0

ii)

Col-0

Research Thesis SALK_02356 Jessica Lee Andrews

i)
Figure
60
3.1.2. Silique dissection of heterozygous SALK_02356 T-DNA Homozygous
insertion line WT HET HET
for At5g63000. An average of five siliques were collected from five individual plants per
48.71
Average Number of Seeds

line to
50 a total of 24 siliques each from SALK_02356 heterozygotes and Col-0 wild-type.
i) Seeds
40
in siliques were classified as either fully developed or defective. On average,
33.83
Col-0 siliques contained 49 fully developed seeds, and only 1 in 4 siliques presented up to
two defective
30 seeds. SALK_02356 heterozygote siliques contained an average of 34 fully
developed seeds and 12 defective seeds. Standard error bars are indicated in black for
20
each data plot on the bar graph. Counts and standard
11.96 errors are reported in Appendix IV.
ii) Silique
10 dissections for SALK_02356 heterozygotes in comparison to Col-0. Defective
seeds are indicated by arrows. 0.29
0
Fully developed seeds per silique Defective seeds per silique
SALK_02356 Col-0

ii)
How do you confirm mutant lines ?
Research Thesis Jessica Lee Andrew

Genotyping: confirmation of mutants


A)
1.00E-04

9.00E-05 8.21E-05

8.00E-05

cDNA Concentration (fmol/µl)


7.00E-05

6.00E-05

5.00E-05

4.00E-05

3.00E-05

2.00E-05

1.00E-05 4.56E-06

0.00E+00
Col-0 SALK_096103

B)
9.00E-04

Need to confirm transcript level, design primers after


8.00E-04 the insert. 7.14E-04

7.00E-04
cDNA Concentration (fmol/µl)
Transcript level may not correlate to protein level,6.00E-04
therefore it is optimal to create
antibodies to confirm, total knock-out of protein.5.00E-04
4.00E-04

3.00E-04
This is the definitive proof….. Looking at protein abundance
2.00E-04
8.21E-05
1.00E-04

0.00E+00
How do you confirm mutant lines ?
Research Thesis Jessica Lee Andrew

How effective is T-DNA insertional mutagenesis in Arabidopsis?


A)
1.00E-04

9.00E-05 8.21E-05

8.00E-05

cDNA Concentration (fmol/µl)


7.00E-05

6.00E-05

5.00E-05

4.00E-05

3.00E-05

2.00E-05

1.00E-05 4.56E-06

0.00E+00
Col-0 SALK_096103

B)
9.00E-04

8.00E-04 7.14E-04

7.00E-04

cDNA Concentration (fmol/µl)


6.00E-04

5.00E-04

4.00E-04

3.00E-04

2.00E-04
8.21E-05
1.00E-04

0.00E+00
Col-0 SALK_074544
How do you confirm mutant lines ?
A T-DNA insert may not produce a knock-out effect

May only produce a knock-down, -

-examples of no difference in protein abundance –insert in an intron region, insert


spliced
Out

Production a truncated protein- insert towards the end of the gene-results in an


early
termination

Production of an overexpressor-insert within the promoter region affects


transcription/
Translation resulting in the over expression of the protein.
Genotyping: Homozygous mutants that are sterile

The gene is essential for


pollen/embryo/seed
development
Fertilization

Homozygous mutants can grow


but cannot make progeny
Multiple genes encode for the same or similar
proteins = MULTIGENE FAMILIES
Many genes can have redundant functions. Therefore deletion of a gene will have
no consequences to the phenotype. The “other” genes can do the same job.
ric1::2

ai AtTric1 (At3g49560) AtTric2 (At5g24650)


SALK_149871
SALK_031707
ATG SALK_112126 TAG ATG TGA
SALK _136525
5’UTR 3’UTR
5’UTR 3’UTR
Attric1
786 bp 780 bp

Col-0 SALK_112126 Col-0 SALK_031707 Col-0 SALK_149871 Col-0 SALK_136525


M 1 2 3 4 M 5 6 7 8 M 9 10 11 12 M 13 14 15 16

Attric2

0d 7d 14 d 0d 7d 14 d 5 1
07 26 52 87
17 21 6 4 9 Attric1::2
03 11 13 1
cross K_
_ K_ K_
mu 1 28 d mu 2 33 d 33
Attric1::2 d Col-0
SA
L
SALK
SA
L
SA
L #1 #2 #3
aii SALK_031707 X SALK_136525 #1 Acti
Col-0
MultipleAttric1::2
genes encode for the same or
SALK_112126 X SALK_149871 Attric1::2 #2 Attric1::2#3 AtTr
SALK_112126 similar proteins?
X SALK_136525 Attric1::2 #3 * * AtTr
• Col-0
Example AtTric1/2 single deletions have
Attric1
(SALK_112126 ) -0
no consequence to
Attric1::2 phenotype
Attric1::2#3 AL
K_11
21
26
)

(SA
LK
_14
98
71
)

K_
112
12
6)

Col #1 #2 #3 ::2 (S AL (SA

• Need ic1 tric1 2 (S


TIC110 to make a double mutant ?? tric
l-0 r 2
aiii 116 Co At
t
At At l-0 tric
1 tric
Co At At
35

Attric2
25
HOW??
AtTric1 28 kDa Mitochondria 30 kDa

Attric1::2#3
(SALK_149871 )
Generation of double Homozygous mutants
• Single lines had no obvious defect/effect on plant development.
• Need to make a double mutant plant
• May need to make triple/ quadruple (1-2 yrs)
HM A HM B = Het A Het B = HM A and HM B 1/16

cross = Self =

Genotyping: generation of double Homozygous mutants


Generation of double Homozygous mutants
• Single
Attric2lines had no obvious defect/effect on plant development.
• May need to make triple/ quadruple (1-2 yrs)

14 0d 7d 14 d 0d 7d 14 d

28 d 33 d 33 d

Col-0 Attric1::2#3

Col-0
Attric1
Attric1::2#3
(SALK_112126 )

60 Attric2
Attric1::2#3
(SALK_149871 )

e T -DNA insertional knock-ou ts for AtTric. a, Screening of T-DNA insertion lines for AtTric1 and A
Col-0 SALK_112126 Col-0 SALK_031707 Col-0 SALK_149871 Col-0 SALK_136525
M 1 2 3 4 M 5 6 7 8 M 9 10 11 12 M 13 14 15 16

Phenotypic analysis of mutants ( KO’s, knock-downs,


overexpressors etc)
What are the observable physical consequences of modifying that gene?

1
5
6
07

87
52
12
Attric1::2

17

49
36
12
03

_1
_1
_1
mu 1 cross mu 2 Attric1::2

LK
LK
LK
Col-0 #1 #2 #3

LK

SA
SA
SA
SA
1. Normalaiiconditions on
SALK_031707 MS media and
X SALK_136525 Soil.
Attric1::2 #1 Actin
SALK_112126 X SALK_149871 Attric1::2 #2 AtTric1
SALK_112126 X SALK_136525 Attric1::2 #3 * * AtTric2
1. Phenotypic analysis on compromised conditions eg stress, high light drought etc.
) )
26 71 1)
Attric1::2 11
21
14
98 26
)
87
K_ K_ 21 49
L L 11 _1
SA (SA K_ LK
Col -0 #1 #2 #3 2 (
1::
L A
1 2 (S
A (S
1::2
l-0 tric t tric tric 1 tric
2 tric
aiii 116 TIC110 Co At At l-0 tric
How do we measure phenotypes?
35
A
Co At At At

Mitochondria Chloroplasts 30 kDa


Stage growth progression-
25 how many
AtTric1
days does it take to reach a certain developmental stage 28 kDa

bi Developmental stage - Plate (0 % Sucrose)


0.1 0.5 0.7 1.0 1.02 1.04
0 % Sucrose
r 3 % Succrose

Col-0

Attric1::2 Col-0

Attric1 (SALK_112126 )
Attric2 (SALK_149871 )
Attric1::2
0 2 4 6 8 10 12 14
Days after sowing
Developmental stage - Plate (3 % Sucrose)
bii 0.1 0.5 0.7 1.0 1.02 1.04
Attric1

Col-0

Attric1::2
Attric2
Attric1 (SALK_112126 )
Attric2 (SALK_149871 )

0 2 4 6 8 10 12 14 0d 7d 14 d 0d 7d 14 d
Days after sowing
28 d 33 d 33 d
Developmental stage - Soil
Phenotypic analysis of mutants ( KO’s, knock-downs,
overexpressors etc)

Growth parameters: height, length, root length, leaf radius


Phenotypic analysis of
mutants
Phenotypic analysis on compromised conditions

1. High temp 42 deg , return to normal

2. High light

3. Treat UV light

4. High salinity

5. Stop watering

6. Infect pathogen (biotic)


Phenotypic analysis of
mutants
developmental parameters
Eg. Germination, senescence

If your gene is expressed in a


developmental specific manner

You should only observe the phenotype


at a particular stage

)%!"!!#$
i
100
)!!"!!#$ Col-0
,-./-0)$
0% sucrose/glucose

Attim17-1
,-./-0%$ KO1
80
(!"!!#$
germination

Attim17-1
,-./-0*$ KO2
60
'!"!!#$
Attim17-1
,-./-0&$ KO1/AtTim17-1
40
&!"!!#$ Attim17-1 KO2/AtTim17-1
,-./-0+$

AtTim17-1 OX1
,-./-0'$
20
%!"!!#$
AtTim17-1 OX2
,-./-01$

!"!!#$0
12
)$ 18
%$ 24
*$ 36
&$ 48
+$ h

)%!"!!#$ )%!"!!#$
,-./-0($
A

Phenotypic analysis of
mutants Stage of development

developmental parameters Number of samples

B
Col-0 SALK_016767 GABI_369G03

The effects of this gene KO can only be


observed at senescence

Look in online expression data to see


where your gene is expressed
Leaves after 3 days of dark-induced senescence treatment

C
0 days 3 days

1.0
Relative chlorophyll content

0.8

0.6

0.4

0.2

0.0
Col-0 SALK_016767 GABI_369G03
Molecular and
Biochemical analysis of
mutants
No obvious developmental growth
defect

Can have a biochemical or Molecular


defect

Eg. Defective in a biochemical pathway

Know what your looking for..

Whole transcriptome, proteome,


metabolome analysis.
Generation of mutants
Overexpressor lines
Introduce another copy of the gene into the genome. Can be under the native promoter
(endogenous expression) or under a strong constituative promoter.
You may Tag your new protein- FLAG tag pull downs, GFP tag localisation

RNAi knock-downs
The Nobel Prize in Physiology or Medicine in 2006 was awarded to Craig Mello
and Andrew Fire for the development of essentially a new field, RNA Interference or RNAi
In c. elegans.
Using RNA molecules to inhibit gene expression, siRNA molecules bind to mRNA.
In plant research it is a very powerful technique as many species are polyploidal, eg wheat

CRISPR/cas9- directed mutations/deletions/insertions


Specifically target regions within the genome, make truncations, make specific mutations, make
knock-outs where none are available.

Complementation of Knock-out mutants


The only way to confirm that your gene is responsible for the phenotype observed is to complement
it.
Return a fucntional copy of the gene back into the genome…… but will it be under the native
promoter, will it be in the same position within the genome, will expression levels be restored as in
wildtype.
Generation of complementation lines and overexpressor
lines.
Restore the gene of interest and restore the phenotype observed. Can create
lines with both a strongly expressed promoter (CaMV) or an endogenous
promoter (native promoter). CaMV will most likely produce an overexpresion
lines, though it can get silenced iver generations

Obtain varying levels of transcript and protein expression due to multiple


individual
T-DNA lines with varied location and number of T-DNA inserts.

Allow you to make tagged variants for biochemical characterisation- Pull down
approaches.
Agrobacterium mediated transformation of plasmid containing a homologous gene
short strand sequence. Common soil pathogen that transfers DNA into host plant
Transferred DNA (T-DNA). Single or multiple T-DNA can be inserted into host genome.

Nature Protocols 1, 641 - 646 (2006) Published online: 29


June 2006
doi:10.1038/nprot.2006.97
Reverse genetics:
Forward genetics: classical
Discovering the function of a
genetics
gene by analysing the
Determining the genetic basis
phenotypic consequences of a
responsible for a phenotype
gene
EMS mutagenesis
Forward genetics: Ethyl Methanesulphonate
classical genetics Introduces random point
Determining the genetic basis mutants, single amino acids
responsible for a phenotype base change, favours a C to T
change.

Can produce a single amino


1. Mutagen treat seeds-EMS acid change or a stop codon
(only 5%).
2. Screen for desired phenotype
Required at least 125 000
3. Identify the gene responsible by seeds
traditional mapping techniques and
deep sequencing Multiple mutations can occur
per line depending on the
concentration and length of
treatment. ~700
Forward genetics:
classical genetics
Determining the genetic basis
responsible for a phenotype

1. Mutagen treat seeds- use mainly


EMS
(radiation, natural variation, recombination)

1. Screen for desired phenotype


Simple, easy to identify screen- visual to allow screening of
10’s-100’s mutants at one time

1. Identify the gene responsible by


traditional mapping techniques and
deep sequencing
Mapping

Positional mapping
Next Generation Sequencing

1-3 Mb chromosomal
region
Forward genetics:
Positional mapping.
Positional mapping can take years
• single nucleotide polymorphism (SNP) to do PCR, sequencing, crossing
mapping is employed to narrow down a segregating etc.
genomic region
Traditionally this was the
bottleneck in EMS mutant
• cross is set up between mutant and screens.
alternative strain of the same species
that contains polymorphic nucleotides
• eg. Col-0 and Ler Thanks to NGS we can sequence
genomes. So we use a
combination of mapping and
• recombination during meiosis, the sequencing.
polymorphisms from the mutant and
mapping strains will be distributed 50/50 Still need to confirm the
ratio, mutation is responsible???????

• except for the mutation, it will be linked How would you do this
SNPs that are genetically linked to the
causal mutation. Complementation!!!
Genetic screen

EMS treatment of
seeds

M1 plants
2000

Screen of M2
phenotype
Forward genetics: Reverse genetics:
• Time consuming • Faster (still take month to make mutants)
• Expert geneticists needed • T-DNA cheap and available
• Need easy to screen phenotype- pale • Proven techniques and methodology
plant, small plant, large plant etc • Multiple genes
• unbiased discovery • Cant find a phenotype
• Still need to work out the mechanisms • Is the observed phenotype a direct
consequence, or due to downstream
effects.
• Researcher bias
• Still need to work out mechanisms

You might also like