Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 5

Diagnosis of Retinitis Pigmentosa Through Exome Analysis

Abstract
Genome sequencing is becoming increasingly more common in medical diagnostics, sometimes pre-
empting symptoms. A patient with retinal issues sought genetic counselling for diagnosis. Research
aimed to form a diagnosis and identify a mutated gene, educating on sequencing benefits and
possible inaccuracies. A pedigree analysis determined either autosomal dominant or recessiveness.
Applying the information to a search in the Online Mendelian Inheritance in Man (OMIM) database,
makes Retinitis Pigmentosa (RP) the most likely suspect. Digging deeper in the genes linked to this
disease, a mutation in PRPF4 stood out. For this reason, a decision was made to isolate this gene,
multiply it through a polymerase chain reaction (PCR) and use the Sanger method to sequence it for
further analysis. Every one of these sequencing processes was performed by Radboud UMC (RUMC).
To make sure the detected mutations weren’t just sequencing errors, the ClinVar database was
referenced. Only one variant was found, finding out it was benign marked the start of the next part
of the project. That being: sequencing all genes associated with RP. The raw data was processed into
a list of variants and annotated for analysis. A Python 3.12 script was designed to apply a myriad of
filters to these variants resulting in two variants on the USH2A gene. However, through a segregation
analysis it was determined these variants couldn’t be the cause. Whole exome sequencing (WES) the
patient produced CNGB1 as a new variant. Analysing the original sequencing data in the Integrative
Genomics Viewer (IGV) and seeing that the variant is likely an error makes it easy to conclude that
none of the sequenced variants are the cause. A case can be made that the filters applied to the data
were too strict, causing potential mutations to be filtered out. It also isn’t clear whether the
erroneous CNGB1 means there was an error in the TES or WES.
No conclusive evidence either way, but this process can be applied to further diagnoses.
A next step could be to expand the scope even further through whole genome sequencing (WGS).

Materials and Methods


A Temporary Diagnosis
The subject was a patient that presented to a clinical geneticist in the RUMC for a genetic
consultation. The research commenced with the goal of forming a diagnosis and determining if there
was a mutated gene responsible for the disease. Starting around the age of 22 the proband
manifested a range of clinical symptoms, namely: progressive retinal degeneration, nyctalopia and a
slow decrease of central vision. With the assumption that the condition was either monogenic or
mendelian in nature, a pedigree analysis was performed to determine which of these was the most
likely. For this process a four generational pedigree chart was used. The next step was to apply the
acquired information to a search in the OMIM (OMIM, 2023) database. The specific search term used
was: “progressive degeneration retina, bad night vision, decreased central vision”. Sifting through the
linked genes on the RP entry in the OMIM database (Hamosh et al., 2023), the genes bearing the
inheritance pattern and symptoms can be isolated from the proband, sequenced and then compared
to a reference sequence (RefSeq) to determine if the pathogenic mutation is present.

Mutation analysis
One such gene was selected for further analysis. To isolate and sequence the PRPF4 gene from the
proband, primers were designed based on the transcript variant 2 mRNA RefSeq (NM_001244926.2),
sourced from the National Center for Biotechnology Information (NCBI) nucleotide database (NCBI,
2023). Designing the primer pair to isolate and multiply this gene in the PCR was done using the
Primer-BLAST tool (NCBI, 2023). As the PRPF4 coding sequence (CDS) was in the 87-1652 base pair
range, the ranges selected were from 1 to 86 for the forward primer and from 1652 to 2897 for the
reverse primer, 2897 being the end of the 3’ untranslated region (UTR). As for primer parameters a
minimum product size of 1565 and a maximum of 2897 base pairs was chosen. For the number of
primers only 1 was entered. As for the melting temperatures (Tm) a min of 57.0, an Opt of 60.0 and a
Max of 63.0 was chosen, with a max Tm difference of 4 between the pair. The settings in the
“Exon/intron selection” and “Primer Pair Specificity Checking Parameters” sections were left on
default. Then for the advanced settings, the Primer Pair Specificity Checking Parameters were left on
default. For the primer parameters the primer size was set to a min of 15 an opt of 20 and a max of
25 and finally the Primer GC content (%) had a min of 40 and a max of 60. The PRPF4 gene was then
multiplied by PCR following the protocol in Genetische Mutaties DNA Profiling (Olde Loohuis et al.,
2022), utilizing the forward primer: 5’GGACGGTCTGAAAGGGAGTG3’ which has a length of 20 a Tm of
60.04 and a GC% of 60.00, paired with the reverse primer: 5’TGGGGGCTGACATGGAAATC3’ which has
a length of 20 a Tm of 60.03 and a GC% of 55.00. To validate the amplified DNA fragment gel
electrophoresis was performed following the protocol in Genetische Mutaties DNA Profiling (Olde
Loohuis et al., 2022). After validation, the fragment was ready to be sequenced by the RUMC using
the Sanger method following the protocol in Next-generation genetic testing for retinitis pigmentosa
(Neveling et al., 2012). The Sanger method was chosen for being the most accurate method
compared to next generation sequencing (NGS) Now that the gene has been sequenced it can be
compared to the PRPF4 RefSeq with accession code: NM_001244926.2 (NCBI, 2023) to see if any
mutations were present.
Any found variants would first be looked up in the ClinVar database (NCBI, 2023), so any known
benign variants could be ruled out.

Targeted Exome Sequencing


Further increasing the scale of the research, the next step is targeted sequencing of all the genes
whose mutations have been linked to the disease according to OMIM (Hamosh et al., 2000; OMIM,
2023), in a process known as targeted exome sequencing (TES). The gene sequencing performed on
the proband’s DNA by RUMC used NGS following the protocol in Next-generation genetic testing for
retinitis pigmentosa (Neveling et al., 2012). NGS is less accurate than Sanger but it’s a faster, more
cost-effective method for sequencing larger datasets. The resulting sequence reads were then
mapped against the reference genome (hg18) and the discovered variants were annotated for
further analysis.
To narrow down which variants had the highest chance of being pathogenic mutations a Python 3.11
script (Van Rossum & Drake, 2009) was designed to apply filters to some of the variant information
and relevant annotations. Starting with the “gene component” category, the focus was placed on
coding areas by filtering out every variant that was not found on an exon or splice site (annotated as:
“EXON_REGION”, “SA_SITE” and “SD_SITE”). Next to exclude any synonymous variants a filter was
applied to only include “nonsynonymous” variants (annotated as “TRUE” for synonymous or “False”).
To ensure the quality and accuracy of the variants, any positions that produced less than 5 reads
and/or had a variation % of less than 20 were filtered out. Then to ensure the variants were
unknown a filter was applied to the Single Nucleotide Polymorphism Database (dbSNP) (NCBI, 2023)
annotation, filtering out any variants that have any information in this category. Knowing that dbSNP
(NCBI, 2023) can include variants that could be pathogenic, any variant that has been annotated with
a human gene mutation database (HGMD) (Stenson et al., 2003) entry was also included in the
resulting variants, ignoring the dbSNP (NCBI, 2023) filter. Some of the work now having been
performed by the script, the research continues by hand filtering the data, making use of Microsoft
Excel (Microsoft, 2023). The resulting variants were then validated by RUMC through resequencing
using the Sanger method following the protocol in Next-generation genetic testing for retinitis
pigmentosa (Neveling et al., 2012). Being a more accurate sequencing method than NGS, any
variants that failed to be validated were filtered out for being erroneous. Then any variants
annotated with a PhyloP (Nassar et al., 2023) score of <1 was filtered out. This because the protein,
the original gene coded for was not important enough to be conserved during evolution, anything
under a 1 is considered benign. The annotated data from the tools: SIFT (Ng et al., 2003), PolyPhen-2
(Adzhubei et al., 2013) and MutPred2 (Pejaver et al., 2020) all make predictions on the effects of a
missense mutation in a protein. Any of the leftover variants that were deemed benign by at least 2 of
the 3 tools were filtered out. dbSNP (NCBI, 2023) was then used to filter out any of the leftover
variants with a minor allele frequency (MAF) of >10%, higher percentages mean the variant is more
common in a population, thus making it less likely they’re pathogenic mutations. Finally, to verify if
any of the resulting variants were indeed the mutations linked to the disease a segregation analysis
was performed on the proband’s family. This is done to see if any of the family members that have
the resulting variants display the phenotypes linked to the disease.

Whole Exome Sequencing


Upscaling the operation once more leads to the sequencing of the entirety of the proband’s exome.
Again the sequencing was performed by RUMC making use of the NGS method following the
protocol in Next-generation genetic testing for retinitis pigmentosa (Neveling et al., 2012), the data
provided was comparable to the data from the TES. For this reason, the script used earlier was
reapplied and the same steps to determine the pathogenicity were followed. Any new variants that
weren’t found in the last step were then further analysed. To verify the new variants the raw
sequencing data was viewed in the Integrative Genomics Viewer (IGV) (Robinson et al., 2011).
Any new variants provided in the data that don’t show up in the original reads indicate an error
somewhere during the sequencing process. To determine whether it’s worth it to re-sequence the
erroneous variants, the pathogenicity can be analysed using the annotated data. Should the variation
be pathogenic enough, the gene can be sequenced using the Sanger method for higher accuracy.

References
Adzhubei, I., Jordan, D. M., & Sunyaev, S. R. (2013). Predicting functional effect of human missense
mutations using PolyPhen-2. Current protocols in human genetics, Chapter 7, Unit7.20.
https://doi.org/10.1002/0471142905.hg0720s76.

ClinVar [Internet]. Bethesda (MD): National Library of Medicine (US), National Center for
Biotechnology Information; 2004 – [cited 2023 Dec 21]. Available from:
https://www.ncbi.nlm.nih.gov/clinvar/

dbSNP [Internet]. Bethesda (MD): National Library of Medicine (US), National Center for
Biotechnology Information; 2004 – [cited 2023 Dec 27]. Available from:
https://www.ncbi.nlm.nih.gov/snp/

Microsoft Corporation. (2023). Microsoft Excel [Software]. Retrieved from


https://www.microsoft.com/excel
Nassar, L. R., Barber, G. P., Benet-Pagès, A., Casper, J., Clawson, H., Diekhans, M., Fischer, C.,
Gonzalez, J. N., Hinrichs, A. S., Lee, B. T., Lee, C. M., Muthuraman, P., Nguy, B., Pereira, T., Nejad, P.,
Perez, G., Raney, B. J., Schmelter, D., Speir, M. L., Wick, B. D., … Kent, W. J. (2023). The UCSC Genome
Browser database: 2023 update. Nucleic acids research, 51(D1), D1188–D1195.
https://doi.org/10.1093/nar/gkac1072

Neveling, K., Collin, R. W., Gilissen, C., van Huet, R. A., Visser, L., Kwint, M. P., Gijsen, S. J., Zonneveld,
M. N., Wieskamp, N., de Ligt, J., Siemiatkowska, A. M., Hoefsloot, L. H., Buckley, M. F., Kellner, U.,
Branham, K. E., den Hollander, A. I., Hoischen, A., Hoyng, C., Klevering, B. J., van den Born, L. I., …
Scheffer, H. (2012). Next-generation genetic testing for retinitis pigmentosa. Human mutation, 33(6),
963–972. https://doi.org/10.1002/humu.22045

National Center for Biotechnology Information (NCBI)[Internet]. Bethesda (MD): National Library of
Medicine (US), National Center for Biotechnology Information; [1988] – [cited 2023 Dec 21].
Available from: https://www.ncbi.nlm.nih.gov/

Nucleotide [Internet]. Bethesda (MD): National Library of Medicine (US), National Center for
Biotechnology Information; [1988] – . Accession No. NM_001244926.2, Homo sapiens pre-mRNA
processing factor 4 (PRPF4), transcript variant 2, mRNA; [cited 2023 Dec 21]. Available from:
https://www.ncbi.nlm.nih.gov/nuccore/NM_001244926.2

O'Leary, N. A., Wright, M. W., Brister, J. R., Ciufo, S., Haddad, D., McVeigh, R., Rajput, B., Robbertse,
B., Smith-White, B., Ako-Adjei, D., Astashyn, A., Badretdin, A., Bao, Y., Blinkova, O., Brover, V.,
Chetvernin, V., Choi, J., Cox, E., Ermolaeva, O., Farrell, C. M., … Pruitt, K. D. (2016). Reference
sequence (RefSeq) database at NCBI: current status, taxonomic expansion, and functional
annotation. Nucleic acids research, 44(D1), D733–D745. https://doi.org/10.1093/nar/gkv1189

Olde Loohuis N, Goosen T, de Boer E. (2022). Genetische Mutaties DNA Profiling (2022/23). Cited
2023 Dec 21, from https://han.onderwijsonline.nl/elearning/lesson/pqg1wKXy

Online Mendelian Inheritance in Man, OMIM®. Johns Hopkins University, Baltimore, MD. MIM
Number: 268000: 06/15/2023. World Wide Web URL: https://omim.org/

Online Mendelian Inheritance in Man, OMIM®. McKusick-Nathans Institute of Genetic Medicine,


Johns Hopkins University (Baltimore, MD), 19/12/2023. World Wide Web URL: https://omim.org/

Pejaver V, Urresti J, Lugo-Martinez J, Pagel KA, Lin GN, Nam H, Mort M, Cooper DN, Sebat J,
Iakoucheva LM, Mooney SD, Radivojac P. Inferring the molecular and phenotypic impact of amino
acid variants with MutPred2. Nat. Commun. 11, 5918 (2020)

Primer-BLAST [Internet]. Bethesda (MD): National Library of Medicine (US), National Center for
Biotechnology Information; 2004 – [cited 2023 Dec 21]. Available from:
https://www.ncbi.nlm.nih.gov/tools/primer-blast/

Robinson, J. T., Thorvaldsdóttir, H., Winckler, W., Guttman, M., Lander, E. S., Getz, G., & Mesirov, J. P.
(2011). Integrative genomics viewer. Nature biotechnology, 29(1), 24–26.
https://doi.org/10.1038/nbt.1754

Stenson, P.D., Mort, M., Ball, E.V., Shaw, K., Phillips, A., & Cooper, D.N. (2014). The Human Gene
Mutation Database: building a comprehensive mutation repository for clinical and molecular
genetics, diagnostic testing and personalized genomic medicine. Human Genetics, 133,: 1-9.
https://doi.org/ 10.1007/s00439-013-1358-4
Van Rossum, G., & Drake, F. L. (2009). Python 3 Reference Manual. Scotts Valley, CA: CreateSpace.

You might also like