Full Ebook of Charney Nestlers Neurobiology of Mental Illness 5Th Edition Dennis S Charney Online PDF All Chapter

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 69

Charney & Nestler's Neurobiology of

Mental Illness 5th Edition Dennis S.


Charney
Visit to download the full and correct content document:
https://ebookmeta.com/product/charney-nestlers-neurobiology-of-mental-illness-5th-e
dition-dennis-s-charney/
More products digital (pdf, epub, mobi) instant
download maybe you interests ...

Charney Nestler s Neurobiology of Mental Illness 5th


Edition Dennis S Charney Md Editor Eric J Nestler Md
Phd Editor Pamela Sklar Md Phd Editor Joseph D Buxbaum
Phd Editor
https://ebookmeta.com/product/charney-nestler-s-neurobiology-of-
mental-illness-5th-edition-dennis-s-charney-md-editor-eric-j-
nestler-md-phd-editor-pamela-sklar-md-phd-editor-joseph-d-
buxbaum-phd-editor/

About Canada Health and Illness 2nd Edition Dennis


Raphael

https://ebookmeta.com/product/about-canada-health-and-
illness-2nd-edition-dennis-raphael/

The Neuroscience of Clinical Psychiatry The


Pathophysiology of Behavior and Mental Illness 3rd
Edition Edmund S. Higgins And Mark S. George

https://ebookmeta.com/product/the-neuroscience-of-clinical-
psychiatry-the-pathophysiology-of-behavior-and-mental-
illness-3rd-edition-edmund-s-higgins-and-mark-s-george/

Grace for the Children Finding Hope in the Midst of


Child and Adolescent Mental Illness Matthew S Stanford

https://ebookmeta.com/product/grace-for-the-children-finding-
hope-in-the-midst-of-child-and-adolescent-mental-illness-matthew-
s-stanford/
Building Children s Resilience in the Face of Parental
Mental Illness Conversations with Children Parents and
Professionals 1st Edition Alan Cooklin

https://ebookmeta.com/product/building-children-s-resilience-in-
the-face-of-parental-mental-illness-conversations-with-children-
parents-and-professionals-1st-edition-alan-cooklin/

Fear Gone Wild A Story of Mental Illness Suicide and


Hope Through Loss 1st Edition Kayla Stoecklein

https://ebookmeta.com/product/fear-gone-wild-a-story-of-mental-
illness-suicide-and-hope-through-loss-1st-edition-kayla-
stoecklein/

50 Years after Deinstitutionalization Mental Illness in


Contemporary Communities 1st Edition Brea L. Perry

https://ebookmeta.com/product/50-years-after-
deinstitutionalization-mental-illness-in-contemporary-
communities-1st-edition-brea-l-perry/

Health and Wellness in People Living with Serious


Mental Illness 1st Edition Patrick W Corrigan

https://ebookmeta.com/product/health-and-wellness-in-people-
living-with-serious-mental-illness-1st-edition-patrick-w-
corrigan/

While You Were Out: An Intimate Family Portrait of


Mental Illness in an Era of Silence 1st Edition Meg
Kissinger

https://ebookmeta.com/product/while-you-were-out-an-intimate-
family-portrait-of-mental-illness-in-an-era-of-silence-1st-
edition-meg-kissinger/
C H A R N EY & N E S T L E R’S
N EU R O B I O L O G Y O F M E N TA L I L L N E S S

Charney and Nestler's Neurobiology of Mental Illness, edited by Dennis S. Charney, et al., Oxford University Press USA - OSO, 2018.

07:31:09.
Charney and Nestler's Neurobiology of Mental Illness, edited by Dennis S. Charney, et al., Oxford University Press USA - OSO, 2018.

07:31:09.
CHAR NEY & NE ST LER’S
NEUROBIOLO GY OF
MEN TAL ILLNE SS
FIFTH EDITION

EDITED BY

Dennis S. Charney, MD Pamela Sklar, MD, PhD


ANNE AND JOEL EHRENKRANZ DEAN MOUNT SINAI PROFESSOR
ICAHN SCHOOL OF MEDICINE I N P S Y C H I AT R I C G E N O M I C S
AT M O U N T S I N A I CH A IR , DE PA RTM E N T OF G E N ET I C S
PRESIDENT FOR ACADEMIC AFFAIR S AND GENOMIC SCIENCES
M O U N T S I N A I H E A LT H S Y S T E M PROFESSOR OF GENETIC AND GENOMIC
P R O F E S S O R , D E PA RT M E N T S O F P S Y C H I AT R Y, S C I E N C E S , P S Y C H I AT R Y, A N D N E U R O S C I E N C E
NEUROSCIENCE, AND PHARMACOLOGICAL SCIENCES I C A H N S C H O O L O F M E D I C I N E AT M O U N T S I N A I
NEW YORK, NEW YORK NEW YORK, NEW YORK

Joseph D. Buxbaum, PhD Eric J. Nestler, MD, PhD


P R O F E S S O R O F P S Y C H I AT R Y, N E U R O S C I E N C E , A N D N A S H FA M ILY P R O F E S S O R O F N E U R O S C IE N C E
GENETICS AND GENOMIC SCIENCES DIRECTOR OF THE FRIEDMAN
D I R E C T O R O F T H E S E AV E R AU T I S M C E N T E R BRAIN INSTITUTE
F O R R E S E A R C H A N D T R E AT M E N T DEAN FOR ACADEMIC AND SCIENTIFIC
ICAHN SCHOOL OF MEDICINE AFFAIR S ICAHN SCHOOL OF
AT M O U N T S I N A I M E D I C I N E AT M O U N T S I N A I
NEW YORK, NEW YORK NEW YORK, NEW YORK

1
Charney and Nestler's Neurobiology of Mental Illness, edited by Dennis S. Charney, et al., Oxford University Press USA - OSO, 2018.

07:32:22.
1
Oxford University Press is a department of the University of Oxford. It furthers
the University’s objective of excellence in research, scholarship, and education
by publishing worldwide. Oxford is a registered trade mark of Oxford University
Press in the UK and certain other countries.

Published in the United States of America by Oxford University Press


198 Madison Avenue, New York, NY 10016, United States of America.

4th edition: 2013


3rd edition: 2011
2nd edition: 2004
© Oxford University Press 2018

All rights reserved. No part of this publication may be reproduced, stored in


a retrieval system, or transmitted, in any form or by any means, without the
prior permission in writing of Oxford University Press, or as expressly permitted
by law, by license, or under terms agreed with the appropriate reproduction
rights organization. Inquiries concerning reproduction outside the scope of the
above should be sent to the Rights Department, Oxford University Press, at the
address above.

You must not circulate this work in any other form


and you must impose this same condition on any acquirer.

Library of Congress Cataloging-​in-​Publication Data


Names: Charney, Dennis S., editor. | Sklar, Pamela B., editor. |
Buxbaum, Joseph D., editor. | Nestler, Eric J. (Eric Jonathan), 1954– , editor.
Title: Charney & Nestler’s neurobiology of mental illness /
edited by Dennis S. Charney, Pamela Sklar, Joseph D. Buxbaum, Eric J. Nestler.
Other titles: Neurobiology of mental illness. | Charney and Nestler’s neurobiology of mental illness
Description: Fifth edition. | New York, NY : Oxford University Press, [2018] |
Preceded by Neurobiology of mental illness / edited by Dennis S. Charney ... [et al.].
4th ed. 2013. | Includes bibliographical references.
Identifiers: LCCN 2017046721 | ISBN 9780190681425 (hardcover)
Subjects: | MESH: Mental Disorders—etiology |
Mental Disorders—physiopathology | Mental Disorders—therapy |
Neurobiology Classification: LCC RC341 | NLM WM 140 | DDC 616.8—dc23
LC record available at https://lccn.loc.gov/2017046721

This material is not intended to be, and should not be considered, a substitute for medical
or other professional advice. Treatment for the conditions described in this material is highly
dependent on the individual circumstances. And, while this material is designed to offer accurate
information with respect to the subject matter covered and to be current as of the time it
was written, research and knowledge about medical and health issues is constantly evolving and
dose schedules for medications are being revised continually, with new side effects recognized and
accounted for regularly. Readers must therefore always check the product information and
clinical procedures with the most up-to-date published product information and data sheets
provided by the manufacturers and the most recent codes of conduct and safety regulation.
The publisher and the authors make no representations or warranties to readers, express or implied, as to
the accuracy or completeness of this material. Without limiting the foregoing, the publisher and
the authors make no representations or warranties as to the accuracy or efficacy of the drug
dosages mentioned in the material. The authors and the publisher do not accept, and expressly
disclaim, any responsibility for any liability, loss or risk that may be claimed or incurred as
a consequence of the use and/ or application of any of the contents of this material.

1 3 5 7 9 8 6 4 2
Printed by Sheridan Books, Inc., United States of America

Charney and Nestler's Neurobiology of Mental Illness, edited by Dennis S. Charney, et al., Oxford University Press USA - OSO, 2018.

07:32:56.
CONTENTS

Contributors ix 13. Genetics of Schizophrenia and Bipolar Disorder 161


Alexander Charney and Pamela Sklar
SECTION 1 14. Neuroimaging of Psychotic Disorders 177
Stephan Heckers, Neil Woodward, and Dost Öngür
E M E R G I N G A N D E S TA B L I S H E D
TECHNOLOGIES 15. Animal and Cellular Models of Psychotic Disorders 193
Eric J. Nestler and Karl Deisseroth Mikhail V. Pletnikov, Guo-​Li Ming,
and Christopher A. Ross
1. Genetic Methodologies and Applications 3 16. Cognitive and Motivational Neuroscience
Shaun M. Purcell of Psychotic Disorders: Animal and Human Studies 209
2. Network Methods for Elucidating the Complexity Jared W. Young, Alan Anticevic,
of Common Human Diseases 17 and Deanna M. Barch
Eric E. Schadt 17. Synaptic Mechanisms of Psychotic Disorders:
3. The Human Brain and Its Epigenomes 35 Animal and Human Studies 223
Andrew Chess and Schahram Akbarian Seth G. N. Grant
4. Methods for In Vivo Gene Manipulation 43 18. Cellular Mechanisms of Psychotic Disorders:
Lisa M. Monteggia and Wei Xu Human Studies 233
5. Application of Stem Cells to Understanding Samuel J. Dienel and David A. Lewis
Psychiatric Disorders 55 19. Neurodevelopmental Mechanisms for Psychotic
Kristen Brennand Disorders: Animal and Human Studies 245
6. Optogenetics and Related Technologies for Psychiatric Nao J. Gamo, Takeshi Sakurai, Hanna Jaaro-​Peled,
Disease Research: Current Status and Challenges 73 and Akira Sawa
Lief E. Fenno and Karl Deisseroth 20. The Neurobiology and Treatment
7. In Vivo Circuit Analysis 87 of Bipolar Disorder 255
Ryan Bowman, Hannah Schwennesen, Kafui Dzirasa, Katherine E. Burdick, Luz H. Ospina,
and Rainbo Hultman Stephen J. Haggarty, and Roy H. Perlis
8. Magnetic Resonance Methodologies 95 21. Novel Approaches for Treating
Peter A. Bandettini and Hanzhang Lu Psychotic Disorders 267
Tiago Reis Marques and Shitij Kapur
9. PET Brain Imaging Methodologies 107
Ansel T. Hillmer, Kelly P. Cosgrove, 22. Current Treatments for Psychotic Disorders 277
and Richard E. Carson Deepak K. Sarpal and Anil K. Malhotra
10. Neuromodulation and Psychiatric Disorders 121
Wayne K. Goodman and Mark S. George SECTION 3
11. The Neurobiology of Sleep 129 DEPRESSION
Giulio Tononi and Chiara Cirelli Helen Mayberg

SECTION 2 23. Diagnosis and Epidemiology of Depression 289


Nicholas T. Van Dam, Brian M. Iacoviello,
P S YC H OT I C D I S O R D E R S
and James W. Murrough
Pamela Sklar
24. Genetics of Depression 301
12. Diagnosis and Epidemiology of Psychotic Disorders 149 Douglas F. Levinson and Walter E. Nichols
Emma Meyer, Julie Walsh-​Messinger, 25. Neuroimaging of Depression 315
and Dolores Malaspina Michele A. Bertocci and Mary L. Phillips

v USA - OSO, 2018.


Charney and Nestler's Neurobiology of Mental Illness, edited by Dennis S. Charney, et al., Oxford University Press

07:33:17.
26. Animal Models of Mood Disorders 329 SECTION 5
Lyonna F. Alcantara, Eric M. Parise, and Carlos A. S U B S TA N C E U S E D I S O R D E R S
Bolaños-​Guzmán Antonelli Bonci and Nora Volkow
27. Neurotrophic Mechanisms of Depression:
Animal and Human Studies 341 41. Epidemiology of Substance Use Disorders 547
Ronald S. Duman Denise B. Kandel, Mei-​Chen Hu, Pamela C. Griesler,
Bradley T. Kerridge, and Bridget F. Grant
28. Immune Mechanisms of Depression 355
Caroline Ménard, Madeline L. Pfau, Georgia E. Hodes, 42. The Genetic Basis of Addictive Disorders 565
and Scott J. Russo David Goldman, Zhifeng Zhou, and Colin Hodgkinson
29. Neuroendocrine Mechanisms of Depression: 43. Animal Models of Addiction 577
Clinical and Preclinical Evidence 365 Christopher J. Evans, Brigitte L. Kieffer, David Jentsch,
Jill M. Goldstein, L. Holsen, S. Cherkerzian, M. Misra, and Rafael J. Maldonado
and R.J. Handra 44. Reward Circuitry and Drug Addiction 587
30. New Approaches for Treating Depression 377 Vaughn R. Steele, Vani Pariyadath, Rita Z. Goldstein,
Eric J. Nestler and Elliot A. Stein
31. Current Treatments for Depression 387 45. Molecular Neuroimaging in Addictive Disorders 601
John H. Krystal and Dennis S. Charney Edythe D. London and Chelsea L. Robertson
46. Cellular and Molecular Mechanisms of Addiction 617
Kathryn J. Reissner and Peter W. Kalivas
SECTION 4
47. Brain Development and the Risk for Substance Abuse 631
A N X I ET Y D I S O R D E R S
Mary M. Heitzeg and B.J. Casey
Kerry J. Ressler
48. Novel Approaches for Treating Addiction 643
32. Diagnosis and Epidemiology of Anxiety, Jane B. Acri and Phil Skolnick
Obsessive-​Compulsive, and Trauma 49. Current Approved Pharmacotherapies for Substance
and Stressor-​Related Disorders 409 Use Disorders 657
Murray B. Stein, Meghan E. Keough, Alexis S. Hammond and Eric C. Strain
and Peter P. Roy-​Byrne
33. Genetics of Anxiety Disorders 419
Takeshi Otowa, Roxann Roberson-​Nay, SECTION 6
Mandakh Bekhbat, Gretchen N. Neigh, DEMENTIA
and John M. Hettema Alison M. Goate
34. Functional Neurocircuitry and Neuroimaging Studies
of Anxiety Disorders 435 50. Diagnosis and Epidemiology of Dementia 673
Madeleine S. Goodkind and Amit Etkin William C. Kreisl and Christiane Reitz
35. Animal Models and Assays Probing Anxiety Related 51. Genetics of Dementia 685
Behaviors and Neural Circuits 451 Alan E. Renton and Alison M. Goate
Ramon Tasan and Nicolas Singewald 52. Neuroimaging and Cerebrospinal Fluid Biomarkers
36. What Are Fear and Anxiety? Listening to the Brain 471 of Alzheimer’s Disease 703
Joseph LeDoux Brian A. Gordon, Stephanie J.B. Vos, and Anne M. Fagan
37. Synaptic and Circuit Mechanisms of Anxiety 53. Animal Models of Alzheimer’s Disease 715
Disorders: Animal and Human Studies 477 David Morgan
Anfei Li and Francis S. Lee 54. Cellular Mechanisms of Dementia:
38. The Neurobiology of Resilience 487 Animal and Human Studies 727
Adriana Feder, Sarah R. Horn, Margaret Haglund, Li Gan
Steven M. Southwick, and Dennis S. Charney 55. Neurobiology of Lewy Body Dementias:
39. Novel Approaches for Treating Anxiety Disorders 513 Animal and Human Studies 737
David A. Sturman, Milissa L. Kaufman, James E. Galvin and Jose Tomas Bras
Cara E. Bigony, and Kerry J. Ressler 56. Neurobiology of FTD: Animal and Human Studies 751
40. Current and Experimental Treatments for Dah-​eun Chloe Chung, Jeannette N. Stankowski,
Anxiety Disorders 531 and Leonard Petrucelli
Adam J. Guastella, Alice Norton, Gail A. Alvares, 57. Current Treatments for Alzheimer’s Disease 769
and Christine Yun Ju Song Mary Sano and Judith Neugroschil

Charney and Nestler's Neurobiology of Mental Illness, edited by Dennis S. Charney, et al., Oxford vi • Co
University n t USA
Press ent s 2018.
- OSO,

07:33:17.
SECTION 7 65. The Neurobiology of Tic Disorders and Obsessive-​
P E D I AT R I C P S YC H I AT R I C D I S O R D E R S Compulsive Disorder: Animal and Human Studies 879
Joseph D. Buxbaum Christopher Pittenger
66. Neurobiology of Eating Disorders:
58. Diagnosis and Epidemiology of Pediatric Animal and Human Studies 891
Psychiatric Disorders 783 Thomas Hildebrandt and Ashley Heywood
Elise B. Robinson, Benjamin M. Neale,
67. Novel Approaches for Treating Pediatric
and Mark J. Daly
Psychiatric Disorders 905
59. Genetics of Pediatric Psychiatric Disorders 797 Supritha Prasad and Edwin H. Cook, Jr.
Silvia De Rubeis, M. Pilar Trelles, and Joseph D. Buxbaum
68. Current Treatments for Pediatric Psychiatric Disorders 915
60. Neuroimaging in Pediatric Psychiatric Disorders 807 M. Pilar Trelles, Paige M. Siper, and Dorothy E. Grice
Timothy P.L. Roberts and Luke Bloy
61. Animal and Cellular Models of Pediatric
SECTION 8
Psychiatric Disorders 823
Elodie Drapeau, Hala Harony-​Nicolas, F U T U R E O F P S YC H I AT R I C D I AG N O S I S :
and Jacqueline N. Crawley TOWA R D P R E C I S I O N P S YC H I AT RY
Dennis S. Charney
62. Neurodevelopmental Mechanisms of Pediatric
Psychiatric Disorders: Animal and Human Studies 841 69. DSM-​5 Overview and Goals 935
Silvia De Rubeis, Kathryn Roeder, and Bernie Devlin Darrel A. Regier, Sarah E. Morris,
63. Neurobiology of Autism Spectrum Disorder and Susan K. Schultz
and Intellectual Disability: Animal 70. The Present and Future of Psychiatric Diagnosis 941
and Human Studies 855 Steven E. Hyman
Jesse Costales, Silvia De Rubeis, Jennifer Foss-​Feig,
71. The NIMH Research Domain Criteria Project:
Patrick R. Hof, Joseph D. Buxbaum,
Toward Precision Medicine in Psychiatry 947
and Alexander Kolezvon
Bruce N. Cuthbert
64. Neurobiology of Attention Deficit Hyperactivity
72. Computational Psychiatry and the Bayesian Brain 963
Disorder: Animal and Human Studies 865
Karl J. Friston and Raymond J. Dolan
Stephen V. Faraone, Pradeep G. Bhide,
and Joseph Biederman
Index     975

Charney and Nestler's Neurobiology of Mental Illness, edited by Dennis S. Charney, et al., Oxford University
C o n tPress
e n tUSA
s •- OSO,
vii 2018.
07:33:17.
CONTRIBUTOR S

Jane B. Acri, PhD Michele A. Bertocci, PhD


Chief, Medication Discovery & Toxicology Branch Department of Psychiatry
Division of Therapeutics & Medical Consequences University of Pittsburgh
National Institute on Drug Abuse Western Psychiatric Institute and Clinic
National Institutes of Health Pittsburgh, Pennsylvania
Bethesda, Maryland
Pradeep G. Bhide, PhD
Schahram Akbarian, MD, PhD Florida State University College of Medicine
Friedman Brain Institute Pediatric Psychopharmacology Unit of the Child Psychiatry
Departments of Psychiatry and Neuroscience Service
Icahn School of Medicine Tallahassee, Florida
Mount Sinai
Joseph Biederman, MD
New York, New York
Pediatric Psychopharmacology Unit
Lyonna F. Alcantara, MS Child Psychiatry Service
Department of Psychology Massachusetts General Hospital
Texas A&M University Harvard Medical School
College Station, Texas Boston, Massachusetts
Gail A. Alvares, PhD Cara E. Bigony, BA
Brain and Mind Centre Department of Psychiatry
Sydney Medical School McLean Hospital
University of Sydney Harvard Medical School
Sydney, New South Wales, Australia Belmont, Massachusetts
Alan Anticevic, PhD Luke Bloy, PhD
Department of Psychiatry Lurie Family Foundations MEG Imaging Center
Yale University School of Medicine Department of Radiology
NIAAA Center for the Translational Neuroscience of Children’s Hospital of Philadelphia
Alcoholism Philadelphia, Pennsylvania
Abraham Ribicoff Research Facilities
Carlos A. Bolaños-​Guzmán, PhD
Connecticut Mental Health Center
Department of Psychology
New Haven, Connecticut
Texas A&M University
Peter A. Bandettini, PhD College Station, Texas
Principal Investigator
Antonelli Bonci, MD
National Institutes of Mental Health
Scientific Director
Bethesda, Maryland
National Institute on Drug Abuse
Deanna M. Barch, PhD Baltimore, Maryland
Department of Psychological & Brain Sciences
Ryan Bowman
Department of Psychiatry
Department of Psychiatry
Department of Radiology
Duke University School of Medicine
Washington University in St. Louis
Durham, North Carolina
St. Louis, Missouri
Jose Tomas Bras, PhD
Mandakh Bekhbat, BA
Department of Molecular Neuroscience
Department of Physiology
UCL, Institute of Neurology
Emory University School of Medicine
Atlanta, Georgia

ix USA - OSO, 2018.


Charney and Nestler's Neurobiology of Mental Illness, edited by Dennis S. Charney, et al., Oxford University Press

07:33:17.
Kristen Brennand, PhD Kelly P. Cosgrove, PhD
Associate Professor Department of Radiology and
Genetics and Genomic Sciences, Neuroscience, Biomedical Imaging
Psychiatry Department of Psychiatry
Icahn School of Medicine at Mount Sinai Yale University
New York, New York New Haven, Connecticut
Katherine E. Burdick, PhD Jesse Costales, MD
Professor of Psychiatry and Neuroscience Seaver Autism Center for Research
Mount Sinai School of Medicine and Treatment
New York, New York Departments of Psychiatry and Neuroscience
Icahn School of Medicine at Mount Sinai
Richard E. Carson, PhD
New York, New York
Department of Radiology and Biomedical
Imaging Jacqueline N. Crawley, PhD
Department of Biomedical Engineering Robert E. Chason Endowed Chair in Translational Research
Yale University MIND Institute
New Haven, Connecticut Professor, Department of Psychiatry and Behavioral
Neuroscience
B.J. Casey, PhD
University of California Davis School of Medicine
Professor of Psychology
Sacramento, California
Yale University
New Haven, Connecticut Bruce N. Cuthbert, PhD
Department of Psychiatry
Alexander Charney, MD
University of Pittsburgh
Instructor, Neuroscience
Pittsburgh, Pennsylvania
Icahn School of Medicine at Mount Sinai
New York, New York Mark J. Daly, PhD
Analytic and Translational Genetics Unit
S. Cherkerzian, SCD
Massachusetts General Hospital
Departments of Psychiatry and Medicine
Program in Medical and Population Genetics
Harvard Medical School
Broad Institute of MIT and Harvard
Brigham and Women’s Hospital
Boston, Massachusetts
Connors Center for Women’s Health &
Gender Biology Silvia de Rubeis, PhD
Boston, Massachusetts Seaver Autism Center for Research and Treatment
Department of Psychiatry
Andrew Chess, MD
Icahn School of Medicine at Mount Sinai
Department of Developmental
New York, New York
and Regenerative Biology
Icahn School of Medicine Karl Deisseroth, MD, PhD
Mount Sinai Howard Hughes Medical Institute
New York, New York Departments of Bioengineering and Psychiatry
Stanford University
Dah-​eun Chloe Chung, BA
Stanford, California
Department of Neuroscience
Mayo Clinic Bernie Devlin, PhD
Neurobiology of Disease Department of Statistics
Mayo Clinic Graduate School Carnegie Mellon University
of Biomedical Sciences Department of Psychiatry
Rochester, Maine University of Pittsburgh School of Medicine
Pittsburgh, Pennsylvania
Chiara Cirelli, PhD
Professor, Department of Psychiatry Samuel J. Dienel, MD
Neuroscience Training Program Department of Psychiatry
University of Wisconsin-​Madison University of Pittsburgh
Madison, Wisconsin Pittsburgh, Pennsylvania
Edwin H. Cook, Jr, MD, IJR Raymond J. Dolan, FRS
Department of Psychiatry Wellcome Trust Centre for Neuroimaging
University of Illinois at Chicago Institute of Neurology
Chicago, Illinois University College London
London, England, UK

x •University
Charney and Nestler's Neurobiology of Mental Illness, edited by Dennis S. Charney, et al., Oxford C o n tPress
r i bUSA
u to r s2018.
- OSO,

07:33:17.
Elodie Drapeau, PhD Karl J. Friston, FRS, FMedSci, FRSB
Seaver Autism Center for Research and Treatment Wellcome Trust Centre for Neuroimaging
Department of Psychiatry Institute of Neurology
Friedman Brain Institute University College London
Icahn School of Medicine at Mount Sinai London, England, UK
New York, New York
James E. Galvin, MD, MPH
Ronald S. Duman, PhD Comprehensive Center for Brain Health
Laboratory of Molecular Psychiatry Charles E. Schmidt College of Medicine
Departments of Psychiatry and Pharmacology Florida Atlantic University
Yale University School of Medicine Boca Raton, Florida
Connecticut Mental Health Center
Nao J. Gamo, PhD
New Haven, Connecticut
Department of Psychiatry and Behavioral Sciences
Kafui Dzirasa, MD, PhD Johns Hopkins University School of Medicine
Department of Psychiatry Baltimore, Maryland
Duke University School of Medicine
Li Gan, PhD
Durham, North Carolina
Gladstone Institutes
Amit Etkin, MD, PhD University of California, San Francisco
Associate Professor of Psychiatry and Behavioral San Francisco, California
Sciences
Mark S. George, MD
Stanford University School of Medicine
Departments of Psychiatry, Radiology, and Neuroscience
Stanford, California
Medical University of South Carolina
Christopher J. Evans, PhD Ralph H. Johnson VA Medical Center
UCLA Brain Research Institute Charleston, South Carolina
University of California, Los Angeles
Alison M. Goate, D.Phil
Los Angeles, California
Ronald M. Loeb Center for Alzheimer’s Disease
Anne M. Fagan, PhD Department of Neuroscience
The Knight Alzheimer’s Disease Research Center Icahn School of Medicine at Mount Sinai
Department of Neurology New York, New York
Washington University in St. Louis
David Goldman, MD
St. Louis, Missouri
Clinical Assistant Professor
Stephen V. Faraone, PhD Department of Psychiatry
Department of Psychiatry NYU Langone Health
SUNY Upstate Medical University New York University
Center for Brain Repair New York, New York
Department of Biomedical Sciences
Jill M. Goldstein, PhD
Syracuse, New York
Departments of Psychiatry and Medicine
Adriana Feder, MD Harvard Medical School
Associate Professor of Psychiatry Brigham and Women’s Hospital
Associate Director for Research Connors Center for Women’s Health & Gender Biology
World Trade Center Mental Health Program BWH, departments of Psychiatry and Medicine
Icahn School of Medicine at Mount Sinai Boston, Massachusetts
New York, New York
Rita Z. Goldstein, PhD
Lief E. Fenno, MD, PhD Departments of Psychiatry & Neuroscience
Howard Hughes Medical Institute Icahn School of Medicine at Mount Sinai
Departments of Bioengineering and Psychiatry New York, New York
Stanford University
Madeleine S. Goodkind, PhD
Stanford, California
University of California, Berkeley
Jennifer Foss-​Feig, PhD Berkeley, California
Seaver Autism Center for Research and Treatment
Wayne K. Goodman, MD
Departments of Psychiatry and Neuroscience
Menninger Department of Psychiatry and Behavioral
Icahn School of Medicine at Mount Sinai
Sciences
New York, New York
Baylor College of Medicine
Houston, Texas

 Co
Charney and Nestler's Neurobiology of Mental Illness, edited by Dennis S. Charney, et al., Oxford n t r iPress
University b u to
USAr- sOSO,
• xi
2018.

07:33:17.
Brian A. Gordon, PhD Stephan Heckers, MD
Department of Radiology Director, Vanderbilt Early Psychosis Program
The Knight Alzheimer’s Disease Research Center Department of Psychiatry
Washington University in St. Louis Vanderbilt University Medical Center
St. Louis, Missouri Nashville, Tennessee
Bridget F. Grant, PhD Mary M. Heitzeg, PhD
National Institutes of Health Department of Psychiatry
Bethesda, Maryland University of Michigan
Ann Arbor, Michigan
Seth G. N. Grant
Centre for Clinical Brain Sciences John M. Hettema, MD, PhD
The University of Edinburgh Department of Psychiatry
Edinburgh, Scotland, UK Virginia Institute for Psychiatric and Behavioral Genetics
Virginia Commonwealth University
Dorothy E. Grice, MD
Richmond, Virginia
Division of Tics, OCD and Other Related Disorders
Department of Psychiatry Ashley Heywood, BS
Icahn School of Medicine at Mount Sinai Icahn School of Medicine at Mount Sinai
New York, New York New York, New York
Pamela C. Griesler Thomas Hildebrandt, PsyD
Department of Psychiatry Icahn School of Medicine at Mount Sinai
Mailman School of Public Health New York, New York
Columbia University School of Medicine
Ansel T. Hillmer, PhD
New York, New York
Department of Radiology and Biomedical Imaging
Adam J. Guastella, PhD Department of Psychiatry
Brain and Mind Centre Yale University
Sydney Medical School New Haven, Connecticut
University of Sydney
Georgia E. Hodes, PhD
Sydney, New South Wales, Australia
Fishberg Department of Neuroscience
Stephen J. Haggarty, PhD Friedman Brain Institute
Associate Professor Icahn School of Medicine
Department of Neurology Mount Sinai
Harvard Medical School New York, New York
Boston, Massachusetts
Colin Hodgkinson, PhD
Margaret Haglund, MD Section of Human Neurogenetics
Department of Psychiatry & Behavioral National Institute on Alcohol Abuse and Alcoholism
Neurosciences National Institutes of Health
Cedars-​Sinai Medical Group Bethesda, Maryland
Beverly Hills, California
Patrick R. Hof, MD
Alexis S. Hammond, MD, PhD Seaver Autism Center for Research and Treatment
Behavioral Pharmacology Research Unit Departments of Psychiatry and Neuroscience
Department of Psychiatry and Behavioral Sciences Icahn School of Medicine at Mount Sinai
Johns Hopkins University School of Medicine New York, New York
Baltimore, Maryland
L. Holsen, PhD
R.J. Handra, PhD Departments of Psychiatry and Medicine
Department of Biomedical Sciences Harvard Medical School
Colorado State University Brigham and Women’s Hospital
Fort Collins, Colorado Connors Center for Women’s Health &
Gender Biology
Hala Harony-​Nicolas, PhD
BWH, departments of Psychiatry and Medicine
Seaver Autism Center for Research and Treatment
Boston, Massachusetts
Department of Psychiatry
Friedman Brain Institute Sarah R. Horn
Icahn School of Medicine at Mount Sinai Department of Psychology
New York, New York University of Oregon
Eugene, Oregon

xii •University
Charney and Nestler's Neurobiology of Mental Illness, edited by Dennis S. Charney, et al., Oxford C o n Press
trib u to
USA r s2018.
- OSO,

07:33:17.
Mei-​Chen Hu, PhD Brigitte L. Kieffer, PhD
Associate Research Scientist Scientific Director
Department of Psychiatry Douglas Institute Professor
Columbia University Medical Center Department of Psychiatry
New York, New York McGill University Chair
McGill University
Rainbo Hultman, PhD
Montreal, Canada
Department of Psychiatry
Duke University School of Medicine Alexander Kolezvon, MD
Durham, North Carolina Seaver Autism Center for Research and Treatment
Departments of Psychiatry and Neuroscience
Steven E. Hyman, MD
Icahn School of Medicine at Mount Sinai
Director, Stanley Center for Psychiatric Research
New York, New York
Broad Institute of MIT and Harvard
Boston, Massachusetts William C. Kreisl, MD
Taub Institute for Research on Alzheimer’s Disease and the
Brian M. Iacoviello, PhD
Aging Brain
Mood and Anxiety Disorders Program
Department of Neurology
Department of Psychiatry
College of Physicians and Surgeons
Icahn School of Medicine
Columbia University
Mount Sinai
New York, New York
New York, New York
John H. Krystal, MD
Hanna Jaaro-​Peled, PhD
Departments of Psychiatry and Neuroscience
Department of Psychiatry and Behavioral Sciences
Yale University School of Medicine
Johns Hopkins University School of Medicine
Behavioral Health services
Baltimore, Maryland
New Haven Hospital
David Jentsch, PhD New Haven, Connecticut
Professor of Psychiatry Clinical Neuroscience Division
Binghamton University, State University of New York VA National Center for PTSD
Binghamton, New York VA Connecticut Healthcare System
West Haven, Connecticut
Peter W. Kalivas, PhD
Department of Neurosciences Joseph LeDoux, PhD
Medical University of South Carolina New York University
Charleston, South Carolina New York, New York
Denise B. Kandel, PhD Francis S. Lee, MD, PhD
Professor of Sociomedical Sciences in Psychiatry Sackler Institute for Developmental
Department of Psychiatry & Mailman School of Public Psychobiology
Health Weill Cornell Medical College
Columbia University of Cornell University
New York, New York New York, New York
Shitij Kapur, FRCPC, PhD, FMedSci Douglas F. Levinson, MD
Dean, Faculty of Medicine, Dentistry, and Health Sciences Professor of Psychiatry
Assistant Vice-​Chancellor (Health) Department of Psychiatry
University of Melbourne Stanford University
Melbourne, Victoria, Australia Palo Alto, California
Milissa L. Kaufman, MD, PhD David A. Lewis, MD
Department of Psychiatry Department of Psychiatry
McLean Hospital University of Pittsburgh
Harvard Medical School Pittsburgh, Pennsylvania
Belmont, Massachusetts
Anfei Li
Meghan E. Keough, PhD Sackler Institute for Developmental Psychobiology
University of Washington, Seattle Department of Psychiatry
Seattle, Washington Weill Cornell Medical College of Cornell University
New York, New York
Bradley T. Kerridge, MD
National Institutes of Health
Bethesda, Maryland

 C oUniversity
Charney and Nestler's Neurobiology of Mental Illness, edited by Dennis S. Charney, et al., Oxford n t r i bPress
u to r -sOSO,
USA • xiii
2018.

07:33:17.
Edythe D. London, PhD M. Misra, MD, MPH
Department of Molecular and Medical Pharmacology Departments of Psychiatry and Medicine
Department of Psychiatry and Biobehavioral Harvard Medical School
Sciences Brigham and Women’s Hospital
David Geffen School of Medicine Connors Center for Women’s Health &
Brain Research Institute Gender Biology
University of California Los Angeles BWH, departments of Psychiatry
Los Angeles, California and Medicine
Boston, Massachusetts
Hanzhang Lu, PhD
Professor of Radiology and Radiological Science Lisa M. Monteggia, PhD
Johns Hopkins University School of Medicine Department of Neuroscience
Baltimore, Maryland UT Southwestern Medical Center
Dallas, Texas
Dolores Malaspina, MD
Department of Psychiatry David Morgan, PhD
Columbia University Medical Center CEO, Byrd Alzheimer’s Institute
New York, New York Distinguished Professor of Pharmacology
Rafael J. Maldonado, MD, PhD and Physiology
Department of Experimental and Health Sciences University of South Florida
University Pompeu Fabra Tampa, Florida
Barcelona, Catalunya, Spain Sarah E. Morris, PhD
Anil K. Malhotra, MD Chief, Adult Psychopathology and Psychosocial Intervention
Professor, The Center for Psychiatric Development Branch
Neuroscience Associate Head, RDoC Unit
The Feinstein Institute for Medical Research Program Officer, Schizophrenia Spectrum Disorders
Director, Psychiatry Research Program
Zucker Hillside Hospital National Institute of Mental Health
Professor, Molecular Medicine and Psychiatry Bethesda, Maryland
Hofstra Northwell School of Medicine James W. Murrough, MD
New York, New York Mood and Anxiety Disorders Program
Tiago Reis Marques, MD, PhD Department of Psychiatry
Department of Psychosis Studies Fishberg Department of Neuroscience
King’s College London Friedman Brain Institute
London, England, UK Icahn School of Medicine
Mount Sinai
Helen Mayberg, MD New York, New York
Professor of Psychology, Neurology, and Radiology
Dorothy C. Fucqua Chair Benjamin M. Neale, PhD
Psychiatric Neuroimaging and Therapeutics Analytic and Translational Genetics
Department of Psychiatry and Behavioral Sciences Unit
Emory University School of Medicine Massachusetts General Hospital
Atlanta, Georgia Program in Medical and Population Genetics
Broad Institute of MIT and Harvard
Caroline Ménard, PhD Stanley Center for Psychiatric Research
Fishberg Department of Neuroscience Broad Institute of MIT and Harvard
Friedman Brain Institute Boston, Massachusetts
Icahn School of Medicine at
Mount Sinai Gretchen N. Neigh, PhD
New York, New York Departments of Anatomy and Neurobiology
Virginia Commonwealth University
Emma Meyer, MD Richmond, Virginia
Department of Psychiatry
New York University School of Medicine Judith Neugroschil, MD
New York, New York Alzheimer’s Disease Research Center
Icahn School of Medicine at Mount Sinai
Guo-​Li Ming, MD, PhD New York, New York
Johns Hopkins University School of Medicine
Baltimore, Maryland

xiv University
Charney and Nestler's Neurobiology of Mental Illness, edited by Dennis S. Charney, et al., Oxford • C o nPress
t r i USA
b u to r s2018.
- OSO,

07:33:17.
Walter E. Nichols, MD Mary L. Phillips, MD
Professor in the School of Medicine Department of Psychiatry
Department of Psychiatry University of Pittsburgh
Program on the Genetics of Brain Function Western Psychiatric Institute and Clinic
Stanford University Pittsburgh, Pennsylvania
Palo Alto, California
Christopher Pittenger, MD, PhD
Alice Norton, PhD Department of Psychiatry
Brain and Mind Centre Yale University
Sydney Medical School New Haven, Connecticut
University of Sydney
Mikhail V. Pletnikov, MD, PhD
Sydney, New South Wales,
Johns Hopkins University School of Medicine
Australia
Baltimore, Maryland
Dost Öngür, MD, PhD
Supritha Prasad, IJR
Chief, Psychotic Disorders Division
Department of Psychiatry
Director, Schizophrenia and Bipolar
University of Illinois at Chicago
Disorder Research Program
Chicago, Illinois
McLean Hospital
Associate Professor of Psychiatry Shaun M. Purcell, PhD
Harvard Medical School Associate Professor, Psychiatry
Boston, Massachusetts Associate Professor, Genetics and Genomic Sciences
Icahn School of Medicine at Mount Sinai
Luz H. Ospina, MA, PhD
New York, New York
Icahn School of Medicine
at Mount Sinai Darrel A. Regier, MD, MPH
New York, New York Center for the Study of Traumatic Stress
Department of Psychiatry
Takeshi Otowa, MD, PhD
Uniformed Services University
Graduate School of Clinical Psychology
Bethesda, Maryland
Teikyo Heisei University
Tokyo, Japan Kathryn J. Reissner, PhD
Department of Psychology & Neuroscience
Eric M. Parise, PhD
University of North Carolina at Chapel Hill
Fishberg Department of Neuroscience
Chapel Hill, North Carolina
The Mount Sinai School of Medicine
New York, New York Christiane Reitz, MD, PhD
Taub Institute for Research on Alzheimer’s Disease and the
Vani Pariyadath, PhD
Aging Brain
National Institute on Drug Abuse
Department of Neurology
Bethesda, Maryland
Gertrude H. Sergievsky Center
Roy H. Perlis, MD, MSc Department of Epidemiology
Professor of Psychiatry Mailman School of Public Health
Harvard Medical School College of Physicians and Surgeons
Director, Center for Experimental Drugs Columbia University
and Diagnostics New York, New York
Center for Genomic Medicine
Alan E. Renton, PhD
Massachusetts General Hospital
Ronald M. Loeb Center for Alzheimer’s Disease
Boston, Massachusetts
Department of Neuroscience
Leonard Petrucelli, PhD Icahn School of Medicine at Mount Sinai
Department of Research, Neuroscience New York, New York
Mayo Clinic College of Medicine Kerry J. Ressler, MD, PhD
Jacksonville, Florida Department of Psychiatry
Madeline L. Pfau, PhD McLean Hospital
Fishberg Department of Neuroscience Harvard Medical School
Friedman Brain Institute Belmont, Massachusetts
Icahn School of Medicine Department of Psychiatry and Behavioral Sciences
Mount Sinai Emory University School of Medicine
New York, New York Atlanta, Georgia

 Co
Charney and Nestler's Neurobiology of Mental Illness, edited by Dennis S. Charney, et al., Oxford n t r i Press
University b u to
USAr-sOSO,
• xv
2018.

07:33:17.
Roxann Roberson-​Nay, PhD Deepak K. Sarpal, MD
Departments of Psychiatry and Psychology Assistant Professor of Psychiatry
Virginia Institute for Psychiatric and Behavioral Department of Psychiatry
Genetics University of Pittsburgh
Virginia Commonwealth University Pittsburgh, Pennsylvania
Richmond, Virginia Akira Sawa, MD, PhD
Timothy P.L. Roberts, PhD Department of Psychiatry and Behavioral Sciences
Lurie Family Foundations MEG Johns Hopkins University School of Medicine
Imaging Center Baltimore, Maryland
Department of Radiology Eric E. Schadt, PhD
Children’s Hospital of Philadelphia Department of Genetics and Genomic Sciences
Philadelphia, Pennsylvania Mount Sinai School of Medicine
Chelsea L. Robertson, PhD New York, New York
Department of Molecular and Susan K. Schultz, MD DFAPA
Medical Pharmacology Geriatric Psychiatry, James A. Haley Veterans Hospital
Department of Psychiatry and Professor of Psychiatry, Courtesy
Biobehavioral Sciences University of South Florida College of Medicine
David Geffen School of Medicine Adjunct Professor of Psychiatry
University of California Los Angeles University of Iowa Carver College of Medicine
Los Angeles, California Tampa, Florida
Elise B. Robinson, ScD Hannah Schwennesen, MD
Analytic and Translational Genetics Unit Department of Psychiatry
Massachusetts General Hospital Duke University School of Medicine
Boston, Massachusetts Durham, North Carolina
Kathryn Roeder, PhD Nicolas Singewald, PhD
Computational Biology Department Department of Pharmacology and Toxicology
Carnegie Mellon University Inst. Pharmacy and CMBI
Pittsburgh, Pennsylvania University of Innsbruck
Christopher A. Ross, MD Innsbruck, Austria
Johns Hopkins University School Paige M. Siper, PhD
of Medicine Seaver Autism Center for Research and Treatment
Baltimore, Maryland Department of Psychiatry
Peter P. Roy-​Byrne, MD Icahn School of Medicine at Mount Sinai
Professor Emeritus, Department of Psychiatry New York, New York
University of Washington School of Medicine Phil Skolnick, PhD, DSC (Hon)
Seattle, Washington Director
Scott J. Russo, PhD Division of Therapeutics & Medical Consequences
Fishberg Department of Neuroscience National Institute on Drug Abuse
Friedman Brain Institute National Institutes of Health
Icahn School of Medicine Bethesda, Maryland
Mount Sinai Christine Yun Ju Song, PhD
New York, New York Brain and Mind Centre
Takeshi Sakurai, MD, PhD Sydney Medical School
Department of Drug Discovery Medicine University of Sydney
Medical Innovation Center Sydney, New South Wales, Australia
Kyoto University Graduate School of Medicine Steven M. Southwick, MD
Kyoto, Japan Glenn H. Greenberg Professor of Psychiatry
Mary Sano, PhD Yale University School of Medicine
Alzheimer’s Disease Research Center New Haven, Connecticut
Icahn School of Medicine at Mount Sinai Jeannette N. Stankowski, PhD
New York, New York Department of Neuroscience
James J. Peters VAMC Mayo Clinic College of Medicine
Bronx, New York Jacksonville, Florida

xvi University
Charney and Nestler's Neurobiology of Mental Illness, edited by Dennis S. Charney, et al., Oxford • C o nPress
t r i USA
b u to r s2018.
- OSO,

07:33:17.
Vaughn R. Steele, PhD Nicholas T. Van Dam, PhD
Neuroimaging Research Branch Mood and Anxiety Disorders Program
National Institute of Drug Abuse Department of Psychiatry
Intramural Research Program Icahn School of Medicine
National Institutes of Health Mount Sinai
Baltimore, Maryland New York, New York
Elliot A. Stein, PhD Nora Volkow, MD
Neuroimaging Research Branch Senior Investigator, Laboratory of Neuroimaging
National Institute of Drug Abuse National Institute on Alcohol Abuse and Alcoholism
Intramural Research Program Director, National Institute on Drug Abuse
National Institutes of Health Rockville, Maryland
Baltimore, Maryland
Stephanie J.B. Vos, PhD
Murray B. Stein, MD, MPH, FRCPC Department of Psychiatry and Neuropsychology
Distinguished Professor, Psychiatry Alzheimer Center Limburg
Distinguished Professor, Family Medicine School for Mental Health and Neuroscience
and Public Health Maastricht University
Vice Chair for Clinical Research in Psychiatry Maastricht, the Netherlands
University of California, San Diego
Julie Walsh-​Messinger, MA, PhD
San Diego, California
Assistant Professor
Eric C. Strain, MD Department of Psychiatry
Behavioral Pharmacology Research Unit University of Dayton
Department of Psychiatry and Behavioral Sciences Dayton, Ohio
Johns Hopkins University School of Medicine
Neil Woodward, PhD
Baltimore, Maryland
Vanderbilt Early Psychosis Program
David A. Sturman, MD, PhD Department of Psychiatry
Department of Psychiatry Vanderbilt University Medical Center
McLean Hospital Nashville, Tennessee
Harvard Medical School
Wei Xu, PhD
Belmont, Massachusetts
Department of Neuroscience
MGH/​McLean Adult Psychiatry Residency Program
UT Southwestern Medical Center
Harvard Medical School
Dallas, Texas
Boston, Massachusetts
Jared W. Young, PhD
Ramon Tasan, PhD
Department of Psychiatry
Department of Pharmacology
University of California San Diego
Medical University Innsbruck
La Jolla, California
Innsbruck, Austria
Desert-​Pacific Mental Illness Research Education and
Giulio Tononi, MD, PhD Clinical Center
Professor, Department of Psychiatry VA San Diego Healthcare System
Neuroscience Training Program San Diego, California
University of Wisconsin-​Madison
Zhifeng Zhou, PhD
Madison, Wisconsin
Section of Human Neurogenetics
M. Pilar Trelles, MD National Institute on Alcohol Abuse and Alcoholism
Seaver Autism Center for Research and Treatment Bethesda, Maryland
Department of Psychiatry
Icahn School of Medicine at Mount Sinai
New York, New York

 C oUniversity
Charney and Nestler's Neurobiology of Mental Illness, edited by Dennis S. Charney, et al., Oxford n t r i bPress
u to r s- OSO,
USA • xvii
2018.

07:33:17.
Charney and Nestler's Neurobiology of Mental Illness, edited by Dennis S. Charney, et al., Oxford University Press USA - OSO, 2018.

07:33:17.
SECTION 1

EMERGING AND ESTABLISHED TECHNOLOGIES

Charney and Nestler's Neurobiology of Mental Illness, edited by Dennis S. Charney, et al., Oxford University Press USA - OSO, 2018.

07:33:44.
Charney and Nestler's Neurobiology of Mental Illness, edited by Dennis S. Charney, et al., Oxford University Press USA - OSO, 2018.

07:33:44.
1.
GENETIC METHODOLOGIES AND APPLICATIONS
Shaun M. Purcell

INTRODUCTION disease genetics will ultimately, but undoubtedly, provide


fundamental insights into human biology, development, and
The past decade has witnessed tremendous advances in the evolution. However, the ease with which genetics will achieve
molecular technologies and data-​analytic methods at our dis- success in these various applications relates to different aspects
posal for studying the genetic bases of complex diseases and of the unknown, underlying genetic architecture of any partic-
traits. These advances have enabled the creation of compre- ular disease or trait.
hensive catalogs of different forms of human genetic variation, The question of the genetic architecture of common dis-
as well as large-​scale studies focused on specific diseases or ease has been a central one: it relates to the types of approaches
traits. In this chapter, we outline the general principles behind that will work best to map genes, as well as to what we can
some of these advances and discuss their application to study- expect to learn from genetic studies in the near future. For
ing complex traits, with a focus on neuropsychiatric disease. a heritable disease, genetic architecture describes how many
independent genetic effects contribute to risk, at the level
of both the population and the specific individual; it also
M OT I VAT I O N S F O R describes the typical frequency and effect size of these variants,
M A P P I N G T H E G E N ET I C how they combine to produce a phenotype (e.g., additively or
BASIS OF DISEASE interactively), and the extent to which multiple genetic risk
factors for a disease coalesce into a smaller number of distinct
Genetic epidemiology is fundamentally concerned with relat- biological pathways or networks. Other aspects of genetic archi-
ing genotype (i.e., variation between individuals’ genomes) to tecture include the mode of inheritance (e.g., recessive effects),
phenotype (i.e., the presence or absence of a disease, or measure the presence of positive, negative, or balancing selection acting
of a trait such as height or cholesterol level) (Altshuler et al., on risk variants, the extent to which genetic effects are shared
2008). There are a number of relatively distinct motivations (or contribute to different disease rates) across populations, the
for this work, which can be conceived of both in terms of extent to which variants influence multiple outcomes through
proximal and distal goals of the research. Recently, there has pleiotropy (one gene having multiple downstream effects), and
been a great deal of focus on identifying specific alleles (varia- the extent to which genetic effects are moderated by environ-
ble forms of a locus, which is a gene or region) that “explain the mental exposures (gene–​environment interaction).
heritability” as a primary benchmark and major goal of genetic The success of risk prediction, for example, in the general
studies, as discussed later. For many downstream applications, population will be crucially dependent on the proportion of
however, perhaps an equally important, but distinct, proximal variance explained by detected variants, which is a function
goal of genetics is to point to the genes and/​or gene networks of both the frequency and penetrance (a measure of effect
that are causally associated with disease. size that equals the chance that a carrier develops disease) of
Following from these proximal goals (identifying the spe- risk alleles. By learning which specific alleles (the particular
cific alleles that explain heritability and identifying the rel- variants of genes) increase or decrease risk or type or course
evant genes and pathways) there are several distinct, more of disease, one can in theory predict an individual’s risk or
distal goals or applications, the success of which will depend provide tailored medical treatment to patients based on their
on different aspects of the genetic discoveries made. In theory, genotype. In practice, truly personalized genomic medicine is
understanding the genetics of a disease could be used for risk still only a long-​term goal in most instances rather than a cur-
prediction, either at the population level or within families rent or imminent reality, although this is likely to be an area of
(following the model of genetic counseling for Mendelian dis- great progress over the coming decade.
ease); for prediction of disease course, severity, or drug response However, inasmuch as the distal goals relate to identifying
in affected individuals; to identify targets for drug discovery loci, to point to potential drug targets, for example, the extent
research; to inform on the relationships and comorbidities to which detected variants account for heritability might not be
between different diseases; or even to provide a framework critically relevant: for instance, there are multiple examples of
for causal inference around environmental effects (Smith and genetic studies that have pointed to weak genetic effects in genes
Ebrahim, 2003). More generally, advances in understanding that are already known targets of existing, successful therapies.

3 USA - OSO, 2018.


Charney and Nestler's Neurobiology of Mental Illness, edited by Dennis S. Charney, et al., Oxford University Press

07:33:44.
Thus, genetic studies have a parallel set of aims that are almost then search for chromosomal positions at which the profile of
orthogonal to the goal of explaining variability in a population, IBD maximally correlates with the coinheritance pattern of
involving the identification of the networks of genes implicated the phenotype. Linkage analysis proved spectacularly useful in
in disease. Here the aim is to use this information to point to the mapping Mendelian disease genes of major effect: rare muta-
biological mechanisms involved in disease pathogenesis. tions that almost always lead to correspondingly rare diseases.
In contrast, for complex common diseases, linkage analysis has
yielded very few durable results (for neuropsychiatric disease,
C L A S S I C A L G E N ET I C one notable exception is the DISC1 locus). This is, in large
E P I D E M I O L O GY: F R O M FA M I LY part, because linkage analysis has low power to detect variants
S T U D I E S , S E G R E G AT I O N, A N D of only modest effect. Given that it has, in fact, been widely
L I N K AG E A N A LYS I S TO L I N K AG E applied for many complex diseases, including schizophrenia,
DISEQUILIBRIUM M APPING the failure of linkage analysis suggests that the genetic archi-
tecture of most common diseases is unlikely to contain any real
Classical genetic epidemiology posed a series of increasingly spe- “hotspots”—​genes or loci at which a sizeable proportion of
cific questions: For a particular disease or trait, are there genetic cases carry a highly (or even moderately) penetrant risk variant.
influences? Is the genetic basis simple or complex? Where are Association analysis (or linkage disequilibrium mapping)
those genes located? Which specific forms of the gene cause dis- has replaced linkage analysis as the workhorse of genetic epide-
ease? The tools to answer these questions were, respectively, fam- miology over the past decade. Association analysis is concep-
ily and twin studies, segregation analysis, linkage analysis, and tually straightforward: typically in populations of unrelated
association analysis. Twin and family studies are used primarily to individuals, association analysis simply looks for specific vari-
estimate the heritability of a trait (the extent to which variation ants (alleles) that are significantly more frequent in people
in outcome is due to variation in genes) by contrasting the phe- with the disease compared with those without. Compared
notypic similarity of relatives of differing genetic similarity. More with linkage analysis, this approach is more powerful to detect
recently, twin and family study designs have also proved useful variants of smaller effect (Risch and Merikangas, 1996).
in molecular studies of genetic and epigenetic variation (van To contrast the effect sizes expected for a “major gene” dis-
Dongen et al., 2012). One notable family study of schizophre- order versus a complex, common disease, consider that for a rare
nia and bipolar disorder involved tens of thousands of patients disease, say, affecting 1 in 10,000 individuals, a major gene effect
from Sweden and showed clear evidence for a shared genetic may increase risk more than 10,000-​fold: for example, if baseline
basis common to both disorders (Lichtenstein et al., 2009). risk in noncarriers of the gene is 0.00003, then the penetrance
Looking at a range of first-​degree relative classes, such studies (risk of disease given genotype) would be 30% or more. In this
estimate the probandwise concordance rate (the probability an scenario, even though the gene is not completely Mendelian
individual develops disease given they have an affected relative (deterministic in its effect), a very large proportion (more than
of a particular type) and the familial relative risk (λ), which, one third) of carriers will develop the disease. Conversely, a very
for a given class of relative, is the concordance rate divided by large proportion of all affected individuals will carry that partic-
the population prevalence of disease. Both approaches ask how ular disease allele (again, more than one third). In comparison,
much more likely an individual is to develop disease if he or she for a common disease with a population prevalence of 1 in 100
has an affected relative. Estimates of λ for MZ twins, full siblings, individuals, researchers expect effect sizes for common alleles to
parent–​offspring pairs, and half-​siblings track strongly with the be at most 1.2-​fold, rather than 10,000-​fold increases in risk. If
extent of genetic similarity in those pairs, indicative of a consid- a 1.2-​fold risk allele has a population frequency of, say, 40%, it
erable genetic basis for these diseases. This and other studies put implies that carriers have ~1.2% risk of developing disease, and
the heritability of schizophrenia to be very high, with estimates we would expect to see the allele in ~44% of cases compared to
from 60% to 80%, for example. ~40% of unaffected individuals. This relatively small difference
Segregation analysis considers the broader pattern of dis- means that the variant is harder to detect statistically. It also
ease within larger pedigrees. For Mendelian disease, segrega- means that this allele, by itself, will have very little predictive
tion analysis can estimate whether there is likely to be a single utility: in other words, knowing an individual’s genotype at this
disease allele in each family, and if so, its mode of inheritance. locus would only marginally improve one’s ability to predict
For complex diseases that are caused by multiple genes and whether or not the individual will develop disease. Of course,
environmental influences, segregation analysis is typically for a heritable disease we would expect many such loci to con-
uninformative (beyond demonstrating above-​ chance levels tribute to disease risk, which could be informative for predic-
of familial clustering). Linkage analysis also uses pedigrees to tion if analyzed collectively.
identify (very broad) chromosomal loci that cosegregate with Historically, the principal limitation in applying associ-
disease in a particular family. Linkage analysis primarily gained ation analysis broadly was that testing a specific marker for
popularity after the introduction of molecular marker maps in association only queries a tiny proportion of the total extent
the 1980s. For example, by genotyping 300–​4 00 “microsatel- of variability that exists genome-​wide. This arises from the
lite” markers (short tandem repeats that vary in length between properties of linkage disequilibrium in human populations,
individuals), one can infer the pattern of gene flow in a fam- as described later. In contrast, linkage analysis only requires
ily (specifically, of shared chromosomal regions coinherited a relatively modest number of molecular markers to provide
from a single ancestor and so identical-​by-​descent, IBD) and genome-​wide surveys of gene flow within families, albeit very

4 • S.ECharney,
Charney and Nestler's Neurobiology of Mental Illness, edited by Dennis merging and
et al., Oxford E stablished
University Press USA - OSO,T2018.
echnologies
07:33:44.
low resolution ones (because very large chunks of chromosome polymorphic sites along with estimates of allele frequencies
are shared between closely related individuals). For associa- in multiple populations, a central aim was to characterize and
tion studies, it became apparent that hundreds of thousands describe the patterns of correlation between nearby variants,
of markers would be needed to cover the whole genome and referred to as linkage disequilibrium (LD). As illustrated in
capture the majority of common variation. In practice, for a Figure 1.1, two or more alleles at nearby sites are said to be in
long time this meant that association analysis was limited to LD if they co-​occur more than expected by chance, that is, than
testing a small number of variants in a small number of can- if they were inherited independently of each other. In reality,
didate genes. Candidates were usually selected on the basis of haplotypes (collections of alleles on the same physical stretch
prior knowledge, or assumptions, about the pathophysiology of chromosome) are the primary unit of inheritance, not indi-
of disease. In neuropsychiatric genetics, despite a considera- vidual alleles. Two alleles on the same haplotype will tend to
ble body of work, studies of candidate genes largely failed to be either both cotransmitted from parent to offspring, or will
lead to broadly reproducible results. There are multiple reasons both be untransmitted, thereby inducing a correlation between
to explain this state of affairs (reviewed by Kim et al., 2011). the alleles at the population level. The further away two sites
Perhaps most obviously, many of the original hypotheses about physically reside on the chromosome, the more likely that they
the disease may have been incorrect, or at least fundamentally will be separated by a meiotic recombination event. Thus, LD
incomplete descriptions of a much more complex process. For between any two sites tends to “break down,” or be attenuated,
a number of diseases such as Type II diabetes and Crohns’ dis- over distance. This property can be used to localize genes, in
ease, the biology pointed to by recent, robust genetic findings that it implies that two sites that are in LD are also likely to be
from genome-​wide association studies (described later) has often physically colocated on the same stretch of chromosome. This
been at odds with the prior assumptions about what would be is the principle behind linkage disequilibrium mapping.
genetically important. Of course, this is actually a good thing Obtaining genotype data on an individual for two nearby
from the perspective of genetic studies, inasmuch as we strive heterozygous sites does not directly reveal the underlying hap-
for genetics to be a source of novel insights and hypotheses. lotypes carried by that individual, although in families the
Typically error rates in candidate genes studies were high, haplotype can often be inferred straightforwardly. For exam-
too: false positives (Type I errors in hypothesis testing) were ple, if the individual carries an A/​C (heterozygous) genotype
hard to control, given varying degrees of multiple testing, and for the first site and G/​T for the second, there are two possible
false negatives (Type II errors) were also likely as sample sizes haplotypic configurations: that the AG haplotype was inher-
used for most candidate gene studies were typically very small ited from one parent and therefore CT from the second, or
by today’s standards. For schizophrenia, as of 2011, 732 auto- that AT was inherited from the first and CG from the second.
somal genes had been tested by 1,374 hypothesis-​driven can- The process of resolving which configuration is more likely
didate gene studies, although most genes were investigated is called phasing. As in Figure 1.1, phase is often unambigu-
in only one (61%) or two (16%) studies (Kim et al., 2011). ous when one studies multiple members of the same family.
Typically no replication was attempted, or it was underpow- Alternatively, statistical approaches (based on algorithms such
ered, or the statistical evidence was hard to reconcile with the as expectation maximization [EM] or Markov Chain Monte
literature. For example, often different markers in the same Carlo [MCMC] and population genetic models) can be used
gene were tested across different studies, or replication was to resolve phase in samples of unrelated individuals by consid-
claimed but the direction of effect differed between studies. ering the observed correlation between sites and treating the
Furthermore, genetic variation in candidate genes was typi- unknown phase information statistically in terms of a missing-​
cally only very poorly captured, even for common variation, data problem (Browning and Browning, 2012). In some
often with only one or two markers being genotyped per gene. situations it is also possible to use sequencing to type haplo-
types directly, using molecular rather than statistical means,
sequencing along the same physical stretch of chromosome.
E X PA N D I N G K N OW L E D G E The actual structure and extent of LD in humans reflects
OF THE HUM AN GENOME both demographic factors and the history of the population
studied and biological properties of the genome, influenc-
Reference maps and databases have been critical in many areas of ing the rate of recombination at particular sites. The typical
genomics, from the human genome reference sequence itself to structure and extent of LD is of critical importance to the
maps of coding and other functional elements in the sequence. implementation of association analysis as applied to large
Equally important for disease and population genetics has been genomic regions. Fundamentally, association mapping (some-
the more recent construction of maps, or catalogs, of observed times known as linkage disequilibrium mapping, as previously
variation within and between different human populations. noted) relies on the fact that by testing a particular variant,
The two most notable efforts are the International HapMap one is implicitly testing a host of nearby variants for which the
project (International HapMap Consortium, 2007) and the genotyped markers act as proxies, or tags.
1000 Genomes Project (1000 Genomes Consortium, 2010). The HapMap project provides a comprehensive empirical
The HapMap project employed large-​scale genotyping to type description of the typical profiles of LD in the populations stud-
almost 4 million known single-​ nucleotide polymorphisms ied. To a first approximation, patterns of LD can be well charac-
(SNPs) in 270 individuals of African, Asian, and European terized by “haplotype blocks,” meaning that there are regions of
ancestry. As well as generating lists of technically validated the genome (very variable in size, but often on the order of 10 to

1. G enetic
Charney and Nestler's Neurobiology of Mental Illness, edited by Dennis S. Charney, etM ethodologies
al., and
Oxford University Press USA - OSO,A2018.
pplications 
07:33:44.
(a) Unphased genotypes Resolved haplotypic phase

A/A A/C A A C A
G/T G/G G T G G

A/C A C
G/T T G

(b)
Estimated haplotype
A/A A/C C/C frequencies from a population
G/T C/C A/A G/G
G/G
G/G T/T R2 = 0.351
Haplotype Frequency Expectation under LE

C/A C/A CT 0.000 0.135


A/A A/C G/T AT 0.300 0.165
G/T C/A CG 0.450 0.315
G/G G/G G/T AG 0.250 0.385

Figure 1.1 Linkage disequilibrium and haplotype phasing. (A) Using family information can often resolve phase unambiguously. Here a trio is genotyped
for two biallelic SNPs: for the first site, A or C alleles (top genotype in all plots); for the second site, G or T alleles (bottom genotype in all plots) in
this example. From inspection, the mother necessarily transmits the CG haplotype, implying that the offspring carries AT and CG haplotypes, rather
than AG and CT haplotypes. (B) In the absence of family data, it is still possible to estimate haplotype frequencies from genotypes at SNPs in linkage
equilibrium. In this toy, illustrative example, the EM algorithm would conclude that the CT haplotype does not exist in this population based on this
very small sample of 10, meaning that the two SNPs are in LD (here R2 is estimated at 0.351). Individuals would be assigned a combination of AT, CG,
and AG haplotypes only, which will be consistent with their SNP genotypes.

100 kilobases (kb); 1 kb = 1000 basepairs) in which there is very the average extent of LD, analyses of HapMap data showed
high LD, meaning that only a small subset of all possible haplo- that one could expect to capture the majority of common
types (combinations of alleles in that region) are observed in the (typically defined as above 5% marker allele frequency) var-
population. For example, considering 10 SNPs, each with two iation in European and Asian populations at a reasonable
alleles, there are 210 = 1024 possible haplotypes, although under level of certainty (e.g., R2 > 0.8) by genotyping on the order
very strong LD we may observe only two or three of these at of 500,000 SNPs genome-​wide. This paved the way for the
appreciable population frequencies. These “blocks” are separated first genome-​wide association studies (GWASs), which began
by “recombination hotspots”—​places in the genome with a his- typing 100,000–​300,000 markers using newly developed,
torically higher rate of recombination—​which acts to reduce LD standardized commercial microarrays, soon establishing
by separating alleles on the recombinant haplotype. The results 500,000–​1,000,000 SNPs as routine (Carlson et al., 2004).
from the HapMap helped inform the design of experiments that As described later, association analysis of these datasets has
aimed to intelligently select the smallest possible set of markers driven many genetic discoveries in the past decade.
necessary to capture, or tag, most of the known common varia- Superseding the tagging approach in many respects, the
tion in a region. In the 10-​SNP example, it may only be necessary more general approach of imputation leverages the actual
to genotype 1 or 2 SNPs, for example, without significant loss of HapMap sample data itself to fill in data that are “missing” in
information compared with genotyping all 10. a GWAS but present in the HapMap, relying on LD informa-
A common measure of LD in association studies is R2, tion implicit in the HapMap across all SNPs. Imputation
where a value of 0 indicates no LD (two sites are statistically allows researchers to probabilistically assign genotypes for
independent) and 1 indicates that one marker is effectively a all common HapMap SNPs (over 2 million in the European
perfect proxy for the second. An intermediate value, say of samples), even if only 500,000 have been directly genotyped
0.8, indicates that one marker captures 80% of the informa- in the study, by taking advantage of the redundancy due to
tion one would obtain if using one marker as a proxy for the LD. One of the major applications of imputation is to facil-
other, instead of directly genotyping the second marker. If the itate the comparison and aggregation of studies that use dif-
untyped marker is a causal risk factor for disease, then one ferent GWAS arrays, by mapping everything to the common
may still expect to observe a statistical signal of association set of HapMap SNPs. This also obviates many of the practical
(e.g., based on a simple comparison of case and control allele difficulties that plagued candidate gene studies, in which dif-
frequencies) at the genotyped marker, albeit one that is atten- ferent markers were typed in different studies.
uated due to incomplete LD. (In fact, to retain equivalent The HapMap and GWAS in general are largely focused
power to detect association at the marker, in this case one on assaying only common genetic variation: typically sites at
would require 1/​R2 = 1/​0.8 or 125% of the sample size com- which at least 5% of chromosomes carry an “alternate” allele
pared with typing the causal marker directly). By estimating compared with the reference sequence. The vast majority of

6 • S.ECharney,
Charney and Nestler's Neurobiology of Mental Illness, edited by Dennis merging and
et al., Oxford E stablished
University Press USA - OSO,T2018.
echnologies
07:33:44.
variants that have population frequencies below 1% will not that can simultaneously assay hundreds of thousands of single-​
be present in the HapMap or on standard microarrays, so will nucleotide polymorphisms, has addressed the basic limitations
be effectively invisible to GWAS approaches. A major push inherent in the early application of association (or linkage dis-
in recent years has been to leverage advances in so-​called next equilibrium) mapping. Because most of the common variation
generation sequencing (NGS) technologies to build catalogs in the genome can be assayed, genetic studies have become
of lower frequency variation. This technology employs mas- fundamentally data-​driven enterprises and do not rely on
sively parallel approaches to sequence many millions of small prior biological hypotheses. Order-​of-​magnitude cheaper per-​
fragments of DNA, generating very large numbers of short genotype costs have enabled a large amount of genetic data to
reads (around 100 bases) that can be mapped back to the ref- be amassed; the use of standardized microarrays (combined
erence sequence and variant sites called in an individual. The with imputation analysis) has also facilitated pooling of data
1000 Genomes Project (www.1000genomes.org/​) has used across studies to achieve larger samples through meta-​analysis,
this technology to sequence the entire genomes of over 1,000 and therefore greater power, which is vitally important in
individuals, in order to create maps of known low-​frequency complex trait genetics (Lohmueller et al., 2003). Also, GWAS
variants and reference panels for imputation. Combining pub- studies generally do a more comprehensive job at captur-
licly available 1000 Genomes data with standard GWAS data, ing common variation in a given gene compared with early
one can reliable impute over 10,000,000 polymorphic sites, candidate-​based studies using older genotyping technologies,
many of which are of low frequency (under 1%) and many of including capturing the vast amount of variation in flanking
which represent potentially functional polymorphisms (e.g., intergenic and intronic regions. At the same time, the large
nonsynonymous allelic substitutions in genes, or short inser- multiple-​testing burden inherent in GWAS forced inves-
tions and deletions that shift the reading frame of a gene). To tigators to address the issue of false positive rates early and
measure very rare mutations that are specific to a family or a head on. Based on empirical and theoretical considerations,
particular ancestral group that is not represented in the 1000 most investigators require a p-​value of less than 5×10-​8 for an
Genomes data, it will still be necessary to sequence samples association to be declared genome-​wide significant. In a well-​
directly. But given current cost constraints, the 1000 Genomes controlled study, findings that reach this stringent threshold
data afford a new lease of life for existing GWAS samples. In have been shown to have a very high probability of replicating
addition to utility in imputing a good deal of low-​frequency in subsequent studies.
variation, these data may be particularly helpful in ascribing a
putative function to associated regions or haplotypes, as a con-
sequence of the near-​complete ascertainment of all commonly A P P L I C AT I O N S O F
variable sites. Recent efforts such as the Haplotype Reference G E N O M E -​W I D E
Consortium (http://​www.haplotype-​reference-​consortium. A S S O C I AT I O N M A P P I N G
org) now allow researchers to perform imputation analysis A N D A N A LY T I C I S S U E S
leveraging tens of thousands of reference samples.
Another type of genomic map that has recently been Genome-​ wide association studies have been very widely
reported, and that will likely play a critical part in both the adopted for a large number of diseases. One of the pioneer-
analysis and interpretation of many genetic studies of disease, ing studies was of seven diseases and a shared control sample,
is the ENCODE project (Encyclopedia of DNA Elements; the Wellcome Trust Case Control Consortium (2007). The
http://​www.genome.gov/​10005107). This project aimed to U.S. National Human Genome Research Institute (NHGRI)
map all functional elements in the human genome sequence maintains a catalog (www.genome.gov/​GWAStudies) of pub-
beyond protein-​coding genes: for example, regions (that may lished associations from GWAS for a diverse range of diseases
often be cell-​and tissue-​specific) related to factors such as chro- and traits. To date, over 1,600 associations have been pub-
matin structure, methylation, histone modification, sequence-​ lished, all meeting the strict threshold of genome-​wide signif-
specific transcription factors, and RNA-​binding proteins. As icance (Figure 1.2).
many association signals from GWAS fall outside of known For most common diseases, these genome-​wide findings
protein-​coding genes, a more comprehensive annotation and likely represent the tip of the iceberg of true common vari-
understanding of the full sequence will be important in trans- ant associations. In many cases, including for neuropsychi-
lating statistical signal into biological knowledge (Degner atric disease, there are multiple lines of evidence that point
et al., 2012). Ultimately, a better accounting of the diversity of to an abundance of true signals below the formal thresh-
cell types in humans, and in particular in the brain, will be nec- old for genome-​wide significance. When looking at many
essary to fully understand how genes act and how to interpret replicated genome-​wide significant results, the statistical
association signals in concert with single-​cell molecular studies. power to detect them (given their frequency and reported
effect size) would typically have been low. (In practice,
reported effect sizes are often inflated by the so-​called
G E N O M E -​W I D E “winner’s curse” effect, meaning that variants detected at
A S S O C I AT I O N S T U D I E S strict significance thresholds may have the needed “luck of
the draw” from sampling variation to push them over the
In many respects, the development of reliable, cost effective, bar). Low power a priori implies either that the investiga-
high-​throughput genotyping technologies, using microarrays tor was extremely lucky (managing to detect one particular

1. G enetic
Charney and Nestler's Neurobiology of Mental Illness, edited by Dennis S. Charney, etM ethodologies
al., and
Oxford University Press USA - OSO,A2018.
pplications 
07:33:44.
Charney and Nestler's Neurobiology of Mental Illness, edited by Dennis S. Charney, et al., Oxford University Press USA - OSO, 2018.
07:33:44.

The National Human Genome Research Institute GWAS catalog. A list of published GWAS associations (accessed 9/​2016) (http://​www.ebi.ac.uk/​g was/​home). Shaded circles indicate
Figure 1.2
different classes of phenotype. Many of these discovered loci were completely novel.
true positive despite very low chances to do so) or, more et al., 2010) and schizophrenia (International Schizophrenia
parsimoniously, that there must be a substantially larger Consortium, 2009), have indicated that a sizeable proportion
reservoir of similar effects truly existing, from which this of the total heritability may be due to the combined action
study sampled only a particular subset, in proportion to of extremely modest effects across many loci (many of which
the statistical power. More directly, one can take sets of may never be expected to rise to the level of genome-​wide sig-
independent, subthreshold associations (e.g., SNPs with p-​ nificance even in very large samples). Under such models it
values between 1×10-​4 and 5×10-​8) and ask whether more is likely unrealistic to ever expect a “complete” genetic model
than expected are nominally significant in an independent of a disease in the sense of accounting for all risk genes and
sample (e.g., at P < 0.01 or P < 0.05) or show effects in alleles. Nonetheless, it is important to note that although very
a consistent direction (above 50% correspondence of risk high polygenicity reduces power to unambiguously detect any
versus protective effects expected by chance alone, often one particular variant, it does not by itself preclude progress
referred to as a “sign test”). For many diseases, such analyses toward the broader goals of genetic studies, namely, the identi-
strongly support the presence of many subthreshold true fication of critical biological pathways and networks and even
associations. Furthermore, approaches such as gene set–​ individual risk prediction and personalized therapies.
enrichment analysis applied to lists of subthreshold associa- Table 1.1 gives concrete numbers for the sample sizes
tions can be used to indicate whether the genes implicated required under different genetic models, for both common
appear to be a random selection of all genes, as would be and rare variants of varying effect sizes. Given the large sample
expected if the associated regions were, in fact, selected sizes indicated in Table 1.1 for the type of variant that charac-
purely by chance, as opposed to preferentially belonging to terizes most “GWAS hits,” meta-​analysis (or combined, mega-​
certain known pathways, or clustering in networks, beyond analysis) has played an increasingly important role in genetic
chance expectation—​which is consistent with a nontriv- disease studies, in which consortia of studies—​and then con-
ial proportion of the associations being true positives. For sortia of consortia—​pool results or raw genotype data to col-
example, Lango Allen et al. (2010) reported hundreds of lectively achieve greater power to detect variants of small effect.
variants influencing human height clustered in function- Although it has become clear that Type II errors (false neg-
ally related pathways. Evidence for a substantial number atives) are the primary hurdle in GWAS (low power to detect
of likely true subthreshold associations for a given disease small effects), there has also been considerable attention to
can be taken to indicate that larger sample sizes will yield the issue of Type I errors (false positives). At the dawn of the
genome-​wide significant associations, as more true posi- GWAS era, many researchers were reasonably concerned that
tives are pushed over the threshold. the massive multiple testing, as well as the scope for bias from
Other studies have taken more direct approaches to technical artifact or epidemiological confounding, would lead
address the idea of highly polygenic disease architectures (i.e., to hopelessly inflated false-​positive signals. Given that most
involving hundreds or thousands of distinct genetic loci). In GWAS studies have been population-​based (utilizing samples
particular, analyses of common variants in GWAS data for of unrelated cases and controls) as opposed to family-​based,
various highly heritable phenotypes, including height (Yang one concern was that population stratification might give rise

Table 1.1 SAMPLE SIZES REQUIRED (CASE/CONTROL PAIRS FOR A 1% DISEASE) UNDER DIFFERENT
GENETIC MODELS

CAUSAL ALLELE GENOTYPED MARKER REQUIRED SAMPLE SIZE OF 80% POWER

MAF GRR MAF R2 α = 0.05 α = 5 × 10−8

0.40 1.2 (Directly typed causal allele) 949 4,792

0.40 1.2 0.50 0.67 1,400 7,064

0.40 1.2 0.10 0.17 5,880 29,668

0.01 3.0 (Directly typed causal allele) 410 2,070

0.01 3.0 0.50 0.01 21,213 107,030


0.01 3.0 0.10 0.09 2,533 12,780

Contrasting power under two particular scenarios, involving a common and a low-frequency variant. Power calculated using the Genetic Power Calculator (http://pngu.
mgh.harvard.edu/purcell/gpc/) and shows the number of case/control pairs required to achieve 80% power (i.e., an 80% chance of correctly rejecting the null hypothesis
when the SNP truly has an effect) for two significant thresholds: a nominal 0.05, and genome-wide significant 5 × 10-8. (These α values represent the chance of a false-
positive test result.) The two causal scenarios are not intended to be directly comparable; rather, the numbers presented are meant to show the impact of requiring a strict
significance threshold on required sample size, and the impact of incomplete LD (by genotyping a marker instead of directly genotyping the causal variant) under the two
scenarios. Aside from the fact that, in general, large sample sizes are required for these types of effects, we see in particular that if the marker has a frequency very different
from the causal variant, the R2, which is always set at the highest possible value given the two allele frequencies, it will be necessarily low, and therefore, power will be
power, and the sample size required to achieve 80% will be high. The first scenario represents the type of SNP we may expect to find in a GWAS; the second scenario
represents (perhaps an optimistically large) effect as one might hope to see in exome sequencing or an exome array study.

1. G enetic
Charney and Nestler's Neurobiology of Mental Illness, edited by Dennis S. Charney, etM ethodologies
al., and
Oxford University Press USA - OSO,A2018.
pplications 
07:33:44.
to false positives. If cases and controls are not well matched complement component 4 (C4) gene and its impact on synap-
for ancestry, because different populations systematically vary tic pruning (Sekar et al., 2016).
in allele frequency at many sites across the genome for reasons
unrelated to the disease being studied, this could induce spu-
P O LYG E N I C A NA LYS E S WIT H I N A N D
rious associations. In contrast, association analyses that adopt
B ET WE E N T R A I TS A N D D I S E A S E S
a family-​based approach (e.g., the transmission disequilibrium
test, or TDT, which tests for overtransmission of a specific One interesting class of analytic approach to emerge from
allele from heterozygous parents to affected offspring) implic- GWAS focuses on genome-​wide patterns of variation in order
itly guard against such confounding effects (e.g., by contrast- to make inferences about genetic architecture for individual
ing transmitted versus untransmitted alleles from within the traits or diseases, and also the genetic overlap between different
same parent, in the case of the TDT). In practice, the pres- traits or diseases. Variance components models, implemented in
ence of genome-​wide genotypic data allows one to empirically the GCTA package (Yang et al., 2013), can estimate the herita-
assess the presence of heterogeneity in ancestry in a sample of bility of a trait without using traditional family-​based samples.
individuals (Rosenberg et al., 2002) and to correct it statisti- Using SNP data to infer the degree of distant genetic related-
cally in tests of association (using approaches such as principal ness between all pairs of individuals in a population-​based
components analysis). Although most GWASs have been con- sample, the same underlying logic of the classical twin study is
ducted in populations of European descent, there is potentially applied: individuals who are genetically more similar should
a lot to be learned from application to a more diverse range also be phenotypically more similar if the trait is heritable.
of populations, and new analytic challenges, for example, in Instead of comparing pairs sharing either 100% or 50% of their
highly admixed populations (Rosenberg et al., 2010). Quality genome, as identical and fraternal twins do, this approach con-
control procedures play an important role in GWAS—​for siders large numbers of individuals that perhaps share ~0.5%,
example, testing for deviations from Hardy-​Weinberg equilib- as estimated from the SNP data. Nonetheless, in large samples,
rium, or detecting SNPs with particularly high rates of failed using linear mixed models one can estimate heritability arising
genotyping. GWAS can still be prone to false positives from from the shared SNPs. These models can be extended to con-
technical bias or other types of analytic error, simply by virtue sider multiple diseases and to estimate the genetic correlations
of the large number of tests performed: in large part, this con- between them, which represents the proportion of genetic
cern is addressed by placing a strong emphasis on the need to effects that are shared by two diseases. Other approaches to
seek replication of any putative signals in independent samples. studying pleiotropic gene effects include using polygenic risk
Although one can, in theory, approach the analysis of scores (International Schizophrenia Consortium, 2009) and
genotype–​phenotype relations using GWAS data in a num- LD-​ score regression (Bulik-​ Sullivan et al., 2015b). These
ber of ways, in practice most substantive findings (as repre- approaches have demonstrated, for example, that schizophrenia
sented in the NHGRI catalog) come from simple, sequential and bipolar disorder are highly genetically correlated (Cross-​
tests of one SNP (either imputed or directly genotyped) at a Disorder Group of the Psychiatric Genomics Consortium,
time. Typically, a technique such as logistic or linear regres- 2013) and are now being routinely applied to broader panels of
sion is employed, assuming a purely additive dosage model at phenotypes with available GWAS (Bulik-​Sullivan et al., 2015a).
each site. Simpler alternatives include Armitage trend test or Studying the genetic, nosological boundaries of disorders will
Fisher’s exact test; more complex alternatives include nonpa- help gene discovery efforts and may lead to better understand-
rametric regression models, linear mixed models, and Bayesian ing the heterogeneity within disorders.
approaches. In broad terms, it does not appear that the precise
choice of statistical machinery employed has altered the general
trends of results and substantive conclusions to date, however. THE FREQUENCY
Subsequent chapters summarize the results from GWAS S P E C T RU M O F D I S E A S E
and other types of genetic studies for a range of neuropsychi- ALLELES: MODELS OF RARE
atric diseases. Compared with certain other common diseases A N D C O M M O N VA R I AT I O N
such as Crohn’s disease or Type I diabetes, there arguably has
been less “low-​hanging fruit” to emerge from neuropsychiatric Most genetic variation in the human genome is attributable
GWAS. Nonetheless, numerous genome-​wide significant hits to common polymorphism. For this reason, along with the
have now been reported, particularly for schizophrenia and fact that common SNPs in any one population constitute a
bipolar disorder. Also, as noted, consideration of subthresh- relatively limited and easily assayable universe, common vari-
old results strongly suggests that more are to be expected with ation was an obvious first target for large-​scale, genome-​wide
larger sample sizes. Most notably, the Psychiatric Genomics genetic studies, in the form of SNP-​based GWASs. It has, of
Consortium (PGC) reported a combined GWAS of over course, long been recognized that common SNPs are by no
35,000 schizophrenia patients and 100,000 controls that means the only class of variation a geneticist may wish to study.
detected 108 independent genome-​wide significant associ- Particularly in the context of disease, one can argue (supported
ated loci (Schizophrenia Working Group of the Psychiatric by observations in rare, Mendelian disease) that larger types of
Genomics Consortium, 2014). More recent work has begun variant might be more likely to have a strong impact on disease
to unpick the underlying causal allelic architecture at individ- risk, as, unlike SNPs, they impact more than just a single (usu-
ual loci and its relation to disease mechanisms, notably for the ally intergenic) nucleotide. Structural variants are one such

10 •S. E
Charney and Nestler's Neurobiology of Mental Illness, edited by Dennis merging
Charney, and
et al., Oxford E stablished
University Press USA - OSO,T echnologies
2018.

07:33:44.
class, involving the deletion, duplication, inversion, or trans- with childhood or early-​adult onsets, we would expect selection
location of potentially millions of nucleotides. Similarly, evo- to constrain alleles of larger effect to have lower population fre-
lutionary arguments can be used to suggest that alleles of high quencies. Although the exact relationship between frequency
penetrance are unlikely to be very common, assuming the dis- and effect size that arises from the action of selection is hard to
ease has had a continued negative impact on fitness over many predict generally, it is safe to conclude that common variants
generations, and so would have been selected against. The of very large effect are unlikely to exist; otherwise, all combina-
hypothesis that rare variants may primarily underlie common tions of variant will likely occur, in proportion to the frequency
disease risk, in the same way they do for rare disease, expresses spectrum of neutral variation. What may make some diseases,
this logic (Cirulli and Goldstein, 2010). including neuropsychiatric disease, particularly challenging from
For schizophrenia, examples of very rare structural vari- a genetic perspective, is likely to be the sheer number of loci in
ants that are large-​effect risk alleles were identified over two the genome that, if perturbed by either a rare or common vari-
decades ago, using the classical techniques of cytogenetics and ant, can increase risk for disease. This challenge will be equally
linkage mapping in extended pedigrees. For example, a 1.5–​3 pertinent for various study designs, from sequencing to GWAS.
Mb microdeletion at 22q11.2 leads to velo-​cardio-​facial syn-
drome (VCFS), a phenotypically heterogeneous syndrome,
which displays an approximately 30% probability of leading to STUDIES OF R ARE
schizophrenia. Because the deletion occurs at one in ~4,000 S T RU C T U R A L VA R I AT I O N :
live births, this variant is expected to contribute to risk in ~1% C O P Y N U M B E R VA R I A N T S
of all schizophrenia patients. A N D N E U R O P SYC H I AT R I C
A second example of a rare and highly penetrant structural DISEASE
variant is the balanced translocation between 1q42 and 11q14,
segregating with major psychiatric disease in a single extended Structural variants, such as the 22q11.2 deletion previously
Scottish pedigree and mapped using linkage analysis. One described, have a well-​established role in a range of rare disease
of the translocation’s breakpoints was later shown to disrupt phenotypes, as well as genomic alterations that occur in cancers
a gene, now known as DISC1, “disrupted in schizophrenia (Mills et al., 2011; Wain et al., 2009). Technologies such as array-​
1” (St. Clair et al., 1990). The success of mapping DISC1 CGH (comparative genomic hybridization) are now routinely
prompted a wave of functional studies to investigate its roles used in prenatal screening as well as research settings, replac-
in neurodevelopment, although the precise mechanism by ing traditional karyotype techniques for detecting unbalanced
which the translocation acts to increase risk for major psy- chromosomal changes. Rare copy number variants (CNVs,
chiatric illness in this family still eludes researchers. Whether deletions or duplications of genetic material) ranging from 100
or not that mechanism is ever fully understood, many would kb or less to multiple megabases can also be called from analysis
argue that the finding provides a window into the larger, more of the same SNP microarrays used in GWAS studies: this for-
complex pathways involved in the disease. tuitous fact has meant that relatively large GWAS samples have
In its extreme form, the multiple rare variant model is taken been able to be assayed for changes in copy number variation.
to mean that although many rare disease variants may exist in a For autism and schizophrenia (International Schizophrenia
population, most affected individuals will carry only one, which Consortium, 2008; Sebat et al., 2007; Pinto et al, 2010, 2014;
was sufficient to cause their disease; similarly, most unaffected Perkins et al, 2016), such events clearly play an important role.
individuals would not be expected to carry any risk alleles. This Several studies have found, in particular, an increased rate of de
model is in contrast to the polygenic common variant model, novo CNVs in both autism and schizophrenia patients: such
in which both affected and unaffected individuals would be events will effectively be uncensored with respect to natural
expected to carry many risk alleles: under this model, cases selection. The increased rate of de novo mutation in schizophre-
simply carry more of them on average, as a consequence of the nia patients is also consistent with epidemiological observations
increased genetic burden leading to increased risk of disease. of increased paternal age (as the probability of a germ line muta-
The extreme form of the multiple rare variant model essentially tion in the father is known to increase with his age also).
recasts a common disease as a collection of multiple, clinically Approximately a dozen specific loci have been mapped with
indistinguishable diseases—​that could in theory also be eti- high statistical confidence, being likely to harbor CNVs that
ologically distinct in a fundamental manner, but that should increase risk for disease (Sullivan et al., 2012; Perkins et al., 2016).
often be amenable to the same family-​based genetic approaches Such events are typically large (often impacting dozens of genes),
that worked for Mendelian disease (i.e., if most affected fami- rare in the general population (with a frequency under 1/​1000),
lies are, in fact, segregating a single, high-​penetrance allele). In and are estimated to increase risk for disease by up to tenfold or
practice, extreme forms of the multiple rare variant model are more. Interestingly, the same CNVs have been shown to increase
unlikely to be the general rule for any common disease—​if link- risk both for autism and schizophrenia as well as other neurode-
age analysis has been adequately performed in appropriately velopmental and behavioral disorders. In addition, autism and
sized pedigree collections, this model can already be ruled out. schizophrenia patients show a modest but significant increased
Perhaps a better default or working model for most com- burden of rare CNVs across their genomes, again consistent with
mon diseases should instead be that multiple variants of vary- the high polygenicity of neuropsychiatric disease. For other neu-
ing effect sizes are likely to exist anywhere across the frequency ropsychiatric diseases, the role of CNVs is either less pronounced
spectrum (Gibson, 2012; Owen et al., 2010). At least for diseases or no relationship has yet been clearly established.

1. G enetic
Charney and Nestler's Neurobiology of Mental Illness, edited by Dennis S. Charney, etMal.,
ethodologies and
Oxford University Press USA A2018.
- OSO, pplications 
07:33:44.
N E X T-​G E N E R AT I O N S E Q U E N C I N G observed in only one of the sequenced samples—​and most of
TECHNOLOGIES AND MEDICAL these will be novel in the sense that they will not have been pre-
SEQUENCING viously identified and deposited in databases such as dbSNP
(http://​www.ncbi.nlm.nih.gov/​projects/​SNP/​), which cur-
The advent of next-​generation sequencing, as well as driving large rently contains around 50 million known variants. This fact
genomics projects such the 1000 Genomes, has been widely and alone clearly poses challenges for the analysis of sequence data
largely very successfully applied to a host of rare, Mendelian dis- to map risk alleles for disease. In practice, the rarity of individ-
eases over the past few years. One of the most common applica- ual variants means that researchers employ a range of methods
tions of NGS to date has been whole-​exome sequencing (Bamshad to statistically aggregate multiple mutations across a particular
et al., 2011). Here targeted approaches allow investigators to first gene and collectively test them for association with a disease, in
greatly enrich the pool of DNA fragments to be sequenced for par- so-​called gene-​based rare variant analysis. Although large stud-
ticular regions of interest: in the case of whole-​exome sequencing, ies of thousands of patients and controls are underway, across
this involves “capturing” the ~1% of the genome that is known to a range of diseases, unambiguous discoveries are yet to emerge
contain exons of protein-​coding genes. This relatively small frac- from these studies. For common, complex traits, exome sequenc-
tion of the genome can then be sequenced at high depth (i.e., with ing will be much more challenging than for Mendelian disorders,
20 or more reads spanning most targeted bases) to ensure high and very large sample sizes may well be required, as is the case
sensitivity to detect if not all then at least the vast majority of var- for studies of common variation (Kiezun et al., 2012). Although
iant (nonreference) sites present in an individual’s exome. In com- sample sizes are still small by GWAS standards, early applica-
parison with sequencing the whole genome, exome sequencing is tions of exome sequencing to schizophrenia have yet to unam-
still considerably cheaper per unit, although per base sequenced biguously pinpoint many specific genes. Consistent with the
it is less cost effective. In practice, though, sequence data on the polygenic models from GWAS, there is nonetheless evidence of
exome is typically more valuable in the sense that any one vari- an increased burden of rare, damaging mutations across many
ant has a higher prior likelihood of being functional, and that one genes, especially for genes involved in brain development and
can more readily ascribe and interpret that function in terms of synaptic function (e.g., Genovese et al., 2016). Recently, great
its impact on the resulting gene product and what else is known efforts have been made to aggregate data on rare variants across
about that gene (e.g., where it is expressed, what other disorders many different exomes sequencing studies (Exome Aggregation
are associated with mutations in that gene, what other proteins Consortium, ExAC), which has the potential to empower both
interact with the protein coded by that gene). Perhaps the main clinical and research sequencing studies. The large ExAC refer-
drawback with exome sequencing is the expanding definition ence can help to identify which mutations observed in a given
of what is practically implied by “the exome”: other interesting study are more likely to be pathogenic: namely, those that are
regions such as regulatory regions near genes, rare transcripts, and truly rare in the large reference panel, and those that occur in
noncoding RNAs are typically not captured comprehensively, genes that appear to be intolerant to damaging mutations, based
and this fact alone may for many motivate the move to whole-​ on an analysis of ExAC data (Lek et al., 2016).
genome sequencing. The amount of data generated by whole-​ Because genotyping technology is still cheaper and more
genome sequencing is orders-​of-​magnitude larger than for the accurate than sequencing, a number of groups have collaborated
exome, and so computational challenges in analyzing and even to create an exome array: a standard SNP microarray using the
storing the data become major concerns for large studies. same technology deployed for GWAS, but that primarily con-
A typical exome sequencing experiment on one individ- tains approximately 200,000 low-​frequency mutations that are
ual currently targets around 200,000 genomic intervals, each nonsynonymous (alter the protein’s amino-​acid sequence) and
usually corresponding to one exon of a protein-​coding gene, observed in at least two studies (and so represent variants that
around 150 bases in length, targeting around 20,000 RefSeq are segregating in populations at low frequencies, perhaps 0.1%,
genes and spanning around 30 Mb of genomic sequence. In a as opposed to truly “private” mutations that may be specific to
high-​depth sequencing study, each targeted base is often cov- single families and may never be seen again). Although com-
ered, on average, by as many as 50 to 100 “short reads.” These prehensive results from these studies are not yet available, early
reads are typically 70–​100 bases in length, often physically applications do not suggest that this particular slice of the fre-
paired such that any two reads are expected to fall at nearby quency spectrum of nonsynonymous SNPs plays the major role
genomic locations. Variants are discovered by aligning these or completely accounts for any “missing heritability,” however.
reads to the reference sequence and looking for differences: this Other applications of sequencing to map rare variants
is a technically involved and potentially error-​prone procedure, for common diseases are using families rather than standard
although the informatics for this have improved markedly in case control, population-​based designs. Families can have a
the past few years, in no small part driven by large projects such number of advantages: ascertaining families with an unusu-
as the 1000 Genomes. From a whole-​genome sequencing study, ally high “density” of affected individuals for a given disease
one expects to find something on the order of 3 to 4 million increases the probability that a rare highly penetrant vari-
variant sites; from whole-​exome sequencing, this figure is typi- ant is present in that family. One can, in principle, use IBD
cally in the range of 15,000–​20,000 (depending on experimen- information from linkage analysis to prioritize specific regions
tal details as well as the ancestry of the sampled individual). of the genome for sequencing or analysis. One can use fam-
When sequencing more than a few individuals, a very large ily information to resolve haplotype phase and to impute
proportion of all sites discovered will be “singletons”—​variants sequence data across family members (as related individuals,

12 •S. E
Charney and Nestler's Neurobiology of Mental Illness, edited by Dennis merging
Charney, and
et al., Oxford E stablished
University Press USA - OSO,T echnologies
2018.

07:33:44.
by definition, represent different combinations of the same highly deleterious mutations in patients. At the same time, it does
smaller set of “founder” chromosomes). One disadvantage is not appear to be the case that a sizeable proportion of affected
that for many adult-​onset diseases it is far harder to collect individuals carry a de novo mutation that is likely to be the sole
intact family collections in large numbers. cause of their disease. In contrast to dozens of genes identified
Additionally, one can use families to detect new, or de novo, in autism using this approach, for schizophrenia relatively few
mutations. In neuropsychiatric disease, and particularly autism genes have emerged that are observed to be recurrently hit by de
and schizophrenia, the hypothesis that de novo mutation may novos across these studies beyond the level expected by chance—​
play a significant role in disease risk is attractive to many research- again speaking to the very high polygenicity of such diseases. The
ers and is supported by the epidemiological observation that genes and mutations in specific patients that do emerge from this
affected individuals tend to have older fathers (which is, in turn, approach may well be particularly interesting to study, however, in
known to correlate within increased germ line mutation that that (because de novo mutations are effectively uncensored with
will be transmitted to offspring). A number of exome sequenc- respect to natural selection) they could in theory display a very
ing studies using trios (affected offspring and two parents) have high penetrance. Such “large-​effect” alleles could in many cases be
been published for these two diseases (Neale et al., 2012; De preferable mutations to follow up in functional studies, for exam-
Rubeis et al, 2014; Iossifov et al, 2014; Sanders et al, 2015; Xu et al, ple, using induced pluripotent stem cells or animal models.
2012; Fromer et al, 2014). The results to date are interesting and Figure 1.3 illustrates some of the different genetic designs
do point to nonrandom networks of genes that are enriched for and technologies currently available for relating DNA

Reporting GWAS results: Q-​Q, Manhattan, and “regional” plots. These figures are taken from the Psychiatric Genomics Consortium Bipolar
Figure 1.3
Disorder Working Group’s Nature Genetics 2011 report of a mega-​analysis of bipolar disorder GWAS data. (A) A so-​called “Manhattan plot,” in which
individual SNP association statistics are ordered along the x-​axis; the p-​value is plotted on a –​log10(P) scale, so values over 7.3 represent genome-​wide
significance. (B) The same data are shown in a Q–​Q plot (quantile–​quantile), which plots the observed statistic (−log10( P)) in rank order against the
expected value under the global null hypothesis of no association. Points along the diagonal are therefore consistent with chance. The plots can show
evidence of systematic bias (if the entire line grossly departs from the diagonal) or signal that is more likely to be true (if only the top portion of the
data does, indicating there are more nominally significant hits than would be expected by chance). (C) A third commonly used plot when reporting
GWAS results is a “region” plot. This shows the association statistics in a particular region as well as gives information on the LD (R2) between
markers. (Psychiatric GWAS Consortium Bipolar Disorder Working Group, 2011).

1. G enetic
Charney and Nestler's Neurobiology of Mental Illness, edited by Dennis Mal.,
S. Charney, et ethodologies and
Oxford University Press USA A 2018.
- OSO, pplications 
07:33:44.
Unique 1/1,000,000 1/100,000 1/10,000 1/1,000 1/100 1/10

De novo “Private” “Singletons” Low-frequency Common polymorphism


mutation mutations segregating variants

Exome sequencing Exome sequencing


Exome arry GWAS
in families in populations

Summary of genetic study types targeting different intervals of the allelic frequency spectrum. The values along the horizontal bar indicate the
Figure 1.4
minor allele frequency that is targeted by different genetic technologies, from common variation to sequencing for newly arising mutation.

variation to phenotype, in relation to the part of the allelic fre- analysis, although in practice it is often likely to be a rea-
quency spectrum they are designed to probe. Ultimately, it is sonable one. Although there is little empirical evidence for
likely that approaches that look for convergence of genetic sig- nonadditive effects being a generally important compo-
nals across these different studies may be fruitful (Nejentsev nent of the architecture of common disease, finding specific
et al., 2009). instances of such effects could be very informative. Examples
of nonadditive effects include basic dominant/​ recessive
(and compound heterozygote) models at a single locus and
extended regions of homozygosity due to recent inbreeding,
I N T E G R AT I VE A N A LYS E S unmasking rare recessive effects (Keller et al., 2012), interac-
O F G E N ET I C N ET WO R K S tion between genes (epistasis as reviewed by Cordell, 2009),
A N D PAT H WAYS and between genes and environments (Thomas, 2010), as
well as sex-​specific, imprinting, and parent-​of-​origin effects.
Future progress in complex traits genetics is likely to rely on Whether or not allowing for these more complex models will
two factors, no matter what particular type of genetic study help to map disease genes is unclear. Nonetheless, studying
is adopted: (1) increasingly large sample collections and the growing number of genes already mapped by the additive
(2) integrative modeling approaches that not only consider models with respect to these alternate models (including plei-
genetic information from different studies as illustrated in otropic effects on other phenotypes) has the potential to be
Figure 1.4, but also consider multiple genetic signals in their of great value.
broader context (Raychaudhuri, 2011). This includes intersec-
tion of multilocus genotype data with functional information,
from gene expression studies, from protein–​protein interac-
tion networks, or from other curated gene sets and pathways. S U M M A RY
For example, the CommonMind Consortium has used gene
expression profiles in postmortem brain samples of schiz- The tools available to the complex trait geneticist have evolved
ophrenia patients and controls to help to interpret GWAS rapidly over the past decade. Consequently, psychiatric genet-
signals (Fromer et al., 2016). Jointly modeling the impact of ics has made considerable progress during the same time frame
risk variants on intermediate phenotypes or endophenotypes (Sullivan et al., 2012). Different genetic strategies, from studies
(Gottesman and Gould, 2003), for example, from brain of de novo variation in exome sequencing, large deletion and
imaging studies, and a fuller analysis of pleiotropic effects, duplication copy number variants, and rare and low-​frequency
where the same variant influences multiple (and potentially variants segregating in populations to common polymor-
seemingly unconnected) disorders or traits (Cotsapas et al., phisms are underway. It seems clear that all approaches will
2011; Craddock et al., 2009), are both likely to be powerful continue to bear fruit in the coming years, although the full
approaches moving forward, particularly when seeded by solid promise of neuropsychiatric genetics is not yet achieved. In
knowledge of multiple associated loci from the primary genet- the (hopefully not too distant) future, the interpretation of
ics studies. multiple genetic associations in their biological context, rather
than their initial discovery per se, will increasingly become the
central challenge faced, but it will remain critically grounded
A LT E R N AT I VE on the initial gene discovery work going on today.
G E N ET I C M O D E L S

The majority of genetic studies assume simple, additive mod- DISCLOSURE


els of effect, whether the variant is common or rare. This is
typically a convenient, simplifying assumption made during Dr. Purcell has no conflicts of interests to disclose.

14 •S. E
Charney and Nestler's Neurobiology of Mental Illness, edited by Dennis merging
Charney, and
et al., Oxford E stablished
University Press USA - OSO,T echnologies
2018.

07:33:44.
REFERENCES Lango Allen H., Estrada, K., et al. (2010). Hundreds of variants clustered
in genomic loci and biological pathways affect human height. Nature
467:832–​838.
Altshuler, D., Daly, M.J., et al. (2008). Genetic mapping in human disease. Lek, M., Karczewski, E.V., et al. (2016). Analysis of protein-​coding genetic
Science 322(5903):881–​888. variation in 60,706 humans. Nature 536(7616):285–​291.
Bamshad, M.J., Ng, S.B., et al. (2011). Exome sequencing as a tool for Lichtenstein, P., Yip, B.H., et al. (2009). Common genetic determinants of
Mendelian disease gene discovery. Nat Rev Genet. 12(11):745–​755. schizophrenia and bipolar disorder in Swedish families: a population-​
Browning S.R., and Browning, B.L. (2012). Haplotype phasing: existing based study. Lancet 373(9659):234–​239.
methods and new developments. Nat Rev Genet 12:703–​714. Lohmueller, K.E., Pearce, C.L., et al. (2003). Meta-​analysis of genetic asso-
Bulik-​Sullivan, B., Fomicame. H.K., et al. (2015a). An atlas of genetic cor- ciation studies supports a contribution of common variants to suscep-
relations across human diseases and traits.” Nat Genet 47:1236–​1241. tibility to common disease. Nat Genet 33:177–​182.
Bulik-​Sullivan, B., Loh, P.R., et al. (2015b). LD Score regression distin- Mills, R.E., Walter, K., et al.; 1000 Genomes Project. (2011). Mapping
guishes confounding from polygenicity in genome-​wide association copy number variation by population-​ scale genome sequencing.
studies,” Nat Genet 47:291–​295. Nature 470(7332):59–​65.
Carlson, C.S., Eberle M.A., et al. (2004). Mapping complex disease loci in Neale, B.M., Kou, Y., et al. (2012). Patterns and rates of exonic de
whole-​genome association studies. Nature 429(6990):446–​452. novo mutations in autism spectrum disorders. Nature 485(7397):
Cirulli, E.T., and Goldstein, D.B. (2010). Uncovering the roles of rare vari- 242–​245.
ants in common disease through whole-​genome sequencing. Nat Rev Nejentsev, S., Walker, N., et al. (2009). Rare variants of IFIH1, a gene
Genet 11(6):415–​425. implicated in antiviral responses, protect against type 1 diabetes.
Cordell H.J. (2009). Detecting gene–​gene interactions that underlie Science 324(5925):387–​9.
human diseases. Nat Rev Genet 10:392–​404. NHGRI GWAS Catalog: A Catalog of Published Genome-​ Wide
Cotsapas C., Voight, B.F., et al. (2011). Pervasive sharing of genetic effects Association Studies. http://​www.genome.gov/​g wastudies/​.
in autoimmune disease. PLoS Genet. 7(8):e1002254. Owen M.J., Craddock, N., et al.(2010). Suggestion of roles for both com-
Cross-​Disorder Phenotype Group of the Psychiatric GWAS Consortium, mon and rare risk variants in genome-​wide studies of schizophrenia.
Craddock, N., et al. (2009). Dissecting the phenotype in genome-​wide Arch Gen Psychiatry 67(7):667–​673.
association studies of psychiatric illness. Br J Psychiatry 195(2):97–​99. Perkins,O., Pers, T.H., et al. (2016). Contribution of copy number variants
Cross-​Disorder Group of the Psychiatric Genomics Consortium, Lee, to schizophrenia from a genome-​wide study of 41,321 subjects. Nat
S.H., et al. (2013) Genetic relationship between five psychiatric disor- Genet 49(1):27–​35.epub.
ders estimated from genome-​wide SNPs. Nat Genet 45(9):984–​994. Pinto, D., Pagnamenta, A.T., et al. (2010). Functional impact of global
Degner, J.F., Pai, A.A., et al. (2012). DNase I sensitivity QTLs are rare copy number variation in autism spectrum disorders. Nature
a major determinant of human expression variation. Nature 466(7304):368–​372.
482(7385):390–​394. Pinto, D., Delaby, Y., et al. (2014). Convergence of genes and cellular
De Rubeis., He, X., et al. (2014). Synaptic, transcriptional and chromatin pathways dysregulated in autism spectrum disorders. Am J Hum Genet
genes disrupted in autism. Nature 515(7526): 209–​215. 94(5):677–​694.
ENCODE Project Consortium. (2012). http://​www.nature.com/​ Psychiatric GWAS Consortium Bipolar Disorder Working Group.
encode/​. (2011). Large-​scale genome-​wide association analysis of bipolar dis-
Encyclopedia of DNA Elements. URL http://​www.genome.gov/​10005107. order identifies a new susceptibility locus near ODZ4. Nat Genet
Fromer M., Pocklington, A.J., et al. ( 2014 ). De novo mutations in schizo- 43(10):977–​983.
phrenia implicate synaptic networks. Nature 506, (7487):179–​184. Raychaudhuri, S. (2011). Mapping rare and common causal alleles for
Fromer, M., Roussos, P., et al. (2016). Gene expression elucidates func- complex human diseases. Cell 147(1):57–​69.
tional impact of polygenic risk for schizophrenia. Nat Neurosci Rosenberg, N.A., Huang, L., et al. (2010). Genome-​wide association stud-
19(11):1442–​1453. ies in diverse populations. Nat Rev Genet 11(5):356–​366.
Genetic Power Calculator. http://​pngu.mgh.harvard.edu/​purcell/​g pc/​. Rosenberg, N.A., Pritchard J.K., et al. (2002). Genetic structure of human
Genovese, G., Fromer, M., et al.(2016). Increased burden of ultra-​rare populations. Science 298(5602):2381–​2385.
protein-​altering variants among 4,877 individuals with schizophrenia. Risch, N., and Merikangas, K. (1996). The future of genetic studies of
Nat Neurosci 19(11):1433–​1441. complex human diseases. Science 273(5281):1516–​1517.
Gibson, G. (2012). Rare and common variants: twenty arguments. Nat Schizophrenia Working Group of the Psychiatric Genomics Consortium
Rev Genet 13(2):135–​145. (2014) Biological insights from 108 schizophrenia-​associated genetic
Gottesman I., and Gould, T. (2003). The endophenotype concept loci. Nature 511(7510):421–​427.
in psychiatry: etymology and strategic intentions. Am J Psych Sanders, S.J., He, X., et al. (2015). Insights into autism spectrum dis-
160(4):636–​645. order genomic architecture and biology from 71 risk loci. Neuron
International HapMap Consortium. (2007). A second generation human 87(6):1215–​1233.
haplotype map of over 3.1 million SNPs. Nature 449:851–​861. Sebat, J., Lakshmi, B., et al. (2007). Strong association of de novo copy
International Schizophrenia Consortium. (2008). Rare chromosomal number mutations with autism. Science 316:445–​449.
deletions and duplications increase risk of schizophrenia. Nature Sekar, A., Bialas, A.R., et al. (2016). Schizophrenia risk from com-
455:237–​241. plex variation of complement component 4. Nature 530(7589):
International Schizophrenia Consortium. (2009). Common polygenic 177–​183.
variation contributes to risk of schizophrenia and bipolar disorder. Smith, G.D., and Ebrahim, S. (2003). “Mendelian randomization”: can
Nature 460(7256):748–​752. genetic epidemiology contribute to understanding environmental
Iossifov I, O’Roask, B.J., et al. (2014). The contribution of de novo coding determinants of disease? Int J Epidemiol 32(1):1–​22.
mutations to autism spectrum disorder. Nature 515(7526):216–​221. St Clair, D., Blackwood D., et al. (1990). Association within a family of
Keller M.C., Simonson, M.A., et al. (2012). Runs of homozygosity a balanced autosomal translocation with major mental illness. Lancet
implicate autozygosity as a schizophrenia risk factor. PLoS Genet. 336(8706):13–​16.
8(4):e1002656. Sullivan, P.F., Daly M.J., et al. (2012). Genetic architectures of psychiat-
Kiezun, A., Garimella K., et al. (2012). Exome sequencing and the genetic ric disorders: the emerging picture and its implications. Nat Rev Genet
basis of complex traits. Nat Genet. 44(6):623–​630. 13(8):537–​551.
Kim, Y., Zerwas S., et al. (2011). Schizophrenia genetics: where next? Thomas, D. (2010). Gene-​environment-​wide association studies: emerg-
Schizophr Bull 37(3):456–​463. ing approaches. Nat Rev Genet 11(4):259–​272.

1. G enetic
Charney and Nestler's Neurobiology of Mental Illness, edited by Dennis Mal.,
S. Charney, et ethodologies and
Oxford University Press USA A 2018.
- OSO, pplications 
07:33:44.
The 1000 Genomes Project Consortium. (2010). A map of human genome Xu, B., Ionita-​Laza, I., et al. (2012). De novo gene mutations highlight
variation from population-​ scale sequencing. Nature 467(7319): patterns of genetic and neural complexity in schizophrenia. Nat Genet
1061–​1073. 44(12):1365–​1369.
van Dongen, J., Slagboom, P.E., et al. (2012). The continuing value of twin Yang, J., Benyamin, B., et al. (2010). Common SNPs explain a large
studies in the omics era. Nat Rev Genet 13:640–​653. proportion of the heritability for human height. Nat Genet
Wain, L.V., Armour, J.A., et al. (2009). Genomic copy number variation, 42(7):565–​569.
human health, and disease. Lancet 374(9686):340–​350. Yang, J., Lee, S.H., et al. (2013). Genome-​wide complex trait analysis
The Wellcome Trust Case Control Consortium. (2007). Genome-​wide (GCTA): methods, data analyses, and interpretations. Methods Mol
association study of 14,000 cases of seven common diseases and 3000 Biol 1019:215–​236.
shared controls. Nature 447:661–​678.

16 •S. E
Charney and Nestler's Neurobiology of Mental Illness, edited by Dennis merging
Charney, and
et al., Oxford E stablished
University Press USA - OSO,T echnologies
2018.

07:33:44.
2.
NET WORK METHODS FOR ELUCIDATING
THE COMPLEXIT Y OF COMMON HUMAN DISEASES
Eric E. Schadt

INTRODUCTION Schadt et al., 2009; Califano et al., 2012; Argmann et al.,


2016) as opposed to responses to changes in a small number
Our understandings of common human diseases and how of genes driving core biological processes associated with the
best to treat them are hampered by the complexity of the disease. Integrating large-​scale, high-​dimensional molecular
human system in which they are manifested. Unlike simple and physiological data holds promise in not only defining the
Mendelian disorders in which highly expressive, highly pen- molecular networks that directly respond to genetic and envi-
etrant mutations make it possible to identify the causal genes ronmental perturbations that associate with disease, but also
within families segregating traits associated with the disor- in causally associating such networks with the physiological
ders, the common human diseases originate from a more states associated with disease.
complex interplay between constellations of changes in DNA Of course, genetics is but one dimension in a big sea of
(both rare and common variations) and a broad range of envi- data dimensions that we can now leverage to better under-
ronmental factors like diet, age, sex, and exposure to environ- stand human conditions such as psychiatric disorders. Models
mental toxins (R. Chen et al., 2016). of disease that consider a greater diversity of data that inform
With roughly 3 billion nucleotides making up the human on disease will necessarily deliver more accurate diagnoses. In
genome, the number of nucleotide changes that can affect fact, we are in the midst of a big data revolution that perme-
the activities of a moderate to large number of genes vastly ates nearly every aspect of our lives. Electronic devices that
exceeds our ability to experimentally determine the effects of consume much of our attention on a daily basis enable rapid
combinations of such changes. Whereas the focus in years past transactions among individuals on unprecedented scales,
regarding DNA variation and its association to disease had where all of the information involved in these daily transac-
been focused on protein-​coding sequences, given declarations tions can be seamlessly stored in digital form, whether the
of intergenic DNA being composed mainly of “junk” (Smith, transactions involve monitoring of activity levels, cell phone
Brookhaven National Laboratory et al., 1972) we know today calls, text messages, credit card purchases, e-​mail, or visits to
that more than 80% of the human genome is actively bound the doctor’s office in which all tests carried out are digitized
by proteins that regulate the expression of genes (Ecker et al., and entered into one’s electronic medical record (Figure 2.1).
2012), providing a vast array of knobs and switches to modu- In fact, devices such as the Apple iPhone now provide plat-
late not only the activity of genes but also the activity of whole forms such as HealthKit, ResearchKit, and CareKit to facil-
gene networks. Therefore, leveraging naturally occurring DNA itate larger scale collections of data around individuals using
variation in human populations can be considered among the smart devices such as an iPhone, as well as better engagement
most attractive approaches to inferring the constellation of around the acquired data to facilitate increased wellness and
genes that affect disease risk. For most noncancer human dis- even impact clinical care decisions. The digital universe of
eases such as Alzheimer’s disease, autism, and schizophrenia, data more generally now far exceeds one zettabyte (that is 21
changes in DNA that correlate with changes in disease can be zeros or one billion terabytes—​think 63 billion 16-​gigabyte
inferred as tagging or directly representing causal components iPhones). Thus, our ability to store and access unimaginable
of disease (Zhang et al., 2013; Fromer et al., 2016). In this way, scales of data has been revolutionized by technological inno-
the DNA variation directly elucidates disease etiology and so vations, some of which (such as DNA sequencing technolo-
is extremely useful. Genome-​wide association studies (GWAS) gies) have been observed to operate at super Moore’s law rates.
are now well proven to uncover genetic loci that affect disease The life and biomedical sciences have not stood on the
risk or disease progression (Witte, 2010; Welter et al., 2014). sidelines of this revolution. There has been an incredible wave
The complex array of interacting factors does not influ- of new technologies in genomics—​such as next-​generation
ence the activity of single genes in isolation but, instead, sequencing technologies (Eid et al., 2009), sophisticated imag-
affects entire network states that, in turn, increase or decrease ing systems, and mass spectrometry-​ based flow cytometry
the risk of disease or affect disease severity. In the context of (Bandura et al., 2009)—​enabling data to be generated at very
common human diseases, the disease states can be considered large scales. As a result we can monitor the expression of tens
as emergent properties of molecular networks (Schadt, 2009; of thousands of protein-​and noncoding genes simultaneously

1 7 USA - OSO, 2018.


Charney and Nestler's Neurobiology of Mental Illness, edited by Dennis S. Charney, et al., Oxford University Press

07:33:44.
GPS

Weather and
climate

Air traffic DATA


ANALYTICS

Security and Individual


automation patient

Coronary artery
disease
Type 2
Biomedical research diabetes
and personal health
Obesity PREDICTIVE MODELS
OF DISEASE
Osteoarthrities

Cell phone,
texts, digital
music and
movies
Diagnosis
and treatment
assignment
Diagnostics Therapies
Real time traffic

Financial markets
and services

Big data is all around us, enabled by technological advances in micro-​and nanoelectronics, nano materials, interconnectivity provided by
Figure 2.1
sophisticated telecommunication infrastructure, massive network-​attached storage capabilities, and commodity-​based high-​performance compute
infrastructures. The ability to store all credit card transactions, all cell phone traffic, all e-​mail traffic, video from extensive networks of surveillance
devices, and satellite and ground sensing data informing on all aspects of the weather and overall climate, and to now generate and store massive data
informing on our personal health including whole-​genome sequencing data and extensive imagining data, is driving a revolution in high-​end data
analytics to make sense of the big data and drive more accurate descriptive and predictive models that inform decision making on every level, whether
identifying the next big security threat or making the best diagnosis and treatment choice for a given patient.

(Y. Chen et al., 2008; Emilsson et al., 2008; Zhang et al., 2013; the idea that we can simply repeat experiments to validate
Franzen et al., 2016), score hundreds of thousands of SNPs findings generated in populations. In fact, while first instances
(single-​nucleotide polymorphisms) in individual samples (R. of the central dogma of biology looked something like the
Shi et al., 2016; Kilpelainen et al., 2016; Lek et al., 2016; Lu et al., simple graph depicted in Figure 2.2 (top), today—​given that
2016), sequence entire human genomes for less than $1000, and the complex interplay of multiple dimensions of data (DNA,
relate all of these data patterns to a great diversity of other bio- RNA, protein, metabolite, cellular, physiologic, ecologic, and
logically relevant information (clinical data, biochemical data, social structures more generally) demands a more holistic view
social networking data, etc.) Given technologies on the horizon be taken in which we embrace complexity in its entirety—​the
like the IBM DNA transistor with theoretical sequencing limits central dogma is evolving to look something more like the
in the hundreds of millions of bases per second per transistor graph depicted in Figure 2.2 (bottom). Our emerging view
(imagine millions of these transistors packed together in a sin- of complex biological systems is one of a dynamic, fluid sys-
gle handheld device) (Schadt et al., 2010), we won’t be talking tem that is able to reconfigure itself as conditions demand
in the future about Google rolling through neighborhoods with (Barabasi and Oltvai, 2004; Han et al., 2004; Luscombe
Wi-​Fi-​sniffing equipment (Kravets, 2010); rather, we will be et al., 2004; Pinto et al., 2004; Zerhouni, 2003). Despite
talking about DNA-​sniffing equipment rolling through neigh- these transformative advances in technology and the need to
borhoods sequencing everything they encounter in real time embrace complexity, it remains difficult to assess where we are
and then pumping such data into big data clouds to link with all with respect to our understanding of living systems relative to
other available information in the digital universe. a complete comprehension of such systems. One of the pri-
If we want to achieve understanding from big data, organ- mary difficulties in making such an assessment is that the suite
ize it, compute on it, and build predictive models from it, then of research tools available to us seldom provides insights into
we must employ statistical reasoning beyond the more classic aspects of the overall picture of the system that are not directly
hypothesis testing of yesteryear. We have moved well beyond measured.

18 •S. E
Charney and Nestler's Neurobiology of Mental Illness, edited by Dennis merging
Charney, and
et al., Oxford E stablished
University Press USA - OSO,T echnologies
2018.

07:33:44.
Original central dogma of biology

DNA RNA Protein

Evolving central dogma of biology

Translation

tRNA rRNA
Transcription
tmRNA
mRNA
snoRNA RNA
replica- snRNA Proteins Phosphorylation
tion

Epigenetic viRNA miRNA


(modified Replication Prions
bases)
DNA
piwi
siRNA scRNA
RNA

RNA binding proteins


Reverse transcriptase ADAR
Spliceosome (RNA editing)

THE EVOLVING CENTRAL DOGMA OF BIOLOGY The upper panel represents the original central dogma of biology, a simple
Figure 2.2
view driven by early observations with low-​resolution tools that uncovered a central relationship between DNA, RNA, and proteins, namely that
RNA is transcribed from DNA, and RNA, in turn, is translated into proteins. New higher resolution technologies have enabled a far more complex
view of the central dogma to emerge (bottom panel), with epigenetic changes to DNA that are transgenerational, leading to non-​Mendelian patterns
of inheritance, a complex array of RNA molecules such as microRNA, viRNA, piwiRNA, and siRNA that do not code for proteins but carry
out complex regulatory functions, and sophisticated protein complexes involved in splicing, RNA editing, and RNA binding all feeding back on
transcription, leading to a more network-​oriented view of the central dogma.

In this chapter I discuss a particular class of modeling T H E M A N Y M O VI N G P I E C E S O F


approaches that integrate diverse types of data on broad B I O L O G I C A L SYS T E M S : A M O VI E
scales in ways that enable others to interpret their data in A N A L O GY
a more holistic, informative context, to derive predictions
that inform decision making on multiple levels, whether Tools to interrogate biological systems in the past were
deciding on the next set of genes to validate experimentally crude and did not permit the more holistic querying of
or the best treatment for a given individual given detailed such systems at multiple scales. In fact, if we were to view
molecular and higher order data on their condition. the full suite of interacting parts in living systems, from the
Central to these models will be inferring causality among molecular on up to the ecological levels, we would achieve
molecular traits and between molecular and higher order a more complete understanding of the cellular-​, organ-​,
traits by leveraging DNA as a systematic source of pertur- and organism-​level processes that underlie complex phe-
bation. In contrast to the more qualitative approaches notypes such as disease, much in the same way we achieve
biological researchers have employed in the past, getting understanding by watching a movie. The continuous flow
the most from these new types of high-​dimensional, large-​ of information in a movie enables our minds to exercise an
scale data requires constructing more complex, predictive array of priors that provide the appropriate context and that
models from them; refining the ability of such models to constrain the possible relationships (structures), not only
assess disease risk, progression, and best treatment strate- within a given frame or scene but also over the entire course
gies; and ultimately translating these complex models into of the movie. As our senses take in all of the streaming audio
a clinical setting where doctors can employ them as tools and visual information, our internal network reconstruc-
to understand most optimally a patient’s current condition tion engine (centered at the brain) pieces the information
and how best to improve it. Such solutions require a robust together to represent highly complex and nonlinear rela-
engineering approach, where integrating the new breed of tionships depicted in the movie, so that in the end we are
large-​scale datasets streaming out of the biological sciences able to achieve an understanding of what the movie intends
and constructing predictive models from them will require to convey at a hierarchy of levels.
approaches more akin to those employed by physicists, cli- What if we were to view a movie as we have viewed biologi-
matologists, and other strongly quantitative disciplines that cal systems in the past? What if, instead of viewing a movie as a
have mastered the collection and predictive modeling of continuous stream of frames of coherent pixels and sound, we
high-​dimensional data. viewed single dimensions of these data, and we viewed them

2 .S.NCharney,
Charney and Nestler's Neurobiology of Mental Illness, edited by Dennis etwork A nalyses
et al., Oxford of USA
University Press C omplex
- OSO, 2018.

07:33:44.
independently from one another? Understanding in this case ever greater breadth, but we must also innovate methods that
would likely be difficult, if not impossible, to achieve. As an consider many different dimensions of information to pro-
example, consider a 2-​hour feature length film composed of duce more descriptive models (movies) of the system. There
216,000 frames (30 frames per second), where each frame is are, of course, many different types of modeling approaches
composed of 1,280 × 720 pixels (roughly one million pixels). that have been and continue to be explored. Descriptive mod-
First, it is worth noting that the number of pixels of informa- els quantify relationships among variables in data that can, in
tion, roughly 199 billion, represented in this film is quite turn, enable classification of systems under study into different
large (if each pixel were represented by 32 bits, the film would meaningful groups; whether stratifying disease populations
comprise more than 6 terabytes of information). Suppose we into disease subtypes to assign patients to the most appropri-
decided to use the tools of reductionist biology to view the ate treatment or categorizing customers by product preference,
film, where instead of viewing the film as a rapid succession descriptive models are useful for classifying but cannot neces-
of frames of one million pixels each, we viewed a single frame sarily be used to predict how any given variable will respond to
in which the intensity value for each pixel across all 216,000 another at the individual level. For example, while patterns of
frames in the movie was averaged. This gross, aggregate aver- gene expression such as those identified for breast cancer and
age would provide very little, if any, information regarding the now in play at companies like Genomic Health can very well
movie—​not unlike our attempts to understand complex living distinguish good from poor prognoses (van’t Veer, Dai et al.,
systems by examining single snapshots of a subset of molecu- 2002; van de Vijver, He et al., 2002), such models are not gen-
lar traits in a single cell type and in a single context at a single erally as useful for understanding how genes in patterns associ-
point in time. Even if we viewed our movie as independent, ated with disease are causally related or for distinguishing key
one-​dimensional slices through its frames, where each slice driver genes from passenger genes.
was viewed as pixel intensities across that one dimension Predictive models, on the other hand, incorporate historic
changing over time (like a dynamic mass spectrometry trace), and current data to predict how one variable may respond to
this view would provide significantly more information, but another in a particular context or predict response or future
it would still be very difficult to understand the meaning of states of components of a system at the individual level. In the
the movie by looking at all of the one-​dimensional traces inde- biological context, predictive models aim to accurately pre-
pendently, unless more sophisticated mathematical algorithms dict (in silico—​using the model to run simulations on a com-
were employed to link the information together. puter) molecule expression–​level changes, cell state dynamics,
Despite the complexity of biological systems, even at the and phenotype transitions in response to specific perturbation
cellular level, research in the context of large-​scale, high-​ events. For example, understanding how the constellation
dimensional omics data has tended to focus on single data of genes identified for diseases like schizophrenia or autism
dimensions, whether constructing coexpression networks (Roussos et al., 2014; Fromer et al., 2016; Lek et al., 2016) are
based on gene expression data, carrying out genome-​wide actually related to one another in probabilistic causal ways can
association analyses based on DNA variation information, or lead to an understanding of how perturbing a given gene (say,
constructing protein interaction networks based on protein–​ for treatment) will impact the corresponding molecular net-
protein interaction data. While we achieve some under- works and ultimately the pathophysiology of the diseases they
standing in this way, progress is limited because none of the impact. Key to constructing predictive models is elucidating
dimensions on their own provide a complete enough context causal relationships between traits of interest. Resolving causal
within which to interpret results fully. This type of limitation relationships requires a systematic source of perturbation, and
has become apparent in genome-​wide association studies or here I discuss the use of DNA variation as a systematic pertur-
whole-​exome or genome-​sequencing studies, where thousands bation source to infer causal relationships among molecular
of highly replicated loci have been identified and highly repli- traits and between molecular traits and higher order traits like
cated as associated with disease, but our understanding of dis- disease (Schadt et al., 2005; Y. Chen et al., 2008; Emilsson
ease is still limited because the genetic loci do not necessarily et al., 2008; Zhu et al., 2008; Millstein et al., 2009; Millstein
inform on the gene affected, on how gene function is altered, et al., 2011; Zhu et al., 2012; Zhang et al., 2013; Chang et al.,
or, more generally, how the biological processes involving a 2015; Franzen et al., 2016). However, before diving into this
given gene are altered at particular points of time or in par- specific type of modeling approach, it is worth reviewing the
ticular contexts (Altshuler et al., 2008; Y. Chen et al., 2008; general ways in which biological data can be modeled.
Emilsson et al., 2008; Zhang et al., 2013; Franzen et al., 2016).
It is apparent that if different biological data dimensions could
be formally considered simultaneously, we would achieve a MODELING
more complete understanding of biological systems (Y. Chen B I O L O G I C A L DATA
et al., 2008; Emilsson et al., 2008; Zhong, Beaulaurier et al.,
2010; Zhu et al., 2012; Zhang et al., 2013; Franzen et al., 2016). A true understanding of complex systems and the complex
(See the documentary film The New Biology at http://​www. behaviors they exhibit can only be achieved if we understand
youtube.com/​watch?v=sjTQD6E3lH4.) the causal relationships among the hierarchy of constituent
To form a more complete understanding of complex human components comprising the system. However, inferring cau-
diseases like psychiatric disorders, we must not only evolve sality between variables, especially recovering causal networks
technologies to sample systems at ever higher rates and with from observational data, is a particularly challenging task.

20 •S. E
Charney and Nestler's Neurobiology of Mental Illness, edited by Dennis merging
Charney, and
et al., Oxford E stablished
University Press USA - OSO, T echnologies
2018.

07:33:44.
Given the complexity of biological data and the complexity typically this modeling approach is restricted to smaller net-
of methods that can be applied to deriving meaning from work structures, and the models can be difficult to calibrate
such data, an awareness of the different classes of models is (Azeloglu and Iyengar 2015). Modeling of the dynamics of
warranted, even though in this chapter I focus primarily on physiologic glucose-​insulin levels, metabolic flux, and drug
probabilistic causal reasoning. The different types of model- response are just a few of many examples that have been
ing that can be applied to biological data can be broken down effectively modeled using this approach. Logic models rep-
into a number of different classes, with the selection of mod- resent another class of models that require significant prior
eling approach to employ dependent on a number of factors knowledge, but that also have an adaptive component that
such as extent of prior knowledge, dimensionality of the data can be learned from the data and thus can reduce dependence
to be modeled, the scale of data available to model, and of on the extent of knowledge required to model the biological
course what one hopes to derive from the data and the model system of interest. Logic models also maintain a simple and
(Figure 2.3). intuitive framework for understanding complex signaling
In the spectrum of modeling classes ranging from those networks (Morris et al., 2010). In addition, this type of mod-
assuming the most complete knowledge of pathways and net- eling approach still provides for direct mechanistic insights
works to those assuming no knowledge, preferring instead to be derived from simulations. Kinetic and logic models are
to learn the network structures directly from the data, the more representative of what I refer to as bottom-​up mod-
kinetic models are at the most extreme end of the distribution eling approaches that begin with strong prior knowledge
with respect to requiring extensive prior knowledge. Kinetic regarding how pathways are put together, but then define the
models are typically represented as systems of ordinary dif- kinetic parameters on those pathways that describe the flow
ferential equations (ODEs), which require extensive prior of information through the system.
knowledge as the ODEs fix the connectivity structure among Boolean network modeling is another class of approaches
the variables being modeled (e.g., the pathway is assumed to that provide an even more flexible framework for modeling
be known). The model is the defined by a series of parameters biomolecules as binary variables that directly relate to state
that are fit from the data, and with these parameter estimates information that is relevant to downstream biological pro-
the behavior of the system can be directly explored via simula- cesses. However, the regulation of the different states rep-
tions run on the model. Via these simulations, kinetic mod- resented are described in a parameter-​free way (in contrast
els provide for greater mechanistic insights. These models to kinetic models that are defined by kinetic parameters),
can also be fit from smaller, more focused datasets, although providing for an approach that enables a more exploratory

Type of model Characteristics of Model


Minimal sample size required to fit model

Novel mechanistic insights


(given comparable complexity)

Very large # of data Prior knowledge dependence Can reveal strong


Limited to Extensive prior
Kinetic points needed to mechanistic
small # knowledge required
Bottom-up fit model insights
modeling
Limited to Can reveal strong
Fuzzy Larger # of data points Strong prior
small to mechanistic
logic needed to fit model knowledge required
mode rate # insights

Boolean Can have Larger # of data points Less prior knowledge Potential to provide
network moderate# needed to fit model required mechanistic
Top-down insights
modeling
Can have Prior knowledge not
Bayesian Moderate to large # of Can learn novel
moderate to required but can be
Model Size (# of variables modeled)

network data points to fit model causal relationships


large# leveraged

Small to moderate Prior knowledge not Does not implicitly infer


PLS Can have
# of data points to required, some ability causality but informs on
regression large #
Correlation- fit model to model prior data relationships
based
modeling PCAMulti- Prior knowledge not
Can have Small # of data points Little ability to gain
regression required, limited ability to
very large # to fit model mechanistic insights,
&WGCNA incorporate prior knowledge
association based

Figure 2.3 Modeling biological data using different classes of mathematical modeling approaches. The primary aim of these different approaches is
uncovering relationships in the data that may help predict phenotypes of interest, elucidate causal relationships among traits and biological processes,
and derive mechanistic insights into the causes of disease, wellness, drug response, and other phenotypes of interest. A more detailed description of the
different modeling approaches is given in the main text.

2 .S.NCharney,
Charney and Nestler's Neurobiology of Mental Illness, edited by Dennis etwork A nalyses
et al., Oxford of USA
University Press C omplex
- OSO, 2018.

07:33:44.
characterization of the dynamics of a complex system (Albert structures may reflect completely contradictory causal rela-
and Thakar, 2014). While these types of models can represent tionships. I will explore how appropriate prior information
many more variables than kinetic models, they provide less can be incorporated to help resolve these and related issues.
mechanistic insight. Bayesian network models, the approach
discussed in depth in this chapter, provide an even more flexible
framework for modeling complex biological processes, requir- C AU S A L I T Y A S
ing no prior knowledge but still providing for a natural and A S TAT I S T I C A L I N F E R E N C E
mathematically elegant way to incorporate prior knowledge.
Bayesian networks provide a way to learn regulatory relation- In the life sciences, most researchers are accustomed to think-
ships directly from the data. With the use of heuristic search- ing about causality from the standpoint of physical interac-
ing, networks composed of many thousands of variables can tions. In the molecular biology or biochemistry setting, when
be constructed, although equally large sets of data are required two molecular traits are indicated as causally related we typ-
to effectively construct this type of model. The causal relation- ically mean that one of the molecular entities (e.g., a small
ships represented in these models are statistically inferred, so molecule compound) has been determined experimentally to
deriving mechanistic insights is more difficult. The Boolean physically interact with or to induce processes that directly
and Bayesian network modeling approaches are examples of affect the other molecular entity (e.g., the target protein of
what I refer to as top-​down modeling approaches that seek the small molecule) and consequently leads to a phenotypic
to learn relationships directly from the data (structure-​based change of interest (e.g., lower LDL cholesterol levels). In this
learning). case we have an understanding of the causal factors relevant
The final classes of modeling approaches are correla- to the activity of interest, so that careful experimental manip-
tion based and are more exploratory in nature, seeking to ulation of these factors allows for the identification of gen-
elucidate the correlation structures in extensive datasets uine causal relationships. However, in the context of many
in order to begin to understand the relationships that may thousands of variables related in unknown ways, the aim is
be well reflected in them, and that may aid in understand- to examine the behavior of those variables across populations
ing key processes involved in complex processes associated in ways that facilitate statistically inferring causal relation-
with phenotypes of interest such as disease. Partial least ships. For example, statistical associations between changes
squares regression and principle component analysis (PCA) in DNA, changes in molecular phenotypes, and changes in
multi-​regression are examples of two classes of such model- higher order phenotypes like functional MRI readouts or dis-
ing approaches. They do not require any prior knowledge to ease can be examined for patterns of conditional dependency
fit the models; they can operate on extremely large datasets, among the variables that allow directionality to be inferred
scaling to any number of variables that give rise to very large-​ among them. In this case we can employ indirect measures
scale networks; and they are easy to calibrate. However, such of processes that mediate changes in one trait conditional on
models do not explicitly infer causality but rather reflect con- another, to make a statistically inferred causal link. This is not
nections and influences on those connections, a first step for unlike the types of statistical inferences that are leveraged
learning important relationships that are involved in complex in other disciplines to make new discoveries. For example,
processes such as disease. less than 5% of known extrasolar planets have been directly
In this broad spectrum of methods, Bayesian networks observed, so that most are observed indirectly. One method
strike a nice balance between resolving mechanisms and for detecting planets that cannot be directly observed consid-
structure and more broadly reflecting connections and their ers that when a planet is orbiting a star, the gravitational pull
influences, thereby providing an efficient path for under- of the planet on the star will place the star into a subtle orbit,
standing information flow. Whereas ODEs are hypothesis which from our vantage point will appear as the star moving
driven, where the relationships among variables is assumed closer to and further away from the Earth in a cyclical fash-
known, Bayesian methods operate in a hypothesis-​free con- ion. Such movement can be measured as displacements in the
text in which we attempt to infer the relationships among star’s spectral lines due to the Doppler effect (Eriskine et al.,
variables given the data. As a result, Bayesian networks have 2005) and so the presence of the planet acting on the star can
emerged as a state-​of-​the-​art approach for understanding be statistically inferred.
complex systems in which the relationships among the con- Similarly, consider genetic variants associated with, say,
stituent components of the system are not generally known, schizophrenia or autism (many such loci have now been iden-
since they can seamlessly incorporate existing knowledge as tified; see Roussos et al., 2014; Fromer et al., 2016; Lek et al.,
structural and parameter priors and then infer directed rela- 2016). Further, suppose the expression of some number of
tionships among the nodes in the network using conditional genes assayed in relevant regions of the brain relating to these
dependency arguments (Zhu et al., 2008; Zhu et al., 2012; disorders were also associated with these same genetic variants.
Chang et al., 2015). However, there are also limitations with By examining the changes in the levels of expression of these
this modeling approach that relate to the ability Bayesian genes in response to changes in genotype at any of the genetic
networks to distinguish causal structures that have equiva- loci of interest, one can directly assess the extent to which
lent joint probability and conditional independence struc- these expression changes induced by the genetic loci well
tures (Markov equivalence). The severity of this problem explain the degree of association between the locus genotypes
cannot be understated, since statistically indistinguishable and disease trait. In this way, just as the characteristic wobble

22 •S. E
Charney and Nestler's Neurobiology of Mental Illness, edited by Dennis merging
Charney, and
et al., Oxford E stablished
University Press USA - OSO, T echnologies
2018.

07:33:44.
of a star induced by an orbiting planet predicts the presence of scales needed to elucidate the complexity of networks at play
the planet, the characteristic “wobble” of the expression lev- in neurons in living mammals (Boyden et al., 2005). If exper-
els of a gene and its association to the disease state predicts a imentally controlled artificial perturbations on a given gene
causal path between the gene and disease state, as described in cause a change in a trait of interest, then we infer a causal
more detail below. relationship between that gene and trait. However, DNA
Critical to identifying causal relationships is distin- variation in the germline provides an excellent systematic per-
guishing between correlation and causation. The old adage, turbation source that can also be used to resolve causal rela-
“correlation does not imply causation,” is familiar to most. tionships in biological systems. Because variations in DNA
This is among the first fallacies one learns about in begin- cause variations in RNA, proteins, metabolites, and subse-
ning logic courses: post hoc ergo propter hoc (Latin for “after quently, higher order phenotypes, this source of variation can
this, therefore because of this”). Measurements taken over be leveraged to infer causality. Unlike artificial perturbations
time on independent variables can be correlated because such as gene knockouts, transgenics, or chemical or optoge-
trends reflected by such variables are coincidentally similar netic perturbations that may induce artificial correlations that
or changes in each variable are independently caused by a are not observed in more natural settings, naturally occurring
common source, in addition to being correlated as a result genetic variation defines those perturbations that give rise
of a cause–​effect relationship. It is also interesting to note to the broad array of phenotypic variations (such as disease
that while correlation and causation are related, our intui- and drug response) that we are precisely interested in eluci-
tive notation that causation implies correlation is not always dating. The past 10 years (Nitsch et al., 2006; Lawlor et al.,
correct either. For example, suppose U and V are random 2008) have demonstrated that causal links between DNA
variables with the same distribution, and suppose X = U + variations and molecular and higher order phenotypes can
V and Y = U − V. In this case the covariance between X and provide information on causal relationships between those
Y (defined as E(XY) − E(X)E(Y), where E represents the traits (Schadt et al., 2005; Y. Chen et al., 2008; Emilsson
expectation function) is 0, and so the correlation is 0 even et al., 2008; Millstein et al., 2009; Yang et al., 2009; Zhong,
though there is a direct functional dependence between the Beaulaurier et al., 2010; Zhong, Yang et al., 2010; Zhang et al.,
variables (Feller, 1967). Only when two variables are linearly 2013; Franzen et al., 2016). Causality in this instance can be
dependent (which is often the case in research) is our intu- inferred because there is random segregation of the chromo-
itive notion of functional dependence implying perfect cor- somes during gametogenesis, thus providing the appropriate
relation correct. randomization mechanism to protect against confounding,
Structure learning approaches that seek to infer causal similar to what is achieved in randomized clinical trials by
relationships among correlated variables often employ con- randomly assigning patients to treatments to test the causal
ditional dependency arguments or mutual information meas- effects of a drug of interest (Nitsch et al., 2006; Lawlor et al.,
ures to resolve causality by introducing a third correlated 2008). However, quantifying the uncertainty in making
variable. By conditioning each of the variables on the third such causal calls has been challenging. For example, causal
and examining the residual correlation between them in each effect estimates often considered in Mendelian randomiza-
case, a decision can be made as to the direction of the flow of tion approaches can be confounded by pleiotropic effects
information between the variables. However, this type of rea- and reverse causation, limiting the utility of such approaches
soning has generally failed to result in predictive causal infer- for problems that involve the reconstruction of regulatory
ence, because in the absence of systematic perturbations the networks, in which pleiotropy is common and there may
number of graphs that can be represented between just three be little a priori information regarding the structure of the
traits is large (125 graphs representing directed and undirected causal relationships between the traits of interest (Millstein
relationships between three correlated variables are possible), et al., 2009).
and many of these possible relationships between the traits are Recently, though, formal statistical tests for inferring
not statistically distinguishable (Sieberts and Schadt, 2007). causal relationships between quantitative traits mediated
For example, if variables X, Y, and Z are observed in a popu- by a common genetic locus have been developed (Millstein
lation to be correlated (e.g., suppose X, Y, and Z represent the et al., 2009). To understand how such a test works, consider
expression levels of three genes assayed in a given region of the marker genotypes at a given DNA locus L that are correlated
brain in a population of individuals with schizophrenia) and with a given molecular phenotype, G, and a higher order
the true relationship between the variables is X → Z ← Y, this phenotype T (Figure 2.4). The causal relationship G → T is
relationship cannot be statistically distinguished from X → Y implied if three conditions are satisfied under the assump-
← Z and Z → X ← Y, even though these relationships give rise to tion that L is sufficiently randomized: (1) L and G are asso-
contradictory causal relationships. ciated, (2) L and T are associated, and (3) L is independent
To break this type of statistical symmetry, a source of of T given G (i.e., L and T|G are not associated) (L.S. Chen
perturbation is required. Classically in biology we have et al., 2007). If a given locus L is independent of G given
introduced artificial perturbations by knocking a gene out, T (G|T), this is consistent with T being causal for G (T →
overexpressing a gene, or chemically perturbing a given pro- G), and if L is associated with G|T, then this is consistent
tein to assess the consequences on a given trait of interest. with G being causal for T (G → T). We can boil all of these
More recently, in the neurosciences, optogenetics methods observations down to four conditions from which a statisti-
have provided novel ways to perturb genes on the short time cal test can be formed to test for causality: (1) L and T are

2 .S.NCharney,
Charney and Nestler's Neurobiology of Mental Illness, edited by Dennis etwork A nalyses
et al., Oxford of USA
University Press C omplex
- OSO, 2018.

07:33:44.
Independent model Independent/hidden model
G
parameters, a causal inference test can then be carried out by
G testing the strength of the chain of mathematical conditions
L H that collectively are consistent with causal mediation (i.e.,
L
the strength of the chain is only as strong as its weakest link,
or similarly T
T so that the intersection of the rejection regions of the com-
Causal model G ponent tests provides for the causality test we seek). For a
series of statistical tests of size α γ and rejection region Rγ , the
L G T L L
“intersection union” test with rejection region equal to the
T intersection over all Rγ , is a level sup(α γ ) test, so that the p-​
Causal/independent model value for the causal inference test corresponds to the p-​value
Causal/hidden model
for an intersection union test, or, simply, the supremum of
L G T L G T the four p-​values for the component tests (L.S. Chen et al.,
2007). This test has been implemented as the CIT pack-
age in the R statistical programming language and is freely
H
available.
Applications of this type of test can be applied to resolve
Figure 2.4 Given two traits G and T are correlated in a given population
the types of causal relationships depicted in Figure 2.4.
with changes in DNA at locus L, there are five basic causal models
to consider in testing the hypothesis that variations in trait G cause Application of these ideas in segregating mouse populations
variations in trait T. Here H denotes an unmeasured molecular or higher have led to the identification and validation of many genes
order trait. causal for a number of metabolic traits, including obesity, dia-
betes, and heart disease. In one such population constructed
between the B6 and DBA inbred strains of mouse, 111 F2
associated, (2) L is associated with G|T, (3) G is associated intercross animals were placed on a high-​fat, atherogenic diet
with T|L, and (4) L is independent of T|G. Each of these for 4 months at 12 months of age. All animals were genotyped
conditions can be assessed with a corresponding statistical using a genome-​wide panel of markers, clinically character-
test. For example, if we assume the marker corresponding ized with respect to a number of metabolic traits, and the
to locus L is biallelic, where L1 and L2 represent indicator livers were expression profiled using a comprehensive gene
variables for the two alleles in a codominant coding scheme, expression microarray. Given the pattern of genetic associa-
then the four conditions above can be tested in the param- tion between the metabolic and gene expression traits, causal
eters of the following three regression models: inference testing was carried out to identify the genes in this
population best supported as causal of obesity-​related traits
Ti = α1 + β1 L1i + β 2 L2 i + ε1i (Monks et al., 2004; Schadt et al., 2005). Of the top nine
genes identified in this study supported as causal for obesity-​
Gi = α 2 + β3Ti + β 4 L1i + β5 L2 i + ε 2 i related traits, eight of the genes were ultimately experimen-
tally validated (Millstein et al., 2009). The only gene that
Ti = α 3 + β6Gi + β 7 L1i + β8 L2 i + ε 3 i , failed to validate was an X-​linked gene that was lethal if com-
pletely knocked out and so represented a more complicated
where Gi and Ti represent the gene and trait levels, respectively, example for which the appropriate tools to validate could not
for individual i in a population of interest, and the ε ij repre- be constructed.
sents independently distributed random noise variables with Of course, this exact same type of reasoning can be used
variance σ j (L.S. Chen et al., 2007). Given these models the
2 to causally relate imaging traits, DNA variation, and expres-
four component tests of interest are: sion data to clinical phenotype data in the context of psychi-
atric disorders (Figure 2.5). Consider associations identified
{
H 0 : {β1 , β 2 = 0}, H 1 : β1,β 2 } ≠ 0
between SNP genotypes and gene expression traits assayed
in dorsal-​lateral prefrontal cortex (DLPFC). Given the asso-
ciation of SNPs with expression in DLPFC, such SNPs are
{ {
H 0 : β 4 , β5 = 0}, H 1 : β 4 , β5 } ≠ 0 of interest for testing association to functional MRI (fMRI)
traits. Given a set of SNPs in which there is an association
H 0 : β 6 = 0, H 1 : β 6 ≠ 0 between gene expression in DLFPC, fMRI, and schizophre-
nia status, we can statistically model whether the relation-
ship between the traits is causal, reactive, or independent as
{ {
H 0 : β 7 , β8 ≠ 0}, H 1 : β 7 , β8 } = 0 . described above (Figure 2.5). This provides a causal statistical
inference procedure applied to functional MRI and disease
The four conditions of interest can be tested using standard trait data, using DNA variation as the systematic perturba-
F-​tests for linear model coefficients (conditions 1–​3) and a tion source that can address the pressing question of whether
slightly more involved test for the last condition, since it is an changes in neuroimaging traits are the result of schizophre-
equivalence testing problem (Millstein et al., 2009). Given nia or whether these changes lead to the schizophrenia
these individual statistical tests on the different regression phenotype.

24 •S. E
Charney and Nestler's Neurobiology of Mental Illness, edited by Dennis merging
Charney, and
et al., Oxford E stablished
University Press USA - OSO, T echnologies
2018.

07:33:44.
VB,600 intersection, 162 samples

fMRI
3
2
1

SNP Schizophrenia 0
–1
–2
–3

SNP Schizop
fMRI hrenia Causal model

SNP Schizop
fMRI Reactive model
hrenia

fMRI

SNP Schizop Independent model


hrenia

Inferring causal relationships between functional MRI traits and schizophrenia traits using SNPs that associate with the expression of genes
Figure 2.5
in the dorsal-​lateral prefrontal cortex as a perturbation source. The heat map represents a two-​dimensional hierarchical clustering of functional MRI
traits in which the highlighted cluster distinguishes schizophrenia cases from controls. Associations between functional MRI traits, gene expression,
disease status, and SNP genotypes can be integrated to infer causal relationships between functional MRI traits and disease status.

F R O M A S S E S S I N G C AU S A L first is what is referred to as the bottom-​up approach, in which


R E L AT I O N S H I P S A M O N G fundamental relationships between small sets of genes that
T R A I T PA I R S TO P R E D I C T I VE may comprise a given pathway are established, thus providing
G E N E N ET WO R K S the fundamental building blocks of higher order processes that
are then constructed from the bottom up. This approach typ-
Leveraging DNA variation as a systematic perturbation source to ically assumes that we have more complete knowledge regard-
resolve the causal relationships among traits is necessary but not ing the fundamental topology (connectivity structure) of
sufficient for understanding the complexity of living systems. Cells pathways, and, given this knowledge, models are constructed
are composed of many tens of thousands of proteins, metabolites, that precisely detail how changes to any component of the
RNA, and DNA, all interacting in complex ways. Complex bio- pathway affect other components as well as the known func-
logical systems are composed of many different types of cells oper- tions carried out by the pathway (i.e., bottom-​up approaches
ating within and between many different types of tissues that make are hypothesis driven). The second approach is referred to as a
up different organ systems, all of which interact in complex ways to top-​down approach in which we take into account all data and
give rise to a vast array of phenotypes that manifest themselves in our existing understanding of systems and construct a model
living systems. Modeling the extent of such relationships between that reflects whole-​system behavior, and from there tease
molecular entities, between cells, and between organ systems is a apart the fundamental components from the top down. This
daunting task. Networks are a convenient framework for repre- approach typically assumes that our understanding of how the
senting the relationships among these different variables. In the network is actually wired is sufficiently incomplete, that our
context of biological systems, a network can be viewed as a graph- knowledge is sufficiently incomplete, and that we must objec-
ical model that represents relationships among DNA, RNA, pro- tively infer the relationships by considering large-​scale, high-​
tein, metabolite, and higher order phenotypes like disease state. In dimensional data that informs on all relationships of interest
this way, networks provide a way to represent extremely large-​scale (i.e., top-​down approaches are data driven).
and complex relationships among molecular and higher order Given our incomplete understanding of more general net-
phenotypes like disease in any given context. works and pathways in living systems, in this chapter I focus on
a top-​down approach to reconstructing predictive networks,
given that this type of structure learning from data is critical
BU I L D I N G F RO M T H E B OT TO M U P O R
to derive hypotheses that cannot otherwise be efficiently pro-
TO P D OWN ?
posed in the context of what is known (from the literature,
Two fundamental approaches to the reconstruction of molec- pathway databases, or other such sources). However, top-​
ular networks dominate computational biology today. The down and bottom-​up approaches are complementary to one

2 .S.NCharney,
Charney and Nestler's Neurobiology of Mental Illness, edited by Dennis etwork A nalyses
et al., Oxford of USA
University Press C omplex
- OSO, 2018.

07:33:44.
another, although these approaches have largely been pursued elucidating novel drug targets or biomarkers that better assess
as separate disciplines with, interestingly, little crosstalk occur- disease risk or severity. However, most of these current efforts
ring between them. One of the future directions I discuss in do not lead to predictive models of disease but, rather, provide
the conclusion is the need to mathematically unify these two a descriptive framework within which to uncover associations
classes of predictive modeling to produce probabilistic causal between a myriad of molecular, cellular, imaging, and clinical
networks that more maximally leverage all available data and traits and disease.
knowledge.
In the context of integrating genetic, molecular profil- A N I N T EG R AT I V E G E N O M I C S
ing, and higher order phenotypic data, biological networks A P P ROAC H TO C O NS T RU C T I V E
are composed of nodes that represent molecular entities that P R E D I C T I VE N ET WO R K MO D E L S
are observed to vary in a given population under study (e.g.,
DNA variations, RNA levels, protein states, or metabolite lev- Systematically integrating different types of data into proba-
els). Edges between the nodes represent relationships between bilistic networks using Bayesian networks has been proposed
the molecular entities, and these edges can either be directed, and applied for the purpose of predicting protein–​protein
indicating a cause–​effect relationship, or undirected, indicat- interactions ( Jansen et al., 2003) and protein function (Lee
ing an association or interaction. For example, a DNA node in et al., 2004). However, these Bayesian networks are still based
the network representing a given locus that varies in a popu- on associations between nodes in the network as opposed to
lation of interest may be connected to a transcript abundance causal relationships. As discussed for the simple case of two
trait, indicating that changes at the particular DNA locus traits, from these types of networks we cannot infer whether
induce changes in the levels of the transcript. The potentially a specific perturbation will affect a complex disease trait. To
millions of such relationships represented in a network define make such predictions, we need networks capable of repre-
the overall connectivity structure of the network, or what is senting causal relationships. Probabilistic causal networks
otherwise known as the topology of the network. Any realistic are one way to model from the top down such relationships,
network topology will be necessarily complicated and nonlin- where causality again in this context reflects a probabilistic
ear from the standpoint of the more classic biochemical path- belief that one node in the network affects the behavior of
way diagrams represented in textbooks and pathway databases another. Bayesian networks (Pearl, 1988) are one type of prob-
like KEGG (Kanehisa et al., 2016). The more classic pathway abilistic causal network that provides a natural framework for
view represents molecular processes on an individual level, integrating highly dissimilar types of data.
while networks represent global (population-​level) metrics Bayesian networks are directed acyclic graphs in which the
that describe variation between individuals in a population of edges of the graph are defined by conditional probabilities that
interest, which, in turn, define coherent biological processes characterize the distribution of states of each node given the
in the tissue or cells associated with the network. One way to state of its parents (Pearl, 1988). The network topology defines
manage the complexity of network structures that can obtain a partitioned joint probability distribution over all nodes in a
is to impose constraints on network structures to make them network, such that the probability distribution of states of a
more computationally tractable. For example, it is common node depends only on the states of its parent nodes: formally, a
when learning network structures to disallow loops or cycles joint probability distribution p( X ) on a set of nodes X can be
in the network structure (otherwise known as the network decomposed as p( X ) = Πp( X i | Pa( X i )), where Pa( X i ) repre-
topology, the connectivity structure of the network), in which sents the parent set of X i. The biological networks of interest
cases we refer to the network as acyclic. we wish to construct are composed of nodes that represent a
The neurosciences have a rich history of employing quantitative trait such as the transcript abundance of a given
network-​ based approaches to understand the complexity gene or levels of a given metabolite. The conditional probabili-
of the human brain and the causes of psychiatric illnesses. ties reflect not only relationships between genes, but also the
Resources like the Allen Brain Atlas (http://​www.alleninsti- stochastic nature of these relationships, as well as noise in the
tute.org) provide an anatomically comprehensive map of gene data used to reconstruct the network.
expression of the human brain that can facilitate network-​ The aim in any network reconstruction such as this is to
based analyses (Ding et al., 2016). Others have employed find the best model—​the model that best reflects the relation-
techniques developed for constructing gene coexpression ships between all of the variables under consideration, given
networks to construct interaction networks on fMRI data a set of data that informs on the variables of interest. In a
(Mumford et al., 2010), and others still have generated protein probabilistic sense, we want to search the space of all possible
interaction networks to reflect features of the network archi- networks (or models) for that network that gives the highest
tecture in brains of those with illnesses such as Huntington’s likelihood of occurring given the data. Bayes’ formula allows
disease (Shirasaki et al., 2012). Larger scale efforts have also us to determine the likelihood of a network model M given
been undertaken to integrate large-​scale transcriptomic data observed data D as a function of our prior belief that the model
in the context of diseases like autism to understand how is correct and the probability of the observed data given the
changes in these networks may give rise to autism or reflect the model is: P( M | D) ∝ P( D | M )P( M ). The number of pos-
types of pathways or biological processes involved in such a sible network structures grows superexponentially with the
disease (Voineagu et al., 2011). These efforts are important not number of nodes, so an exhaustive search of all possible struc-
only for better understanding psychiatric diseases, but also for tures to find the one best supported by the data is not feasible,

26 •S. E
Charney and Nestler's Neurobiology of Mental Illness, edited by Dennis merging
Charney, and
et al., Oxford E stablished
University Press USA - OSO, T echnologies
2018.

07:33:44.
even for a relatively small number of nodes. A number of algo- Bayesian networks are directed graphs. However, the Bayesian
rithms exist to find the optimal network without searching network reconstruction algorithm can take advantage of
exhaustively, like the Monte Carlo Markov Chain (MCMC) genetic data to break the symmetry among nodes in the net-
(Madigan and York, 1995) simulation. With the MCMC work that lead to Markov-​equivalent structures, thereby pro-
algorithm, optimal networks are constructed from a set of viding a way to infer causal directions in the network in an
starting conditions. This algorithm is run thousands of times unambiguous fashion (Zhu et al., 2004) The reconstruction
to identify different plausible networks, each time beginning algorithm can be modified to incorporate genetic data as prior
with different starting conditions. These most plausible net- evidence that two quantitative traits may be causally related
works can then be combined to obtain a consensus network. based on previously a described causality test (Zhu et al., 2004).
For each of the reconstructions using the MCMC algorithm, The genetic priors can be constructed from three basic sources.
the starting point is a null network. Small random changes are First, gene expression traits associated with DNA variants that
made to the network by flipping, adding, or deleting individ- are coincident with the gene’s physical location (referred to as
ual edges, ultimately accepting those changes that lead to an cis-​acting expression quantitative trait loci or cis eQTLs) (Doss
overall improvement in the fit of the network to the data. To et al., 2005) are allowed to be parent nodes of genes with
assess whether a change improves the network model or not, coincident trans eQTLs (the gene in this case does not physi-
information measures like the Bayesian Information Criterion cally reside at the genetic locus of interest), p(cis → trans ) = 1,
(BIC) (Schwarz, 1978) are employed, which reduces overfit- but genes with trans eQTLs are not allowed to be parents of
ting by imposing a cost on the addition of new parameters. genes with cis eQTLs, p(trans → cis ) = 0. Second, after iden-
This is equivalent to imposing a lower prior probability P( M ) tifying all associations between different genetic loci and
on models with larger numbers of parameters. expression traits at some reasonable significance threshold,
Even though edges in Bayesian networks are directed, we genes from this analysis with cis-​or trans eQTL can be tested
cannot in general infer causal relationships from the structure individually for pleiotropic effects at each of their eQTLs to
directly, just as I discussed in relation to the causal inference determine whether any other genes in the set are driven by
test. For a network with three nodes, X1, X 2, and X 3, there are common eQTLs (Lum et al., 2006). If such pleiotropic effects
multiple groups of structures that are mathematically equiva- are detected, the corresponding gene pair and locus giving rise
lent. For example, the three models M1 : X1 → X 2 , X 2 → X 3 ; to the pleiotropic effect can then be used to infer a causal/​
M2 : X 2 → X1 , X 2 → X 3; and M3 : X 2 → X1 , X 3 → X 2 are all reactive or independent relationship based on the causality
Markov-​equivalent, meaning that they all encode for the same test described. If an independent relationship is inferred, then
conditional independence relationship: X1 ⊥ X 3 X 2 , X1, the prior probability that gene A is a parent of gene B can be
and X 3 are independent conditional on X 2. In addition, these scaled as
models are mathematically equivalent:
∑ p ( A ⊥ B A, B, l )
p ( X ) = p ( M1 D ) = p ( X 2 X 1 ) p ( X 1 ) p ( X 3 X 2 )
i
p( A → B) = 1 − i
,
= p ( M2 D ) = p ( X 1 X 2 ) p ( X 2 ) p ( X 3 X 2 ) ∑1 i

( )
= p ( M3 D ) = p X 2 X 3 p ( X 3 ) p ( X 1 X 2 )
where the sums are taken over all loci used to infer the rela-
tionship. If a causal or reactive relationship is inferred, then
Thus, from correlation data alone we cannot infer from these the prior probability is scaled as
types of structures whether X1 is causal for X 2 or vice versa.
It is worth noting, however, that there is a class of structures,
2 ∑ p ( A → B A, B, l i )
V-​shape structures (e.g., Mv : X1 → X 2 , X 3 → X 2 ), that have
p( A → B) = i
p.
no Markov-​equivalent structure. In such cases it is not pos-
sible based on correlation data alone to infer causal relation-
∑ p ( A → B A, B, l ) + p ( B → A A, B, l )
i
i i

ships. Because there are more parameters to estimate in the Mv


model than in the M1, M2, or M3 models, there is a large pen- Finally, if the causal/​ reactive relationship between genes
alty in the BIC score for the Mv model. Therefore, in practice, A and B cannot be determined from the first two sources, the
a large sample size is needed to differentiate the Mv model complexity of the eQTL signature for each gene can be taken
from the M1, M2, or M3 models. into consideration. Genes with a simpler, albeit stronger,
eQTL signature (i.e., a small number of eQTL that explain the
genetic variance component for the gene, with a significant
I N T EG R AT I N G G E N ET I C DATA A S
proportion of the overall variance explained by the genetic
A S T RU C T U R E P R I O R TO E N H A N C E C AUS A L
effects) can be considered as more likely to be causal compared
I N FE R E N C E I N T H E BAY E S I A N N ET WO R K
with genes with more complex and possibly weaker eQTL sig-
R E C O N S T RU C T I O N P RO C E S S
natures (i.e., a larger number of eQTLs explaining the genetic
In general, Bayesian networks can only be solved to Markov-​ variance component for the gene, with less of the overall vari-
equivalent structures, so it is often not possible to determine ance explained by the genetic effects). The structure prior that
the causal direction of a link between two nodes even though gene A is a parent of gene B can then be taken to be

2 .S.NCharney,
Charney and Nestler's Neurobiology of Mental Illness, edited by Dennis etwork A nalyses
et al., Oxford of USA
University Press C omplex
- OSO, 2018.

07:33:44.
Another random document with
no related content on Scribd:
which the proposition expresses. Mersenne further proceeds to show
the effect of thickness and tension. He finds (Prop. 7) that a string
must be four times as thick as another, to give the octave below; he
finds, also (Prop. 8), that the tension must be about four times as
great in order to produce the octave above. From these proportions
various others are deduced, and the law of the 29 phenomena of this
kind may be considered as determined. Mersenne also undertook to
measure the phenomena numerically, that is to determine the
number of vibrations of the string in each of such cases; which at
first might appear difficult, since it is obviously impossible to count
with the eye the passages of a sounding string backwards and
forwards. But Mersenne rightly assumed, that the number of
vibrations is the same so long as the tone is the same, and that the
ratios of the numbers of vibrations of different strings may be
determined from the numerical relations of their notes. He had,
therefore, only to determine the number of vibrations of one certain
string, or one known note, to know those of all others. He took a
musical string of three-quarters of a foot long, stretched with a
weight of six pounds and five eighths, which he found gave him by
its vibrations a certain standard note in his organ: he found that a
string of the same material and tension, fifteen feet, that is, twenty
times as long, made ten recurrences in a second; and he inferred
that the number of vibrations of the shorter string must also be
twenty times as great; and thus such a string must make in one
second of time two hundred vibrations.
5 Hist. Son. et Aud. vol. ix. p. 71.

6 L. i. Prop. 15.

7 L. ii. Prop. 6.
This determination of Mersenne does not appear to have attracted
due notice; but some time afterwards attempts were made to
ascertain the connexion between the sound and its elementary
pulsations in a more direct manner. Hooke, in 1681, produced
sounds by the striking of the teeth of brass wheels, 8 and Stancari, in
1706, by whirling round a large wheel in air, showed, before the
Academy of Bologna, how the number of vibrations in a given note
might be known. Sauveur, who, though deaf for the first seven years
of his life, was one of the greatest promoters of the science of sound,
and gave it its name of Acoustics, endeavored also, about the same
time, to determine the number of vibrations of a standard note, or, as
he called it, Fixed Sound. He employed two methods, both ingenious
and both indirect. The first was the method of beats. Two organ-
pipes, which form a discord, are often heard to produce a kind of
howl, or wavy noise, the sound swelling and declining at small
intervals of time. This was readily and rightly ascribed to the
coincidences of the pulsations of sound of the two notes after certain
cycles. Thus, if the number of vibrations of the notes were as fifteen
to sixteen in the same time, every fifteenth vibration of the one would
coincide with every 30 sixteenth vibration of the other, while all the
intermediate vibrations of the two tones would, in various degrees,
disagree with each other; and thus every such cycle, of fifteen and
sixteen vibrations, might be heard as a separate beat of sound. Now,
Sauveur wished to take a case in which these beats were so slow as
to be counted, 9 and in which the ratio of the vibrations of the notes
was known from a knowledge of their musical relations. Thus if the
two notes form an interval of a semitone, their ratio will be that above
supposed, fifteen to sixteen; and if the beats be found to be six in a
second, we know that, in that time, the graver note makes ninety and
the acuter ninety-six vibrations. In this manner Sauveur found that an
open organ-pipe, five feet long, gave one hundred vibrations in a
second.
8 Life, p. xxiii.

9 Ac. Sc. Hist. 1700, p. 131.

Sauveur’s other method is more recondite, and approaches to a


mechanical view of the question. 10 He proceeded on this basis; a
string, horizontally stretched, cannot be drawn into a mathematical
straight line, but always hangs in a very flat curve, or festoon. Hence
Sauveur assumed that its transverse vibrations may be conceived to
be identical with the lateral swingings of such a festoon. Observing
that the string C, in the middle of a harpsichord, hangs in such a
festoon to the amount of 1⁄323rd of an inch, he calculates, by the laws
of pendulums, the time of oscillation, and finds it 1⁄122nd of a second.
Thus this C, his fixed note, makes one hundred and twenty-two
vibrations in a second. It is curious that this process, seemingly so
arbitrary, is capable of being justified on mechanical principles;
though we can hardly give the author credit for the views which this
justification implies. It is, therefore, easy to understand that it agreed
with other experiments, in the laws which it gave for the dependence
of the tone on the length and tension.
10 Ac. Sc. Hist. 1713.

The problem of satisfactorily explaining this dependence, on


mechanical principles, naturally pressed upon the attention of
mathematicians when the law of the phenomena was thus
completely determined by Mersenne and Sauveur. It was desirable
to show that both the circumstances and the measure of the
phenomena were such as known mechanical causes and laws would
explain. But this problem, as might be expected, was not attacked till
mechanical principles, and the modes of applying them, had become
tolerably familiar.

As the vibrations of a string are produced by its tension, it


appeared to be necessary, in the first place, to determine the law of
the tension 31 which is called into action by the motion of the string;
for it is manifest that, when the string is drawn aside from the straight
line into which it is stretched, there arises an additional tension,
which aids in drawing it back to the straight line as soon as it is let
go. Hooke (On Spring, 1678) determined the law of this additional
tension, which he expressed in his noted formula, “Ut tensio sic vis,”
the Force is as the Tension; or rather, to express his meaning more
clearly, the Force of tension is as the Extension, or, in a string, as the
increase of length. But, in reality, this principle, which is important in
many acoustical problems, is, in the one now before us, unimportant;
the force which urges the string towards the straight line, depends,
with such small extensions as we have now to consider, not on the
extension, but on the curvature; and the power of treating the
mathematical difficulty of curvature, and its mechanical
consequences, was what was requisite for the solution of this
problem.

The problem, in its proper aspect, was first attacked and mastered
by Brook Taylor, an English mathematician of the school of Newton,
by whom the solution was published in 1715, in his Methodus
Incrementorum. Taylor’s solution was indeed imperfect, for it only
pointed out a form and a mode of vibration, with which the string
might move consistently with the laws of mechanics; not the mode in
which it must move, supposing its form to be any whatever. It
showed that the curve might be of the nature of that which is called
the companion to the cycloid; and, on the supposition of the curve of
the string being of this form, the calculation confirmed the previously
established laws by which the tone, or the time of vibration, had
been discovered to depend on the length, tension, and bulk of the
string. The mathematical incompleteness of Taylor’s reasoning must
not prevent us from looking upon his solution of the problem as the
most important step in the progress of this part of the subject: for the
difficulty of applying mechanical principles to the question being
once overcome, the extension and correction of the application was
sure to be undertaken by succeeding mathematicians; and,
accordingly, this soon happened. We may add, moreover, that the
subsequent and more general solutions require to be considered
with reference to Taylor’s, in order to apprehend distinctly their
import; and further, that it was almost evident to a mathematician,
even before the general solution had appeared, that the dependence
of the time of vibration on the length and tension, would be the same
in the general case as in the 32 Taylorian curve; so that, for the ends
of physical philosophy, the solution was not very incomplete.

John Bernoulli, a few years afterwards, 11 solved the problem of


vibrating chords on nearly the same principles and suppositions as
Taylor; but a little later (in 1747), the next generation of great
mathematicians, D’Alembert, Euler, and Daniel Bernoulli, applied the
increased powers of analysis to give generality to the mode of
treating this question; and especially the calculus of partial
differentials, invented for this purpose. But at this epoch, the
discussion, so far as it bore on physics, belonged rather to the
history of another problem, which comes under our notice hereafter,
that of the composition of vibrations; we shall, therefore, defer the
further history of the problem of vibrating strings, till we have to
consider it in connexion with new experimental facts.
11 Op. iii. p. 207.
CHAPTER III.

Problem of the Propagation of Sound.

W E have seen that the ancient philosophers, for the most part,
held that sound was transmitted, as well as produced, by some
motion of the air, without defining what kind of motion this was; that
some writers, however, applied to it a very happy similitude, the
expansive motion of the circular waves produced by throwing a
stone into still water; but that notwithstanding, some rejected this
mode of conception, as, for instance, Bacon, who ascribed the
transmission of sound to certain “spiritual species.”

Though it was an obvious thought to ascribe the motion of sound


to some motion of air; to conceive what kind of motion could and did
produce this effect, must have been a matter of grave perplexity at
the time of which we are speaking; and is far from easy to most
persons even now. We may judge of the difficulty of forming this
conception, when we recollect that John Bernoulli the younger 12
declared, that he could not understand Newton’s proposition on this
subject. The difficulty consists in this; that the movement of the parts
of air, in which sound consists, travels along, but that the parts 33 of
air themselves do not so travel. Accordingly Otto Guericke, 13 the
inventor of the air-pump, asks, “How can sound be conveyed by the
motion of the air? when we find that it is better conveyed through air
that is still, than when there is a wind.” We may observe, however,
that he was partly misled by finding, as he thought, that a bell could
be heard in the vacuum of his air-pump; a result which arose,
probably, from some imperfection in his apparatus.
12 Prize Dis. on Light, 1736.
13 De Vac. Spat. p. 138.

Attempts were made to determine, by experiment, the


circumstances of the motion of sound; and especially its velocity.
Gassendi 14 was one of the first who did this. He employed fire-arms
for the purpose, and thus found the velocity to be 1473 Paris feet in
a second. Roberval found a velocity so small (560 feet) that it threw
uncertainty upon the rest, and affected Newton’s reasonings
subsequently. 15 Cassini, Huyghens, Picard, Römer, found a velocity
of 1172 Paris feet, which is more accurate than the former. Gassendi
had been surprised to find that the velocity with which sounds travel,
is the same whether they are loud or gentle.
14 Fischer, Gesch. d. Physik. vol. i. 171.

15 Newt. Prin. B. ii. P. 50, Schol.

The explanation of this constant velocity of sound, and of its


amount, was one of the problems of which a solution was given in
the Great Charter of modern science, Newton’s Principia (1687).
There, for the first time, were explained the real nature of the
motions and mutual action of the parts of the air through which
sound is transmitted. It was shown 16 that a body vibrating in an
elastic medium, will propagate pulses through the medium; that is,
the parts of the medium will move forwards and backwards, and this
motion will affect successively those parts which are at a greater and
greater distance from the origin of motion. The parts, in going
forwards, produce condensation; in returning to their first places,
they allow extension; and the play of the elasticities developed by
these expansions and contractions, supplies the forces which
continue to propagate the motion.
16 Newt. Prin. B. ii. P. 43.
The idea of such a motion as this, is, as we have said, far from
easy to apprehend distinctly: but a distinct apprehension of it is a
step essential to the physical part of the sciences now under notice;
for it is by means of such pulses, or undulations, that not only sound,
but light, and probably heat, are propagated. We constantly meet
with evidence of the difficulty which men have in conceiving this
undulatory motion, and in separating it from a local motion of the
medium as a 34 mass. For instance, it is not easy at first to conceive
the waters of a great river flowing constantly down towards the sea,
while waves are rolling up the very same part of the stream; and
while the great elevation, which makes the tide, is travelling from the
sea perhaps with a velocity of fifty miles an hour. The motion of such
a wave, or elevation, is distinct from any stream, and is of the nature
of undulations in general. The parts of the fluid stir for a short time
and for a small distance, so as to accumulate themselves on a
neighboring part, and then retire to their former place; and this
movement affects the parts in the order of their places. Perhaps if
the reader looks at a field of standing corn when gusts of wind are
sweeping over it in visible waves, he will have his conception of this
matter aided; for he will see that here, where each ear of grain is
anchored by its stalk, there can be no permanent local motion of the
substance, but only a successive stooping and rising of the separate
straws, producing hollows and waves, closer and laxer strips of the
crowded ears.

Newton had, moreover, to consider the mechanical consequences


which such condensations and rarefactions of the elastic medium,
air, would produce in the parts of the fluid itself. Employing known
laws of the elasticity of air, he showed, in a very remarkable
proposition, 17 the law according to which the particles of air might
vibrate. We may observe, that in this solution, as in that of the
vibrating string already mentioned, a rule was exhibited according to
which the particles might oscillate, but not the law to which they must
conform. It was proved that, by taking the motion of each particle to
be perfectly similar to that of a pendulum, the forces, developed by
contraction and expansion, were precisely such as the motion
required; but it was not shown that no other type of oscillation would
give rise to the same accordance of force and motion. Newton’s
reasoning also gave a determination of the speed of propagation of
the pulses: it appeared that sound ought to travel with the velocity
which a body would acquire by falling freely through half the height
of a homogeneous atmosphere; “the height of a homogeneous
atmosphere” being the height which the air must have, in order to
produce, at the earth’s surface, the actual atmospheric pressure,
supposing no diminution of density to take place in ascending. This
height is about 29,000 feet; and hence it followed that the velocity
was 968 feet. This velocity is really considerably less than that of
sound; but at the time of which 35 we speak, no accurate measure
had been established; and Newton persuaded himself, by
experiments made in the cloister of Trinity College, his residence,
that his calculation was not far from the fact. When, afterwards, more
exact experiments showed the velocity to be 1142 English feet,
Newton attempted to explain the difference by various
considerations, none of which were adequate to the purpose;—as,
the dimensions of the solid particles of which the fluid air consists;—
or the vapors which are mixed with it. Other writers offered other
suggestions; but the true solution of the difficulty was reserved for a
period considerably subsequent.
17 Princ. B. ii. P. 48.
Newton’s calculation of the motion of sound, though logically
incomplete, was the great step in the solution of the problem; for
mathematicians could not but presume that his result was not
restricted to the hypothesis on which he had obtained it; and the
extension of the solution required only mere ordinary talents. The
logical defect of his solution was assailed, as might have been
expected. Cranmer (professor at Geneva), in 1741, conceived that
he was destroying the conclusiveness of Newton’s reasoning, by
showing that it applied equally to other modes of oscillation. This,
indeed, contradicted the enunciation of the 48th Prop. of the Second
Book of the Principia; but it confirmed and extended all the general
results of the demonstration; for it left even the velocity of sound
unaltered, and thus showed that the velocity did not depend
mechanically on the type of the oscillation. But the satisfactory
establishment of this physical generalization was to be supplied from
the vast generalizations of analysis, which mathematicians were now
becoming able to deal with. Accordingly this task was performed by
the great master of analytical generalization, Lagrange, in 1759,
when, at the age of twenty-three, he and two friends published the
first volume of the Turin Memoirs. Euler, as his manner was, at once
perceived the merit of the new solution, and pursued the subject on
the views thus suggested. Various analytical improvements and
extensions were introduced into the solution by the two great
mathematicians; but none of these at all altered the formula by which
the velocity of sound was expressed; and the discrepancy between
calculation and observation, about one-sixth of the whole, which had
perplexed Newton, remained still unaccounted for.

The merit of satisfactorily explaining this discrepancy belongs to


Laplace. He was the first to remark 18 that the common law of the 36
changes of elasticity in the air, as dependent on its compression,
cannot be applied to those rapid vibrations in which sound consists,
since the sudden compression produces a degree of heat which
additionally increases the elasticity. The ratio of this increase
depended on the experiments by which the relation of heat and air is
established. Laplace, in 1816, published 19 the theorem on which the
correction depends. On applying it, the calculated velocity of sound
agreed very closely with the best antecedent experiments, and was
confirmed by more exact ones instituted for that purpose.
18 Méc. Cél. t. v. l. xii. p. 96.

19 Ann. Phys. et Chim. t. iii. p. 288.

This step completes the solution of the problem of the propagation


of sound, as a mathematical induction, obtained from, and verified
by, facts. Most of the discussions concerning points of analysis to
which the investigations on this subject gave rise, as, for instance,
the admissibility of discontinuous functions into the solutions of
partial differential equations, belong to the history of pure
mathematics. Those which really concern the physical theory of
sound may be referred to the problem of the motion of air in tubes, to
which we shall soon have to proceed; but we must first speak of
another form which the problem of vibrating strings assumed.

It deserves to be noticed that the ultimate result of the study of the


undulations of fluids seems to show that the comparison of the
motion of air in the diffusion of sound with the motion of circular
waves from a centre in water, which is mentioned at the beginning of
this chapter, though pertinent in a certain way, is not exact. It
appears by Mr. Scott’s recent investigations concerning waves, 20
that the circular waves are oscillating waves of the Second order,
and are gregarious. The sound-wave seems rather to resemble the
great solitary Wave of Translation of the First order, of which we
have already spoken in Book vi. chapter vi.
20 Brit. Ass. Reports for 1844, p. 361.
CHAPTER IV.

Problem of different Sounds of the same String.

I T had been observed at an early period of acoustical knowledge,


that one string might give several sounds. Mersenne and others 37
had noticed 21 that when a string vibrates, one which is in unison with
it vibrates without being touched. He was also aware that this was
true if the second string was an octave or a twelfth below the first.
This was observed as a new fact in England in 1674, and
communicated to the Royal Society by Wallis. 22 But the later
observers ascertained further, that the longer string divides itself into
two, or into three equal parts, separated by nodes, or points of rest;
this they proved by hanging bits of paper on different parts of the
string. The discovery so modified was again made by Sauveur 23
about 1700. The sounds thus produced in one string by the vibration
of another, have been termed Sympathetic Sounds. Similar sounds
are often produced by performers on stringed instruments, by
touching the string at one of its aliquot divisions, and are then called
the Acute harmonics. Such facts were not difficult to explain on
Taylor’s view of the mechanical condition of the string; but the
difficulty was increased when it was noticed that a sounding body
could produce these different notes at the same time. Mersenne had
remarked this, and the fact was more distinctly observed and
pursued by Sauveur. The notes thus produced in addition to the
genuine note of the string, have been called Secondary Notes; those
usually heard are, the Octave, the Twelfth, and the Seventeenth
above the note itself. To supply a mode of conceiving distinctly, and
explaining mechanically, vibrations which should allow of such an
effect, was therefore a requisite step in acoustics.
21 Harm. lib. iv. Prop. 28 (1636).

22 Ph. Tr. 1677, April.

23 A. P. 1701.

This task was performed by Daniel Bernoulli in a memoir


published in 1755. 24 He there stated and proved the Principle of the
coexistence of small vibrations. It was already established, that a
string might vibrate either in a single swelling (if we use this word to
express the curve between two nodes which Bernoulli calls a
ventre), or in two or three or any number of equal swellings with
immoveable nodes between. Daniel Bernoulli showed further, that
these nodes might be combined, each taking place as if it were the
only one. This appears sufficient to explain the coexistence of the
harmonic sounds just noticed. D’Alembert, indeed, in the article
Fundamental in the French Encyclopédie, and Lagrange in his
Dissertation on Sound in the Turin Memoirs, 25 offer several
objections to this explanation; and it cannot be denied that the
subject has its difficulties; but 38 still these do not deprive Bernoulli of
the merit of having pointed out the principle of Coexistent Vibrations,
or divest that principle of its value in physical science.
24 Berlin Mem. 1753, p. 147.

25 T. i. pp. 64, 103.

Daniel Bernoulli’s Memoir, of which we speak, was published at a


period when the clouds which involve the general analytical
treatment of the problem of vibrating strings, were thickening about
Euler and D’Alembert, and darkening into a controversial hue; and
as Bernoulli ventured to interpose his view, as a solution of these
difficulties, which, in a mathematical sense, it is not, we can hardly
be surprised that he met with a rebuff. The further prosecution of the
different modes of vibration of the same body need not be here
considered.

The sounds which are called Grave Harmonics, have no analogy


with the Acute Harmonics above-mentioned; nor do they belong to
this section; for in the case of Grave Harmonics, we have one sound
from the co-operation of two strings, instead of several sounds from
one string. These harmonics are, in fact, connected with beats, of
which we have already spoken; the beats becoming so close as to
produce a note of definite musical quality. The discovery of the
Grave Harmonics is usually ascribed to Tartini, who mentions them
in 1754; but they are first noticed 26 in the work of Sorge On tuning
Organs, 1744. He there expresses this discovery in a query.
“Whence comes it, that if we tune a fifth (2 : 3), a third sound is
faintly heard, the octave below the lower of the two notes? Nature
shows that with 2 : 3, she still requires the unity, to perfect the order
1, 2, 3.” The truth is, that these numbers express the frequency of
the vibrations, and thus there will be coincidences of the notes 2 and
3, which are of the frequency 1, and consequently give the octave
below the sound 2. This is the explanation given by Lagrange, 27 and
is indeed obvious.
26 Chladni. Acoust. p. 254.

27 Mem. Tur. i. p. 104.


CHAPTER V.

Problem of the Sounds of Pipes.

I T was taken for granted by those who reasoned on sounds, that


the sounds of flutes, organ-pipes, and wind-instruments in general,
39 consisted in vibrations of some kind; but to determine the nature
and laws of these vibrations, and to reconcile them with mechanical
principles, was far from easy. The leading facts which had been
noticed were, that the note of a pipe was proportional to its length,
and that a flute and similar instruments might be made to produce
some of the acute harmonics, as well as the genuine note. It had
further been noticed, 28 that pipes closed at the end, instead of giving
the series of harmonics 1, ½, ⅓, ¼, &c., would give only those notes
which answer to the odd numbers 1, ⅓, ⅕, &c. In this problem also,
Newton 29 made the first step to the solution. At the end of the
propositions respecting the velocity of sound, of which we have
spoken, he noticed that it appeared by taking Mersenne’s or
Sauveur’s determination of the number of vibrations corresponding
to a given note, that the pulse of air runs over twice the length of the
pipe in the time of each vibration. He does not follow out this
observation, but it obviously points to the theory, that the sound of a
pipe consists of pulses which travel back and forwards along its
length, and are kept in motion by the breath of the player. This
supposition would account for the observed dependence of the note
on the length of the pipe. The subject does not appear to have been
again taken up in a theoretical way till about 1760; when Lagrange in
the second volume of the Turin Memoirs, and D. Bernoulli in the
Memoirs of the French Academy for 1762, published important
essays, in which some of the leading facts were satisfactorily
explained, and which may therefore be considered as the principal
solutions of the problem.
28 D. Bernoulli, Berlin. Mem. 1753, p. 150.

29 Princip. Schol. Prop. 50.

In these solutions there was necessarily something hypothetical.


In the case of vibrating strings, as we have seen, the Form of the
vibrating curve was guessed at only, but the existence and position
of the Nodes could be rendered visible to the eye. In the vibrations of
air, we cannot see either the places of nodes, or the mode of
vibration; but several of the results are independent of these
circumstances. Thus both of the solutions explain the fact, that a
tube closed at one end is in unison with an open tube of double the
length; and, by supposing nodes to occur, they account for the
existence of the odd series of harmonics alone, 1, 3, 5, in closed
tubes, while the whole series, 1, 2, 3, 4, 5, &c., occurs in open ones.
Both views of the nature of the vibration appear to be nearly the
same; though Lagrange’s is expressed with an analytical generality
which renders it obscure, and Bernoulli has perhaps 40 laid down an
hypothesis more special than was necessary. Lagrange 30 considers
the vibration of open flutes as “the oscillations of a fibre of air,” under
the condition that its elasticity at the two ends is, during the whole
oscillation, the same as that of the surrounding atmosphere.
Bernoulli supposes 31 the whole inertia of the air in the flute to be
collected into one particle, and this to be moved by the whole
elasticity arising from this displacement. It may be observed that
both these modes of treating the matter come very near to what we
have stated as Newton’s theory; for though Bernoulli supposes all
the air in the flute to be moved at once, and not successively, as by
Newton’s pulse, in either case the whole elasticity moves the whole
air in the tube, and requires more time to do this according to its
quantity. Since that time, the subject has received further
mathematical developement from Euler, 32 Lambert, 33 and
Poisson; 34 but no new explanation of facts has arisen. Attempts
have however been made to ascertain experimentally the places of
the nodes. Bernoulli himself had shown that this place was affected
by the amount of the opening, and Lambert 35 had examined other
cases with the same view. Savart traced the node in various musical
pipes under different conditions; and very recently Mr. Hopkins, of
Cambridge, has pursued the same experimental inquiry. 36 It appears
from these researches, that the early assumptions of
mathematicians with regard to the position of the nodes, are not
exactly verified by the facts. When the air in a pipe is made to vibrate
so as to have several nodes which divide it into equal parts, it had
been supposed by acoustical writers that the part adjacent to the
open end was half of the other parts; the outermost node, however,
is found experimentally to be displaced from the position thus
assigned to it, by a quantity depending on several collateral
circumstances.
30 Mém. Turin, vol. ii. p. 154.

31 Mém. Berlin, 1753, p. 446.

32 Nov. Act. Petrop. tom. xvi.

33 Acad. Berlin, 1775.

34 Journ. Ec. Polyt. cap. 14.

35 Acad. Berlin, 1775.

36 Camb. Trans. vol. v. p. 234.


Since our purpose was to consider this problem only so far as it
has tended towards its mathematical solution, we have avoided
saying anything of the dependence of the mode of vibration on the
cause by which the sound is produced; and consequently, the
researches on the effects of reeds, embouchures, and the like, by
Chladni, Savart, Willis, and others, do not belong to our subject. It is
easily seen that the complex effect of the elasticity and other
properties of the reed and of the air together, is a problem of which
we can hardly 41 hope to give a complete solution till our knowledge
has advanced much beyond its present condition.

Indeed, in the science of Acoustics there is a vast body of facts to


which we might apply what has just been said; but for the sake of
pointing out some of them, we shall consider them as the subjects of
one extensive and yet unsolved problem.

You might also like