Download as pdf or txt
Download as pdf or txt
You are on page 1of 64

Neuroscience Data in the Cloud:

Opportunities and Challenges:


Proceedings of a Workshop 1st Edition
National Academies Of Sciences
Visit to download the full and correct content document:
https://ebookmeta.com/product/neuroscience-data-in-the-cloud-opportunities-and-cha
llenges-proceedings-of-a-workshop-1st-edition-national-academies-of-sciences/
More products digital (pdf, epub, mobi) instant
download maybe you interests ...

Neuroscience Trials of the Future Proceedings of a


Workshop 1st Edition And Medicine Engineering National
Academies Of Sciences

https://ebookmeta.com/product/neuroscience-trials-of-the-future-
proceedings-of-a-workshop-1st-edition-and-medicine-engineering-
national-academies-of-sciences/

Challenges in Machine Generation of Analytic Products


from Multi Source Data Proceedings of a Workshop 1st
Edition And Medicine Engineering National Academies Of
Sciences
https://ebookmeta.com/product/challenges-in-machine-generation-
of-analytic-products-from-multi-source-data-proceedings-of-a-
workshop-1st-edition-and-medicine-engineering-national-academies-
of-sciences/

Exploring the State of the Science in the Field of


Regenerative Medicine Challenges of and Opportunities
for Cellular Therapies Proceedings of a Workshop 1st
Edition And Medicine Engineering National Academies Of
Sciences
https://ebookmeta.com/product/exploring-the-state-of-the-science-
in-the-field-of-regenerative-medicine-challenges-of-and-
opportunities-for-cellular-therapies-proceedings-of-a-
workshop-1st-edition-and-medicine-engineering-national/

Violence and Mental Health Opportunities for Prevention


and Early Detection Proceedings of a Workshop 1st
Edition And Medicine Engineering National Academies Of
Sciences
https://ebookmeta.com/product/violence-and-mental-health-
opportunities-for-prevention-and-early-detection-proceedings-of-
a-workshop-1st-edition-and-medicine-engineering-national-
Brain Health Across the Life Span: Proceedings of a
Workshop 1st Edition National Academies Of Sciences

https://ebookmeta.com/product/brain-health-across-the-life-span-
proceedings-of-a-workshop-1st-edition-national-academies-of-
sciences/

The Drug Development Paradigm in Oncology Proceedings


of a Workshop 1st Edition And Medicine Engineering
National Academies Of Sciences

https://ebookmeta.com/product/the-drug-development-paradigm-in-
oncology-proceedings-of-a-workshop-1st-edition-and-medicine-
engineering-national-academies-of-sciences/

The Ebola Epidemic in West Africa Proceedings of a


Workshop 1st Edition And Medicine Engineering National
Academies Of Sciences

https://ebookmeta.com/product/the-ebola-epidemic-in-west-africa-
proceedings-of-a-workshop-1st-edition-and-medicine-engineering-
national-academies-of-sciences/

The State of Resilience A Leadership Forum and


Community Workshop Proceedings of a Workshop 1st
Edition And Medicine Engineering National Academies Of
Sciences
https://ebookmeta.com/product/the-state-of-resilience-a-
leadership-forum-and-community-workshop-proceedings-of-a-
workshop-1st-edition-and-medicine-engineering-national-academies-
of-sciences/

Data Breach Aftermath and Recovery for Individuals and


Institutions Proceedings of a Workshop 1st Edition And
Medicine Engineering National Academies Of Sciences

https://ebookmeta.com/product/data-breach-aftermath-and-recovery-
for-individuals-and-institutions-proceedings-of-a-workshop-1st-
edition-and-medicine-engineering-national-academies-of-sciences/
Neuroscience Data
in the Cloud
Opportunities and Challenges

PROCEEDINGS OF A WORKSHOP

Lisa Bain, Amanda Wagner Gee, and Clare Stroud, Rapporteurs

Forum on Neuroscience and Nervous System Disorders

Board on Health Sciences Policy

Health and Medicine Division


THE NATIONAL ACADEMIES PRESS
Washington, DC
www.nap.edu
THE NATIONAL ACADEMIES PRESS 500 Fifth Street, NW Washington, DC
20001

This activity was supported by contracts between the National Academy of Sciences and the
Alzheimer’s Association; Cohen Veterans Bioscience; Department of Health and Human
Services’ Food and Drug Administration (5R13FD005362-05) and National Institutes of
Health (NIH) (75N98019F00769 [Under Master Base HHSN263201800029I]) through the
National Center for Complementary and Integrative Health, National Eye Institute, National
Institute of Mental Health, National Institute of Neurological Disorders and Stroke, National
Institute on Aging, National Institute on Alcohol Abuse and Alcoholism, National Institute on
Drug Abuse, and NIH Blueprint for Neuroscience Research; Department of Veterans Affairs
(VA240-14-C-0057); Eisai Inc.; Eli Lilly and Company; Foundation for the National Institutes
of Health; Gatsby Charitable Foundation; Janssen Research & Development, LLC; The Kavli
Foundation; Lundbeck Research USA; Merck Research Laboratories; The Michael J. Fox
Foundation for Parkinson’s Research; National Multiple Sclerosis Society; National Science
Foundation (BCS-1064270); One Mind; Sanofi; Society for Neuroscience; Takeda
Pharmaceuticals International, Inc.; The University of Rhode Island; and Wellcome Trust.
Any opinions, findings, conclusions, or recommendations expressed in this publication do
not necessarily reflect the views of any organization or agency that provided support for the
project.

International Standard Book Number-13: 978-0-309-67055-5


International Standard Book Number-10: 0-309-67055-1
Digital Object Identifier: https://doi.org/10.17226/25653
Epub ISBN: 978-0-309-67058-6

Additional copies of this publication are available from the National Academies Press, 500
Fifth Street, NW, Keck 360, Washington, DC 20001; (800) 624-6242 or (202) 334-3313; htt
p://www.nap.edu.

Copyright 2020 by the National Academy of Sciences. All rights reserved.

Printed in the United States of America

Suggested citation: National Academies of Sciences, Engineering, and Medicine. 2020.


Neuroscience data in the cloud: Opportunities and challenges: Proceedings of a workshop.
Washington, DC: The National Academies Press. https://doi.org/10.17226/25653.
The National Academy of Sciences was established in 1863 by
an Act of Congress, signed by President Lincoln, as a private,
nongovernmental institution to advise the nation on issues related to
science and technology. Members are elected by their peers for
outstanding contributions to research. Dr. Marcia McNutt is
president.

The National Academy of Engineering was established in 1964


under the charter of the National Academy of Sciences to bring the
practices of engineering to advising the nation. Members are elected
by their peers for extraordinary contributions to engineering. Dr.
John L. Anderson is president.

The National Academy of Medicine (formerly the Institute of


Medicine) was established in 1970 under the charter of the National
Academy of Sciences to advise the nation on medical and health
issues. Members are elected by their peers for distinguished
contributions to medicine and health. Dr. Victor J. Dzau is president.

The three Academies work together as the National Academies of


Sciences, Engineering, and Medicine to provide independent,
objective analysis and advice to the nation and conduct other
activities to solve complex problems and inform public policy
decisions. The National Academies also encourage education and
research, recognize outstanding contributions to knowledge, and
increase public understanding in matters of science, engineering,
and medicine.

Learn more about the National Academies of Sciences, Engineering,


and Medicine at www.nationalacademies.org.
Consensus Study Reports published by the National Academies of
Sciences, Engineering, and Medicine document the evidence-based
consensus on the study’s statement of task by an authoring
committee of experts. Reports typically include findings, conclusions,
and recommendations based on information gathered by the
committee and the committee’s deliberations. Each report has been
subjected to a rigorous and independent peer-review process and it
represents the position of the National Academies on the statement
of task.

Proceedings published by the National Academies of Sciences,


Engineering, and Medicine chronicle the presentations and
discussions at a workshop, symposium, or other event convened by
the National Academies. The statements and opinions contained in
proceedings are those of the participants and are not endorsed by
other participants, the planning committee, or the National
Academies.

For information about other products and activities of the National


Academies, please visit www.nationalacademies.org/about/whatwed
o.
PLANNING COMMITTEE ON NEUROSCIENCE
DATA IN THE CLOUD1

DEANNA BARCH (Co-Chair), Washington University in St. Louis


MICHAEL HUERTA (Co-Chair), National Library of Medicine,
National Institutes of Health
AMIEE ALOI, Gates Ventures
ROSA CANET-AVILÉS, Foundation for the National Institutes of
Health
JONATHAN COHEN, Princeton University
PATRICK CULLINAN, bluebird bio
GREGORY FARBER, National Institute of Mental Health
DANIEL GESCHWIND, University of California, Los Angeles
MAGALI HAAS, Cohen Veterans Bioscience
MICHAEL HAWRYLYCZ, Allen Institute for Brain Science
STUART HOFFMAN, Department of Veterans Affairs
MICHAEL MILHAM, Child Mind Institute
KARLA MILLER, University of Oxford
BENJAMIN NEALE, Massachusetts General Hospital; Broad
Institute
KRISTEN ROSATI, Coppersmith Brockelman, PLC
LUBA SMOLENSKY, The Michael J. Fox Foundation for Parkinson’s
Research
MARGARET SUTHERLAND, Chan Zuckerberg Initiative

Health and Medicine Division Staff


CLARE STROUD, Director, Forum on Neuroscience and Nervous
System Disorders
SHEENA M. POSEY NORRIS, Program Officer
PHOENIX WILSON, Senior Program Assistant
ANDREW M. POPE, Senior Director, Board on Health Sciences
Policy

__________________
1 The National Academies of Sciences, Engineering, and Medicine’s planning committees
are solely responsible for organizing the workshop, identifying topics, and choosing
speakers. The responsibility for the published Proceedings of a Workshop rests with the
workshop rapporteurs and the institution.
FORUM ON NEUROSCIENCE AND NERVOUS
SYSTEM DISORDERS1

FRANCES JENSEN (Co-Chair), University of Pennsylvania


JOHN KRYSTAL (Co-Chair), Yale University
SUSAN AMARA, Society for Neuroscience
RITA BALICE-GORDON, Sanofi
KATJA BROSE, Chan Zuckerberg Initiative
EMERY BROWN, Harvard Medical School and Massachusetts
Institute of Technology
DANIEL BURCH, Pharmaceutical Product Development, LLC
JOSEPH BUXBAUM, Icahn School of Medicine at Mount Sinai
SARAH CADDICK, Gatsby Charitable Foundation
ROSA CANET-AVILÉS, Foundation for the National Institutes of
Health
MARIA CARRILLO, Alzheimer’s Association
EDWARD CHANG, University of California, San Francisco
TIMOTHY COETZEE, National Multiple Sclerosis Society
JONATHAN COHEN, Princeton University
ROBERT CONLEY, Eli Lilly and Company
JAMES DESHLER, National Science Foundation
BILLY DUNN, Food and Drug Administration
MICHAEL EGAN, Merck Research Laboratories
NITA FARAHANY, Duke University School of Law
JOSHUA GORDON, National Institute of Mental Health
RAQUEL GUR, University of Pennsylvania
MAGALI HAAS, Cohen Veterans Bioscience
RAMONA HICKS, One Mind
RICHARD HODES, National Institute on Aging
STUART HOFFMAN, Department of Veterans Affairs
JONATHAN HORSFORD, National Institute of Dental and
Craniofacial Research
YASMIN HURD, Icahn School of Medicine at Mount Sinai
STEVEN HYMAN, Broad Institute of Massachusetts Institute of
Technology and Harvard University
MICHAEL IRIZARRY, Eisai Inc.
GEORGE KOOB, National Institute on Alcohol Abuse and Alcoholism
WALTER KOROSHETZ, National Institute of Neurological Disorders
and Stroke
STORY LANDIS, National Institute of Neurological Disorders and
Stroke (Director Emeritus)
ALAN LESHNER, American Association for the Advancement of
Science (Emeritus)
HUSSEINI MANJI, Janssen Research & Development, LLC
CAROLINE MONTOJO, The Kavli Foundation
STEVEN PAUL, Voyager Therapeutics, Inc.
EMILIANGELO RATTI, Takeda Pharmaceuticals International, Inc.
TODD SHERER, The Michael J. Fox Foundation for Parkinson’s
Research
DAVID SHURTLEFF, National Center for Complementary and
Integrative Health
SANTA TUMMINIA, National Eye Institute
NORA VOLKOW, National Institute on Drug Abuse
ANDREW WELCHMAN, Wellcome Trust
DOUG WILLIAMSON, Lundbeck
STEVIN ZORN, MindImmune Therapeutics, Inc.

Health and Medicine Division Staff


CLARE STROUD, Forum Director
SHEENA M. POSEY NORRIS, Program Officer
AMANDA WAGNER GEE, Program Officer
PHOENIX WILSON, Senior Program Assistant
BARDIA MASSOUDKHAN, Financial Business Partner
ANDREW M. POPE, Senior Director, Board on Health Sciences
Policy
__________________
1 The National Academies of Sciences, Engineering, and Medicine’s forums and
roundtables do not issue, review, or approve individual documents. The responsibility for
the published Proceedings of a Workshop rests with the workshop rapporteurs and the
institution.
Reviewers

This Proceedings of a Workshop was reviewed in draft form by


individuals chosen for their diverse perspectives and technical
expertise. The purpose of this independent review is to provide
candid and critical comments that will assist the National Academies
of Sciences, Engineering, and Medicine in making each published
proceedings as sound as possible and to ensure that it meets the
institutional standards for quality, objectivity, evidence, and
responsiveness to the charge. The review comments and draft
manuscript remain confidential to protect the integrity of the
process.
We thank the following individuals for their review of this
proceedings:

SEAN HILL, Centre for Addiction and Mental Health


REBECCA LI, Vivli
MARYANN MARTONE, University of California, San Diego

Although the reviewers listed above provided many constructive


comments and suggestions, they were not asked to endorse the
content of the proceedings nor did they see the final draft before its
release. The review of this proceedings was overseen by LUCILA
OHNO-MACHADO, University of California, San Diego. She was
responsible for making certain that an independent examination of
this proceedings was carried out in accordance with standards of the
National Academies and that all review comments were carefully
considered. Responsibility for the final content rests entirely with the
rapporteurs and the National Academies.
Contents

1 INTRODUCTION AND BACKGROUND


Workshop Objectives
Organization of Proceedings

2 HARNESSING CLOUD-BASED TECHNOLOGIES TO ADVANCE


NEUROSCIENCE RESEARCH: SELECT CURRENT
INITIATIVES
International Neuroscience Coordinating Facility
Open Neuro
STRIDES

PART 1
CLOUD-BASED TECHNOLOGIES FOR NEUROSCIENCE
RESEARCH: CHALLENGES AND POTENTIAL SOLUTIONS

3 PROTECTING PRIVACY IN THE CLOUD


Current Promising Practices to Protect Privacy
Key Privacy-Related Issues to Be Resolved

4 MANAGING DATA AND PROMOTING INTEROPERABILITY


IN THE CLOUD
Current Promising Practices Regarding Standards Development and
Interoperability
Data Management Issues to Be Resolved

5 ASSIGNING CREDIT, DETERMINING OWNERSHIP, AND


LICENSING DATA IN THE CLOUD
Current Promising Practices for Assigning Credit and Licensing Data
Credit, Ownership, and Licensing Issues to Be Resolved

6 GOVERNING, FUNDING, AND SUSTAINING CLOUD-BASED


PLATFORMS
Current Promising Practices for Data Governance in the Cloud
Issues to Be Resolved Regarding Data Use and Access, Analysis,
User Training, and Platforms Sustainability

PART 2
DIFFERENT TYPES OF NEUROSCIENCE DATA: CHALLENGES
AND POTENTIAL OPPORTUNITIES

7 CLINICAL TRIAL AND RESEARCH DATA


Current Promising Practices in Clinical Trial and Research Data
Sharing
Issues to Be Resolved for Sharing Clinical Trial and Research Data

8 GENETIC DATA
Current Promising Practices for Managing Genetic Data in the
Cloud
Issues to Be Resolved Regarding Genetic Data in the Cloud

9 NEUROIMAGING DATA
Current Promising Practices for Neuroimaging Data in the Cloud
Issues to Be Resolved to Advance Cloud-Based Neuroimaging Data
Resources

10 REAL-WORLD DATA
Current Promising Practices for Managing Real-World Data in the
Cloud
Issues to Be Resolved to Incorporate Real-World Data into Clinical
Studies
11 FUTURE DIRECTIONS
Technology and Methods: Progress and Challenges
Training the Next Generation of Scientists
Funding: Current Commitments and Future Needs
Potential Next Steps: Working Groups to Move the Field Forward

APPENDIXES
A References
B Workshop Agenda
C Registered In-Person Attendees
1

Introduction and Background1

Now is an exciting time for brain research due to enormous global


investments that have enabled creation of the infrastructure required to
generate great pools of neuroscience data and develop novel
techniques. This has been facilitated in part by a shift in the way
research data are shared over the past decade (see Figure 1-1). In the
traditional model, data generated by one group of investigators may be
shared with one or more other research groups, each of which builds its
own tools to analyze and manipulate the data, resulting in a
proliferation of datasets and research tool versions. This contrasts with
the cloud model, where data and tools are co-located in a platform that
enables multiple investigators to work with a single copy of the dataset
using common tools.
The cloud model has led to a vast increase in the quantity and
complexity of data and expanded access to these data, which has
attracted many more researchers, enabled multi-national neuroscience
collaborations, and facilitated the development of many new tools. Yet,
the cloud model has also produced new challenges related to data
storage, organization, and protection. Merely switching the technical
infrastructure from local repositories to cloud repositories is not enough
to optimize data use.
FIGURE 1-1 A shift in the research model. In the cloud model, rather than data being
dispersed to multiple researchers for analysis with different sets of tools, data and tools
are co-located in the cloud, allowing multiple researchers to access these data using a
common set of analytical tools.
SOURCE: Presented by Nick Weber, September 24, 2019.

Thirty years ago, the National Institute of Mental Health (NIMH), the
National Institute on Drug Abuse (NIDA), and the National Science
Foundation (NSF) commissioned the Institute of Medicine (IOM) to
consider the future of digital and networked neuroscience, recalled
Michael Huerta, associate director of the National Library of Medicine
(NLM). It is fitting that 30 years later, a group reconvened at the
National Academies of Sciences, Engineering, and Medicine (the
National Academies), which incorporates the former IOM, to explore the
burgeoning use of cloud computing in neuroscience, said Huerta. On
September 24, 2019, the National Academies’ Forum on Neuroscience
and Nervous System Disorders hosted a workshop on neuroscience data
in the cloud, co-chaired by Huerta and Deanna Barch, chair of the
Department of Psychological and Brain Sciences at Washington
University in St. Louis.2 Box 1-1 provides definitions for some of the
core concepts related to cloud computing discussed throughout the
workshop. The intention of the workshop, said Barch, was to focus on
maximizing the benefits that can be realized from neuroscience data.
BOX 1-1
Definition of Cloud Computing and Select
Related Concepts
Cloud computing: as defined by the National Institute of
Standards and Technology, “is a model for enabling ubiquitous,
convenient, on-demand network access to a shared pool of
configurable computing resources (e.g., networks, servers,
storage, applications, and services) that can be rapidly provisioned
and released with minimal management effort or service provider
interaction” (Mell and Grance, 2011).

Data integration: “the process of combining data generated


using a variety of different research methods to enable detection
of underlying themes….”a

Data model: “organizes data elements and standardizes how the


data elements relate to one another. Since data models document
real life people, places, and things and the events between them,
the data model represents reality.”b

Federation: a data organizational scheme that supports “the


sharing of arbitrary resources from arbitrary application domains
with arbitrary consumer groups across multiple domains.… Any
type of organizational collaboration could be facilitated by a secure
method to selectively share data with specific partners” (Lee et al.,
2020).

Interoperability: the ability to exchange and use information


from various sources and of different types in computer systems.c

Platform: a grouping of software or hardware upon which other


technologies are developed.c

__________________
a From Nature Research, available at https://www.nature.com/subjects/data-integratio
n (accessed January 16, 2020).
b From the Princeton University Center for Data Analytics & Reporting, available at http
s://cedar.princeton.edu/understanding-data/what-data-model (accessed January 17,
2020).
c From the National Institutes of Health Strategic Plan for Data Science, available at htt
ps://datascience.nih.gov/strategicplan (accessed January 16, 2020).

WORKSHOP OBJECTIVES
The workshop brought together a broad range of stakeholders
involved in cloud-based neuroscience initiatives and research to explore
the use of cloud technology to advance neuroscience research and
share approaches to address current barriers. These stakeholders
represented academia, government, foundations, the pharmaceutical
and information technology industries, and the legal system. They were
tasked not only with identifying challenges, but also with suggesting
solutions and best practices that can help optimize the utility and
increase the efficiency of cloud-based neuroscience initiatives, support
ongoing efforts, and share information about the work of others, said
Barch.
In addition to cloud-specific issues, the workshop covered a number
of topics related to encouraging data sharing and open science, which
are integrally relevant for, but not specific to, cloud-based platforms.
Many discussions at the workshop covered issues, such as privacy
protection, that are common across many types of data, not just
neuroscience. The workshop provided a venue for members of the
neuroscience community to come together to discuss approaches for
tackling these common challenges, as well as challenges that are
specific to neuroscience data and the cloud-based platforms that are
dedicated to neuroscience or are frequently used by this community.
Box 1-2 provides the workshop Statement of Task.

ORGANIZATION OF PROCEEDINGS
These proceedings reflect the organization of the meeting. Chapter 2
summarizes talks about the landscape of cloud-based technologies for
neuroscience research. Two sets of breakout sessions are summarized
in Parts 1 and 2, which organize issues by content area and types of
data, respectively. In Part 1, Chapter 3 covers issues related to the
protection of privacy; Chapter 4 addresses data management and
interoperability issues; Chapter 5 examines issues related to assignment
of credit and data ownership; and Chapter 6 discusses platform
governance. In Part 2, challenges related to different types of
neuroscience data are examined: Chapter 7, clinical trial and research
data; Chapter 8, genetic data; Chapter 9, neuroimaging data; and
Chapter 10, real-world data. Chapter 11 concludes with a discussion of
future directions, including identifying tangible next steps and promising
areas for future action.

BOX 1-2
Statement of Task
Review the landscape of major neuroscience cloud-based
initiatives and other uses of cloud technology within
neuroscience research.
Discuss aspirational goals for maximizing benefit from data
and compute in the cloud by empowering broad and
meaningful data sharing and fostering open science.
Consider best practices and policies that would increase
efficiencies within and across cloud resources, including
aspects such as:
Consent and data use agreements
Authorization for and accessibility to a variety of data
types by a variety of users
Protection of privacy
Assignment of credit, ownership, and licensing
Technical issues
Researcher support and training
Explore potential next steps to move the field forward to
develop and deploy best practices in the service of achieving
aspirational goals.
__________________
1 The planning committee’s role was limited to planning the workshop, and the Proceedings
of a Workshop was prepared by the workshop rapporteurs as a factual summary of what
occurred at the workshop. Statements, recommendations, and opinions expressed are those of
individual presenters and participants; have not been endorsed or verified by the Health and
Medicine Division (HMD) of the National Academies of Sciences, Engineering, and Medicine; and
should not be construed as reflecting any group consensus.
2 For further information about the workshop, including slides presented by speakers, see htt
p://www.nas.edu/NeuroForum (accessed January 17, 2020).
2

Harnessing Cloud-Based Technologies to


Advance Neuroscience Research: Select
Current Initiatives

Highlightsa
The International Neuroscience Coordinating Facility
coordinates the development and endorsement of findable,
accessible, interoperable, and reusable (FAIR) community
standards and best practices (Martone).
The cloud alone is not a replacement for depositing data into
a proper archive. To maximally impact neuroscience research,
data need to be maintained in a manner that makes them
FAIR (Martone).
To enable data sharing, data standards are needed at all levels
of datasets and become more specialized according to user
needs (Martone).
The platform OpenNeuro enables free and open sharing of
multiple data types using a community standard called the
Brain Imaging Data Structure (BIDS) (Poldrack).
BIDS apps—containerized applications of diverse
neuroimaging software packages—allow users to analyze data
easily and reproducibly, which provides users with additional
benefits and incentivizes adherence to the BIDS standards
(Poldrack).
The Science and Technology Research Infrastructure for
Discovery, Experimentation, and Sustainability (STRIDES)
Initiative is designed to make it easier for researchers to use
cloud-based services such as Google Cloud and Amazon Web
Services (Weber).
STRIDES has facilitated the building of several large and high-
value datasets managed by the National Institutes of Health
and enabled the transfer of more than 30 petabytes of data to
the cloud, where they are accessible to the research
community (Weber).

__________________
a These points were made by the individual workshop participants identified above.
They are not intended to reflect a consensus among workshop participants.

Neuroscience is well on its way to moving to the cloud if it is not


there already, said Maryann Martone, professor emerita at the
University of California, San Diego, and chair of the governing board of
the International Neuroinformatics Coordinating Facility (INCF). The
cloud is ubiquitous across global neuroscience projects, including those
pictured in Figure 2-1, due to the recognition that the cloud is
necessary both for compiling large amounts of data and for taking
algorithms to the data, said Martone. She added that many other
projects beyond those included in the figure are using cloud
infrastructure.

INTERNATIONAL NEUROSCIENCE
COORDINATING FACILITY
Neuroscience is never going to be served by a single platform or a
single infrastructure because there are too many different types of data
and too much technological flux, Martone said, and the cloud alone
should not be seen as a replacement for contributing data to a proper
data repository or archive. To seriously impact the field and move
neuroscience forward, efforts are needed to ensure that data are
maintained in a manner that makes them accessible to both humans
and machines, she said. INCF1 was initially established to create global
infrastructure and data standards with a goal of facilitating organization
and usability of neuroscience data. INCF has grown into a membership
organization with members from 18 countries across 4 different
continents. It aims to coordinate the development and endorsement of
open and FAIR—findable, accessible, interoperable, and reusable—
community standards and best practices that will enable data to be
shared in a maximally useful way for both humans and machines. INCF
also focuses on developing and providing training and educational
resources, and serves as an interface among international large-scale
brain projects, said Martone.
Martone noted that developing and implementing FAIR standards
requires a partnership among researchers, repositories, indexers, and
aggregators. Moreover, for any given dataset there may be dozens of
standards and best practices that need to be brought together in a way
that can be navigable by a range of users. Martone illustrated how
these various standards relate to one another using what she calls the
FAIR onion, shown in Figure 2-2. A host of organizations, societies, and
others act as convening authorities to bring experts together to
establish standards at the different layers of the onion. At the outer
layer of the onion where data are more specialized, problems can only
be solved by the neuroscientists generating those data rather than by
general organizations, said Martone. Organizations like INCF play a
critical role in bringing these researchers together at the outer layers of
the onion, she said.
FIGURE 2-1 An exciting time for global neuroscience. Global neuroscience projects such
as those pictured here have been enabled by the cloud.
SOURCE: Presented by Maryann Martone, September 24, 2019.

To facilitate the development of standards, INCF is developing a


standards portal and will institute a review and endorsement process
with a consistent set of criteria and clear governance procedures, said
Martone. The portal will also house “TrainingSpace”—a collection of
neuroinformatics courses given by worldwide experts, said Martone.
Although standards are important to enable data sharing and many
types of standards have been created, Martone noted that few have
been piloted, tested, and validated. Moreover, standards and best
practices will always be in flux as technologies evolve, she said.
Neuroscientists, other researchers, and infrastructure providers have
little experience working with standards. The first thing they want to do
is change and adapt the standard to their needs. However, Martone said
research requires a delicate touch when it comes to revising standards.
She suggested that the research community will need to learn and
monitor how far one can deviate from a standard before it becomes
meaningless. Nearly all standards have a core that will work for most
use cases, she said. The edges are where modifications are needed. By
contrast, more rigid standards may be appropriate for purely clinical or
purely industrial users, she said.

OPEN NEURO
OpenNeuro2 is a platform that enables free and open sharing of
magnetic resonance imaging (MRI), magnetoencephalography (MEG),
electroencephalography (EEG), invasive EEG (iEEG), and
electrocorticography (ECoG) data. According to Russell Poldrack,
director of the Stanford Center for Reproducible Neuroscience,
OpenNeuro was built on an early project called OpenfMRI, a resource
that was developed to enable open sharing of data from task-based
functional MRI (fMRI) studies (Poldrack et al., 2013). In creating
OpenfMRI, Poldrack and colleagues developed a data organization
scheme that was specific to the type of data that would be submitted
and that would allow automatic analysis of these data. There was no
way to validate a dataset other than to run it through the pipeline. If
the pipeline crashed, manual curators at Stanford had to figure out
what went wrong. The process was very labor intensive, said Poldrack.
FIGURE 2-2 The FAIR onion. Standards are needed at all levels of datasets, as
illustrated by the FAIR onion. At the core, standards for basic data descriptors are
needed. As data become more complex and specialized, additional layers of standards are
required: first, standardized community vocabularies and data types; followed by domain-
specific vocabularies, minimal information models, and common data elements. At the
outer layers of the onion, specialized vocabularies and information models as well as
customized standards and formats are needed for specific applications.
NOTE: CDE = common data elements; FAIR = findable, accessible, interoperable,
reusable.
SOURCE: Presented by Maryann Martone, September 24, 2019.

Realizing the need for a less costly data-sharing approach, Poldrack


and his colleague Krzysztof Gorgolewski created a community standard
called the Brain Imaging Data Structure (BIDS), with funding from the
Arnold Foundation and the National Institutes of Health (NIH). BIDS
specifies file naming and organization and also a metadata structure,
said Poldrack. By using relatively simple directory and file naming
templates and common formats, BIDS reduces the learning curve for
users. Because it has an automated validator that is built in JavaScript,
it also allows users to run the validator in the browser before uploading
their data. It can run a huge dataset in just a few seconds and quickly
provides feedback about whether a particular dataset has met the
standard.
Because they wanted users to accrue benefits from moving to the
standards, they also created BIDS apps—containerized applications for
more than 30 diverse neuroimaging software packages, said Poldrack.
BIDS is flexible enough that by sticking to the core elements, users
realize how they can work with standards and still achieve the
specificity they need, noted Martone. The BIDS apps allow users to run
large data analyses packages easily and reproducibly without having to
reformat their data for those specific packages (Gorgolewski et al.,
2017). Each version of a shared dataset is given a digital object
identifier (DOI) that can be cited in published papers, said Poldrack.
To facilitate data sharing, OpenNeuro also creates a discussion page
for each dataset through which questions can be submitted to the
dataset owner. The project has been funded by the BRAIN Initiative
through 2023 and is growing at a rate of about 10 to 20 new datasets
and has a total of 5,000 to 6,000 users per month, said Poldrack. He
added that OpenNeuro enables anyone to download de-identified data
with no restrictions and no data use agreements, thus providing an
unparalleled degree of openness. The goal, he said, is to maximize the
value of these data.

STRIDES
The NIH Center for Information Technology has also made a major
commitment to adopting and developing best practices related to cloud
technologies as a means of supporting the research community, said
Nick Weber, program manager for Cloud Services at the NIH Center for
Information Technology. The Science and Technology Research
Infrastructure for Discovery, Experimentation, and Sustainability
(STRIDES) Initiative, launched in July 2018, now has partnerships with
Google Cloud and Amazon Web Services (AWS) and is working on
additional partnerships with other commercial providers, said Weber.
STRIDES aims to make it easier for researchers to use these services,
access data, and employ the latest tools and technologies while
protecting the security and privacy of data, he said. Other important
elements of the STRIDES Initiative include a training component across
the full range of users, including technical staff, bench researchers, data
scientists, and informaticians, and providing insight into sustainability
by gathering data on data usage to inform funding decisions, said
Weber.
STRIDES has facilitated building the operational environment for
several large and high-value NIH-managed datasets such as those
generated by Common Fund programs, the Trans-Omics for Precision
Medicine (TOPMed) program sponsored by the National Heart, Lung,
and Blood Institute (NHLBI), and the Accelerating Medicines
Partnership-Parkinson’s Disease (AMP-PD) program, said Weber.
Already, STRIDES investments have provided benefits to these research
programs in terms of cost savings and improved access to professional
services and enterprise support, he said. He added that STRIDES has
enabled the transfer of more than 30 petabytes of data into the cloud,
making it more widely accessible to the research community. Ultimately,
Weber predicts that STRIDES will facilitate improved interconnections
among datasets that otherwise would not have been connected. To
achieve this, he said, STRIDES has initiated efforts to make sure
funding agencies and partners understand how to leverage STRIDES
resources, for example, by including information about STRIDES in
funding opportunity announcements.

__________________
1 For more information, see https://www.incf.org (accessed November 12, 2019).
2 For more information, see https://openneuro.org (accessed November 12, 2019).
Part 1

Cloud-Based Technologies for


Neuroscience Research: Challenges and
Potential Solutions

The biggest barriers to cloud-based studies are ethical, legal, and


administrative, said Benjamin Neale, associate professor in the
Analytic and Translational Genetics Unit at Massachusetts General
Hospital and the Broad Institute of the Massachusetts Institute of
Technology (MIT) and Harvard. There is no consistent framework for
what types of data should be available for what kinds of analyses, he
said, and questions related to privacy and identifiability countervail
against the drive to promote the use of the cloud for neuroscience
research. Maryann Martone added that liability issues and legal
uncertainty have become major problems for the research
community, with the unintended consequence of clamping down on
progress. She added that reuse of data, a central aim of the cloud,
imposes burdens in data management, governance, and privacy.
These and other topics of discussion at the workshop are
summarized in the following four chapters: issues related to
protecting privacy in the cloud (Chapter 3); data management and
promotion of interoperability (Chapter 4); considerations for
assigning credit, determining ownership, and licensing data (Chapter
5); and governance, long-term funding, and sustainability of cloud-
based platforms (Chapter 6).
3

Protecting Privacy in the Cloud

Highlightsa
A complicated web of laws, including the General Data
Protection Regulation in Europe and the Health Insurance
Portability and Accountability Act and the Common Rule in the
United States, regulate privacy and security in research and
complicate efforts to share data across geographical
boundaries (Rosati).
Regulations regarding whether data can be shared are in
constant flux and upcoming changes to the Common Rule are
likely to cause confusion about sharing genomic information
(Rosati).
Different frameworks and models are required to protect
patient privacy depending on the design of a study, its
governance model, and its infrastructure (Hanson, Mackay).
Federated data-sharing platforms protect data by bringing
analytical tools to the data, rather than permitting
downloading the data for analysis, to allow analyses of
multiple datasets without violating restrictions on transfer of
personal information mandated by privacy laws (Rosati).
New approaches to informed consent are needed to enable
greater data sharing and collaboration (Barch, Haas, Rosati).

__________________
a These points were made by the individual workshop participants identified above.
They are not intended to reflect a consensus among workshop participants.

Data privacy is a hot issue right now, according to Kristen Rosati, a


partner at Coppersmith Brockelman, PLC. In the research community,
there is broad support for open data and increased data accessibility,
she said, with the acknowledgment that individual privacy must be
protected. Concerns about the extent to which data sharing poses
privacy risks were fueled recently by a study from scientists at the
Imperial College London and the Université Catholique de Louvain in
Belgium, and covered widely in the popular press, including The New
York Times, demonstrating that nearly all Americans could be correctly
re-identified in any “anonymized” dataset using only 15 demographic
variables, such as zip code, date of birth, gender, etc. (Rocher et al.,
2019).
Rosati described a complicated web of laws that exist in Europe, the
United States, and individual states regulating privacy and security in
research. The complexity arises, she said, because of different laws
from different sources that impact different institutions and different
types of information. In the European Union (EU), the General Data
Protection Regulation (GDPR) applies to organizations within the
European Economic Area (EEA)—including EU member states plus
Iceland, Liechtenstein, and Norway as well as organizations outside EEA
that offer goods or services to data subjects within EEA or monitor the
behavior of data subjects within EEA. The regulations govern any data
that directly or indirectly identifies a living individual, and adds special
protections to what it deems sensitive data such as genetic, biometric,
health, and demographic data. Importantly, said Rosati, because GDPR
applies restrictions to the transfer of personal data, it impacts even
researchers with organizations not directly required to comply with
GDPR, for example, organizations that want to collaborate with
organizations in Europe.
In the United States, multiple regulations add complexity to the web
of laws, according to Rosati. The Health Insurance Portability and
Accountability Act (HIPAA) regulates covered entities such as health
care providers “and their business associates” and applies to “protected
health information”; the Common Rule regulates federally funded
research; the federal substance use disorder treatment regulations
govern substance abuse information in a much more protective way
than HIPAA does; the Food and Drug Administration (FDA) oversees
clinical trials, which regulate a subset of data collected in support of
drug approvals; and NIH policies and regulations provide for certificates
of confidentiality. Layered on top of those regulations are state laws
such as the California Consumer Protection Act and state health
information confidentiality laws and state licensure requirements, said
Rosati.
Each of these regulations define which data are protected somewhat
differently as well as what constitutes de-identified information (under
HIPAA) or non-identifiable information (under the Common Rule).
Moreover, these regulations are in constant flux, said Rosati. For
example, she said the Office for Human Research Protection and other
agencies that enforce the Common Rule are expected to issue guidance
soon on whether whole-genome analysis or other types of genetic
information should be treated as identifiable information even if it is not
paired with data elements that directly identify individuals. This could
widen the difference in how the Common Rule and HIPAA interpret the
identifiability of data, which will likely cause substantial confusion about
the sharing of genomic information, said Rosati.
Rosati added that the patient privacy advocacy community is very
active and has a voice in Congress about putting restrictions around the
use of de-identified data without individual consent. This would be a
disaster for research, she said, noting that the research community
needs similar advocates to explain to the public and Congress the
importance of data sharing and open access to data to advance good
science. A better public policy would be to prohibit re-identification of
individuals from de-identified datasets. Moreover, while security of
patient information is important, Magali Haas, chief executive officer
(CEO) and president of Cohen Veterans Bioscience, noted that patients
are providing informed consent for a purpose, including for research.
Haas and Clare Mackay, professor of imaging neuroscience at the
University of Oxford, said that laws, regulations, and data access
policies governing research should take participants’ perspectives into
account.

CURRENT PROMISING PRACTICES TO PROTECT


PRIVACY
One of the inspirations for this workshop was an experience
described by Jonathan Cohen, Robert Bendheim and Lynn Thoman
Professor of Neuroscience at Princeton University. His group was
developing methods for putting real-time image analysis of fMRI data
on the cloud and wanted to test it in a clinical trial. This type of analysis
is helpful, and in some instances necessary to do in the cloud because
the computing requirements are so high and may be out of reach of
clinical facilities with access to relevant patient populations. Because
Princeton does not have a medical school, his team partnered with
William Hanson at the Perelman School of Medicine at the University of
Pennsylvania (Penn Med). When they were ready to start the project,
however, administrators at Penn Med were concerned that there were
no standards to ensure these data could be shared in a way that
protected the privacy of patients. Hanson said there were also issues
with Penn Med’s Information Services department and Institutional
Review Board (IRB) that had to be overcome to enable moving forward
with the project. He explained that health systems such as Penn Med
are extremely risk averse and under constant threat of data breaches
by hackers trying to fish for patient data in their systems. As the chief
medical information officer at Penn Med, Hanson and his team worked
to put together principles for access, use, and disclosure, starting with
the principle that data sharing is in the interest of patient treatment.
Patients who choose not to have their data shared must opt out.
Hanson said they are now working on a framework to guide all efforts
around data access, use, and disclosure of patient information. These
principles fall into six domains: lawful basis, institutional mission and
values, trustworthiness and accountability, risk mitigation, strong
security controls, and documentation (see Figure 3-1). They plan to
apply these principles to specific use cases in order to operationalize
the Penn Med position and convene a data governance committee to
oversee the process. Hanson said that rather than viewing these
protections as obstacles, he believes they facilitate research.
Many other models have also been developed to protect the privacy
of patients. Mackay described three projects in the United Kingdom that
have navigated the question of privacy protection in different ways:

The UK Biobank, possibly the largest cohort study in the world with
500,000 individuals, multiple data types, and multiple institutions,
but a single principal investigator and single IRB.1
The Wellcome Center for Integrative Neuroimaging (WIN), a study
taking place at a single institution with multiple study types,
principal investigators, and IRBs.2
Dementias Platform UK (DPUK), a substudy of the Field platform
(which also includes Health Data Research UK, or HDRUK) that is
taking place at multiple institutions with multiple study types, IRBs,
consent, and principal investigators.3

The message Mackay delivered to the workshop is that one size does
not fit all. Each project has a different governance model and different
infrastructure designed to fit the types of data gathered, the study
participants, and the users of the data.
The UK Biobank, for example, is open by design, said Mackay.
Participants consent upon enrollment to having their data shared, not
only imaging data, but also a range of sensitive health and personal
information. Data access is governed through a data access committee;
researchers apply and pay an administrative fee for access, said
Mackay. She added that an important aspect of the project is that there
is no disclosure of any health information to the participants, which
limits the ability to recruit potential participants for future studies or for
participants to self-identify for future studies, since the health
information that would make them eligible for the study cannot be
disclosed.
FIGURE 3-1 Penn Med’s principles for access, use, and disclosure of patient information.
Penn Med is working to develop a framework for enabling data sharing while protecting
the privacy of patients.
NOTE: BAA = business associate agreement; HIPAA = Health Insurance Portability and
Accountability Act; IS = information services; PM = Perelman School of Medicine at the
University of Pennsylvania; SSN = Social Security number; TCPA = Telephone Consumer
Protection Act.
SOURCE: Presented by William Hanson, September 24, 2019.

WIN also takes an open science approach and is building a


centralized infrastructure to facilitate data sharing, in which
responsibility for privacy remains with the study’s principal investigator,
according to Mackay. A similar model is being used for the Field
platform, which has 35 cohorts and more than 3 million participants,
she said. The Field platform infrastructure has been developed not only
to facilitate data sharing, but data aggregation and cross-cohort
analyses as well. It uses a single point of entry for requesting access to
data but, like WIN, assigns responsibility for granting access to the
individual data access committees.
DPUK uses another model, where individual cohort owners share
their data, but users are unable to download the data, said Mackay. In
this model, all computation happens on the platform that holds the
dataset—a sort of data playground—said Rosati. The global data
platform Vivli also uses this approach as just one of its pillars of data
security, said Rebecca Li, executive director of Vivli. They also require
anonymized data and manage access to the platform, requiring users to
have specialized expertise.
Rosati also mentioned the FDA Sentinel Initiative.4 In developing their
privacy rules, she said they decided to pursue a federated model where
all participating institutions, both health systems and health insurance
companies, put data into a common data model and locally retain them
rather than depositing them in a central databank. In addition to
protecting data, this system allows investigators to bring analytical tools
to multiple datasets, enabling powerful research across different
databases, she said.
Mackay noted that every research cohort has a participant panel that
plays an important role in determining rules regarding data access.
While an individual investigator may argue that participants in his or her
study did not consent to data sharing, Mackay pointed out that neither
did they consent to the use of those data to further the career of one
individual investigator. Moreover, if a cohort sets up the governance in a
way that prohibits re-identification, it may remove a participant from
the opportunity to take part in a future drug trial of a potentially
beneficial treatment. Indeed, said Mackay, while researchers are rightly
concerned about security and privacy, the perspective of participants is
equally important. As a way of combat-ting the risk aversion that
prevents many researchers and institutions from sharing data, she
suggested discussing case studies of where open science provided
positive benefits that could not have been realized had data access
been restricted.

KEY PRIVACY-RELATED ISSUES TO BE


RESOLVED
No federal laws in the United States prohibit re-identification of
individuals, said Haas, adding that this needs to be addressed at the
policy and legislative levels. Rosati noted that this is a particular
concern in countries like the United States, which does not have a
national health system and people worry about losing health insurance,
disability insurance, etc. Individual privacy also needs to be addressed
with regard to employment laws, said Haas.
Informed consent is a critical component in data sharing via cloud-
based platforms, but there are multiple issues still to be addressed.
Many participants in the privacy breakout session, including Mackay and
Deanna Barch, advocated development of best practices to guide the
development of consent forms for clinical trials and participation in
repositories and bio-banks. Ideal consent forms enable individuals to
share their data with large data-sharing platforms and other
collaborative efforts. Haas added that templates for consent should use
universally accepted definitions regarding what types of data are
particularly sensitive in this changing landscape of regulation and what
qualifications should researchers have to gain access to data. Rosati
said NIH has developed template language for clinical trials and
research repositories, for example, to enable sharing of genome-wide
association study (GWAS) data. Barch added that consent forms are
also needed that distinguish between data already collected versus data
to be collected in the future. She advocated optimizing consent forms
so that participants understand their rights of privacy, but consent to as
many uses with which they are comfortable.
To implement the use of informed consent language that would both
enable data sharing and protect participants, Michael Milham, vice
president of research and founding director of the Center for the
Developing Brain at the Child Mind Institute, suggested including a
range of use cases about how someone’s data might be used in the
future. Granting agencies could require such language unless reasons
are given for why it should not be required. Milham added that beyond
developing best practices, mechanisms are needed to implement the
use of consent forms that enable future use of data.
Other unresolved issues discussed in the privacy breakout session
overlap with those discussed in the data management and platform
governance sessions, discussed further in Chapters 4 and 6. For
example, Haas suggested that better definitions need to be developed
for what constitutes sensitive data and what qualifications should be
required for an investigator to access shared data. Barch added that
there should be some consequence for investigators who violate data
use agreements designed to protect individuals’ privacy. Rosati noted
that most data use agreements fail to specify whether an individual
researcher or the institution is responsible if data are misused. Barch
and Hanson discussed the utility of having institution-oriented resources
of common practices for data sharing and use in agreements with
commercial entities, including details such as secondary relationship
governance in the case of the original company being acquired.

__________________
1 For more information, see https://www.ukbiobank.ac.uk (accessed November 10, 2019).
2 For more information, see https://www.ndcn.ox.ac.uk/divisions/fmrib (accessed November
10, 2019).
3 For more information, see https://www.dementiasplatform.uk (accessed November 10,
2019).
4 For more information, see https://www.fda.gov/safety/fdas-sentinel-initiative (accessed
November 10, 2019).
4

Managing Data and Promoting


Interoperability in the Cloud

Highlightsa
Developing interoperability mechanisms that enable data
platforms to talk to each other is critical, whether or not
these platforms exist in the cloud (Evans).
The National Library of Medicine is working to accelerate
the promotion and adoption of Fast Healthcare
Interoperability Resources standards to promote data
exchange across the National Institutes of Health (Huerta).
Although housing data in a single place could support more
rapid research progress, sometimes it may be more
practical to use federated models and store different levels
of data in different ways. For example, the Psychiatric
Genomics Consortium shares data on a compute cluster in
the Netherlands that, while not in the cloud, uses similar
data sharing and standardized processing approaches
(Neale).
Harmonized approaches, funding, and training are needed
to enable transforming data from a raw state to a
standardized format, which is costly and time consuming
(Huerta, Nalls, Ramoni, Snyder).
A common coordination frame would be needed to merge
different types of data in repositories and platforms
(Marcus).

__________________
a These points were made by the individual workshop participants identified above.
They are not intended to reflect a consensus among workshop participants.

Many of the issues related to data management and integration


are not cloud specific, said Alan Evans, James McGill Professor of
Neurology and Psychiatry at McGill University. Indeed, he said,
getting the major platforms to develop interoperability definitions to
enable data sharing transcends the cloud. But without that
cooperation, there will continue to be islands and communities that
are unable to communicate.
The web of regulations referred to in the section on privacy (see
Chapter 3) further complicates efforts to integrate data across
geographic boundaries, noted Eline Applemans, scientific program
manager in neuroscience at the Foundation for the National
Institutes of Health (FNIH). Benjamin Neale, associate professor in
the Analytic and Translational Genetics Unit at Massachusetts
General Hospital and the Broad Institute of MIT and Harvard,
concurred that the GDPR regulations require cloud environments to
be set up in each country, allowing investigators to analyze data
within national boundaries. He suggested that although research
could proceed more rapidly if data were housed in a single place, the
community should be open to federated models and storage of
different levels of data in different ways. For example, summary-
level information might be shared in a highly interoperable
environment, while individual-level data may be housed in a more
restricted capacity.
Interoperability is facilitated by standards, but developing widely
accepted data standards requires cooperation and is itself
challenging. Data standards could provide the opportunity for large
cloud-based neuroscience resources to work together; however, in a
dynamic field like neuroscience with changing data modalities and
technologies, it can be difficult to corral standards, said Michael
Huerta. Daniel Marcus, professor of imaging neuroscience at the
Washington University School of Medicine in St. Louis, suggested
that inadequate cooperation arises not from a lack of interest, but a
lack of incentives. Governments can play a role in creating such
incentives, as well as in coordinating collaborative efforts, said
Rebecca Li.
Maryann Martone added that the ideal people to develop
standards may not be researchers themselves because they may
lack expertise in informatics and coding. However, the standards
developed should map onto what the researchers actually do in a
way that they can understand how these constructs represent their
experimental paradigms, she said. Thus, she said, it is probably
helpful to start by asking researchers what they need from the data
and what they are willing or unwilling to do to achieve their goals.
Data management can be costly and time consuming, said Huerta.
Researchers should think about data integration and data sharing
from the beginning as they are developing and designing their
projects and should balance the costs versus benefits (value
assessment) in deciding what level of data management is needed,
he said. Martone suggested that it may be helpful to ask researchers
to fill out templates of metadata schemes. These templates, she
said, should be simple and not overly prescriptive. Huerta added that
NIH staff need to understand the complexities of data management;
one example is that data cleaning is essential and can be expensive,
said Huerta.

CURRENT PROMISING PRACTICES


REGARDING STANDARDS DEVELOPMENT AND
INTEROPERABILITY
Huerta recalled that about 20 years ago, to develop the
Neuroimaging Informatics Technology Initiative (NIFTI) as an
imaging standard, the major neuroimaging labs and software
developers came together for workshops to develop standards,
which are still widely used. Now, he said, his office is working to
accelerate the promotion and adoption of Fast Healthcare
Interoperability Resources (FHIR) standards to promote health care
information exchange across NIH. He added that NIH is preparing to
release for public comment a data management and sharing policy,
which will require NIH-funded researchers to include a data
management and sharing plan in their grant proposals.1
Neale suggested that genetics is one domain within the field of
neuroscience that has already made progress in sharing data. The
Psychiatric Genomics Consortium (PGC) was launched in 2007 with
the goal of conducting huge genome-wide analyses of psychiatric
disorders by bringing researchers together from around the world to
work collaboratively (Psychiatric GWAS Consortium Steering
Committee, 2009). The more than 800 investigators from 38
countries that have joined this consortium share data on a research
compute cluster in the Netherlands that functions in a manner
similar to a cloud, enabling many different groups to share and work
together with a standardized kind of processing and analysis, said
Neale.
The UK Biobank has created a different kind of data model in
which data are made available for downloading, said Neale. He
suggested that it may be possible to set up cloud-based methods
that would enable investigators to point to and analyze those data
without downloading it.
Meanwhile, the National Center for Biotechnology Information
(NCBI) has developed a database of genotypes and phenotypes
(dbGaP)2 to archive and distribute data and results from
genotype/phenotype studies conducted in humans, said Neale. He
added that the National Human Genome Research Institute (NHGRI)
Another random document with
no related content on Scribd:
But the native woman obstinately declares that she will not go on
to the Mandalinati hills, and it is only upon a promise of receiving
double pay that she at last complainingly consents to accompany her
mistress to the castle. Ethel has to suffer, however, for descending to
bribery, as before the ascent commences every servant in her
employ has bargained for higher wages, and unless she wishes to
remain in the plains she is compelled to comply with their demands.
But she determines to write and tell Charlie of their extortion by the
first opportunity, and hopes that the intelligence may bring him up,
brimming with indignation, to set her household in order. Her first
view of the castle, however, repays her for the trouble she has had in
getting there. She thinks she has seldom seen a building that strikes
her with such a sense of importance. It is formed of a species of
white stone that glistens like marble in the sunshine, and it is
situated on the brow of a jutting hill that renders it visible for many
miles round. The approach to it is composed of three terraces of
stone, each one surrounded by mountainous shrubs and hill-bearing
flowers, and Ethel wonders why the Rajah Mati Singh, having built
himself such a beautiful residence, should ever leave it for the use of
strangers. She understands very little of the native language, but
from a few words dropt here and there she gathers that the castle
was originally intended for a harem, and supposes the rajah’s wives
found the climate too cold for susceptible natures. If they disliked the
temperature as much as her native servants appear to do, it is no
wonder that they deserted the castle, for their groans and moans
and shakings of the head quite infect their mistress, and make her
feel more lonely and nervous than she would otherwise have done,
although she finds the house is so large that she can only occupy a
small portion of it. The dining-hall, which is some forty feet square, is
approached by eight doors below, two on each side, whilst a gallery
runs round the top of it, supported by a stone balustrade and
containing eight more doors to correspond with those on the ground
floor. These upper doors open into the sleeping chambers, which all
look out again upon open-air verandahs commanding an extensive
view over the hills and plains below. Mrs Dunstan feels very dismal
and isolated as she sits down to her first meal in this splendid dining-
hall, but after a few days she gets reconciled to the loneliness, and
sits with Katie on the terraces and amongst the flowers all day long,
praying that the fresh breeze and mountain air may restore the roses
to her darling’s cheeks. One thing, however, she cannot make up her
mind to, and that is to sleep upstairs. All the chambers are furnished,
for the Rajah Mati Singh is a great ally of the British throne, and
keeps up this castle on purpose to ingratiate himself with the English
by lending it for their use; but Ethel has her bed brought downstairs,
and occupies two rooms that look out upon the moonlit terraces. She
cannot imagine why the natives are so averse to this proceeding on
her part. They gesticulate and chatter—all in double Dutch, as far as
she is concerned—but she will have her own way, for she feels less
lonely when her apartments are all together. Her Dye goes on her
knees to entreat her mistress to sleep upstairs instead of down; but
Ethel is growing tired of all this demonstration about what she knows
nothing, and sharply bids her do as she is told. Yet, as the days go
on, there is something unsatisfactory—she cannot tell what—about
the whole affair. The servants are gloomy and discontented, and
huddle together in groups, whispering to one another. The Dye is
always crying and hugging the child, while she drops mysterious
hints about their never seeing Mudlianah again, which make Ethel’s
heart almost stop beating, as she thinks of native insurrections and
rebellions, and wonders if the servants mean to murder her and
Katie in revenge for having been forced to accompany them to
Mandalinati.
Meanwhile, some mysterious circumstances occur for which Mrs
Dunstan cannot account. One day, as she is sitting at her solitary
dinner with two natives standing behind her chairs, she is startled by
hearing a sudden rushing wind, and, looking up, sees the eight doors
in the gallery open and slam again, apparently of their own accord,
whilst simultaneously the eight doors on the ground floor which were
standing open shut with a loud noise. Ethel looks round; the two
natives are green with fright; but she believes that it is only the wind,
though the evening is as calm as can be. She orders them to open
the lower doors again, and having done so, they have hardly
returned to their station behind her chair before the sixteen doors
open and shut as before. Mrs Dunstan is now very angry; she
believes the servants are playing tricks upon her, and she is not the
woman to permit such an impertinence with impunity. She rises from
table majestically and leaves the room, but reflection shows her that
the only thing she can do is to write to her husband on the subject,
for she is in the power of her servants so long as she remains at the
castle, where they cannot be replaced.
She stays in the garden that evening, thinking over this occurrence
and its remedy, till long after her child has been put to bed—and as
she re-enters the castle she visits Katie’s room before she retires to
her own, and detects the Dye in the act of hanging up a large black
shawl across the window that looks cut upon the terrace.
‘What are you doing that for?’ cries Ethel impetuously, her
suspicions ready to be aroused by anything, however trivial.
The woman stammers and stutters, and finally declares she
cannot sleep without a screen drawn before the window.
‘Bad people’s coming and going at night here!’ she says in
explanation, ‘and looking in at the window upon the child; and if they
touch missy she will die. Missus had better let me put up curtain to
keep them out. They can’t do me any harm. It is the child they come
for.’
‘Bad people coming at night! What on earth do you mean, Dye?
What people come here but our own servants? If you go on talking
such nonsense to me I shall begin to think you drink too much
arrack.’
‘Missus, please!’ replies the native with a deprecatory shrug of the
shoulders; ‘but Dye speaks the truth! A white woman walks on this
terrace every night looking for her child, and if she sees little missy,
she will take her away, and then you will blame poor Dye for losing
her. Better let me put up the curtain so that she can’t look in at
window.’
END OF VOL. II.
*** END OF THE PROJECT GUTENBERG EBOOK A MOMENT OF
MADNESS, AND OTHER STORIES (VOL. 2 OF 3) ***

Updated editions will replace the previous one—the old editions will
be renamed.

Creating the works from print editions not protected by U.S.


copyright law means that no one owns a United States copyright in
these works, so the Foundation (and you!) can copy and distribute it
in the United States without permission and without paying copyright
royalties. Special rules, set forth in the General Terms of Use part of
this license, apply to copying and distributing Project Gutenberg™
electronic works to protect the PROJECT GUTENBERG™ concept
and trademark. Project Gutenberg is a registered trademark, and
may not be used if you charge for an eBook, except by following the
terms of the trademark license, including paying royalties for use of
the Project Gutenberg trademark. If you do not charge anything for
copies of this eBook, complying with the trademark license is very
easy. You may use this eBook for nearly any purpose such as
creation of derivative works, reports, performances and research.
Project Gutenberg eBooks may be modified and printed and given
away—you may do practically ANYTHING in the United States with
eBooks not protected by U.S. copyright law. Redistribution is subject
to the trademark license, especially commercial redistribution.

START: FULL LICENSE


THE FULL PROJECT GUTENBERG LICENSE
PLEASE READ THIS BEFORE YOU DISTRIBUTE OR USE THIS WORK

To protect the Project Gutenberg™ mission of promoting the free


distribution of electronic works, by using or distributing this work (or
any other work associated in any way with the phrase “Project
Gutenberg”), you agree to comply with all the terms of the Full
Project Gutenberg™ License available with this file or online at
www.gutenberg.org/license.

Section 1. General Terms of Use and


Redistributing Project Gutenberg™
electronic works
1.A. By reading or using any part of this Project Gutenberg™
electronic work, you indicate that you have read, understand, agree
to and accept all the terms of this license and intellectual property
(trademark/copyright) agreement. If you do not agree to abide by all
the terms of this agreement, you must cease using and return or
destroy all copies of Project Gutenberg™ electronic works in your
possession. If you paid a fee for obtaining a copy of or access to a
Project Gutenberg™ electronic work and you do not agree to be
bound by the terms of this agreement, you may obtain a refund from
the person or entity to whom you paid the fee as set forth in
paragraph 1.E.8.

1.B. “Project Gutenberg” is a registered trademark. It may only be


used on or associated in any way with an electronic work by people
who agree to be bound by the terms of this agreement. There are a
few things that you can do with most Project Gutenberg™ electronic
works even without complying with the full terms of this agreement.
See paragraph 1.C below. There are a lot of things you can do with
Project Gutenberg™ electronic works if you follow the terms of this
agreement and help preserve free future access to Project
Gutenberg™ electronic works. See paragraph 1.E below.
1.C. The Project Gutenberg Literary Archive Foundation (“the
Foundation” or PGLAF), owns a compilation copyright in the
collection of Project Gutenberg™ electronic works. Nearly all the
individual works in the collection are in the public domain in the
United States. If an individual work is unprotected by copyright law in
the United States and you are located in the United States, we do
not claim a right to prevent you from copying, distributing,
performing, displaying or creating derivative works based on the
work as long as all references to Project Gutenberg are removed. Of
course, we hope that you will support the Project Gutenberg™
mission of promoting free access to electronic works by freely
sharing Project Gutenberg™ works in compliance with the terms of
this agreement for keeping the Project Gutenberg™ name
associated with the work. You can easily comply with the terms of
this agreement by keeping this work in the same format with its
attached full Project Gutenberg™ License when you share it without
charge with others.

1.D. The copyright laws of the place where you are located also
govern what you can do with this work. Copyright laws in most
countries are in a constant state of change. If you are outside the
United States, check the laws of your country in addition to the terms
of this agreement before downloading, copying, displaying,
performing, distributing or creating derivative works based on this
work or any other Project Gutenberg™ work. The Foundation makes
no representations concerning the copyright status of any work in
any country other than the United States.

1.E. Unless you have removed all references to Project Gutenberg:

1.E.1. The following sentence, with active links to, or other


immediate access to, the full Project Gutenberg™ License must
appear prominently whenever any copy of a Project Gutenberg™
work (any work on which the phrase “Project Gutenberg” appears, or
with which the phrase “Project Gutenberg” is associated) is
accessed, displayed, performed, viewed, copied or distributed:
This eBook is for the use of anyone anywhere in the United
States and most other parts of the world at no cost and with
almost no restrictions whatsoever. You may copy it, give it away
or re-use it under the terms of the Project Gutenberg License
included with this eBook or online at www.gutenberg.org. If you
are not located in the United States, you will have to check the
laws of the country where you are located before using this
eBook.

1.E.2. If an individual Project Gutenberg™ electronic work is derived


from texts not protected by U.S. copyright law (does not contain a
notice indicating that it is posted with permission of the copyright
holder), the work can be copied and distributed to anyone in the
United States without paying any fees or charges. If you are
redistributing or providing access to a work with the phrase “Project
Gutenberg” associated with or appearing on the work, you must
comply either with the requirements of paragraphs 1.E.1 through
1.E.7 or obtain permission for the use of the work and the Project
Gutenberg™ trademark as set forth in paragraphs 1.E.8 or 1.E.9.

1.E.3. If an individual Project Gutenberg™ electronic work is posted


with the permission of the copyright holder, your use and distribution
must comply with both paragraphs 1.E.1 through 1.E.7 and any
additional terms imposed by the copyright holder. Additional terms
will be linked to the Project Gutenberg™ License for all works posted
with the permission of the copyright holder found at the beginning of
this work.

1.E.4. Do not unlink or detach or remove the full Project


Gutenberg™ License terms from this work, or any files containing a
part of this work or any other work associated with Project
Gutenberg™.

1.E.5. Do not copy, display, perform, distribute or redistribute this


electronic work, or any part of this electronic work, without
prominently displaying the sentence set forth in paragraph 1.E.1 with
active links or immediate access to the full terms of the Project
Gutenberg™ License.
1.E.6. You may convert to and distribute this work in any binary,
compressed, marked up, nonproprietary or proprietary form,
including any word processing or hypertext form. However, if you
provide access to or distribute copies of a Project Gutenberg™ work
in a format other than “Plain Vanilla ASCII” or other format used in
the official version posted on the official Project Gutenberg™ website
(www.gutenberg.org), you must, at no additional cost, fee or expense
to the user, provide a copy, a means of exporting a copy, or a means
of obtaining a copy upon request, of the work in its original “Plain
Vanilla ASCII” or other form. Any alternate format must include the
full Project Gutenberg™ License as specified in paragraph 1.E.1.

1.E.7. Do not charge a fee for access to, viewing, displaying,


performing, copying or distributing any Project Gutenberg™ works
unless you comply with paragraph 1.E.8 or 1.E.9.

1.E.8. You may charge a reasonable fee for copies of or providing


access to or distributing Project Gutenberg™ electronic works
provided that:

• You pay a royalty fee of 20% of the gross profits you derive from
the use of Project Gutenberg™ works calculated using the
method you already use to calculate your applicable taxes. The
fee is owed to the owner of the Project Gutenberg™ trademark,
but he has agreed to donate royalties under this paragraph to
the Project Gutenberg Literary Archive Foundation. Royalty
payments must be paid within 60 days following each date on
which you prepare (or are legally required to prepare) your
periodic tax returns. Royalty payments should be clearly marked
as such and sent to the Project Gutenberg Literary Archive
Foundation at the address specified in Section 4, “Information
about donations to the Project Gutenberg Literary Archive
Foundation.”

• You provide a full refund of any money paid by a user who


notifies you in writing (or by e-mail) within 30 days of receipt that
s/he does not agree to the terms of the full Project Gutenberg™
License. You must require such a user to return or destroy all
copies of the works possessed in a physical medium and
discontinue all use of and all access to other copies of Project
Gutenberg™ works.

• You provide, in accordance with paragraph 1.F.3, a full refund of


any money paid for a work or a replacement copy, if a defect in
the electronic work is discovered and reported to you within 90
days of receipt of the work.

• You comply with all other terms of this agreement for free
distribution of Project Gutenberg™ works.

1.E.9. If you wish to charge a fee or distribute a Project Gutenberg™


electronic work or group of works on different terms than are set
forth in this agreement, you must obtain permission in writing from
the Project Gutenberg Literary Archive Foundation, the manager of
the Project Gutenberg™ trademark. Contact the Foundation as set
forth in Section 3 below.

1.F.

1.F.1. Project Gutenberg volunteers and employees expend


considerable effort to identify, do copyright research on, transcribe
and proofread works not protected by U.S. copyright law in creating
the Project Gutenberg™ collection. Despite these efforts, Project
Gutenberg™ electronic works, and the medium on which they may
be stored, may contain “Defects,” such as, but not limited to,
incomplete, inaccurate or corrupt data, transcription errors, a
copyright or other intellectual property infringement, a defective or
damaged disk or other medium, a computer virus, or computer
codes that damage or cannot be read by your equipment.

1.F.2. LIMITED WARRANTY, DISCLAIMER OF DAMAGES - Except


for the “Right of Replacement or Refund” described in paragraph
1.F.3, the Project Gutenberg Literary Archive Foundation, the owner
of the Project Gutenberg™ trademark, and any other party
distributing a Project Gutenberg™ electronic work under this
agreement, disclaim all liability to you for damages, costs and
expenses, including legal fees. YOU AGREE THAT YOU HAVE NO
REMEDIES FOR NEGLIGENCE, STRICT LIABILITY, BREACH OF
WARRANTY OR BREACH OF CONTRACT EXCEPT THOSE
PROVIDED IN PARAGRAPH 1.F.3. YOU AGREE THAT THE
FOUNDATION, THE TRADEMARK OWNER, AND ANY
DISTRIBUTOR UNDER THIS AGREEMENT WILL NOT BE LIABLE
TO YOU FOR ACTUAL, DIRECT, INDIRECT, CONSEQUENTIAL,
PUNITIVE OR INCIDENTAL DAMAGES EVEN IF YOU GIVE
NOTICE OF THE POSSIBILITY OF SUCH DAMAGE.

1.F.3. LIMITED RIGHT OF REPLACEMENT OR REFUND - If you


discover a defect in this electronic work within 90 days of receiving it,
you can receive a refund of the money (if any) you paid for it by
sending a written explanation to the person you received the work
from. If you received the work on a physical medium, you must
return the medium with your written explanation. The person or entity
that provided you with the defective work may elect to provide a
replacement copy in lieu of a refund. If you received the work
electronically, the person or entity providing it to you may choose to
give you a second opportunity to receive the work electronically in
lieu of a refund. If the second copy is also defective, you may
demand a refund in writing without further opportunities to fix the
problem.

1.F.4. Except for the limited right of replacement or refund set forth in
paragraph 1.F.3, this work is provided to you ‘AS-IS’, WITH NO
OTHER WARRANTIES OF ANY KIND, EXPRESS OR IMPLIED,
INCLUDING BUT NOT LIMITED TO WARRANTIES OF
MERCHANTABILITY OR FITNESS FOR ANY PURPOSE.

1.F.5. Some states do not allow disclaimers of certain implied


warranties or the exclusion or limitation of certain types of damages.
If any disclaimer or limitation set forth in this agreement violates the
law of the state applicable to this agreement, the agreement shall be
interpreted to make the maximum disclaimer or limitation permitted
by the applicable state law. The invalidity or unenforceability of any
provision of this agreement shall not void the remaining provisions.
1.F.6. INDEMNITY - You agree to indemnify and hold the
Foundation, the trademark owner, any agent or employee of the
Foundation, anyone providing copies of Project Gutenberg™
electronic works in accordance with this agreement, and any
volunteers associated with the production, promotion and distribution
of Project Gutenberg™ electronic works, harmless from all liability,
costs and expenses, including legal fees, that arise directly or
indirectly from any of the following which you do or cause to occur:
(a) distribution of this or any Project Gutenberg™ work, (b)
alteration, modification, or additions or deletions to any Project
Gutenberg™ work, and (c) any Defect you cause.

Section 2. Information about the Mission of


Project Gutenberg™
Project Gutenberg™ is synonymous with the free distribution of
electronic works in formats readable by the widest variety of
computers including obsolete, old, middle-aged and new computers.
It exists because of the efforts of hundreds of volunteers and
donations from people in all walks of life.

Volunteers and financial support to provide volunteers with the


assistance they need are critical to reaching Project Gutenberg™’s
goals and ensuring that the Project Gutenberg™ collection will
remain freely available for generations to come. In 2001, the Project
Gutenberg Literary Archive Foundation was created to provide a
secure and permanent future for Project Gutenberg™ and future
generations. To learn more about the Project Gutenberg Literary
Archive Foundation and how your efforts and donations can help,
see Sections 3 and 4 and the Foundation information page at
www.gutenberg.org.

Section 3. Information about the Project


Gutenberg Literary Archive Foundation
The Project Gutenberg Literary Archive Foundation is a non-profit
501(c)(3) educational corporation organized under the laws of the
state of Mississippi and granted tax exempt status by the Internal
Revenue Service. The Foundation’s EIN or federal tax identification
number is 64-6221541. Contributions to the Project Gutenberg
Literary Archive Foundation are tax deductible to the full extent
permitted by U.S. federal laws and your state’s laws.

The Foundation’s business office is located at 809 North 1500 West,


Salt Lake City, UT 84116, (801) 596-1887. Email contact links and up
to date contact information can be found at the Foundation’s website
and official page at www.gutenberg.org/contact

Section 4. Information about Donations to


the Project Gutenberg Literary Archive
Foundation
Project Gutenberg™ depends upon and cannot survive without
widespread public support and donations to carry out its mission of
increasing the number of public domain and licensed works that can
be freely distributed in machine-readable form accessible by the
widest array of equipment including outdated equipment. Many small
donations ($1 to $5,000) are particularly important to maintaining tax
exempt status with the IRS.

The Foundation is committed to complying with the laws regulating


charities and charitable donations in all 50 states of the United
States. Compliance requirements are not uniform and it takes a
considerable effort, much paperwork and many fees to meet and
keep up with these requirements. We do not solicit donations in
locations where we have not received written confirmation of
compliance. To SEND DONATIONS or determine the status of
compliance for any particular state visit www.gutenberg.org/donate.

While we cannot and do not solicit contributions from states where


we have not met the solicitation requirements, we know of no
prohibition against accepting unsolicited donations from donors in
such states who approach us with offers to donate.

International donations are gratefully accepted, but we cannot make


any statements concerning tax treatment of donations received from
outside the United States. U.S. laws alone swamp our small staff.

Please check the Project Gutenberg web pages for current donation
methods and addresses. Donations are accepted in a number of
other ways including checks, online payments and credit card
donations. To donate, please visit: www.gutenberg.org/donate.

Section 5. General Information About Project


Gutenberg™ electronic works
Professor Michael S. Hart was the originator of the Project
Gutenberg™ concept of a library of electronic works that could be
freely shared with anyone. For forty years, he produced and
distributed Project Gutenberg™ eBooks with only a loose network of
volunteer support.

Project Gutenberg™ eBooks are often created from several printed


editions, all of which are confirmed as not protected by copyright in
the U.S. unless a copyright notice is included. Thus, we do not
necessarily keep eBooks in compliance with any particular paper
edition.

Most people start at our website which has the main PG search
facility: www.gutenberg.org.

This website includes information about Project Gutenberg™,


including how to make donations to the Project Gutenberg Literary
Archive Foundation, how to help produce our new eBooks, and how
to subscribe to our email newsletter to hear about new eBooks.

You might also like