Professional Documents
Culture Documents
Cell - Vol. 181 (Nº7)
Cell - Vol. 181 (Nº7)
Cell - Vol. 181 (Nº7)
Leading Edge
Editorial
Science Has a Racism Problem
We are the editors of a science journal, committed to publishing admissions committees, classmates, researchers—what can
and disseminating exciting work across the biological sciences. you do to raise up Black students and colleagues in your com-
We are 13 scientists. Not one of us is Black. Underrepresentation munities and institutions? None of us individually can stem the
of Black scientists goes beyond our team—to our authors, re- tide of racism or rebuild an unjust society, but every ac-
viewers, and advisory board. And we are not alone. It is easy tion helps.
to divert blame, to point out that the journal is a reflection of We are part of the problem, as are all of us who do not press for
the scientific establishment, to quote statistics. But it is this change on a daily basis. It should not have taken the recent
epidemic of denial of the integral role that each and every mem- deaths of George Floyd, Breonna Taylor, and Ahmaud Arbery
ber of our society plays in supporting the status quo by failing to for us to speak and to act. We are asking ourselves what we
actively fight it that has allowed overt and systemic racism to can do to be stronger allies, stronger anti-racists.
flourish, crippling the lives and livelihoods of Black Americans, Cell stands with our Black readers, reviewers, authors, and
including Black scientists. colleagues. We are committed to listening to and amplifying their
Science has a racism problem. voices, to educating ourselves, and to finding ways that we can
Look to the history of human genetics, a field that has been help and do better. We alone cannot fix racism. But we have the
used repeatedly as scientific rationale for the definition of human advantage of having a platform, so we will put in the work, we will
‘‘races’’ and to support inherent inequalities. Proponents of eu- listen, and we will act.
genics use the alleles we carry as reason to declare racial supe- As a start, we are committing to the following actions to high-
riority, as if expression of a lactase gene has bearing on one’s hu- light and increase representation of Black scientists:
manity. Race is not genetic.
Look to the exploitation of Black research subjects. Acknowl- 1. Representing – we will feature and amplify Black and other
edge the sheer volume of past and current scientific research underrepresented minority authors of Cell papers on so-
made possible by cells stolen decades ago from Henrietta cial media. If you are a person of color and you wish to
Lacks, a Black woman with cancer. Remember the Tuskeegee be highlighted in this way, please tell us. Email the editor
syphilis study that intentionally withheld appropriate treatment of your paper with the subject line ‘‘Faces of Cell’’ at any
to hundreds of Black men. Think about the issues of consent, point in the publication process, and we will be honored
of ownership, and of medical ethics and do not overlook the to post about your paper with your photo and/or your
shared role of race in these violations. Twitter handle and to re-tweet and amplify your own posts
Look to the extreme disparity in the genetic and clinical data- and stories.
bases scientists have built, with the overwhelming majority of 2. Educating – we are committed to featuring issues of
data from white Americans of European descent and the result- importance to the scientific community in our pages.
ing dearth of understanding of health and disease in Black indi- We pledge to purposefully highlight Black authors and
viduals. Read statistics about morbidity and mortality disparities perspectives in the review and commentary content
in hospitals around the country, highlighted by the current that we commission and publish and to share these
pandemic—ask why Black women are five times more likely with the greater scientific community. Has your depart-
than white women to die during pregnancy, or why Black infants ment or institute already made changes or launched
are twice as likely to die as white babies born in the US. Black successful initiatives? Tell us, and we will try to find
health has never been the priority. ways to share those stories. Have new ideas? Let
Science has a racism problem. And it is not limited to scientific us know.
discoveries and their attendant usage. The scientific establish- 3. Diversifying – we pledge to improve the diversity of our
ment, scientific education, and the metrics used to define scien- advisory board and our reviewer pool, using our experi-
tific success have a racism problem as well. ence with gender equity initiatives to increase represen-
Black Americans face a mountain of challenges built on cen- tation of non-white scientists, which is far too low. We
turies of systemic structural racism and the United States’ his- are actively investigating ways to improve diversity
tory of slavery and racial oppression. Educational opportunities, through our outreach, recruiting, and hiring efforts, at
mentorship and representation, and our ingrained, often uncon- Cell and across Cell Press. If you are a Black scientist
scious attitudes all play a role. The gatekeeping system in with an interest in editorial careers, get in touch. We’re
academia, industry, and scientific organizations was not de- eager to talk.
signed to correct for centuries of compounded disadvantage 4. Listening – we are editors because we want to learn. If
and oppression. It is time for renovation. there are ways that we can use our voice and our platform
We urge our community members who have the means to to help the Black scientist community, we want to hear
enact change to do so. Hiring committees, educators, mentors, them. Please email us if you have concrete ideas for
Cell 181, June 25, 2020 ª 2020 Published by Elsevier Inc. 1443
ll
Editorial
perspectives you want to see or creative ways that you beginning. We are learning, and we will almost certainly make
think we can help. We promise to hear them. mistakes along the way. But silence is not, and never should
have been, an option.
We and our colleagues across Cell Press hope to serve as one Science has a racism problem. Scientists are problem solvers.
small part of amplifying Black voices in STEM, and this is just the Let’s get to it.
Commentary
How Support of Early Career Researchers
Can Reset Science in the Post-COVID19 World
Erin M. Gibson,1,12,* F. Chris Bennett,2 Shawn M. Gillespie,3 Ali Deniz Güler,4 David H. Gutmann,5 Casey H. Halpern,6
Sarah C. Kucenas,4 Clete A. Kushida,1 Mackenzie Lemieux,7 Shane Liddelow,8 Shannon L. Macauley,9 Qingyun Li,10
Matthew A. Quinn,11 Laura Weiss Roberts,1 Naresha Saligrama,5 Kathryn R. Taylor,3 Humsa S. Venkatesh,3 Belgin Yalçın,3
and J. Bradley Zuchero6
1Department of Psychiatry and Behavioral Sciences, Stanford University School of Medicine, Palo Alto, CA 94305, USA
2Department of Psychiatry, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104, USA
3Department of Neurology and Neurological Sciences, Stanford University School of Medicine, Palo Alto, CA 94305, USA
4Department of Biology, University of Virginia, Charlottesville, VA 22903, USA
5Department of Neurology, Washington University School of Medicine, St. Louis, MO 63110, USA
6Department of Neurosurgery, Stanford University School of Medicine, Palo Alto, CA 94305, USA
7The Salk Institute for Biological Studies, La Jolla, CA 92037, USA
8Neuroscience Institute, NYU School of Medicine, New York, NY 10016, USA
9Department of Internal Medicine, Wake Forest School of Medicine, Winston-Salem, NC 27101, USA
10Departments of Neuroscience and Genetics, Washington University School of Medicine, St. Louis, MO 63110, USA
11Department of Pathology, Section on Comparative Medicine, Wake Forest School of Medicine, Winston-Salem, NC 27517, USA
12Lead Contact
*Correspondence: egibson1@stanford.edu
https://doi.org/10.1016/j.cell.2020.05.045
The COVID19 crisis has magnified the issues plaguing academic science, but it has also provided the scien-
tific establishment with an unprecedented opportunity to reset. Shoring up the foundation of academic sci-
ence will require a concerted effort between funding agencies, universities, and the public to rethink how we
support scientists, with a special emphasis on early career researchers.
The novel coronavirus, SARS-CoV-2, has more experiments per paper, more pa- times and will undoubtedly suffer more
placed science at the center of every con- pers per year, more expectations and re- from the present lab closures. The re-
versation, amplifying the importance of quirements for grants and tenure, more sponsibilities of family life disproportion-
scientific research to economic stability, opinions from reviewers. The scientific ately impact women. A parent who is
healthcare infrastructure, and disaster community rewards quantity over quality. trying to homeschool their children,
preparedness. In academic science, re- Most scientists can easily name a seminal manage household duties, and work will
covery from the immediate COVID19 paper; many were published long before have left little time to further their own
crisis will require departments, univer- the 2000s, and many had, at most, a scientific agenda. Faculty with family re-
sities, private foundations, federal handful of figures. Today, papers are sponsibilities—women specifically—
agencies, and the public to work together often published with a plethora of supple- must be supported. The COVID19 crisis
collaboratively and comprehensively. The mental figures that will largely go unread will only highlight the rampant diversity is-
goal of recovery should not be to return to and underappreciated. The desire for sues plaguing the scientific establish-
‘‘normal’’ but, rather, to reset. Here, we ‘‘more’’ results in delays in publication, ment, many of which begin with the
argue that recovery provides us with the the awarding of grants, and career loss of women and minorities during
opportunity to address three systemic is- advancement for early career re- early career stages and may lead to
sues that plague the conduct of research searchers; it also stymies creativity and further disenfranchisement of the disad-
in the twenty-first century, with an encourages the proliferation of low-qual- vantaged (Malisch et al., 2020).
emphasis on supporting early career re- ity journals.
searchers who are the most vulnerable. Rethink the Fundamentals of
The strategies needed to ensure stability Diversification Leads to Discovery Funding
and success of early career scientists This crisis is exacerbating the well-docu- The current model of academic science is
post-COVID19 can be adapted to chip mented discrimination afflicting academic heavily reliant upon federal funding, even
away at the systemic issues affecting the science (Monroe et al., 2008). Women, though agencies such as the National In-
scientific establishment. parents, and individuals who identify as stitutes of Health (NIH) were not built to
racial or ethnic minorities leave science, sustain such expectations. The federal
Excess Does Not Equal Excellence technology, engineering, and math government’s funding capacity has signif-
Science has changed immensely over the (STEM) fields as early career researchers icantly diminished as the cost of science
past 50 years. More has become better: at an excessively high rate in the best of has radically increased. The 2019 defense
budget was $685 billion while the 2019 next generation of independent sci- early investigator status for grant applica-
NIH budget was $39 billion. The COVID19 entists. tions and implement no-cost extensions
crisis has clearly amplified that the great- for currently held grants. Additional bridge
est risk to American life is not war, but dis- Funding Agencies funding programs may be especially
ease. Funding is needed at all levels; how- Grantsmanship important for faculty who are between
ever, early career researchers should be The resiliency of research is dependent projects or aiming to switch areas of study
particularly supported as the consistent upon the support of funding agencies. following the COVID19 crisis.
trend of shifting funding away from Like the broader scientific community,
younger researchers has no end in sight funding agencies will need to adapt their Universities
(Daniels, 2015). strategies and structure to fit the chang- Extensions for Tenure: Faculty
ing times. Simplification of grant applica- Most universities have added one-year
Ensuring a Durable Future for tion processes, including fewer supple- extensions to the tenure tracks of early
Academic Science Post-COVID19 mental documentations and more career researchers, but sliding extensions
Recovery from the immediate COVID19 implementation of letter-of-intent formats may better support the success of vulner-
crisis necessitates a multi-pronged prior to full proposals, could increase effi- able academics. Many early career inves-
approach including fiscal and non-fiscal ciency for both the funding agency and tigators may request extensions during
strategies to help graduate students, researcher. Lab closures will undoubtedly lab closures, but they should also have
postdoctoral fellows, and early and later create a void in the preliminary data that the ability to go up for tenure early if the
career faculty. This pandemic has partic- are necessary to obtain most awards. opportunity arises. Ensuring the promo-
ularly impacted senior postdoctoral fel- Early career researchers who had less tion and advancement of marginalized
lows seeking academic faculty positions time to acquire these data prior to lab groups such as women, who make up <
and early career faculty seeking to estab- shutdowns will be the most affected. 30% of STEM faculty, is even more imper-
lish themselves as independent investi- Funding agencies could introduce pol- ative post-COVID19. COVID19-initiated
gators. Special consideration for these icies and programs targeted at early in- resetting of expectations for the publish-
early career researchers is key to over- vestigators that require fewer preliminary ing, teaching, mentorship, and service re-
coming the crisis and strengthening the data (similar to the National Institute of quirements for tenure may not only help
foundations of academic science. Our Mental Health [NIMH] Brain Research minimize the excesses innate to the cur-
action plan proposed below is not an through Advancing Innovative Neurotech- rent tenure structure, but also may help
exhaustive list of all possible recommen- nologies [BRAIN] Initiative R01 or the foster environments that can acknowl-
dations for supporting scientists, nor is it DP2), reducing the excess in data edge implicit biases and keep marginal-
inclusive of every academic scientist’s required for most grants. Grants submit- ized groups from disproportionately
specific circumstance. Not all of our sug- ted by graduate students, postdoctoral leaving STEM fields. Tenure expectations
gestions are applicable at every univer- fellows, and early career faculty who do for the next generation of early career re-
sity or institution, as each will have its not have sufficient preliminary data per searchers may need to account for
own unique set of challenges. We current standards should be given special increased variability between faculty that
acknowledge that monetary support will consideration. Currently, many of the new is exacerbated by the COVID19 crisis
be limited due to the deteriorating eco- funding opportunities by funding and allow for more flexibility in the pro-
nomic situation and drastic loss of reve- agencies, such as the NIH, are geared to- cess. This crisis has amplified how the
nue from clinical operations for most ward supplements to existing grants or antiquated one-size-fits-all guidelines
medical campuses. While the immediate COVID-related research. As there will only encourage the disenfranchisement
goal of the recommendations is to pro- likely be restrictions or reductions to new of women and racial or ethnic minorities
vide support for scientists from funding funding opportunities in the coming years (Diversification Leads to Discovery and
agencies, universities, departments, and due to fiscal shortages, faculty with exist- Excess Does Not Equal Excellence).
the public following COVID19, this sup- ing grants might help early career faculty Trainees
port also provides solutions to the three by including them in their supplemental The current crisis will have a dramatic
major challenges. Solutions to these sys- applications. Including early career fac- trickle-down effect, and numerous hiring
temic issues (i.e., Excess Does Not Equal ulty will also foster collaboration and freezes are already in place. Mechanisms
Excellence, Diversification Leads to Dis- resource sharing, both of which will be vi- to allow postdoctoral fellows or graduate
covery, or Funding Agencies) are inter- tal during this time (Excess Does Not students in their final year to continue in
woven across the structure of academic Equal Excellence and Rethink the Funda- their current positions should be enacted,
science, allowing us to comprehensively mentals of Funding). if necessary, and if labs or universities are
tackle these issues at all levels. Plans Extension of Deadlines, Timelines, able to provide fiscal support. Current clo-
for recovery from the COVID19 and Funding sures are also disrupting the ability of
pandemic must ensure as much continu- Numerous funding agencies have already many graduate students to complete their
ity as possible in research while implemented deadline extensions, but rotations. Universities could extend the
improving upon existing infrastructures deadlines must be further extended for timeline for rotations and potentially cover
in order to provide a more inclusive, the duration of lab disruptions. It is also graduate students’ stipends. Trainees,
cohesive, and efficient future for the imperative that funding agencies extend particularly postdoctoral fellows, may
have limited ability to extend their period If startup funds are set to expire, the expi- Supplementation: Access to
of training due to visa restrictions. Univer- ration date should be extended. New fac- Technology
sities should coordinate with federal ulty should be given the funds needed to Universities should encourage and enable
agencies to pursue strategies aimed at establish their labs once research activ- graduate students and postdocs to use
extending visa expiration timelines, allow- ities resume (Rethink the Fundamentals this time to learn new computational skills
ing trainees to complete work that was of Funding). in anticipation of reductions in ability to do
delayed due to the COVID19 crisis. These Supplementation work at the bench. Many university-
mechanisms are needed to assure that The economic toll caused by shelter-in- offered computational courses were
we do not lose an entire generation of sci- place will undoubtedly be significant, over-committed during lab closures due
entists following the coronavirus crisis. including the reduction in funding through to a significant increase in enrollment re-
Curtailment of Applicable Hiring endowments and charitable giving. We quests. Universities should make a
Freezes fully acknowledge that monetary supple- concerted effort to increase bandwidth
Many universities have implemented hiring mentation may be difficult for universities and capacity for computational courses.
freezes for faculty and staff for the following the COVID19 crisis. Any combi- Many free online resources are also avail-
remainder of the year or beyond. Univer- nation of fiscal supplementation with able to supplement the acquisition of cod-
sities should not limit the ability of early other mechanisms of non-fiscal support ing skills.
career faculty to hire postdoctoral fellows should be considered. Universities might Departments: Administrative and
and staff, however. Restricting early career implement new or expanded fellowships Teaching Load
faculty from hiring technical assistance and for postdocs and graduate students, add Administrative and teaching expecta-
lab managers will stymie their ability to to existing startup packages for faculty, tions should be reevaluated during uni-
generate preliminary data, which will assist with the purchasing of equipment versity closures. Departments should re-
consequently limit grant and paper sub- or expand shared equipment funding, or assess administrative and teaching
missions and delay career advancement. create subsidies or joint ventures with loads, especially for early career faculty
Even a short hiring freeze could have federal programs similar to unemploy- whose promotions are contingent upon
devastating effects on the ability of early ment or re-deployment programs. Univer- teaching requirements. This is especially
career faculty labs to succeed. Allowing sities might supplement pay or provide important, since female scientists gener-
early career faculty to continue hiring will reimbursement for staff, postdoc, and ally have increased teaching loads and
also help to ease the bottleneck of grad- graduate student salaries during the dura- more advisory expectations than male
uate students looking for postdoctoral or tion of academic closures. scientists (Gibney, 2017), which could
research scientist positions within the Supplementation: Per Diem Costs disproportionately delay scientific recov-
next few years. Hiring freezes at any level Many universities have per diem policies ery of female scientists from COVID19
will disproportionately affect early career that differ based on funding source, with closures (Diversification Leads to Dis-
individuals and oversaturate the market reduced per diem costs associated with covery).
with qualified candidates. Permitting federal grants. Early career faculty without
ongoing interviews for faculty positions, federal funding have per diem costs double Mentorship
even if the official hire date is postponed, that of other labs. Universities could imple- Mental Health
could alleviate stress on the postdoc popu- ment mechanisms to reduce or The COVID19 crisis and subsequent lab
lation and expedite the hiring process when supplement animal costs that will be closures will take an incredible toll on
hiring freezes are lifted. The faculty search accrued during lab closures and when mental health. Early career faculty who
process serves as a valuable feedback labs reopen and expand their animal have yet to establish themselves or their
mechanism for postdoctoral fellows that colonies (Rethink the Fundamentals of research independently and postdocs
sometimes has an impact on career path. Funding). whose future job prospects are now
Halting all hiring and all faculty searches Supplementation: Childcare significantly limited will be especially
may drive talented postdocs, especially Initiatives impacted by prolonged lab shutdowns.
women and members of ethnic or racial mi- Onsite daycare facilities support postdoc- Department chairs, division leaders,
norities, out of academia (Diversification toral fellows and faculty with young chil- and mentors should do their best to
Leads to Discovery). dren. These family care centers are critical check in with early career faculty and
Institutional Funds and Startup to narrowing the gap and slow the attrition postdocs during this time. Mentoring
Packages of women and parents in science. Univer- will be key both during and after this
Although universities may curtail sities could work with early childhood ed- crisis. Establishing scheduled virtual
spending from institutional funds, special ucation programs to establish or expand meetings during social distancing and
consideration should be given to new daycare and preschool programs, in-person meetings after labs are reop-
and early career faculty. Early career fac- providing free or subsidized childcare for ened could help alleviate some mental
ulty must retain access to their startup faculty and teaching opportunities for stress. University mental health re-
packages during this time. Institutional early education majors. Universities might sources are also available for anyone
funds should be released for salary sup- also reach out to current or retired teach- who needs support. As students gener-
port for early career faculty and for all ers seeking supplemental income (Diver- ally contact female faculty about mental
staff, students, and trainees in their labs. sification Leads to Discovery). health issues more frequently than male
Public Initiatives
Make Science a National Priority
The current crisis has brought the impor-
tance of science and research to the
forefront of public life. Not only is science
critical for public health decision-mak-
ing, but a sustained investment in
research better positions political
leaders to efficiently deploy testing and
therapeutic solutions. Capitalizing this
momentum is crucial to engaging the
public in science and science funding.
Providing additional funding sources
Figure 1. The COVID19 crisis has magnified the systemic issues plaguing academic focused on conveying science to the
research. These include the often stifling excess requirements in publication, tenure, and
greater public and stimulating interest in
grant processes; the reliance on funding from national agencies that is catered towards
senior level researchers; and the lack of diversity in academic research due to the attrition of science through educational outreach is
women and racial or ethnic minorities during early career stages. critical. Exploiting technology and social
media to bring science and research
directly to the public will be vital in the
faculty (Bennett, 1982), equal encour- Faculty Mentorship Programs post-COVID19 world. Such technology
agement of mentorship from all faculty Once labs are reopened, pairing early might include mechanisms to allow pri-
is essential to not overburdening women career faculty with a later career faculty vate citizens to directly invest in science
faculty during this time (Diversification mentor of an established lab could facili- and scientists (Else, 2019; Miller, 2019),
Leads to Discovery). tate more effective research programs including simplified website-based
Graduate Student Programs and allow for resource sharing. Later donation platforms or inclusion on elec-
Mentoring graduate students throughout career faculty could be incentivized to tion ballots. This is necessary for estab-
lab closures and after reopening should help early career researchers through re- lishing new funding sources for scien-
be strongly encouraged. Those con- ductions in teaching or administrative tists, potentially supplementing the
ducting experiments will be most affected loads, supplementations to animal care dearth of funding for early career re-
by lab closures, and this should be explic- costs, core facility usages, or other means searchers at federal funding agencies
itly acknowledged by faculty and mentors. of reimbursement and/or subsidizations. (Rethink the Fundamentals of Funding).
Universities must assure graduate stu- Investment of later career faculty in the Enhanced Scientific Transparency
dents that graduate programs will be stabi- success of early career faculty will help The COVID19 crisis has revealed a lack of
lized and that admittance will not be to ensure stability and success in the public understanding about how science
decreased. For many faculty, graduate younger generation of independent re- is funded, conducted, and reported. The
students are the major workforce of the searchers. current administration’s belief that the
lab. To ensure that faculty can successfully Clinician-Scientists NIH is ‘‘giving away $32 billion a year’’
build and sustain a lab, continued ability to Faculty who have clinical responsibilities should be cause for concern (DeYoung
attract graduate students is necessary. also necessitate special consideration et al., 2020). Much of the mistrust evident
This is especially important for new inves- during this time, especially if they are on between the scientific establishment and
tigators, as getting postdoctoral fellows the front lines. These individuals will not the general population is rooted in lack
can be more challenging for newer faculty. only lose productivity due to lab closures of transparency and community
involvement in science. Taking scientists community should speak openly and Daniels, R.J. (2015). A generation at risk: young in-
out of the ‘‘ivory tower’’ and increasing honestly about the difficulties faced dur- vestigators and the future of the biomedical work-
force. Proc. Natl. Acad. Sci. USA 112, 313–318.
accessibility through technology may ing the current situation. Early career
help to assuage the mistrust that hinders researchers should be involved in the de- DeYoung, K., Sun, L.H., and Rauhala, E. (2020).
Americans at World Health Organization trans-
our preparedness in times of crisis. Peo- cision-making processes, as they repre-
mitted real-time information about coronavirus to
ple cannot support what they do not un- sent the future of science and academic Trump administration. The Washington Post, Avail-
derstand. Removing excess requirements leadership. The COVID19 crisis has pro- able from. https://www.washingtonpost.com/
in publishing, grantsmanship, and tenure vided us with the unique opportunity to world/national-security/americans-at-world-health-
expectations could have the added reflect upon the present norms and enact organization-transmitted-real-time-information-
benefit of creating more time for scientists change through fiscal and non-fiscal about-coronavirus-to-trump-administration/2020/
04/19/951c77fa-818c-11ea-9040-68981f488eed_
to interact in the public domain. Scientists strategies. Our hope is that this
story.html.
must work on building the trust that is pandemic will allow us to chart a new
Else, H. (2019). Crowdfunding research flips sci-
imperative to success as a community, course for science, both academically ence’s traditional reward model. Nature, Available
and early career scientists are primed to and socially, and to begin to address from. https://www.nature.com/articles/d41586-
help pave this new future (Excess Does the core challenges of research, with a 019-00104-1.
Not Equal Excellence). special focus on supporting the next Gibney, E. (2017). Teaching load could put
generation of independent scientists. female scientists at career disadvantage. Na-
Conclusions ture, Available from. https://www.nature.com/
Beyond the immediate challenges of re- news/teaching-load-could-put-female-scientists-at-
DECLARATION OF INTERESTS career-disadvantage-1.21839.
turning to laboratories and research ca-
Malisch, J.L., Harris, B.N., Sherrer, S.M., Lewis,
reers, the COVID19 crisis has exposed Dr. Roberts serves as Editor-in-Chief of books for
K.A., Shepherd, S.L., McCarthy, P.C., Spott, J.L.,
some of the underlying weaknesses and the American Psychiatric Association Publishing
Karam, E.L., Moustaid-Moussa, N., McCrory Cal-
problems that permeate the current sci- Division and as Editor-in-Chief of the journal Aca-
arco, et al. (2020). In the wake of COVID-19,
demic Medicine. Unrelated to this publication, Dr.
entific enterprise (Figure 1). For example, academia needs new solutions to ensure gender eq-
Roberts serves as an advisor for the Bucksbaum
editors are asking reviewers to not Institute of the University of Chicago Pritzker
uity. Proceedings of the National Academy of Sci-
request more experiments unless abso- ence. https://doi.org/10.1073/pnas.2010636117.
School of Medicine and owns the small business
lutely necessary to validate the core Terra Nova Learning Systems. Miller, Z. (2019). The best platforms for crowd-
claims of a manuscript during the review funding science research. The Balance: Small
REFERENCES Business, Available from. https://www.
process. Most are applauding this effort
thebalancesmb.com/top-sites-for-crowdfunding-
to minimize excess and calling for its scientific-research-985238.
Bennett, S.K. (1982). Student perceptions of and
continued implementation even after sci-
expectations for male and female instructors: Evi- Monroe, K., Ozyurt, S., Wrigley, T., and Alexander,
entists are able to get back to the bench. dence relating to the question of gender bias in A. (2008). Gender Equality in Academia: Bad News
All institutions, funding agencies, depart- teaching evaluation. J. Educ. Psychol. 74, from the Trenches, and Some Possible Solutions.
ments, and members of the scientific 170–179. Perspectives on Politics 6, 215–233.
Previews
Cap-Snatching Leads to Novel Viral Proteins
Alistair B. Russell1,*
1Division of Biology, University of California, San Diego, San Diego, CA, USA
*Correspondence: a5russell@ucsd.edu
https://doi.org/10.1016/j.cell.2020.05.044
Some negative-sense RNA viruses prime mRNA transcription using host 50 cap sequences, usurping host
translational machinery and evading antiviral surveillance. In this issue of Cell, Ho et al. identify an additional
consequence of this viral strategy: the acquisition of upstream start codons from host-derived sequences
and subsequent translation of novel viral products.
Canonical eukaryotic mRNAs possess a start codons, and, for a subset of these sion to a viral protein driven by an
50 methylguanosine cap and a 30 polyade- proteins, they could detect expression upstream host-derived start codon
nosine tail. These features, combined, re- by mass spectrometry. Combined, these altered in vivo pathogenesis. Caution
cruit cellular translational machinery and represent relatively unequivocal evidence should still be taken when trying to link
mark these molecules as ‘‘belonging’’ in that host-derived alternative start codon these virulence data to a true biological
the cytoplasmic compartment wherein acquisition can drive protein production function; infection outcome for the host
they are accessible to ribosomes. Influ- in IAV. is ancillary to viral fitness, and, even in a
enza A virus (IAV), lacking its own capping What is the consequence of translation model that perfectly recapitulates the
machinery, produces capped mRNAs from host-derived start codons? Such host response, the route and dosage of
through a process known as cap-snatch- proteins could play a role in viral infection delivery can produce an incredible
ing. The viral polymerase associates with via at least two, non-exclusive mecha- breadth of outcomes that may have little
an actively transcribing RNA polymerase nisms. (1) They could represent bone- resemblance to the normal course of dis-
II complex and cleaves the nascent host fide, functional, overprinted products ease. Therefore, without a specific pro-
mRNA, thereafter using the cleaved prod- that serve significant roles in the viral life posed mechanism, these data do not yet
uct to prime viral mRNA production. This cycle, and (2) they could provide targets support inferring a function for these novel
process appears to roughly target caps for adaptive immunity (Figure 1). With proteins in IAV but also do not preclude
according to their relative abundance in respect to the first, the genomes of RNA such a role.
the nuclear compartment, resulting in viruses are restricted in length and over- In addition to potential novel functional
diverse cap sequences associated with printing, encoding multiple proteins from proteins, this discovery is incredibly
a given viral mRNA (Walker and Fo- the same overlapping sequence, is a important in light of the second possibil-
dor, 2019). common mechanism by which additional ity—that these upstream start codons
Previously, little to no consequence has coding capacity is procured. Ribosomal may generate targets for immune surveil-
been ascribed to the composition of host- frameshifting, encoded alternative start lance. Any protein, regardless of function
derived sequences appended to the 50 sites, and alternative splicing are, among or lack thereof, once degraded can
end of viral mRNAs, and the heterogeneity other strategies, utilized by IAV to generate potential peptides for display
of such sequences is certainly consistent generate alternative protein products via MHC-I and mark a cell for subsequent
with a lack of function in viral transcription (Chen et al., 2001; Dubois et al., 2014; destruction and recruitment of a more
and translation. Ho et al. (2020) posited Jagger et al., 2012). Ho et al. (2020) found robust immune response. One could
that AUG codons snatched in such a broad conservation between IAV strains even posit that, unlike full-length, canoni-
manner might permit the recruitment of ri- of an overprinted protein that would be cal viral proteins, out-of-frame, or even
bosomes upstream of the canonical IAV generated from upstream host-derived N-terminal extended, proteins may lack
start codons—producing either N-termi- start codons, consistent with a functional functional selection driving stability.
nal extensions to known viral proteins or hypothesis. However, other evolutionary Without such selection, these proteins
novel frameshifted peptides. The authors constraints such as the sequence of the may be rapidly targeted to the protea-
found that AUG codons could be readily ‘‘primary’’ influenza protein, genome some, and thus preferentially presented
identified in host-derived 50 sequences in packaging constraints, and viral promoter via MHC-I, and represent a heretofore un-
two different IAV strains in two different sequences would also produce signa- recognized target for antiviral immunity. It
cell types, indicating that there appeared tures of conservation in these genomic re- has been increasingly recognized that
to be no specific mechanism precluding gions, a point also noted by the authors. ‘‘off-products’’ of viral replication, be
their acquisition by viral transcriptional Furthering a functional argument, Ho they aberrant peptides or even aberrant
machinery. Furthermore, Ho et al. (2020) et al. (2020) found evidence that disrupt- genomes, are frequently the trigger for
were able to detect initiating ribosomes ing the expression of one of these over- an immune response (López, 2014; Wei
associated with upstream, host-derived printed proteins or an N-terminal exten- and Yewdell, 2019). This is perhaps an
unavoidable feature of the rapid, expo- their reporter. Therefore, the findings Decroly, E., Ferron, F., Lescar, J., and Canard, B.
nential growth of viral populations wherein with IAV have widespread relevance (2011). Conventional and unconventional mecha-
nisms for capping viral mRNA. Nat. Rev. Microbiol.
fidelity of replication may be at odds with across many negative-sense RNA vi-
10, 51–65.
replicative speed (Fitzsimmons et al., ruses. What we make of this moving for-
2018). Ho et al. (2020) found predicted ward will take a significant amount of Dubois, J., Terrier, O., and Rosa-Calatrava, M.
(2014). Influenza viruses and mRNA splicing: doing
MHC-I epitopes within an overprinted work, but such work will have been
more with less. MBio 5, e00070–14.
peptide, that these epitopes varied be- made possible by this initial observation.
tween IAV strains consistent with poten- We must uncover whether these alterna- Fitzsimmons, W.J., Woods, R.J., McCrone, J.T.,
Woodman, A., Arnold, J.J., Yennawar, M., Evans,
tial immune pressure, and that an out-of- tive products serve functional roles,
R., Cameron, C.E., and Lauring, A.S. (2018). A
frame canonical MHC-I peptide could be whether they represent significant con- speed-fidelity trade-off determines the mutation
produced and displayed—thus formally, tributors to MHC recognition of viral infec- rate and virulence of an RNA virus. PLoS Biol. 16,
MHC-I targets can be generated via tion, and, not touched upon in this study e2006459.
acquisition of host-derived start codons. but important nevertheless, whether se- Ho, J.S.Y., Angel, M., Ma, Y., Sloan, E., Wang, G.,
A critical next step is establishing whether lection against particular N-terminal ex- Martinez-Romero, C., Alenquer, M., Roudko, V.,
peptides derived from native proteins tensions or immunogenic peptides con- Chung, L., Zheng, S., et al. (2020). Hybrid
generated by upstream host-derived start strains the evolutionary trajectories of gene origination creates human-virus chimeric
codons are displayed via MHC-I, and if viral species that rely on cap-snatching. proteins during infection. Cell 181, this issue,
1502–1517.
so, what role they may play in viral surveil- Ho et al. (2020) have provided us with
lance and clearance. Such work becomes the critical step, identifying that such fea- Jagger, B.W., Wise, H.M., Kash, J.C., Walters, K.-
even more important in light of recent at- tures can impact viral biology, and it will A., Wills, N.M., Xiao, Y.-L., Dunfee, R.L., Schwartz-
man, L.M., Ozinsky, A., Bell, G.L., et al. (2012). An
tempts to generate T cell responses be exciting to explore the implications of
overlapping protein-coding region in influenza A
against IAV, which, although they may this study. virus segment 3 modulates the host response.
not prevent infection, are still thought Science 337, 199–204.
have the potential to reduce morbidity
ACKNOWLEDGMENTS La Gruta, N.L., and Turner, S.J. (2014). T cell medi-
and mortality (La Gruta and Turner, 2014). ated immunity to influenza: mechanisms of viral
Cap-snatching is not unique to IAV. To Work from A.B.R. in this field is supported in part control. Trends Immunol. 35, 396–402.
that end, Ho et al. (2020) explored 50 cap by the Damon Runyon Cancer Research Founda- López, C.B. (2014). Defective viral genomes: crit-
sequences from influenza B virus and tion (DFS-36-19) and NIAID (1K22AI141678). ical danger signals of viral infections. J. Virol. 88,
lassa virus and found host-acquired 8720–8723.
AUG codons in both viral species. Using
Walker, A.P., and Fodor, E. (2019). Interplay be-
a plasmid-based system that recapitu- REFERENCES
tween Influenza Virus and the Host RNA Polymer-
lates the replicon of Heartland banyangvi- ase II Transcriptional Machinery. Trends Microbiol.
Chen, W., Calvo, P.A., Malide, D., Gibbs, J., Schu-
rus, a Bunyavirus, they further confirmed 27, 398–407.
bert, U., Bacik, I., Basta, S., O’Neill, R., Schickli, J.,
translation of a luciferase lacking a start Palese, P., et al. (2001). A novel influenza A virus Wei, J., and Yewdell, J.W. (2019). Flu DRiPs in
codon, consistent with acquisition of mitochondrial protein that induces cell death. MHC Class I Immunosurveillance. Virol. Sin. 34,
host-derived start codons in frame with Nat. Med. 7, 1306–1312. 162–167.
In this issue of Cell, Alavi et al. report that infection by Vibrio cholerae is blocked by gut microbiome-mediated
hydrolysis of bile acids. Cholera therefore joins amebic dysentery and Clostridioides difficile colitis as enteric
infections profoundly influenced by the microbiome’s impact on bile acid metabolism.
Vibrio cholerae causes watery diarrhea so the microbiome impacts infectious dis- disease is regulated by microbiome meta-
severe that it kills by dehydration within eases via microbial metabolism of bile bolism of bile acids.
hours. We are now experiencing the acids in the gut. A second example is the parasite
7th pandemic of cholera, all 7 of which The primary bile acids cholate (CA) and Entamoeba histolytica, an ameba that in-
likely originated in the Indian subconti- chenodeoxycholate (CDCA) are made by vades the intestine by eating the epithelial
nent, with current estimates of up to 3 the liver, where they are conjugated with lining in a process called trogocytosis.
million cases and 100,000 deaths annu- a glycine or taurine before being secreted Burgess et al. (2020) recently identified
ally. That cholera is water-borne was es- into the duodenum. As they make their the intestinal bacterium Clostridium scin-
tablished by the physician John Snow in way through the small intestine, 95% of dens as providing protection from amebic
1854 by linking victims of the London bile acids are absorbed in the terminal colitis. Introduction of C. scindens into
Broad Street cholera epidemic not to ileum through the enterohepatic system, the microbiome of a mouse altered the
bad air but to the Broad Street water a majority being conjugated bile acids bone marrow by inducing expansion
pump. V. cholerae exists in aquatic envi- (Figure 1). Gut microbes that encode bac- of granulocyte-monocyte progenitors
ronments on the surface and in the intes- terial bile salt hydrolase bsh genes can de- (GMPs). C. scindens-mediated protec-
tine of copepods (a type of small crusta- conjugate or cleave the glycine and taurine tion from amebiasis could be transferred
cean). This leads to sporadic outbreaks from conjugated bile acids to yield decon- with adoptive transfer of bone marrow
near rivers in the Indian subcontinent, jugated bile acids (e.g., taurocholate to a naive mouse and act via an
amplified by human fecal-oral spread of [TCA]/taurine and CA). This is a critical increased recruitment of polymorphonu-
the bacteria during outbreaks causing first step in microbial bile acid metabolism clear neutrophils to the colon. Because
pandemics. that leads to all subsequent biotransfor- C. scindens can dehydroxylate CA to
Diarrhea is caused by cholera toxin, an mations. The deconjugated bile acids DCA, Burgess et al. (2020) tested if this
enzyme that ADP-ribosylates the Gs pro- that reach the large intestine are then mediated alteration of the marrow. In
tein that regulates adenylate cyclase, metabolized by members of the gut micro- fact, administration of DCA alone pro-
leading to a cyclic AMP (cAMP)-mediated biota into secondary bile acids, including vided complete protection from amebi-
chloride ion (Cl ) secretion. Cholera toxin deoxycholate (DCA). (Foley et al., 2019). asis via GMP expansion, demonstrating
and the toxin coregulated pili (TCP) are Here, Alavi et al. (2020) demonstrate microbiome-to-bone-marrow communi-
regulated by TcpP, a membrane bound that the composition of the gut micro- cation via bile acids.
transcriptional activator. There is sub- biome contributes to resistance to A final example relates to Clostridioides
stantial person-to-person variation in the cholera. By reconstituting germ-free difficile infection (CDI). C. difficile is a
severity of cholera, with one explanation mice with defined communities of human Gram-positive spore forming bacillus
being personal differences in the micro- microbiome bacteria, they discovered and the most common cause of hospi-
biome regulating virulence gene expres- that the commensal bacterium Blautia tal-acquired, antibiotic-associated diar-
sion by TcpP. However, we still lack a obeum mediates resistance. They show rhea. Ingestion of the spore form of
complete understanding of the factors that B. obeum mediates resistance in C. difficile in an individual with a dysbiotic
that cause person-to-person variation on mice through degrading TCA to CA and microbiome (usually due to prior antibiotic
severity of cholera. that the abundance of the B. obeum therapy) leads to infection of the large in-
In this issue of Cell, Ansel Hsiao and BSH enzyme correlated with resistance testine by the vegetative stage of
colleagues demonstrate that hydrolysis in humans. In the absence of B. obeum- C. difficile. Primary and secondary bile
of the bile acid taurocholate to cholate dependent degradation, TCA induces acids have been shown to impact
by the gut microbiome blocks TcpP acti- the TcpP virulence regulator to cause C. difficile vegetative growth as well as
vation and cholera colonization. This pa- disease. Cholera therefore joins a list of spore germination and toxin activity. For
per adds to emerging literature on how enteropathogens whose ability to cause example, the primary bile acid TCA
induces spore germination, and DCA in- due to the conversion of CA to DCA, verted by the microbiota are becoming
hibits C. difficile growth. which should provide protection. To sum- central players, both for their direct
Like the amebic colitis example, marize, there is strong in vitro and sup- impact on enteropathogens and for their
C. scindens is implicated as having a pro- portive in vivo evidence that microbial impact on the immune system.
tective role in CDI. First, depletion of metabolism of bile acids has a direct
C. scindens is associated with more se- impact on C. difficile spore germination, ACKNOWLEDGMENTS
vere disease in humans and mice. More- growth, and toxin production and activity.
Work from the authors’ labs is supported by
over, in the mouse model of CDI, reconsti- Bile acids also regulate the immune
National Institutes of Health grants R35
tution of C. scindens was able to partially system, as demonstrated in amebic dys- GM119438 (to C.M.T.), 2R37 AI026649-31,
restore colonization resistance against entery with DCA protecting by increasing and R01 AI043596-22 (to W.A.P.) and the Henske
CDI in mice, and colonization resistance marrow GMPs. Additional examples of a and McGrath families.
was associated with secondary bile acid direct impact of bile acids on the immune
synthesis (Buffie et al., 2015). In patients system include the secondary bile acid REFERENCES
with recurrent CDI, high levels of conju- LCA regulating Th17 responses by inter-
Alavi, S., Mitchell, J.D., Cho, J.Y., Liu, R., MacBeth,
gated primary bile acids and reduced sec- fering with RORyT transcriptional activity
J.C., and Hsiao, A. (2020). Interpersonal gut
ondary bile acids were observed in feces (Hang et al., 2019) and, in the context of
microbiome variation drives susceptibility and
when compared to healthy individuals. colitis, bile acids interacting with macro- resistance to Vibrio cholerae. Cell 181, this issue,
Successful treatment of recurrent CDI phages to induce IL-10, which polarizes 1533–1546.
with fecal microbial transplant restored T cells to a regulatory phenotype (Biagioli Biagioli, M., Carino, A., Cipriani, S., Francisci, D.,
the level of fecal secondary bile acids, et al., 2017). Marchianò, S., Scarpelli, P., Sorcini, D., Zampella,
specifically, DCA and LCA (Weingarden In summary, Alavi et al. (2020) have A., and Fiorucci, S. (2017). The bile acid receptor
et al., 2015; Seekatz et al., 2018). Most deepened our understanding of how GPBAR1 regulates the M1/M2 phenotype of intes-
tinal macrophages and activation of GPBAR1 res-
recently, Reed et al. (2020) have shown composition of the gut microbiome pro-
cues mice from murine colitis. J. Immunol. 199,
that several commensal Clostridia encod- tects from cholera by the discovery of 718–733.
ing the bai operon (which encodes en- the role of bacterially encoded bile salt hy-
Buffie, C.G., Bucci, V., Stein, R.R., McKenney,
zymes that convert cholate into the sec- drolases. As microbiome science moves P.T., Ling, L., Gobourne, A., No, D., Liu, H., Kinne-
ondary bile acid deoxycholate), but not from description to mechanism, bile acids brew, M., Viale, A., et al. (2015). Precision micro-
all, are able to inhibit C. difficile growth synthesized by the host and further con- biome reconstitution restores bile acid mediated
*Correspondence: l.akkari@nki.nl
https://doi.org/10.1016/j.cell.2020.06.003
Despite its success in multiple tumor types, immunotherapy remains poorly efficacious in brain malignancies.
In this issue of Cell, Friebel et al. and Klemm et al. provide in-depth insights into the versatile nuances of
immune cells in primary and metastatic brain tumors, granting the field with a rich framework to explore novel
therapeutic avenues.
The development of therapeutic strate- on tumor cell features, while introducing Through the use of multiparameter fluo-
gies enlisting cells composing the tumor the notion that stromal composition can rescence-activated cell sorting followed by
microenvironment (TME) has revolution- be shaped according to different cancer RNA sequencing (Klemm et al., 2020) and
ized clinical approaches in recent years, mutational statuses. Further evidence CYTOF analyses (Friebel et al., 2020), the
exemplified by T cell immunotherapy in elaborating on this concept have delin- authors assessed the abundance and het-
cancer treatment (Waldman et al., 2020). eated the particular phenotype of tumor- erogeneity of tissue-resident and peripher-
Yet, in organs as unique as the brain associated macrophages (TAMs) in spe- ally recruited leucocytes in BrMs and gli-
with its blood-brain barrier (BBB)-pro- cific subsets of GBM (Wang et al., 2017), omas. Microglia (MG) dominated the TME
tected niche, the current understanding with TAM enrichment generally associ- of IDHmut gliomas, considered to be less
of how tumors intrinsically emerging in ated with poor disease outcome (Kiel- aggressive brain lesions that displayed
the central nervous system or originating bassa et al., 2019). However, the extent limited infiltration of leucocytes. These re-
from extracranial sites sculpt the TME to of myeloid cell diversity, including TAM sults contrasted with the immunological
their advantage remains limited, thwarting ontogeny and distinct education, has re- landscapes of IDHwt GBM and BrMs,
efficient immune cell harnessing in thera- mained largely unexplored in human brain which were enriched by peripherally re-
peutic intervention. tumors. cruited monocyte-derived macrophages
Glioblastoma (GBM) and brain metas- In this issue of Cell, two studies (Friebel (MDMs). BrMs from multiple primary tumor
tases (BrMs) bear some of the worst prog- et al., 2020; Klemm et al., 2020) per- origins exhibited substantial infiltration of
noses for cancer patients. Discoveries of formed extensive and comprehensive an- T cells and neutrophils, with melanoma-
brain metastases drivers (Brastianos alyses of GBM and BrM immune land- to-brain BrMs singled out in both studies,
et al., 2015), genomic alterations of GBM scapes from large cohorts of patients, harboring numerous CD4+ and CD8+
subtypes (Brennan et al., 2013), and di- thus providing much needed information T cells subsets, and neutrophils heavily
versity at the single-cell level (Patel on the phenotype and transcriptional pro- infiltrating breast BrMs (Figure 1A).
et al., 2014; Tirosh et al., 2016) have grams acquired by components of the The prominence of TAMs prompted the
shed light into inter- and intra-tumoral TME, in a cell-type- and disease-specific authors to perform further analyses of MG
heterogeneity of brain lesions with a focus manner. and MDM subset composition, spatial
Figure 1. Immune Landscape Composition and Features across Different Types of Brain Tumors
(A) Content of immune cells in the GBM and BrM TME. Tissue-resident microglia (MG) and monocyte-derived macrophages (MDMs) constitute the dominant cell
types composing the immune landscape of GBM in the depicted distinct abundances, according to the IDH mutational status of GBM. Recruited at the tumor site
from the peripheral circulation, leucocytes, including CD4+ T cells, CD8+ T cells, neutrophils, and monocytes, enter the brain parenchyma through the blood-brain
barrier and infiltrate brain tumors, a process enhanced in brain metastases, resulting in increased content of these recruited leucocytes in the BrM TME.
(B) Overview of the functional features acquired by each immune cell type in a disease-specific manner. Macrophages (MG and MDMs) are heterogenous across
tumor types and display the highest magnitude of plasticity, which ranges between ECM remodeling to immunomodulatory roles in IDHwt GBM and BrMs. Greatly
enriched in BrMs, CD4+ T cells and CD8+ T cells display features of hyporesponsiveness and exhaustion, whereas T cells in GBM are characterized by low
proliferative abilities and activation markers. Natural killer cells present an immature phenotype in IDHwt GBM in comparison to the cytotoxic features displayed in
IDHmut gliomas and BrMs.
organization, and phenotype and tran- BrM TME but not in gliomas (Klemm methodology of combination therapies
scriptional programs in GBM and BrM in et al., 2020), with BrM CD4+ and CD8+ to account for the progressive changes
order to establish their nexus role in regu- T cells exhibiting anergic and exhaustion undergone by the TME (Rothschilds and
lating the immune landscape in a disease- signatures, respectively. Flawed T cell Wittrup, 2019). These may include neo-
specific manner. Using orthogonal ap- features can be explained by chronic acti- adjuvant treatment approaches, which
proaches of RNA sequencing (Klemm vation through excessive antigen presen- have proven successful in recurrent
et al., 2020) and mass cytometry (Friebel tation, including by TAMs, which exert GBM (Cloughesy et al., 2019) and could
et al., 2020), both studies showed that greater immunomodulatory functions in be guided by artificial-intelligence-based
the plasticity of TAMs largely relied upon BrM than in IDHwt GBM and express mul- radiogenomics to dynamically monitor
the high magnitude of MDM phenotype tiple co-inhibitory receptors (Figure 1B). the brain TME (Rudie et al., 2019) and
adaptation and transcriptional programs, These major differences are in line with stratify patients into tailored therapeutic
leading to limited commonalities between the outcome of immune checkpoint intervention.
TAM subsets across different diseases. blockade, which showed promising effi- The vivid immune cell heterogeneity
The prognostic value of myeloid cell cacy in melanoma BrMs, where T cells parallels their remarkably challenging tar-
abundance was similarly relevant only are abundant and can be rewired, but little geting to achieve clinical response in
when examining MDM features in GBM. to no response in T cell-excluded primary brain tumors and emphasize the need to
Indeed, monocyte infiltration, which was brain tumors. further enrich our knowledge of the im-
prominent in IDHmut GBM, only correlated Primary cancer cells or their metastatic mune contexture in a dynamic setting.
with a trend in increased GBM patient sur- counterparts face considerable chal- Collectively, these studies provide much
vival, whereas expression of the pan- lenges to strive within the unique brain needed insights into the multifaceted nu-
MDM marker CD163+ (Friebel et al., TME, which can only be overcome by ances of immune cells in primary and met-
2020) or expression of the MDM-repre- hijacking a selective niche, leading to a astatic brain tumors and resources for the
sentative gene set (Klemm et al., 2020) paralleled evolution of stromal and im- brain tumor community to explore novel
both correlated with poor survival in low- mune cells. The deep insights into the di- therapeutic targets.
and high-grade GBM. versity of specialized immune landscapes
Strikingly, MDMs’ heterogeneity was shaped by GBM and BrMs presented in REFERENCES
not a random feature across brain tumors the studies discussed here highlight the
but distinct and dictated by each tumor need for careful identification of the Brastianos, P.K., Carter, S.L., Santagata, S., Cahill,
D.P., Taylor-Weiner, A., Jones, R.T., Van Allen,
type. The origin of MDM diversity was ontogeny, cellular phenotype, and local
E.M., Lawrence, M.S., Horowitz, P.M., Cibulskis,
proposed to rely upon distinct differentia- education of myeloid cell subsets, without K., et al. (2015). Genomic Characterization of Brain
tion trajectories from their monocyte pro- simply utilizing content to predict survival Metastases Reveals Branched Evolution and
genitors, thus giving rise to multiple MDM outcomes or to devise therapeutic inter- Potential Therapeutic Targets. Cancer Discov. 5,
subsets carrying specific phenotypes ventions. Holistic approaches using sin- 1164–1177.
with different prognostic values. Single- gle-cell RNA sequencing and spatial Brennan, C.W., Verhaak, R.G., McKenna, A., Cam-
cell proteomic analysis was leveraged to determinants of the GBM and BrM TME pos, B., Noushmehr, H., Salama, S.R., Zheng, S.,
Chakravarty, D., Sanborn, J.Z., Berman, S.H.,
ascribe a positive survival outcome to will be necessary to understand how
et al.; TCGA Research Network (2013). The so-
the CD163+MRC1+ MDM subset in low- cellular plasticity gives rise to these matic genomic landscape of glioblastoma. Cell
grade GBM (Friebel et al., 2020). subtype-specific immunological niches. 155, 462–477.
The findings that MG and MDMs shared Indeed, several questions remained to Cloughesy, T.F., Mochizuki, A.Y., Orpilla, J.R.,
some, but limited, features within the be answered in light of these novel re- Hugo, W., Lee, A.H., Davidson, T.B., Wang, A.C.,
same disease setting suggest that the sources: do brain tumors poise peripheral Ellingson, B.M., Rytlewski, J.A., Sanders, C.M.,
type of tumor in addition to the affected immune cells for reprogramming, beyond et al. (2019). Neoadjuvant anti-PD-1 immuno-
tissue heavily weighs onto the program- affecting their recruitment? How does the therapy promotes a survival benefit with intratu-
moral and systemic immune responses in recur-
ming of recruited immune cells. This is genetic make-up of brain tumor cells
rent glioblastoma. Nat. Med. 25, 477–486.
further exemplified by the distinct reactive shape the education of each distinct im-
Friebel, E., Kapolou, K., Unger, S., Núñez, N.G.,
phenotype acquired by MG in IDHmut mune cell type? What underlies the dy-
Utz, S., Rushing, E.J., Regli, L., Weller, M., Greter,
GBM compared to IDHwt tumors (Friebel namic acquisition of these phenotypes in M., Tugues, S., et al. (2020). Single-Cell Mapping of
et al., 2020). Importantly, the transcrip- the course of tumor malignancy, and are Human Brain Cancer Reveals Tumor-Specific In-
tomic education acquired by either TAM these altered by standard of care treat- struction of Tissue-Invading Leukocytes. Cell
subsets did not reflect the defined M1- ment? The ability of TAMs to remodel 181, this issue, 1626–1642.
like or M2-like macrophage polarization the extracellular matrix (ECM) landscape Kielbassa, K., Vegna, S., Ramirez, C., and Akkari,
phenotypes but included shades of either in GBM and BrM additionally opens the L. (2019). Understanding the Origin and Diversity
of Macrophages to Tailor Their Targeting in Solid
activation states (Klemm et al., 2020). perspective to apply tumor-tissue archi-
Cancers. Front. Immunol. 10, 2215.
A key distinctive feature between BrM tecture therapies to brain malignancies
Klemm, F., Maas, R.R., Bowman, R.L., Kornete,
and GBM was revealed when assessing and either disrupt the locally shaped niche
M., Soukup, K., Nassiri, S., Brouland, J.P., Iacobu-
the spatial organization and phenotype or facilitate drug penetration though the zio-Donahue, C.A., Brennan, C., Tabar, V., et al.
of infiltrating leucocytes. TAMs and BBB. A major challenge will then be to (2020). Interrogation of the microenvironmental
T cells were closely interacting in the design rational and optimized timing and landscape in brain tumors reveals disease-specific
Perspective
Pandemic Preparedness: Developing Vaccines
and Therapeutic Antibodies For COVID-19
Gregory D. Sempowski,1,2,* Kevin O. Saunders,3 Priyamvada Acharya,3 Kevin J. Wiehe,1 and Barton F. Haynes1,4,*
1Department of Medicine, Duke Human Vaccine Institute, Duke University School of Medicine, Durham, NC 27710, USA
2Department of Pathology, Duke Human Vaccine Institute, Duke University School of Medicine, Durham, NC 27710, USA
3Department of Surgery, Duke Human Vaccine Institute, Duke University School of Medicine, Durham, NC 27710, USA
4Department of Immunology, Duke Human Vaccine Institute, Duke University School of Medicine, Durham, NC 27710, USA
The SARS-CoV-2 pandemic that causes COVID-19 respiratory syndrome has caused global public
health and economic crises, necessitating rapid development of vaccines and therapeutic counter-
measures. The world-wide response to the COVID-19 pandemic has been unprecedented with
government, academic, and private partnerships working together to rapidly develop vaccine
and antibody countermeasures. Many of the technologies being used are derived from prior gov-
ernment-academic partnerships for response to other emerging infections.
Figure 2. Accelerated Platform Technology for Rapid B Cell Screening and Isolation of Pathogen Neutralizing Antibodies Being Used to
Isolate SARS-CoV-2 Antibodies
isolated and delivered a Chikungunya neutralizing antibody by technologies for vaccine delivery (Graham and Sullivan, 2018).
using mRNA in LNPs in a Phase 1 human study (Kose et al., For the past 15 years, the HIV vaccine field has pioneered devel-
2019). In addition to mRNA-LNP, DNA or viral vector approaches opment and use of recombinant antibody technology, advanced
are also being rapidly developed for pandemic prevention anti- computational methods, novel animal models, and new vaccine
body delivery (Balazs et al., 2011; Muthumani et al., 2013). delivery approaches to accelerate HIV vaccine immunogen
SARS-CoV-2 vaccine development. Vaccines are the time- design and development (Burton et al., 2012; Caskey et al.,
honored method for establishing long-lived immune memory 2019; Haynes et al., 2019; Haynes et al., 2016; Klein et al.,
for controlling infectious diseases, and technologies have been 2013; Kwong and Mascola, 2018; Liao et al., 2009). This work
developed such that vaccines can now be developed faster has deciphered the roadblocks for this most difficult-to-develop
than in previous times (Figure 1) (Graham, 2020; Graham et al., HIV vaccine (Haynes et al., 2019). It is expected that the timeline
2018; Graham and Sullivan, 2018). Over 100 companies or aca- for a SARS-CoV-2 vaccine will be much faster and much easier
demic institutions are working on COVID-19 vaccines with stra- than for HIV-1. Investigators have worked to integrate these iter-
tegies that include recombinant vectors, mRNA in lipid nanopar- ative approaches for vaccine and antibody countermeasure
ticles, DNA, inactivated virus, live attenuated virus, virus-like development and applied them to a rapid response to the
particles, and protein subunits (Thanh Le et al., 2020; WHO, COVID-19 disease epidemic caused by SARS-CoV-2 (Figure 1).
2020b). Three vaccine candidates have already advanced to The COVID-19 pandemic caused by SARS-CoV-2 was first
Phase II testing that include an mRNA vaccine encoding the viral widely recognized in December 2019, and the first virus
spike protein from Moderna, an Adeno-type 5 vector vaccine ex- sequence published online in January 2020. By March 16,
pressing the S protein from CanSino Biologicals, and a chim- 2020, the first mRNA/LNP vaccine trial developed by the VRC
panzee adenovirus encoding the spike protein from the Jenner in collaboration with Moderna had begun (NIH, 2020). A rapid
Institute in Oxford, UK. Five other vaccine candidates are also SARS-CoV2 vaccine development approach involves the inte-
now in phase I trials including other mRNA/LNP or DNA vaccines gration of computational and structural-based immunogen
as well as three forms of whole inactivated vaccines design strategies; production of immunogens as inactivated vi-
(WHO, 2020b). rus; DNA, mRNA, vectored or protein subunits; and immuno-
As seen from this rapid movement of SARS-CoV-2 vaccine genic profiling in animal models prior to vaccine manufacturing
candidates into human trials, the time it takes to develop vac- and testing in clinical trials (Figure 1). Computational biology
cines for emerging pathogens is decreasing from that in the techniques have facilitated the rapid analysis of antibody and vi-
past for traditional childhood vaccines. Recently, a DNA vaccine rus sequences for influenza, HIV, and now SARS-CoV-2 to
for the original SARS (SARS-CoV-1) was developed in enable vaccine development (GISAID, 2020; Los Alamos Na-
20 months, a vaccine for H5 influenza A/Indonesia/2006 in tional Laboratory, 2020a; Saunders et al., 2019; Wiehe et al.,
11 months, a vaccine for H1 influenza A/California/2009 2018). Monitoring of HIV evolution by using the Los Alamos
in 4 months, and a Zika virus vaccine in 3.5 months (Graham HIV Sequence Database (Los Alamos National Laboratory,
et al., 2018). These successes have been brought about by inno- 2020a) has been critical for HIV vaccine immunogen designs.
vative technology and approaches that have allowed for rapid The SARS-CoV-2 virus is evolving, albeit at a slower rate than
identification and sequencing of new viral pathogens and new HIV, and virus evolution is a concern for successful COVID-19
vaccine development. The SARS-CoV-2 spike protein RBD is the rapidly than proteins or viral vectors and can be more cost
prime target for vaccine-induced neutralizing antibodies effective.
although other spike protein neutralizing epitopes are of interest. In addition to efficacy being the primary goal of SARS-CoV-2
However, comparison of the SARS-CoV2 RBD with that of vaccine development, safety is also a major concern (Peeples,
SARS-COV-1 reveals only partial homology although both retain 2020). Immunization with a SARS-CoV-1 vaccine has induced
the ability to bind to the ACE2 as a receptor (Wrapp et al., 2020). vaccine-associated immunopathology in the lung (Bolles et al.,
The GISAID database is proving to be helpful information for 2011; Liu et al., 2019; Tseng et al., 2012). Both CD4 and CD8
monitoring the viral evolution of SARS-CoV-2 (GISAID, 2020), T cell responses have been suggested to also be protective for
and the Los Alamos National Laboratory is developing a website SARS-CoV-1 (Zhao et al., 2016). Thus, careful animal preclinical
with tools for analysis of global SARS-CoV-2 spike protein se- studies as well as intense monitoring of human clinical trials will
quences (Korber et al., 2020; Los Alamos National Labora- be of critical importance to developing safe and effective anti-
tory, 2020b). COVID-19 antibody and vaccine countermeasures.
Structural determination of the primary targets of neutralizing Finally, collaboration and coordination will be essential to
antibodies, for example, hemagglutinin in influenza, Env in HIV, ending the pandemic. Globally, one example of private support
and now the spike protein in SARS-CoV-2, has provided valu- for COVID-19 research is the Coalition for Epidemic Prepared-
able atomic-level insight for vaccine design strategies. In partic- ness Innovations (CEPI). CEPI is raising funds for COVID-19 vac-
ular, cryo-electron microscopy has enabled the rapid solution of cine development and as well is funding vaccine development
the structure of the SARS-CoV-2 spike protein (Walls et al., 2020; projects (Gouglas et al., 2019). The Bill & Melinda Gates Founda-
Wrapp et al., 2020). Structural biology analysis of pathogens tion has made a $250 million commitment to fight COVID-19 and
combines structural, computational, biophysical, and biochem- has established a Coronavirus Immunotherapy Consortium, or
ical methods to understand interactions of pathogens with the CoVIC, to foster sharing and comparison of SARS-CoV-2 anti-
immune system (Henderson et al., 2020; LaBranche et al., bodies to speed therapeutic antibody development (Bill & Me-
2019; Murin et al., 2019; Saunders et al., 2019). Early this year, linda Gates Foundation, 2020). The recent formation of an NIH-
structural biologists pivoted to apply technology developed for organized public-private partnership, termed Accelerating
HIV-1 envelope or respiratory syncytial virus (RSV) structural COVID-19 Therapeutic Interventions and Vaccines (ACTIV)
biology to fast-track structure-based vaccine design for (Corey et al., 2020; Kaiser, 2020), is necessary and will facilitate
COVID-19 (Lan et al., 2020; Walls et al., 2020; Wrapp et al., a coordinated COVID-19 pandemic response. Globally, the
2020; Yuan et al., 2020). Currently, established pipelines for World Health Organization is playing a critical multinational coor-
high-resolution cryo-EM structural determination of the SARS- dination and informational role (WHO, 2020a).
CoV-2 spike are integrated with the computational teams, thus
providing atomic-level feedback to COVID-19 vaccine designs.
Summary
The past eight years of HIV-1 antibody discovery has provided
Government-funded and private initiatives have synergized to
templates for HIV-1 vaccine design aiming to elicit broadly reac-
provide countermeasure platforms to rapidly respond to the
tive neutralizing antibodies (Kwong and Mascola, 2018; Sok and
SARS-CoV-2 pandemic. Continued cooperation among public
Burton, 2018). From the study of the ontogeny of HIV neutralizing
and private institutions coupled with speed of development of
antibodies it has become clear that an effective vaccine will likely
antibody countermeasures and vaccines, with rapid evaluation
require multiple immunogens administered in a specific order to
of their safety and efficacy, and early planning for scale-up and
facilitate proper antibody development to multiple neutralizing
manufacture will be critical for expeditious control of the global
targets on HIV (Haynes et al., 2019). Hopefully, the development
COVID-19 pandemic.
of SARS-CoV-2 neutralizing antibodies will require a much
simpler vaccination regimen like the Zika vaccine, where one im-
ACKNOWLEDGMENTS
munization with one immunogen was sufficient to elicit protec-
tive neutralizing antibodies (Pardi et al., 2017a). Such a vaccine Funded by NIH grants AI142596 (B.F.H.), AI145687 and AI150415 (P.A.),
would be amenable to rapid development, large-scale AI058607 (G.D.S.); Department of Defense HR0011-17-2-0069 (G.D.S.); and
manufacturing, and global administration. the Translating Duke Health Initiative (P.A.). All authors wrote and edited the
What will follow rapidly now for prevention of COVID-19 will be manuscript. We thank Megan Averill for editorial assistance and David East-
a number of mRNA/LNP (e.g., from Moderna/NIAID, BioNTech/ erhoff and QiFeng Han for their contributions to the DARPA P3 program.
Cable, J., Srikantiah, P., Crowe, J.E., Jr., Pulendran, B., Hill, A., Ginsberg, A., Klein, F., Mouquet, H., Dosenovic, P., Scheid, J.F., Scharf, L., and Nussenz-
Koff, W., Mathew, A., Ng, T., Jansen, K., et al. (2020). Vaccine innovations for weig, M.C. (2013). Antibodies in HIV-1 vaccine development and therapy. Sci-
emerging infectious diseases-a symposium report. Ann. N Y Acad. Sci. 1462, ence 341, 1199–1204.
14–26. Korber, B., Fischer, W., Gnanakaran, S., Yoon, H., Theiler, J., Abfalterer, W.,
Foley, B., Giorgi, E., Bhattacharya, T., Parker, M., et al. (2020). Spike mutation
Carroll, D., Daszak, P., Wolfe, N.D., Gao, G.F., Morel, C.M., Morzaria, S., Pa-
pipeline reveals the emergence of a more transmissible form of SARS-CoV-2.
blos-Méndez, A., Tomori, O., and Mazet, J.A.K. (2018). The Global Virome
bioRxiv. https://doi.org/10.1101/2020.04.29.069054.
Project. Science 359, 872–874.
Kose, N., Fox, J.M., Sapparapu, G., Bombardi, R., Tennekoon, R.N., de Silva,
Carter, P.J., and Lazar, G.A. (2018). Next generation antibody drugs: pursuit of
A.D., Elbashir, S.M., Theisen, M.A., Humphris-Narayanan, E., Ciaramella, G.,
the ‘high-hanging fruit’. Nat. Rev. Drug Discov. 17, 197–223.
et al. (2019). A lipid-encapsulated mRNA encoding a potently neutralizing hu-
Caskey, M., Schoofs, T., Gruell, H., Settler, A., Karagounis, T., Kreider, E.F., man monoclonal antibody protects against chikungunya infection. Sci. Immu-
Murrell, B., Pfeifer, N., Nogueira, L., Oliveira, T.Y., et al. (2017). Antibody 10- nol. 4, eaaw6647.
1074 suppresses viremia in HIV-1-infected individuals. Nat. Med. 23, 185–191.
Kwong, P.D., and Mascola, J.R. (2012). Human antibodies that neutralize HIV-
Caskey, M., Klein, F., and Nussenzweig, M.C. (2019). Broadly neutralizing anti- 1: identification, structures, and B cell ontogenies. Immunity 37, 412–425.
HIV-1 monoclonal antibodies in the clinic. Nat. Med. 25, 547–553. Kwong, P.D., and Mascola, J.R. (2018). HIV-1 Vaccines Based on Antibody
Corey, B.L., Mascola, J.R., Fauci, A.S., and Collins, F.S. (2020). A strategic Identification, B Cell Ontogeny, and Epitope Structure. Immunity 48, 855–871.
approach to COVID-19 vaccine R&D. Science, eabc5312. LaBranche, C.C., Henderson, R., Hsu, A., Behrens, S., Chen, X., Zhou, T.,
DARPA (2017). Pandemic Prevention Platform (P3). https://www.darpa.mil/ Wiehe, K., Saunders, K.O., Alam, S.M., Bonsignori, M., et al. (2019). Neutrali-
program/pandemic-prevention-platform. zation-guided design of HIV-1 envelope trimers with high affinity for the unmu-
tated common ancestor of CH235 lineage CD4bs broadly neutralizing anti-
Dong, E., Du, H., and Gardner, L. (2020). An interactive web-based dashboard
bodies. PLoS Pathog. 15, e1008026.
to track COVID-19 in real time. Lancet Infect. Dis. 20, 533–534.
Lan, J., Ge, J., Yu, J., Shan, S., Zhou, H., Fan, S., Zhang, Q., Shi, X., Wang, Q.,
Empey, K.M., Peebles, R.S., Jr., and Kolls, J.K. (2010). Pharmacologic ad-
Zhang, L., and Wang, X. (2020). Structure of the SARS-CoV-2 spike receptor-
vances in the treatment and prevention of respiratory syncytial virus. Clin.
binding domain bound to the ACE2 receptor. Nature 581, 215–220.
Infect. Dis. 50, 1258–1267.
Liao, H.X., Levesque, M.C., Nagel, A., Dixon, A., Zhang, R., Walter, E., Parks,
Gaudinski, M.R., Houser, K.V., Doria-Rose, N.A., Chen, G.L., Rothwell, R.S.S.,
R., Whitesides, J., Marshall, D.J., Hwang, K.K., et al. (2009). High-throughput
Berkowitz, N., Costner, P., Holman, L.A., Gordon, I.J., Hendel, C.S., et al.; VRC
isolation of immunoglobulin genes from single human B cells and expression
605 study team (2019). Safety and pharmacokinetics of broadly neutralising
as monoclonal antibodies. J. Virol. Methods 158, 171–179.
human monoclonal antibody VRC07-523LS in healthy adults: a phase 1
dose-escalation clinical trial. Lancet HIV 6, e667–e679. Liu, L., Wei, Q., Lin, Q., Fang, J., Wang, H., Kwok, H., Tang, H., Nishiura, K.,
Peng, J., Tan, Z., et al. (2019). Anti-spike IgG causes severe acute lung injury
GISAID (2020). Next hCoV-19 App (Germany: Munich). https://www.gisaid. by skewing macrophage responses during acute SARS-CoV infection. JCI
org/epiflu-applications/next-hcov-19-app/. Insight 4, e123158.
Gouglas, D., Christodoulou, M., Plotkin, S.A., and Hatchett, R. (2019). CEPI: Los Alamos National Laboratory (2020a). Los Alamos HIV Sequence Database
Driving Progress Toward Epidemic Preparedness and Response. Epidemiol. (US: Los Alamos, NM). http://www.hiv.lanl.gov/.
Rev. 41, 28–33.
Los Alamos National Laboratory (2020b). SARS-CoV-2 Sequence Analysis
Graham, B.S. (2020). Rapid COVID-19 vaccine development. Science, Pipeline. https://cov.lanl.gov/.
eabb8923.
Lu, S., Zhao, Y., Yu, W., Yang, Y., Gao, J., Wang, J., Kuang, D., Yang, M.,
Graham, B.S., and Sullivan, N.J. (2018). Emerging viral diseases from a vacci- Yang, J., Ma, C., et al. (2020). Comparison of SARS-CoV-2 infections among
nology perspective: preparing for the next pandemic. Nat. Immunol. 19, 20–28. 3 species of non-human primates. bioRxiv. https://doi.org/10.1101/2020.04.
Graham, B.S., Mascola, J.R., and Fauci, A.S. (2018). Novel Vaccine Technol- 08.031807.
ogies: Essential Components of an Adequate Response to Emerging Viral Dis- Mendoza, P., Gruell, H., Nogueira, L., Pai, J.A., Butler, A.L., Millard, K., Leh-
eases. JAMA 319, 1431–1432. mann, C., Suárez, I., Oliveira, T.Y., Lorenzi, J.C.C., et al. (2018). Combination
Perspective
Molecular Transducers of Physical Activity
Consortium (MoTrPAC): Mapping the Dynamic
Responses to Exercise
James A. Sanford,1,12 Christopher D. Nogiec,2,12 Malene E. Lindholm,3,12 Joshua N. Adkins,1 David Amar,3
Surendra Dasari,4 Jonelle K. Drugan,5 Facundo M. Fernández,6 Shlomit Radom-Aizik,7 Simon Schenk,8
Michael P. Snyder,3 Russell P. Tracy,9 Patrick Vanderboom,4 Scott Trappe,10,11,12,* Martin J. Walsh,2,11,12,* and the
Molecular Transducers of Physical Activity Consortium
1Pacific Northwest National Laboratory, Richland, WA 99354, USA
2Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
3Stanford University, Stanford, CA 94305, USA
4Mayo Clinic, Rochester, MN 55901, USA
5National Institutes of Health, Bethesda, MD 20892, USA
6Georgia Institute of Technology, Atlanta, GA 30322, USA
7University of California, Irvine, Irvine, CA 92617, USA
8University of California, San Diego, La Jolla, CA 92093, USA
9University of Vermont, Burlington, VT 05405, USA
10Ball State University, Muncie, IN 47306, USA
11Senior author
12These authors contributed equally
Exercise provides a robust physiological stimulus that evokes cross-talk among multiple tissues that when
repeated regularly (i.e., training) improves physiological capacity, benefits numerous organ systems, and de-
creases the risk for premature mortality. However, a gap remains in identifying the detailed molecular signals
induced by exercise that benefits health and prevents disease. The Molecular Transducers of Physical Activ-
ity Consortium (MoTrPAC) was established to address this gap and generate a molecular map of exercise.
Preclinical and clinical studies will examine the systemic effects of endurance and resistance exercise across
a range of ages and fitness levels by molecular probing of multiple tissues before and after acute and chronic
exercise. From this multi-omic and bioinformatic analysis, a molecular map of exercise will be established.
Altogether, MoTrPAC will provide a public database that is expected to enhance our understanding of the
health benefits of exercise and to provide insight into how physical activity mitigates disease.
INTRODUCTION vention or treatment. In fact, almost all the grants which em-
ployed an exercise intervention only addressed health outcomes
Exercise perturbs multiple systems from the whole body to the and adherence issues. The MoTrPAC initiative provides a much
molecular level in an integrated manner (Hawley et al., 2014). needed comprehensive program to understand the interplay be-
However, in-depth fundamental knowledge into the molecular tween these biological systems with the goal of improving the
and cellular mechanisms that are responsible for physical activ- design of physical activity interventions. In addition, there is a
ity’s benefits on multiple organ systems and the diseases and potential to identify molecular targets that can be manipulated
disorders that derive from inactivity is incomplete (Booth et al., to mimic the effects of exercise in persons unable to do so for
2017; Neufer et al., 2015). A better understanding of these bio- a variety of reasons, such as physical disability, coma, or pa-
logical processes and pathways would allow for the develop- ralysis.
ment of targeted exercise interventions and prescriptions and To address the gaps in knowledge about how exercise en-
provide a foundation for developing exercise-mimetic pharma- hances health and ameliorates disease, multiple agencies at
cologic interventions. the NIH—including the National Institute of Arthritis and Muscu-
The Molecular Transducers of Physical Activity Consortium loskeletal and Skin Diseases (NIAMS), the National Institute of
(MoTrPAC) was established to elucidate how exercise improves Diabetes and Digestive and Kidney Diseases (NIDDK), the Na-
health and ameliorates diseases by building a map of the molec- tional Institute on Aging (NIA), and other institutes and centers
ular responses to acute and chronic exercise. In 2014, a portfolio who participated in the trans-NIH Exercise Interest Group—pro-
analysis of National Institutes of Health (NIH) grants revealed that posed the Common Fund program supporting MoTrPAC. To
most research regarding physical activity involved disease pre- create a substantive complex map of molecular transducers in
long-duration primary responses as well as secondary molecular white adipose), as well as liver, heart, kidney, lungs, brain, and
events (detailed protocols are available at https://motrpac.org/ brown adipose. It is expected that nucleic acid, proteomic, and
protocols.cfm). To study the biological events that occur during targeted metabolomic assays will be performed only on tissues
the early, intermediate, and later stages of endurance training, where the amount is non-limiting, whereas transcriptomics,
the PASS study design for the chronic response to exercise, non-targeted metabolomics, and non-targeted lipidomics will
which has been completed, entailed up to 8 weeks of treadmill be performed on all tissues. Together, these assays are ex-
training (5 days per week at 70% VO2max), with tissues pected to provide molecular and physiological insights about
collected 48 h after 1, 2, 4, and 8 weeks of training; incline, dura- the effect of exercise on many different organs. Ultimately,
tion, and speed of exercise progressively increased on a daily to MoTrPAC should begin to explain how molecular transducers
weekly basis during the initial 6 weeks of training. Sex differ- function across an entire mammal (Pedersen and Feb-
ences in both the acute exercise response and the training braio, 2012).
response are being investigated along with other study aims. After preliminary characterization of the changes that occur in
The most powerful aspect of the PASS design is the breadth of the initial set of analyses, a second phase of the PASS will
tissues collected. In addition to being studied in the context of include mechanistic studies of exercise-induced molecules
MoTrPAC, these will serve as a data resource for generating hy- that transduce stress resistance and circulating factors that
potheses for future studies. For both the acute exercise and might be implicated in the health benefits of exercise. Additional
chronic exercise training studies (including the non-exercise studies will focus on the adaptation to chronic resistance exer-
controls), as many as 27 biospecimens per rat are being cise and the impact of age and sex on these responses, as
collected for potential analysis. In addition to biospecimen well as other studies that have yet to be determined.
collection, other phenotypic outcomes are being collected,
including blood lactate concentration, maximal oxygen con- Human Clinical Exercise Sites
sumption (VO2max), and body composition. At Chemical Anal- The human component of MoTrPAC is an in-depth study of the
ysis Sites, initial biospecimens of focus will include those that effects of two different forms of exercise (endurance and resis-
overlap with the human studies (i.e., plasma, skeletal muscle, tance training) across multiple individuals of different ages
(including children) and sexes, as well as sedentary and highly et al., 1993), bioenergetic flux >10-fold (Kjaer et al., 1991; Romijn
active individuals. This large cohort will be used to study the et al., 1993; Steensberg et al., 2000), and large dynamic range in
response to exercise at the whole body and cellular levels and gene expression from small to >100-fold changes (Louis et al.,
attempt to identify the molecular underpinnings that might be 2007; Radom-Aizik et al., 2013, 2014) and likely enhance
responsible for the adaptive process and variation among indi- cross-talk among many organs (Pedersen and Febbraio, 2012).
viduals. Several traditional methods from the field of exercise Standardized conditions that control for physical activity,
physiology will be combined with novel biospecimen sampling time of day, and dietary intake will be implemented prior to
and high-throughput molecular analytical approaches that will the acute exercise bout. On the day of an acute exercise bout
likely yield important insights into the effects of exercise on with biospecimen collections, volunteers will arrive at a Human
health. The human study has many unique aspects that are high- Clinical Center in the morning after an overnight fast and rest
lighted below and will be conducted as a randomized controlled comfortably for 0.5 h prior to obtaining baseline blood (antecu-
trial (RCT) with an intent-to-treat design. bital vein), skeletal muscle (vastus lateralis), and adipose (peri-
Participants umbilical region) samples. Participants will then perform the
The goal is to recruit 270 children and adolescents (10–17 years standardized acute exercise bout (or rest for the non-exercise
of age) who are low-active in endurance-type exercise and 1,980 control group) with additional biospecimen samples (blood,
healthy sedentary adults (age 18 years or greater) who will be muscle, adipose) obtained 0.5 h (early), 4 h (middle), and
medically screened and randomly assigned to endurance 24 h (late) after exercise. These time points were chosen to
training (170 youth, 840 adults), resistance training (840 adults), capture the dynamic changes in the response to exercise as
or non-exercise control (50 youth, 300 adults) (Figure 2). An addi- metabolic, post-translational, and epigenetic modifications
tional group of highly active endurance- (50 youth, 150 adults) can occur quite rapidly (Barrès et al., 2012; Bolster et al.,
and resistance- (150 adults) trained individuals will serve as com- 2003; Hoffman et al., 2015; Romijn et al., 1993), whereas
parators and will not participate in the MoTrPAC exercise training mRNA induction generally peaks a few hours after exercise
programs. The recruitment and enrollment approach will be sex (Louis et al., 2007; Yang et al., 2005), and increases in protein
balanced and provide participants across a wide range of ages synthesis rates are detectable in the hours and days following
(10–17, 18–39, 40–59 and R60 year age groups) and of different exercise (Phillips et al., 1997) (Figure 3). Additional blood sam-
races. ples will be collected during the endurance exercise bout (20-
Exercise Training Program: Adults and 40-min time points) and shortly after (10 min) both endur-
The sedentary adult participants randomized to endurance or ance and resistance exercise bouts. All participants will have
resistance exercise training will perform 12 weeks of supervised pre-exercise biospecimen collections, but to reduce partici-
exercise, 3 days per week, with progression in both volume and pant burden in the post-exercise phase, sedentary participants
intensity. Each endurance training session will be 1 h in dura- will undergo skeletal muscle and adipose biopsies at one of
tion and be evenly split between cycling and treadmill (walking/ three time points (early, middle, late). The highly active partici-
running) exercise with intensity set to 60%–80% of heart rate pants will have muscle biopsies and blood at all time points,
reserve and monitored in real time during each session. Each whereas adipose biopsies will be collected at the pre and mid-
resistance training session will target the whole body and consist dle time points.
of eight total exercises (five upper body: chest press, military Acute Exercise Bout and Biospecimens Collections:
press, seated row, triceps extension, biceps curl; three lower Pediatrics
body: leg press, leg curl, knee extension) at a prescribed plan Children and adolescents undergo critical periods of growth and
of 3 sets of 8–12 repetitions at an intensity of 60%–80% of development, which are distinct from adult physiology. Pediatric
maximum for each exercise. These exercise protocols are well studies must also comply with additional ethical considerations
known to improve clinically relevant parameters (i.e., VO2max (Radom-Aizik and Cooper, 2016). Consequently, although the
and muscular strength and hypertrophy) via alterations in meta- pediatric arm of the study will mimic the adult protocol as closely
bolic, biochemical, and molecular signatures (Coggan et al., as possible, there are a few notable exceptions: (1) children who
1990; Gollnick et al., 1973; Raue et al., 2012; Rönn et al., 2014; are low-active in endurance-type exercise will be recruited
Timmons et al., 2010). (versus sedentary adults) to account for the fact that children
Acute Exercise Bout and Biospecimens Collections: are naturally more active than adults and also participate in
Adults mandatory physical education classes; (2) no tissue biopsies
A unique feature of the MoTrPAC adult protocol will be the inte- (only blood will be collected); (3) an acute bout of endurance ex-
gration of strategic biospecimen collections (blood, muscle, and ercise with blood collection will be performed in both the training
adipose) before, during, and after standardized bouts of acute intervention and no-exercise control groups; (4) blood samples
exercise. Participants will perform a 40 to 45-min bout of exer- will be collected in all participants before, 20 and 40 min during
cise (exercise-mode specific; rest for the non-exercise controls) exercise, and 10 min, 0.5 h, and 3.5 h into recovery; and (5) for
with biospecimens collected before and after 12 weeks of a subgroup of 170 low-active endurance exercise children and
training. The highly active group will perform the exercise- adolescents who will be randomized to receive 12-week endur-
mode specific bout only once. Compared to resting homeosta- ance training (N = 120) or continue their standard practice (N =
sis, these types of exercise challenges are expected to dramat- 50) (Figure 2), the endurance exercise intervention will be modi-
ically increase metabolic rate 5- to 10-fold (Coggan et al., 1990; fied to provide an exercise intervention that is appropriate to the
Farinatti and Castinheiras Neto, 2011; Mulla et al., 2000; Romijn pediatric participants’ age group. For middle and high school
platforms will be conducted. Investigators are exploring the po- specific targets to benefits (Carter et al., 2017). Although several
tential for collecting and analyzing microbiome samples from a studies (for review see Loos et al., 2015; Warburton et al., 2006)
subgroup of adult participants. have provided a rich source of information to develop a founda-
Implementation tion for larger and more comprehensive genomic, epigenomic,
To optimize this complex protocol, the adult component of the and transcriptomic (GET) analyses, sufficiently powered studies
study will be implemented in two phases. The first phase will with the complementary detailed study design to make accurate
involve 150 adult participants and will require 4–6 months, predictions as machine-learned models remain underdevel-
enabling assessment of participant and clinical burden and oped. Moreover, the information needed to understand the role
feasibility, as well as allowing for refinement of the MoTrPAC pro- genetic variation plays in the response of individuals to acute
tocol. In phase two, the remainder of the project with a target of and chronic exercise remains limited. MoTrPAC, although pre-
over 2,000 participants will be implemented. dicted to be statistically underpowered for a genome-wide asso-
ciation study (GWAS), should be able to be statistically organized
Consortium Coordinating Center and prioritized (Cantor et al., 2010) so that there is potential
The Consortium Coordinating Center (CCC) is composed of four benefit from the orthogonal measurements assessed through
parts: an Administrative Coordinating Center (ACC), a Data Man- other ‘omes, leading to improved mechanistic insight. Such in-
agement and Quality Control (DMAQC) Core, an Exercise Inter- formation can begin as a knowledge base for enabling better
vention Core (EIC), and a central Biorepository. The role of the treatment considerations for a variety of diseases (whether acute
ACC is to enable the organization and governance of MoTrPAC or chronic) through recognizing potential genetic and epigenetic
by facilitating key processes such as meeting logistics, IRB sub- differences in responses to exercise and training. This could be
mission, and preparation of Manuals of Operations. accomplished through identifying novel gene/genetic network
The Biorepository, working with the preclinical and clinical involvement, their corresponding changes in RNA transcripts
sites, the DMAQC, and the Chemical Analysis Sites, oversees and how such genes are regulated at the epigenetic level from
sample collection, shipping, archiving, and distribution of human adult and adolescent and between athletic and sedentary indi-
and animal samples. This includes ensuring that homogeneous viduals, and associated sex differences in response to acute
cryo-pulverization of tissue samples occurs prior to distribution and chronic exercise.
of aliquots to the various Chemical Analysis Sites. Uniform sam- The goal of the GET assays are to map and measure changes
ple processing is important to ensure that diverse data types can in the (1) RNA transcriptome and transcript isoforms including
be directly compared. Each tissue sample will also be stored for small and micro RNA using RNA sequencing, (2) DNA methyl-
future use by MoTrPAC and non-MoTrPAC investigators. Sam- ation and chromatin accessibility from rat and human tissues us-
ples include serum, EDTA plasma, PAXgene-protected whole ing reduced representation bisulfite (RRBS) for rat or methyl CpG
blood and peripheral blood mononuclear cells, and vastus later- hybrid capture for human specimens and ATAC-seq (assay for
alis skeletal muscle and subcutaneous abdominal adipose tissue transposase-accessible chromatin with sequencing), respec-
from humans and >20 different tissues from the preclinical ani- tively, and (3) genomic sequence and structure of all human par-
mals. Each sample will be analyzed by the Chemical Analysis ticipants. The assays are expected to provide insights into
Sites, and additional material will be archived for future use. changes in biological processes as well as gene regulatory net-
The Biorepository inventory system interacts with the DMAQC works that occur in response to acute and chronic exercise. The
to enable sample tracking, quality control, and other process GET assay component of MoTrPAC will involve comprehensive
support systems. analyses of extensively curated rat and human MoTrPAC sam-
ples with an exercise intervention, contribute these data to public
Chemical Analysis Sites databases, help identify candidate molecular transducers, eluci-
To understand the exercise response in detail, an in-depth date new mechanisms that might explain the human response to
analysis of molecular and ‘omic assays will be performed using exercise, and cooperate with the Bioinformatics Center (BIC) to
state-of-the-art laboratory techniques. Technologies include ge- develop predictive models of the individual response to physical
nomics, transcriptomics, DNA methylomics, targeted and untar- activity.
geted proteomics, and targeted and untargeted metabolomics. Proteomics
Genomic, Transcriptomic, and Regulatory Analyses Proteins are important drivers of cellular structure, function, and
Evidence has shown, through more than 150 small cohort signal mediation (Cox and Mann, 2011); thus, uncovering the
studies (typically with under 50 participants analyzed) (Bouchard pathways through which physical activity influences health re-
et al., 2011; Pacheco et al., 2018), that exercise is accompanied quires analysis of the proteome and the critical signaling-associ-
with massive changes at both the transcriptional and epige- ated post-translational modifications of the proteome in various
nomic levels in muscle, adipose, and most other tissue systems tissues. To date, a number of proteomic studies have shown
(Lindholm et al., 2014; Ling and Rönn, 2014; Rönn and Ling, important changes influenced by exercise (Burniston, 2008; Hoff-
2013) with the poorly understood influence of the underlying hu- man et al., 2015; Magherini et al., 2012; Sollanek et al., 2017).
man genetic/environmental variation that exists between and The majority of this work has focused on skeletal muscle, which
within populations (Leon ska-Duniec et al., 2016). Therefore, is the tissue that actively performs the motions involved in exer-
recent scientific studies have been conducted generating data cise, and blood and plasma, which circulate signals systemically
reflecting some of the underlying genetic and epigenetic basis through the body and may be responsible for facilitating cross-
for responses to exercise, physical activity, and training linking talk between organ systems. Furthermore, this research is often
performed in the context of diabetes because of the role of muscle acids, and myriad other molecule types, and its wide dynamic
and the interplay of exercise with insulin-resistance (Kleinert et al., range (sub-nM to mM) implies that no single chemical assay
2018). Although these studies have largely been constrained to can adequately profile all metabolites in one experiment (Smilde
experimental models of exercise in animals or very small cohorts et al., 2005). To this end, MoTrPAC will employ a combination of
of human subjects, the results are tantalizing and have identified non-targeted and targeted approaches for mapping the broader
several proteins and signaling molecules that potentially play a effects of exercise on both the metabolome and lipidome. These
key role in the response to exercise. The large-scale and well- will range from triple-quadrupole-based liquid chromatography-
controlled preclinical and clinical protocols adopted for MoTrPAC mass spectrometry (LC-MS) using stable isotope-labeled inter-
will allow for expansion of this knowledge by providing a deeper nal standards for absolute quantification to high resolution MS
interrogation of the proteomic response to acute and chronic ex- and tandem MS using reversed phase and hydrophilic interac-
ercise in numerous tissues from individuals across a range of tion LC for mapping relative changes in both known and
fitness levels. unknown molecular transducers. Targeted and non-targeted
Importantly, proteomic analyses should be inclusive of not LC-MS assays that focus on the non-polar fraction of the metab-
only protein expression but also the state of protein post-trans- olome (the lipidome) will also be leveraged to map exercise ef-
lational modifications, such as phosphorylation or acetylation, fects on lipid metabolism and oxidation (Nieman et al., 2013,
because these chemical moieties can act as rapid integrators 2014). It is expected that these studies will provide insights
by dictating protein localization and enzymatic activity (Brandes into energy metabolism and signaling molecules involved in the
et al., 2009; Choudhary et al., 2014; Emmerich et al., 2011; Hunt- response to exercise.
er, 1995). Primarily, untargeted mass spectrometry methods and Exosomes
targeted aptamer-based detection techniques will be employed Exercise is a potent stimulus that has broad ranging systemic ef-
to probe changes in protein abundance and modifications fects that are indubitable (Egan and Zierath, 2013). One prevail-
induced by exercise. Given that distinct tissues present techno- ing hypothesis is that circulating extracellular vesicles termed
logical challenges to discovery-based proteomic analysis (e.g., exosomes play an important role in carrying training-induced
dynamic range in skeletal muscle), state-of-the-art instrumenta- protein, mRNA, and microRNA (miRNA) cargo between organs
tion and protocols, including tandem mass tag labeling and frac- as a means of integrating responses to exercise (Safdar and Tar-
tionation (Mertins et al., 2018), will be employed. Indeed, pilot nopolsky, 2018) (Figure 4). Many techniques have been
discovery-based proteomics efforts with muscle and other developed for isolating exosomes from plasma to analyze their
tissues have yielded robust datasets with levels of protein molecular cargo (Barrachina et al., 2019). Importantly, these
coverage exceeding previous studies, presenting a wealth of techniques have been used to demonstrate that acute exercise
opportunities to elucidate proteomic response to exercise and increases the abundance of a wide variety of exosome-associ-
integrate these findings with data obtained from GET and metab- ated proteins related to metabolic and immune regulation
olomic studies of the same tissues. (Whitham et al., 2018). Exosome isolation and analysis of
Metabolomics MoTrPAC specimens will further investigate these effects by
Complementing genomics, transcriptomics, epigenomics, and describing how exosome content is modulated in response to
proteomic studies, MoTrPAC will also carry out a highly compre- endurance and resistance exercise. The identification of protein
hensive mapping of exercise-associated alterations in the me- and RNA signatures associated with exercise will shed light on
tabolome of both rats and humans. The metabolome is the total exosome-mediated inter-organ cross-talk and provide a frame-
collection of biologically active small molecules in a given organ- work for studies to characterize the systemic response to phys-
ism (Nicholson and Wilson, 2003). This includes endogenous ical activity.
molecules that are biosynthesized by metabolic networks in pri-
mary metabolism, molecules derived from diet or environmental Bioinformatics Center
exposures (the exposome; Wild, 2005), and molecules derived The immediate goals of MoTrPAC will be vested in the ‘omic plat-
from the biosynthetic interactions with the microbiome. Metabo- forms used and the data being generated, the quality of this data,
lomics can either be ‘‘targeted’’ to a set of known compounds both meta and experimental, and how it will be utilized to map
(e.g., certain acylcarnitines) or ‘‘non-targeted,’’ which attempts the molecular transducers involving the responses to acute
to detect and relatively quantify as many metabolites as possible and chronic exercise. Data from each assay will be collected at
(Dettmer et al., 2007). In the context of acute and chronic exer- the BIC and analyzed using consistent bioinformatic and analytic
cise, metabolomics can provide sensitive and dynamic pheno- pipelines, whenever possible. This will improve reproducibility,
typic patterns that closely reflect cellular and molecular changes interpretability, and ease in data harmonization across sites.
and will likely improve our understanding of the effects of exer- Assay-specific quantitative data will undergo quality control
cise beyond the individual pathway level (Heaney et al., 2017). assessment and be normalized to reduce undesirable sample-
A number of studies have documented profound metabolomic to-sample variation, minimize batch effects, and deal with ana-
alterations associated with exercise (Fukai et al., 2016; Heaney lyte heteroskedasticity typically observed in molecular abun-
et al., 2017; Lewis et al., 2010; Xiao et al., 2016), but these typi- dance datasets. Relative levels of analytes will be determined
cally involve smaller cohorts (n < 100), are limited to only one (or a and the changes in molecules and pathways in response to ex-
handful of) metabolomics assays, or focus primarily on alter- ercise deduced.
ations in energy production pathways. The vast chemical diver- Investigators across the consortium will conduct a series of
sity of the metabolome, which includes lipids, sugars, amino integrative analyses with the end goal of creating a map of
and proposal template are available at https://motrpac.org/ this article. Additionally, the authors would like to gratefully acknowledge the
ancillarystudyguidelines.cfm. expert administrative functions by Heather Kiesel of the University of
Florida for help organizing efforts by the MoTrPAC Writing Group to better
enable the completion of this article. The MoTrPAC Study is supported by
Study Challenges NIH grants U24OD026629 (Bioinformatics Center), U24DK112349,
There are many challenges associated with a large multicenter U24DK112342, U24DK112340, U24DK112341, U24DK112326,
project generating a wide variety of data types. Standardization U24DK112331, U24DK112348 (Chemical Analysis Sites), U01AR071133,
procedures, operation manuals, and quality control steps are in U01AR071130, U01AR071124-01, U01AR071128, U01AR071150,
place to reduce the variation in exercise performance and eval- U01AR071160, U01AR071158 (Clinical Centers), U24AR071113 (Consortium
Coordinating Center), U01AG055133, U01AG055137, and U01AG055135
uation and sample collections of the animal and human samples.
(PASS/Animal Sites). The views expressed are those of the authors and do
Participant and clinic burden are a concern given the large scope not necessarily reflect those of the NIH or the Department of Health and Hu-
and complexity of MoTrPAC; the consortium has made strategic man Services of the United States.
choices regarding the protocol design and biospecimens sam-
pling to help mitigate these challenges. Similarly, standardization
REFERENCES
and quality control steps are in place for the data generated us-
ing multiple ‘omics platforms even for similar data types (e.g., Amar, D., and Shamir, R. (2014). Constructing module maps for integrated
metabolomics and proteomics). An initial implementation phase analysis of heterogeneous biological networks. Nucleic Acids Res. 42,
(described above) was also put in place for the human studies to 4208–4219.
further evaluate the adult protocol during the early stages of Amar, D., Yekutieli, D., Maron-Katz, A., Hendler, T., and Shamir, R. (2015). A
recruitment, data collection, and analysis to identify any unfore- hierarchical Bayesian model for flexible module discovery in three-way time-
seen issues. series data. Bioinformatics 31, i17–i26.
Integrating heterogeneous data types across MoTrPAC will Barrachina, M.N., Calderón-Cruz, B., Fernandez-Rocca, L., and Garcı́a, Á.
require sophisticated tracking, data normalization, and analytic (2019). Application of Extracellular Vesicles Proteomics to Cardiovascular Dis-
approaches. MoTrPAC will generate large amounts of data, and ease: Guidelines, Data Analysis, and Future Perspectives. Proteomics 19,
e1800247.
many measurements may not meet statistical significance at the
individual molecule level but may do so at the pathway level. Barrès, R., Yan, J., Egan, B., Treebak, J.T., Rasmussen, M., Fritz, T., Caidahl,
K., Krook, A., O’Gorman, D.J., and Zierath, J.R. (2012). Acute exercise re-
State-of-the art analytic tools are in place to help manage the
models promoter methylation in human skeletal muscle. Cell Metab. 15,
depth and breadth of data analysis, and these tools will continue 405–411.
to evolve as MoTrPAC progresses. Finally, the data are complex;
Bolster, D.R., Kubica, N., Crozier, S.J., Williamson, D.L., Farrell, P.A., Kimball,
ensuring that they are accessible and understandable to the broad S.R., and Jefferson, L.S. (2003). Immediate response of mammalian target of
scientific community in a timely fashion is essential for this project rapamycin (mTOR)-mediated signalling following acute resistance exercise in
to be successful. Protection of participant clinical data (PHI, geno- rat skeletal muscle. J. Physiol. 553, 213–220.
mics data) will be given the highest priority to protect each individ- Booth, F.W., Roberts, C.K., Thyfault, J.P., Ruegsegger, G.N., and Toede-
ual’s identity. Incorporation of useful visualization tools as well as busch, R.G. (2017). Role of Inactivity in Chronic Diseases: Evolutionary Insight
active engagement with the broader scientific community will be and Pathophysiological Mechanisms. Physiol. Rev. 97, 1351–1402.
equally important to fully capitalize on the MoTrPAC project. Bouchard, C., Rankinen, T., and Timmons, J.A. (2011). Genomics and genetics
in the biology of adaptation to exercise. Compr. Physiol. 1, 1603–1648.
Summary Brandes, N., Schmitt, S., and Jakob, U. (2009). Thiol-based redox switches in
When complete, MoTrPAC will deliver a map of the biological eukaryotic proteins. Antioxid. Redox Signal. 11, 997–1014.
molecules and pathways underlying the systemic effects of Burniston, J.G. (2008). Changes in the rat skeletal muscle proteome induced
acute and chronic exercise. The data, which will ultimately be by moderate-intensity endurance exercise. Biochim. Biophys. Acta 1784,
made freely available to the scientific community, will provide 1077–1086.
unprecedented opportunities to begin to understand the path- Cantor, R.M., Lange, K., and Sinsheimer, J.S. (2010). Prioritizing GWAS re-
sults: A review of statistical methods and recommendations for their applica-
ways by which physical activity influences health. In the future,
tion. Am. J. Hum. Genet. 86, 6–22.
it is expected that the knowledge gained will allow researchers
Carter, A.C., Chang, H.Y., Church, G., Dombkowski, A., Ecker, J.R., Gil, E.,
and health professionals to develop personalized exercise rec-
Giresi, P.G., Greely, H., Greenleaf, W.J., Hacohen, N., et al. (2017). Challenges
ommendations and provide insights into molecular targets that
and recommendations for epigenomics in precision health. Nat. Biotechnol.
could be manipulated to mimic some of the effects of exercise 35, 1128–1132.
in persons unable to do so.
Choudhary, C., Weinert, B.T., Nishida, Y., Verdin, E., and Mann, M. (2014). The
growing landscape of lysine acetylation links metabolism and cell signalling.
SUPPLEMENTAL INFORMATION Nat. Rev. Mol. Cell Biol. 15, 536–550.
Coggan, A.R., Kohrt, W.M., Spina, R.J., Bier, D.M., and Holloszy, J.O. (1990).
Supplemental Information can be found online at https://doi.org/10.1016/j.
Endurance training decreases plasma glucose turnover and oxidation during
cell.2020.06.004.
moderate-intensity exercise in men. J. Appl. Physiol. 68, 990–996.
ACKNOWLEDGMENTS Cowen, L., Ideker, T., Raphael, B.J., and Sharan, R. (2017). Network propaga-
tion: a universal amplifier of genetic associations. Nat. Rev. Genet. 18,
The authors would like to gratefully acknowledge Jill K. Gregory, CMI, FAMI 551–562.
(Certified Medical Illustrator) of the Icahn School of Medicine at Mount Sinai Cox, J., and Mann, M. (2011). Quantitative, high-resolution proteomics for
for working with the writing group to generate the figures enclosed within data-driven systems biology. Annu. Rev. Biochem. 80, 273–299.
Emmerich, C.H., Schmukle, A.C., and Walczak, H. (2011). The emerging role of Mertins, P., Tang, L.C., Krug, K., Clark, D.J., Gritsenko, M.A., Chen, L.,
linear ubiquitination in cell signaling. Sci. Signal. 4, re5. Clauser, K.R., Clauss, T.R., Shah, P., Gillette, M.A., et al. (2018). Reproducible
workflow for multiplexed deep-scale proteome and phosphoproteome anal-
Farinatti, P.T.V., and Castinheiras Neto, A.G. (2011). The effect of between-set
ysis of tumor tissues by liquid chromatography-mass spectrometry. Nat. Pro-
rest intervals on the oxygen uptake during and after resistance exercise ses-
toc. 13, 1632–1661.
sions performed with large- and small-muscle mass. J. Strength Cond. Res.
25, 3181–3190. Mulla, N.A., Simonsen, L., and Bülow, J. (2000). Post-exercise adipose tissue
and skeletal muscle lipid metabolism in humans: the effects of exercise inten-
Fukai, K., Harada, S., Iida, M., Kurihara, A., Takeuchi, A., Kuwabara, K., Su-
sity. J. Physiol. 524, 919–928.
giyama, D., Okamura, T., Akiyama, M., Nishiwaki, Y., et al. (2016). Metabolic
Profiling of Total Physical Activity and Sedentary Behavior in Community- Neufer, P.D., Bamman, M.M., Muoio, D.M., Bouchard, C., Cooper, D.M.,
Dwelling Men. PLoS ONE 11, e0164877. Goodpaster, B.H., Booth, F.W., Kohrt, W.M., Gerszten, R.E., Mattson, M.P.,
et al. (2015). Understanding the Cellular and Molecular Mechanisms of Phys-
Gallant, A., Leiserson, M.D.M., Kachalov, M., Cowen, L.J., and Hescott, B.J.
ical Activity-Induced Health Benefits. Cell Metab. 22, 4–11.
(2013). Genecentric: a package to uncover graph-theoretic structure in high-
throughput epistasis data. BMC Bioinformatics 14, 23. Nicholson, J.K., and Wilson, I.D. (2003). Opinion: understanding ‘global’ sys-
tems biology: metabonomics and the continuum of metabolism. Nat. Rev.
Gollnick, P.D., Armstrong, R.B., Saltin, B., Saubert, C.W., 4th, Sembrowich,
Drug Discov. 2, 668–676.
W.L., and Shepherd, R.E. (1973). Effect of training on enzyme activity and fiber
composition of human skeletal muscle. J. Appl. Physiol. 34, 107–111. Nieman, D.C., Gillitt, N.D., Knab, A.M., Shanely, R.A., Pappan, K.L., Jin, F., and
Lila, M.A. (2013). Influence of a polyphenol-enriched protein powder on exer-
Hawley, J.A., Hargreaves, M., Joyner, M.J., and Zierath, J.R. (2014). Integra-
cise-induced inflammation and oxidative stress in athletes: a randomized trial
tive biology of exercise. Cell 159, 738–749.
using a metabolomics approach. PLoS ONE 8, e72215.
Heaney, L.M., Deighton, K., and Suzuki, T. (2017). Non-targeted metabolomics
Nieman, D.C., Shanely, R.A., Luo, B., Meaney, M.P., Dew, D.A., and Pappan,
in sport and exercise science. J. Sports Sci. 37, 959–967.
K.L. (2014). Metabolomics approach to assessing plasma 13- and 9-hydroxy-
Hoffman, N.J., Parker, B.L., Chaudhuri, R., Fisher-Wellman, K.H., Kleinert, M., octadecadienoic acid and linoleic acid metabolite responses to 75-km cycling.
Humphrey, S.J., Yang, P., Holliday, M., Trefely, S., Fazakerley, D.J., et al. Am. J. Physiol. Regul. Integr. Comp. Physiol. 307, R68–R74.
(2015). Global Phosphoproteomic Analysis of Human Skeletal Muscle Reveals
Pacheco, C., Felipe, S.M.D.S., Soares, M.M.D.C., Alves, J.O., Soares, P.M.,
a Network of Exercise-Regulated Kinases and AMPK Substrates. Cell Metab.
Leal-Cardoso, J.H., Loureiro, A.C.C., Ferraz, A.S.M., de Carvalho, D.P., and
22, 922–935.
Ceccatto, V.M. (2018). A compendium of physical exercise-related human
Hofree, M., Shen, J.P., Carter, H., Gross, A., and Ideker, T. (2013). Network- genes: an ’omic scale analysis. Biol. Sport 35, 3–11.
based stratification of tumor mutations. Nat. Methods 10, 1108–1115.
Pedersen, B.K., and Febbraio, M.A. (2012). Muscles, exercise and obesity:
Hunter, T. (1995). Protein kinases and phosphatases: the yin and yang of pro- skeletal muscle as a secretory organ. Nat. Rev. Endocrinol. 8, 457–465.
tein phosphorylation and signaling. Cell 80, 225–236.
Phillips, S.M., Tipton, K.D., Aarsland, A., Wolf, S.E., and Wolfe, R.R. (1997).
Jo, K., Jung, I., Moon, J.H., and Kim, S. (2016). Influence maximization in time Mixed muscle protein synthesis and breakdown after resistance exercise in
bounded network identifies transcription factors regulating perturbed path- humans. Am. J. Physiol. 273, E99–E107.
ways. Bioinformatics 32, i128–i136. Radom-Aizik, S., and Cooper, D.M. (2016). Bridging the Gaps: the Promise of
Kjaer, M., Kiens, B., Hargreaves, M., and Richter, E.A. (1991). Influence of Omics Studies in Pediatric Exercise Research. Pediatr. Exerc. Sci. 28,
active muscle mass on glucose homeostasis during exercise in humans. 194–201.
J. Appl. Physiol. 71, 552–557. Radom-Aizik, S., Zaldivar, F., Haddad, F., and Cooper, D.M. (2013). Impact of
Kleinert, M., Parker, B.L., Jensen, T.E., Raun, S.H., Pham, P., Han, X., James, brief exercise on peripheral blood NK cell gene and microRNA expression in
D.E., Richter, E.A., and Sylow, L. (2018). Quantitative proteomic characteriza- young adults. J. Appl. Physiol. 114, 628–636.
tion of cellular pathways associated with altered insulin sensitivity in skeletal Radom-Aizik, S., Zaldivar, F.P., Jr., Haddad, F., and Cooper, D.M. (2014).
muscle following high-fat diet feeding and exercise training. Sci. Rep. Impact of brief exercise on circulating monocyte gene and microRNA expres-
8, 10723. sion: implications for atherosclerotic vascular disease. Brain Behav. Immun.
Leon ska-Duniec, A., Ahmetov, I.I., and Zmijewski, P. (2016). Genetic variants 39, 121–129.
influencing effectiveness of exercise training programmes in obesity - an over- Raue, U., Trappe, T.A., Estrem, S.T., Qian, H.-R., Helvering, L.M., Smith, R.C.,
view of human studies. Biol. Sport 33, 207–214. and Trappe, S. (2012). Transcriptome signature of resistance exercise adapta-
Lewis, G.D., Farrell, L., Wood, M.J., Martinovic, M., Arany, Z., Rowe, G.C., tions: mixed muscle and fiber type specific profiles in young and old adults.
Souza, A., Cheng, S., McCabe, E.L., Yang, E., et al. (2010). Metabolic signa- J. Appl. Physiol. 112, 1625–1636.
tures of exercise in human plasma. Sci. Transl. Med. 2, 33ra37. Romijn, J.A., Coyle, E.F., Sidossis, L.S., Gastaldelli, A., Horowitz, J.F., Endert,
Lindholm, M.E., Marabita, F., Gomez-Cabrero, D., Rundqvist, H., Ekström, E., and Wolfe, R.R. (1993). Regulation of endogenous fat and carbohydrate
T.J., Tegnér, J., and Sundberg, C.J. (2014). An integrative analysis reveals co- metabolism in relation to exercise intensity and duration. Am. J. Physiol.
ordinated reprogramming of the epigenome and the transcriptome in human 265, E380–E391.
skeletal muscle after training. Epigenetics 9, 1557–1569. Rönn, T., and Ling, C. (2013). Effect of exercise on DNA methylation and meta-
Ling, C., and Rönn, T. (2014). Epigenetic adaptation to regular exercise in hu- bolism in human adipose tissue and skeletal muscle. Epigenomics 5, 603–605.
mans. Drug Discov. Today 19, 1015–1018. Rönn, T., Volkov, P., Tornberg, A., Elgzyri, T., Hansson, O., Eriksson, K.-F.,
Loos, R.J.F., Hagberg, J.M., Pérusse, L., Roth, S.M., Sarzynski, M.A., Wolf- Groop, L., and Ling, C. (2014). Extensive changes in the transcriptional profile
arth, B., Rankinen, T., and Bouchard, C. (2015). Advances in exercise, fitness, of human adipose tissue including genes involved in oxidative phosphorylation
and performance genomics in 2014. Med. Sci. Sports Exerc. 47, 1105–1112. after a 6-month exercise intervention. Acta Physiol. (Oxf.) 211, 188–200.
Louis, E., Raue, U., Yang, Y., Jemiolo, B., and Trappe, S. (2007). Time course Safdar, A., and Tarnopolsky, M.A. (2018). Exosomes as Mediators of the Sys-
of proteolytic, cytokine, and myostatin gene expression after acute exercise in temic Adaptations to Endurance Exercise. Cold Spring Harb. Perspect. Med.
human skeletal muscle. J. Appl. Physiol. 103, 1744–1751. 8, a029827.
Correspondence
benno@pasteur.fr (B.S.),
ido.amit@weizmann.ac.il (I.A.),
zhangzheng1975@aliyun.com (Z.Z.)
In Brief
A computational framework that allows
for the identification and characterization
of virus-infected cells as well as
bystander cell responses reveals how
SARS-CoV-2 alters the immune
responses of patients.
361667513
Highlights
d Viral-Track: a computational framework to analyze host-viral
infection maps
Article
Host-Viral Infection Maps Reveal Signatures
of Severe COVID-19 Patients
Pierre Bost,1,2,3,6 Amir Giladi,1,6 Yang Liu,4,6 Yanis Bendjelal,2 Gang Xu,4 Eyal David,1 Ronnie Blecher-Gonen,1
Merav Cohen,1 Chiara Medaglia,1 Hanjie Li,1 Aleksandra Deczkowska,1 Shuye Zhang,5 Benno Schwikowski,2,*
Zheng Zhang,4,* and Ido Amit1,7,*
1Department of Immunology, Weizmann Institute of Science, Rehovot, Israel
2Systems Biology Group, Department of Computational Biology and USR 3756, Institut Pasteur and CNRS, Paris 75015, France
3Sorbonne Universite, Complexite du vivant, Paris 75005, France
4Institute for Hepatology, National Clinical Research Center for Infectious Disease, Shenzhen Third People’s Hospital, School of Medicine,
Southern University of Science and Technology, Shenzhen 518112, Guangdong Province, China
5Shanghai Public Health Clinical Center and Institute of Biomedical Sciences, Fudan University, Shanghai 201508, China
6These authors contributed equally
7Lead Contact
SUMMARY
Viruses are a constant threat to global health as highlighted by the current COVID-19 pandemic. Currently,
lack of data underlying how the human host interacts with viruses, including the SARS-CoV-2 virus, limits
effective therapeutic intervention. We introduce Viral-Track, a computational method that globally scans un-
mapped single-cell RNA sequencing (scRNA-seq) data for the presence of viral RNA, enabling transcriptional
cell sorting of infected versus bystander cells. We demonstrate the sensitivity and specificity of Viral-Track to
systematically detect viruses from multiple models of infection, including hepatitis B virus, in an unsuper-
vised manner. Applying Viral-Track to bronchoalveloar-lavage samples from severe and mild COVID-19 pa-
tients reveals a dramatic impact of the virus on the immune system of severe patients compared to mild
cases. Viral-Track detects an unexpected co-infection of the human metapneumovirus, present mainly in
monocytes perturbed in type-I interferon (IFN)-signaling. Viral-Track provides a robust technology for dis-
secting the mechanisms of viral-infection and pathology.
INTRODUCTION thus far regarding the interaction of the SARS-CoV-2 virus with the
human host and, as a consequence, no efficient treatment has
The development of efficient vaccines against viral pathogens is been designed so far (Chen et al., 2020). Moreover, only few ther-
considered one of the biggest achievements of modern apeutic targets have been identified, highlighting the urgency to
medicine and has significantly contributed to the increase in life ex- develop additional strategies to dissect the virus-host interactions.
pectancy worldwide. However, no vaccines exist for many life- Single-cell RNA sequencing (scRNA-seq) is an emerging tech-
threatening viruses such as HIV (Burton, 2019), Zika virus (Pierson nology that has been extensively used to study several complex
and Diamond, 2018), or hepatitis C virus (HCV) (Bailey et al., 2019). diseases, including cancer (Li et al., 2019), neurodegeneration
Additionally, efficient broad-spectrum antiviral drugs are still (Keren-Shaul et al., 2017), and auto-immune (Zhang et al., 2019)
missing, making infectious diseases a significant challenge for and metabolic diseases (Jaitin et al., 2019), providing new insights
modern health systems. Viruses can also trigger or fuel non-infec- and revealing new therapeutic targets and strategies (Yofe et al.,
tious diseases such as cancer (Young and Rickinson, 2004) and 2020). In the context of infectious diseases, scRNA-seq studies
are suspected to contribute to various other chronic diseases identified the underlying cells and pathways interacting with
such as Alzheimer disease (Itzhaki, 2018) and various auto-im- various pathogens (Drayman et al., 2019; Shnayder et al., 2018;
mune disorders (Münz et al., 2009). The recent emergence of high- Steuerman et al., 2018; Zanini et al., 2018). During the immune
ly pathogenic viruses such as the Ebola virus and the emerging response to a pathogen, a limited number of antigen-positive or in-
SARS-CoV-2 pandemic recalls the constant threat that viruses fected cells initiate and modulate the host immune response
represent to global health. So far, the SARS-CoV-2 pandemic (Blecher-Gonen et al., 2019), while most of the tissue response is
has caused a global financial and social catastrophe and is ex- propagated through cytokines, such as type I interferon (IFN)
pected to make a significant long-lasting impact on human health signaling, to bystander, uninfected cells. It is therefore essential
(Zhu et al., 2020). Despite intensive research efforts, little is known to develop new analytical tools to identify the rare infected cells
Cell 181, 1475–1488, June 25, 2020 ª 2020 Elsevier Inc. 1475
ll
Article
in order to better understand complex host-virus interactions un- reference host genome of the relevant profiled organism. Irrele-
derlying these pathologies. Multiple experimental tools have vant reads, representing other organisms, primers, adaptors, tem-
been developed over the years to track virus-infected cells in vivo, plate switching oligonucleotides, and other contaminants are then
characterize the cellular state of the infected cells, and differentiate commonly discarded. We reasoned that during infection, and
them from their bystander neighbors. These include fluorescently likely many other pathological processes, these reads can poten-
labeled pathogens or pathogens expressing fluorescent proteins tially carry valuable information about viral RNA that is discarded
(De Baets et al., 2015; Blecher-Gonen et al., 2019), as well as in this filtering step. In order to efficiently detect viral reads from
reporter mice (Lienenklaus et al., 2009). However, in the case of hu- raw scRNA-seq data in an unsupervised manner, we developed
man clinical samples, these tools are limited, making the pathogen- Viral-Track, an R-based computational pipeline (Figure 1A;
infected cells and viral reservoir cell types hard to detect. STAR Methods). Briefly, Viral-Track relies on the STAR aligner
Viruses exploit their host cells to first express viral genes, opti- (Dobin et al., 2013) to map the reads of scRNA-seq data to both
mize the cellular environment, and then fully activate the viral repli- the host reference genome and an extensive list of high-quality
cation program. Because scRNA-seq technologies rely on polya- viral genomes (Stano et al., 2016). Because viral reads are highly
denylated RNA isolation and amplification, current scRNA-seq repetitive and generate substantial sequencing artifacts, the viral
methods can, in theory, detect these viral RNA programs and genomes identified in Viral-Track with a sufficient number of map-
therefore enable accurate identification of the bona fide infected ped reads are then filtered, based on read mapping quality, nucle-
cells and their unique properties at single-cell resolution. While otide composition, sequence complexity, and genome coverage,
such an approach has already been used to study both in vitro to limit the occurrence of false-positives (STAR Methods). Due to
(Drayman et al., 2019; Shnayder et al., 2018) and in vivo infection the lack of high-quality viral genome annotations, Viral-Track in-
models (Steuerman et al., 2018), no general computational frame- cludes de novo transcriptome assembly of the identified viruses
work has been developed to detect viruses and analyze host-viral using StringTie (Pertea et al., 2015). Finally, viral reads are demul-
maps in clinical samples. Here, we present a new computational tiplexed, quantified using unique molecular identifiers (UMI), and
tool, called Viral-Track, that is designed to systematically scan assigned to unique viral transcripts and cells (Figures 1A and
for viral RNA in scRNA-seq data of physiological viral infections us- S1A). The Viral-Track algorithm has been designed to robustly
ing a direct mapping strategy. Viral-Track performs comprehen- handle various types of scRNA-seq datasets, as illustrated below,
sive mapping of scRNA-seq data onto a large database of known and is publicly accessible at https://github.com/PierreBSC/
viral genomes, providing precise annotation of the cell types asso- Viral-Track.
ciated with viral infections. Integrating these data with the host In order to evaluate the specificity and sensitivity of Viral-
transcriptome enables transcriptional sorting and differential Track, we benchmarked Viral-Track on several scRNA-seq
profiling of the viral-infected cells compared to bystander cells. Us- datasets (Table S1). These datasets include a large number of
ing a new statistical approach for differential gene expression be- experiments we conducted, as well as published studies, that
tween infected and bystander cells, we are able to recover virus- span several tissues (lung, spleen, liver, and lymph node) and a
induced programs and reveal key host factors required for viral wide range of viruses: influenza A, lymphocytic choriomeningitis
replication. Viral-Track is able to annotate the viral program with virus (LCMV), vesicular stomatitis virus (VSV), herpes simplex vi-
high accuracy and sensitivity, as we demonstrate in several in vivo rus 1 (HSV-1), human immunodeficiency virus (HIV), and HBV.
mouse models of infection, as well as human samples of hepatitis We first evaluated mouse lungs infected in vivo by influenza A vi-
B virus (HBV) infection. Applying Viral-Track on bronchoalveolar rus and sequenced using MARS-seq2.0 (Keren-Shaul et al.,
lavage (BAL) samples from moderate and severe COVID-19 pa- 2019; Steuerman et al., 2018). Viral-Track analysis specifically
tients, we reveal the infection landscape of SARS-CoV-2 and its detected the 8 distinct influenza A viral segments (NC_002016
interaction with the host tissue. Our analysis shows a dramatic to NC_002023 Refseq nucleotide sequences) from the specific
impact of the SARS-CoV-2 virus on the immune system of severe infecting strain (H1N1 Puerto Rico 8 strain) (Figure 1B). We per-
patients, compared to mild cases, including replacement of the tis- formed transcriptome assembly to test the feasibility of recon-
sue-resident alveolar macrophages with recruited inflammatory structing the viral transcriptome from 30 -enriched scRNA-seq
monocytes, neutrophils, and macrophages and an altered CD8+ data. The results were highly coherent with the current knowl-
T cell cytotoxic response. We find that SARS-CoV-2 mainly infects edge of influenza A transcriptome, exemplified by Viral-Track’s
the epithelial and macrophage subsets. In addition, Viral-Track de- ability to identify documented spliced transcript structures with
tects an unexpected co-infection of the human metapneumovirus single-nucleotide precision. For instance, we identified the exact
in one of the severe patients. This study establishes Viral-Track as location of the key splicing site on segment 7 that gives rise to M2
a broadly applicable tool for dissecting mechanisms of viral infec- transcript and links nucleotides 51 and 740 (Dubois et al., 2014)
tions, including identification of the cellular and molecular signa- (Figure 1C). Quantification of the number of viral reads across
tures involved in virus-induced pathologies. different experimental conditions was consistent with current
knowledge of the disease, with lung stomal cells of non-immune
RESULTS lineages (CD45) exhibiting a significantly higher viral load
compared to immune cells (CD45+) (p = 0.039, two-tailed
Viral-Track: An Unsupervised Pipeline for Welch’s t test) (Figure 1D).
Characterization of Viral Infections in scRNA-Seq Data As inbred mice lack the influenza-specific restriction factor
All scRNA-seq computational packages implement a pipeline that Mx1, influenza A infection is extremely virulent in inbred mice
initially aligns the sequenced reads to the expressed part of a (Haller et al., 1980). Moreover, all influenza A mRNA are capped
B C
D
E F
G H
and polyadenylated, making them an optimal substrate for commonly used scRNA-seq technologies and non-RNA viruses.
scRNA-seq isolation and amplification protocols. We therefore We applied Viral-Track to scRNA-seq data from a recently publi-
evaluated the sensitivity and specificity of Viral-Track in a more cation of human primary cells infected ex vivo with HSV-1, a linear
challenging dataset. In this model, photoactivatable-GFP (PA- double-stranded DNA virus, generated by the Drop-seq platform
GFP) mice were infected with LCMV (Armstrong acute strain), (Drayman et al., 2019; Macosko et al., 2015). We found that Viral-
a virus lacking strong poly(A) mRNA signals (Burrell et al., Track detected and identified correctly HSV-1 RNA specifically in
2017), via injection to the footpad. 72 h post-infection, CD45+ the infected samples but not in the controls (NC_001806 Refseq
splenic immune cells from different spatial niches (T zone, B nucleotide sequences) (Figures S1F and S1G). Finally, we
zone, marginal zone, and total spleen) were profiled using the analyzed scRNA-seq data of CD4+ T cells infected ex vivo with
NICHE-seq technology (Medaglia et al., 2017). Even though HIV-1 (Bradley et al., 2018), generated using the droplet-based
the LCMV viral mRNAs are not polyadenylated, we detected chromium platform (Zheng et al., 2017). Viral-Track successfully
mRNA molecules that converted to cDNA through priming of identified HIV as the unique virus present in the infected samples
the MARS-seq oligo(dt) RT primer, and Viral-Track successfully (Figures S1H and S1I), but detected significant amounts of HIV-1
identified the two viral segments (LCMV segment L viral reads in one control samples probably due to ambient
[NC_004291] and S [NC_004294]) (Figure S1B), albeit the num- contamination (Yang et al., 2020).
ber of detected reads was an order of magnitude lower than
the number observed in influenza A infection (Figure 1E). We de- Defining the Host Viral Interactions of HBV Using
tected viral reads in samples from the marginal zone, B zone, and Viral-Track
the total spleen, but not in T zone samples, and marginal zone We further tested Viral-Track’s applicability for detecting viral
samples exhibited significantly higher viral load compared to B reads in human clinical samples. For this purpose, we generated
zone and total spleen samples (Figure 1E; p = 0.0067 and scRNA-seq data from a liver biopsy of an untreated hepatitis B
0.0083 respectively, two-tailed Welch’s t test). This observation patient and analyzed the data using Viral-Track. Viral-Track suc-
is in line with the biology of LCMV, which primarily infects mac- cessfully identified HBV as the only virus present in the sample
rophages and lymphocytes from the marginal zone of the spleen (Figure 1F) with 18,420 reads assigned to the HBV genome
(Müller et al., 2002). (NC_003977 Refseq sequence). Coverage analysis revealed a
We next evaluated whether Viral-Track is sensitive to barcode strong peak located at the 50 end of the C gene, encoding for
swapping during Illumina-based scRNA-seq (Griffiths et al., the main core protein, suggesting that the HBV virus is actively
2018), which, in the case of viral RNA detection, can lead to the producing virions (Figure 1G). We then overlaid the viral data
false assignment of viral reads to uninfected cells. To this end, on the host transcriptome to identify infected and bystander
we infected mice with one of two different viruses, LCMV and populations. A total of 13,803 cells passed a lenient quality con-
VSV, and performed MARS-seq2.0 on CD45+CD19CD3 non- trol, permitting apoptotic signals that may arise from viral infec-
B/T cells from the auricular draining lymph node 1 day after infec- tion. We identified several non-immune cell types (Figure S1J),
tion (STAR Methods). All samples were sequenced concurrently including hepatocytes (expressing ALB and APOA2), as well as
to test for cross-sample viral read contamination. For both viruses, hepatocytes showing apoptotic signatures (ALB with high
Viral-Track was able to identify the correct viral segments (Figures expression of mitochondrial genes), sinusoidal endothelial cells
S1C and S1D), with no cross-contamination, evident by the (FCN2), and epithelial cells (KRT7). We also observed several
absence of VSV reads detected in the LCMV-infected cells and subsets of immune cells such as B cells (MS4A1), plasma cells
vice versa (Figure S1E). We further generalized Viral-Track for (MZB1), conventional dendritic cells 1 (cDC1; XCR1),
Figure 1. Viral-Track Retrieves Viral Reads in a Variety of Tissues, Viral Strains, and Sequencing Platforms
(A) Schematics of the Viral-Track approach. Single-cell sequencing data of cells from an infected tissue, containing infected and bystander cells are analyzed by
Viral-Track. Viral-Track maps the sequenced reads to both the host reference genome and a database of viral genomes, overlaying infection status on top of the
host transcriptional landscape.
(B) Results of Viral-Track analysis on scRNA-seq data from influenza A PR8-infected mouse lungs. For each viral segment, represented by a dot, the complexity of
the sequences (measured by entropy, i.e., how repetitive are the mapped sequences) and the percentage of the segment that is mapped are plotted. Dark red
dots correspond to viral segments of the influenza A PR8 strain and yellow dots to segments belonging to other H1N1 influenza strains. Viral segments with more
than 50 mapped reads are plotted.
(C) Coverage plot of the influenza A segment NC_002016 (influenza A PR8 segment 7), M2 transcript location estimated using StringTie is shown below with the
splicing site position.
(D) Quantification of the number of reads assigned to influenza viral segments across experimental settings. Each dot corresponds to a technical replicate (384-
well plate). Two-tailed Welch’s t test was used to compare viral load betwen CD45 and CD45+ cells (p = 0.039).
(E) Quantification of the number of reads assigned to LCMV viral segments in the different zones of the spleen. Each dot corresponds to a technical replicate (384-
well plate). Two-tailed Welch’s t test was used to compare viral load between cells from the infected marginal zone to cells from the B zone or the whole speen (p =
0.0067 and 0.0083 respectively).
(F) Result of Viral-Track analysis on scRNA-seq data from a HBV patient. For each viral segment, represented by a dot, the entropy of the sequence and the
percentage of the segment that is mapped is plotted. Green dots correspond to viral segments that passed quality control. Viral segments with more than 50
mapped reads are plotted.
(G) Coverage plot of the HBV genome. Locations of the different viral genes from NCBI database are depicted at the bottom.
(H) Enrichment of infected cells across hepatic cell subsets (left panel); red line corresponds to an enrichment of one. Distribution of the number of HBV UMIs per
cell in each cell subset (right panel).
See also Figure S1.
A B
D E
plasmacytoid dendritic cells (pDCs) (TCF4), and three different several studies, reporting active infection of macrophages
macrophage subsets (expressing TREM2, CD163, and FCN1, (Faure-Dupuy et al., 2019).
respectively). We observed a large diversity among the lympho- Together, this extensive list of validations demonstrate that
cyte compartment with CD8+ T cells (CD8A), Th17 cells (CCR6, Viral-Track is a sensitive and accurate method to detect and
IL23A), gd T cells (TRGC1), activated CD4 T cells (LEF1, OX40), identify, in an unsupervised manner, virus strains in diverse
natural killer (NK) cells (NKG7), and a distinct cluster of activated scRNA-seq samples, in different tissues, and at varying viral
CD8+ T cells (CSF2 and TOX2). We analyzed infected cells using types and loads. Importantly, Viral-Track can be applied to hu-
automated thresholding over the viral signal (Figure S1J; STAR man clinical samples to extract valuable insight into the biology
Methods). As expected, hepatocytes and apoptotic hepatocytes of the host-virus interactions.
were strongly enriched among the infected cells (Figures 1H and
S1K). Interestingly, we also detected viral reads in non-hepato- Viral-Track Identifies Infected versus Bystander Cells
cyte clusters, including two subsets of macrophages (CD163+ and Uncovers Virus-Induced Pathways
and TREM2+ populations, respectively), the cDC1 subset To further evaluate the accuracy of Viral-Track against a well-es-
(XCR1+), as well as endothelial (OIT3+ cells) and epithelial cells tablished model for tracking infection in single cells, we infected
(KRT7+) (Figures 1H and S1K). Infection of non-hepatocyte clus- mice with a GFP-expressing LCMV virus (LCMV-GFP virus) (Med-
ters, although with relatively low viral load, is coherent with aglia et al., 2017). We performed MARS-seq on GFP+ splenocytes
A B
D E F
G I
H J
and total spleen cells 72 h post-infection and analyzed the ure 2E). This is in line with a previous report highlighting the ability
sequenced cells (Figures S2A and S2B; STAR Methods). GFP+ of LCMV to trigger an abortive form of cell division blocked in the
cells were enriched for vUMI+ cells compared to total spleen (Fig- G1 phase (Beier et al., 2015). Altogether, our results show that
ure S2A). We then calculated whether the cells positive for the Viral-Track is sufficient to detect infected cells in in vivo scRNA-
LCMV-GFP signal (GFP+ cells) were similar to the ones desig- seq data and infer the differential gene expression in infected
nated by Viral-Track as containing viral UMIs (vUMI+). Following versus bystander cells.
clustering and annotation, we observed similar proportions of
GFP+ and vUMI+ cells across cell clusters (Figures 2A and S2C; A Single-Cell Map of SARS-CoV-2 Infection in Mild and
R = 0.95, p = 9.0 * 1012), with monocytes, marginal zone B cells Severe Patients
(MZBs), and macrophages being the major infected cell types. We COVID-19 is a viral disease caused by SARS-CoV-2 infection,
then evaluated the transcriptional signatures within these two sets which has recently been recognized as the cause for a pandemic
of cells by computing the Pearson correlation between each pair (Wang et al., 2020a). Little is currently known about the course of
of cells. We observed similar distribution of Pearson correlation the disease and how the virus interacts with the host immune
within the GFP+ and vUMI+ monocyte cells (Figure 2B) that was system in its mild and severe manifestations. To gain insights
significantly higher (median correlation of 0.65, 0.64, and 0.51, on the infection course in humans, we performed scRNA-seq
respectively) than the correlation observed between GFP vUMI and Viral-Track analysis on BALF samples from three mild and
bystander monocytes. We conclude that Viral-Track correctly six severe COVID-19 patients (Liao et al., 2020). In total, 50,615
identifies a homogeneous set of infected cells from in vivo cells passed quality control and were analyzed using the MetaCell
scRNA-seq samples similar to the one identified by conventional algorithm (Baran et al., 2019) (Figure 3A; STAR Methods). Meta-
reporter viruses, even in the more difficult scenario in which viral cell analysis coarsely grouped the metacells into the myeloid,
transcripts are poorly polyadenylated. lymphoid, and epithelial lineages, and each lineage was further
We next evaluated the ability of Viral-Track to detect host fac- subdivided into smaller subsets (Figures 3A, 3B and S3A). Among
tors associated with virus replication. For this purpose, we devel- epithelial cells, we identified epithelial progenitors (expressing
oped a statistical method that detects differentially expressed SOX4), type II alveolar cells (AT2, expressing SFTPB), ciliated
genes based on data binarization and complementary log-log cells (FOXJ1), ionocytes (CFTR), goblet cells (MUC5B), and club
regression (STAR Methods; Methods S1). We used this approach cells (SCGB1A1; Figure S3B). Lymphoid cells consisted several
to test for transcriptional differences between bystander and in- subtypes of CD4+ T cells, including naive CD4+ T cells (express-
fected cells during spleen LCMV infection across the three main ing CCR7), regulatory T cells (Treg, expressing FOXP3), and T
infected cell types: macrophages, MZB cells, and monocytes. follicular helper cells (Tfh, expressing CXCL13 and PDCD1), but
We observed that MZB cells were the most influenced by the viral also diverse CD8+ subsets, such as NK cells (NCAM1), resident
infection, compared to monocytes and macrophages (107, 42, memory CD8+ T cells (Trm, CD8A, and ZNF683), effector CD8+
and 3 genes upregulated, respectively, Z score >3) (Figure 2C). T cells (GZMA and GZMK), and cytotoxic CD8+ T cells (GNLY,
We performed Gene Ontology enrichment analysis on the upregu- PRF1), as well as B cells (CD79A; Figure S3C). The myeloid
lated genes in MZB cells and observed a significant enrichment in compartment exhibited a high diversity of cell states, including
several pathways, including ‘‘chromosome organization,’’ ‘‘DNA neutrophils (FCGR3B), mast cells (CPA3), alveolar macrophages
replication,’’ and ‘‘cell cycle,’’ suggesting that LCMV triggers cell (FABP4), dendritic cells (DCs; FSCN1), and plasmacytoid DCs
division in MZB cells (Figure 2D). Indeed, LCMV-infected MZB (pDC; TCF4) as well as a large diversity of monocytes (FCN1)
cells exhibited higher levels of cell cycle-related genes such as and monocyte-derived macrophages (SPP1) sub-populations
Smc2 (required for chromatin condensation), Cdc6 (regulator of (Figure S3D). These results were robust across different analysis
DNA replication), and Stmn1 (regulator of mitotic spindle) (Figures platforms (Liao et al., 2020).
2E and S2D), but also fibrillarin (Fbl), a host factor whose expres- Comparison of the cellular landscape of mild and severe
sion is required by several viruses (Deffrasnes et al., 2016) (Fig- patients revealed key differences in the composition of BAL
A B
C D
F
E
samples (Figures 3B and 3C). We found changes to each of the Viral-Track Identifies Co-infection of SARS-CoV-2 with
three compartments (Figures 3D–3F and S3E–S3G). While alve- the Human Metapneumovirus
olar macrophages and pDC where enriched in the myeloid To characterize the in vivo crosstalk of SARS-CoV-2 with its human
compartment in the mild patients, the severe patients’ myeloid host, we applied Viral-Track on the data generated from the nine
cells were characterized by a patient-specific diversity associ- SARS-CoV-2 patients and the rich cellular landscape we identi-
ated with accumulation of neutrophils, FCN1+ monocytes, and fied. SARS-CoV-2 transcripts were detected in all six severe sam-
monocyte-derived SPP1+ macrophages (Figures 3D and S3E). ples in variable amounts, ranging from less than 400 transcripts to
Additionally, NK cells and naive CCR7+ CD4+ T cells were more than 15,000 (Figures 4A and S4A). In contrast, no viral reads
consistently enriched across severe patients BAL, while were detected in the three mild patients (Figure 4A). Coverage
ZNF683hi CD8+ Trm cells were specific to mild patients (Figures analysis revealed that the majority of the viral reads mapped to
3E and S3F). We also observed changes in the epithelial the 30 end of the viral segment and corresponded to positive-
compartment, as severe patients exhibited higher numbers of stranded RNA (Figure 4B). This is in agreement with the coronavi-
club cells and AT2 cells (Figures 3F and S3G). By investigating rus transcription: due to a nested transcription process all genomic
expression patterns of shared gene expression programs, we and subgenomic RNA molecules share the same 30 end (Masters,
observed that cytotoxic CD8+ cells and the CD4+ Tfh cells are 2006). We then analyzed the enrichment of vUMIs in the cell pop-
the most proliferative compartments (Figure 3G), while a broad ulations represented in the BAL samples. We observed a strong
interferon type I response, a hallmark of viral response, is mainly enrichment of viral reads in the ciliated and epithelial progenitor
expressed by neutrophils and, to a lesser extent, FCN1+ mono- population, two known cellular targets of the virus, which express
cytes (Figure 3H). We next performed in-depth differential gene the main receptor of the SARS-CoV-2 virus ACE2, as well as
expression analysis between subsets characteristic of mild or TMPRSS2, a protease essential for SARS-CoV-2 entry (Figures
severe patients. We found that CD4+ T cells in the severe pa- 4C and S4B; Table S2) (Hoffmann et al., 2020). We also observed
tients exhibit a more naive phenotype, expressing higher levels enrichment of SARS-CoV-2 reads in the SPP1+ macrophage pop-
of IL7R, CCR7, S1PR1, and LTB. The CD8+ Trm cells signatures ulation, suggesting either that SARS-CoV-2 can infect immune
are restricted to the mild patients and have higher levels of the cells from the myeloid compartment or that SPP1+ macrophages
effector molecules XCL1, ITGAE, CXCR6, and ZNF683 (Fig- phagocytose infected cells or viral particles. Differential gene
ure 3I). Comparing gene expression differences in myeloid types expression analysis between vUMI+ infected and vUMI
between severe and mild patients revealed disease severity- bystander SPP1+ macrophages in the patients with the highest
associated upregulation of inflammatory chemokine genes in viral load, revealed that infected macrophages have a higher
SPP1+ monocyte-derived macrophages populations (CCL2, expression of chemokines (CCL7, CCL8, and CCL18) and APOE,
CCL3, CCL4, CCL7, and CCL8; Figure 3J), as well as genes and a lower expression of TAOK1, a serine/threonine-protein ki-
associated with hypoxia or oxidative stress (HMOX1 and nase in the p38 MAPK cascade (Figure S4C). Interestingly,
HIF1A), and downregulation of MHC class II (HLA-A and HLA- CD147 (also known as BSG), a potential new SARS-CoV-2
DQA1) and type I IFN genes (IFIT1 and OAS1). Alveolar macro- receptor (Wang et al., 2020b), is expressed by all cell types,
phages displayed a severity-associated signature, including including immune cells, suggesting alternative routes for the virus
upregulation of the chemokines CCL18 and CCL4L2 and the to infect these cells.
cathepsins CTSL and CTSB (Figure 3J). Together, we identified Often in cases of infectious diseases, the specific infecting vi-
dramatic differences between the mild and severe COVID-19 rus is not known, or may be accompanied by co-infection with
patients, including an inflammatory signature and a perturbed additional unknown viruses. Viral-Track applies an unsupervised
immune response associated with the severe manifestation mapping strategy and is optimally designed to systematically
of the COVID-19 disease. These also highlight potential profile the source of infection or co-infections in human clinical
immunotherapy treatment of the severe patients by targeting samples. To our surprise, Viral-Track analysis of data from one
the hyper inflammatory response that is activated by inflamma- of the severe patients (S1) revealed the presence of a second vi-
tory cytokines such as interleukin (IL)-6 and IL-8 (Liu et al., rus, the human metapneumovirus (hMPV) (NC_039199 Refseq
2019) (Figure S3H). sequence, Figure 4D) with more than one million reads mapped
to hMPV in this specific patient. hMPV is a non-segmented, sin- says are unbiased and sensitive in their ability to detect extremely
gle-stranded, and negative-sense RNA virus that is responsible rare viral sequences (Moustafa et al., 2017), but do not provide in-
for upper and lower respiratory tract infections in mostly young formation about the infected cells and the cellular changes
(<5 years) children but can also target elderly as well as im- induced by the infection. Alternatively, it is possible to combine
muno-compromised patients (Panda et al., 2014). hMPV has DNA probes with scRNA-seq to enrich for viral sequences and in-
been implicated as a possible source of co-infection with the crease the sensitivity of the assay, but this requires prior knowl-
original SARS-CoV virus (Chan et al., 2003). edge of the viruses present in each sample (Zanini et al., 2018).
Coverage analysis revealed that most reads fall into the N, P, M, Here, we present Viral-Track, a robust and unsupervised compu-
F, M2, SH, G, but not L, genes of hMPV (Figure 4E). We observed a tational pipeline that can detect viral RNA in any scRNA-seq data-
typical pattern of biased scRNA-seq coverage, indicating that the set without the need for experimental modifications or prior knowl-
N, P, M, F, M2, SH, and G genes are actively transcribed, and sug- edge of the infecting agent. Viral-Track was benchmarked on data
gesting that the hMPV was active and replicating at the time of originating from various tissues, infected by viruses with marked
sample collection. Analysis of the viral UMI distribution across differences in their RNA properties, and generated with different
cells revealed a substantial viral load in a large subset of the cells, scRNA-seq platforms. We demonstrate that Viral-Track can
spanning hundreds to thousands vUMIs per infected cell (Fig- readily provide essential information on infection status in clinical
ure 4F), independently of the total host UMIs in that cell (Fig- samples, identify infected cells, probe viral-induced transcrip-
ure S4D). We mapped the infected cells and characterized their tional alterations, and reveal cases of co-infection.
distribution across cell types. The infected patient is characterized In practice, only 70%–85% of scRNA-seq reads map to the host
by high levels of monocytes and CD4+ T cells (Figure S4E). Unlike genome and represent polyadenylated exonic host transcripts,
the SARS-CoV-2 virus infection map, hMPV-infected cells were whereas the remainder of the data is usually overlooked in analysis.
highly enriched in the monocyte compartment but not in the We show that these unmapped scRNA-seq reads, in pathological
epithelial and SPP1+ macrophage compartments (Figure 4G). human samples, potentially contain valuable information on viral
We tested whether the hMPV could alter the function of the in- infection and can be effectively used for viral genome assembly.
fected monocytes, and therefore influence the course of the dis- Viral-Track can resolve complex cellular ecosystems perturbed
ease. Using Viral-Track, we detected a large number of up- and by viral infection and provide an unbiased map of the infected cells,
downregulated genes in infected monocytes compared to as well as the transcriptional perturbations induced by the virus at
bystander monocytes (Figure 4H). Interestingly, several key recep- the single cell level. We combine Viral-Track with a novel statistical
tor genes required for monocyte activation such as CD16 approach to detect differentially expressed genes from scRNA-
(FCGR3B), G-CSF receptor (CSF3R), and the formyl peptide recep- seq data, therefore allowing the detection of gene expression
tor (FRP1) were downregulated in the infected compared to the changes triggered by viral infection and differentiating them from
bystander cells. Moreover, we observed a dramatic downregulation the more abundant bystander effects, such as type I IFN signaling,
of type I Interferon signaling and interferon stimulated genes (ISGs), at the single cell level. Further advances will focus on applying Viral-
including viral restriction factors, (e.g., IFIT3). A gene set enrichment Track on largescale datasets containing scRNA-seq data from
analysis (Figure S4F) revealed a strong enrichment of interferon dozens of samples, leading to robust single-cell viral metagenomic
response genes in the downregulated gene set, suggesting that studies that characterize the viral evolution and interactions of vi-
the hMPV is strongly downregulating the IFN response pathway. rus-induced disease mechanisms with host genetics.
Several anti-inflammatory genes were upregulated, including Here, we applied scRNA-seq and Viral-Track analysis to
LILRB4 (a potent inhibitor of monocyte activation) (Lu et al., 2009) COVID-19 patient-derived samples to provide a cellular and viral
and MITF, a transcription factor known to be a critical suppressor atlas of the BAL lung cells from COVID-19 patients. This analysis
of innate immunity (Harris et al., 2018). Last, we observed a positive revealed the diversity of the immune responses across COVID-
and significant association between total number of hMPV UMIs 19 patients and between mild and severe patients. We expect
and production of type I IFN, highlighting that while hMPV dampens that as the pandemic keeps spreading and global research ef-
the response to type I IFN, production of this signal is highly forts grow, additional scRNA-seq samples from COVID-19 pa-
restricted to a rare (~1%) population of cells with a high viral load tients will be generated, including patients treated with emerging
(Figure S4G). Altogether, our analysis described the distribution of immunotherapies (Liu et al., 2019). Such an approach might help
SARS-CoV-2-infected cells in patient’s BAL and revealed the pres- to solve key questions including the contribution of the humoral
ence of a viral co-infection by the hMPV that dampens the immune response (Iwasaki and Yang, 2020), the role of the IL6 pathway
activation of the monocyte compartment in the infected patient. (Herold et al., 2020), and the immune memory induced by the vi-
Further large-scale analyses of mild versus severe patients need rus (Prompetchara et al., 2020). Viral-Track can contribute to the
to be conducted to better understand if the co-infection is corre- global effort to identify the different cellular compartments that
lated or even causative in SARS-CoV-2 pathology. are targeted and affected by COVID-19 and other viruses and
to detect possible co-infection by unexpected viruses. Co-infec-
DISCUSSION tions are gaining recognition in the scientific and medical com-
munity as critical factors in disease prognosis (Zhang et al.,
The virosphere contains hundreds of thousands of species that 2020). So far, research focused mainly on co-infections of bac-
constantly interact with their host cells. Over the years, several terial sources or of well-known viruses such as influenza A (Wu
genomic techniques have been developed to detect virus-derived et al., 2020). Understanding the diversity of viral co-infections
sequences in human samples. For instance, deep sequencing as- and their mechanisms of immune suppression at the cellular
and molecular level could therefore provide highly valuable infor- d QUANTIFICATION AND STATISTICAL ANALYSIS
mation and lead toward possible therapeutic targets, especially B Read mapping/alignment
for severe patients, whose treatment options are limited. B Viral database and STAR Index building
B Processing and filtering of the BAM files
Limitations B Transcript reconstruction
Viral-Track is a new and powerful tool to decipher host-viral in- B MARS-seq data demultiplexing and UMI count
teractions. However, its impact is dependent on several factors, B Drop-seq and 10X data download, pre-processing and
the most critical one being the biochemical and pathophysiolog- demultiplexing
ical properties of the virus. The absence of a poly(A) tail at the B Analysis of the MARS-seq spleen LCMV dataset
end of viral RNA molecules can significantly decrease their cap- B Analysis of the 10X HBV liver dataset
ture rate efficiency in current scRNA-seq techniques, as shown B Analysis of the COVID-19 BAL dataset
by the LCMV example. This may hinder Viral-Track’s ability to B Testing for infection specificity in COVID-19 BAL
robustly identify infected cells or discern differential expression dataset
between infected and bystander cells in such viruses. Other B Dichotomized differential gene expression analysis
properties of the viral RNA molecules, absence/presence of 50 B Automate thresholding to detect HBV and hMPV in-
capping, nucleotide composition, or dependence on RNA bind- fected cells
ing proteins, may also affect capture efficiency, and as the tech- B Gene set enrichment analysis
nology develops, further research will focus on the classification
of molecular features that facilitate or prevent virus identification SUPPLEMENTAL INFORMATION
Liu, L., Wei, Q., Lin, Q., Fang, J., Wang, H., Kwok, H., Tang, H., Nishiura, K., Subramanian, A., Tamayo, P., Mootha, V.K., Mukherjee, S., Ebert, B.L., Gil-
Peng, J., Tan, Z., et al. (2019). Anti-spike IgG causes severe acute lung injury lette, M.A., Paulovich, A., Pomeroy, S.L., Golub, T.R., Lander, E.S., and Me-
by skewing macrophage responses during acute SARS-CoV infection. JCI sirov, J.P. (2005). Gene set enrichment analysis: a knowledge-based approach
Insight 4, 123158. for interpreting genome-wide expression profiles. Proc. Natl. Acad. Sci. USA
102, 15545–15550.
Lu, H.K., Rentero, C., Raftery, M.J., Borges, L., Bryant, K., and Tedla, N.
Svensson, V. (2020). Droplet scRNA-seq is not zero-inflated. Nat. Biotechnol.
(2009). Leukocyte Ig-like receptor B4 (LILRB4) is a potent inhibitor of Fcgam-
38, 147–150.
maRI-mediated monocyte activation via dephosphorylation of multiple ki-
nases. J. Biol. Chem. 284, 34839–34848. Svensson, V., da Veiga Beltrame, E., and Pachter, L. (2019). Quantifying the
tradeoff between sequencing depth and cell number in single-cell RNA-seq.
Macosko, E.Z., Basu, A., Satija, R., Nemesh, J., Shekhar, K., Goldman, M., Tir-
bioRxiv. https://doi.org/10.1101/762773.
osh, I., Bialas, A.R., Kamitaki, N., Martersteck, E.M., et al. (2015). Highly Par-
allel Genome-wide Expression Profiling of Individual Cells Using Nanoliter Townes, F.W., Hicks, S.C., Aryee, M.J., and Irizarry, R.A. (2019). Feature selec-
Droplets. Cell 161, 1202–1214. tion and dimension reduction for single-cell RNA-Seq based on a multinomial
model. Genome Biol. 20, 295.
Masters, P.S. (2006). The molecular biology of coronaviruses. Adv. Virus Res.
Wang, C., Horby, P.W., Hayden, F.G., and Gao, G.F. (2020a). A novel corona-
66, 193–292.
virus outbreak of global health concern. Lancet 395, 470–473.
McInnes, L., Healy, J., and Melville, J. (2018). Umap: Uniform manifold
Wang, K., Chen, W., Zhou, Y.-S., Lian, J.-Q., Zhang, Z., Du, P., Gong, L.,
approximation and projection for dimension reduction. ArXiv,
Zhang, Y., Cui, H.-Y., Geng, J.-J., et al. (2020b). SARS-CoV-2 invades host
ArXiv1802.03426.
cells via a novel route: CD147-spike protein. BioRxiv. https://doi.org/10.
Medaglia, C., Giladi, A., Stoler-Barak, L., De Giovanni, M., Salame, T.M., 1101/2020.03.14.988345.
Biram, A., David, E., Li, H., Iannacone, M., Shulman, Z., et al. (2017). Spatial Wu, X., Cai, Y., Huang, X., Yu, X., Zhao, L., Wang, F., Li, Q., Gu, S., Xu, T., Li, Y.,
reconstruction of immune niches by combining photoactivatable reporters et al. (2020). Co-infection with SARS-CoV-2 and Influenza A Virus in Patient
and scRNA-seq. Science 358, 1622–1626. with Pneumonia, China. Emerg. Infect. Dis. 26 https://doi.org/10.3201/
Moustafa, A., Xie, C., Kirkness, E., Biggs, W., Wong, E., Turpaz, Y., Bloom, K., eid2606.200299.
Delwart, E., Nelson, K.E., Venter, J.C., and Telenti, A. (2017). The blood DNA Yang, S., Corbett, S.E., Koga, Y., Wang, Z., Johnson, W.E., Yajima, M., and
virome in 8,000 humans. PLoS Pathog. 13, e1006292. Campbell, J.D. (2020). Decontamination of ambient RNA in single-cell RNA-
Müller, S., Hunziker, L., Enzler, S., Bühler-Jungo, M., Di Santo, J.P., Zinker- seq with DecontX. Genome Biol. 21, 57.
nagel, R.M., and Mueller, C. (2002). Role of an intact splenic microarchitec- Yofe, I., Dahan, R., and Amit, I. (2020). Single-cell genomic approaches for
ture in early lymphocytic choriomeningitis virus production. J. Virol. 76, developing the next generation of immunotherapies. Nat. Med. 26,
2375–2383. 171–177.
Münz, C., Lünemann, J.D., Getts, M.T., and Miller, S.D. (2009). Antiviral im- Young, L.S., and Rickinson, A.B. (2004). Epstein-Barr virus: 40 years on. Nat.
mune responses: triggers of or triggered by autoimmunity? Nat. Rev. Immunol. Rev. Cancer 4, 757–768.
9, 246–258. Zanini, F., Robinson, M.L., Croote, D., Sahoo, M.K., Sanz, A.M., Ortiz-
Otsu, N. (1979). A Threshold Selection Method from Gray-Level Histograms. Lasso, E., Albornoz, L.L., Rosso, F., Montoya, J.G., Goo, L., et al. (2018).
IEEE Trans. Syst. Man Cybern. 9, 62–66. Virus-inclusive single-cell RNA sequencing reveals the molecular signature
of progression to severe dengue. Proc. Natl. Acad. Sci. USA 115, E12363–
Panda, S., Mohakud, N.K., Pena, L., and Kumar, S. (2014). Human metapneu-
E12369.
movirus: review of an important respiratory pathogen. Int. J. Infect. Dis.
25, 45–52. Zhang, F., Wei, K., Slowikowski, K., Fonseka, C.Y., Rao, D.A., Kelly, S.,
Goodman, S.M., Tabechian, D., Hughes, L.B., Salomon-Escoto, K., et al.;
Pertea, M., Pertea, G.M., Antonescu, C.M., Chang, T.-C., Mendell, J.T., and Accelerating Medicines Partnership Rheumatoid Arthritis and Systemic
Salzberg, S.L. (2015). StringTie enables improved reconstruction of a tran-
Lupus Erythematosus (AMP RA/SLE) Consortium (2019). Defining inflam-
scriptome from RNA-seq reads. Nat. Biotechnol. 33, 290–295.
matory cell states in rheumatoid arthritis joint synovial tissues by inte-
Pierson, T.C., and Diamond, M.S. (2018). The emergence of Zika virus and its grating single-cell transcriptomics and mass cytometry. Nat. Immunol.
new clinical syndromes. Nature 560, 573–581. 20, 928–942.
STAR+METHODS
RESOURCE AVAILABILITY
Lead Contact
Further information and requests for resources and reagents should be directed to and will be fulfilled by the Lead Contact Ido Amit
(ido.amit@weizmann.ac.il).
Materials Availability
This study did not generate new unique reagents.
Mice
C57BL/6 mice were purchased from Jackson Laboratories and bred and housed at the Weizmann Institute of Science animal facility,
under specific pathogen-free conditions. Female mice, 6-8 weeks of age, were used for all experiments. Experimental protocols were
approved by the Weizmann Institute of Science Ethics Committee and were performed according to institutional guidelines.
LCMV/VSV infections
For LCMV infection, 1x105 Focus-Forming Units (FFUs) of the LCMV-Arm strain were injected. For VSV, 1x105 Plaque-Forming Units
(PFUs) of the VSV Indiana strain were used. Mice were anesthetized and viruses administered by intradermal injection into the ear
pinna. 24h later, mice were sacrificed and auricular LN were harvested.
Subjects
This study was conducted according to the principles expressed in the Declaration of Helsinki. Ethical approval was obtained from
the Research Ethics Committee of Shenzhen Third People’s Hospital. All participants provided written informed consent for sample
collection and subsequent analyses.
METHOD DETAILS
plates containing lysis buffer before processing the plate according to the MARS-seq protocol (Jaitin et al., 2014). All infectious work
was performed in designated Biosafety Level 2 (BSL-2) and BSL-3 workspaces in accordance with institutional guidelines
Read mapping/alignment
Reads were aligned using STAR 2.7.0 (Dobin et al., 2013) in the two-pass mode using the following parameters:–runThreadN was set
to 14,–outSAMattributes to ‘NH HI AS nM NM XS’,–outSAMtype to ‘BAM SortedByCoordinate’,–outFilterScoreMinOverLread to
0.6,–outFilterMatchNminPverLread to 0.6, and–twopassMode to ‘Basic’.
Empirically we determined that a mean sequence entropy bigger than 1.2, a coverage bigger than 5% and the longest contig bigger
than three times the mean read length is sufficient to consider a viral segment to be present. This filter configuration eliminated all
manually identified artifacts in the various benchmarked datasets and was used unchanged in the HBV and COVID-19 patient
data analysis.
When using this strategy, we observed two different kinds of ‘contamination’:
-
d the first one consists of the detection of retroviruses specific to the sequenced host species: this is likely due to the expression
of host endogenous retro-viral elements that highly similar to ‘real’ retroviruses.
-
d the second is the presence of a plant virus, the Tomato brown rugose fruit virus: this is an emerging virus that infects tomatoes
and peppers and is endemic in Israel and Jordan. It is highly contagious and spreads easily. We detected this virus only in sam-
ples sequenced in Rehovot (Israel) suggesting that it was due to an airborne contamination.
To improve computation speed, this step was parallelised using the doParallel R package.
Transcript reconstruction
As viral genomes are poorly annotated, we decided to systemically reconstruct the transcriptome of each viral segment detected
using the transcript assembler StringTie (Pertea et al., 2015). StringTie was used with default parameter except the minimum isoform
abundance parameter -f which was set to 0.01 to detect lowly abundant transcripts and the minimal distance between two transcript
-g set to 5.
the getdiffGenes function with default parameters. Data were visualized using UMAP (McInnes et al., 2018) implemented by the uwot
package.
procedures such as TPM, especially for lowly expressed genes (Hafemeister and Satija, 2019). We therefore improved the method
used in our former paper (Blecher-Gonen et al., 2019) that was based on logistic regression.
Briefly our method is based on the global trend of the field that consists in sequencing large amounts of cells but with a limited
sequencing depth. Such approach will produce mostly ‘binary’ data and seem to be represent the best compromise on a cost/effi-
ciency point of view (Svensson et al., 2019). So far, several statistical models have been used to model and analyze scRNA-seq count
data, most of them being based on the zero-inflated negative-binomial (ZINB) distribution (Finak et al., 2015; Kharchenko et al., 2014).
However, recent studies suggested that those models are too complex and introduce artificial complexity (Silverman et al., 2018;
Svensson, 2020; Townes et al., 2019). We hypothesize that with such binary data, current models will not fit properly and more suited
ones need to be developed.
We therefore developed a new approach based on the binomial complementary Log-log regression (cloglog model): once a given
group of cells has been isolated, through Louvain’s clustering for instance (Blondel et al., 2008), we first dichotomized gene expres-
sion (if the normalized expression is bigger than 0 the gene is considered as expressed) and then computed a binomial Generalized
Linear Model (GLM) with a complementary log log link function (cloglog) using the glm() R function. To mitigate the variation of the
library size as well as the global effect of the infection (bystander effect), we include both variables in the regression model. The cor-
responding p value are then computed using a Likelihood Ratio Test (LRT) and then corrected using Benjamini Hochberg correction
(Benjamini and Hochberg, 1995).
For a more comprehensive description of the approach please see Methods S1.
Supplemental Figures
Figure S2. Comparison of Viral-Track Performance to Fluorescence Tagging Techniques, Related to Figure 2
A. Proportion of vUMI+ cells from total spleen and the LCMV-GFP+ population B. UMAP plot of the spleen LCMV data, spots are colored based on Louvain
clustering. C. UMAP plot of the spleen LCMV data, bystander cells are colored in gray, vUMI+ cells are colored in red and GFP+ cells in green. D. Mean gene
expression in bystander and infected MZB cells. Genes with a log2FC bigger than 1 or lower than 1 and a corrected p value lower than 0.01 are colored in
orange.
ll
Article
Figure S3. Detailed Molecular and Cellular Profiling of COVID-19 BAL Samples, Related to Figure 3
A. The confusion matrix of the MetaCell model shown in Figure 3A. Entries denote for each pair of metacells the propensity of cells from both metacells to be
clustered together in a bootstrap analysis. B-D. Gene expression profiles of cells belonging to the epithelial (B), lymphoid (C), and myeloid (D). In A-D, color bars
indicate association to 27 cell subsets depicted in Figure 3A. E-G. Quantification of the frequency of specific cell subsets in the myeloid (E), lymphoid (F), and
epithelial (G) compartments, across the nine patients. Diamond marks patient S1, co-infected with the human Metapneumovirus (Figures 4D-4H). Horizontal lines
indicate mean frequency. (H). Projection of IL6 and IL8 (CXCL8) expression on the 2D map shown in Figure 4A. Colors represent expression quantiles.
ll
Article
Correspondence
shane@lji.org (S.C.),
alex@lji.org (A.S.)
In Brief
An analysis of immune cell responses to
SARS-CoV-2 from recovered patients
identifies the regions of the virus that is
targeted and also reveals cross-reactivity
with other common circulating
coronaviruses
Highlights
d Measuring immunity to SARS-CoV-2 is key for
understanding COVID-19 and vaccine development
Article
Targets of T Cell Responses to SARS-CoV-2
Coronavirus in Humans with COVID-19
Disease and Unexposed Individuals
Alba Grifoni,1 Daniela Weiskopf,1 Sydney I. Ramirez,1,2 Jose Mateus,1 Jennifer M. Dan,1,2
Carolyn Rydyznski Moderbacher,1 Stephen A. Rawlings,2 Aaron Sutherland,1 Lakshmanane Premkumar,3
Ramesh S. Jadi,3 Daniel Marrama,1 Aravinda M. de Silva,3 April Frazier,1 Aaron F. Carlin,2 Jason A. Greenbaum,1
Bjoern Peters,1,2 Florian Krammer,4 Davey M. Smith,2 Shane Crotty,1,2,5,* and Alessandro Sette1,2,5,6,*
1Center for Infectious Disease and Vaccine Research, La Jolla Institute for Immunology, La Jolla, CA 92037, USA
2Department of Medicine, Division of Infectious Diseases and Global Public Health, University of California, San Diego, La Jolla, CA
92037, USA
3Department of Microbiology and Immunology, University of North Carolina School of Medicine, Chapel Hill, NC 27599-7290, USA
4Department of Microbiology, Icahn School of Medicine at Mount Sinai, New York, NY, USA
5These authors contributed equally
6Lead Contact
SUMMARY
Understanding adaptive immunity to SARS-CoV-2 is important for vaccine development, interpreting coro-
navirus disease 2019 (COVID-19) pathogenesis, and calibration of pandemic control measures. Using HLA
class I and II predicted peptide ‘‘megapools,’’ circulating SARS-CoV-2-specific CD8+ and CD4+ T cells
were identified in 70% and 100% of COVID-19 convalescent patients, respectively. CD4+ T cell responses
to spike, the main target of most vaccine efforts, were robust and correlated with the magnitude of the anti-
SARS-CoV-2 IgG and IgA titers. The M, spike, and N proteins each accounted for 11%–27% of the total CD4+
response, with additional responses commonly targeting nsp3, nsp4, ORF3a, and ORF8, among others. For
CD8+ T cells, spike and M were recognized, with at least eight SARS-CoV-2 ORFs targeted. Importantly, we
detected SARS-CoV-2-reactive CD4+ T cells in 40%–60% of unexposed individuals, suggesting cross-
reactive T cell recognition between circulating ‘‘common cold’’ coronaviruses and SARS-CoV-2.
Cell 181, 1489–1501, June 25, 2020 ª 2020 Elsevier Inc. 1489
ll
Article
(Vennema et al., 1990). Protective immunity, immunopathogene- Limited information is also available about which SARS-CoV-2
sis, and vaccine development for COVID-19 are each briefly dis- proteins are recognized by human T cell immune responses. In
cussed below, related to introducing the importance of defining some infections, T cell responses are strongly biased toward
T cell responses to SARS-CoV-2. certain viral proteins, and the targets can vary substantially be-
Based on data from SARS patients in 2003–2004 (caused by tween CD4+ and CD8+ T cells (Moutaftsi et al., 2010; Tian
SARS-CoV, the most closely related human betacoronavirus to et al., 2019). Knowledge of SARS-CoV-2 proteins and epitopes
SARS-CoV-2), and based on the fact that most acute viral infec- recognized by human T cell responses is of immediate rele-
tions result in development of protective immunity (Sallusto et al., vance, as it will allow for monitoring of COVID-19 immune re-
2010), a likely possibility has been that substantial CD4+ T cell, sponses in laboratories worldwide. Epitope knowledge will also
CD8+ T cell, and neutralizing antibody responses develop to assist candidate vaccine design and facilitate evaluation of vac-
SARS-CoV-2, and all contribute to clearance of the acute infec- cine candidate immunogenicity. Almost all of the current COVID-
tion, and, as a corollary, some of the T and B cells are retained 19 vaccine candidates are focused on the spike protein.
long term (i.e., multiple years) as immunological memory and A final key issue to consider in the study of SARS-CoV-2 im-
protective immunity against SARS-CoV-2 infection (Guo et al., munity is whether some degree of cross-reactive coronavirus
2020b; Li et al., 2008). However, a contrarian viewpoint is also immunity exists in a fraction of the human population, and
legitimate. While most acute infections result in the development whether this might influence susceptibility to COVID-19 disease.
of protective immunity, available data for human coronaviruses This issue is also relevant for vaccine development, as cross-
suggest the possibility that substantive adaptive immune re- reactive immunity could influence responsiveness to candidate
sponses can fail to occur (Choe et al., 2017; Okba et al., 2019; vaccines (Andrews et al., 2015).
Zhao et al., 2017) and robust protective immunity can fail to In sum, the ability to measure and understand the human CD4+
develop (Callow et al., 1990). A failure to develop protective im- and CD8+ T cell responses to SARS-CoV-2 infection is a major
munity could occur due to a T cell and/or antibody response of knowledge gap currently impeding COVID-19 vaccine develop-
insufficient magnitude or durability, with the neutralizing anti- ment, interpretation of COVID-19 disease pathogenesis, and
body response being dependent on the CD4+ T cell response calibration of future social distancing pandemic control
(Crotty, 2019; Zhao et al., 2016). Thus, there is urgent need to un- measures.
derstand the magnitude and composition of the human CD4+
and CD8+ T cell responses to SARS-CoV-2. If natural infection RESULTS
with SARS-CoV-2 elicits potent CD4+ and CD8+ T cell responses
commonly associated with protective antiviral immunity, COVID- SARS-CoV-2 Peptides and Predicted Class I and Class II
19 is a strong candidate for rapid vaccine development. Epitopes
Immunopathogenesis in COVID-19 is a serious concern (Cao, We recently predicted SARS-CoV-2 T cell epitopes utilizing
2020; Peeples, 2020). It is most likely that an early CD4+ and the Immune Epitope Database and Analysis Resource (IEDB)
CD8+ T cell response against SARS-CoV-2 is protective, but an (Dhanda et al., 2019; Vita et al., 2019). Utilizing bioinformatic
early response is difficult to generate because of efficient innate approaches, we identified specific peptides in SARS-CoV-2
immune evasion mechanisms of SARS-CoV-2 in humans with increased probability of being T cell targets (Grifoni
(Blanco-Melo et al., 2020). Immune evasion by SARS-CoV-2 is et al., 2020). We previously developed the megapool (MP)
likely exacerbated by reduced myeloid cell antigen-presenting approach to allow simultaneous testing of large numbers of
cell (APC) function or availability in the elderly (Zhao et al., 2011). epitopes. By this technique, numerous epitopes are solubi-
In such cases, it is conceivable that late T cell responses may lized, pooled, and re-lyophilized to avoid cell toxicity problems
instead amplify pathogenic inflammatory outcomes in the pres- (Carrasco Pro et al., 2015). These MPs have been used in hu-
ence of sustained high viral loads in the lungs, by multiple hypo- man T cell studies of a number of indications, including al-
thetical possible mechanisms (Guo et al., 2020a; Li et al., 2008; lergies (Hinz et al., 2016), tuberculosis (Lindestam Arlehamn
Liu et al., 2019). Critical (ICU) and fatal COVID-19 (and SARS) out- et al., 2016), tetanus, pertussis (Bancroft et al., 2016; da Silva
comes are associated with elevated levels of inflammatory cyto- Antunes et al., 2017), and dengue virus, for both CD4+ and
kines and chemokines, including interleukin-6 (IL-6) (Giamarel- CD8+ T cell epitopes (Grifoni et al., 2017; Weiskopf et al.,
los-Bourboulis et al., 2020; Wong et al., 2004; Zhou et al., 2020) 2015). Here, we generated MPs based on predicted SARS-
Vaccine development against acute viral infections classi- CoV-2 epitopes. Specifically, one MP corresponds to 221 pre-
cally focuses on vaccine-elicited recapitulation of the type of dicted HLA class II CD4+ T cell epitopes (Grifoni et al., 2020)
protective immune response elicited by natural infection. covering all proteins in the viral genome, apart from the spike
Such foundational knowledge is currently missing for (S) antigen (CD4_R MP). The prediction strategy utilized is
COVID-19, including how the balance and the phenotypes of geared to capture 50% of the total response (Dhanda
responding cells vary as a function of disease course and et al., 2018; Paul et al., 2015) and was designed and validated
severity. Such knowledge can guide selection of vaccine to predict dominant epitopes independently of ethnicity and
strategies most likely to elicit protective immunity against HLA polymorphism. This approach takes advantage of the
SARS-CoV-2. Furthermore, knowledge of the T cell responses extensive cross-reactivity and repertoire overlap between
to COVID-19 can guide selection of appropriate immunolog- different HLA class II loci and allelic variants to predict pro-
ical endpoints for COVID-19 candidate vaccine clinical trials, miscuous epitopes, capable of binding many of the most
which are already starting. common HLA class II prototypic specificities (Greenbaum
Table 1. Participant Characteristics mind in terms of comparison of the magnitude of the CD4+
T cell responses to those pools.
Unexposed (n = 20) COVID-19 (n = 20)
In the case of CD8 epitopes, since the overlap between
20–66 20–64
different HLA class I allelic variants and loci is more limited to
(median = 31, (median = 44,
Age (years) IQR = 21) IQR = 9)
specific groups of alleles, or supertypes (Sidney et al., 2008),
we targeted a set of the 12 most prominent HLA class I A and
Gender
B alleles, which together allow broad coverage (>85%) of the
Male (%) 35% (7/20) 45% (9/20)
general population. Two class I MPs were synthesized based
Female (%) 65% (13/20) 55% (11/20) on epitope predictions for those 12 most common HLA A and
Residency B alleles (Grifoni et al., 2020), which collectively encompass
California (%) 95% (19/20) 100% (20/20) 628 predicted HLA class I CD8+ T cell epitopes from the entire
USA, Non-California 5% (1/20) 0% (0/20) SARS-CoV-2 proteome (CD8 MP-A and MP-B).
(%)
Sample Collection March 2015– March–April 2020 Immunological Phenotypes of Recovered COVID-19
Date March 2018 Patients
SARS-CoV-2 PCR N/A 100% (16/16 tested) To test for the generation of SARS-CoV-2 CD4+ and CD8+ T cell
Positivity responses following infection, we initially recruited 20 adult pa-
Antibody Test N/A 90% (18/20)
tients who had recovered from COVID-19 disease (Table 1).
Positivitya We also utilized peripheral blood mononuclear cell (PBMC) and
plasma samples from local healthy control donors collected in
Disease Severityb
2015–2018 (see STAR Methods). Blood samples were collected
Mild N/A 70% (14/20)
at 20–35 days post-symptoms onset from non-hospitalized
Moderate N/A 20% (4/20) COVID-19 patients who were no longer symptomatic. SARS-
Severe N/A 10% (2/20) CoV-2 infection was determined by swab test viral PCR during
Critical N/A 0% (0/20) the acute phase of the infection. Verification of SARS-CoV-2
Symptoms exposure was attempted both by lateral flow serology and
Cough N/A 79% (15/19) SARS-CoV-2 spike protein receptor binding domain (RBD)
ELISA (Stadlbauer et al., 2020), using plasma from the convales-
Fatigue N/A 42% (8/19)
cence stage blood draw. Most patients were confirmed positive
Fever N/A 37% (7/19)
by lateral flow immunoglobulin (Ig) tests (Table 1). All patients
Anosmia N/A 21% (4/19) were confirmed COVID-19 cases by SARS-CoV-2 RBD ELISA
Dyspnea N/A 16% (3/19) (Figures 1 and S1). All cases were IgG positive; anti-RBD IgM
Diarrhea N/A 5% (1/19) and IgA was also detected in the large majority of cases (Figures
Days Post Symptom N/A 20–36 (18/20) 1 and S1).
Onset at Collection (median = 26, IQR = 7) We defined a 21-color flow cytometry panel of mononuclear
Past Medical History leukocyte lineage and phenotypic markers (Table S2) to broadly
No known N/A 65% (13/20) assess the immunological cellular profile of recovered COVID-19
patients (Figures 1 and S2). The frequency of CD3+ cells was
Hyperlipidemia N/A 15% (3/20)
slightly increased in recovered COVID-19 patients relative to
Hypertension N/A 10% (2/20)
non-exposed controls, while no significant differences overall
Asthma N/A 10% (2/20) were observed in the frequencies of CD4+ or CD8+ T cells
Known or suspected N/A 75% (15/20) between the two groups. Frequencies of CD19+ cells were
sick contact/exposure somewhat decreased, while no differences were observed in
a
Commercial skin prick lateral flow assay. the frequencies of CD3–CD19– cells or CD14+CD16– monocytes
b
WHO criteria. (Figures 1 and S2). No evidence of general lymphopenia was
observed in the convalescing patients, consistent with the litera-
ture. Next, we utilized the SARS-CoV-2 MPs to probe CD4+ and
et al., 2011; O’Sullivan et al., 1991; Sidney et al., 2010a, CD8+ T cell responses.
2010b; Southwood et al., 1998).
For the spike protein, to ensure that all T cell reactivity Identification and Quantitation of SARS-CoV-2-Specific
against this important antigen can be detected, we generated CD4+ T Cell Responses
a separate MP covering the entire antigen with 253 15-mer We utilized T cell receptor (TCR) dependent activation induced
peptides overlapping by 10-residues (MP_S, Table S1). As marker (AIM) assays to identify and quantify SARS-CoV-2-spe-
stated above, the MP used to probe the non-spike regions is cific CD4+ T cells in recovered COVID-19 patients. Initial defini-
expected to capture 50% of the total response. The use of tion and assessment of human antigen-specific SARS-CoV-2
overlapping peptides spanning entire open reading frames T cell responses are best made with direct ex vivo T cell assays
(ORFs) instead allows for a more complete characterization using broad-based epitope pools, such as MPs, and
but also requires more cells. This factor should be kept in assays capable of detecting T cells of unknown cytokine
% CD8+ T cells
% CD4+ T cells
% CD3+ cells
40 *p < 0.05, ****p < 0.0001. See also Figures S1 and S2.
60
40
50
20
20
40
surements in independent experiments was
0 30 0 high (p < 0.0002, Figure S3D). To assess func-
Neg COVID Neg COVID Neg COVID
tionality and polarization of the SARS-CoV-2-
G H I specific CD4+ T cell response, we measured cy-
30 * 80 ns 80 ns
tokines secreted in response to MP stimulation.
% CD14+CD16- monocytes
60 70
functional, as the cells produced IL-2 in
% CD19+ cells
20
response to non-spike and spike MPs (Fig-
40 60 ure 2D). Polarization of the cells appeared to be
10 a classic TH1 type, as substantial interferon
20 50 (IFN)-g was produced (Figure 2E), while little to
no IL-4, IL-5, IL-13, or IL-17a was expressed
0 0 40 (Figures S3G–S3J).
Neg COVID Neg COVID Neg COVID
Thus, recovered COVID-19 patients consis-
tently generated a substantial CD4+ T cell
polarization and functional attributes. AIM assays are cytokine- response against SARS-CoV-2. Similar conclusions were
independent assays to identify antigen-specific CD4+ T cells reached using stimulation index as the metric (Figures S3E
(Havenar-Daughton et al., 2016; Reiss et al., 2017). AIM assays and S3F). In terms of total CD4+ T cell response per donor (Fig-
have been successfully used to identify virus-specific, vaccine- ure 2A), on average 50% of the detected response was
specific, or tuberculosis-specific CD4+ T cells in a range of directed against the spike protein, and 50% was directed
studies (Dan et al., 2016, 2019; Herati et al., 2017; Morou against the MP representing the remainder of the SARS-CoV-
et al., 2019). 2 orfeome (Figure 2A). This is of significance, since the SARS-
We stimulated PBMCs from 10 COVID-19 cases and 11 CoV-2 spike protein is a key component of the vast majority
healthy controls (SARS-CoV-2 unexposed, collected in 2015– of candidate COVID-19 vaccines under development. Of
2018) with a spike MP (MP_S) and the class II MP covering the note, given the nature of the MP_R peptide predictions, the
remainder of the SARS-CoV-2 orfeome (‘‘non-spike,’’ MP actual CD4+ T cell response to be ascribed to non-spike
CD4_R). A CMV MP was used as a positive control, while ORFs was likely to be higher, addressed in further experiments
DMSO was used as the negative control (Figures 2 and S3). below.
SARS-CoV-2 spike-specific CD4+ T cell responses
(OX40+CD137+) were detected in 100% of COVID-19 cases Identification and Quantitation of SARS-CoV-2-Specific
(p < 0.0001 versus unexposed donors spike MP, Figures 2A CD8+ T Cell Responses
and 2B. p = 0.002 versus DMSO control, Figure 2C). CD4+ To measure SARS-CoV-2-specific CD8+ T cells in the recov-
T cell responses to the remainder of the SARS-CoV-2 orfeome ered COVID-19 patients, we utilized two complementary meth-
were also detected in 100% of COVID-19 cases (p < 0.0079 odologies, AIM assays and intracellular cytokine staining (ICS).
versus unexposed donors non-spike MP, Figures 2A and 2B. The two SARS-CoV-2 class I MPs were used, CD8-A and CD8-
p = 0.002, non-spike versus DMSO control, Figure 2C). The B, with CMV MP and DMSO serving as positive and negative
magnitude of the SARS-CoV-2-specific CD4+ T cell responses controls, respectively (Figures 3 and S4). CD8+ T cell responses
measured was similar to that of the CMV MP (Figure S3C). The were detected by AIM (CD69+CD137+) in 70% of COVID-19
concordance between SARS-CoV-2-specific CD4+ T cell mea- cases (p < 0.0011 versus unexposed donors ‘‘CD8 total,’’
A B
C D E
Figures 3A and 3B; p = 0.002, CD8-A or CD8-B versus DMSO higher antibody titers in COVID-19 cases. Given that spike is
control, Figure S4B). MP CD8-A contains spike epitopes, the primary target of SARS neutralizing antibodies, we exam-
among epitopes to other proteins. The magnitude of the ined spike-specific CD4+ T cells. Spike-specific CD4+ T cell
SARS-CoV-2 reactive CD8+ T cell responses measured by responses correlated well with the magnitude of the anti-spike
AIM was somewhat lower than the CMV MP (Figure S4C). RBD IgG titers (R = 0.81; p < 0.0001; Figure 4A). Similar re-
Similar conclusions were reached using stimulation index (Fig- sults were obtained using stimulation index (Figure S5A).
ures S3D and S3E). The non-spike SARS-CoV-2-specific CD4+ T cell response
Independently, ICS assays detected IFN-g+ SARS-CoV-2- did not correlate as well with anti-spike RBD IgG titers (Fig-
specific CD8+ T cells in the majority of COVID-19 cases (Figures ures 4B and S5B), consistent with a common requirement
3C and 3D). The majority of IFN-g+ cells co-expressed granzyme for intramolecular CD4+ T cell help (Sette et al., 2008). Anti-
B (Figures 3D and 3E). A substantial fraction of the IFN-g+ spike IgA titers also correlated with spike-specific CD4+
cells expressed tumor necrosis factor (TNF) but not IL-10 (Fig- T cells (p < 0.0002, Figure S5). Thus, COVID-19 patients
ure 3D). Thus, the majority of recovered COVID-19 patients make anti-spike RBD antibody responses commensurate
generated a CD8+ T cell response against SARS-CoV-2. with the magnitude of their spike-specific CD4+ T cell
response. We then assessed the relationship between the
Relationship between SARS-CoV-2-Specific CD4+ T Cell CD4+ and CD8+ T cell responses to SARS-CoV-2. SARS-
Responses and IgG and IgA Titers CoV-2-specific CD4+ and CD8+ T cell responses were well
Most protective antibody responses are dependent on CD4+ correlated (R = 0.62. p = 0.0025, Figures 4C and S5). Thus,
T cell help. Therefore, we assessed whether stronger SARS- antibody, CD4+, and CD8+ T cell responses to SARS-CoV-2
CoV-2-specific CD4+ T cell responses were associated with were generally well correlated.
A B
C D E
Pre-existing Cross-Reactive Coronavirus-Specific tested the SARS-CoV-2 unexposed donors for seroreactivity to
T Cells HCoV-OC43 and HCoV-NL63 as a representative betacoronavi-
While spike- and non-spike-specific CD4+ T cell responses were rus and alphacoronavirus, respectively. All donors were IgG
detectable in all COVID-19 cases, cells were also detected in un- seropositive to HCoV-OC43 and HCoV-NL63 RBD, to varying
exposed individuals (Figures 3A and 3B). These responses were degrees (Figure 5C), consistent with the endemic nature of these
statistically significant for non-spike-specific CD4+ T cell reac- viruses (Gorse et al., 2010; Huang et al., 2020; Severance et al.,
tivity (non-spike, p = 0.039; spike, p = 0.067; Figures 5A and 2008). We therefore examined whether these represented true
5B). Non-spike-specific CD4+ T cell responses were above the pan-coronavirus T cells capable of recognizing SARS-CoV-2
limit of detection in 50% of donors based on stimulation index epitopes.
(SI) (Figure S3E). All of the donors were recruited between
2015 and 2018, excluding any possibility of exposure to SARS- SARS-CoV-2 ORF Targets of CD4+ and CD8+ T Cells
CoV-2. Four human coronaviruses are known causes of sea- A most pressing, yet unresolved, set of issues in understanding
sonal ‘‘common cold’’ upper-respiratory tract infections: SARS-CoV-2 immune responses is what antigens are targeted
HCoV-OC43, HCoV-HKU1, HCoV-NL63, and HCoV-229E. We by CD4+ and CD8+ T cells, whether the corresponding antigens
A B C
Figure 4. Correlations between SARS-CoV-2-Specific CD4+ T Cells, Antibodies, and CD8+ T Cells
(A) Correlation between SARS-CoV-2 spike-specific CD4+ T cells (%) and anti-spike RBD IgG.
(B) Correlation between SARS-CoV-2 non-spike-specific CD4+ T cells (%) and anti-spike RBD IgG.
(C) Correlation between SARS-CoV-2-specific CD4+ T cells and SARS-CoV-2-specific CD8+ T cells. Total MP responses per donor were used in each case
(‘‘Non-spike’’ + ‘‘spike’’ (CD4_R + MP_S) for CD4+ T cells, CD8_A + CD8_B for CD8+ T cells).
Statistical comparisons were performed using Spearman correlation. See also Figure S5.
are the same or different, and how do they reflect the antigens When examining the non-exposed donors, the pattern of CD4+
currently considered for COVID-19 vaccine development. We T cell targets changed. While S was still a relatively prominent
synthesized sets of overlapping peptides spanning the entire target (23% of total, on average), there was no, or marginal, reac-
sequence of SARS-CoV-2 and pooled them separately so that tivity against SARS-CoV-2 N and M. Among donors with detect-
each pool would represent one antigen (with the exception of able CD4+ T cells, a shift in reactivity was observed toward
nsp3, for which two pools were made; Table S1). SARS-CoV-2 nsp14 (25%), nsp4 (15%) and nsp6 (14%) (Figures
In the case of CD4+ T cell responses, no obvious pattern of 6B and 6C). SARS-CoV-2-reactive CD4+ T cells were detected in
antigen specificity was observed based on SARS-CoV-2 at least six different unexposed donors, demonstrating that the
genome organization; however, coronaviruses increase protein cross-reactivity is relatively widely distributed (Figure S6A).
synthesis of certain ORFs in infected cells via subgenomic Having scanned the full SARS-CoV-2 orfeome for CD4+ T cell
RNAs. Accounting for the relative abundance of subgenomic reactivity in multiple donors, it was possible to assess whether
RNAs (Figure 6A) (Irigoyen et al., 2016; Snijder et al., 2003; the epitope prediction MP approach successfully enriched for
Xie et al., 2020), the ORFs were re-ordered based on predicted SARS-CoV-2 epitopes targeted by human CD4+ T cells. When
protein abundance (Figure 6B). A clear hierarchy of SARS-CoV- the total reactivity observed with the CD4_R MP was plotted
2-specific CD4+ T cell targets was then apparent, with the ma- versus the sum total of all antigen pools (excluding spike, given
jority of the CD4+ T cell response in COVID-19 cases directed that spike predictions were not included in the CD4_R MP), a sig-
against highly expressed SARS-CoV-2 ORFs spike, M, and N. nificant correlation was observed (p < 0.0002, Figure S6C). The
On average, these antigens accounted for 27%, 21%, and single MP-R captured 50% (44% +/ range 28%–80%) of
11% of the total CD4+ T cell response, respectively. Most the non-spike response per COVID-19 donor, demonstrating
COVID-19 cases also had CD4+ T cells specific for SARS- the success of the prediction approach, which, as mentioned
CoV-2 nsp3, nsp4, and ORF8 (Figure 6B), on average each ac- above, was devised to attempt to capture approximately 50%
counting for 5% of the total CD4+ T cell response (Figure 6C). of the total response (Dhanda et al., 2018; Paul et al., 2015).
E, ORF6, hypothetical ORF10, and nsp1 are all small antigens In the case of CD8+ T cell responses, the data in the literature
(or potentially not expressed, in the case of ORF10) and were from other coronaviruses (57 different studies curated in the
most likely predominantly unrecognized as a result. These re- IEDB; Table S3) reported spike accounting for 50% and N ac-
sults are somewhat unexpected, because data for other coro- counting for 36% of the defined epitopes. In a large study of hu-
naviruses, from 27 different studies curated in the IEDB, re- man SARS-CoV-1 responses, spike was reported as essentially
ported that spike accounted for nearly two-thirds of reported the only target of CD8+ T cell responses (Li et al., 2008), while in a
CD4+ T cell reactivity (Table S3). N accounted for most of the study of MERS CD8+ T cells, responses were noted for spike, N
remaining epitopes in the published literature, although human and a pool of M/E peptides (Zhao et al., 2017). Few epitopes
N-specific CD4+ T cell responses were not observed in one of have been reported from other coronavirus antigens (Table
the most comprehensive studies of human SARS-CoV-1 S3). Here, we scanned the full SARS-CoV-2 orfeome for CD8+
T cell responses (Li et al., 2008). Coronavirus M has not previ- T cell recognition. Our data indicate a somewhat different pattern
ously been described as a prominent target of CD4+ T cell re- of immunodominance for SARS-CoV-2 CD8+ T cell reactivity
sponses (Table S3). In sum, these results, fully scanning the (Figures 6D and 6E), with spike protein accounting for 26%
SARS2 orfeome, demonstrate a pattern of robust and diverse of the reactivity, and N accounting for 12%. Significant reac-
SARS-CoV-2-specific CD4+ T cell reactivity in convalescing tivity in COVID-19 recovered subjects was derived from other
COVID-19 cases that correlated largely with predicted viral pro- antigens, such as M (22%), nsp6 (15%), ORF8 (10%), and
tein abundance in infected cells. ORF3a (7%) (Figures 6D and 6E). In unexposed donors, SARS-
A B C
CoV-2-reactive CD8+ T cells were detected in at least four recognized by 100% of COVID-19 cases studied here. Signif-
different donors (Figure S7), with less clear targeting of specific icant CD4+ T cell responses were also directed against nsp3,
SARS-CoV-2 proteins than was observed for CD4+ T cells, sug- nsp4, ORF3s, ORF7a, nsp12, and ORF8. These data suggest
gesting that coronavirus CD8+ T cell cross-reactivity exists but is that a candidate COVID-19 vaccine consisting only of SARS-
less widespread than CD4+ T cell cross-reactivity. CoV-2 spike would be capable of eliciting SARS-CoV-2-spe-
cific CD4+ T cell responses of similar representation to that
DISCUSSION of natural COVID-19 disease, but the data also indicate that
there are many potential CD4+ T cell targets in SARS-CoV-2,
There is a critical need for foundational knowledge about T cell and inclusion of additional SARS-CoV-2 structural antigens
responses to SARS-CoV-2. Here, we report functional validation such as M and N would better mimic the natural SARS-CoV-
of predicted epitopes when arranged in epitope MPs, utilizing 2-specific CD4+ T cell response observed in mild to moderate
PBMCs derived from convalescing COVID-19 cases. The exper- COVID-19 disease.
iments also used protein-specific peptide pools to determine Regarding SARS-CoV-2 CD8+ T cell responses, the pattern of
which SARS-CoV-2 proteins are the predominant targets of hu- immunodominance found here differed from the literature for
man SARS-CoV-2-specific CD4+ and CD8+ T cells generated other coronaviruses. However, stringent comparisons are not
during COVID-19 disease. Importantly, we utilized the exact possible, as some earlier studies were not similarly comprehen-
same series of experimental techniques with blood samples sive and did not utilize the same experimental strategy. The spike
from healthy control donors (PBMCs collected in the 2015– protein was a target of human SARS-CoV-2 CD8+ T cell re-
2018 time frame), and substantial cross-reactive coronavirus sponses, but it is not dominant. SARS-CoV-2 M was just as
T cell memory was observed. strongly recognized, and significant reactivity was noted for
Our results demonstrate that the epitope MPs are reagents other antigens, mostly nsp6, ORF3a, and N, which comprised
well suited to analyze and detect SARS-CoV-2-specific T cell nearly 50% of the total CD8+ T cell response, on average.
responses with limited sample material. We also developed Thus, these data indicate that candidate COVID-19 vaccines
and tested peptide pools corresponding to each of the 25 pro- endeavoring to elicit CD8+ T cell responses against the spike
teins encoded in the SARS-CoV-2 genome. Data from both protein will be eliciting a relatively narrow CD8+ T cell response
the epitope MPs and protein peptide pool experiments can compared to the natural CD8+ T cell response observed in mild
be interpreted in the context of previously reported T cell to moderate COVID-19 disease. An optimal vaccine CD8+
response immunodominance patterns observed for other co- T cell response to SARS-CoV-2 might benefit from additional
ronaviruses, particularly the SARS and MERS viruses, which class I epitopes, such as the ones derived from the M, nsp6,
have been studied in humans, HLA-transgenic mice, wild- ORF3a, and/or N.
type mice, and other species. In the case of CD4+ T cell re- There have been concerns regarding vaccine enhancement of
sponses, data for other coronaviruses found that spike ac- disease by certain candidate COVID-19 vaccine approaches, via
counted for nearly two-thirds of reported CD4+ T cell reac- antibody-dependent enhancement (ADE) or development of a
tivity, with N and M accounting for limited reactivity, and no TH2 responses (Peeples, 2020). Herein, we saw predominant
reactivity in one large study of human SARS-CoV-1 responses TH1 responses in convalescing COVID-19 cases, with little to
(Li et al., 2008). Our SARS-CoV-2 data reveal that the pattern no TH2 cytokines. Clearly more studies are required, but the
of immunodominance in COVID-19 is different. In particular, data here appear to predominantly represent a classic TH1
M, spike, and N proteins were clearly co-dominant, each response to SARS-CoV-2.
B C
D E
Figure 6. Protein Immunodominance of SARS-CoV-2-Specific CD4+ and CD8+ T Cells in COVID-19 Cases and Unexposed Donors
(A) SARS-CoV-2 genome organization and predicted viral protein abundance in infected cells.
(B) SARS-CoV-2 antigen-specific CD4+ T cells (AIM+, OX40+CD137+) quantified by stimulation index, using a peptide pool for each viral protein (with two ex-
ceptions, see Table S1). COVID-19 cases (top, in blue. n = 10) and unexposed donors (bottom, in white. n = 10). Data are expressed as geometric mean and
geometric SD.
(C) Fraction of SARS-CoV-2 proteins recognized by CD4+ T cells in COVID-19 cases (top) and unexposed donors (bottom).
(D) SARS-CoV-2 antigen-specific CD4+ T cells (AIM+, OX40+CD137+) quantified by stimulation index, using a peptide pool for each viral protein (with two ex-
ceptions, see Table S1). COVID-19 cases (top, in red. n = 10) and unexposed donors (bottom, in gray. n = 10). Data are expressed as geometric mean and
geometric SD.
(E) Fraction of SARS-CoV-2 proteins recognized by CD8+ T cells in COVID-19 cases (top) and unexposed donors (bottom).
See also Figures S6 and S7 and Table S6.
While it was important to identify antigen-specific T cell re- important detailed resolution of the human coronavirus-specific
sponses in COVID-19 cases, it is also of great interest to under- T cell responses.
stand whether cross-reactive immunity exists between corona- In sum, we measured SARS-CoV-2-specific CD4+ and CD8+
viruses to any degree. A key step in developing that T cells responses in COVID-19 cases. Using multiple
understanding is to examine antigen-specific CD4+ and CD8+ experimental approaches, SARS-CoV-2-specific CD4+ T cell
T cells in COVID-19 cases and in unexposed healthy controls, and antibody responses were observed in all COVID-19 cases,
utilizing the exact same antigens and series of experimental and CD8+ T cell responses were observed in most. Importantly,
techniques. CD4+ T cell responses were detected in 40%–60% pre-existing SARS-CoV-2-cross-reactive T cell responses were
of unexposed individuals. This may be reflective of some degree observed in healthy donors, indicating some potential for pre-ex-
of cross-reactive, preexisting immunity to SARS-CoV-2 in some, isting immunity in the human population. ORF mapping of T cell
but not all, individuals. Whether this immunity is relevant in influ- specificities revealed valuable targets for incorporation in candi-
encing clinical outcomes is unknown—and cannot be known date vaccine development and revealed distinct specificity pat-
without T cell measurements before and after SARS-CoV-2 terns between COVID-19 cases and unexposed healthy
infection of individuals—but it is tempting to speculate that the controls.
cross-reactive CD4+ T cells may be of value in protective immu-
nity, based on SARS mouse models (Zhao et al., 2016). Clear
identification of the cross-reactive peptides, and their sequence STAR+METHODS
homology relation to other coronaviruses, requires deconvolu-
tion of the positive peptide pools, which is not feasible with the Detailed methods are provided in the online version of this paper
cell numbers presently available, and time frame of the pre- and include the following:
sent study.
d KEY RESOURCES TABLE
Regarding the value of cross-reactive T cells, influenza (flu)
d RESOURCE AVAILABILITY
immunology in relationship to pandemics may be instructive. In
B Lead Contact
the context of the 2009 H1N1 influenza pandemic, preexisting
B Materials Availability
T cell immunity existed in the adult population, which focused
B Data and Code Availability
on the more conserved internal influenza viral proteins (Green-
d EXPERIMENTAL MODEL AND SUBJECT DETAILS
baum et al., 2009). The presence of cross-reactive T cells was
B Human Subjects
found to correlate with less severe disease (Sridhar et al.,
d METHOD DETAILS
2013; Wilkinson et al., 2012). The frequent availability of cross-
B Peptide Pools
reactive memory T cell responses might have been one factor
B PBMC isolation
contributing to the lesser severity of the H1N1 flu pandemic
B SARS-CoV-2 RBD ELISA
(Hancock et al., 2009). Cross-reactive immunity to influenza
B OC43 and NL63 coronavirus RBD ELISA
strains has been modeled to be a critical influencer of suscepti-
B Flow Cytometry
bility to newly emerging, potentially pandemic, influenza strains
B Cytokine bead assays
(Gostic et al., 2016). Given the severity of the ongoing COVID-
B Identification of coronavirus epitopes and associated
19 pandemic, it has been modeled that any degree of cross-pro-
literature references
tective coronavirus immunity in the population could have a very
d QUANTIFICATION AND STATISTICAL ANALYSIS
substantial impact on the overall course of the pandemic, and
the dynamics of the epidemiology for years to come (Kissler
et al., 2020). SUPPLEMENTAL INFORMATION
Limitations and Future Directions Supplemental Information can be found online at https://doi.org/10.1016/j.
cell.2020.05.015.
Caveats of this study include the sample size and the focus on
non-hospitalized COVID-19 cases. Sample size was limited by
expediency. The focus on non-hospitalized cases of COVID-19 ACKNOWLEDGMENTS
is a strength, in that these donors had uncomplicated disease
of moderate duration, and thus it was encouraging that substan- We would like to thank Cheryl Kim, director of the LJI flow cytometry core fa-
cility for outstanding expertise. We thank Prof. Peter Kim, Abigail Powell, PhD,
tial CD4+ T cell and antibody responses were detected in all
and colleagues (Stanford) for RBD protein synthesized from Prof. Florian
cases, and CD8+ T cell responses in the majority of cases. Com- Krammer (Mt. Sinai) constructs. J.M. was supported by PhD student fellow-
plementing these data with MP T cell data from acute patients ships from the Departamento Administrativo de Ciencia, Tecnologia e Innova-
and patients with complicated disease course will also be of cion (COLCIENCIAS), and Pontificia Universidad Javeriana. This work was
clear value, as will studies on the longevity of SARS-CoV-2 funded by the NIH NIAID under awards AI42742 (Cooperative Centers for Hu-
immunological memory. Additionally, lack of detailed informa- man Immunology) (S.C. and A.S.), National Institutes of Health contract Nr.
tion on common cold history or matched blood samples pre- 75N9301900065 (A.S. and D.W.), and U19 AI118626 (A.S. and B.P.). The BD
FACSymphony purchase was partially funded by the Bill and Melinda Gates
exposure to SARS-CoV-2 prevents conclusions regarding the
Foundation and LJI Institutional Funds (S.C. and A.S.). This work was addition-
abundance of cross-reactive coronavirus T cells before expo- ally supported in part by the Johnathan and Mary Tu Foundation (D.M.S.), the
sure to SARS-CoV-2 and any potential protective efficacy of NIAID under K08 award AI135078 (J.D.), and UCSD T32s AI007036 and
such cells. Finally, full epitope mapping in the future will add AI007384 Infectious Diseases Division (S.I.R. and S.A.R.).
Alshukairi, A.N., Zheng, J., Zhao, J., Nehdi, A., Baharoon, S.A., Layqah, L., Bo- Gorse, G.J., Patel, G.B., Vitale, J.N., and O’Connor, T.Z. (2010). Prevalence of
khari, A., Al Johani, S.M., Samman, N., Boudjelal, M., et al. (2018). High Prev- antibodies to four human coronaviruses is lower in nasal secretions than in
alence of MERS-CoV Infection in Camel Workers in Saudi Arabia. MBio 9, serum. Clin. Vaccine Immunol. 17, 1875–1880.
e01985–e01918. Gostic, K.M., Ambrose, M., Worobey, M., and Lloyd-Smith, J.O. (2016). Potent
Amanat, F., and Krammer, F. (2020). SARS-CoV-2 Vaccines: Status Report. protection against H5N1 and H7N9 influenza via childhood hemagglutinin
Immunity 52, 583–589. imprinting. Science 354, 722–726.
Andrews, S.F., Huang, Y., Kaur, K., Popova, L.I., Ho, I.Y., Pauli, N.T., Henry Du- Greenbaum, J.A., Kotturi, M.F., Kim, Y., Oseroff, C., Vaughan, K., Salimi, N.,
nand, C.J., Taylor, W.M., Lim, S., Huang, M., et al. (2015). Immune history pro- Vita, R., Ponomarenko, J., Scheuermann, R.H., Sette, A., and Peters, B.
foundly affects broadly protective B cell responses to influenza. Sci. Transl. (2009). Pre-existing immunity against swine-origin H1N1 influenza viruses in
Med. 7, 316ra192. the general human population. Proc. Natl. Acad. Sci. USA 106, 20365–20370.
Bancroft, T., Dillon, M.B., da Silva Antunes, R., Paul, S., Peters, B., Crotty, S., Greenbaum, J., Sidney, J., Chung, J., Brander, C., Peters, B., and Sette, A.
Lindestam Arlehamn, C.S., and Sette, A. (2016). Th1 versus Th2 T cell polari- (2011). Functional classification of class II human leukocyte antigen (HLA) mol-
zation by whole-cell and acellular childhood pertussis vaccines persists upon ecules reveals seven different supertypes and a surprising degree of repertoire
re-immunization in adolescence and adulthood. Cell. Immunol. 304- sharing across supertypes. Immunogenetics 63, 325–335.
305, 35–43. Grifoni, A., Angelo, M.A., Lopez, B., O’Rourke, P.H., Sidney, J., Cerpas, C.,
Blanco-Melo, D., Nilsson-Payant, B.E., Liu, W.-C., Møller, R., Panis, M., Balmaseda, A., Silveira, C.G.T., Maestri, A., Costa, P.R., et al. (2017). Global
Sachs, D., Albrecht, R.A., and tenOever, B.R. (2020). SARS-CoV-2 launches Assessment of Dengue Virus-Specific CD4+ T Cell Responses in Dengue-
a unique transcriptional signature from in vitro, ex vivo, and in vivo systems. Endemic Areas. Front. Immunol. 8, 1309.
bioRxiv. https://doi.org/10.1101/2020.03.24.004655. Grifoni, A., Sidney, J., Zhang, Y., Scheuermann, R.H., Peters, B., and Sette, A.
Callow, K.A., Parry, H.F., Sergeant, M., and Tyrrell, D.A. (1990). The time (2020). A Sequence Homology and Bioinformatic Approach Can Predict
course of the immune response to experimental coronavirus infection of Candidate Targets for Immune Responses to SARS-CoV-2. Cell Host Microbe
man. Epidemiol. Infect. 105, 435–446. 27, 671–680.
Cao, X. (2020). COVID-19: immunopathology and its implications for therapy. Guo, T., Fan, Y., Chen, M., Wu, X., Zhang, L., He, T., Wang, H., Wan, J., Wang,
Nat. Rev. Immunol. 20, 269–270. X., and Lu, Z. (2020a). Cardiovascular Implications of Fatal Outcomes of Pa-
Carrasco Pro, S., Sidney, J., Paul, S., Lindestam Arlehamn, C., Weiskopf, D., tients With Coronavirus Disease 2019 (COVID-19). JAMA Cardiol. Published
Peters, B., and Sette, A. (2015). Automatic Generation of Validated Specific online March 27, 2020. https://doi.org/10.1001/jamacardio.2020.1017.
Epitope Sets. J. Immunol. Res. 2015, 763461. Guo, X., Guo, Z., Duan, C., Chen, Z., Wang, G., Lu, Y., Li, M., and Lu, J.
Choe, P.G., Perera, R.A.P.M., Park, W.B., Song, K.H., Bang, J.H., Kim, E.S., (2020b). Long-Term Persistence of IgG Antibodies in SARS-CoV Infected
Kim, H.B., Ko, L.W.R., Park, S.W., Kim, N.J., et al. (2017). MERS-CoV Antibody Healthcare Workers. medRxiv. https://doi.org/10.1101/2020.02.12.20021386.
Responses 1 Year after Symptom Onset, South Korea, 2015. Emerg. Infect. Hancock, K., Veguilla, V., Lu, X., Zhong, W., Butler, E.N., Sun, H., Liu, F., Dong,
Dis. 23, 1079–1084. L., DeVos, J.R., Gargiullo, P.M., et al. (2009). Cross-reactive antibody re-
Crotty, S. (2019). T Follicular Helper Cell Biology: A Decade of Discovery and sponses to the 2009 pandemic H1N1 influenza virus. N. Engl. J. Med. 361,
Diseases. Immunity 50, 1132–1148. 1945–1952.
da Silva Antunes, R., Paul, S., Sidney, J., Weiskopf, D., Dan, J.M., Phillips, E., Havenar-Daughton, C., Reiss, S.M., Carnathan, D.G., Wu, J.E., Kendric, K.,
Mallal, S., Crotty, S., Sette, A., and Lindestam Arlehamn, C.S. (2017). Definition Torrents de la Peña, A., Kasturi, S.P., Dan, J.M., Bothwell, M., Sanders,
of Human Epitopes Recognized in Tetanus Toxoid and Development of an R.W., et al. (2016). Cytokine-Independent Detection of Antigen-Specific
Assay Strategy to Detect Ex Vivo Tetanus CD4+ T Cell Responses. PLoS Germinal Center T Follicular Helper Cells in Immunized Nonhuman Primates
ONE 12, e0169086. Using a Live Cell Activation-Induced Marker Technique. J. Immunol. 197,
Dan, J.M., Lindestam Arlehamn, C.S., Weiskopf, D., da Silva Antunes, R., Ha- 994–1002.
venar-Daughton, C., Reiss, S.M., Brigger, M., Bothwell, M., Sette, A., and Herati, R.S., Muselman, A., Vella, L., Bengsch, B., Parkhouse, K., Del Alcazar,
Crotty, S. (2016). A Cytokine-Independent Approach To Identify Antigen-Spe- D., Kotzin, J., Doyle, S.A., Tebas, P., Hensley, S.E., et al. (2017). Successive
cific Human Germinal Center T Follicular Helper Cells and Rare Antigen-Spe- annual influenza vaccination induces a recurrent oligoclonotypic memory
cific CD4+ T Cells in Blood. J. Immunol. 197, 983–993. response in circulating T follicular helper cells. Sci Immunol 2, eaag2152.
ll
Article
Hinz, D., Seumois, G., Gholami, A.M., Greenbaum, J.A., Lane, J., White, B., Sallusto, F., Lanzavecchia, A., Araki, K., and Ahmed, R. (2010). From vaccines
Broide, D.H., Schulten, V., Sidney, J., Bakhru, P., et al. (2016). Lack of allergy to memory and back. Immunity 33, 451–463.
to timothy grass pollen is not a passive phenomenon but associated with the Sette, A., Moutaftsi, M., Moyron-Quiroz, J., McCausland, M.M., Davies, D.H.,
allergen-specific modulation of immune reactivity. Clin. Exp. Allergy 46, Johnston, R.J., Peters, B., Rafii-El-Idrissi Benhnia, M., Hoffmann, J., Su, H.P.,
705–719. et al. (2008). Selective CD4+ T cell help for antibody responses to a large viral
Huang, A.T., Garcia-Carreras, B., Hitchings, M.D.T., Yang, B., Katzelnick, L., pathogen: deterministic linkage of specificities. Immunity 28, 847–858.
Rattigan, S.M., Borgert, B., Moreno, C., Solomon, B.D., Rodriguez-Barraquer,
Severance, E.G., Bossis, I., Dickerson, F.B., Stallings, C.R., Origoni, A.E., Sul-
I., et al. (2020). A systematic review of antibody mediated immunity to
lens, A., Yolken, R.H., and Viscidi, R.P. (2008). Development of a nucleo-
coronaviruses: antibody kinetics, correlates of protection, and association
capsid-based human coronavirus immunoassay and estimates of individuals
of antibody responses with severity of disease. medRxiv,
exposed to coronavirus in a U.S. metropolitan population. Clin. Vaccine Immu-
2020.2004.2014.20065771.
nol. 15, 1805–1810.
Irigoyen, N., Firth, A.E., Jones, J.D., Chung, B.Y., Siddell, S.G., and Brierley, I.
Sidney, J., Peters, B., Frahm, N., Brander, C., and Sette, A. (2008). HLA class I
(2016). High-Resolution Analysis of Coronavirus Gene Expression by RNA
supertypes: a revised and updated classification. BMC Immunol. 9, 1.
Sequencing and Ribosome Profiling. PLoS Pathog. 12, e1005473.
Sidney, J., Steen, A., Moore, C., Ngo, S., Chung, J., Peters, B., and Sette, A.
Jurtz, V., Paul, S., Andreatta, M., Marcatili, P., Peters, B., and Nielsen, M.
(2010a). Divergent motifs but overlapping binding repertoires of six HLA-DQ
(2017). NetMHCpan-4.0: Improved Peptide-MHC Class I Interaction Predic-
molecules frequently expressed in the worldwide human population.
tions Integrating Eluted Ligand and Peptide Binding Affinity Data.
J. Immunol. 185, 4189–4198.
J. Immunol. 199, 3360–3368.
Sidney, J., Steen, A., Moore, C., Ngo, S., Chung, J., Peters, B., and Sette, A.
Kissler, S.M., Tedijanto, C., Goldstein, E., Grad, Y.H., and Lipsitch, M. (2020).
(2010b). Five HLA-DP molecules frequently expressed in the worldwide human
Projecting the transmission dynamics of SARS-CoV-2 through the postpan-
population share a common HLA supertypic binding specificity. J. Immunol.
demic period. Science, eabb5793.
184, 2492–2503.
Li, C.K., Wu, H., Yan, H., Ma, S., Wang, L., Zhang, M., Tang, X., Temperton,
Snijder, E.J., Bredenbeek, P.J., Dobbe, J.C., Thiel, V., Ziebuhr, J., Poon, L.L.,
N.J., Weiss, R.A., Brenchley, J.M., et al. (2008). T cell responses to whole
Guan, Y., Rozanov, M., Spaan, W.J., and Gorbalenya, A.E. (2003). Unique and
SARS coronavirus in humans. J. Immunol. 181, 5490–5500.
conserved features of genome and proteome of SARS-coronavirus, an early
Lindestam Arlehamn, C.S., McKinney, D.M., Carpenter, C., Paul, S., Rozot, V., split-off from the coronavirus group 2 lineage. J. Mol. Biol. 331, 991–1004.
Makgotlho, E., Gregg, Y., van Rooyen, M., Ernst, J.D., Hatherill, M., et al.
Southwood, S., Sidney, J., Kondo, A., del Guercio, M.F., Appella, E., Hoffman,
(2016). A Quantitative Analysis of Complexity of Human Pathogen-Specific
S., Kubo, R.T., Chesnut, R.W., Grey, H.M., and Sette, A. (1998). Several com-
CD4 T Cell Responses in Healthy M. tuberculosis Infected South Africans.
mon HLA-DR types share largely overlapping peptide binding repertoires.
PLoS Pathog. 12, e1005760.
J. Immunol. 160, 3363–3373.
Liu, L., Wei, Q., Lin, Q., Fang, J., Wang, H., Kwok, H., Tang, H., Nishiura, K.,
Peng, J., Tan, Z., et al. (2019). Anti-spike IgG causes severe acute lung injury Sridhar, S., Begom, S., Bermingham, A., Hoschler, K., Adamson, W., Carman,
by skewing macrophage responses during acute SARS-CoV infection. JCI W., Bean, T., Barclay, W., Deeks, J.J., and Lalvani, A. (2013). Cellular immune
Insight 4, 123158. correlates of protection against symptomatic pandemic influenza. Nat. Med.
19, 1305–1312.
Morou, A., Brunet-Ratnasingham, E., Dubé, M., Charlebois, R., Mercier, E.,
Darko, S., Brassard, N., Nganou-Makamdop, K., Arumugam, S., Gendron- Stadlbauer, D., Amanat, F., Chromikova, V., Jiang, K., Strohmeier, S., Arunku-
Lepage, G., et al. (2019). Altered differentiation is central to HIV-specific mar, G.A., Tan, J., Bhavsar, D., Capuano, C., Kirkpatrick, E., et al. (2020).
CD4+ T cell dysfunction in progressive disease. Nat. Immunol. 20, 1059–1070. SARS-CoV-2 Seroconversion in Humans: A Detailed Protocol for a Serological
Assay, Antigen Production, and Test Setup. Curr. Protoc. Microbiol. 57, e100.
Moutaftsi, M., Tscharke, D.C., Vaughan, K., Koelle, D.M., Stern, L., Calvo-
Calle, M., Ennis, F., Terajima, M., Sutter, G., Crotty, S., et al. (2010). Uncover- Takano, T., Kawakami, C., Yamada, S., Satoh, R., and Hohdatsu, T. (2008).
ing the interplay between CD8, CD4 and antibody responses to complex path- Antibody-dependent enhancement occurs upon re-infection with the identical
ogens. Future Microbiol. 5, 221–239. serotype virus in feline infectious peritonitis virus infection. J. Vet. Med. Sci. 70,
1315–1321.
O’Sullivan, D., Arrhenius, T., Sidney, J., Del Guercio, M.F., Albertson, M., Wall,
M., Oseroff, C., Southwood, S., Colón, S.M., Gaeta, F.C., et al. (1991). On the Thanh Le, T., Andreadakis, Z., Kumar, A., Gómez Román, R., Tollefsen, S., Sa-
interaction of promiscuous antigenic peptides with different DR alleles. Identi- ville, M., and Mayhew, S. (2020). The COVID-19 vaccine development land-
fication of common structural motifs. J. Immunol. 147, 2663–2669. scape. Nat. Rev. Drug Discov. 19, 305–306.
Okba, N.M.A., Raj, V.S., Widjaja, I., GeurtsvanKessel, C.H., de Bruin, E., Chan- Tian, Y., Grifoni, A., Sette, A., and Weiskopf, D. (2019). Human T Cell Response
dler, F.D., Park, W.B., Kim, N.J., Farag, E.A.B.A., Al-Hajri, M., et al. (2019). Sen- to Dengue Virus Infection. Front. Immunol. 10, 2125.
sitive and Specific Detection of Low-Level Antibody Responses in Mild Middle Vennema, H., de Groot, R.J., Harbour, D.A., Dalderup, M., Gruffydd-Jones, T.,
East Respiratory Syndrome Coronavirus Infections. Emerg. Infect. Dis. 25, Horzinek, M.C., and Spaan, W.J. (1990). Early death after feline infectious peri-
1868–1877. tonitis virus challenge due to recombinant vaccinia virus immunization. J. Virol.
Paul, S., Lindestam Arlehamn, C.S., Scriba, T.J., Dillon, M.B., Oseroff, C., Hinz, 64, 1407–1409.
D., McKinney, D.M., Carrasco Pro, S., Sidney, J., Peters, B., and Sette, A. Vita, R., Mahajan, S., Overton, J.A., Dhanda, S.K., Martini, S., Cantrell, J.R.,
(2015). Development and validation of a broad scheme for prediction of HLA Wheeler, D.K., Sette, A., and Peters, B. (2019). The Immune Epitope Database
class II restricted T cell epitopes. J. Immunol. Methods 422, 28–34. (IEDB): 2018 update. Nucleic Acids Res. 47 (D1), D339–D343.
Paul, S., Sidney, J., Sette, A., and Peters, B. (2016). TepiTool: A Pipeline for Weiskopf, D., Angelo, M.A., de Azeredo, E.L., Sidney, J., Greenbaum, J.A.,
Computational Prediction of T Cell Epitope Candidates. Curr. Protoc. Immu- Fernando, A.N., Broadwater, A., Kolla, R.V., De Silva, A.D., de Silva, A.M.,
nol. 114, 18.19.1–18.19.24. et al. (2013). Comprehensive analysis of dengue virus-specific responses sup-
Peeples, L. (2020). News Feature: Avoiding pitfalls in the pursuit of a COVID-19 ports an HLA-linked protective role for CD8+ T cells. Proc. Natl. Acad. Sci. USA
vaccine. Proc. Natl. Acad. Sci. USA 117, 8218–8221. 110, E2046–E2053.
Reiss, S., Baxter, A.E., Cirelli, K.M., Dan, J.M., Morou, A., Daigneault, A., Bras- Weiskopf, D., Cerpas, C., Angelo, M.A., Bangs, D.J., Sidney, J., Paul, S., Pe-
sard, N., Silvestri, G., Routy, J.P., Havenar-Daughton, C., et al. (2017). ters, B., Sanches, F.P., Silvera, C.G., Costa, P.R., et al. (2015). Human CD8+ T-
Comparative analysis of activation induced marker (AIM) assays for sensitive Cell Responses Against the 4 Dengue Virus Serotypes Are Associated With
identification of antigen-specific CD4 T cells. PLoS ONE 12, e0186998. Distinct Patterns of Protein Targets. J. Infect. Dis. 212, 1743–1751.
STAR+METHODS
Continued
REAGENT or RESOURCE SOURCE IDENTIFIER
Critical Commercial Assays
CoronaCheck COVID-19 Rapid Antibody 20/20 BioResponse https://coronachecktest.com/
Test Kit
Deposited Data
Wuhan-Hu-1 RNA isolate NCBI nuccore database GenBank: MN_908947
ORF10 protein NCBI protein database NCBI: YP_009725255.1
Nucleocapsid phosphoprotein NCBI protein database NCBI: YP_009724397.2
ORF8 protein NCBI protein database NCBI: YP_009724396.1
ORF7a protein NCBI protein database NCBI: YP_009724395.1
ORF6 protein NCBI protein database NCBI: YP_009724394.1
membrane glycoprotein NCBI protein database NCBI: YP_009724393.1
envelope protein NCBI protein database NCBI: YP_009724392.1
ORF3a protein NCBI protein database NCBI: YP_009724391.1
surface glycoprotein NCBI protein database NCBI: YP_009724390.1
orf1ab polyprotein NCBI protein database NCBI: YP_009724389.1
Software and Algorithms
IEDB Vita et al., 2019 https://www.iedb.org
IEDB-AR (analysis resource) Dhanda et al., 2019 http://tools.iedb.org
NetMHCpan EL 4.0 Jurtz et al., 2017 http://tools.iedb.org/mhci/
IEDB Vita et al., 2019 https://www.iedb.org
Tepitool Paul et al., 2016; Paul et al., 2015 http://tools.iedb.org/tepitool/
FlowJo 10 FlowJo https://www.flowjo.com/
GraphPad Prism 8.4 GraphPad https://www.graphpad.com/
LEGENDplex v8.0 Biolegend https://www.biolegend.com/
RESOURCE AVAILABILITY
Lead Contact
Further information and requests for resources and reagents should be directed to and will be fulfilled by the Lead Contact, Dr. Ales-
sandro Sette (alex@lji.org).
Materials Availability
Aliquots of synthesized sets of peptides utilized in this study will be made available upon request. There are restrictions to the avail-
ability of the peptide reagents due to cost and limited quantity.
Human Subjects
Healthy Unexposed Donors
Samples from healthy adult donors were obtained by the La Jolla Institute for Immunology (LJI) Clinical Core or provided by a
commercial vendor (Carter Blood Care) for prior, unrelated studies between early 2015 and early 2018. These samples were
considered to be from unexposed controls, given that SARS-CoV-2 emerged as a novel pathogen in late 2019, more than one
year after the collection of any of these samples. These donors were considered healthy in that they had no known history of
any significant systemic diseases, including, but not limited to, autoimmune disease, diabetes, kidney or liver disease, congestive
heart failure, malignancy, coagulopathy, hepatitis B or C, or HIV. An overview of the characteristics of these unexposed donors is
provided in Table 1.
The LJI Institutional Review Board approved the collection of these samples (LJI; VD-112). At the time of enrollment in the
initial studies, all individual donors provided informed consent that their samples could be used for future studies, including
this study.
Convalescent COVID-19 Donors
The Institutional Review Boards of the University of California, San Diego (UCSD; 200236X) and La Jolla Institute (LJI; VD-214)
approved blood draw protocols for convalescent donors. All human subjects were assessed for capacity using a standardized
and approved assessment. Subjects deemed to have capacity voluntarily gave informed consent prior to being enrolled in the study.
Individuals did not receive compensation for their participation.
Study inclusion criteria included subjects over the age of 18 years, regardless of disease severity, race, ethnicity, gender, preg-
nancy or nursing status, who were willing and able to provide informed consent, or with a legal guardian or representative willing
and able to provide informed consent when the participant could not personally do so. Study exclusion criteria included lack of
willingness or ability to provide informed consent, or lack of an appropriate legal guardian or representative to provide informed
consent.
Blood from convalescent donors was obtained at a UC San Diego Health clinic. Blood was collected in acid citrate dextrose (ACD)
tubes and stored at room temperature prior to processing for PBMC isolation and plasma collection. A separate serum separator
tube (SST) was collected from each donor. Samples were de-identified prior to analysis. Other efforts to maintain the confidentiality
of participants included referring to specimens and other records via an assigned, coded identification number.
Prior to enrollment in the study, donors were asked to provide proof of positive testing for SARS-CoV-2, and screened for clinical
history and/or epidemiological risk factors consistent with the World Health Organization (WHO) or Centers for Disease Control and
Prevention (CDC) case definitions of COVID-19 or Persons Under Investigation (PUI) (https://www.who.int/emergencies/diseases/
novel-coronavirus-2019/technical-guidance/surveillance-and-case-definitions, https://www.cdc.gov/coronavirus/2019-nCoV/hcp/
clinical-criteria.html). Per CDC and WHO guidance, clinical features consistent with COVID-19 included subjective or measured
fever, signs or symptoms of lower respiratory tract illness (e.g., cough or dyspnea). Epidemiologic risk factors included close contact
with a laboratory-confirmed case of SARS-CoV-2 within 14 days of symptom onset or a history of travel to an area with a high rate of
COVID-19 cases within 14 days of symptom onset.
Disease severity was defined as mild, moderate, severe or critical based on a modified version of the WHO interim guidance,
‘‘Clinical management of severe acute respiratory infection when COVID-19 is suspected’’ (WHO Reference Number: WHO/2019-
nCoV/clinical/2020.4). Mild disease was defined as an uncomplicated upper respiratory tract infection (URI) with potential non-
specific symptoms (e.g., fatigue, fever, cough with or without sputum production, anorexia, malaise, myalgia, sore throat, dys-
pnea, nasal congestion, headache; rarely diarrhea, nausea and vomiting) that did not require hospitalization. Moderate disease
was defined as the presence of lower respiratory tract disease or pneumonia without the need for supplemental oxygen, without
signs of severe pneumonia, or a URI requiring hospitalization (including observation admission status). Severe disease was
defined as severe lower respiratory tract infection or pneumonia with fever plus any one of the following: tachypnea (respiratory
rate > 30 breaths per minute), respiratory distress, or oxygen saturation less than 93% on room air. Critical disease was defined as
the need for ICU admission or the presence of acute respiratory distress syndrome (ARDS), sepsis, or septic shock, as defined in
the WHO guidance document.
Convalescent donors were screened for symptoms prior to scheduling blood draws, and had to be symptom-free and approxi-
mately 3 weeks out from symptom onset at the time of the initial blood draw. Following enrollment, whole blood from convalescent
donors was run on a colloidal-gold immunochromatographic ‘lateral flow’ assay to evaluate for prior exposure to SARS-CoV-2. This
assay detects IgM or IgG antibodies directed against recombinant SARS-CoV-2 antigen labeled with a colloidal gold tracer (20/20
BioResponse CoronaCheck). Ninety percent of convalescent donors tested positive for IgM or IgG to SARS-CoV-2 by this assay
(Table 1).
Convalescent donors were California residents, who were either referred to the study by a health care provider or self-referred. The
majority (75%) of donors had a known sick contact with COVID-19 or suspected exposure to SARS-CoV-2 (Table 1). The most com-
mon symptoms reported were cough, fatigue, fever, anosmia, and dyspnea. Seventy percent of donors experienced mild illness.
Donors were asked to self-report any known medical illnesses. Of note, 65% of these individuals had no known underlying medical
illnesses.
METHOD DETAILS
Peptide Pools
Epitope MegaPool (MP) design and preparation
SARS-CoV-2 virus-specific CD4 and CD8 peptides were synthesized as crude material (A&A, San Diego, CA), resuspended in
DMSO, pooled and sequentially lyophilized as previously reported (Carrasco Pro et al., 2015). SARS-CoV-2 epitopes were predicted
using the protein sequences derived from the SARS-CoV-2 reference (GenBank: MN908947) and IEDB analysis-resource as previ-
ously described (Dhanda et al., 2019; Grifoni et al., 2020). Specifically, CD4 SARS-CoV-2 epitope prediction was carried out using a
previously described approach in Tepitool resource in IEDB (Paul et al., 2015; Paul et al., 2016), to select peptides with median
consensus percentile % 20, similar to what was previously described, but removing the resulting spike glycoprotein epitopes
from this prediction (CD4-R (remainder) ‘‘Non-spike’’ MP, n = 221). This approach takes advantage of the extensive cross-reactivity
and repertoire overlap between different HLA class II loci and allelic variants to predict promiscuous epitopes, capable of binding
across the most common HLA class II prototypic specificities (Greenbaum et al., 2011; O’Sullivan et al., 1991; Sidney et al.,
2010a, b; Southwood et al., 1998). The algorithm utilizes predictions for seven common HLA-DR alleles (DRB1*03:01,
DRB1*07:01, DRB1*15:01, DRB3*01:01, DRB3*02:02, DRB4*01:01 and DRB5*01:01) empirically determined to allow coverage of
diverse populations and for different pathogens and antigen systems (Dhanda et al., 2018; Paul et al., 2015).
To investigate in-depth spike-specific CD4 T cells, 15-mer peptides (overlapping by 10 amino acids) spanning the entire antigen
have been synthesized and pooled separately (CD-4 S (spike) MP, n = 253).
In the case of CD8 epitopes, since the overlap between different HLA class I allelic variants and loci is more limited to specific
groups of alleles, or supertypes (Sidney et al., 2008), we targeted a set of the 12 most prominent HLA class I A and B alleles
(A*01:01, A*02:01, A*03:01, A*11:01, A*23:01, A*24:02, B*07:02, B*08:01, B*35:01, B*40:01, B*44:02, B*44:03), which have been
shown to allow broad coverage of the general population. CD8 SARS-CoV-2 epitope prediction was performed as previously re-
ported, using NetMHC pan EL 4.0 algorithm (Jurtz et al., 2017) for the top 12 more frequent HLA alleles and selecting the top 1
percentile predicted epitope per HLA allele clustered with nested/overlap reduction (Grifoni et al., 2020). The 628 predicted CD8 epi-
topes were split in two CD8 MPs containing 314 peptides each (CD8-A and CD8-B). The CMV MP is a pool of previously reported
class I and class II epitopes (Carrasco Pro et al., 2015).
Protein peptide pools
In the case of the protein pools, peptides of 15 amino acid length overlapping by 10 spanning each entire protein sequence were
tested in a single MP (6-253 peptides per pool). Table S1 lists the number of peptides pooled for each of the viral proteins. Upon
request we are prepared to make these MP available to the scientific community for use in a diverse set of investigations.
PBMC isolation
For all samples whole blood was collected in ACD tubes (COVID-19 donors) or heparin coated blood bag (healthy unexposed do-
nors). Whole blood was then centrifuged for 15 min at 1850 rpm to separate the cellular fraction and plasma. The plasma was
then carefully removed from the cell pellet and stored at 20C.
Peripheral blood mononuclear cells (PBMC) were isolated by density-gradient sedimentation using Ficoll-Paque (Lymphoprep,
Nycomed Pharma, Oslo, Norway) as previously described (Weiskopf et al., 2013). Isolated PBMC were cryopreserved in cell recovery
media containing 10% DMSO (GIBCO), supplemented with 10% heat inactivated fetal bovine serum, depending on the processing
laboratory, (FBS; Hyclone Laboratories, Logan UT) and stored in liquid nitrogen until used in the assays.
Flow Cytometry
Direct ex vivo PBMC immune cell phenotyping
For the surface stain, 1x106 PBMCs were resuspended in 100 ml PBS with 2% FBS (FACS buffer) and stained with antibody cocktail
for 1 hour at 4 C in the dark. Following surface staining, cells were washed twice with FACS buffer. Cells were then fixed/permea-
bilized for 40min at 4C in the dark using the eBioscience FoxP3 transcription factor buffer kit (ThermoFisher Scientific, Waltham, MA).
Following fixation/permeabilization, cells were washed twice with 1x permeabilization buffer, resuspended in 100 ml permeabilization
buffer and stained with intracellular/intranuclear antibodies for 1 hour at 4 C in the dark. Samples were washed twice with 1x per-
meabilization buffer following staining. After the final wash, cells were resuspended in 200ml FACS buffer. All samples were acquired
on a BD FACSymphony cell sorter (BD Biosciences, San Diego, CA). A list of antibodies used in this panel can be found in Table S2.
T cell stimulations
For all flow cytometry assays of stimulated T cells, cryopreserved cells were thawed by diluting them in 10 mL complete RPMI 1640
with 5% human AB serum (Gemini Bioproducts) in the presence of benzonase [20ul/10mL]. All samples were acquired on a ZE5 Cell
analyzer (Bio-rad laboratories), and analyzed with FlowJo software (Tree Star, San Carlos, CA).
Activation induced cell marker assay
Cells were cultured for 24 hours in the presence of SARS-CoV-2 specific MPs [1 mg/ml] or 10 mg/mL PHA in 96-wells U bottom plates
at 1x106 PBMC per well. A stimulation with an equimolar amount of DMSO was performed as negative control, phytohemagglutinin
(PHA, Roche, 1 mg/ml) and stimulation with a combined CD4 and CD8 cytomegalovirus MP (CMV, 1 mg/ml) were included as positive
controls. Supernatants were harvested at 24 hours post-stimulation for multiplex detection of cytokines. Antibodies used in the AIM
assay are listed in Table S4. AIM assays shown in Figures 2 and 3 and AIM assays shown in Figure 6 had five COVID-19 donors in
common and nine Unexposed donors. Full raw data is listed in Table S6.
Intracellular cytokine staining assay
For the intracellular cytokine staining, PBMC were cultured in the presence of SARS-CoV-2 specific MPs [1 mg/ml] for 9 hours. Golgi-Plug
containing brefeldin A (BD Biosciences, San Diego, CA) and monensin (Biolegend, San Diego, CA) were added 3 hours into the culture.
Cells were then washed and surface stained for 30 minutes on ice, fixed with 1% of paraformaldehyde (Sigma-Aldrich, St. Louis, MO) and
kept at 4 C overnight. Antibodies used in the ICS assay are listed in Table S5. The gates applied for the identification of IFNg, GzB, TNFa,
or IL-10 production on the total population of CD8+ T cells were defined according to the cells cultured with DMSO for each individual.
Data and statistical analyses were done in FlowJo 10 and GraphPad Prism 8.4, unless otherwise stated. The statistical details of the
experiments are provided in the respective figure legends. Data plotted in linear scale were expressed as Mean + Standard Deviation
(SD). Data plotted in logarithmic scales were expressed as Geometric Mean + Geometric Standard Deviation (SD). Correlation an-
alyses were performed using Spearman, while Mann-Whitney or Wilcoxon tests were applied for unpaired or paired comparisons,
respectively. Details pertaining to significance are also noted in the respective legends. T cell data have been calculated as
background subtracted data or stimulation index. Background subtracted data were derived by subtracting the percentage of AIM+
cells after SARS-CoV-2 stimulation from the DMSO stimulation. Stimulation Index was calculated instead by dividing the percentage
of AIM+ cells after SARS-CoV-2 stimulation with the percentage of AIM+ cells derived from DMSO stimulation. If the AIM+ cells per-
centage after DMSO stimulation was equal to 0, the minimum value across each cohort was used. When two stimuli were combined
together, the percentage of AIM+ cells after SARS-CoV-2 stimulation was combined and either subtracted twice or divided by twice
the value of the percentage of AIM+ cells derived from DMSO stimulation. Additional data analysis techniques are described in the
STAR Methods sections above.
Supplemental Figures
A B C
anti-RBD IgG anti-RBD IgM anti-RBD IgA
1.5 2.0 3
1565 AK4 1565 AK4 1565 AK4
1570 AK1 1570 AK1 1570 AK1
1592 AK2 1.5 1592 AK2 1592 AK2
1.0 2079 AK1 2 2079 AK1
2079 AK1
OD450
OD450
2095 AK1
OD450
D E F
104 104 104
**** **** ****
SARS-CoV-2 spike RBD IgM (AUC)
SARS-CoV-2 spike RBD IgG (AUC)
103 103
103
102 102
102
101 101
101
100 100
Figure S3. SARS-CoV-2-Specific CD4+ T Cell Responses of Recovered COVID-19 Patients, Related to Figure 2
(A) Example flow cytometry gating strategy.
(B) FACS plot examples for controls. DMSO negative control, CMV positive control, PHA positive control.
(C) CMV-specific CD4+ T cells as percentage of AIM+ (OX40+CD137+) CD4+ T cells after stimulation of PBMCs with CMV peptide pool. Data were background
subtracted against DMSO negative control and are shown with geometric mean and geometric standard deviation. Samples were from unexposed donors
(‘‘Unexposed,’’ n = 11) and recovered COVID-19 patients (‘‘COVID-19,’’ n = 10).
(D) Spearman correlation of SARS-CoV-2 spike specific CD4+ T cells (AIM+ (OX40+CD137+) CD4+ T cells, background subtracted) after stimulation with spike
pool run on the same donors in two independent experiment series run on different dates. COVID-19 patient samples shown in blue. Unexposed donor samples
shown in black.
(E-F) Stimulation index quantitation of AIM+ (OX40+CD137+) CD4+ T cells; the same samples as in Figure 2 and Figure S3C were analyzed.
(G-H) Cytokine levels in the supernatants of AIM assays after stimulation with (G) Spike MP (MP_S), or (H) CD4-R (‘‘Non-spike’’). Data are shown in comparison to
the negative control (DMSO), per donor.
(I-J) Cytokine production by CD4+ T cells in response to Non-spike (CD4-R MP) or Spike (MP_S) peptide pools (‘‘CoV antigen (Ag)’’) was confirmed by analyzing
cytokine secretion from the subset of COVID-19 donors determined to have low or negative CD8+ T cell responses (< 0.1% by AIM) to the same peptide pool
determined positive for SARS-CoV-2 specific CD4+ T cells by AIM. (I) IL-2. (J) IFNg.
Statistical comparisons across cohorts were performed with the Mann-Whitney test, while paired sample comparisons were performed with the Wilcoxon test.
**p < 0.01; ***p < 0.001. ns not significant.
ll
Article
Figure S4. SARS-CoV-2-Specific CD8+ T Cell Responses of Recovered COVID-19 Patients, Related to Figure 3
(A) Flow cytometry gating strategy.
(B) SARS-CoV-2 specific CD8+ T cells as determined by AIM+ (CD69+CD137+) CD8+ T cells. Response of PBMCs from COVID-19 cases between the negative
control (DMSO) and antigen specific stimulation.
(C) CMV-specific CD8+ T cells as percentage of AIM+ (CD69+CD137+) CD8+ T cells after stimulation of PBMCs with CMV peptide pool. Data were background
subtracted against DMSO negative control and are shown with geometric mean and geometric standard deviation. Samples were from unexposed donors
(‘‘Unexposed,’’ n = 11) and recovered COVID-19 patients (‘‘COVID-19,’’ n = 10).
(D-E) Stimulation index quantitation of AIM+ (CD69+CD137+) CD8+ T cells; the same samples as in Figure 2 and Figure S4C were analyzed.
Statistical comparisons across cohorts were performed with the Mann-Whitney test, while paired sample comparisons were performed with the Wilcoxon test.
**p < 0.01; ***p < 0.001. ns not significant.
ll
Article
Figure S5. Correlations between SARS-CoV-2-Specific CD4+ T Cells, Antibodies, and CD8+ T Cells, Related to Figure 4
(A) Correlation between SARS-CoV-2 spike specific CD4+ T cells and anti-spike RBD IgG, using CD4+ T cell stimulation index.
(B) Correlation between SARS-CoV-2 non-spike specific CD4+ T cells and anti-spike RBD IgG, using CD4+ T cell stimulation index.
(C) Correlation between SARS-CoV-2 spike specific CD4+ T cells (%) and anti-spike RBD IgA.
(D) Correlation between SARS-CoV-2 spike specific CD4+ T cells (%) and anti-spike RBD IgA.
(E) Correlation between SARS-CoV-2 specific CD4+ T cells and SARS-CoV-2 specific CD8+ T cells, using stimulation index. Total MP responses per donor
were used in each case (‘‘Non-spike’’ + ‘‘spike’’ (CD4_R + MP_S) for CD4+ T cells, CD8-A + CD8-B for CD8+ T cells).
Statistical comparisons were performed using Spearman correlation.
ll
Article
Figure S6. Protein Immunodominance of SARS-CoV-2 Specific CD4+ T Cells in Recovered COVID-19 Patients and Unexposed Donors,
Related to Figure 6
(A) The same data as Figure 6B, but with each unexposed donor color coded.
(B) The same experiment as Figure 6B, but with SARS-CoV-2 specific CD4+ T cells measured as percentage of AIM+ (OX40+CD137+) CD4+ T cells, after
background subtraction. COVID-19 cases (top, in blue. n = 10) and unexposed donors (bottom, in white. n = 10).
(C) Correlation of SARS-CoV-2 specific CD4+ T cells detected using the epitope prediction approach (CD4_R MP) compared against the sum total of all antigen
pools of overlapping peptides (excluding spike), run with samples from the same donors in two different experiment series. Dotted line indicates 1:1 concordance.
Statistical comparison was performed using Spearman correlation.
ll
Article
Figure S7. Protein Immunodominance of SARS-CoV-2-Specific CD8+ T Cells in Recovered COVID-19 Patients and Unexposed Donors,
Related to Figure 6
(A) The same data as Figure 6D, but with each unexposed donor color coded.
(B) The same experiment as Figure 6D, but with SARS-CoV-2 specific CD8+ T cells measured as percentage of AIM+ (CD69+CD137+) CD8+ T cells, after
background subtraction. COVID-19 cases (top, in red. n = 10) and unexposed donors (bottom, in gray. n = 10).
Article
In-frame Off-frame
contribute to virulence.
Met E ... Met X X Met E ... Met - X X X X X X ...
Canonical viral proteins Viral proteins with Upstream Frankenstein ORF (UFO)
host-encoded extensions Novel host-virus encoded proteins
Highlights
d A mechanism of hybrid gene birth is employed by many
families of RNA viruses
Article
Hybrid Gene Origination Creates
Human-Virus Chimeric Proteins during Infection
Jessica Sook Yuin Ho,1,26 Matthew Angel,2,26 Yixuan Ma,1,26 Elizabeth Sloan,14,26 Guojun Wang,1,12,25
Carles Martinez-Romero,1,12,13 Marta Alenquer,15 Vladimir Roudko,6,7,8,9 Liliane Chung,16 Simin Zheng,1 Max Chang,4
Yesai Fstkchyan,1 Sara Clohisey,16 Adam M. Dinan,17 James Gibbs,2 Robert Gifford,14 Rong Shen,20 Quan Gu,14
Nerea Irigoyen,17 Laura Campisi,1 Cheng Huang,19 Nan Zhao,1 Joshua D. Jones,17,22 Ingeborg van Knippenberg,14,23
Zeyu Zhu,1 Natasha Moshkina,1 Léa Meyer,14 Justine Noel,1 Zuleyma Peralta,5 Veronica Rezelj,14,24 Robyn Kaake,3
Brad Rosenberg,1 Bo Wang,16 Jiajie Wei,2 Slobodan Paessler,19 Helen M. Wise,16 Jeffrey Johnson,1,3
Alessandro Vannini,20,21 Maria João Amorim,15 J. Kenneth Baillie,16 Emily R. Miraldi,10,11 Christopher Benner,4
Ian Brierley,17 Paul Digard,16 Marta quksza,5 Andrew E. Firth,17 Nevan Krogan,3 Benjamin D. Greenbaum,6,7,8,9
Megan K. MacLeod,18 Harm van Bakel,5 Adolfo Garcı̀a-Sastre,1,12,13 Jonathan W. Yewdell,2 Edward Hutchinson,14,27,*
and Ivan Marazzi1,12,27,28,*
1Department of Microbiology, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
2Laboratory of Viral Diseases, National Institute of Allergy and Infectious Diseases, NIH, Bethesda, MD 20892, USA
3Department of Cellular and Molecular Pharmacology, University of California, San Francisco, San Francisco, CA 94158, USA
4Department of Medicine, School of Medicine, University of California San Diego, La Jolla, CA 92037, USA
5Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
6Tisch Cancer Institute, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
7Department of Medicine, Hematology and Medical Oncology, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
8Department of Oncological Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
9Department of Pathology, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
10Divisions of Immunobiology and Biomedical Informatics, Cincinnati Children’s Hospital, Cincinnati, OH 45229, USA
11Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, OH 45257, USA
12Global Health and Emerging Pathogens Institute, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
13Division of Infectious Diseases, Department of Medicine, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
14MRC-University of Glasgow Centre for Virus Research, Glasgow G61 1QH, UK
15Instituto Gulbenkian de Ciência, 2780-156 Oeiras, Portugal
16The Roslin Institute, University of Edinburgh, Edinburgh EH25 9PS, UK
17Division of Virology, Department of Pathology, University of Cambridge, Cambridge CB2 0SP, UK
18Centre for Immunobiology, Institute of Infection, Immunity and Inflammation, University of Glasgow, Glasgow G12 8QQ, UK
19Department of Pathology, the University of Texas Medical Branch, Galveston, TX 77555, USA
20Division of Structural Biology, The Institute of Cancer Research, London SW7 3RP, UK
21Fondazione Human Technopole, Structural Biology Research Centre, 20157 Milan, Italy
22Present address: Infection Medicine, Edinburgh Medical School: Biomedical Sciences, University of Edinburgh, Edinburgh, UK
23Present address: Department of Learning and Teaching Enhancement, Sighthill Court, Edinburgh Napier University, Edinburgh, UK
24Present address: Viral Populations and Pathogenesis Unit, Department of Virology, Institut Pasteur, CNRS UMR 3569, Paris, France
25Present address: The State Key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, College of Life Sciences,
SUMMARY
RNA viruses are a major human health threat. The life cycles of many highly pathogenic RNA viruses like influ-
enza A virus (IAV) and Lassa virus depends on host mRNA, because viral polymerases cleave 50 -m7G-capped
host transcripts to prime viral mRNA synthesis (‘‘cap-snatching’’). We hypothesized that start codons within
cap-snatched host transcripts could generate chimeric human-viral mRNAs with coding potential. We report
the existence of this mechanism of gene origination, which we named ‘‘start-snatching.’’ Depending on the
reading frame, start-snatching allows the translation of host and viral ‘‘untranslated regions’’ (UTRs) to create
N-terminally extended viral proteins or entirely novel polypeptides by genetic overprinting. We show that
both types of chimeric proteins are made in IAV-infected cells, generate T cell responses, and contribute
to virulence. Our results indicate that during infection with IAV, and likely a multitude of other human, animal
and plant viruses, a host-dependent mechanism allows the genesis of hybrid genes.
1502 Cell 181, 1502–1517, June 25, 2020 ª 2020 The Authors. Published by Elsevier Inc.
This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
ll
Article OPEN ACCESS
AUGGAA...
Met Glu ... canonical viral protein
{
cap UTR
C D
40
11 nt Percent of all unique CS sequences 15
Percentage of CS sequences
10
20
5
10
0 0
0 5 10 15 20 PB2 PB1 PA HA NP NA M NS
CS length (nt) Segment
All CS Sequences CS containing AUGs
segments predicted to make sizable products (>30 aa) (Fig- gation of de novo assembled 80S initiation complexes but not of
ure 2B). These ORFs overlap with canonical viral genes but are those already engaged in elongation. Ribosome-protected frag-
read in different frames (overprinted). They range from over 40 ments (RPFs) were mapped to both the human and viral ge-
residues (HA) to nearly 80 residues (PB1). Where N-terminal nomes (Figures 3A and S3A–S3C). Mapping of RPF sequences
extensions of the major ORF were possible, these ranged from revealed an accumulation of ribosomes at the canonical initiation
8–21 aa in length (Figure 2B). site in mRNAs transcribed from all eight genome segments
Thus, uvORFs are present in all genome segments and, if (Figure 3B; main ORF AUG), consistent with previous reports
licensed by host-derived uAUG-containing RNAs, could (Machkovech et al., 2019). As well as observing ribosomes accu-
generate polypeptides of varying length (Figure 2B). mulating at the canonical initiation sites, we also observed RPFs
mapping to the host-derived sequence upstream of the 50 UTR,
Host-Virus mRNA Chimeras Associate with Elongating suggesting that translation initiated in this region (Figure 3B, in-
Ribosomes sets). The total number of RPF reads mapping to host-derived
If cap-snatched host uAUGs did initiate translation of viral 50 sequences for each segment was 5%–20% of the reads map-
UTRs, the 50 termini of viral mRNAs would be bound by initiating ping to the canonical start codon (Figure 3C), broadly consistent
ribosomes. We therefore performed ribosomal profiling of IAV in- with the proportion of cap-snatched sequences containing
fected cells, in the presence of harringtonine, which blocks elon- uAUGs (Figures 1D and S1B).
A mRNA 1
C
mRNA 2
1500 20
0.00
1500 100 Canonical
CS Reads Canonical CS Reads
15 AUG 80 AUG PB2 PB1 PA HA NP NA M NS
1000 10 1000
60 Segment
40
5 20
500 0 500 0
-20 -15 -10 -5 0 -20 -15 -10 -5 0
0 0
-20 -15 -10 -5 0 5 10 15 20 25 -20 -15 -10 -5 0 5 10 15 20
PA Segment HA Segment D
40
Normalized Read Counts
800 20 50
CS Reads Canonical 25000
CS Reads
Percentage CS
30
10 30
15000 20
400 5
10
0
10000 DMSO
200 -20 -15 -10 -5 0 0
5000 -20 -15 -10 -5 0 20 Harringtonine
0 mRNA
-20 -15 -10 -5 0 5 10 15 20 0
-20 -10 0 10 20 30
10
NP Segment NA Segment
Normalized Read Counts
0 0
-20 -10 0 10 20 30 40 -20 -15 -10 -5 0 5 10 15 20
M Segment NS Segment
Normalized Read Counts
10000 50 50
5000 0
0
-20 -15 -10 -5 0 -20 -15 -10 -5 0
5000
0 0
-20 -15 -10 -5 0 5 10 15 20 -20
20 -15
1 -10
10 -5 0 5 10 15 20 25
Distance from first viral nucleotide Distance from first viral nucleotide
Precisely mapping initiation sites very close to the cap is chal- segment), and less frequent toward the 50 end of the host-derived
lenging, because many of the heterogeneous 50 mRNA ends would sequence (Figure S3D). As well as inferring upstream ribosome
be too short to extrude from the ribosome, making P-site phasing initiation by mapping RPFs to protected uAUGs, we could test for
problematic by standard Ribo-seq analysis. To address this, we it directly by comparing ribosomal profiles with and without harring-
used the location of AUGs within the RPF to identify the reading tonine arrest. Harringtonine increased the proportion of RPFs from
frame being translated. This suggested that initiation occurred in cap-snatched sequences that contained an AUG, indicating trans-
all three reading frames (Figure S3D). uAUG codons were more lation was initiating on uAUGs in these host-derived sequences
frequently close to the start of the viral UTR sequences, peaking (Figure 3D). Taken together, our data show that translation initiates
at the 2 position of mRNAs from all genome segments (numbered from cap-snatched host-derived uAUGs in viral mRNA chimeras,
from the first position in the coding sense of the viral genome albeit at lower frequencies than at canonical start codons.
A B
C D
E F
Host-Virus Protein Chimeras Are Expressed during I-restricted epitope of ovalbumin (Porgador et al., 1997). Based
Infection, Recognized by T Cells, and Affect Virulence on the uvORFs predicted from our in silico analyses, we inserted
To demonstrate that chimeric proteins are expressed during the epitope (OVAI; OVA 257-264; SL8; SIINFEKL) in frame with
infection, we performed mass spectrometry analyses of cell ly- the longest uvORF (PB1 frame 3 uvORF; PB1-UFO(SIIN) (Fig-
sates from infected cells. We also checked whether any chimeric ure 4C) and one of the shortest uvORFs (NS, frame 2 uvORF;
proteins could be integrated into viral progeny by analyzing pu- NS-UFO(SIIN) (Figure 4D). In the case of PB1 segment, we inte-
rified virions (Figures 4A, S4A, and S4B). grated sequences encoding OVAI directly into the UTR, placing
There are limitations to this approach, as the likelihood of a the epitope within the uvORF encoding PB1-UFO (Figure 4C, top
tryptic digest generating peptides that can be detected by the panels). For the NS segment, we used synonymous mutations in
mass spectrometer is lower for short proteins. This issue the canonical viral gene to delete five naturally occurring stop co-
reduces the chance of finding peptides derived from small over- dons in the uvORF; we then inserted OVAI into the extended
printed uvORFs (<30 aa), or that map to short N-terminal exten- uvORF, positioning the insertion in a flexible ‘‘linker’’ region of
sions. Nevertheless, we were able to identify at least 2 distinct the major viral gene NS1 (Thulasi Raman and Zhou, 2016). This
peptides that were derived from the two long overprinted uvORFs genetic configuration was chosen to ascertain whether uvORFs
in the PB1 and PB2 segments, which we named PB1-UFO and are translated by default provided that they are not interrupted by
PB2-UFO, respectively (for ‘‘Upstream Frankenstein ORF’’). In stop codons (Figure 4D, top panels).
addition, we detected a UTR-encoded N-terminal extension of Mouse DC2.4 cells infected with PB1-UFO(SIIN) activated trans-
NP, which we named NP-extension (NP-ext) (Figures 4A, 4B, genic OT-I CD8+ T cells (that are highly specific for mouse H-2 Kb
S4A, and S4B; Table S2A). Peptides from all three proteins were class I molecule complexed with SIINFEKL; Kb-SIIN) (Hogquist
present in PR8 IAV infected cell lysates (Figures 4B, left panels, et al., 1994) as determined by upregulation of CD25 and CD69 (Fig-
and S4A; Table S2A). These novel viral peptides were not de- ure 4C, lower panels). Recombinant IAV expressing SIIN(PB1-Ub-
tected in uninfected controls (Figure S4A). We were also able to SIIN) at high levels (Wei et al., 2019) was used as a positive control
identify peptides derived from the PB1-UFO protein when we (Figure 4C, right panels). No upregulation of CD25 and CD69 was
re-analyzed three previously published proteomic datasets of observed in mock treated samples. Similar results were obtained
IAV infection (Heaton et al., 2016) (Figure S4C; Table S2C). Only with the NS-UFO(SIIN) virus. Here, OT-I CD8+ T cells were acti-
NP-ext was detected in virions (Figure S4B; Table S2B), presum- vated when incubated with bone marrow-derived dendritic cells
ably because influenza virions specifically package hundreds of (BMDCs) infected with the NS-UFO(SIIN) virus (Figure 4D, right
copies of NP, while there is no known mechanism to specifically panels). This was comparable to the activation seen in a control
package other uvORF-encoded proteins (Hutchinson et al., 2014). experiment using a virus in which OVAI was inserted into the
Quantification of the PB1-UFO, PB2-UFO, and NP-ext pro- stem of the viral NA protein (NA-SIIN) (Figure 4D, middle panels)
teins indicated that, although they are less abundant than the (Bottermann et al., 2018). Again, noo upregulation was observed
major viral proteins, they are expressed at detectable levels during mock infection. Taken together, our data with both the
within an infected cell. When quantified, tryptic peptides from PB1-UFO(SIIN) and the NS-SIIN viruses indicate that, unless
these proteins were found between the 20th and 40th percentile blocked by stop codons, uvORFs are translated and expressed
of normalized peptide intensities, including both host and viral during infection, and T cell immunosurveillance extends to pep-
proteins, within our samples (Figures S4A and S4B). Taken tides encoded by uvORFs.
together, our data show that N-terminal extensions and over- Next, to probe if the expression of chimeric host-viral proteins
printed uvORFs are synthesized during IAV infection and are pre- has an impact on viral pathogenesis, we generated a battery of
sent at a moderate abundance within infected cells. recombinant viruses, in which specific N-terminal extensions
We next asked whether chimeric host-viral proteins could or uvORFs were knocked out through the introduction of
be recognized by the host’s immune system. To test this, premature stop codons (NP-Dext and UFOD, respectively). The
we created modified IAVs containing insertions of a class viruses were generated either in the PR8 (Figures 4E, 4F, and
Figure 4. uvORFs Are Expressed during Infection and Can Contribute to Virulence
(A) The number of upstream viral open reading frames (uvORFs) that could be translated for each segment of the IAV genome (empty circles), highlighting those
detected in infected cell lysates by mass spectrometry (filled red circles).
(B) Tryptic peptides that map to translated uvORFs, detected by mass spectrometry across multiple experiments (summarizing data in Figures S4A and S4C).
(C) Schematic showing the generation of the PB1-UFO(SIIN) virus. DC2.4 cells were infected with the indicated viruses and co-cultured with OT-I CD8+ T cells.
OT-1 activation, assessed by CD69 and CD25 expression, was assayed by flow cytometry at 24 h post co-culture. vmRNA, viral mRNA.
(D) Schematic showing the generation of the NS-SIIN virus. Red bars indicate stop codons mutated to permit uninterrupted NS1-UFO translation. Mouse BMDC
cells were incubated with IAV antigen presentations, and co-cultured with OT1-CD8+ T cells. OT-I activation, assessed by CD69 and CD25 expression, was
assayed by flow cytometry of CD69 and CD25 expression at 24 h post co-culture.
(E) Upper panel: schematic showing mutations that truncate NP-ext (NP-DEXT) and control mutations (NP-SYN), as engineered into the IAV PR8. Wild-type PR8 is
also shown. Lower panel: weight loss and survival curves of 6- to 8-week-old BALB/c mice infected with 15 plaque-forming unit (PFU)/mouse of the indicated
viruses. Data are an aggregate of 2 independent experiments of n = 3 mice, using 2 independently plaque purified clones of the NP-DEXT or PR8;NP-SYN viruses
(total n = 6/condition). *p < 0.05; data are shown as the mean ± SEM.
(F) Upper panel: schematic showing mutations that knocked out PB1-UFO (PB1-UFOD) and control mutations (PB1-UFOSYN), as engineered into the IAV PR8.
Wild-type PR8 is also shown. Lower panel: weight loss and survival curves of 6- to 8-week-old BALB/c mice infected with the indicated dose (per mouse) of the
indicated viruses. n = 10 mice/condition. *p < 0.05. Data are shown as the mean ± SEM.
chemistry
A Acidic Basic Hydrophobic Neutral Polar
4
Bits
2
0
100
Conservation
75
50
25
0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77
C
Positive selection
Propagator Model Analysis G(X) : Chance that a class of mutation Mutation class likely to be propagated
in PB1-UFO reaches frequency X
g(X) : quantifies the relative likelihood that a
test class of mutations reaches frequency > x G0(X): Chance that a neutral class of mutations
g(X)
1 Neutral / Heterogenous selection
when compared to a neutral class of mutations reaches the same frequency X.
G X Neutral class defined as synonymous mutations
g X = that occur in PB1 reading frame Negative selection
G0 X and do not overlap with PB1-UFO Mutation class not likely to be propagated
Frequency, X
R1 R2 R3 R1 R2 R3 R1 R2 R3
PB1-UFO Frame Test PB1-UFO Frame Test PB1-UFO Frame
PB1 Frame Neutral class PB1 Frame Neutral class PB1 Frame Test Neutral class
0.8
1.2 1.1
0.7 1.1 1.0
1.0 0.9
0.6
0.9 0.8
0.5 0.8
0.7
0.7
0.4 0.6
0.6
0.5
0.3 0.5
0.4 0.4
0.2 0.3 0.3
0.2 0.2
0.1
0.1 0.1
0.0 0.0 0.0
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
Frequency, X Frequency, X Frequency, X
G 15 H
Position of epitopes within aa sequence of PB1-UFO
10
number of unique # of unique
identities, log2 HLA-epitope pairs
5
(Kd < 500nM)
0 0 77
100
HLA-B5801 10
Unique PB1 sequences
Percent identity across
HLA-B4001
HLA-B3901
HLA-B2705
50
HLA-B1501
HLA-B0801
5
HLA-B0702
HLA-A2601
HLA-A2402
0
HLA-A0301
S4D), A/WSN/33(H1N1) (WSN) (Figure S4E), or mouse-adapted Chimeric Host-IAV Proteins Are Conserved
A/California/04/2009(H1N1) (Cal09) (Figure S4F) backgrounds. We next asked if NP-ext and PB1-UFO are conserved across
We also generated the reciprocal control viruses carrying synon- different strains. The ability to express NP-ext without interrup-
ymous mutations (NPSYN;UFOSYN). Both genomic configurations tion by stop codons in the 50 UTR was maintained in 99.9% of
of control and knockout viruses maintained intact the canonical IAV isolates present in the NCBI Influenza database (Zhang
viral ORFs (Table S3). et al., 2017) (Figures S2, S5A, and S5B). Sequence analysis of
The mutant viruses did not display gross alterations in viral the translated 50 UTR also suggested that N-terminally extended
growth in vitro (Figures S4D–S4F). This was independent of sequences would be similar within IAV subtypes (Figure S5C).
viral background and also of the cell type infected (Figures There are many reasons why these sequences are conserved,
S4D–S4F). To determine if interrupting upstream translation including constraints imposed by RNA structure and the require-
had effects in vivo, we focused on the NP-Dext and PB1- ment to interact with the viral polymerase complex (Fodor, 2013).
UFOD viruses in the PR8 background. The strategy used to Whatever the primary selective pressure, the result of the con-
generate these viruses is shown in the top panels of Figures servation of the 50 UTR sequence is that the ability to express
4E and 4F. NP-ext is nearly universal among IAV strains.
We found that the NP-Dext viruses were less virulent in mice The ability to express PB1-UFO requires not only a lack of stop
compared to the control NP-SYN viruses (Figure 4E), suggest- codons in the appropriate frame of the 50 UTR, but also the main-
ing that NP-ext expression contributes to virulence. A similar tenance of a uvORF overprinted on the canonical PB1 ORF. We
role for NP-ext was recently proposed for the pandemic first analyzed sequences of the IAV subtypes H1N1, H3N2, and
2019 IAV (pdm2009) strain, in which an extended NP protein H5N1. We found that PB1-UFO is conserved within each of these
was found to contribute to virulence in mice and pigs (Wise three virus subtypes (Figure 5A), and stop codons resulting in
et al., 2019). Importantly, however, pdm2009 viruses translate PB1-UFO proteins <77 aa long were infrequent (Figure 5B).
NP-ext from a uAUG encoded in the 50 UTR of NP, but no cor- To understand the factors that contribute to the maintenance
responding uAUG is encoded by the PR8 virus used in of PB1-UFO ORF length and amino acid sequence composition
our study. within the IAV, we first looked at the probability that an ORF
The PB1-UFOD viruses displayed increased virulence when similar in length to PB1-UFO could have arisen stochastically
compared to the PB1-UFOSYN viruses in vivo, although in this in the IAV PB1 segment. We used a sequence randomization
case an effect was only observed at high infectious doses (Fig- model (Figure S5D) on the H3N2 subtype of IAV, the subtype
ure 4F). Gene expression analyses suggested that there were for which the greatest number of complete sequences were
distinct transcriptomic signatures in the lungs of mice infected available. We found that 77% of the sequences in the NCBI
with high doses of the PB1-UFOD or PB1-UFOSYN viruses Influenza database (Zhang et al., 2017) encoded a 77-aa PB1-
(Figures S4G and S4H; Table S4A). Gene Ontology analysis of UFO (Figure S5E) that is significantly longer than the 15–30
differentially expressed genes indicated changes in a number aa long ORFs expected by chance (Figures S5E–S5G). We
of pathways, including leukocyte activation and pro-inflamma- also found that these predicted ORFs would require multiple
tory cytokine secretion (Figure S4I; Table S5). Immune cell dys- (30–70) additional synonymous mutations in order to generate
regulation may therefore be at least partially responsible for the an ORF that is of similar length to PB1-UFO (Figure S5H).
differences in morbidity and mortality during infection with the The above analysis does not take into account constraints
PB1-UFOD or PB1-UFOSYN viruses. imposed by nucleotide biases in the viral UTR or canonical
Together, these functional data show that uvORFs are ex- PB1 ORF or from viral RNA structure. To examine their roles in
pressed during IAV infections, can be detected by the adap- the maintenance of the PB1-UFO ORF we used the frequency
tive immune system, and can modulate the severity of propagator method (Luksza and Lässig, 2014; Strelkowa and
infection. Lässig, 2012) (Figures 5C and S6A). This method can determine
if these factors imposed constraints on the PB1-UFO amino acid LASV genomes comprise two ambisense segments. The median
sequence. The model and its possible outcomes are shown and cap-snatched length of LASV mRNAs was seven nucleotides
discussed in detail in Figures 5C and S6A and the STAR (Figure S7C) in agreement with structural prediction of the
Methods. LASV polymerase (Wallat et al., 2014). Sequence analysis indi-
Briefly, mutations that occur in the viral UTR region, which cates that these uAUGs could lead to the translation of N-termi-
encodes the N-terminal part of PB1-UFO, undergo negative se- nal extensions of the GPC protein, as well as the formation of two
lection (Figure 5D; g < 1). This indicates that mutations in the viral overprinted new ORFs of 50 and 80 aa from the viral mRNAs
UTR, should they occur, have a low probability of being propa- encoding the nucleoprotein (N) and Z proteins of LASV (Fig-
gated down the IAV strain tree. On the other hand, when we ure 6C, 6D, and S7D). The proportions of uAUGs detected in
consider the nucleotide sequences that encode the overlapping cap-snatched sequences from IBV and LASV were dependent
regions of PB1-UFO and the canonical PB1 ORF, we see that on viral segments and ranged between 4% and 12% (Table S7).
there is heterogeneous/neutral selection occurring on mutations We also tested the hypothesis that translation of UTR-derived
in the PB1-UFO ORF (g z 1). This is most likely shaped by the sequences could occur in other sNSVs by using minireplicon as-
requirement to maintain the main PB1 ORF sequence, as muta- says encoding a luciferase reporter to a member of the Phenui-
tions that maintain the PB1 ORF aa sequence (synonymous mu- viridae (Heartland banyangvirus; L segment UTRs). By mutating
tations in PB1 ORF) are more likely to be fixed in the population the canonical AUG, we identified low but readily detectable
(Figure 5E; red line; g < 1). Mutations that change the PB1 amino levels of upstream translation (Figure S7E).
acid sequence instead undergo negative selection (Figure 5F; Overall, these data suggest that generation of chimeric virus-
blue line; g < 1) and are unlikely to be propagated down the strain host ORFs is a common feature of sNSVs. To quantify the poten-
tree, consistent with PB1 ORF being fixed and essential for IAV. tial pervasiveness of this mechanism and the likelihood of novel
Selection in these regions is unlikely to be dominated by RNA ORFs being conserved and functionalized into new genes, we
structural constraints because similar effects are observed when analyzed RNA virus genomic sequences for their propensity to
RNA secondary structure is taken into account for our analysis generate novel proteins by performing in silico analyses of their
(Figures S6B–S6D). Overall, our analyses suggest that PB1- genomes. Although the exact levels of upstream translation will
UFO conservation is largely dictated by the need to preserve depend on a range of factors, including the intrinsic properties
both the viral UTR nucleotide sequence and the amino acid of viral polymerase complexes and, potentially, mechanisms
sequence of the main PB1 ORF. Taken together, this suggests that modulate upstream AUG translation, our results indicate
that the evolution of the PB1-UFO ORF is heavily constrained the genomic potential of start-snatching (Figure 7). Given that
by converging selective pressures. viral mRNA and proteins are among the most highly expressed
Because we had shown that peptides derived from PB1-UFO biotypes in infected cells, our data support the idea that all
could be presented to the immune system (Figures 4C and 4D), cap-snatching virus could expand their proteome by start-
we asked whether epitope-HLA class I interactions could play a snatching uAUGs from their hosts.
role in shaping PB1-UFO sequence. We found that multiple
unique PB1-UFO peptides were predicted to bind to and interact DISCUSSION
with various HLA types (Figure 5G; Table S6). Notably, high-affin-
ity (<500 nM) HLA-epitope pairs were concentrated in regions of In this manuscript, we describe the existence of a mechanism em-
PB1-UFO where conservation was low, suggesting that immune ployed by sNSVs to generate chimeric host-virus genes. This
pressure on PB1-UFO may lead to some diversifying selection mechanism, ‘‘start-snatching,’’ involves the co-opting of start co-
on the protein (Figure 5H). dons from host mRNA sequences to expand the viral proteome.
This mechanism appears to be accessible to all sNSVs, including
Chimeric Host-Virus Proteins of Other Viruses major human pathogens such as IAV and LASV. Start-snatching
Finally, we asked whether our finding that start-snatching gener- allows the translation of proteins from cryptic uvORFs, either as
ates novel ORFs could be generalized from IAV to other sNSVs. canonical viral proteins with N-terminal extensions, or as UFO pro-
We began by looking at another member of the Orthomyxoviri- teins overprinted on the canonical viral ORF. In this study, we have
dae family, influenza B virus (IBV), by performing DEFEND-seq identified examples of both types of uvORF in IAV infections. We
on A549 cells infected with IBV. The host-derived sequences have shown that translation can initiate on uAUGs in the host-
that IBV obtains by cap-snatching had comparable median derived sequence of viral mRNAs, and that this leads to the
lengths to those appropriated by IAV (Figure S7A). Sequence expression of chimeric host-virus proteins that can be detected
analysis indicates that uAUG-initiated translation could read in infected cells. In our hands, the ablation of uvORFs did not
through the 50 UTR of every IBV genome segment in at least impact viral replication in vitro but had a moderate effect in vivo,
one frame and predicted at least two long overprinted new which would be consistent with uvORFs encoding accessory pro-
ORFs (PA and NA segments) (Figures 6A and 6B), as well as teins. We found that uvORFs can be recognized by the immune
N-terminal extensions of six of the eight major viral proteins (Fig- system, and we modeled the contribution of different evolutionary
ures 6A and S7B). forces at play on uvORFs by characterizing viral-intrinsic and host-
Next, we looked at other families of sNSVs. We performed immune features that contribute to their evolution. Finally, we
CAGE analysis on cells infected by Lassa virus (LASV), a member showed experimentally and by sequence analysis that the capa-
of the family Arenaviridae and an emerging virus that in the past bility to express uvORFs through start-snatching is widespread
decade has caused several epidemics of hemorrhagic fever. among the sNSVs.
A C
B D
Chimeric mRNAs Encode Novel Viral Proteins Furthermore, uvORFs are translated in at least three of the eight
We hypothesized that cap-snatching of sNSVs could generate IAV genome segments, generating NP-ext, PB2-UFO, and PB1-
ORFs that are encoded by two genomes (human and virus). UFO (Figures 2 and 4). Genetic evidence suggested that many
Consistent with this, our analyses indicate that roughly 10% of other uvORF proteins are also likely to be expressed, although
IAV mRNA contains host-derived uAUGs (Figures 1D and S1B). we did not detect them in our current study, potentially due to
Hantaan
EMARV
Conservation and Function of uvORFs Gene Origination through Overprinting and the Mis-
Our analysis shows that most sNSV infections lead to expression naming of "UTRs"
of chimeric genes and uvORFs. Because they are host and virally Genetic overprinting typically occurs when a pre-existing
encoded, it is therefore reasonable to ask who benefits from their reading frame acquires mutations that enable translation in alter-
expression. Key considerations in this regard, and based on our native reading frames while maintaining the function of the
analysis are: ancestral frame. This is an important mechanism for the creation
of new proteins, especially in the context of compact genomes
(1) Epitopes encoded in uvORFs are recognized by the adap- (viral, prokaryotic, and eukaryotic organelles) with little coding
tive immune system. MHC I presentation of uvORF- capacity (Keese and Gibbs, 1992; Kovacs et al., 2010; Poulin
derived peptides poses the risk of an adaptive immune et al., 2003; Sabath et al., 2012).
response against cells infected with sNSVs, analogous While genetic overprinting could be selectively advantageous
to the risks posed to IAV by the presentation of alternative for some organisms, the evolution of overprinted genes is prob-
reading frames (ARFs) and defective ribosomal products lematic. Any evolution of the overprinted ORF will be constrained
(DRiPs) (Dolan et al., 2010; Wei et al., 2019; Wei and Yew- by the effects of mutations in the underlying ORF. In addition, es-
dell, 2017, 2019; Zanker et al., 2019). Indeed, the risks tablished overlapping ORFs typically have dedicated mecha-
posed to the virus by the presentation of uvORFs are nisms for their expression, such as ribosomal scanning or frame-
potentially even higher due to the high conservation of shifting, which allow for efficient and regulated expression
these sequences. patterns. Exploring the limited evolutionary space that satisfies
(2) Two uvORFs considered here (NP-ext and PB1-UFO) are all of these constraints presumably requires the overprinted
both highly conserved across multiple strains of IAV. gene to provide a strong selective advantage.
However, merely assessing conservation is insufficient, Start-snatching exposes the 50 coding regions of sNSV ge-
as other forms of selection also act on IAV genome se- nomes to low levels of non-specific out-of-frame translation.
quences. In particular, genome packaging signals in the This ‘‘genetic feature’’ could facilitate the evolution of novel
primary RNA sequence are concentrated in the terminal genes through genetic overprinting, without having to evolve a
regions of each genome segment (Dadonaite et al., dedicated method to express an overprinted ORF before that
2019; Gog et al., 2007; Hutchinson et al., 2010), resulting ORF could provide a selective advantage.
in a suppression of synonymous codon usage (Gog et al., A similar argument applies to the evolution of alternative up-
2007; Jagger et al., 2012). In overprinted regions, like stream translation mechanisms for N-terminally extended pro-
PB1-UFO, there is also selective pressure conferred by teins: if an N-terminal extension provided by start-snatching was
the sequence encoding the canonical ORF. We observe selectively advantageous, the virus could evolve to directly
encode an uAUG in the UTR and make the generation of extended B Mouse Infection studies
protein host-independent and heritable. In this respect, it is inter- B Preparation of RNA sequencing Libraries (In-
esting to note that some recent strains of IAV have evolved to fected Mice)
encode a uAUG in the UTR of NP that allows it to express an B SIINFEKL expression analysis
N-terminally extended protein that can modulate virulence (Wise B Minireplicon Assays
et al., 2019). In essence, start-snatching might simply be a way d QUANTIFICATION AND STATISTICAL ANALYSES
to increase the chances of UTR translation by outsourcing B Mouse Infection Studies
uAUG to non-viral genomic material. B Quantitative qPCR assays
The translation of 50 UTRs (that implies their misnaming) oc- B CAGE sequencing of WSN IAV virus infected cells
curs frequently in eukaryotic genes. uORFs are, in fact, perva- B Ribosome sequencing analyses
sively expressed, with some functioning as short biologically B RNA sequencing Analyses
active polypeptides (Andrews and Rothnagel, 2014; Calvo B LASV CAGE sequencing Analyses
et al., 2009; Combier et al., 2008; Sendoel et al., 2017; Starck B Sequence Randomization Model for PB1-UFO length
et al., 2016; Wang and Rothnagel, 2004; Wen et al., 2009). B Frequency Propagator Ratio Analysis
uORFs are abundantly expressed in cancer cells (Sendoel B Epitope predictions for PB1-UFO
et al., 2017) and activated T cells (Starck et al., 2016). Overall,
future work will be needed to redefine what, in reality, a gene is. SUPPLEMENTAL INFORMATION
and A.E.F.; Investigation, J.S.Y.H., M.A., Y.M., E.S., G.W., C.M.-R., M.A., V.R., Decroly, E., Ferron, F., Lescar, J., and Canard, B. (2011). Conventional and un-
L. Campisi, S.Z., M.C., Y.F., S.C., A.M.D., J.G., R.G., R.S., Q.G., N.I., L. Chung, conventional mechanisms for capping viral mRNA. Nat. Rev. Microbiol.
N.Z., J.D.J., I.v.K., Z.Z., N.M., L.M., J.N., Z.P., V.R., R.K., B.R., B.W., J.W., 10, 51–65.
H.W., J.J., A.V., M.J.A., E.R.M., C.B., I.B., P.D., M.L., A.E.F., N.K., B.D.G., Dias, A., Bouvier, D., Crépin, T., McCarthy, A.A., Hart, D.J., Baudin, F., Cusack,
M.K.M., and H.v.B.; Data Curation, M.A., Y.M., H.v.B., R.G., and J.J.; Re- S., and Ruigrok, R.W. (2009). The cap-snatching endonuclease of influenza vi-
sources, Y.M., M.A., J.J., M.C., H.v.B., E.R.M., A.G.-S., S.P., and C.H.; Writing rus polymerase resides in the PA subunit. Nature 458, 914–918.
– Original Draft, I.M. and E.H.; Writing – Review & Editing, E.H., I.M., J.W.Y.,
Dikstein, R. (2012). Transcription and translation in a package deal: the TISU
Y.M., J.S.Y.H., M.A., E.S., A.M.D., H.v.B., M.L., B.D.G., E.R.M., A.G.-S.,
paradigm. Gene 491, 1–4.
P.D., and A.E.F.; Visualization, E.H., Y.M., M.A., G.W., J.H., J.J., M.C., Z.P.,
E.R.M., and Z.Z.; Funding Acquisition, A.G.-S., I.M., J.K.B., A.E.F., I.B., P.D., Dobin, A., Davis, C.A., Schlesinger, F., Drenkow, J., Zaleski, C., Jha, S., Batut,
M.J.A., E.H., and M.K.M.; Project Administration, I.M. and E.H.; Supervision, P., Chaisson, M., and Gingeras, T.R. (2013). STAR: ultrafast universal RNA-seq
E.H. and I.M. aligner. Bioinformatics 29, 15–21.
Dolan, B.P., Li, L., Takeda, K., Bennink, J.R., and Yewdell, J.W. (2010). Defec-
tive ribosomal products are the major source of antigenic peptides endoge-
DECLARATION OF INTERESTS
nously generated from influenza A virus neuraminidase. J. Immunol. 184,
1419–1424.
The authors declare no competing interests.
Edgar, R.C. (2004). MUSCLE: multiple sequence alignment with high accuracy
Received: March 29, 2019 and high throughput. Nucleic Acids Res. 32, 1792–1797.
Revised: February 26, 2020 Elfakess, R., and Dikstein, R. (2008). A translation initiation element specific to
Accepted: May 18, 2020 mRNAs with very short 5’UTR that also regulates transcription. PLoS ONE
Published: June 18, 2020 3, e3094.
Fodor, E. (2013). The RNA polymerase of influenza a virus: mechanisms of viral
transcription and replication. Acta Virol. 57, 113–122.
REFERENCES
Fodor, E., Devenish, L., Engelhardt, O.G., Palese, P., Brownlee, G.G., and Gar-
Andreatta, M., and Nielsen, M. (2016). Gapped sequence alignment using arti- cı́a-Sastre, A. (1999). Rescue of influenza A virus from recombinant DNA.
ficial neural networks: application to the MHC class I system. Bioinformatics J. Virol. 73, 9679–9682.
32, 511–517. Forrest, A.R., Kawaji, H., Rehli, M., Baillie, J.K., de Hoon, M.J., Haberle, V.,
Andreev, D.E., O’Connor, P.B., Fahey, C., Kenny, E.M., Terenin, I.M., Dmitriev, Lassmann, T., Kulakovskiy, I.V., Lizio, M., Itoh, M., et al.; FANTOM Consortium
S.E., Cormican, P., Morris, D.W., Shatsky, I.N., and Baranov, P.V. (2015). and the RIKEN PMI and CLST (DGT) (2014). A promoter-level mammalian
Translation of 50 leaders is pervasive in genes resistant to eIF2 repression. eLife expression atlas. Nature 507, 462–470.
4, e03971. Gaush, C.R., and Smith, T.F. (1968). Replication and plaque assay of influenza
Andrews, S.J., and Rothnagel, J.A. (2014). Emerging evidence for functional virus in an established line of canine kidney cells. Appl. Microbiol. 16, 588–594.
peptides encoded by short open reading frames. Nat. Rev. Genet. 15, Gog, J.R., Afonso, Edos.S., Dalton, R.M., Leclercq, I., Tiley, L., Elton, D., von
193–204. Kirchbach, J.C., Naffakh, N., Escriou, N., and Digard, P. (2007). Codon conser-
Bottermann, M., Foss, S., van Tienen, L.M., Vaysburd, M., Cruickshank, J., vation in the influenza A virus genome defines RNA packaging signals. Nucleic
O’Connell, K., Clark, J., Mayes, K., Higginson, K., Hirst, J.C., et al. (2018). Acids Res. 35, 1897–1907.
TRIM21 mediates antibody inhibition of adenovirus-based gene delivery and Gruber, A.R., Lorenz, R., Bernhart, S.H., Neuböck, R., and Hofacker, I.L.
vaccination. Proc. Natl. Acad. Sci. USA 115, 10440–10445. (2008). The Vienna RNA websuite. Nucleic Acids Res. 36, W70-4.
Buchholz, U.J., Finke, S., and Conzelmann, K.K. (1999). Generation of bovine Gu, W., Gallagher, G.R., Dai, W., Liu, P., Li, R., Trombly, M.I., Gammon, D.B.,
respiratory syncytial virus (BRSV) from cDNA: BRSV NS2 is not essential for vi- Mello, C.C., Wang, J.P., and Finberg, R.W. (2015). Influenza A virus preferen-
rus replication in tissue culture, and the human RSV leader region acts as a tially snatches noncoding RNA caps. RNA 21, 2067–2075.
functional BRSV genome promoter. J. Virol. 73, 251–259. Haimov, O., Sinvani, H., Martin, F., Ulitsky, I., Emmanuel, R., Tamarkin-Ben-
Calvo, S.E., Pagliarini, D.J., and Mootha, V.K. (2009). Upstream open reading Harush, A., Vardy, A., and Dikstein, R. (2017). Efficient and Accurate Transla-
frames cause widespread reduction of protein expression and are polymor- tion Initiation Directed by TISU Involves RPS3 and RPS10e Binding and Differ-
phic among humans. Proc. Natl. Acad. Sci. USA 106, 7507–7512. ential Eukaryotic Initiation Factor 1A Regulation. Mol. Cell. Biol. 37, e00150-17.
Clohisey, S., Parkinson, N., Wang, B., Bertin, N., Wise, H., Tomoiu, A., Sum- Heaton, N.S., Moshkina, N., Fenouil, R., Gardner, T.J., Aguirre, S., Shah, P.S.,
mers, K.M., Hendry, R.W., Carninci, P., Forrest, A.R.R., et al.; FANTOM5 Con- Zhao, N., Manganaro, L., Hultquist, J.F., Noel, J., et al. (2016). Targeting Viral
sortium (2020). Comprehensive characterisation of transcriptional activity dur- Proteostasis Limits Influenza Virus, HIV, and Dengue Virus Infection. Immunity
ing influenza A virus infection reveals biases in cap-snatching of host RNA 44, 46–58.
sequences. J. Virol. 94, e01720-19. Hoffmann, E., Neumann, G., Kawaoka, Y., Hobom, G., and Webster, R.G.
Combier, J.P., de Billy, F., Gamas, P., Niebel, A., and Rivas, S. (2008). Trans- (2000). A DNA transfection system for generation of influenza A virus from eight
regulation of the expression of the transcription factor MtHAP2-1 by a uORF plasmids. Proc. Natl. Acad. Sci. USA 97, 6108–6113.
controls root nodule development. Genes Dev. 22, 1549–1559. Hogquist, K.A., Jameson, S.C., Heath, W.R., Howard, J.L., Bevan, M.J., and
Cox, J., and Mann, M. (2008). MaxQuant enables high peptide identification Carbone, F.R. (1994). T cell receptor antagonist peptides induce positive se-
rates, individualized p.p.b.-range mass accuracies and proteome-wide pro- lection. Cell 76, 17–27.
tein quantification. Nat. Biotechnol. 26, 1367–1372. Hutchinson, E.C., and Stegmann, M. (2018). Purification and Proteomics of
Dadonaite, B., Gilbertson, B., Knight, M.L., Trifkovic, S., Rockman, S., Laeder- Influenza Virions. Methods Mol. Biol. 1836, 89–120.
ach, A., Brown, L.E., Fodor, E., and Bauer, D.L.V. (2019). The structure of the Hutchinson, E.C., Curran, M.D., Read, E.K., Gog, J.R., and Digard, P. (2008).
influenza A virus genome. Nat. Microbiol. 4, 1781–1789. Mutational analysis of cis-acting RNA signals in segment 7 of influenza A virus.
de Wit, E., Spronken, M.I., Bestebroer, T.M., Rimmelzwaan, G.F., Osterhaus, J. Virol. 82, 11869–11879.
A.D., and Fouchier, R.A. (2004). Efficient generation and growth of influenza vi- Hutchinson, E.C., von Kirchbach, J.C., Gog, J.R., and Digard, P. (2010).
rus A/PR/8/34 from eight cDNA fragments. Virus Res. 103, 155–161. Genome packaging in influenza A virus. J. Gen. Virol. 91, 313–328.
Martin, M. (2011). Cutadapt removes adapter sequences from high- Takahashi, H., Lassmann, T., Murata, M., and Carninci, P. (2012). 50 end-
throughput sequencing reads. EMBnet.journal 17, 10–12. centered expression profiling using cap-analysis gene expression and next-
generation sequencing. Nat. Protoc. 7, 542–561.
Masella, A.P., Bartram, A.K., Truszkowski, J.M., Brown, D.G., and Neufeld,
J.D. (2012). PANDAseq: paired-end assembler for illumina sequences. BMC Thulasi Raman, S.N., and Zhou, Y. (2016). Networks of Host Factors that
Bioinformatics 13, 31. Interact with NS1 Protein of Influenza A Virus. Front. Microbiol. 7, 654.
McGlincy, N.J., and Ingolia, N.T. (2017). Transcriptome-wide measurement of Tilston-Lunel, N.L., Shi, X., Elliott, R.M., and Acrani, G.O. (2017). The Potential
translation by ribosome profiling. Methods 126, 112–129. for Reassortment between Oropouche and Schmallenberg Orthobunyavi-
Ohno, S. (1970). Evolution by Gene Duplication (Springer). ruses. Viruses 9, 220.
Plotch, S.J., Bouloy, M., Ulmanen, I., and Krug, R.M. (1981). A unique Tyanova, S., Temu, T., and Cox, J. (2016). The MaxQuant computational plat-
cap(m7GpppXm)-dependent influenza virion endonuclease cleaves capped form for mass spectrometry-based shotgun proteomics. Nat. Protoc. 11,
RNAs to generate the primers that initiate viral RNA transcription. Cell 23, 2301–2319.
847–858. Wallat, G.D., Huang, Q., Wang, W., Dong, H., Ly, H., Liang, Y., and Dong, C.
Porgador, A., Yewdell, J.W., Deng, Y., Bennink, J.R., and Germain, R.N. (2014). High-resolution structure of the N-terminal endonuclease domain of
(1997). Localization, quantitation, and in situ detection of specific peptide- the Lassa virus L polymerase in complex with magnesium ions. PLoS ONE
MHC class I complexes using a monoclonal antibody. Immunity 6, 715–726. 9, e87577.
Poulin, F., Brueschke, A., and Sonenberg, N. (2003). Gene fusion and overlap- Wang, X.Q., and Rothnagel, J.A. (2004). 50 -untranslated regions with multiple
ping reading frames in the mammalian genes for 4E-BP3 and MASK. J. Biol. upstream AUG codons can support low-level translation via leaky scanning
Chem. 278, 52290–52297. and reinitiation. Nucleic Acids Res. 32, 1382–1391.
Price, M.N., Dehal, P.S., and Arkin, A.P. (2010). FastTree 2–approximately Wei, J., and Yewdell, J.W. (2017). Autoimmune T cell recognition of alternative-
maximum-likelihood trees for large alignments. PLoS ONE 5, e9490. reading-frame-encoded peptides. Nat. Med. 23, 409–410.
Reich, S., Guilligay, D., Pflug, A., Malet, H., Berger, I., Crépin, T., Hart, D., Lu- Wei, J., and Yewdell, J.W. (2019). Flu DRiPs in MHC Class I Immunosurveil-
nardi, T., Nanao, M., Ruigrok, R.W., and Cusack, S. (2014). Structural insight lance. Virol. Sin. 34, 162–167.
Wei, J., Kishton, R.J., Angel, M., Conn, C.S., Dalla-Venezia, N., Marcel, V., Vin- 2009 H1N1 pandemic virus: potential for strains with altered virulence pheno-
cent, A., Catez, F., Ferre, S., Ayadi, L., et al. (2019). Ribosomal Proteins Regu- type? PLoS Pathog. 6, e1001145.
late MHC Class I Peptide Generation for Immunosurveillance. Mol. Cell 73,
1162–1173. Young, S.K., and Wek, R.C. (2016). Upstream Open Reading Frames Differen-
tially Regulate Gene-specific Translation in the Integrated Stress Response.
Wen, Y., Liu, Y., Xu, Y., Zhao, Y., Hua, R., Wang, K., Sun, M., Li, Y., Yang, S.,
J. Biol. Chem. 291, 16927–16935.
Zhang, X.J., et al. (2009). Loss-of-function mutations of an inhibitory upstream
ORF in the human hairless transcript cause Marie Unna hereditary hypotricho- Zanker, D.J., Oveissi, S., Tscharke, D.C., Duan, M., Wan, S., Zhang, X., Xiao,
sis. Nat. Genet. 41, 228–233. K., Mifsud, N.A., Gibbs, J., Izzard, L., et al. (2019). Influenza A Virus Infection
Westerhof, L.M., McGuire, K., MacLellan, L., Flynn, A., Gray, J.I., Thomas, M., Induces Viral and Cellular Defective Ribosomal Products Encoded by Alterna-
Goodyear, C.S., and MacLeod, M.K. (2019). Multifunctional cytokine produc- tive Reading Frames. J. Immunol. 202, 3370–3380.
tion reveals functional superiority of memory CD4 T cells. Eur. J. Immunol. 49,
2019–2029. Zhang, Y., Aevermann, B.D., Anderson, T.K., Burke, D.F., Dauphin, G., Gu, Z.,
He, S., Kumar, S., Larsen, C.N., Lee, A.J., et al. (2017). Influenza Research
Wise, H.M., Gaunt, E., Ping, J., Holzer, B., Jasim, S., Lycett, S.J., Murphy, L.,
Database: An integrated bioinformatics resource for influenza virus research.
Livesey, A., Brown, R., Smith, N., et al. (2019). An alternative AUG codon that
Nucleic Acids Res. 45 (D1), D466–D474.
produces an N-terminally extended form of the influenza A virus NP is a viru-
lence factor for a swine-derived virus. bioRxiv. https://doi.org/10.1101/ Zhou, Y., Zhou, B., Pache, L., Chang, M., Khodabakhshi, A.H., Tanaseichuk,
738427. O., Benner, C., and Chanda, S.K. (2019). Metascape provides a biologist-ori-
Ye, J., Sorrell, E.M., Cai, Y., Shao, H., Xu, K., Pena, L., Hickman, D., Song, H., ented resource for the analysis of systems-level datasets. Nat. Commun.
Angel, M., Medina, R.A., et al. (2010). Variations in the hemagglutinin of the 10, 1523.
STAR+METHODS
Continued
REAGENT or RESOURCE SOURCE IDENTIFIER
Biological Samples
Primary CD14+ human monocytes The Roslin Institute, University of N/A
Edinburgh; As Clohisey et al., 2020
Chemicals, Peptides, and Recombinant Proteins
Dulbecco’s Modified Eagle Thermo Fisher / GIBCO Cat#11965175
Medium (DMEM)
Minimum Essential Medium (MEM) Sigma-Aldrich Cat# 51411C
Purified Agar Oxoid Cat #: LP0028
Trypsin from bovine pancreas, TPCK- Sigma-Aldrich Cat #: T1426-500MG
treated
Protease Inhibitor Cocktail Set III, EDTA- EMD Millipore Cat# 539134-10ML
Free - Calbiochem
Trypsin Sigma-Aldrich Cat# T8802-100MG
TRIzol Reagent Thermo Fisher Scientific Cat#15596018
SimplyBlueTM SafeStain Thermo Fisher Scientific Cat# LC6060
NuPage 412% BT Gel 1.5mm 12w 10 Thermo Fisher Scientific Cat# NP0322BOX
Per Box
MG-132 Sigma-Aldrich Cat# M7449-1ML
NuPAGE MOPS SDS Running Buffer (20X) Thermo Fisher Scientific Cat# NP0001
Ovalbumin (257-264) chicken Sigma-Aldrich Cat# S7951
LT-1 transfection reagent Mirius Cat# MIR 2304
recombinant human colony-stimulating A gift from Chiron, Emeryville, CA, US; As N/A
factor 1 Clohisey et al., 2020
Lys-C lysyl endopeptidase Wako 121-05063
Harringtonine LKT biochemicals H0169
Cycloheximide Sigma-Aldrich Cat# C7698
Sequencing grade modified trypsin Promega 9PIV511
Critical Commercial Assays
Dual Luciferase Reporter Assay System Promega Cat#E1910
CD8a+ T Cell Isolation Kit Miltenyi Biotec Cat#130-104-075
EasySep Mouse CD8+ T Cell Isolation Kit StemCell Technologies Cat# 19853
PureLink RNA Mini Kit 250 Reactions Thermo Fisher Scientific Cat# 12183025
PureLink DNase Set Thermo Fisher Scientific Cat# 12185010
miRNeasy Mini Kit QIAGEN Cat# 217004
Q5 site directed mutagenesis kit NEB Cat# E0554S
Ribo-Zero Gold rRNA Removal Kit Illumina Cat# MRZG12324
(Human/Mouse/Rat)
SMARTer total RNA Pico kit Clontech Cat# 634411
TruSeq Stranded Total RNA Library Prep Kit Illumina Cat # 20020596
Deposited Data
CAGE sequencing of WSN IAV virus Clohisey et al., 2020 https://fantom.gsc.riken.jp/5/data/
infected cells
DEFEND seq of PR8 IAV infected A549 cells Rialdi et al., 2017 GEO: GSE96677
DEFEND seq of IBV infected A549 cells This study GEO: GSE85474
Ribosome Profiling of PR8 IAV infected cells This study GEO: GSE148245
CAGE sequencing of LASV infected This study GEO: GSE148122
vero cells
RNA seq of PR8; PB1-UFOD and PR8;PB1- This study GEO: GSE128519
UFOSYN infected mouse lungs
(Continued on next page)
Continued
REAGENT or RESOURCE SOURCE IDENTIFIER
GISAID Database Shu and McCauley, 2017 https://www.gisaid.org
NCBI Influenza Virus Database Zhang et al., 2017 http://www.ncbi.nlm.nih.gov/genomes/
FLU/Database
Mass spectrometry Data: PR8 IAV infected This study Table S2A
A549 and 293T cells
Mass spectrometry Data: WSN IAV Virions Hutchinson et al., 2014 https://massive.ucsd.edu/ProteoSAFe/
datasets.jsp using the MassIVE ID
MSV000078740; Table S2B
Mass spectrometry Data: Heaton et al., 2016 Table S2C
Immunoprecipitation of PR8 IAV RdRp
Experimental Models: Cell Lines
Dog: MDCK ATCC CCL-34; RRID: CVCL_0422
Human: A549 ATCC CCL-185; RRID: CVCL_0023
Human: 293T ATCC CRL-3216; RRID: CVCL_0063
Cow: MDBK Sigma 90050801-1VL; RRID: CVCL_0421
Monkey: Vero ATCC CCL-81; RRID: CVCL_0059
Mouse: DC2.4 Sigma-Aldrich Cat# SCC142; RRID: CVCL_J409
Hamster: BSR-T7/5 Buchholz et al., 1999 N/A
Experimental Models: Organisms/Strains
Mouse: BALB/cJ (6-8 weeks) Jackson Laboratories 00651
Chicken: Specific Pathogen Free Charles River Cat #: 10100329
Fertile Eggs
Mouse: OT-I: C57BL/6-Tg(TcraTcrb) The Jackson Laboratory / in-house; Cat# 003831; RRID: IMSR_JAX:003831
1100Mjb/J Hogquist et al., 1994
Mouse: C57BL/6 (10-14 weeks) Envigo N/A
Oligonucleotides
DEFEND-seq cDNA synthesis–3’ primer Rialdi et al., 2017 N/A
qPCR Primers This Study Table S4B
Recombinant DNA
PR8 pDUAL plasmids A kind gift of Prof Ron Fouchier; de Wit N/A
et al., 2004
Cal09 pDP2002 plasmids A kind gift of Prof Daniel Perez.; Ye N/A
et al., 2010
pT7HRTMRen(-) MRC-University of Glasgow Centre for Virus N/A
Research; Rezelj et al., 2019
pTMHRTN MRC-University of Glasgow Centre for Virus N/A
Research; Rezelj et al., 2019
pTMHRTL MRC-University of Glasgow Centre for Virus N/A
Research; Rezelj et al., 2019
pTM1-FFLuc MRC-University of Glasgow Centre for Virus N/A
Research; Rezelj et al., 2019
pRL-TK Promega E2241
Software and Algorithms
DESeq2 Love et al., 2014 https://bioconductor.org/packages/
release/bioc/html/DESeq2.html
Bowtie Langmead et al., 2009 http://bowtie-bio.sourceforge.net/
index.shtml
MaxQuant Cox and Mann, 2008 https://www.biochem.mpg.de/5111795/
maxquant
Cutadapt Martin, 2011 https://cutadapt.readthedocs.io/en/stable/
(Continued on next page)
Continued
REAGENT or RESOURCE SOURCE IDENTIFIER
STAR Dobin et al., 2013 https://github.com/alexdobin/STAR
FlowJo Treestar N/A
Metascape Zhou et al., 2019 https://metascape.org/gp/index.html#/
main/step1
Vienna RNA Webserver Gruber et al., 2008 http://rna.tbi.univie.ac.at
FastTree Price et al., 2010 http://www.microbesonline.org/fasttree
RAxML Stamatakis, 2014 https://cme.h-its.org/exelixis/web/
software/raxml/index.html
TreeTime Sagulenko et al., 2018 https://github.com/neherlab/treetime
PANDASeq Masella et al., 2012 https://github.com/neufeld/pandaseq
NetMHC (v3.4 and v4.0) Andreatta and Nielsen, 2016 https://services.healthtech.dtu.dk/service.
php?NetMHC-4.0
MUSCLE Edgar, 2004 https://www.drive5.com/muscle/
HISAT2 Kim et al., 2015 http://daehwankimlab.github.io/hisat2
Prism 8 Graphpad N/A
RESOURCE AVAILABILITY
Lead Contact
Further information and requests for reagents may be directed to and will be fulfilled by Lead Contact Ivan Marazzi (ivan.marazzi@
mssm.edu).
Materials Availability
All unique/stable reagents generated in this study are available from the Lead Contact with a completed Materials Transfer
Agreement.
Cells cultures
Madin–Darby Canine Kidney (MDCK) cells, A549 human lung epithelial cells, Vero (ATCC-CCL81) and 293T human embryonic kidney
cells were cultured in Dulbecco’s Modified Eagle’s Medium (DMEM; GIBCO) supplemented with 10% fetal bovine serum (FBS;
GIBCO). Madin-Darby Bovine Kidney (MDBK) cells were cultured in Minimum Essential Medium (MEM; Sigma) supplemented
with 2 mM L-glutamine and 10% fetal calf serum (FCS). BSR-T7/5 golden hamster cells (Buchholz et al., 1999) were cultured in Glas-
gow Minimal Essential Medium (GMEM) supplemented with 10% FCS and 10% tryptose phosphate broth under G418 selection. All
cells were maintained at 37C and 5% CO2.
Mice
For infection studies: Six to eight-week-old female BALB/c mice were obtained from Jackson Laboratories (Bar Harbor, ME). All mice
infection procedures were performed following protocols approved by the Icahn School of Medicine at Mount Sinai Institutional
Animal Care and Use Committee (IACUC). Animal studies were carried out in strict accordance with the recommendations in the
Guide for the Care and Use of Laboratory Animals of the National Research Council.
For antigen presentation experiments: female OTI (Hogquist et al., 1994) mice were bred in-house on a mixed genetic background.
Animals were kept in dedicated barrier facilities, proactive in environmental enrichment under the EU Directive 2010 and Animal (Sci-
entific Procedures) Act (UK Home Office license number 70/8645) with ethical review approval (University of Glasgow). Animals were
cared for by trained and licensed individuals and humanely sacrificed using Schedule 1 methods.
For BMDC Isolation: 10-14 week old naive female C57BL/6 mice, purchased from Envigo (UK) and maintained at the University of
Glasgow under standard animal husbandry conditions in accordance with UK home office regulations and approved by the local
ethics committee.
Virus Strains
Wild-type viruses
A/Puerto Rico/8/34(H1N1) (PR8) virus was generated by reverse genetics and propagated in 9-11 day old embryonated chicken eggs
(Charles River, Cat # 10100329). Mouse-adapted A/California/04/09(H1N1) (Cal09) was generated by reverse genetics (Ye et al.,
2010) and propagated on MDCK cells in the presence of 1 mg/ml TPCK-trypsin, as described previously (Hutchinson et al., 2008).
The influenza virus A/WSN/33(H1N1) (WSN) (Hoffmann et al., 2000) was propagated on MDBK cells. A/Udorn/72(H3N2) (Udorn)
was propagated on MDCK cells in the presence of 1 mg/ml TPCK-trypsin, as described previously (Clohisey et al., 2020; Hutchinson
et al., 2014). Plaque assays were carried out in MDCK cells and visualized by immunocytochemistry or staining with crystal violet or
Coomassie blue, as previously described (Gaush and Smith, 1968) (See below also for method details).
Mutant viruses
All mutant and control viruses were generated using a plasmid-based reverse genetics system (Fodor et al., 1999; Ye et al., 2010),
using either the A/Puerto Rico/8/1934 (PR8), A/WSN/33 (WSN) or mouse-adapted A/California/4/09 (Cal09) strains as the backbone.
Plasmids used for reverse genetics were the PR8 pDUAL plasmids (de Wit et al., 2004) and the Cal09 pDP2002 plasmids (Ye et al.,
2010) (a kind gift of Prof Daniel R. Perez (University of Georgia, USA). Site-directed mutagenesis of plasmids was performed using the
Q5 site-directed mutagenesis kit (QIAGEN); the edited NS segment sequence required for the PR8-NS.F3.SIIN mutant virus
(described in Figure 4) was synthesized by Genewiz.
PB1-UFO(SIIN) virus
OVA257-264 (SIINFEKL) epitope was inserted into the 50 UTR of the PB1 segment of the influenza A virus (IAV) genome at position 1
before the PB1 start codon. This insertion did not result in an N-terminal extension of or mutations in the PB1 protein, but results in the
insertion of the OVA257-264 antigenic epitope in frame with the PB1-UFO protein.
NS-UFO(SIIN) virus
The OVA257-264 (SIINFEKL) epitope was inserted into frame 2 of the NS segment of the IAV genome, in a region corresponding to the
linker sequence of the NS1 protein (encoded in frame). This effectively replaced codons 79-84 of NS1, while retaining the sequence of
NEP. The replacement sequence was flanked by two upstream nucleotides and one downstream nucleotide to introduce a frameshift
into frame 2. Premature stop codons in frame 2 were also mutated at positions 4, 27, 32, 74 and 77, relative from the start codon of
NS1, to generate a 106 amino acid long NS-UFO sequence, extending it from the original 4 amino acid long uvORF in reading frame 2.
PB1-SIIN virus and NA-SIIN viruses
These viruses have been described in Wei et al. (2019) and Bottermann et al. (2018) respectively.
PR8; PB1-UFOD, PR8; PB1-UFOSYN, PR8; PB1-EXT+ viruses
PR8; PB1-UFOD contains a C to T nucleotide substitution 9 nucleotides after the start of PB1 open reading frame. This generates a
premature stop codon in the PB1-UFO ORF. Its control virus, PR8; PB1-UFOSYN, contains a C to G nucleotide substitution at the
same position. Both viruses retain the amino acid sequence of the PB1 ORF. PR8; PB1-EXT+ contains a T to C nucleotide substi-
tution three nucleotides before the start of PB1 open reading frame. This disrupts a conserved stop codon (‘‘TGA’’) in frame with
PB1 ORF, resulting in the N-terminal extension of the PB1-ORF. PB1-UFO ORF is maintained in this virus. Mutations were confirmed
by sequencing both plasmids and viruses. All viruses were expanded in 9-11 day old embryonated chicken eggs after rescue. The
stock virus titers were calculated from the average of three independent experiments.
PR8; NP-EXTD, PR8; NP-EXTSYN viruses
PR8; NP-EXTD contains an A to T nucleotide substitution 6 nucleotides before the start of the NP open reading frame. This generates
an in-frame stop codon that results in the loss of the N-terminal NP-extension. Its control virus, PR8; NP-EXTSYN, bears an A to G
nucleotide substitution at the same position in the UTR, preserving the NP-extension. Mutations were confirmed by sequencing
both plasmids and viruses, and 3 independent plaque purified clones of each virus, grown on MDCK cells, were used in subsequent
experiments. Stock virus titers were calculated from the average of three independent experiments.
WSN; PB1-UFOD, WSN; PB1-UFOSYN, Cal09; PB1-UFOD, Cal09; PB1-UFOSYN viruses
WSN; PB1-UFOD and Cal09; PB1-UFOD viruses contain C to U nucleotide substitutions 9 nucleotides after the start of PB1 open
reading frame. This generates a premature stop codon in the PB1-UFO ORF. Their control viruses, WSN; PB1-UFOSYN and
Cal09; PB1-UFOSYN respectively, contain C to G nucleotide substitutions at the same positions. All the viruses retain the amino
acid sequence of the PB1 ORF.
METHOD DETAILS
solution while agitating. Lysates were rotated for 30min at 4 C before an equal volume of water was added to the sample to bring
NaCl concentration back to 250mM. Samples were then centrifuged at full speed for 15 min at 4 C. 4x Laemmli buffer (200mM
Tris-HCl pH6.8, 8% SDS, 40% glycerol, 0.588M B-mercaptoethanol, 50mM EDTA and 0.08% Bromophenol Blue) was then added
to the supernatant to 1x concentration, and 5ml of the lysate was loaded on a 4%–12% Bis-Tris gel (Novex). Gels were run under a
hood for 150V for 1h15min in 1X MOPS running buffer and stained in SimplyBlueTM SafeStain (Invitrogen), following the manufac-
turer’s recommended protocol. Once stained, gel bands corresponding to 40-60kDa and < 15kDa were excised. Gel slices were sub-
ject to in-gel tryptic digests as previously described (Rosenfeld et al., 1992).
Digested samples were analyzed on a Thermo Fisher Orbitrap Fusion mass spectrometry system equipped with an Easy nLC 1200
ultra-high pressure liquid chromatography system interfaced via a Nanospray Flex nanoelectrospray source. Samples were injected
on a C18 reverse phase column (25 cm 3 75 mm packed with ReprosilPur C18 AQ 1.9 mm particles). Peptides were separated by an
organic gradient from 5% to 30% ACN in 0.1% formic acid over 70 minutes at a flow rate of 300 nL/min. The MS continuously ac-
quired spectra in a data-dependent manner throughout the gradient, acquiring a full scan in the Orbitrap (at 120,000 resolution with
an AGC target of 200,000 and a maximum injection time of 100 ms) followed by as many MS/MS scans as could be acquired on the
most abundant ions in 3 s in the dual linear ion trap (rapid scan type with an intensity threshold of 5000, HCD collision energy of 29%,
AGC target of 10,000, a maximum injection time of 35 ms, and an isolation width of 1.6 m/z). Singly and unassigned charge states
were rejected. Dynamic exclusion was enabled with a repeat count of 1, an exclusion duration of 20 s, and an exclusion mass width of
± 10 ppm. Raw mass spectrometry data were assigned to human protein sequences and MS1 intensities extracted with the Max-
Quant software package (version 1.6.8) (Cox and Mann, 2008). Data were searched against the SwissProt human protein database
(downloaded on October 10, 2019) and a custom influenza A virus database comprising all six open-reading frames greater than 10
amino acids for the IAV (strain PR-8) genomic sequence. Variable modifications were allowed for N-terminal protein acetylation,
methionine oxidation, and lysine acetylation. A static modification was indicated for carbamidomethyl cysteine. All other settings
were left using MaxQuant default settings.
BSL4 facilities in Galveston National Laboratory in the University of Texas Medical Branch in accordance with institutional health and
safety guidelines and federal regulations. Total RNA from the trizol-treated lysates was isolated and DNase treated using the Purelink
RNA Minikit (Invitrogen). The purified RNA was then submitted for CAGE-sequencing at Kabushiki Kaisha DNAFORM, Japan.
Minireplicon Assays
Minireplicon assays were performed as previously described (Rezelj et al., 2019; Tilston-Lunel et al., 2017). Briefly, and using the
plasmids indicated above, LT-1 transfection reagent (Mirus) was used to transfect sub-confluent BSR-T7/5 cells. After 24 h cells
were processed using a Dual-Luciferase Reporter Assay System (Promega), with luciferase measured using Glowmax 20/20 lumin-
ometer (Promega).
where,
The frequency propagator ratio takes into account both numbers and histories of the mutation class of interest. It is a robust mea-
sure of selection because it is (a) largely independent of data entry frequency, and (b) insensitive to clonal expansion of mutations.
At the limit x = 1, the propagator ratio gðxÞ reduces to g, where
d=n
g=
d0 =n0
and,
Selection on a mutation class of interest can be inferred from the value of g. g < 1 suggests evolutionary constraints (negative se-
lection) on the mutation class of interest relative to the reference class, where a fraction ð1 gÞ of the mutations are under negative
selection. g > 1 suggests that fixation of the mutation class of interest undergoes positive selection, and that at least a fraction
ðg 1Þ=g of that mutation class is beneficial. gz1 suggests weak or heterogenous selection acting on the mutation class of interest,
relative to that of the neutral reference class.
To quantify selection occurring across the PB1-UFO frame, we calculated mutation frequencies in the set of codons derived from
the following three regions (R1-R3)
We chose to use synonymous mutations in the main PB1 ORF (reading frame) in R3 as our neutral reference class to calculate
G0 ðxÞ, as we reasoned that the majority of such mutations evolve near neutrality.
To quantify selection on the N-terminal of PB1-UFO in R1, we calculated the GðxÞ for two classes of mutations: Those that changed
(non-synonymous in PB1-UFO) or did not change (synonymous in PB1-UFO) the amino acid sequence of PB1-UFO. We used
synonymous mutations occurring the PB1 ORF in R3 as our neutral reference class (G0 ðxÞ). We found that g < 1 for both cases,
suggesting that mutations occurring in this region of PB1-UFO were not likely to be fixed over time, and mostly undergo negative
selection, relative to our reference class.
To quantify selection on the C-terminal of PB1-UFO in R2, we again calculated the GðxÞ for mutations that changed (non-synon-
ymous in PB1-UFO) or did not change (synonymous in PB1-UFO) the amino acid sequence of PB1-UFO. We used synonymous
mutations occurring the PB1 ORF in R3 as our neutral reference class ðG0 ðxÞÞ. We found that gz1 for both cases, suggesting
that mutations occurring in this region of PB1-UFO underwent heterogenous selection, relative to that of the reference class.
Since R2 mutations in PB1-UFO appear to undergo heterogeneous selection, we asked if selection occurring on the main PB1 ORF
was a contributing factor. To do so, we calculated the GðxÞ for mutations that changed (non-synonymous in PB1) or did not change
(synonymous in PB1) the amino acid sequence of PB1 in R2. Synonymous mutations occurring the PB1 ORF in R3 as our neutral
reference class ðG0 ðxÞÞ. Here we found that g > 1 for synonymous mutations and g < 1 for non-synonymous mutations, suggesting
that mutations that do NOT alter the amino acid sequence of PB1 are preferentially fixed over time. This suggests to us that part of the
reason why PB1-UFO is undergoing heterogeneous selection in R2 is that there is a requirement to maintain the protein sequence of
PB1. This is not surprising, given that PB1 is an integral part of the viral RNA dependent RNA polymerase complex.
Finally, to interrogate the effect of RNA structure, we classified nucleotides as pairing or non-pairing based on the MFE structure
(discussed above) calculated by RNAFold. We masked nucleotides that were predicted to base pair (‘‘stem-forming’’) from down-
stream analyses as we reasoned that mutations in these nucleotides are likely to affect both RNA structure AND protein sequence,
thus confounding later interpretations of the data. Regions that were not predicted to base pair (‘‘loop nucleotides’’) were then used
for downstream calculations of frequency propagator ratios. Mutation frequencies were calculated in the same regions (R1, R2 and
R3) and reading frames (PB1-UFO versus PB1) as described above. We found that similar effects to before were found, suggesting
that RNA structure was not a major contributor to the maintenance of the PB1-UFO frame.
Note: The absolute number of polymorphism histories that reach a given frequency are finite (since the tree is constructed over a
defined period of time). This can give rise to sampling fluctuations. These sampling uncertainties are reported as error bars in
our figures.
Supplemental Figures
Figure S3. IAV mRNAs Can Be Translated from Host-Derived AUGs, Related to Figure 3
(A) Length distribution of ribosome profiling reads that aligned to human (left panel) and viral (right panel) transcripts in DMSO (Ribo) or harringtonine (Ribo + Harr)
treated samples.
(B) Metagene alignment of average P site density around annotated start codons in human (left panel) or viral (right panel) transcripts in DMSO treated samples.
(C) Metagene alignment of average P site density around annotated start codons in human (left panel) or viral (right panel) transcripts in harringtonine treated
samples.
(D) Frequency of AUG codons by position relative to the viral transcription initiation site. Bars show the mean frequency and are color coded according to frame.
Error bars indicate the standard deviation.
ll
OPEN ACCESS Article
Figure S4. uvORFs Are Expressed during Infection and Can Contribute to Virulence, Related to Figure 4
(A) Plots showing the position of uvORF peptides found in lysates of cells (A549 or 293) infected with A/PR/8/34 virus at 8 or 24h post infection. The specific cell
lysates they were found in are indicated on the right. 1: MG132 treated, 2: DMSO treated. Peptide locations are drawn relative to uvORFs (gray regions) and
canonical ORFs (blue regions) and are colored by the log10 of their intensities, relative to the sample median.
(B) Same as in (A), but for uvORF peptides found within purified A/WSN/33 virions.
(C) Same as in (A), but for uvORF peptides found from an independent, previously published dataset.
(D) In vitro growth curves of the indicated mutant (UFOD) and control (UFOSYN) viruses made in the PR8 background, and performed on MDCK cells. Error bars
indicate the standard deviation of 3 replicates.
(E) In vitro growth curves of the indicated mutant (UFOD) and control (UFOSYN) viruses made in the WSN/33 background, and performed on MDCK cells.
(F) In vitro growth curves of the indicated mutant (UFOD) and control (UFOSYN) viruses made in the Cal/09 background, and performed on A549 cells. Error bars
indicate the standard deviation of 3 replicates.
(G) Heatmap of differentially expressed genes (Fold Change > 2, p < 0.01) found in the lungs of mice infected with 100PFU of either the PR8;PB1-UFOD or
PR8;PB1-UFOSYN viruses at day 6 post infection.
(H) qPCR validation of four significantly changed genes identified in (G) (highlighted with green text). Each dot represents the lung of one mouse infected with
100PFU of the indicated viruses, collected at day 6 post infection. P values were calculated through a one tailed t test. *p < 0.05
(I) Gene ontology analysis of genes shown in (G).
ll
OPEN ACCESS Article
Figure S7. DEFEND-Seq and CAGE Analysis of Other Cap-Snatching Viruses, Related to Figure 6
(A) Distribution of lengths for cap-snatched sequences found in IBV, as determined by DEFEND-seq.
(B) Host derived uAUGs give rise to long uvORFs (> 30aa). (Upper panels) Predicted peptide sequences derived upon translation of all three ribosome reading
frames in the indicated IBV genome segments. (Lower panels) Predicted distribution of the lengths of new ORF and extension peptides generated from each
reading frame of the viral 50 UTR. Peptide lengths are calculated based on AUG positions obtained through DEFEND-sequencing.
(C) Distribution of lengths for cap-snatched sequences found in LASV infected cells, as determined by CAGE-seq.
(D) Host derived uAUGs enable reverse sense genome segments of Lassa virus L and S to give rise to uvORFs and extensions. (Upper panels) Schematic of
proteins encoded in the indicated reading frames in either the L or S segment. Lassa virus RNA is ambisense. (Middle panels) Predicted peptide sequences
derived upon translation of all three reading frames in the reverse sense L and S segments. (Lower panels) Predicted distribution of the lengths of new ORFs and
extension peptides generated from each reading frame of the viral 50 UTR. Peptide lengths are calculated based on AUG positions obtained through CAGE.
(legend continued on next page)
ll
OPEN ACCESS Article
(E) (Left panels) Schematic showing (in coding sense) the 50 termini of viral reporter RNAs, in which a viral untranslated region (UTR) flanks a luciferase (Luc)
reporter gene. Reporter RNAs were used to assess upstream translation in the mRNAs of Heartland virus (HRTV). The 50 terminus of the mRNAs consisted of cap-
snatched sequence from host mRNAs (cap), followed by a viral 50 UTR (50 UTR) and the reporter gene (Luc). Cap structures are indicated as circles, the most
N-terminal AUG as a triangle, AUG mutations as crosses and stop codons as lines. (Right panels) Luc expression when these reporters were included in min-
ireplicon assays, as a percentage of expression with the WT construct, showing the means and s.d. of 3 repeats compared to WT-STOP by Student’s 2-tailed t
test (n.s.: p R 0.05, *p < 0.05, ***p % 0.0005).
Article
Correspondence
zgitai@princeton.edu
In Brief
A compound that kills both Gram-positive
and Gram-negative bacteria through two
independent mechanisms may provide a
platform for the development of future
antibiotics.
Highlights
d SCH-79797 kills Gram-negative and Gram-positive bacteria
with undetectable resistance
Article
A Dual-Mechanism Antibiotic Kills Gram-Negative
Bacteria and Avoids Drug Resistance
James K. Martin II,1,7 Joseph P. Sheehan,1,7 Benjamin P. Bratton,1,2,7 Gabriel M. Moore,1 André Mateus,3
Sophia Hsin-Jung Li,1 Hahn Kim,4,5 Joshua D. Rabinowitz,2,4 Athanasios Typas,3 Mikhail M. Savitski,3
Maxwell Z. Wilson,1,6 and Zemer Gitai1,8,*
1Department of Molecular Biology, Princeton University, Princeton, NJ 08544, USA
2Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ 08544, USA
3European Molecular Biology Laboratory, Genome Biology Unit, 69117 Heidelberg, Germany
4Department of Chemistry, Princeton University, Princeton, NJ 08544, USA
5Princeton University Small Molecule Screening Center, Princeton University, Princeton, NJ 08544, USA
6Department of Molecular, Cellular, and Developmental Biology, Center for BioEngineering, University of California, Santa Barbara, Santa
*Correspondence: zgitai@princeton.edu
https://doi.org/10.1016/j.cell.2020.05.005
SUMMARY
The rise of antibiotic resistance and declining discovery of new antibiotics has created a global health crisis.
Of particular concern, no new antibiotic classes have been approved for treating Gram-negative pathogens in
decades. Here, we characterize a compound, SCH-79797, that kills both Gram-negative and Gram-positive
bacteria through a unique dual-targeting mechanism of action (MoA) with undetectably low resistance fre-
quencies. To characterize its MoA, we combined quantitative imaging, proteomic, genetic, metabolomic,
and cell-based assays. This pipeline demonstrates that SCH-79797 has two independent cellular targets,
folate metabolism and bacterial membrane integrity, and outperforms combination treatments in killing
methicillin-resistant Staphylococcus aureus (MRSA) persisters. Building on the molecular core of SCH-
79797, we developed a derivative, Irresistin-16, with increased potency and showed its efficacy against Neis-
seria gonorrhoeae in a mouse vaginal infection model. This promising antibiotic lead suggests that
combining multiple MoAs onto a single chemical scaffold may be an underappreciated approach to targeting
challenging bacterial pathogens.
1518 Cell 181, 1518–1532, June 25, 2020 ª 2020 Elsevier Inc.
ll
Article
selecting for resistant mutants is the most common method for by directly functioning as an antibiotic (Gupta et al., 2018). Given
characterizing MoA, making the characterization of new anti- that studies focusing on characterizing its anticoagulant activ-
biotic MoAs without resistance mutants a significant challenge. ities suggested that at least 5 mg/kg SCH-79797 can be safely
Phenotypic methods, such as macromolecular synthesis as- tolerated in animals (Gobbetti et al., 2012; Strande et al., 2007)
says, have been previously used in such cases, as was done and its emergence as a potential antimicrobial with no known
for teixobactin (Ling et al., 2015). However, these assays only bacterial target (Gupta et al., 2018), we decided to further char-
allow the classification of molecules with previously described acterize SCH-79797 as a candidate antibiotic.
MoAs (King and Wu, 2009). Thus, there is also a need for resis- To assess the spectrum of bacterial species susceptible to
tance-independent approaches for the de novo characterization SCH-79797, we measured the minimal inhibitory concentration
of antibiotic MoA. (MIC) of SCH-79797 against several clinically relevant patho-
Here, we describe a compound, SCH-79797, which is bacte- gens, including the ESKAPE pathogens (Boucher et al., 2009).
ricidal toward both Gram-negative and Gram-positive bacteria, In this study, we define MIC as the concentration of drug that re-
including clinically significant bacterial pathogens such as meth- sults in no visible bacterial growth after 14 h of growth at 37 C.
icillin resistant Staphylococcus aureus (MRSA), Enterococcus We found that SCH-79797 significantly hindered the growth of
faecalis, Neisseria gonorrhoeae, and Acinetobacter baumannii, multiple Gram-negative and Gram-positive pathogens including
with no signs of resistance. In an animal host model, SCH- Neisseria gonorrhoeae, two clinical isolates of Acinetobacter
79797 blocked infection by A. baumannii with low toxicity to baumannii, Enterococcus faecalis, and Staphylococcus aureus
the host at the dose required for effective antibiotic activity. To (Figure 1A; Table S1). SCH-79797 also exhibited potent activity
rapidly and efficiently classify the MoA of SCH-79797, we used against several antibiotic-resistant pathogen strains including
a variant of a recently described quantitative imaging-based multi-drug-resistant WHO-L N. gonorrhoeae and MRSA
approach known as bacterial cytological profiling (BCP) (None- S. aureus. Using the E. coli lptD4213 strain from our original
juie et al., 2013). This effort showed that SCH-79797 functions screen, SCH-79797 exhibited potent and rapid bactericidal ac-
through a mechanism distinct from that of most known classes tivity (Figures 1B and 1C). SCH-79797 also exhibited similar
of antibiotics. In the absence of being able to evolve resistant bactericidal activity against a clinical isolate of S. aureus MRSA
mutants, we used thermal proteome profiling (Savitski et al., USA300 (Tenover and Goering, 2009) suggesting that its bacte-
2014), CRISPRi genetic sensitivity (Peters et al., 2016), and me- ricidal activity is not species-specific (Figure S1A).
tabolomic profiling (Kwon et al., 2008; Kwon et al., 2010) to char-
acterize the MoA of SCH-79797. Using this multi-dimensional, SCH-79797 Is Effective In Vivo and Has a Low Frequency
systems-level approach, we identified the candidate targets of of Resistance
SCH-79797 as dihydrofolate reductase and the bacterial mem- Given SCH-79797’s promising ability to kill bacteria, we sought to
brane. Classical enzymology and membrane permeability and determine if it can function as an effective antibiotic in vivo. To test
polarization assays confirmed the targets identified by our its antibiotic activity in the context of an animal host infection, we
high-throughput approaches. By analyzing derivatives of the focused on A. baumannii as it has emerged as an important Gram-
SCH-79797 structure, we demonstrated that the two pharmaco- negative pathogen that is targeted by relatively few available anti-
phores of this compound can be distinguished. Finally, we biotics, and has a well-established host animal model in the wax
describe a derivative of SCH-79797, Irresistin-16 (IRS-16), with worm, Galleria mellonella (Gebhardt et al., 2015; Peleg et al.,
improved potency that demonstrates efficacy in a mouse vaginal 2009). We first established that injecting G. mellonella with SCH-
Neisseria gonorrhoeae model. Thus, our findings identify and 79797 at concentrations four times higher than the MIC of SCH-
characterize a promising antibiotic candidate and provide a po- 79797 toward A. baumannii did not result in higher host toxicity
tential roadmap for future antibiotic discovery efforts. than the solvent-only control (Figures S1E and S1F; Table S2).
We next tested the ability of SCH-79797 to treat infection of
RESULTS G. mellonella with a lethal dose of A. baumannii AB17978. Treat-
ment with SCH-79797 significantly prolonged the survival of
SCH-79797 Is a Broad-Spectrum, Bactericidal Antibiotic A. baumannii-infected G. mellonella (p < 0.001) (Figures 1D,
With the aim of finding antibiotics with novel mechanisms of ac- S1G, and S1H). The survival rate of G. mellonella treated with
tion (MoA), we began with an unbiased, whole-cell screening SCH-79797 was similar to the control antibiotics meropenem,
approach. To include antibiotics that target both Gram-negative rifampicin, and gentamicin (Figures 1D, S1G, and S1H) (Karlowsky
and Gram-positive bacteria, we screened for compounds that in- et al., 2003; Viehman et al., 2014).
hibited the growth of E. coli lptD4213, which has a compromised To further characterize its promise as an antibiotic, we attemp-
outer membrane that makes it partially permeable to antibiotics ted to determine the frequency with which bacteria develop
that would otherwise have difficulty penetrating the Gram-nega- resistance toward SCH-79797. Because spontaneous suppres-
tive lipopolysaccharide (Ruiz et al., 2006). We screened a small sors can restore E. coli lptD4213’s membrane barrier function-
molecule library of 33,000 unique compounds and one of our ality, we focused our resistance studies on S. aureus MRSA
most potent hits was SCH-79797, a compound that had been USA300 (Tenover and Goering, 2009). We were unable to isolate
previously reported as a human PAR-1 antagonist (Ahn et al., stable SCH-79797-resistant mutants upon plating 108 CFU of
2000). This finding was surprising because there are no PAR-1 MRSA USA300 onto agar containing 25 mg/mL SCH-79797 (4X
homologs in bacteria. A recent report suggested that SCH- MIC). We were also unable to isolate SCH-79797-resistant mu-
79797 increases the ability of neutrophils to kill bacteria, perhaps tants upon plating 108 colony-forming units (CFUs) of
A B
D E
Figure 1. SCH-79797 Is a Broad-Spectrum, Bactericidal Antibiotic that Is Effective in an Animal Model and Has a Low Frequency of
Resistance
(A) The MIC of SCH-79797 against Gram-negative (red) and Gram-positive (black) bacteria. The MICs of E. cloacae and P. aeruginosa were greater the maximal
drug concentrations tested. See also Table S1.
(B) The relative growth of E. coli lptD4213 after treatment with SCH-79797. Bacterial growth was measured as the optical density at 600 nm (OD600) 14 h following
inoculation. Each data point represents 2 biological replicates. Mean ± SD are shown.
(C) Colony forming units (CFUs mL 1) after 3-h treatment of E. coli lptD4213 with 1% DMSO (solvent control), 6.2 mg/mL SCH-79797 (23 MIC), 0.12 mg/mL
ampicillin (23 MIC), or 0.48 mg/mL novobiocin (43 MIC). Data points at 1 3 102 CFU mL 1 are below the level of detection. Each data point represents 3 biological
replicates. Mean ± SD are shown.
(D) The percent survival of G. mellonella wax worms infected with A. baumannii and concomitantly treated with 2 mL/larva 100% DMSO, 67 mg/larva SCH-79797,
67 mg/larva gentamicin, 67 mg/larva merapenem, or 67 mg/larva rifampicin. Data represents a typical cohort (n = 12) from a biological triplicate. p values are
determined from a Mantel-Cox test using Prism (**p < 0.01; ***p < 0.001), and the pooled results are presented in the supplemental material (Figure S1G). For other
Mantel-Cox comparisons, see Table S2.
(E) Fold increase in resistance of S. aureus MRSA USA300 to SCH-79797, novobiocin, trimethoprim, or nisin after 25 days of serial passaging in each drug. Data
represents one biological replicate and the data for a second replicate is shown in Figure S1B.
B. subtilis, suggesting that the difficulty in developing resistant dent cultures of S. aureus MRSA USA300 in sub-lethal concen-
mutants is not species-specific. To address resistance rates trations of SCH-79797, as well as three control antibiotics: novo-
more quantitatively, we serial-passaged 2 biologically indepen- biocin, trimethoprim, and nisin. Over the course of 25 days, we
Because we had gold standards of the BCP results of SCH- SCH-79797 targets. Specifically, we used thermal proteome
79797 and of several antibiotics representing different classes profiling, an assay that uses mass spectrometry to compare
and sub-groups within classes, we applied a machine learning the thermal stability of the entire proteome with and without
approach to classify the BCP data. Using a one-way MANOVA, drug treatment (schematized in Figure 3A) (Mateus et al., 2018;
we performed dimensionality reduction to remove the influence Savitski et al., 2014). Briefly, intact cells or cell lysate samples
of naturally covarying metrics, such as cell length and cell treated with a range of compound concentrations are heated
perimeter. We then used single-linkage clustering to cluster to a series of increasing temperatures and the soluble proteins
treatment groups by their neighborhood representation vec- at each temperature are collected (Becher et al., 2016). Proteins
tors, such that samples whose neighborhoods were similar that bind to the drug are thermally stabilized, which leads to a
would be clustered together. This analysis indicated that shift in the temperature at which those proteins precipitate (Fig-
SCH-79797 resulted in a phenotypic death-state that was ure 3A). Using E. coli lptD4213, we treated intact cells and cell ly-
different from the other antibiotics tested (Figure 2B). We also sates with SCH-79797 and found that it significantly shifted the
tested an additional method to assess whether this result thermal stability of dihydrofolate reductase (the DHFR homolog
was robust to multiple statistical methods. Using UMAP for in E. coli is known as FolA) (Figures 3B and S3A). The fact that
dimensionality reduction on Z score normalized data (McInnes the same result was observed with both intact cells and cell ly-
et al., 2018), we found that SCH-79797-treated cells formed a sates (Figures 3B and S3A) suggests that SCH-79797 enters
distinct peak separated from other treatments (Figure S2). E. coli cells and directly binds to FolA. As a positive control,
This result supports our conclusion that SCH-79797 is dissim- we used a well-characterized antibiotic that targets DHFR,
ilar from most antibiotics tested and therefore likely possesses trimethoprim, and found that it also thermally stabilizes its known
a MoA distinct from that of any of the antibiotics in our target, the E. coli DHFR, FolA (Figures 3C and S3B) (Gleckman
training set. et al., 1981).
To test both the physiological significance and species-spec-
Thermal Profiling and CRISPRi Genetics Demonstrate ificity of the suggestion that SCH-79797 binds to DHFR, we took
that SCH-79797 Targets Dihydrofolate Reductase advantage of a collection of B. subtilis essential gene CRISPR-
In the absence of resistant mutants or similarity to antibiotics interference (CRISPRi) knockdown mutants (Peters et al.,
with known MoA by BCP, we turned to a high-throughput prote- 2016). In each of these mutants, an essential gene is targeted
omics-based approach for de novo identification of candidate by CRISPRi to reduce its expression 3-fold. A strain with
reduced levels of the SCH-79797 target should be sensitized to additional targets that are not shared with trimethoprim. If this
sub-lethal doses of SCH-79797. Given the thermal profiling was the case, we would expect that cells resistant to trimetho-
result, we focused on mutants in the folate biosynthesis pathway prim would still be susceptible to SCH-79797. Previous studies
(schematized in Figure 4A). As a negative control, we confirmed demonstrated that resistance to trimethoprim can be achieved
that CRISPRi knockdowns of genes unrelated to folate meta- by deleting thyA and supplementing the media with thymine
bolism are not sensitized to SCH-79797 (Figure S4). As a positive (Amyes and Smith, 1975). We confirmed that deleting thyA
control for our assay, we again utilized trimethoprim. We from E. coli lptD4213 in the presence of excess thymine led
confirmed that dihydrofolate reductase (dfrA in B. subtilis) and to trimethoprim resistance (Figure 5A). We also found that
dihydrofolate synthase (folC, an enzyme that acts upstream of thymine supplementation decreased the sensitivity of E. coli
DfrA) knockdowns are hypersensitive to trimethoprim, while lptD4213 to SCH-79797 (comparing ‘‘WT’’ of Figure 5A to Fig-
knockdowns of enzymes that function downstream of DHFR, ure 1B; Table S1). The findings that reducing cellular depen-
folD and glyA, are not (Figure 4B). SCH-79797 exhibited the dence on DHFR activity (by thymine supplementation) makes
same genetic sensitivity pattern as trimethoprim in that both cells less sensitive to SCH-79797 treatment, while increasing
dfrA and folC, but not folD and glyA knockdowns, were sensi- reliance on DHFR activity (by dfrA CRISPRi) makes cells more
tized to SCH-79797 (Figure 4B). sensitive to SCH-79797, together indicate that DHFR inhibition
is a physiologically important target of SCH-79797. Neverthe-
SCH-79797 Inhibits DHFR Activity in Cells and in Purified less, comparing cells with and without thyA in the presence of
Enzymatic Assays thymine showed no change in sensitivity to SCH-79797 (Fig-
To determine how SCH-79797 affects folate metabolism in living ure 5A), suggesting that SCH-79797 is likely to have a second,
cells, we used mass spectrometry to measure the relative abun- folate-independent MoA.
dance of folate metabolite pools in E. coli NCM3722 treated with To obtain clues about the potential additional MoA of SCH-
SCH-79797. E. coli NCM3722 was used because these bacteria 79797, we revisited our fluorescent BCP images of E. coli
lack mutations that disrupt primary metabolism in other lab lptD4213 cells treated with SCH-79797. We observed SYTOX
strains of E. coli (Soupene et al., 2003). E. coli NCM3722 cells Green staining in some of the bacteria (Figure 2A), suggesting
were grown in Gutnick Minimal Media and treated with that SCH-79797 compromises the integrity of the bacterial mem-
13.9 mg/mL SCH-79797 (13 MIC) for 15 min (Kwon et al., brane. To directly quantify the effect of SCH-79797 on bacterial
2008, 2010). In response to SCH-79797 treatment, the levels of membrane integrity, we used flow cytometry to measure the
the DHFR substrate, 7,8-dihydrofolate (DHF), rose 10-fold membrane potential and permeability of E. coli lptD4213 in the
compared to untreated cells, while the levels of folate metabo- presence of the fluorescent dyes, DiOC2(3) and TO-PRO-3.
lites downstream of DHF dropped significantly (Figure 4C). DiOC2(3) is a cationic dye that accumulates in the cytoplasm of
This metabolic response is characteristic of dihydrofolate reduc- cells with an active membrane potential and shifts its fluores-
tase (FolA in E. coli) inhibition as we observed a similar pattern cence from red to green in these cells, providing a measure of
upon treatment with trimethoprim, a known inhibitor of DHFR membrane potential (Figure 5B) (Novo et al., 1999). TO-PRO-3
(Figure 4C) (Gleckman et al., 1981). is a nucleic acid stain that only accumulates in cells with compro-
To determine whether SCH-79797 inhibits DHFR directly, we mised membranes, providing an independent measure of mem-
obtained purified E. coli FolA protein and measured its enzymatic brane permeability (Figure 5B) (Novo et al., 2000). As positive
activity in the presence of increasing concentrations of SCH- controls, we observed the expected shifts in both DiOC2(3)
79797. We found that SCH-79797 has an IC50 of 8.6 ± 3 mM and TO-PRO-3 staining using CCCP, a membrane-decoupler
against FolA (Figure 4D). We also measured the initial velocity that affects membrane potential but not permeability (Novo
of FolA activity at various DHF substrate concentrations to et al., 2000), and two compounds that disrupt both membrane
establish if SCH-79797 acts competitively or non-competitively. potential and permeability: nisin, a pore-forming antibacterial
A Michaelis-Menten fit to the data demonstrated that 8.6 mM peptide (Prince et al., 2016; Wiedemann et al., 2001), and poly-
SCH-79797 (the IC50) increases the Km from 32 ± 25 mM to 100 myxin B, a small lipopeptide membrane destabilizer (Warren
± 80 mM. These results indicate that SCH-79797 functions at et al., 1957) (Figure 5C). As negative controls, we confirmed
least partially as a competitive inhibitor of FolA’s activity on its that antibiotics that do not target the membrane, including ampi-
DHF substrate (Figure 4E). Likewise, the Michaelis-Menten fits cillin, rifampicin, and novobiocin, do not shift DiOC2(3) or TO-
for the established FolA inhibitor trimethoprim show a very PRO-3 staining (Figures S5A–S5D). After 15 min of treatment
similar effect IC50 = 15 ± 4 nM (Figure 4D), consistent with previ- with SCH-79797, subsequent DiOC2(3) and TO-PRO-3 staining
ous measurements of the tight binding between trimethoprim revealed significant defects in both membrane polarization and
and E. coli FolA (Cammarata et al., 2017). permeability (Figure 5C). These effects on the membrane are
not secondary consequences of DHFR inhibition, as trimetho-
SCH-79797 Also Disrupts Bacterial Membrane Potential prim-treated E. coli showed no significant changes in DiOC2(3)
and Permeability Barrier and TO-PRO-3 staining (Figure 5C). The membrane-targeting ef-
The similarities between SCH-79797 and trimethoprim with fect of SCH-79797 is also not species-specific, as similar results
respect to FolA inhibition helped confirm DHFR as a target of were seen with SCH-79797-treated B. subtilis 168 (Figures S5E–
SCH-79797 but were also surprising because these two com- S5G). These findings indicate that independent of its ability to
pounds did not generate similar profiles in our BCP analysis inhibit DHFR activity, SCH-79797 disrupts both membrane po-
(Figure 2B). One potential explanation is that SCH-79797 has tential and permeability barrier.
A B
D E
A B
Figure 5. SCH-79797 Is Distinct from Other Dihydrofolate Reductase Inhibitors and Disrupts Membrane Integrity
(A) The growth of wild-type (WT) and DthyA E. coli lptD4213 relative to a DMSO-treated control after SCH-79797 and trimethoprim treatment. Bacterial growth
was measured for 14 h and the final OD600 of each condition was plotted against drug concentration. Each data point represents 2 biological replicates. Mean ±
SD are shown.
(B) Schematic of flow cytometry data showing the expected results for each class of polarized, depolarized, permeable, and impermeable bacteria.
(C) Flow cytometry analysis of the membrane potential and permeability of E. coli lptD4213 cells after 15 min incubation with 1% DMSO (solvent control), 5 mM
CCCP, 25 mg/mL nisin (23 MIC), 0.8 mg/mL polymyxin B (23 MIC), 12.5 mg/mL SCH-79797 (23 MIC), or 2 mg/mL trimethoprim (103 MIC). The limits for the
depolarized region were defined by comparing the values in the CCCP and solvent only controls. The limits for the permeabilized region were defined by
comparing the nisin and solvent only controls.
SCH-79797 Treatment Can Kill Bacteria in Contexts integrity. Qualitative inspection suggested that SCH-79797-
Where Combination Therapy Fails treated E. coli appeared similar to E. coli lptD4213 cells treated
Having established that SCH-79797 disrupts both folate meta- with a combination of trimethoprim and nisin (Figure 6A). Quan-
bolism and membrane integrity, we sought to determine if these tification of the images confirmed that SCH-79797 clusters with
two targets can together explain how SCH-79797 kills bacteria. the co-treatment of trimethoprim and nisin (Figure 6A). Co-treat-
To address this question, we used BCP analysis to compare the ment with polymyxin B and trimethoprim similarly clustered with
cell morphology of bacteria treated with SCH-79797 to that of SCH-79797, suggesting that this effect is due to membrane
bacteria treated with a combination of two different antibiotics, perturbation and not specific to the complex MoA of nisin (Has-
one of which targets DHFR and one of which targets membrane per et al., 2006; Prince et al., 2016; Wiedemann et al., 2001)
Figure 6. SCH-79797 Mimics Co-treatment with Folate Metabolism and Membrane Integrity Disruptors but Can Be More Effective Than Their
Combination
(A) BCP analysis of E. coli lptD4213 cells after 30 min of treatment with 1% DMSO, 6.3 mg/mL SCH-79797 (13 MIC), 2 mg/mL trimethoprim (103 MIC), 25 mg/mL
nisin (23 MIC), or the combination of 2 mg/mL trimethoprim (103 MIC) and 25 mg/mL nisin (23 MIC). Cytological profiles were clustered by the first three principal
components that account for at least 90% of the variance between samples. Cells were stained with DAPI, FM4-64, and SYTOX Green. Shown here are the
merged images of DAPI (blue) and FM4-64 (red). Scale bar, 2 mm.
(B)The viability of E. coli lptD4213 cells measured in CFU mL 1 after 2 h of treatment with 1% DMSO (solvent control), 2 mg/mL trimethoprim (103 MIC), 25 mg/mL
nisin (23 MIC), the combination of 2 mg/mL trimethoprim (103 MIC) and 25 mg/mL nisin (23 MIC), 0.8 mg/mL polymixin B (23 MIC), the combination of 2 mg/mL
trimethoprim (103 MIC) and 0.8 mg/mL polymixin B (23 MIC), or 3.1 mg/mL SCH-79797 (13 MIC). Each bar represents 3 biological replicates. Mean ± SD
are shown.
(C) Viability of S. aureus MRSA USA300 persister cells measured in CFU mL 1 after 2 h of treatment with 1% DMSO (solvent control), 63 mg/mL trimethoprim (103
MIC), 100 mg/mL nisin (23 MIC), the combination of 63 mg/mL trimethoprim (103 MIC) and 50 mg/mL nisin (23 MIC), 63 mg/mL daptomycin (23 MIC), the
combination of 63 mg/mL trimethoprim (103 MIC) and 63 mg/mL daptomycin (23 MIC), or 6.3 mg/mL SCH-79797 (13 MIC). Each bar represents 3 biological
replicates. Mean ± SD are shown.
B C
D E
F G
Figure 7. Derivates of SCH-79797 Show Increased Potency and the Ability to Help Clear Infection in Mouse Vaginal N. gonorrhea Model
(A) Structures of SCH-79797, the pyrroloquinazolinediamine core lacking the side chains (IRS-10), and the pyrroloquinazolinediamine core with a biphenyl
decoration (IRS-16).
(B) The MICs of SCH-79797, IRS-10, and IRS-16 against a few selected species. For the MICs against additional strains, see Table S1.
(C) The growth of CRISPRi B. subtilis knockdown mutants involved in folate synthesis relative to a DMSO-treated control after treatment with IRS-10 or IRS-16.
Bacterial growth was measured for 14 h and the final optical density (OD600) of each condition was plotted against drug concentration. Each data point represents
2 biological replicates. Mean ± SD are shown.
(D) Flow cytometry analysis of the membrane potential and permeability of E. coli lptD4213 cells after 15 min incubation with 0.4 mg/mL IRS-10 (13 MIC) or
0.02 mg/mL IRS-16 (13 MIC).
(E) Therapeutic index of SCH-79797 and IRS-16 was calculated by dividing the MIC of each drug for the indicated mammalian cell line by its MIC against E. coli
lptD4213. The MIC of IRS-16 against PBMC was greater than the maximal drug concentrations tested.
(Figure S6). The fact that SCH-79797 clusters more closely to the that the pyrroloquinazolinediamine core is sufficient to target
co-treatments than to the individual treatments with trimetho- DHFR (Figure 7C). However, unlike SCH-79797, DiOC2(3) and
prim or nisin/polymyxin B reinforces the conclusion that SCH- TO-PRO-3 staining showed that IRS-10 does not disrupt mem-
79797 kills bacteria by targeting both DHFR and the membrane. brane polarity or permeability (Figure 7D). We also confirmed
There are no other antibiotics that have been shown to target that IRS-10 directly inhibits the enzymatic activity of purified
both folate metabolism and membrane integrity, indicating that E. coli FolA (Figures 4D and 4E). IRS-10 proved to be a more
SCH-79797 represents an antibiotic with a unique MoA. This potent inhibitor of FolA than SCH-79797 (IRS-10 IC50 = 65 ±
result also explains why SCH-79797 failed to cluster with any 19 nM) (Figure 4D), suggesting that its increased efficacy against
of the known antibiotics in our BCP analysis (Figure 2B). E. coli in cell culture can be explained by increased activity to-
Combination antibiotic therapy has been suggested as a po- ward DHFR. Together, these findings suggest that the pyrrolo-
tential means of circumventing the rise of antibiotic resistance quinazolinediamine core of SCH-79797 targets DHFR.
(Tamma et al., 2012; Tyers and Wright, 2019) but it has remained We next sought to determine if the isopropylbenzene side
unclear whether it is better to combine multiple activities on the group is responsible for the membrane-targeting properties of
same molecule. To probe this issue, we measured the synergy of SCH-79797. We thus obtained isopropylbenzene alone (also
co-treatment with one antibiotic targeting dihydrofolate reduc- known as cumene) (Figure S7A) and determined its effects on
tase and another targeting the membrane and compared their membrane integrity and folate biosynthesis. DiOC2(3) and TO-
combined effectiveness to that of SCH-79797. Interestingly, PRO-3 staining showed that ispropylbenzene disrupts both
when E. coli lptD4213 cells were co-treated with trimethoprim membrane polarity and permeability (Figure S7B). Meanwhile,
and nisin, or co-treated with trimethoprim and polymixin B, the reduction of dfrA or folC levels by CRISPRi had no effect on
two antibiotics antagonized one another’s activity, resulting in the sensitivity of bacteria to isopropylbenzene (Figure S7C).
a greater number of viable cells remaining after 2 h of co-treat- These results support the conclusion that SCH-79797 is a
ment (Figure 6B). MRSA USA300 persister cells (Kim et al., dual-targeting compound where the pyrroloquinazolinediamine
2018) are resistant to treatment with the membrane-disrupting core specifically targets folate metabolism while the isopropyl-
daptomycin (Chen et al., 2014; Taylor and Palmer, 2016). Treat- benzene group specifically targets membrane integrity.
ing MRSA USA300 persister cells with 63 mg/mL daptomycin (13 As a further test of whether the hydrophobic isopropylbenzene
MIC) for 2 h did not reduce the number of CFUs remaining in the chain functions to target the membrane, we generated another
culture (Figure 6C). However, SCH-79797 treatment of MRSA small molecule, Irresistin-16 (IRS-16), in which we decorated
robustly killed these persister cells while the combination of the pyrroloquinazolinediamine core with a biphenyl group that
trimethoprim with either nisin or daptomycin could not (Fig- is even more hydrophobic than isopropylbenzene (Figure 7A).
ure 6C). These results suggest that the combination of two As predicted from the inclusion of both folate-targeting and
different antibacterial activities on the same molecular scaffold membrane-targeting moieties, dfrA and folC CRISPRi mutants
can, at least in the case of SCH-79797, produce a more potent proved hypersensitive to IRS-16 and IRS-16 disrupted mem-
antibacterial effect than co-treating with two antibiotics with brane permeability and polarity by DiOC2(3) and TO-PRO-3
the two separate targeting activities. staining (Figures 7C and 7D). IRS-16 also had more potent anti-
biotic activity than IRS-10 against most bacteria tested (Fig-
The Chemical Basis of the Two MoAs of SCH-79797 ure 7B; Table S1), suggesting that targeting both membrane
SCH-79797 consists of a pyrroloquinazolinediamine core that is integrity and folate biosynthesis is more powerful than targeting
substituted with an isopropylphenyl group on one side and a cy- folate metabolism alone.
clopropyl moiety on the other. In order to test the function of the
pyrroloquinazolinediamine core on the antibiotic activity of SCH- The SCH-79797 Derivative IRS-16 Is Efficacious in a
79797, we synthesized a derivative of SCH-79797 (Irresistin-10 Mouse Infection Model
or IRS-10) that lacks both side groups (Figure 7A). When An effective antibiotic needs to be able to target pathogenic bac-
compared to the parent molecule SCH-79797, removing the iso- teria without killing mammalian hosts. To determine the concen-
propylphenyl and cyclopropyl groups increased the potency trations required to inhibit the growth of mammalian cells, we
against E. coli lptD4213 but decreased the potency against treated several mammalian cell lines with both SCH-79797 and
B. subtilis 168, MRSA USA300, and A. baumannii AB17978 (Fig- IRS-16, the derivative with the most potent antibiotic activity.
ure 7B; Table S1). To determine whether the pyrroloquinazoline- SCH-79797 showed promising results with PBMC cells, as
diamine core of SCH-79797 is specifically involved in targeting they required more than 10-fold higher doses for growth inhibi-
folate metabolism or membrane integrity, we assessed the activ- tion than those required for killing E. coli lptD4213 (Figure 7E).
ity of IRS-10 using the dfrA and folC CRISPRi hypersensitivity However, SCH-79797 inhibited the growth of other mammalian
assay and the quantitative flow cytometry membrane integrity cell lines, including HK-2, HEK293, and HLF, at doses compara-
assay. The CRISPRi hypersensitivity assay indicated that IRS- ble to the doses needed to kill bacteria. In contrast, IRS-16 killed
10 maintains the ability to inhibit folate metabolism, suggesting bacteria at 100–1,000-fold lower doses than those required to
(F) The stability of IRS-16 was measured following incubation with mouse liver microsomes. Each data point represents 2 biological replicates. Mean ± SD
are shown.
(G) Treatment of mice with IRS-16 (10 mg/kg, i.v., twice a day [b.i.d.]) reduces the vaginal burden of N. gonorrhoeae 24 h after inoculation. p value from one-
factor ANOVA.
affect mammalian cells in all cell lines tested (Figure 7E). In light tential target for SCH-79797, the bacterial membrane. Quantita-
of this larger therapeutic window, we focused our further in vivo tive flow cytometry with dyes that report on membrane
analysis efforts on IRS-16. permeability and polarity confirmed that SCH-79797 has a
Because IRS-16 preferentially killed bacteria in culture folate-independent effect on bacterial membrane integrity (Fig-
models, we proceeded to characterize its in vivo effects on ure 5C). Together, these assays constitute a pipeline that can
mice. We determined the maximal tolerated dose (MTD) of be used in the future to rapidly characterize antibiotic MoAs de
IRS-16 to be 15 mg/kg when administered intravenously. A phar- novo. Such a pipeline is especially important for compounds
macokinetic analysis revealed that at this MTD, the plasma con- such as SCH-79797 that are not prone to resistance and do
centration of IRS-16 peaked at 1.4 mg/mL with a half-life of 15.8 h not mimic known MoAs. BCP, thermal proteome profiling, me-
(Figures S7D and S7E). Consistent with this robust in vivo stabil- tabolomics, CRISPRi sensitivity, and flow cytometry are all as-
ity, a mouse liver microsome study showed that IRS-16 is says that can be performed in small volumes, such that they
extremely stable as compared to a control drug, verapamil can be readily scaled without the need for synthesizing large
(Figure 7F). amounts of the compound in question. The orthogonal nature
Finally, we determined whether IRS-16 has antibiotic activity in of the assays enables the independent identification of multiple
a mouse bacterial infection model. For this purpose, we focused MoAs, which may help in the discovery of unique antibiotic
on N. gonorrhoeae as it is a Gram-negative pathogen for which classes.
there is an acute need for new antibiotics due to widespread Both of the targets of SCH-79797 are relevant for its function
resistance toward existing drugs (CDC, 2019). There is also as an antibiotic. The CRISPRi and metabolomic studies
a well-validated mouse vaginal infection model for demonstrate that SCH-79797 actively disrupts folate meta-
N. gonorrhoeae (Jerse et al., 2002; Song et al., 2008). In bacterial bolism in multiple bacterial species in a manner that is rate-
culture, IRS-16 showed robust activity toward N. gonorrhoeae, limiting for growth (Figures 4B and 4C). Meanwhile, the flow cy-
with an MIC of 0.03 mg/mL (Table S1). Our pharmacokinetic anal- tometry assay demonstrates that SCH-79797 simultaneously
ysis indicated that IRS-16 should persist in mice at concentra- disrupts membrane integrity even though folate inhibition itself
tions above this MIC for nearly 48 h (Figures S7D and S7E). To has no effect on the membrane (Figure 5C). The ability of SCH-
test its in vivo efficacy, we inoculated the vaginal tracts of 79797 to disrupt membrane integrity is particularly interesting
BALB/c mice with 1.85 3 106 CFU/mouse of N. gonorrhoeae given that membrane-disruptors are often selective for either
ATCC 700825, treated with intravenous (i.v.) doses of either Gram-positive or Gram-negative bacteria (Ling et al., 2015;
10 mg/kg IRS-16 or vehicle control at 2 and 14 h post-infection, Taylor and Palmer, 2016; Warren et al., 1957), while SCH-
and assayed vaginal N. gonorrhoeae CFUs at 26 h post-infec- 79797 proved potent against both Gram-positive pathogens
tion. IRS-16 significantly reduced the vaginal load of like S. aureus and E. faecalis as well as Gram-negative patho-
N. gonorrhoeae (p < 0.05) as compared to the vehicle control gens like A. baumannii, N. gonorrhoeae, and pathogenic
(Figure 7G). Consistent with its favorable therapeutic index and E. coli (Figure 1A). Host toxicity is often a concern for mem-
pharmacokinetic profile, this result confirms that IRS-16 can brane-targeting antibiotics, and while SCH-79797 was well
function as an effective antibiotic in an in vivo mouse gonorrhea tolerated by some animal cells like G. mellonella wax worms
infection model. and PBMC cells, it killed other mammalian cell lines at doses
similar to those at which it functions as an antibiotic. Mean-
DISCUSSION while, IRS-16, a derivative of SCH-79797, increased antibiotic
activity without increasing mammalian toxicity, thereby
Due to the rise in resistance to known antibiotics, there is an increasing its therapeutic window >100-fold. The ability of
acute need for new antibiotics with the key features of having IRS-16 to selectively target bacteria is consistent with a recent
unique MoAs, potency toward Gram-negatives, and reduced study of retinoid derivatives that provided proof-of-principle
susceptibility to resistance. Here, we describe a promising com- that small molecules can preferentially target bacterial mem-
pound, SCH-79797, and its derivative, IRS-16, that are effective branes (Kim et al., 2018). Future biophysical characterization
in animals and address these key criteria with a unique dual-tar- and medicinal chemistry will help to further increase potency
geting MoA, the ability to kill both Gram-negative and Gram-pos- and reduce toxicity.
itive pathogens, and an undetectably low frequency of The undetectably low frequency of resistance to SCH-79797
resistance. We also describe a systems-level pipeline that com- could result from its two distinct targets. Specifically, we were
bines independent orthogonal approaches to characterize the successful in isolating resistance mutants for mimics of each
MoA of SCH-79797 in the absence of resistant mutants. Specif- of its two individual targets, trimethoprim and nisin, but not for
ically, we used BCP classification to categorize the MoA of SCH- SCH-79797 (Figure 1E). The average mutation rate in E. coli is
79797 as distinct from those of 37 known antibiotics (Figure 2B). 2.1 3 10 7 per gene per generation (Chen and Zhang, 2013).
We then used thermal proteome profiling to identify DHFR as a If E. coli required 2 mutations to acquire resistance to
candidate binding partner of SCH-79797 and confirmed that SCH-79797, the number of bacteria that would be necessary
SCH-79797 inhibits folate metabolism through metabolomic to find a resistant mutant would be in the range of 1014. Even if
analysis and CRISPRi genetic hypersensitivity (Figures 3B, 4B, that represents an overestimate, humans are estimated to carry
and 4C). We confirmed that SCH-79797 directly inhibits DHFR roughly 4 3 1013 bacteria in total, so such low frequencies of
activity by acting competitively toward its DHF substrate (Fig- resistance would be unlikely to result in resistant mutants in a
ures 4D and 4E). The BCP images also alerted us to a second po- clinical context.
Our studies suggest that SCH-79797 is more potent than com- B Materials Availability
bination treatment with antibiotics that mimic its two activities, the B Data and Code Availability
DHFR-inhibitor trimethoprim and the membrane-disruptor nisin. B Bacterial Cytological Profiling Code
Similarly, co-treatment with trimethoprim and polymyxin B B Thermal Proteome Profiling Data
showed antagonistic interactions (Figure 6B), while MRSA B Flow Cytometry Data
persister cells were killed by SCH-79797 but not by combined B Additional Raw Data
treatment with trimethoprim and daptomycin (Figure 6C). A poten- d EXPERIMENTAL MODEL AND SUBJECT DETAILS
tial explanation for the potency of SCH-79797 is that recruiting a B Bacterial strains and growth conditions
DHFR inhibitor to the membrane could increase its effective con- B Mammalian cell lines
centration or potentiate its inactivation of DHFR by sequestering it. B Animal models
Permeabilizing the membrane could also enhance the access of d METHOD DETAILS
SCH-79797 to its cytoplasmic DHFR target. The difference be- B Minimum inhibitory concentration assays
tween SCH-79797 and the combination treatments could also B Compound library
be based on non-primary target effects such as differences in B Galleria mellonella killing assay
localized synergistic drug concentrations, drug uptake or efflux. B Colony forming units assay
Membrane-targeting molecules can act either synergistically or B S. aureus MRSA persister cell assay
antagonistically with antibiotics with different MoAs (Brochado B Serial passaging assay to evolve resistance
et al., 2018). Because trimethoprim and various membrane dis- B Bacterial cytological profiling
ruptors antagonize each other separately, but DHFR inhibition B Thermal proteome profiling
and membrane disruption synergize in the context of SCH- B Metabolomics
79797, combining antibiotic activities onto the same molecule B Dihydrofolate reductase activity assay
could present a solution for bypassing this antagonistic effect. In B Membrane potential and permeability assay
any event, our results suggest that despite the promise of combi- B Mammalian cell cytotoxicity
nation antibiotic therapies (Brochado et al., 2018; Tyers and B Mouse liver microsomal stability
Wright, 2019), an even more powerful approach could be to B Pharmacokinetic analysis
combine different targeting moieties onto the same chemical B Neisseria gonorrhea vaginal infection model
scaffold. d QUANTIFICATION AND STATISTICAL ANALYSIS
Discovering the MoA of SCH-79797 also enabled us to design B Center, spread, and statistical significance
derivatives that improve its efficacy. Our most promising deriva- B Bacterial cytological profiling
tive currently is IRS-16, in which we replaced the isopropyl- B Thermal proteome profiling
phenyl group with a biphenyl group (with the idea to increase B Flow cytometry analysis
the membrane-targeting activity) and removed the cyclopropyl B Pharmacokinetic analysis
side chain (to enhance the DHFR inhibition). IRS-16 showed
improved ability to kill bacteria, with significantly lower MICs SUPPLEMENTAL INFORMATION
than SCH-79797. More importantly, IRS-16 did not similarly
Supplemental Information can be found online at https://doi.org/10.1016/j.
enhance the growth inhibition of mammalian cells. Thus, IRS-
cell.2020.05.005.
16 exhibited a promising therapeutic index, as reducing the con-
centration necessary to kill bacteria without affecting the con- ACKNOWLEDGMENTS
centration necessary to kill mammalian cells resulted in a com-
pound that is >100-fold more potent toward bacteria than The B. subtilis CRISPR knockdown library was a kind gift from Jason M. Pe-
hosts. IRS-16 was also stable in mice and tolerated at doses ters. Flow cytometry was performed in collaboration with Christina DeCoste
(Princeton University Flow Cytometry Resource Facility [FCRF]). We appre-
significantly above the MIC for several hours. Finally, we
ciate the support and feedback from lab members in the Gitai and Shaevitz
confirmed that IRS-16 significantly reduced the burden of labs. Funding was provided in part by NIH (DP1AI124669 to Z.G., J.P.S.,
N. gonorrhoeae in a mouse vaginal infection model. B.P.B., and J.K.M. and T32 GM007388 to J.K.M. and G.M.M.), as well as
N. gonorrhoeae is a Gram-negative pathogen with some of the Princeton DFR Innovation Funds for New Ideas in Science (to J.K.M. and
highest rates of drug resistance for any pathogen. The acute J.P.S.). Additional funding provided by the National Science Foundation
need for new antibiotics to treat N. gonorrhoeae makes IRS-16 (NSF PHY-1734030 to B.P.B.) and for the FCRF by the National Cancer Insti-
tute (NCI-CCSG P30CA072720-5921). The opinions, findings, and conclu-
a particularly promising small molecule candidate for future
sions or recommendations expressed in this material contents are solely the
development. responsibility of the authors and do not necessarily represent the official views
of the NIH or the National Science Foundation.
STAR+METHODS
AUTHOR CONTRIBUTIONS
Detailed methods are provided in the online version of this paper Conceptualization, Z.G., J.K.M., M.Z.W., and H.K.; Methodology, Z.G., B.P.B.,
and include the following: M.Z.W., A.M., A.T., M.M.S., J.R., S.H.-J.L., and H.K.; Software, M.Z.W.,
J.K.M., B.P.B., A.T., and M.M.S.; Validation, J.P.S.; Formal Analysis, B.P.B.
d KEY RESOURCES TABLE and G.M.M.; Investigation, M.Z.W., J.K.M., J.P.S., G.M.M., A.M., A.T.,
d RESOURCE AVAILABILITY M.M.S., S.H.-J.L., and B.P.B.; Resources, J.P.S. and M.Z.W.; Writing – Orig-
B Lead Contact inal Draft, Z.G. and J.K.M.; Writing – Reviewing & Editing, Z.G., B.P.B.,
ll
Article
membrane permeability, and bacterial counts of Staphylococcus aureus and iological studies of Escherichia coli strain MG1655: growth defects and
Micrococcus luteus. Antimicrob. Agents Chemother. 44, 827–834. apparent cross-regulation of gene expression. J. Bacteriol. 185, 5611–5626.
O’Neill, J. (2014). AMR Review Paper–Tackling a Crisis for the Health and Strande, J.L., Hsu, A., Su, J., Fu, X., Gross, G.J., and Baker, J.E. (2007). SCH
Wealth of Nations (AMR Review Paper). 79797, a selective PAR1 antagonist, limits myocardial ischemia/reperfusion
Peleg, A.Y., Jara, S., Monga, D., Eliopoulos, G.M., Moellering, R.C., Jr., and injury in rat hearts. Basic Res. Cardiol. 102, 350–358.
Mylonakis, E. (2009). Galleria mellonella as a model system to study Acineto- Tamma, P.D., Cosgrove, S.E., and Maragakis, L.L. (2012). Combination ther-
bacter baumannii pathogenesis and therapeutics. Antimicrob. Agents Chemo- apy for treatment of infections with gram-negative bacteria. Clin. Microbiol.
ther. 53, 2605–2609. Rev. 25, 450–470.
Peters, J.M., Colavin, A., Shi, H., Czarny, T.L., Larson, M.H., Wong, S., Haw- Taylor, S.D., and Palmer, M. (2016). The action mechanism of daptomycin.
kins, J.S., Lu, C.H.S., Koo, B.-M.M., Marta, E., et al. (2016). A Comprehensive, Bioorg. Med. Chem. 24, 6253–6268.
CRISPR-based Functional Analysis of Essential Genes in Bacteria. Cell 165, Tenover, F.C., and Goering, R.V. (2009). Methicillin-resistant Staphylococcus
1493–1506. aureus strain USA300: origin and epidemiology. J. Antimicrob. Chemother. 64,
Prince, A., Sandhu, P., Ror, P., Dash, E., Sharma, S., Arakha, M., Jha, S., 441–446.
Akhter, Y., and Saleem, M. (2016). Lipid-II Independent Antimicrobial Mecha- Tyers, M., and Wright, G.D. (2019). Drug combinations: a strategy to extend
nism of Nisin Depends On Its Crowding And Degree Of Oligomerization. Sci. the life of antibiotics in the 21st century. Nat. Rev. Microbiol. 17, 141–155.
Rep. 6, 37908. Ursell, T., Lee, T.K., Shiomi, D., Shi, H., Tropini, C., Monds, R.D., Colavin, A.,
Randall, L.B., Georgi, E., Genzel, G.H., and Schweizer, H.P. (2016). Finafloxa- Billings, G., Bhaya-Grossman, I., Broxton, M., et al. (2017). Rapid, precise
cin overcomes Burkholderia pseudomallei efflux-mediated fluoroquinolone quantification of bacterial cellular dimensions across a genomic-scale
resistance. J. Antimicrob. Chemother. 72, 1258–1260. knockout library. BMC Biol. 15, 17.
Ruiz, N., Kahne, D., and Silhavy, T.J. (2006). Advances in understanding bac- Viehman, J.A., Nguyen, M.H., and Doi, Y. (2014). Treatment options for carba-
terial outer-membrane biogenesis. Nat. Rev. Microbiol. 4, 57–66. penem-resistant and extensively drug-resistant Acinetobacter baumannii in-
Savitski, M.M., Reinhard, F.B.M., Franken, H., Werner, T., Savitski, M.F., Eber- fections. Drugs 74, 1315–1333.
hard, D., Martinez Molina, D., Jafari, R., Dovega, R.B., Klaeger, S., et al. (2014). Warren, G.H., Gray, J., and Yurchenco, J.A. (1957). Effect of polymyxin on the
Tracking cancer drugs in living cells by thermal profiling of the proteome. Sci- lysis of Neisseria catarrhalis by lysozyme. J. Bacteriol. 74, 788–793.
ence 346, 1255784. Werner, T., Sweetman, G., Savitski, M.F., Mathieson, T., Bantscheff, M., and
Song, W., Condron, S., Mocca, B.T., Veit, S.J., Hill, D., Abbas, A., and Jerse, Savitski, M.M. (2014). Ion coalescence of neutron encoded TMT 10-plex re-
A.E. (2008). Local and humoral immune responses against primary and repeat porter ions. Anal. Chem. 86, 3594–3601.
Neisseria gonorrhoeae genital tract infections of 17beta-estradiol-treated Wiedemann, I., Breukink, E., van Kraaij, C., Kuipers, O.P., Bierbaum, G., de
mice. Vaccine 26, 5741–5751. Kruijff, B., and Sahl, H.-G. (2001). Specific binding of nisin to the peptidoglycan
Soupene, E., van Heeswijk, W.C., Plumbridge, J., Stewart, V., Bertenthal, D., precursor lipid II combines pore formation and inhibition of cell wall biosyn-
Lee, H., Prasad, G., Paliy, O., Charernnoppakul, P., and Kustu, S. (2003). Phys- thesis for potent antibiotic activity. J. Biol. Chem. 276, 1772–1779.
STAR+METHODS
Continued
REAGENT or RESOURCE SOURCE IDENTIFIER
FlowJo FlowJo, LLC https://www.flowjo.com/solutions/flowjo/
downloads
Other
Synergy HT microplate reader BioTek N/A
InfiniteM200 Pro microplate reader Tecan N/A
27G x 0.5 inch needle BD Biosciences 305109
QuantaMaster 40 Spectrophotometer HORIBA Instruments N/A
RESOURCE AVAILABILITY
Lead Contact
Further information and requests for resources and reagents should be directed to and will be fulfilled by the Lead Contact, Zemer
Gitai (zgitai@princeton.edu).
Materials Availability
The unique strain (E. coli lptD4213 DthyA) generated in this study are available from the Lead Contact.
Animal models
Pharmacokinetics determination
Care and handling of male CD-1 mice approximately 6-8 weeks old conformed to institutional animal care and use policies as carried
out at Pharmaron, Inc. (Beijing, ROC).
Neisseria gonorrhoeae infection model
Care and handling of 5-week old ovariectomized BALB/c mice conformed to institutional animal care and use policies as carried out
at Pharmacology Discovery Services Taiwan, Ltd. (New Taipei City, TW).
METHOD DETAILS
Compound library
Compounds were sourced from commercial vendors: MicrosourceDiversity, Aldrich, Selleckchem, Chiromics, and Chembridge.
Each compound dissolved in DMSO at 50 mM and then was screened for antibiotic activity against E. coli lptD4213 growing in Terrific
Broth. After normalizing for plate-to-plate variation, an OD600 of half the median plate OD600 was used as a cutoff, below which any
compound was assumed to have inhibited the growth of E. coli lptD4213 and above which compounds were assumed to be ineffec-
tive. Compounds that either had not been previously identified as antibiotics or had unknown or ambiguous mechanisms of action
were chosen for further investigation and their MIC’s were measured using the microdilution method described above.
Metabolomics
Overnight E. coli NCM3722 cultures were grown and diluted 1:100 in Gutnick Minimal Media and grown to early-mid exponential
phase (OD600 = 0.4-0.6). Cultures were treated with either 13.9 mg/mL SCH-79797 (1X MIC) or 0.15 mg/mL trimethoprim (1X MIC)
for 15 minutes. Folates were extracted by vacuum filtering 15 mL of treated cells using 0.45 mm HNWP Millipore nylon membranes
and immediately placing filters into an ice-cold quenching solution containing 40:40:20 Methanol:acetonitrile:25 mM NH4OAc + 0.1%
sodium Ascorbic in HPLC grade water. The resulting solution was then centrifuged at 16,000 3 g for 1.5 min at 4 C and the super-
natant saved for mass spectrometry analysis. Mass spectrometry analysis was performed as described in Chen et al. (2017).
normalized to a standard condition (60 mM NADPH and 100 mM DHF) that was measured immediately before the sample of interest.
The relative activity was calculated from (bsample – bnoEnzyme)/(bstandard – bnoEnzyme).
Pharmacokinetic analysis
6-8 week old male CD-1 mice were injected intravenously with a single dose of IRS-16 at 15 mg/kg in 5% DMSO + 95% (20% HP-
b-CD in water, W/V) and showed no adverse effects. Plasma measurements were averaged from 2 mice at the indicated time points
following administration. Concentration was determined by LC/MS. Pharmacokinetic assay was performed by Pharmaron, Inc. (Bei-
jing, ROC).
Pharmacokinetic analysis
Plasma measurements were averaged from 2 mice at the indicated time points following administration. After the rapid initial
approach to pseudoequilibrium, filled data symbols were used as the input for terminal half-life determination (Figure S7D). Pharma-
cokinetic parameters estimated of a non-compartmental model of IRS-16 serum levels (Figure S7D). These details are included in the
respective figure legends.
Supplemental Figures
Figure S1. SCH-79797 Is Bactericidal against Staphylococcus aureus, Exhibits Undetectably Low Rates of Resistance, and Is an Effective
Antibiotic in an Infection Model of Galleria mellonella by Acinetobacter baumannii, Related to Figure 1
A. Colony forming units (CFU mL-1) after 3-hour treatment of S. aureus MRSA USA300 with solvent only (1% DMSO), and 6.3 mg/mL SCH-79797 (1X MIC), or
4.0 mg/mL novobiocin (5X MIC). Each data point represents 3 independent samples and 3 technical replicates. Mean ± SD are shown. B. Fold increase in
resistance of S. aureus MRSA USA300 to SCH-79797, novobiocin, trimethoprim, or nisin after 25 days of serial passaging in each drug. C. Fold increase in
resistance of A. baumannii AB17978 to SCH-79797 and gentamicin after 5 days of serial passaging in each drug. D. Fold increase in the susceptibility of S. aureus
MRSA USA300 mutants to the indicated antibiotics. Trimethoprim and nisin resistant mutants were obtained from serial passaging in respective antibiotics. E-F.
The percent survival of non-infected G. mellonella wax worms after treatment with 2 mL/larva of 100% DMSO, 67 mg/larva SCH-79797, 67 mg/larva gentamicin,
(legend continued on next page)
ll
Article
67 mg/larva rifampicin, or 67 mg/larva meropenem. Data in (E) represents a typical cohort (n = 12) from a biological triplicate and the pooled results are presented in
(F). p values are determined from a Mantel-Cox test using Prism (n.s., pR0.05; *p < 0.05). G. The percent survival of G. mellonella wax worms infected with
A. baumannii (AB 17978) and concomitantly treated with 67 mg/larva SCH-79797, 67 mg/larva gentamicin, 67 mg/larva rifampicin, or 67 mg/larva meropenem. Data
represents the pooled results from a biological triplicate. H. The relative survival of drug-treated of G. mellonella wax worms infected with A. baumannii (AB 17978)
and treated concomitantly with antibiotics relative to larvae treated with antibiotics only without infection. Data represents the pooled results from a biological
triplicate.
ll
Article
Figure S2. Bacterial Cytological Profiling of SCH-79797 and Antibiotics with Known Mechanisms of Action, Related to Figure 2
Dimensionality reduction of all treatments using umap were replotted using two color channels, one for only SCH-79797 treatments and one for the remaining
conditions. Density smoothed with a s = 0.1 width Gaussian kernel.
ll
Article
Figure S3. Thermal Stability of Dihydrofolate Reductase Increases after SCH-79797 and Trimethoprim Treatment, Related to Figure 3
A-B. The relative thermal stability of E. coli dihydrofolate reductase (FolA) after treatment of whole cell and cell lysate samples with (A) SCH-79797 and (B)
trimethoprim. Changes in thermal stability were determined by measuring changes in the abundance of FolA across 10 different temperatures ranging from 42-
72 C and 4 drug concentrations and a vehicle control.
ll
Article
Figure S4. CRISPRi Mutants Not Involved in Folate Metabolism Are Not Sensitized to SCH-79797, Related to Figure 4
A. The growth of CRISPRi B. subtilis knockdown mutants relative to a DMSO-treated control after SCH-79797 treatment. Bacterial growth was measured for 14 h
and the final optical density (OD600) of each condition was plotted against drug concentration. Each data point represents 2 independent replicates. Mean ± SD
are shown.
ll
Article
Figure S5. Treatment with Ampicillin, Rifampicin, and Novobiocin Does Not Disrupt Membrane Integrity while SCH-79797 Disrupts Bacillus
subtilis 168 Membrane Integrity, Related to Figure 5
A-D. Flow cytometry analysis of the membrane potential and permeability of E. coli lptD4213 cells after 15 minute incubation with (A) 0.06 mg/mL ampicillin (2X
MIC), (B) 0.002 mg/mL rifampicin (2X MIC), (C) 3.1 mg/mL SCH-79797 (1X MIC), or (D) 0.12mg/mL novobiocin (2X MIC). The limits for the depolarized region were
defined by comparing the values in the CCCP and solvent only controls. The limits for the permeabilized region were defined by comparing the nisin and solvent
only controls. E-G. Flow cytometry analysis of the membrane potential and permeability of B. subtilis 168 cells after 15 minute incubation with (E) 1% DMSO, (F)
3.1 mg/mL SCH-79797 (1X MIC), or (G) 6.2 mg/mL SCH-79797 (2X MIC). The limits for the regions were defined from the solvent only controls.
ll
Article
Figure S6. SCH-79797 Mimics Co-treatment with Folate Metabolism and Alternate Membrane Integrity Disruptors, Related to Figure 6
BCP analysis of E. coli lptD4213 cells after 30 minutes of treatment with 1% DMSO, 6.3 mg/mL SCH-79797 (1X MIC), 2 mg/mL trimethoprim (10X MIC), 0.8 mg/mL
polymyxin B (2X MIC), or the combination of 2 mg/mL trimethoprim (10X MIC) and 0.8 mg/mL polymyxin B (2X MIC). Cytological profiles were clustered by the first
three principal components that account for at least 90% of the variance between samples. Cells were stained with DAPI, FM4-64, and SYTOX Green. Shown
here are the merged images of DAPI (blue) and FM4-64 (red). Scale bar is 2 mm.
ll
Article
Figure S7. Cumene (Isopropylbenzene) Disrupts Membrane Integrity with No Additional Sensitivity in Folate-Metabolism Mutants and
Pharmacokinetic Analysis of IRS-16 Stability, Related to Figure 7
(A) Structure of cumene. (B) Flow cytometry analysis of the membrane potential and permeability of E. coli lptD4213 cells after 15 minute incubation with
17000 mg/mL cumene (1X MIC). The limits for the depolarized region were defined by comparing the values in the CCCP and solvent only controls. The limits for
the permeabilized region were defined by comparing the nisin and solvent only controls. (C) The growth of CRISPRi B. subtilis knockdown mutants relative to a
DMSO-treated control after cumene treatment. Bacterial growth was measured for 14 h and the final optical density (OD600) of each condition was plotted against
drug concentration. (D) Plasma concentrations of IRS-16 were measured following a single 15000 mg/kg IV dose. After the rapid initial approach to pseudoe-
quilibrium, filled data symbols were used as the input for terminal half-life determination. (E) Pharmacokinetic parameters estimated of a non-compartmental
model of IRS-16 serum levels.
Article
Correspondence
ansel.hsiao@ucr.edu
In Brief
Differences in the gut microbiome
between individuals determine resistance
to cholera infection through the effects on
the activity of a bile salt enzyme.
Highlights
d Interpersonal human gut microbiome variation confers
variable infection resistance
Article
Interpersonal Gut Microbiome Variation Drives
Susceptibility and Resistance to Cholera Infection
Salma Alavi,1,5 Jonathan D. Mitchell,1,5 Jennifer Y. Cho,1,2 Rui Liu,1,3 John C. Macbeth,1,4 and Ansel Hsiao1,6,*
1Department of Microbiology and Plant Pathology, University of California, Riverside, Riverside, CA, USA
2Department of Biochemistry, University of California, Riverside, Riverside, CA, USA
3Graduate Program in Genetics, Genomics, and Bioinformatics, University of California, Riverside, Riverside, CA, USA
4Division of Biomedical Sciences, School of Medicine, University of California, Riverside, Riverside, CA, USA
5These authors contributed equally
6Lead Contact
*Correspondence: ansel.hsiao@ucr.edu
https://doi.org/10.1016/j.cell.2020.05.036
SUMMARY
The gut microbiome is the resident microbial community of the gastrointestinal tract. This community is high-
ly diverse, but how microbial diversity confers resistance or susceptibility to intestinal pathogens is poorly
understood. Using transplantation of human microbiomes into several animal models of infection, we
show that key microbiome species shape the chemical environment of the gut through the activity of the
enzyme bile salt hydrolase. The activity of this enzyme reduced colonization by the major human diarrheal
pathogen Vibrio cholerae by degrading the bile salt taurocholate that activates the expression of virulence
genes. The absence of these functions and species permits increased infection loads on a personal micro-
biome-specific basis. These findings suggest new targets for individualized preventative strategies of
V. cholerae infection through modulating the structure and function of the gut microbiome.
Cell 181, 1533–1546, June 25, 2020 ª 2020 Elsevier Inc. 1533
ll
Article
A B
C D
Figure 1. Model Human Gut Microbiomes Replicate Structure of Communities Affected by Diarrhea-Induced Dysbiosis
(A) Defined human gut communities.
(B) Composition of healthy US human donor fecal microbiomes.
(C) Principal coordinates analysis of defined and complete human gut microbiomes based on weighted UniFrac distance, % variance explained shown in pa-
rentheses. Ellipses show 95% confidence intervals.
(D) Weighted UniFrac distance to indicated defined human model microbiomes of fecal samples from cholera patients at the end of diarrhea (left) and healthy
human donors (right) *p < 0.05, ****p < 0.0001, Mann-Whitney U-test. Boxplots show inter-quartile range, whiskers minimum to maximum.
A B C D
E F
G H
gut microbial community to a highly dysbiotic, low-diversity state phenotypes in both feces and in the medial and distal thirds of
dominated by Streptococci, which recovers to a configuration the small intestine. In prior studies in adult mice, small intestinal
similar to non-diarrheal individuals over the course of weeks after colonization by V. cholerae required antibiotics (Freter, 1955,
the cessation of acute disease (Hsiao et al., 2014). This has also 1956) and ketamine anesthesia (Olivier et al., 2009). Our results
been observed in other diarrheal infections, such as enterotoxi- with human, as opposed to murine, gut bacteria suggest that mi-
genic E. coli and rotavirus (David et al., 2015; Kieser et al., 2018), crobiome differences across host species and inter-individual
and other gut pathologies, such as severe malnutrition (Subra- variation within host species both play key roles in determining
manian et al., 2014). Principal coordinates analysis (PCoA) of a pathogen susceptibility. Significantly, we could restore coloniza-
human cohort from Bangladesh (Hsiao et al., 2014) displays tion resistance by mixing the CR and DS bacteria, suggesting
the dysbiosis caused by cholera (Figure 1C, Diarrhea (start)) that susceptibility is reversible through microbiome modification
and community structure weeks after the cessation of diarrhea (Figures 2A and 2B). In ‘‘Mix’’ groups, where mice were adminis-
(Diarrhea (end) to Recovery (end)), when the microbiome be- tered a 1:1 mixture of CR and DS, V. cholerae levels were signif-
comes similar to that of individuals in the same area not suffering icantly less compared to that in DS mice, in fact dropping below
from acute malnutrition or diarrhea (Subramanian et al., 2014). the level of CR mice 2 days post-infection.
Interpersonal microbiome variation in Bangladesh was far higher We also observed increased colonization susceptibility of DS
than among healthy US individuals sampled as part of this study; microbiomes when compared to a simplified model healthy mi-
indeed, some Bangladeshi ‘‘healthy’’ microbiomes closely crobiome (‘‘SR’’) when GF mice were colonized with defined
resemble cholera-diarrheal communities. As infectious diarrhea communities for 2 weeks prior to introduction of V. cholerae (Fig-
and malnutrition are frequent in cholera endemic areas, we hy- ure 2C). The SR community consisted of three species repre-
pothesized that the distinctive dysbiotic microbiome structure senting major phylogenetic lineages commonly found in the
observed during recovery from multiple sources of environ- healthy human gut. To model an attempted microbiome restora-
mental insult to the gut may be a recurring window of vulnera- tion of a fully established and dense gut community, we also
bility to cholera. introduced DS microbes for 10 days, followed by a gavage of
We then used human-derived isolates to assemble defined gut SR microbes 4 days prior to infection with V. cholerae. In this
communities (Figure 1A) based on these metagenomic analyses. Mix group, pathogen colonization was strongly inhibited
One model microbiome (‘‘CR’’) was based on metagenomic sur- compared to DS-colonized mice, suggesting that microbiome
veys of healthy individuals, characterized by high taxonomic di- modification could be used to restore colonization resistance
versity but commonly including members of the genera Bacter- even to entrenched dysbiotic communities.
oides, Clostridium, and Blautia (Arumugam et al., 2011; Qin We then profiled gut microbiome structure during infection in
et al., 2010; Yatsunenko et al., 2012). Another (‘‘DS’’) model mi- feces and small intestines of gnotobiotic animals with different
crobiome is characteristic of the dysbiotic state found in cholera- communities (Figures 3A–3C and 3G). In these samples, the
endemic areas, comprising Streptococci, Enterococcus faecalis, CR and DS communities were distinct and the CR community
and E. coli. 16S sequencing analysis confirmed that the CR com- more similar to complete fecal microbiomes of healthy US do-
munity is more similar to healthy human microbiomes than dys- nors, while co-inoculation of CR and DS led to an intermediate
biotic diarrheal microbiomes, while the DS model community final microbiome.
was more similar to microbiomes at the conclusion of cholera
(Figures 1C and 1D). Non-dysbiotic Human Microbiomes Reduce Virulence
We grew bacterial species from each defined group in pure Gene Expression and Colonization of V. cholerae in a
culture and used culture optical density (OD600) to introduce Suckling Mouse Model of Infection
equivalent amounts of each member species with V. cholerae While gnotobiotic adult mice serve as a useful microbial-interac-
to germfree (GF) adult C57BL/6 mice by intra-gastric gavage. tion model, we extended our studies to suckling mice, where the
Overall, microbial load during infection was equivalent as pathology and virulence gene expression observed during
measured by qPCR of 16S gene levels (Figure 2F). Mice that V. cholerae infection is closer to that of humans (Klose, 2000).
received the CR microbiome at infection were resistant to First, we recapitulated the CR and DS resistance phenotypes
V. cholerae colonization, compared to animals receiving DS mi- in suckling GF C57BL/6 animals (Figure 2D). We then con-
crobes (Figures 2A and 2B). We observed these colonization structed a more accessible and scalable model of microbiome-
A B C
D E F
G H
Figure 3. Addition of the CR Model Human Microbiome to Mice Hosting DS Microbes Yields a Community Structure Closer to Complete
Fecal Communities of Healthy Human Volunteers
(A–F) Principal coordinates analysis (PCoA) of microbial community diversity based on weighted UniFrac distance, % variance explained shown in parentheses
for each axis. Ellipses show 95% confidence intervals. (A) PCoA of fecal samples and distal third of small intestine of GF mice with model communities during
V. cholerae infection compared to healthy US donor fecal samples, with (B) PC1 positions and (C) all pairwise weighted UniFrac distances to healthy US donor
fecal samples. (D) PCoA of model communities and healthy human donor communities in suckling mice, with (E) PC1 positions and (F) all pairwise weighted
UniFrac distances to healthy US donor fecal samples.
pathogen interaction by clearing the native murine flora of CD-1 We normalized fecal slurries and transplanted these samples
pups with streptomycin before introduction of human-associ- into antibiotic-treated suckling mice with V. cholerae (Figure 4),
ated species. Using this system, we observed similar micro- with dramatically different effects on V. cholerae colonization,
biome-dependent infection outcomes; competitive CR/DS even though these fecal communities colonized suckling animals
transplantation yielded a dominant CR-like phenotype (Fig- at similar efficiencies and density (Figure 2F). We observed an
ure 2E), while CR and DS colonization load did not differ in 1.5-log10 range of V. cholerae colonization depending on the
non-Vibrio-infected animals (Figure 2F). This pattern was re- human donor (Figure 4), suggesting wide variation in infection
flected in 16S sequencing data (Figures 3D–3F; Table S3). During outcomes based on individual gut microbiome structure. The
infection, pups receiving CR microbes had very different com- higher basal microbiome diversity in Bangladesh (Figure 1C)
munity structure (with Vibrio reads filtered) compared to animals suggests that interpersonal variations in infection resistance in
with DS microbes, and animals receiving a mixed inoculum endemic areas could be substantially higher.
(CR+DS) closely resembled CR mice.
Total microbial diversity was not a dominant factor in infection A Pipeline for Randomization of Microbiome Members
resistance, because we restored colonization resistance in DS Identifies Commensal Species Consistently Correlated
mice to almost full CR levels by transplanting only a small subset with V. cholerae Infection Outcome
of CR species (SR). Expression levels of the key colonization fac- We hypothesized that the CR species best able to colonize intes-
tor tcpA were reduced 9.7-fold in SR compared to DS animals tines with the DS community might be prophylactic for infection,
(Figure 2G). We did not observe significant microbiome-depen- because transplantation of CR microbes into DS communities
dent differences in cholera toxin gene expression, diarrhea, or reduced V. cholerae colonization. Therefore, we examined gut
fluid accumulation in these animals (Figure S1). microbiome structure during V. cholerae infection in GF animals
Recent studies have shown type VI secretion system (T6SS) with a mixed CR+DS community (Figures 3A and 3G). The CR
killing of murine commensal E. coli acts to drive increased viru- community in the small intestine was quite distinct from that in
lence in infection of suckling mice (Zhao et al., 2018). As our feces, but all animals with this community were consistently
DS model community contains E. coli, albeit a different strain, colonized by Blautia and Bacteroides species. The DS commu-
we tested the effects of T6SS on V. cholerae colonization and nity was consistent in feces and small intestine, and dominated
E. coli levels in our experimental system. A T6SS DvasK mutant by Streptococci. The lack of generalizability of fecal data to other
was deficient for colonization compared to wild-type V. cholerae gut compartments suggested that fecal sampling studies may
in antibiotic-cleared suckling mice (Figure 2H). However, T6SS mask important differences for pathogenesis in the proximal
activity did not significantly alter levels of a co-inoculated strep- gastrointestinal tract.
tomycin-resistant K12 E. coli, and we observed E. coli at compa- In the small intestine, B. obeum maintained its relative abun-
rable levels in DS and Mix (DS+CR) communities in the small dance in gnotobiotic CR+DS and CR-only animals (Figure 3H),
intestine (Figure 3G). These differences may be E. coli strain- suggesting that it may play a role in CR infection resistance
specific, or due to the much higher levels of V. cholerae used and in transmitting this phenotype to DS animals. However,
in previous T6SS studies. these findings had potentially limited translational applicability
Together, our data suggested that the mechanism for given the basal inter-personal diversity in humans; the ability to
improved colonization resistance of CR/SR microbes lay in displace one dysbiotic microbiome and resist V. cholerae coloni-
T6SS-independent manipulation of V. cholerae virulence gene zation may not be representative across diverse individuals and
expression. microbial communities. To identify Vibrio-antagonistic microbes
across many different microbiome contexts, we generated
Inter-individual Variation in Pathogen Colonization random, unique, 5-member combinations drawn from CR and
Resistance of Human Gut Microbiomes DS strains (Figures 5A and 5B) and established OD600-normal-
Our microbiome transplantation system in suckling mice al- ized mixtures of these bacteria in antibiotic-cleared suckling an-
lowed us to screen numerous intact human fecal microbiomes imals with and without Vibrio infection. We reasoned that species
collected from healthy adult volunteers without malnutrition or whose presence/absence or abundance consistently correlated
recent antibiotic usage or diarrhea for effects on V. cholerae against V. cholerae colonization in many different communities
(Figure 1B). These complete fecal communities were taxonom- would be excellent putative targets for anti-Vibrio interventions.
ically similar to the CR, but not DS microbiomes in both original We identified several species consistently associated with
microbial content and community structure upon transplanta- pathogen levels across multiple microbiome combinations.
tion (Figures 3D–3F). This was unsurprising, because the CR Higher levels of B. obeum were significantly associated with
model community represented up to 73% of genus-level diver- reduced V. cholerae colonization (Figure 5C), but did not directly
sity by total relative abundance in these samples, while correlate with V. cholerae abundance. This is consistent with the
members of the DS community only represented <1% of the effects of the mixed CR/DS microbiome on infection; B. obeum
total (Table S4). consistently established in the DS small intestine, but high levels
(G) Microbiome structure during infection with V. cholerae and host reads filtered (left) and in antibiotic-treated suckling mice without V. cholerae (right).
(H) B. obeum abundance in adult GF mice containing indicated microbiomes during V. cholerae infection (4 days post-infection). *p < 0.05, ****p < 0.0001; n.s. not
significant (Mann-Whitney U-test). Error bars represent mean ± SEM.
C D
Figure 5. An Unbiased Combinatorial Strategy for Identifying Commensal Correlates of Protection Against and Susceptibility to V. cholerae
colonization
(A) Combinatorial strategy.
(B) 5-member microbiome embodiments randomly generated using CR/DS members (left). Resulting V. cholerae infection of antibiotic-treated suckling mice
containing defined microbiome embodiments are shown at right.
(C) Mean V. cholerae colonization in suckling mice bearing communities containing B. obeum or Streptococcus species alone and in combination. Data
normalized across experiments as fold colony-forming unit (CFU)-gavaged V. cholerae recovered after infection.
(D) Abundance of DS member species in randomized microbiomes correlated to resulting V. cholerae abundance after infection. Points represent mice receiving
different microbiome embodiments. *p < 0.05 (Mann-Whitney U-test); n.s., not significant. Error bars represent mean ± SEM.
expression under anaerobic conditions through modulating the mogenates with the bile-sequestering resin cholestyramine ab-
structure and activity of the upstream virulence activator TcpP lated their ability to activate PtcpA (Figure 6A).
(Yang et al., 2013). Similarly, we saw that TC activated PtcpA, We next screened the DS and CR species for their effects on
while CA was not an efficient inducer (Figure 6B). Intestinal ho- TC. We incubated TC solutions at a physiologically relevant con-
mogenate effects on tcpA was bile-specific; pre-treatment of ho- centration with pure cultures of these microbes, heat-treated
A B
C D E
Figure 6. B. obeum Exerts Effects on V. cholerae Colonization through Degradation of the In Vivo Virulence Gene Activating Signal Taur-
ocholate (TC)
PtcpA activity normalized to tcpA induction by 125 mM TC unless noted.
(A) Modulation of tcpA-activating signals in suckling CD-1 mouse intestinal homogenates by pure cultures of B. obeum and S. salivarius, with heat treatment.
(B) Bile effects on tcp gene expression in vitro.
(C) Effects of CR and DS pure cultures on TC activation of virulence in vitro.
(D) Effects of B. obeum bsh enzyme expression on TC-mediated tcp activation in vitro.
and filter-sterilized the resulting supernatants, and measured dicted structure, B. obeum encodes for predicted type 1 bsh
their ability to induce PtcpA (Figure 6C). We observed dramatic enzymes, which are highly effective at deconjugating TC, GC,
differences in the ability of these strains to affect TC virulence in- glycodeoxycholate, and taurodeoxycholate (Song et al.,
duction, with members of the CR/SR microbiomes in general 2019), the strongest activators of tcpA expression in
being better able to prevent tcp activation. The ability to ablate V. cholerae (Figure S5). All DS members except S. infantarius
TC-mediated induction of tcp expression varied at genus level, lacked annotated bsh genes, while many CR species encoded
with Blautia torques unable to affect tcp induction by TC in bsh homologs. Although S. infantarius showed high activity
comparison to B. obeum, and Streptococcus infantarius able against TC in vitro, despite bearing bsh homologs to phylotypes
to process TC in contrast to other DS Streptococci. Of SR with poor predicted TC activity, we observed no statistically
species, B. vulgatus, but not C. scindens, showed efficient TC significant difference in effects on V. cholerae colonization
processing. compared to S. salivarius (Figure S4B). This suggests that there
Because the SR community largely recapitulated the CR may be differences in in vivo regulation of these enzymes in
colonization resistance phenotype, and B. vulgatus demon- S. infantarius. Conversely, B. torques, despite encoding several
strated high activity against TC in vitro, we wanted to examine putative bsh genes, was not able to prevent tcpA induction by
the relative contribution of B. obeum and B. vulgatus on TC TC. B. vulgatus was an efficient TC processor in vitro, but
levels in the distal small intestine. We colonized adult GF despite encoding, 3 putative bsh genes could not drive signifi-
mice with CR members, or CR species without B. obeum, cantly lower levels of TC to CA processing in the distal small in-
and measured TC and CA in the distal third of the small intes- testine in the absence of B. obeum (Figure S4A), further sug-
tine 2 days post-colonization, compared to GF mice (Fig- gesting that enzyme expression or function may diverge in
ure S4A). As expected given the absence of microbial bsh, in vitro versus in vivo.
GF distal small intestine showed a high TC/CA ratio, while the V. cholerae also encodes a putative bsh enzyme (VCA0877).
presence of CR microbes efficiently processed TC to CA, However, this is a predicted phylotype 6 bsh, which has poor
yielding low TC/CA ratios. Strikingly, the removal of B. obeum activity against TC but higher activity against rarer bile acids
restored the TC/CA ratio in the distal small intestine to a level that do not participate in regulation of virulence but may be
not statistically significantly different from GF animals, although bacteriostatic in vivo (Table S5). V. cholerae cannot prevent
there was a trend toward more TC in GF animals. This sug- TC-mediated tcp activation in vitro, suggesting that V. cholerae
gested that although other CR organisms can contribute to has limited bsh activity against TC in comparison to B. obeum
TC processing to CA, B. obeum is particularly well suited to (Figure S4C).
manipulating the level of this bile acid in the distal small intes- We constitutively expressed the B. obeum bsh
tine. This agrees with our findings that B. obeum efficiently col- RUMOBE_000028 by cloning this locus downstream of a
onizes the small intestine (Figures 3 and 5), and the presence of constitutive PLtet-O1 promoter in E. coli, generating strain
B. vulgatus and B. obeum together does not significantly bshC. This bshC strain efficiently reduced levels of TC and
improve the ability of a microbiome to exclude V. cholerae tcp activation compared to the isogenic vector control in
(Figure S2). both pure TC solutions (Figure 6D) and intestinal homogenates
(Figure 6E). Significantly, given the dominant effect of
A Bile Salt Hydrolase Enzyme Encoded by B. obeum Is B. obeum on Vibrio resistance, pure cultures of either
Able to Degrade Virulence-Activating Signals in the Gut B. obeum or bshC reduce tcp activation by TC (Figure 6D)
To determine the molecular basis for B. obeum’s effect on and TC levels in intestinal homogenates (Figure 6E) in the pres-
TC-dependent virulence induction, we examined genetic ence of S. salivarius. The activity of this B. obeum enzyme is
determinants of bile acid processing. The B. obeum genome en- able to affect V. cholerae in vivo, because V. cholerae is unable
codes for a hypothetical choloylglycine hydrolase (EC 3.5.1.24). to colonize mice gavaged with bshC as effectively as mice with
Such bile salt hydrolase (bsh) enzymes catalyze the removal of the vector control (Figure 6F).
the conjugated amino acids of bile salts, for example the removal
of taurine from TC to form CA. Bile Salt Hydrolase Levels in Human Gut Microbial
Putative bsh genes are broadly distributed across gut micro- Communities Are Correlated to V. cholerae Infection
bial species, because the ability to survive the inhibitory effects Outcome
of bile are extremely important for enteric commensals (De To determine the distribution of bsh phylotypes in human gut mi-
Smet et al., 1995). A recent study classed bacterial bsh genes crobiomes predicted to be dysbiotic or healthy, we re-examined
into several phylotypes based on sequence similarity and an existing deeply sequenced shotgun metagenomic dataset of
showed that bsh phylotypes have variable and substrate- human cholera patients in Bangladesh (Table S2D) (David et al.,
dependent deconjugation activity (Song et al., 2019). We 2015). Importantly, data was available from patients at presenta-
binned predicted bsh genes in the CR and DS genomes into tion at clinic for cholera (‘‘Diarrhea (d0)’’) without any prior anti-
these phylotypes (Table S5). By sequence alignment and pre- biotic usage and from patients that received oral antibiotics as
(E) Mass spectrometry measurement of TC in suckling CD-1 mouse intestines after incubation with pure cultures of indicated strains.
(F) V. cholerae infection of suckling CD-1 mice after 1-day of colonization with indicated E. coli strains. *p < 0.05, **p < 0.01, ***p < 0.001, ****p < 0.0001 (unpaired
Student’s t test). Error bars represent mean ± SEM. n = 3–10 for all experiments.
C
used standardized amounts of the result-
ing mixed cultures to treat TC solutions,
which we then used for virulence re-
porter assays. Strikingly, we observed
that microbiomes (subjects B and D)
that allowed higher V. cholerae levels
when transplanted in suckling animals
were also unable to completely remove
TC from solution after 24 h, whereas
communities exhibiting stronger coloni-
zation resistance (A, C, and E) reduced
TC to undetectable levels (Figure 7B).
Because species in genus Blautia
demonstrated differences in bsh activity
in vitro, as well as association with cholera
patients and uninfected family members
(Midani et al., 2018), we assayed for the
level of the bsh gene of B. obeum specif-
part of the standard of care for cholera (‘‘Diarrhea + abx’’). These ically in total DNA extracted from human fecal samples by real-
data showed that bsh levels were already affected by diarrhea time PCR. This also served as a function-specific measure
prior to any clinical intervention. We found that several bsh of B. obeum abundance in these complex fecal mixtures.
phylotypes followed diarrhea-dependent patterns in comparison We found that communities associated with higher V. cholerae
to a healthy Bangladeshi cohort. Phylotypes 1, 3, 4, and 5 were colonization had lower levels of B. obeum bsh (Figure 7C), sug-
significantly depleted in dysbiotic samples compared to healthy gesting that B. obeum abundance and specifically the presence
controls (Figures 7A and S6). Of these, phylotypes 1, 3, and 4 of the bsh activity is associated with resistance to V. cholerae
were shown to be highly active against TC (Song et al., 2019) infection.
(summarized in Table S5). These data suggested that a charac-
teristic of healthy human microbiomes that may modulate DISCUSSION
V. cholerae susceptibility is their ability to deconjugate TC into
non virulence-inducing forms. A role for gastrointestinal microbes in resistance to enteropath-
We then assayed whether complete healthy US fecal com- ogens such as V. cholerae has been recognized for some time
munities were differentially able to convert TC to non-tcp-acti- (Freter, 1955, 1956). However, this colonization resistance has
vating forms. We took the first six healthy US donors and often been examined in dichotomous terms, either germfree or
anaerobically cultured bacteria from their fecal samples conventional, or ‘‘normal’’ and damaged by specific factors
in vitro and were able to recover species representing 66%– such as antibiotics. Our results suggest that beyond extreme
99% relative abundance of the original sample (Table S4C). fluctuations in structure, such as those due to diarrhea and se-
We inoculated these complex fecal specimens in media and vere malnutrition, diversity even within otherwise ‘‘healthy’’
human populations can serve as predictive markers for infection Our results suggest a model where, at the initial point of
resistance. colonization in the small intestine, tcp gene expression and
A key difficulty in identifying taxa that can drive infection thus TCP biogenesis is determined by the balance of bile acids
susceptibility is the complexity of animal gut microbiomes, that is modulated by commensal microbes with differential bsh
compounded by dramatic differences in the taxonomic diver- activity. Differences in early tcp gene expression, and thus
sity across host species (Seedorf et al., 2014). The limited colonization, may have disproportionate impact on the pro-
taxonomic and functional resolution of many commonly em- gression of V. cholerae infections; variation in microbiome
ployed metagenomic techniques is also a barrier for converting bsh activity may help determine whether infection proceeds
observations in large population studies to mechanistic in- to fulminant diarrhea or low or temporary colonization that
sights on microbial effects on pathogenesis and other pheno- leads to mild or asymptomatic infections that are common in
types. Findings in this study and others (Hsiao et al., 2014), in cholera endemic areas (King et al., 2008). Once severe diar-
which single genes encoded by specific microbiome members rhea has begun, the commensal community and existing
are able to affect V. cholerae colonization in isolation from lumenal bile is depleted, and any future bile secretion is
other functions, suggest that correlative studies have distinct predominantly conjugated primary species that stimulate
limits in their abilities to provide mechanistic insights to virulence.
microbiome-pathogen interactions. For instance, a recent Although B. obeum bile modification is an important regu-
sequencing-based study that sampled gut microbes of cholera lator of V. cholerae colonization, there may be additive effects
patients and healthy household contacts identified microbes on pathogen behavior of multiple community members and
from the same genus as associated with both individuals through other mechanisms. Removal of B. obeum was suffi-
with cholera and individuals with putative exposure but non- cient to raise TC levels in the mouse distal small intestine,
progression to disease (Midani et al., 2018). Thus, genus-, but although constitutive expression of B. obeum bsh yielded
and possibly even species-level data may be insufficient to a 1 log drop in V. cholerae colonization, the full CR micro-
identify clear targets for future mechanistic studies in the biome yielded almost 2 log-fold differences in colonization
absence of experimental manipulations. compared to DS microbiomes. In V. cholerae, virulence
Recent developments in high-throughput sequencing, gene expression is negatively regulated by several different
anaerobic microbiology, and gnotobiotic animal systems quorum sensing systems involving the sensing of specific
have allowed for mechanistic studies of the interaction of hu- autoinducers (Jung et al., 2015). Prior studies identified
man commensals and human pathogens in animal models of B. obeum-produced autoinducer AI-2 as a suppressor of
colonization and disease. Specific target taxa identified by tcpA, functioning through a pathway bypassing the canonical
multi-omics approaches can be established in animals, with AI-2 receptor LuxP and involving the regulator VqmA (Hsiao
species and gene content defined before introduction of et al., 2014). VqmA has been shown to activate the master
pathogens. These types of approaches allowed us to identify quorum-sensing regulator HapR, which leads to repression
B. obeum as a key member of the human gut microbiome of tcpP (Liu et al., 2006; Zhu et al., 2002). Thus, AI-2 expres-
that drove infection resistance and microbial interactions sion and bsh activity by B. obeum may synergize to reduce
with bile acids as a driver of virulence gene regulation the level and activity of TcpP during infection. Additional
in V. cholerae and a mechanism of protection against infection. studies will be required to determine how these two inputs
We hypothesize that microbial bile metabolism most affects exert their effects on regulation of virulence determinants of
V. cholerae pathogenesis during early infection. V. cholerae V. cholerae in vivo. Non-B. obeum CR members may also
tightly regulates gene expression in response to host impact V. cholerae colonization, both at the level of virulence
environmental signals such as bile. Bile acids are thought to regulation and possibly metabolic competition for colonization
stabilize the structure of the key virulence regulator TcpP niches. Host diet may play a role in colonization; bile is
(Yang et al., 2013), which drives the activation of toxT tran- secreted from the gallbladder in response to food ingestion,
scription. ToxT then causes full induction of tcp and cholera and this varies by the fat content of the meal (Marciani
toxin, with TCP-dependent colonization thought to begin prior et al., 2013). Diet is also a potent driver of microbiome struc-
to toxin expression (Lee et al., 1999). Following colonization, ture (Faith et al., 2011), but the effect of different dietary com-
the activity of bile becomes less clear. Some studies have positions on driving the microbiome to infection resistant or
demonstrated that the unsaturated fatty acids in bile are susceptible states has not been well studied. Ingestion of
able to modulate the binding of ToxT to target promoters, food has been shown to dramatically reduce the infectious
reducing CT and TCP expression (Plecha and Withey, 2015). dose of V. cholerae due to buffering of stomach pH, but
The deconjugated bile salt sodium deoxycholate promotes also possibly by raising the levels of virulence-activating
interaction between the virulence activators ToxR and ToxS conjugated bile acids secreted into the small intestine.
and subsequent activity (Midgett et al., 2017). However, Taken together, our results suggest that variation in
ToxRS likely does not directly activate toxT, but rather acts human gut microbiomes are a significant contributor to
to boost the activity of TcpP at the toxT promoter (Krukonis V. cholerae infection risk, and this can be modulated through
et al., 2000). Both conjugated and deconjugated bile acids introduction of a human gut commensal with multiple molec-
are also able to induce ToxT-independent activation of ular effects on V. cholerae, able to affect levels of both
cholera toxin dependent on ToxRS (Hung and Mekala- virulence-activating and virulence-suppressing signals at
nos, 2005). the site of infection. This suggests that targeted microbiome
The authors declare no competing interests. Jones, B.V., Begley, M., Hill, C., Gahan, C.G., and Marchesi, J.R. (2008). Func-
tional and comparative metagenomic analysis of bile salt hydrolase activity in
Received: December 5, 2019 the human gut microbiome. Proc. Natl. Acad. Sci. USA 105, 13580–13585.
Revised: March 16, 2020 Jung, S.A., Chapman, C.A., and Ng, W.L. (2015). Quadruple quorum-sensing
Accepted: May 18, 2020 inputs control Vibrio cholerae virulence and maintain system robustness. PLoS
Published: June 16, 2020 Pathog. 11, e1004837.
STAR+METHODS
Continued
REAGENT or RESOURCE SOURCE IDENTIFIER
Human volunteer donor fecal sample This paper Subject J
Human volunteer donor fecal sample This paper Subject K
Human volunteer donor fecal sample This paper Subject L
Human volunteer donor fecal sample This paper Subject M
Human volunteer donor fecal sample This paper Subject N
Human volunteer donor fecal sample This paper Subject O
Human volunteer donor fecal sample This paper Subject P
Chemicals, Peptides, and Recombinant Proteins
Zeocin Research Products International Cat# 1006-33-0
Corporation
Sodium taurocholate hydrate Sigma Aldrich Cat# 86339
Sodium glycocholate hydrate Sigma Aldrich Cat# G7132
Cholic acid Alfa Aesar Cat# A1125714
Sodium taurodeoxycholate hydrate Sigma Aldrich Cat# T0557
Sodium glycodeoxycholate Sigma Aldrich Cat# G9910
Deoxycholic acid MP Biomedicals Cat# 0210149610
Tauro-b-muricholic acid Steraloids Inc. Cat# C1899-000
b-muricholic acid Steraloids Inc. Cat# C1895-000
Cholestyramine Sigma Aldrich Cat# C4650
Critical Commercial Assays
iQ SYBR Green Supermix Biorad Cat# 170882
Platinum Hot Start PCR Master Mix Thermo Scientific Cat# 13000013
SuperScript IV First-Strand Synthesis Invitrogen Cat# 18091200
System
Gibson Master Mix New England Biolabs Cat# E2611S
Deposited Data
Short-read sequencing data This paper European Nucleotide
Archive (ENA) PRJEB31497
Short-read sequencing data for meta- European Nucleotide Archive See Table S2 for accession
analysis numbers
Experimental Models: Organisms/Strains
Mouse: C57BL/6 UCR gnotobiotic facility N/A
Mouse: CD-1 IGS Charles River Laboratories N/A
Oligonucleotides
All primers for study, see Table S6. This paper N/A
Recombinant DNA
B. obeum codon optimized LuxS placed This paper N/A
downstream of the PLtet-O-1 constitutive
promoter sequence derived from the
plasmid vector pZE21 vector
(pMK_B. obeum_luxS)
Plasmid: Constitutive expression construct This paper N/A
for B. obeum bsh (bshc)
Software and Algorithms
QIIME Caporaso et al., 2010 http://qiime.org/
Phyre2 Kelley et al., 2015 http://www.sbg.bio.ic.ac.uk/
phyre2/html/page.cgi?id=index
Chimera Pettersen et al., 2004 https://www.cgl.ucsf.edu/chimera/
(Continued on next page)
Continued
REAGENT or RESOURCE SOURCE IDENTIFIER
HISAT2 Kim et al., 2015 http://ccb.jhu.edu/software/hisat2/
manual.shtml
Other
Lab diet Newco Distributors Cat# 5K52
RESOURCE AVAILABILITY
Lead Contact
Further information and requests for resources should be directed to and will be fulfilled by the Lead Contact, Ansel Hsiao (ansel.
hsiao@ucr.edu).
Materials Availability
Unique plasmids, strains, and reagents generated in this study are available from the Lead Contact with a completed Materials Trans-
fer Agreement.
Human studies
All human samples were part of a study approved by the UCR Institutional Review Board and followed NIH guidelines. We collected
intact fecal samples from a cohort of healthy adult volunteers at the University of California, Riverside. Inclusion criteria were: 1) age
between 18 and 40 years, 2) must be able to provide signed and dated informed consent, 3) must be willing and able to provide stool
specimen. Exclusion criteria were: 1) systemic antibiotic usage (oral, intramuscular, or intravenous) in the 2 weeks prior to sampling;
2) acute disease at time of enrollment (presence of moderate or severe illness with or without fever); 3) diarrhea (liquid or very loose
stools not associated with a change in diet) in the 2 weeks prior to sampling; 4) active uncontrolled GI disorders or diseases including
Inflammatory bowel disease (IBD), ulcerative colitis, Crohn’s disease, or indeterminate colitis, persistent, infectious gastroenteritis,
colitis, or gastritis, and chronic constipation; 5) Major surgery of the GI tract, excluding cholecystectomy and appendectomy, but
including major bowel resection at any time. Age inclusion criteria were chosen to avoid age-related microbiome differences, which
are strongest in early life (Yatsunenko et al., 2012). Fecal samples were collected aseptically from each person at UCR and
immediately preserved at 80 C until processing for DNA extraction, culturing, and animal colonization. Stocks of fecal slurries
for subsequent experiments were prepared by resuspending samples at 1:3 weight/volume in sterile reduced PBS and adding sterile
glycerol to a final concentration of 25% volume/volume.
Animal studies
All animal experiments used protocols approved by the Institutional Animal Care and Use Committee of the University of California,
Riverside (UCR) and followed NIH guidelines. All CD-1 suckling animals were purchased from Charles River Laboratories. Suckling
and adult germfree C57BL/6 mice were reared at the UCR gnotobiotic facility. No distinction was made between male and female
animals for bacterial studies. Adult animals were used at > 3 weeks of age. Germfree suckling mice used at 5-6 days of age. Animals
were checked for signs of moribund condition prior to use in experiments, and used for one experimental procedure only. Adult an-
imals were co-housed in cages without mixing sex. Male mice were separated except in cases of littermates.
For the antibiotic-cleared suckling mouse model, 4-day old suckling CD-1 animals were fasted for 1.5 hours, then orally dosed with
1mg/g body weight streptomycin using 30-gauge plastic tubing, after which the animals were placed with a lactating dam for 1 day.
After 24 hours, mice received microbial communities with V. cholerae in a maximum gavage volume of 50 ml. At 18 hours post-infec-
tion, animals were sacrificed, and relevant sections of intestinal tissue dissected and homogenized for CFU numeration and nucleic
acid extraction.
Germ-free C57BL/6 mice were bred and maintained in plastic gnotobiotic isolators at University of California, Riverside. Mice were
fed an autoclaved, low-fat plant polysaccharide-rich mouse chow (Lab Diet 5K52) and were 5–8 weeks old at time of gavage. Bac-
terial cultures were prepared as described above. Mice were fasted for 30 minutes prior to introduction of bacteria, and stomach pH
was buffered by intra-gastric gavage of 100 mL 1M NaHCO3, followed by gavage with 150uL of balanced defined microbial libraries.
Fecal samples were collected across the course of the experiment. Mice were sacrificed 4 days post gavage and small intestine
collected and cut to three equal (proximal, medial, distal) sections by length. Samples were homogenized and used for CFU enumer-
ation of bacteria on LB agar containing 200 mg/mL streptomycin.
For establishing defined microbiomes prior to V. cholerae infection, germ-free C57BL/6 mice were maintained as mentioned above
and used at 6-13 weeks of age. Mice were fasted and given NaHCO3 as previously described and either given 150 mL of the simple
resistant (SR), or dysbiotic (DS) communities. For the Mix group, mice were initially given the DS microbiome embodiment. 10 days
after microbiome introduction and 4 days prior to V. cholerae infection, the SR community was introduced into the Mix group animals
by gavage. 2 weeks after human commensal colonization, each group was infected with 5 3 109 CFU V. cholerae O1 El Tor C6706.
Fecal samples were suspended in 500 mL of PBS and homogenized using a bead beater (BioSpec) at 1,400 RPM for 30 s. CFU
enumeration of V. cholerae was done on LB agar containing 200 mg/mL streptomycin.
We used the antibiotic-treated infant mouse model described above to determine which members of the healthy human micro-
biome contribute most to resistance to V. cholerae. We made 18 random combinations of human gut microbiome strains and intro-
duced them to suckling mice along with V. cholerae. Each combination included five unique strains, and each gavage contained the
equivalent total microbial mass of 300 mL of OD600 = 0.4 culture, divided evenly across all constituent strains. After introducing human
microbiome and V. cholerae to suckling mice, V. cholerae levels in homogenized intestines were determined by plating on selective
agar. The absolute abundance of each species was determined with a combination of 16S rRNA gene qPCR and 16S rRNA
sequencing.
METHOD DETAILS
PCR purification columns (QIAGEN) and subjected to sequencing using the Illumina MiSeq platform. Paired-end 150nt reads were
assembled, de-multiplexed, rarefied to > 900 reads per sample, and analyzed using the QIIME 1.9.1 software package (Caporaso
et al., 2010). Sequencing run results are summarized in Tables S2A and S3.
with the SuperScript IV First-Strand Synthesis System (Invitrogen) following manufacturers’ instructions. Real-time PCR was per-
formed using conditions 95 C for 3 min, followed by 39 cycles (95 C for 10 s, 55 C for 30 s, 95 C for 10 s, 65 C for 5 s, 95 C for
5 s).
then returned to lactating dams. Overnight cultures of bshC and vector strains were normalized to the equivalent 300 mL of OD600 = 0.4
culture, and cells pelleted and resuspended in fresh LYH-BHI. 50 mL of this was then introduced via intra-gastric gavage into anti-
biotic-treated suckling mice that had been fasted for 1.5 hours. Pups were then returned to a lactating dam. After 1 day of pre-infec-
tion colonization with E. coli, animals were gavaged with V. cholerae as described above.
Statistical tests were performed in the GraphPad Prism software package. Results are representative of two independent experi-
ments. Statistical parameters for studies are reported in relevant figure legends and tables.
Supplemental Figures
Figure S1. V. cholerae Pathology and Gut Distribution in Different Microbiome Contexts, Related to Figure 2
All experiments are in antibiotic-cleared suckling CD-1 mice. (A) Expression of ctxA in intestinal tissues of infected mice containing defined model human mi-
crobiomes. (B) Fluid accumulation in intestines of infected mice containing defined model human microbiomes. (C) Distribution of V. cholerae in infected mice
containing defined model human microbiomes. n.s. not significant (Mann-Whitney U-test). Error bars represent mean ± SEM.
ll
Article
Figure S2. Mean V. cholerae Colonization in Antibiotic-Cleared Suckling CD-1 Mice Bearing Communities Containing B. obeum in Combi-
nations of SR Species, Related to Figure 2
Normalized colonization across experiments reported as fold CFU V. cholerae gavaged recovered after infection. n.s. not significant (Mann-Whitney U-test).
ll
Article
Figure S3. Induction of BB170 AI-2 Reporter by Indicated Cell-Free Supernatants, Related to Figure 6
**p < 0.01, (Mann-Whitney U-test). Error bars represent mean ± SEM.
ll
Article
Figure S6. Levels of Different Phylotypes of Microbial bsh Enzymes in Metagenomic Sequencing of Fecal Microbiomes of Cholera Patients
Pre- (d0) and Post- (+abx) Antibiotic Treatment Compared to Healthy Individuals in Bangladesh, Related to Figure 7
*p < 0.05, **p < 0.01, ***p < 0.001, ****p < 0.0001 (Mann-Whitney U-test), n.s. not-significant. Error bars represent mean ± SEM.
Article
Correspondence
liboxing@mail.sysu.edu.cn (B.L.),
richard.tsien@nyulangone.org (R.W.T.)
In Brief
Silencing neuronal activity triggers similar
molecular mechanisms as activating
neurons during long-term potentiation,
demonstrating Hebbian mechanisms of
homeostatic spike regulation.
Highlights
d Chronic spike blockade with tetrodotoxin causes
homeostatic spike broadening
Article
Neuronal Inactivity Co-opts LTP Machinery
to Drive Potassium Channel Splicing
and Homeostatic Spike Widening
Boxing Li,1,2,* Benjamin S. Suutari,2,3,7 Simon D. Sun,2,3,7 Zhengyi Luo,4,7 Chuanchuan Wei,1,7 Nicolas Chenouard,2
Nataniel J. Mandelberg,2 Guoan Zhang,5 Brie Wamsley,2,6 Guoling Tian,2 Sandrine Sanchez,2 Sikun You,1
Lianyan Huang,1 Thomas A. Neubert,5 Gordon Fishell,2,6 and Richard W. Tsien2,3,8,*
1Neuroscience Program, Guangdong Provincial Key Laboratory of Brain Function and Disease, Zhongshan School of Medicine and The Fifth
Affiliated Hospital, Sun Yat-sen University, Guangzhou 510810, China
2Department of Neuroscience and Physiology, Neuroscience Institute, NYU Grossman Medical Center, New York, NY 10016, USA
3Center for Neural Science, New York University, New York, NY 10003, USA
4Guangdong Provincial Key Laboratory of Malignant Tumor Epigenetics and Gene Regulation, RNA Biomedical Institute, Sun Yat-sen
Memorial Hospital, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou 510120, China
5Department of Biochemistry and Molecular Pharmacology and Skirball Institute, NYU Grossman Medical Center, New York, NY 10016, USA
6Stanley Center for Psychiatric Research, The Broad Institute, Cambridge, MA 02142, USA
7These authors contributed equally
8Lead Contact
SUMMARY
Homeostasis of neural firing properties is important in stabilizing neuronal circuitry, but how such plasticity
might depend on alternative splicing is not known. Here we report that chronic inactivity homeostatically in-
creases action potential duration by changing alternative splicing of BK channels; this requires nuclear
export of the splicing factor Nova-2. Inactivity and Nova-2 relocation were connected by a novel synapto-nu-
clear signaling pathway that surprisingly invoked mechanisms akin to Hebbian plasticity: Ca2+-permeable
AMPA receptor upregulation, L-type Ca2+ channel activation, enhanced spine Ca2+ transients, nuclear trans-
location of a CaM shuttle, and nuclear CaMKIV activation. These findings not only uncover commonalities be-
tween homeostatic and Hebbian plasticity but also connect homeostatic regulation of synaptic transmission
and neuronal excitability. The signaling cascade provides a full-loop mechanism for a classic autoregulatory
feedback loop proposed 25 years ago. Each element of the loop has been implicated previously in neuro-
psychiatric disease.
Cell 181, 1547–1565, June 25, 2020 ª 2020 Elsevier Inc. 1547
ll
Article
Answering these questions would delineate a classic homeo- increases in APD and Ca2+ entry as negative feedback
static feedback loop autoregulation.
The slowing of repolarization that drove APD prolongation
D Firing / DCa2 + signaling was associated with blunted afterhyperpolarization (AHP) (Fig-
ure 1A). Both changes would ensue if potassium channel activity
[ Y was reduced. We considered the BK channel because it regu-
lates APD and AHP (Hu et al., 2001; Lee and Cui, 2010; Sausbier
D Cellular proteins )D Gene expression et al., 2004; Shao et al., 1999). Indeed, inhibiting BK channels
with a specific blocker, iberiotoxin (IbTx), lengthened AP half-
width (Figures 1C and 1D). To test whether BK channels are
proposed more than 25 year ago at Brandeis (LeMasson et al., involved in inactivity-induced APD widening, we applied IbTx
1993; Marder et al., 1996; Siegel et al., 1994). This scheme em- to TTX-silenced cortical cultures (48 h) just before AP recording.
bodies the feedback principles of neuronal homeostasis but has Strikingly, IbTx did not further widen APs (Figures 1C and 1D);
never been worked out in a full loop for any aspect of neuronal the effect of inactivity largely occluded IbTx’s effect on APD.
function. We focused on action potential duration (APD), funda- Evidently, neuronal silencing and IbTx converge on a common
mental in tuning neurons, synapses, and circuits. Even small pathway—inhibiting BK activity in widening the AP.
changes in APD greatly affect Ca2+ influx (Borst and Sakmann,
1998; Geiger and Jonas, 2000; Llinás et al., 1982) and, thus, Chronic Inactivity Alters BK Channel AS
neurotransmitter release, gene expression, and overall neuronal Chronic inactivity might inhibit BK channels via gene expression
function (Byrne and Kandel, 1996; Deng et al., 2013; Jackson and membrane trafficking. However, decreases in transcription,
et al., 1991; Matthews et al., 2009; Sabatini and Regehr, 1997). mRNA translation, or protein trafficking were ruled out by mea-
APD is generally prolonged by inactivity (Kim and Tsien, 2008; surements of BK mRNA, total protein, and surface membrane
Trasande and Ramirez, 2007), an appropriate response to home- expression (Figures S2A–S2D). BK channels also undergo AS,
ostatically restore Ca2+ entry, but remarkably little is known with profound effects on activity (Fodor and Aldrich, 2009; Shel-
about underlying mechanisms. ley et al., 2013; Shipston and Tian, 2016; Xie and McCobb, 1998;
Here we report that inactivity-induced homeostatic regulation Zarei et al., 2001). We assessed the expression of exons pre-
of APD in excitatory neurons is controlled by AS of the large- dicted to undergo splicing (X1–X6; Figure 1E) via RT-PCR (Pietr-
conductance Ca2+-activated potassium channel (BK channel). zykowski et al., 2008). In cultured cortical neurons, BK under-
We deciphered the underlying signaling mechanism: a specific went AS in X3, X4, and X6 but not in X1, X2, and X5 (Figures
change in BK AS driven by a novel cascade involving Ca2+- 1F and S3A). Notably, only X6 (called E29 hereafter) splicing
permeable glutamate receptors, voltage-gated Ca2+ channels, decreased after chronic TTX treatment (Figures 1F); AS of other
calcium/calmodulin-dependent (CaM) kinase kinase b exons remained unchanged (Figure S3B), narrowing our search
(bCaMKK), CaM kinase IV (CaMKIV), and the splicing factor for the basis of BK modulation.
Nova-2. We find that complete silencing of an excitatory neuron We also tested the effect of enhanced activity by depolariza-
triggers signaling similar to that activated during direct depolar- tion with K+-rich culture media (20, 40, and 60 mM K+;
ization. Strikingly, each signaling player is encoded by a gene Figure S3C). Surprisingly, chronic depolarization decreased
implicated previously in neuropsychiatric disease. E29 exon inclusion (Figures 1G and S3D); the magnitude of shift
in AS varied with strength and duration of depolarization (Fig-
RESULTS ure S3D), but always in the same direction as chronic inactivity.
We return later to resolve the paradox of how chronic silencing
Homeostatic AP Broadening Driven by BK Current and depolarization produce similar changes in BK AS.
Attenuation Depolarization- or seizure-induced changes in AS involve
Although homeostasis of action potential (AP) frequency is well activation of CaM-dependent kinases (Xie and Black, 2001). In
studied (Desai et al., 1999; Lee and Chung, 2014; Maffei and Tur- contrast, little is known about AS induced by inactivity, leading
rigiano, 2008), how APD is regulated remains mysterious. This us to focus on its functional effect and underlying mechanisms.
distinction is critical because spike width regulates neuronal
excitability, Ca2+ influx, and neurotransmission and is governed Altered E29 Inclusion Dampens K+ Current and
by its own set of ion channels operating in parallel with channels Broadens Spikes
controlling spike frequency (Kimm et al., 2015). E29 encodes 27 amino acids strategically located within the
To examine homeostasis of APD, we blocked spiking of regulator of K+ conductance (RCK) domain, near the Ca2+
cultured cortical neurons by chronic (24- or 48-h) sodium chan- bowl, one of the Ca2+ binding sites promoting BK activation (Fig-
nel blockade with tetrodotoxin (TTX) and examined APD after ure 1E). E29 inclusion affects BK responses to Ca2+ (Ha et al.,
TTX removal. Chronically silenced neurons had a significantly 2000) and to alcohol-induced microRNAs (miRNAs) (Pietrzykow-
longer APD than mock-treated controls (Figures 1A and S1), ski et al., 2008). We expressed BK channels in HEK293 cells
and AP-induced Ca2+ transients were elevated 1.5- to 2-fold in to find out whether E29 inclusion influences BK currents. BK
amplitude and duration (Figure 1B), as expected from APD- currents were smaller with DE29 than with E29 (Figures 1H and
dependent prolongation of Ca2+ channel activation (Bischof- 1I), whereas the membrane expression level was no different
berger et al., 2002). Thus, chronic inactivity drove compensatory (Figures S2E–S2G). Thus, the reduction of E29 inclusion after
Figure 1. Chronic Spike Blockade-Induced BK Channel AS Is Responsible for Homeostatic Prolongation of AP Duration
(A) Action potentials (APs) recorded from sham control- or chronic TTX-treated (48 h) neurons (left); AP duration (APD) was measured as half-width (right, n = 10).
Scale bars, 20 mV, 1 ms.
(B) Single AP-elicited Ca2+ transients from control or TTX neurons transfected with GCaMP6s (left) and DF/F (right, n = 6).
(C) APs recorded from control (left) and TTX (right) neurons with (red) or without (black) IbTx. Scale bar, 20 mV and 1 ms.
TTX treatment likely contributes to the dampening of BK outward Nova-2 Is Required for E29 Splicing
current. Testing the necessity of Nova-2 for E29 inclusion, we knocked
We returned to cultured cortical neurons to test directly down Nova-2 in cultured cortical neurons using short hairpin
whether a drop in E29 inclusion leads to APD prolongation. RNAs (shRNAs) that targeted its coding sequence (CDS), or its
Upon overexpression of the DE29 BK construct, APD was untranslated region (UTR) (Figure S4A); the UTR-directed shRNA
longer, in line with lower BK conductance, relative to that spared an exogenous Nova-2 construct lacking the UTR (Nova-
observed for its E29 counterpart. Furthermore, action potentials 2(R); Figure S4B). E29 inclusion was sharply reduced by lentiviral
of E29-expressing neurons failed to broaden with chronic delivery of UTR-targeting shRNA but not of scrambled shRNA
silencing. Likewise, the already widened AP of DE29-expressing and not of exogenous Nova-2(R) (Figure 2D). This match to the
showed no further prolongation (Figures 1J and 1K). Thus, inclu- pattern of Nova-2 knockdown and rescue (Figure S4B) sug-
sion or exclusion of E29 in overexpressed channels overrides gested that Nova-2 is necessary for the AS event.
the regulation engaged by chronic inactivity, suggesting that
such regulation involves E29 splicing. Clinching this calls for Changes in Nuclear Localization of Nova-2 Can Regulate
understanding how the splicing is controlled. E29 Splicing
Because Nova-2 acts outside of the nucleus (Racca et al.,
Nova-2 Binds to the Intron Downstream of E29 2010), not just within it, we assessed the effect of cellular locale
AS of BK channels has been intensely studied, but how E29 by comparing the effects of Nova-2 lacking its nuclear localiza-
inclusion is controlled remains unknown. Generally, splicing tion signal (DNLS) and with the NLS intact (wild type [WT]).
regulatory factors favor exon inclusion when binding to down- Although WT Nova-2 was concentrated in the nucleus, DNLS-
stream introns but inhibit it when binding to upstream introns Nova-2 was mainly cytoplasmic (Figure 2E). We verified that
or the exon itself (Black, 2003; Ule et al., 2006). Our bioinformat- Nova-2 directly regulates E29 inclusion using a splice reporter
ics analysis of the intronic sequence past E29 revealed YCAY (Stoilov et al., 2008) containing E29 and partial flanking intron
(Y=C/U) clusters (Figure 2A), consensus sequences for binding sequences (Figure 2F) in HEK293 cells (largely lacking endoge-
of Nova proteins, a well-studied class of neuron-specific RNA- nous Nova-2). In controls, E29 inclusion was knocked down by
binding proteins that regulate AS of neuronal proteins (Buckano- Nova-2 shRNA and rescued by Nova2(R) (Figure S4C). Critically,
vich and Darnell, 1997; Ule et al., 2005). Mice lacking Nova-2, the DNLS-Nova-2 largely failed to induce E29 inclusion (Figures 2F
predominant isoform in the neocortex (Yang et al., 1998), are and S4C), supporting the importance of nuclear localization.
deficient in E29 inclusion (Ule et al., 2005).
To see whether E29 splicing is directly regulated by Nova-2, we Nova-2 Is Necessary and Sufficient for TTX-Induced E29
tested whether Nova-2 binds to synthetic RNA oligonucleotides Exclusion
spanning the putative binding region in the downstream intron To confirm in neurons that Nova-2 is critical for chronic inac-
(probe A1; Figure 2A). Pull-down assays showed that probe A1 tivity-induced E29 reduction, we assayed the effects of chronic
binds efficiently to endogenous Nova-2 from mouse cortical ly- TTX after knockdown of Nova-2 (Figure 2G). Although scram-
sates. In contrast, no binding to Nova-2 was detected with a probe bled lentiviral shRNA spared the reduction of E29 inclusion
with mutations in putative Nova-2-binding (YCAY) sites (probe A2) following 48-h TTX treatment, knockdown of Nova-2 mimicked
or a control probe spanning a 35-bp intronic stretch upstream of and occluded the effect of chronic inactivity on E29 inclusion
A1 without YCAY motifs (probe B; Figures 2A and 2B). We asked (Figure 2G). These experiments showed that Nova-2 binds
whether Nova-2 also bound to the intron downstream of E29 directly to the intron downstream of E29 and is necessary and
in vivo, subjecting cortical lysates to RNA immunoprecipitation sufficient for regulation of E29 inclusion by inactivity.
(IP) with Nova-2 antibodies and assaying with RT-PCR using
primers flanking the YCAY sites. RT-PCR product was detected Chronic Inactivity Reduces Nova-2 Nuclear Localization
following IP with Nova-2 antibodies but not with control immuno- In Vitro
globulin G (IgG) (Figure 2C). No product from the Nova-2 immuno- This sets up the question of how Nova-2 effectiveness is linked to
precipitate was detected without reverse transcriptase, excluding chronic inactivity. Because Nova-2 must be in the nucleus to
an effect of genomic DNA contamination (Figure 2C). Thus, in vivo control splicing, its cellular relocalization is a potential control
and in vitro experiments show that Nova-2 binds directly to the point. We assessed Nova-2 localization in cultured cortical neu-
YCAY sites in the intron downstream of E29. rons after 48-h treatment with TTX. In TTX-treated neurons,
Inactivity-induced Nova-2 translocation was also observed sayed Nova-2 localization. In vector control-transfected neurons,
with a monocular deprivation (MD) paradigm (Wiesel and Hubel, Nova-2 was predominantly located in the nucleus (Figure 4D, top
1963). A lid suture was applied in juvenile mice for 5 days during row). In contrast, in CA-CaMKIV-transfected neurons, Nova-2
the critical period of visual development (post-natal days 26–31) nuclear intensity was lower, and cytoplasmic intensity was higher
(Figure 3H), and subcellular localization of Nova-2 in monocular (Figure 4D, second row), causing a significant drop in nuclear/
V1 was assayed by immunostaining. The nuclear intensity of cytoplasmic (nuc/cyt) ratio (Figure 4E). Evidently, activation of
Nova-2 in the deprived hemisphere (contralateral to the closed CaMKIV is sufficient to drive Nova-2 translocation. Continuous
eye) was less than in the non-deprived hemisphere (ipsilateral); activation of CaMKIV requires its binding to Ca2+/CaM. Accord-
opposite differences were seen in cytoplasmic intensity (Figures ingly, we overexpressed a nucleus-localized Ca2+/CaM trap,
3I and 3J). Thus, MD-induced inactivity also induced in vivo CaMBP4(nuc) (Cohen et al., 2016; Wang et al., 1995), before
Nova-2 translocation from the nucleus to the cytosol. Strikingly, examining Nova-2 location. This intervention abolished the
MD also reduced E29 inclusion and increased APD in the vision- chronic inactivity-induced nuclear export of Nova-2 (Figures
deprived hemisphere compared with its non-deprived counter- 4D, fourth row, and 4E), supporting the idea that Nova-2 translo-
part (Figures 3K–3N). Thus, chronic inactivity in vivo leads to cation is regulated by activated CaMKIV. These experiments indi-
nucleus-to-cytosol Nova-2 redistribution, reduction of E29 inclu- cate that CaMKIV activation is sufficient and necessary to induce
sion, and increased APD. Nova-2 translocation from the nucleus to the cytosol.
CaMKIV Signaling Drives Nuclear Nova-2 Export CaMKIV Phosphorylates Nova-2 and Regulates Its
Is mimicking CaMKIV activation sufficient to induce Nova-2 Nuclear Localization
translocation? To test this, we expressed constitutively activated To find out whether CaMKIV phosphorylates Nova-2, we co-ex-
CaMKIV (CA-CaMKIV) (Cruzalegui and Means, 1993) and as- pressed CA-CaMKIV (or GFP as a control) with FLAG-tagged
Figure 3. Chronic Spike Blockade Reduces Nova-2 Nuclear Localization and E29 Inclusion
(A) Immunofluorescence of Nova-2 from neurons treated with the sham Con or 48 h TTX. Nuclei are indicated by dashes. Scale bar, 10 mm.
(B) Quantification of the nuc/cyt ratio of Nova-2 immunofluorescence intensity from neurons sham- or 48 h TTX-treated (left) and sham- or 24 h 60 mM K+ solution-
treated (right) (n = 30 cells).
(C) Western blots from nuclear (left), cytosolic (center), or whole-cell fractions (right), with Nova-2, Lamin B, and Glyceraldehyde 3-phosphate dehydrogenase
(GAPDH) levels.
(D) Quantification of (C) (for each condition, n = 4).
(E) Diagram of eliminated thalamic inputs to the cortex.
(F) Nova (red) localization in cortical neurons from Olig3 Cre or Olig3 Cre; TeToxf/+ mice. Scale bar, 20 mm.
(G) Quantified nuc/cyt ratio of Nova immunofluorescence intensity from the groups in (F) (n = 20 cells).
(H) Diagram of visual pathways involved in monocular deprivation (MD), with the monocular region of the primary visual cortex (V1mono) indicated (boxes).
(I) Nova-2 (red) localization in the V1mono region, contralateral (Contra) or ipsilateral (Ipsi) to visual deprived eye (MD). Scale bar, 10 mm.
(J) Quantified nuc/cyt ratio of Nova-2 immunofluorescence intensity from the groups in (I) (n = 20 cells).
(K) E29 splicing in V1mono Contra or Ipsi to the visually deprived eye (MD).
(L) Quantification of E29 splicing in (K) (n = 4).
(M) AP waveforms recorded from the V1mono region, Contra or Ipsi to the visually deprived eye (MD).
(N) Half-width of APs in (M) (n = 12–14).
For (B), (D), (G), (J), (L), and (N), data are represented as mean ± SEM; **p < 0.01; N.S. represents p > 0.05.
Nova-2 in HEK293 cells and probed Nova-2 phosphorylation Spontaneous Spine Depolarizations Detected with a
with mass spectrometry (liquid chromatography-tandem mass Membrane Voltage Probe
spectrometry [LC-MS/MS]) (Figures S5A–S5D and 5D5G). How does chronic elimination of AP firing lead to activation of
Compared with the control, CA-CaMKIV drove Nova-2 phos- CaMKIV and nuclear exit of Nova-2? Activity blockade with
phorylation at three sites: serine 25 (site 1), threonine 27 (site TTX abolishes evoked vesicle release but spares spontaneous,
2), and serine 194 (site 3). Sites 1 and 2 lie within or next to the AP-independent vesicle release. We asked whether sponta-
NLS in Nova-2, predictive of influence on nuclear localization neous synaptic transmission might be sufficient to activate
(Harreman et al., 2004); site 3 resides in the KH2 domain, one signaling to CaMKIV activation and Nova-2 translocation.
of the RNA binding domains (Yang et al., 1998). To test whether To test this, we monitored the membrane potential in den-
phosphorylation affects Nova-2 location in cultured neurons, we dritic spines of cortical neurons using the genetically encoded
probed the distribution of FLAG-tagged Nova-2 constructs with voltage indicator ASAP1 (St-Pierre et al., 2014). ASAP1 fluores-
the sites mutated to glutamic acid (E) to mimic the negative cence was seen in the plasma membrane of somata, dendritic
charge phosphorylation confers or to alanine (A) to prevent phos- shafts, and spines (Figure 6A). Decreases in ASAP1 fluores-
phorylation. In contrast to the mostly nuclear positioning of WT cence were linearly related to membrane depolarizations (St-
Nova-2, Nova-2 with phosphomimetic mutations at sites 1 and Pierre et al., 2014) imposed by K+-rich external solution (Fig-
2 (1E2E) was mostly cytoplasmic (Figures 5H and 5I). Single mu- ure 6B, gray symbols). In cortical cultures acutely exposed to
tations (1E or 2E) reduced Nova-2 nuclear localization, even TTX, dendritic spines exhibited dips in relative fluorescence in-
when the other site was mutated to alanine (1E2A or 1A2E) (Fig- tensity (DF/F), reflecting spontaneous excitatory postsynaptic
ures 5H and 5I). Conversely, Nova-2 with single or double muta- potentials (EPSPs) (Figure 6C, green exemplar trace, and 6B,
tions to alanine (2A and 1A2A) was predominantly nuclear, like pooled data). After chronic TTX, spontaneous spine depolariza-
the WT (Figures 5H and 5I). In contrast, glutamate mutations at tions grew by 10 mV (Figure 6C, red trace), exceeding depo-
site 3 had no effect on Nova-2 localization (Figures 5H and 5I) larization attained with 20 mM K+ (Figure 6B, pooled data), suf-
or on Nova-2 binding to RNA (Figure S5E). These results showed ficient to recruit L-type Ca2+ channel- and NMDA receptor
that active CaMKIV reduces Nova-2 nuclear localization by (NMDAR)-mediated Ca2+ influx (Helton et al., 2005; Mayer
phosphorylating sites 1 and 2 (S25, T27). et al., 1984; Nowak et al., 1984). For comparison, even larger
To test whether CaMKIV phosphorylation of Nova-2 controlled DF/F transients were observed in dendritic spines of control
its ability to regulate E29 splicing, we co-expressed the 1E2E neurons during backpropagating dendritic APs (bAPs) or so-
double mutant Nova-2 with the E29 splicing reporter in matic action potentials recorded without TTX (Figures 6C,
HEK293 cells. Unlike WT Nova-2, the 1E2E variant was unable blue and violet exemplar traces, and 6B, pooled data).
to induce E29 inclusion, like Nova-2 lacking its NLS (DNLS; Fig-
ure 5J). Evidently, CaMKIV binds to Nova-2 in the nucleus, Chronic Inactivity Enhances Spontaneous Spine Ca2+
primed to phosphorylate serine 25 and threonine 27 sites near Transients
the NLS of Nova-2; the resulting Nova-2 nuclear exit prevents To study spontaneous Ca2+ transients in dendritic spines,
its action in splicing, reducing exon 29 inclusion. we expressed the fluorescent Ca2+ indicator GCaMP6s
(Figures 6D–6G). Although the soma of cortical neurons was treated with TTX than in controls (Figure 6H). Likewise, the Ca2+
totally silent in the presence of TTX (Figures S6A and S6B), den- transients were taller, broader, and larger in area (Figures 6I–6K),
dritic spines remained active despite spike blockade (Figures whereas fluorescent punctum size and event frequency per
6D–6G and S6C). More spines were active in neurons chronically punctum were no different (Figures S6D and S6E). Thus, chronic
blockade of APs enhances spontaneous Ca2+ transients in The CaV1-CaMKK-CaMKIV Pathway Drives Reduced
spines, in line with optical voltage recordings. E29 Inclusion
Intensified minis and spine Ca2+ signaling could link inactivity
Contributions of Various Ca2+ Pathways during to nuclear AS. To find out whether this involves a classical
Spontaneous Transmission CaV1-CaMKK-CaMKIV cascade, like that engaged by depolari-
We characterized the elevated synaptic Ca2+ transients, mindful zation (West et al., 2002), we monitored phosphorylation of
of increases in Ca2+-permeable, GluA1-containing AMPA recep- CaMKIV as a pivotal step. Inactivity induced elevation of phos-
tors (AMPARs) following activity blockade (Kim and Ziff, 2014; phorylated CaMKIV (pCaMKIV) but not in the presence of nimo-
Thiagarajan et al., 2005). Live-labeled surface GluA1 increased dipine (a CaV1 blocker), KN93 (a CaMK inhibitor), or STO-609
in neurons that had undergone chronic inactivity (Figures S6F (a CaMKK blocker). This pattern was consistent in western
and S6G). Elevated surface GluA1 contributed to the enlarged blots of nuclear extracts (Figures 7A and 7B) and in nuc/cyto ra-
Ca2+ transient, indicated by a sharp drop in synaptic Ca2+ tran- tios obtained by immunocytochemistry (Figures S7A and S7B).
sients upon acute exposure to the GluA1 antagonist philantho- Nova-2 localization showed a reciprocal pattern (Figures 7C,
toxin (PhTx) (Figure 6L). Importantly, PhTx completely blocked 7D, S7C, and S7D) which was mirrored by TTX-induced reduc-
chronic inactivity-induced AP prolongation (Figures 6M and tion on E29 inclusion (Figures 7E and 7F). These results sup-
6N), indicating that GluA1 activation was critical for homeostatic ported a chain of signaling events emanating from spontane-
regulation of APD. ously active spines whereby a CaV1-CaMKK-CaMKIV pathway
Following chronic inactivity, elevated Ca2+ transients were drives reductions in nuclear Nova-2 and in E29 inclusion.
also inhibited by blockade of NMDAR with ((2R)-amino-5- Activation of a CaV1-CaMKK-CaMKIV pathway by chronic
phosphonovaleric acid; (2R)-amino-5-phosphonopentanoate) inactivity appears surprising because it seemingly recapitulates
APV and depletion of internal Ca2+ stores with thapsigargin effects of hyperactivity (Deisseroth et al., 2003; Ma et al.,
(Figure 6L). This aligns with AMPAR activation recruiting 2014). However, we found that directly imposed depolarization
Ca2+ delivery via NMDAR and intracellular Ca2+ stores (Empt- caused similar effects as chronic TTX on Nova-2 translocation
age et al., 1999). Voltage-dependent L-type Ca2+ channels and BK E29 exclusion (Figures S7E and S7F), with the latter ef-
(CaV1) also contributed to spine Ca2+ transients during fect completely prevented by nimodipine or KN93 (Figures 7G,
chronic inactivity, as judged by partial reduction with nimodi- 7H, and S7F). Thus, activation of CaV1 channels and CaMKs
pine (Figure 6L). A glutamate-gated cation current would drives Nova-2 relocation and E29 splicing, irrespective of how
create a voltage drop across the spine neck resistance (Har- the pathway is engaged.
nett et al., 2012; Palmer and Stuart, 2009), giving rise to
directly measured depolarizations and CaV1- and NMDA- bCaMKK Translocation Is Required for E29 Inclusion
mediated Ca2+ influx (Figure 6L). A remaining question is how, without spiking, CaV1 activation at
We verified inactivity-induced engagement of Ca2+ signaling dendritic sites causes activation of CaMKIV in the nucleus. In
by testing for activation of CaMKII and CaMKI in dendritic excitation-transcription coupling (E-T coupling) following acute
spines, which, respectively, undergo autophosphorylation or depolarization, signaling to nuclear CaMKIV involves transloca-
CaM kinase kinase (CaMKK)-mediated phosphorylation in tion of Ca2+/calmodulin via different shuttle proteins: gCaMKII
response to local Ca2+ signals (Wayman et al., 2008). Using in cortical, hippocampal, and sympathetic neurons (Ma et al.,
site-specific anti-phospho-Thr antibodies, we showed CaMKII 2014) and gCaMKI in parvalbumin-positive inhibitory neurons
Thr286 autophosphorylation and CaMKI Thr177/178 phos- (Cohen et al., 2016). In seeking a translocator for excitation-AS
phorylation, respectively (Figures S6H–S6K). Chronic inactivity coupling, we looked for an increase in nuclear level paired
augmented the intensity of both markers relative to the control with a drop in cytoplasmic level during chronic inactivity. This
(Figures S6H–S6K), providing independent biochemical evi- pattern was not evident for aCaMKI, bCaMKI, gCaMKI, dCaMKI,
dence that local Ca2+ signaling in spines is enhanced by gCaMKII, aCaMKK, and CaM itself (Figures 7J, S7G, and S7H).
chronic inactivity. In contrast, levels of bCaMKK rose in the nucleus and fell in
Figure 6. Chronic Spike Blockade Leads to Elevated Depolarization and Ca2+ Transients in Dendritic Spines
(A) Micrograph of a neuron expressing ASAP1. Scale bar, 10 mm.
(B) Quantification of membrane potential (ordinate, left) of somata and spines induced by APs (AP in a soma, purple; bAP in a spine, blue) or by AP-independent
synaptic transmission (spine depolarization in sham Con cultures, Vspine green; spine depolarization in 48 h TTX cultures, Vspine red), plotted against the cor-
responding change in ASAP1 fluorescence from the respective events (DF/F). See also STAR Methods.
(C) Example traces of ASAP1 fluorescence intensity (DF/F) in the spines and somata in (B).
(D and F) GCaMP6s expression in Con (D) and 48 h TTX neurons (F). Active spines with Ca2+ transients (during 5 min of imaging) are indicated (red dots). Scale bar,
10 mm. See also STAR Methods.
(E and G) GaMP6s signal (spontaneous Ca2+ signals were recorded in the presence of TTX) from active spines (y axis, red dots in D and F) in Con (E) and 48 h TTX
neurons (G).
(H–K) Comparison of synaptic Ca2+ transients between Con (black) and TTX (red) neurons. Shown are (H) number of active puncta (putative active spines) during
5 min of recording, (I) mean amplitude, (J) mean duration, and (K) normalized total fluorescence in Con and TTX (48 h) neurons (n = 14–15).
(L) Quantification of Ca2+ transient amplitudes in dendritic spines with wash on PhTx, APV, thapsigargin, or nimodipine (n = 8).
(M and N) AP waveforms (M) and pooled half-widths (N) after sham (black), 48 h TTX (red), or 48 h TTX with co-application of PhTx (green). Scale bars, 20 mV, 1 ms.
For (H)–(L) and (N), data are represented as mean ± SEM; *p < 0.05; **p < 0.01; N.S. represents p > 0.05. See also Figure S6.
the cytosol; its nuc/cyto ratio increased by more than 50% units. However, the brain-dominant BK auxiliary subunit b4 is
(Figures 7I and 7J). bCaMKK can phosphorylate CaMKIV and not significantly altered by TTX treatment (Lee et al., 2015).
is thus a plausible mediator of cytosol-to-nucleus signaling and
nuclear CaMKIV activation. We probed bCaMKK’s involvement CaV1-CaMKIV Signaling Is Engaged by Chronic Inactivity
by shRNA knockdown. This completely blocked inactivity- and Acute Depolarization
induced reduction of E29 inclusion, whereas concomitantly ex- Elegant work shows how AS can be regulated by depolarization
pressing shRNA-resistant bCaMKK fully rescued it (Figure 7K or activity elevation (Ding et al., 2017; Iijima et al., 2011; Mauger
and 7L). Thus, bCaMKK translocates to the nucleus and is et al., 2016; Xie and Black, 2001). Here we show that AS is also
necessary for chronic inactivity-induced E29 splicing. controlled by chronic inactivity and dominates homeostatic
regulation of AP shape. Strikingly, chronic inactivity-induced
DISCUSSION AS of BK, although homeostatic in outcome, relies on CaV1-
CaMKK-CaMKIV signaling, like acute depolarization-induced
We found an unexpected mechanism by which excitatory neu- splicing. Activation of Ca2+ signaling by silencing neuronal
rons modify their APs in responding homeostatically to chronic spiking is counterintuitive; the expectation is that Ca2+ entry
inactivity (Figure 7M). The adaptation arises from a well-defined would be dampened (Bridi et al., 2018). We resolved the paradox
change in splicing and is mediated by a novel signaling cascade by optically tracking transient depolarizations in dendritic spines
that mobilizes Hebbian-type signaling, even in the absence of of ‘‘inactive’’ neurons (Figures 6A–6C), which were strong
spikes. Remarkably, each element of this signaling pathway enough to activate the CaV1 channels present in dendritic spines
has been genetically implicated in neuropsychiatric disease (Stanika et al., 2016). This demystified the recruitment of CaV1
(see below). signaling, already indicated pharmacologically.
A potent set of synaptic events combine to drive homeostatic
Regulation of BK Channel Splicing Selectively readjustment of APD. Chronic spike blockade drives enhanced
Controls APD spontaneous presynaptic vesicle release (Jakawich et al.,
We found that splicing of BK channels is necessary and sufficient 2010; Lindskog et al., 2010) and incorporation of postsynaptic,
for inactivity-induced lengthening of APD. Under typical circum- high-conductance, PhTx-sensitive GluA1 receptors (Kim and
stances, regulatory changes in one repolarizing current (e.g., BK) Ziff, 2014; Thiagarajan et al., 2005). This increases glutamate-
would be offset by compensatory changes in others (e.g., KV2) gated cation influx, drives greater spine depolarization, and re-
(Kimm et al., 2015). Here, however, the key findings are null ef- cruits L-type Ca2+ channels in spine heads (Obermair et al.,
fects (Figures 1C, 1D, 1J, 1K, 6M, and 6N); without changes in 2004; Yasuda et al., 2003), Mg2+-unblocked NMDARs, and
spike waveform, altered recruitment of other voltage-gated Ca2+ release from internal stores (Dittmer et al., 2017). Our sce-
channels is not expected. Kimm et al. (2015) have further shown nario (Figure 7M) accounts for L-type channel participation in
that blockade of BK channels with IbTx spares the current-fre- responses triggered by chronic TTX; L-type channel involvement
quency relationship. Thus, mechanisms other than BK regulation in responses to chronic depolarization is well known (O’Leary
are expected and seen for homeostatic adjustments of firing fre- et al., 2010). Ca2+ channel-triggered signaling in response to
quency (Lee and Chung, 2014; Lee et al., 2015). For APD, we chronic inactivity and depolarization runs counter to conven-
cannot exclude ancillary changes enabled by E29 inclusion, tional expectations of diametrically opposing effector actions.
such as BK channel phosphorylation or modified auxiliary sub- Our data suggest that L-type-dependent signaling can be
Figure 7. CaV1-CaMK Signaling Is Required for Chronic Spike Blockade-Induced Nova-2 Translocation and BK Channel AS
(A) Expression level of pCaMKIV and CaMKIV in neurons treated with nimodipine, KN93, or STO-609.
(B) Quantification of CaMKIV activation from (A) (n = 3).
(C) Western blot indicating nuclear Nova-2 levels from neurons treated as indicated.
(D) Quantification of the nuclear Nova-2 level from (C) (n = 3).
(E) E29 splicing after 48 h TTX with other treatments as indicated.
(F) Quantification of E29 splicing from (E) (n = 4).
(G) E29 splicing after chronic depolarization (60K for 24 h) with or without nimodipine or KN93.
(H) Quantification of E29 splicing from (G) (n = 4).
(I) bCaMKK immunostaining (green) with MAP2 (red) and DAPI (blue). Nuclei are indicated by white dashed circles. Scale bar, 10 mm.
(J) Fold changes of aCaMKK and bCaMKK expression in the nucleus, cytosol, and nuc/cyt ratio after TTX (48 h) relative to Con neuron levels (n = 30).
(K) E29 splicing from neurons expressing viral shRNA constructs against endogenous bCaMKK with or without co-expression of shRNA-resistant bCaMKK
(bCaMKK (R)). Scrambled shRNA was used as Con.
(L) Quantification of E29 splicing from (K) (n = 4).
(M) Diagram of the homeostatic signaling loop regulating APD. (1) Chronic spike blockade leads to upregulation of synaptic GluA1 (synaptic scaling). (2) Activation
of GluA1 mediates excessive cation influx, induces membrane depolarization in dendritic spines, and facilitates opening of the NMDAR and CaV1 channels. (3)
CaV1 opening leads to activation of CaMKs, including CaMKI, CaMKII, and bCaMKK. (4) bCaMKK translocates from the cytosol to nucleus and activates CaMKIV.
(5) Activated CaMKIV phosphorylates nuclear Nova-2. (6) Phosphorylated Nova-2 translocates to the cytosol. (7) This leads to reduced E29 inclusion in BK
channel pre-mRNA. (8) BK channel activity is inhibited when lacking E29, broadening APDs, the observed homeostatic response to chronic spike blockade.
(N) The autoregulatory feedback loop of neuronal excitability (LeMasson et al., 1993; Marder et al., 1996; Siegel et al., 1994) with components described in this
study. Various genes implicated in neuropsychiatric diseases are highlighted.
For (B), (D), (F), (H), (J), and (L), data are represented as mean ± SEM; **p < 0.01; N.S. represents p > 0.05. See also Figure S7.
mobilized to achieve the appropriate response irrespective of the 48 h, indicating that homeostatic modulation occurs in two
direction of the initial perturbation. phases. Perhaps early inhibition of CaMKIV, together with in-
Strikingly, the CaV1 blocker nimodipine completely abolished hibition of calcineurin, contributes to up-scaling of AMPARs,
TTX-induced CaMK activation, Nova-2 translocation, and E29 critical for activation of synaptic NMDARs and CaV1; with
splicing but only partially inhibited the Ca2+ transient in spines further inactivity, enhanced synaptic events lead to recruit-
induced by chronic TTX. We suggest that other Ca2+ influx path- ment of calcium signaling and enhanced nuclear pCaMKIV, vi-
ways contribute to Ca2+ flux (Wheeler et al., 2012), whereas only tal for regulation of APD.
nimodipine-sensitive CaV1 channels provide critical voltage- Regulation of AS by CaMKIV has been studied previously
dependent conformational (VDC) signaling (Li et al., 2016) to in the context of acute depolarization, acting at conserved CaM-
help trap activated CaMKII (Wang et al., 2017). KIV-responsive RNA element (CaRRE) sequences in target pre-
cursor mRNAs (pre-mRNAs) (Iijima et al., 2011; Lee et al.,
Multiple CaM Translocators Support Different Forms of 2007; Liu et al., 2012; Xie and Black, 2001). We looked for regu-
Surface-to-Nucleus Communication lation of inclusion of stress-axis-regulated (STREX) exon (X4
Translocation of a CaMK is a recurrent theme in cytonuclear in our study) and for CaRRE sequences flanking E29 but did
signaling in neurons, but the identity of the kinase varies (Co- not find either (Figure S3). This is not surprising in light of evi-
hen et al., 2016; Ma et al., 2014). In the present case of dence that multiple activity-dependent splicing factors (hnRNP
chronic inactivity-induced signaling, sensitivity to the selective L, SAM68, and related STAR [signal transduction and activation
CaMKK inhibitor STO-609 indicated that a CaMKK was of RNA] family members) have specific effects on individual pre-
involved. These findings resembled studies in C. elegans mRNA targets (Iijima et al., 2011). For Nova-2 and other splicing
where cytonuclear signaling relies on a monomeric CaMKK factors, puzzles remain regarding how selective control could be
(CKK-1) that translocates across the nuclear membrane (Ki- exerted if the splicing events are controlled by the same regu-
mura et al., 2002). Precedent exists for regulated cytonuclear lator, CaMKIV. Possible differences in the dynamics and locali-
distribution of bCaMKK (Cao et al., 2011; Karacosta et al., zation of specific splicing events need further study.
2012; Nakamura et al., 2001), but its molecular mechanism Similar to Rbfox1 (Lee et al., 2009), Nova function is regulated
needs elucidation. bCaMKK lacks a classic NLS, but it might by cytonuclear relocation (Racca et al., 2010). Nova-2 posi-
rely on regulation of its nuclear export signal (NES) (Xu tioning is a U-shaped function of activity level, with nuclear exit
et al., 2012) through interaction with a resident nuclear favored by hyperactivity (seizure induction [Eom et al., 2013] or
protein. sustained depolarization; Figure S7E) or chronic inactivity. This
CaMKIV was, as expected, the target of bCaMKK activation, echoes the U-shaped relationship of CaMKIV activation to activ-
as judged by elevation of pCaMKIV and its blockade by STO- ity. Although we focused on Nova-2 actions in the nucleus,
609. Finding that CaMKIV was prebound to its substrate Nova-2 has also been found in dendrites, co-localized with its
Nova-2 opens up the possibility of local signaling events. This target mRNAs (Racca et al., 2010). Outside of the nucleus,
would minimize the perturbation of nuclear Ca2+ regulation splicing factors might facilitate translocation of their binding
overall, desirable for signaling extending over hours rather partners (e.g., CaMKIV) and regulate the stability and translation
than seconds to minutes. To be activated by CaMKK, CaMKIV of their target mRNAs (see Lee et al., 2016, for such a role of
must be bound to Ca2+/CaM. In one scenario, bCaMKK could Rbfox1).
locally transfer Ca2+/CaM to CaMKIV while retaining most of
its catalytic activity, a known feature of this enzyme (Anderson Implications for Autism, Schizophrenia, and Other
et al., 1998; Edelman et al., 1996; Tokumitsu and Soderling, Neuropsychiatric Diseases
1996). Consistent with such an intranuclear handoff, buffering In the specific feedback loop we circumnavigated (Figures 7M
of nuclear free Ca2+/CaM by CaMBP4nuc completely inhibited and 7N), each element is genetically implicated in autism
chronic inactivity-induced Nova-2 translocation (Figures 4D spectrum disorder (ASD), schizophrenia, or other neuropsychi-
and 4E). atric disorders. First, L-type Ca2+ channels have been impli-
cated repeatedly in ASD and schizophrenia (Bhat et al., 2012;
Roles of CaMKIV in Diverse Aspects of Homeostatic Purcell et al., 2014; Splawski et al., 2004, 2005) and affect
Plasticity downstream signaling of many forms (Deisseroth et al., 2003).
CaMKIV has been identified previously as a key player in ho- Second, bCaMKK (gene name CAMKK2) exhibits sporadic
meostatic plasticity of excitatory neurons by Ibata et al. (2008) mutations in individuals with schizophrenia with severe
and Joseph and Turrigiano (2017), who found that the synap- biochemical effects (Luo et al., 2014; O’Brien et al., 2017) and
tic response to 4-h TTX treatment, a relatively brief period of affect multiple target kinases (Marcelo et al., 2016). Third,
inactivity, could be mimicked by CaMKIV inhibition using Nova-2 mediates splicing of hundreds of pre-mRNAs, encod-
STO-609 or overexpression of a dominant-negative CaMKIV. ing proteins prominent at synaptic sites (Saito et al., 2016;
Those findings might seem at odds with enhanced pCaMKIV Ule et al., 2005) whose patterns are altered in Fragile X, a
in neurons undergoing chronic inactivity (Figures 4 and 7). form of ASD (Kuwano et al., 2011; Lewis et al., 2000). Finally,
However, reconciliation may be possible based on the dy- KCNMA1, encoding the BK channel, is causally involved in
namics of the homeostatic response. During TTX treatment certain sporadic forms of ASD (Laumonnier et al., 2006), mental
for 24–48 h, Kim and Ziff (2014) found initial inhibition of retardation, schizophrenia, and epilepsy (Du et al., 2005; Hig-
Ca2+ signaling for at least 6 h, followed by late activation at gins et al., 2008; Zhang et al., 2006); inclusion of BK exon E29
is significantly lower in ASD samples than in matched controls thank Xiaohan Wang and other Tsien lab members for advice and comments
(Parikshak et al., 2016). Together, these findings suggest that on the manuscript. This work was supported by research grants from the
NIGMS (GM058234), NINDS (NS24067), NIMH (MH071739), NIDA
individual steps in the adaptive feedback pathway not only
(DA040484), Druckenmiller Foundation, Simons Foundation, Mathers Founda-
contribute to homeostatic regulation but might also go awry tion, and Burnett Family Foundation (to R.W.T.); the National Key R&D Program
in neuropsychiatric disorders (Mullins et al., 2016; Figure 7N). of China (2018YFA0108300 to B.L.); the National Natural Science Foundation
In pinpointing players in E-AS coupling that seem to support of China (81622016 and 31571034 to B.L. and 81871048 and 81741063 to
pathogenesis, our results align with generally altered splicing L.H.); the Guangdong Natural Science Foundation (Grants for Distinguished
in ASD (Parikshak et al., 2016), possibly arising from this kind Young Scholars 2015A030306019 to B.L. and 2018B030311034 to L.H.);
and the Guangdong Provincial Key R&D Programs (Key Technologies for
of regulatory loop or side branches from it. Thus, correction
Treatment of Brain Disorders 2018B030332001 and Development of New
of faulty E-AS coupling merits consideration as a therapeutic
Tools for Diagnosis and Treatment of Autism 2018B030335001 to L.H.
strategy (Hébert et al., 2014). and B.L.).
Detailed methods are provided in the online version of this paper B.L. and R.W.T. conceived the project. B.S.S., S.D.S., and Z.L. performed
and include the following: electrophysiology experiments. B.L. and C.W. performed subcellular fraction-
ation and immunostaining. N.C. provided analysis tools for calcium and
d KEY RESOURCES TABLE voltage imaging. N.J.M. and S.D.S. performed immunostaining and analysis
for GluA1. G.Z. and T.A.N. performed mass spectrometry. B.W. and G.F. per-
d LEAD CONTACT AND MATERIALS AVAILABILITY
formed thalamic input elimination and immunohistochemistry for Nova. G.T.,
d EXPERIMENTAL MODEL AND SUBJECT DETAILS S.S., and S.D.S. performed cell culture. S.Y. and L.H. performed qRT-PCR
B Cell lines and immunostaining. B.L. performed all other experiments and data analysis.
B Primary cell cultures B.L., R.W.T., and S.D.S. wrote the manuscript with advice from L.H. and G.F.
B Animals
d METHOD DETAILS DECLARATION OF INTERESTS
B Constructs
B Transfection and treatment of cortical neurons The authors declare no competing interests.
B Lentiviral transduction of cortical neurons
Received: April 24, 2019
B Immunocytochemistry and image acquisition and
Revised: January 28, 2020
analysis Accepted: May 4, 2020
B Transfection and electrophysiology of HEK293 cells Published: June 2, 2020
B Electrophysiological recording of action potential half-
width in cultured cortical neurons REFERENCES
B Recording Current-Voltage relationships in HEK cells
transfected with BK channels Anderson, K.A., Means, R.L., Huang, Q.H., Kemp, B.E., Goldstein, E.G., Sel-
B Ca
2+
imaging on dendrite and soma of cultured bert, M.A., Edelman, A.M., Fremeau, R.T., and Means, A.R. (1998). Compo-
nents of a calmodulin-dependent protein kinase cascade. Molecular cloning,
neurons
functional characterization and cellular localization of Ca2+/calmodulin-depen-
B Voltage imaging of cultured neurons dent protein kinase kinase b. J. Biol. Chem. 273, 31880–31889.
B Immunoprecipitation
Bhat, S., Dao, D.T., Terrillion, C.E., Arad, M., Smith, R.J., Soldatov, N.M., and
B Protein sample preparation and western blot Gould, T.D. (2012). CACNA1C (Cav1.2) in the pathophysiology of psychiatric
B Oligo-RNA pull-down disease. Prog. Neurobiol. 99, 1–14.
B RNA immunoprecipitation Bischofberger, J., Geiger, J.R., and Jonas, P. (2002). Timing and efficacy of
B RT-PCR and Realtime qPCR Ca2+ channel activation in hippocampal mossy fiber boutons. J. Neurosci.
B Thalamic input elimination and immunohistochemistry 22, 10593–10602.
B Monocular deprivation Bito, H., Deisseroth, K., and Tsien, R.W. (1996). CREB phosphorylation and
B Protein sample preparation and mass spectrometry dephosphorylation: a Ca(2+)- and stimulus duration-dependent switch for hip-
analysis pocampal gene expression. Cell 87, 1203–1214.
B Phosphopeptide identification and quantitation Black, D.L. (2003). Mechanisms of alternative pre-messenger RNA splicing.
d QUANTIFICATION AND STATISTICAL ANALYSIS Annu. Rev. Biochem. 72, 291–336.
d DATA AND CODE AVAILABILITY Borst, J.G., and Sakmann, B. (1998). Calcium current during a single action po-
tential in a large presynaptic terminal of the rat brainstem. J. Physiol. 506,
143–157.
SUPPLEMENTAL INFORMATION
Bridi, M.C.D., de Pasquale, R., Lantz, C.L., Gu, Y., Borrell, A., Choi, S.Y., He,
Supplemental Information can be found online at https://doi.org/10.1016/j. K., Tran, T., Hong, S.Z., Dykman, A., et al. (2018). Two distinct mechanisms for
cell.2020.05.013. experience-dependent homeostasis. Nat. Neurosci. 21, 843–850.
Buckanovich, R.J., and Darnell, R.B. (1997). The neuronal RNA binding protein
ACKNOWLEDGMENTS Nova-1 recognizes specific RNA targets in vitro and in vivo. Mol. Cell. Biol. 17,
3194–3201.
We thank Dr. Robert B. Darnell for providing human anti-Nova serum and Dr. Byrne, J.H., and Kandel, E.R. (1996). Presynaptic facilitation revisited: state
Peter Stoilov and Dr. Douglas Black for providing AS reporter constructs. We and time dependence. J. Neurosci. 16, 425–435.
Splawski, I., Timothy, K.W., Sharpe, L.M., Decher, N., Kumar, P., Bloise, R., Wondolowski, J., and Dickman, D. (2013). Emerging links between homeo-
Napolitano, C., Schwartz, P.J., Joseph, R.M., Condouris, K., et al. (2004). static synaptic plasticity and neurological disease. Front. Cell. Neurosci.
Ca(V)1.2 calcium channel dysfunction causes a multisystem disorder including 7, 223.
arrhythmia and autism. Cell 119, 19–31. Xie, J., and Black, D.L. (2001). A CaMK IV responsive RNA element mediates
Splawski, I., Timothy, K.W., Decher, N., Kumar, P., Sachse, F.B., Beggs, A.H., depolarization-induced alternative splicing of ion channels. Nature 410,
Sanguinetti, M.C., and Keating, M.T. (2005). Severe arrhythmia disorder 936–939.
caused by cardiac L-type calcium channel mutations. Proc. Natl. Acad. Sci. Xie, J., and McCobb, D.P. (1998). Control of alternative splicing of potassium
USA 102, 8089–8096, discussion 8086–8088. channels by stress hormones. Science 280, 443–446.
STAR+METHODS
Continued
REAGENT or RESOURCE SOURCE IDENTIFIER
Magna RIP RNA-Binding Protein Sigma Cat#17-700
Immunoprecipitation Kit
Pierce Cell Surface Protein Isolation Kit Thermo Scientific Cat#89881
Pierce Nuclear Protein Extraction Kit Thermo Scientific Cat#78833
CelLytic M Cell Lysis Reagent Sigma Cat#C3228
Experimental Models: Organisms/Strains
Mouse: Olig3-Cre mouse gift from Y. Nakagawa, N/A
University of Minnesota
Mouse: R26floxstopTeNT gift from M. Goulding at the Salk N/A
Institute for Biological Studies
Oligonucleotides
Probes and shRNAs, see Table S1. This paper N/A
Primers for RT-PCR and qPCR, see Table S1 This paper N/A
Primers to insert E29 and flanking introns to the This paper N/A
splicing reporter: F: CCGGAATTCCGGC
TATGTGGCAACCCTAC
Primers to insert E29 and flanking introns to the This paper N/A
splicing reporter: R: CGCGGATCCGCGT
CTCCTTTGACTTCCTCT
Primers to measure E29 splicing in the splicing This paper N/A
reporter: F: GGAGAAGTCTGCCGTTACTGCCC
TGTG (DY-782 labeled)
Primers to measure E29 splicing in the splicing This paper N/A
reporter: R: CCGTCGTCCTTGAAGAAGATGGTGC
Recombinant DNA
mouse bCaMKK construct: Lentiviral CaMKKbeta Green et al., 2011b Addgene Plasmid #33322;
RRID:Addgene_33322
rat bCaMKK 1-460 construct: pSG5-FLAG- Green et al., 2011a Addgene Plasmid #33324;
CaMKKbeta rat 1-460 RRID:Addgene_33324
Human Nova-2 ORF Origene Cat# RC216200L1V, RC216200L2V
pcDNA3-BK-GFP Li et al., 2014 N/A
pCKII-GFP Li et al., 2016 N/A
BK channel without E29 This paper N/A
BK channel with E29 This paper N/A
splice reporter pFlare5 vector Stoilov et al., 2008 N/A
pGFP-C-shLenti bCaMKK shRNA constructs Origene Cat#TL711303
pGFP-C-shLenti Nova2 shRNA constructs Origene Cat#TL508674
pcDNA3.1/Puro-CAG-ASAP1 St-Pierre et al., 2014 Addgene Plasmid # 52519;
RRID:Addgene_52519
AAV-CaMKIIa-GCaMP6s-P2A-nls-tdTomato Gift from Jonathan Ting Addgene Plasmid #51086;
(unpublished) RRID:Addgene_51086
CaMKIV-GFP construct Gift from Haruhiko Bito N/A
CA-CaMKIV construct Gifts from Tian-Ming Gao N/A
CaMBP4 construct Gifts from Tian-Ming Gao N/A
Software and Algorithms
pClamp 9 Molecular Devices https://www.moleculardevices.com/
Prism GraphPad https://www.graphpad.com/
MATLAB Mathworks https://www.mathworks.com/
ImageJ Schneider et al., 2012 https://imagej.nih.gov/ij/
Further information and requests for resources and reagents should be addressed to Lead Contact, Richard W. Tsien (richard.tsien@
nyulangone.org)
All unique reagents generated in this study are available from the Lead Contact with a completed Materials Transfer Agreement.
Cell lines
HEK293 cell lines (human, female) were purchased from the American Type Culture Collection (ATCC). Cells were cultured in
standard Dulbeco’s modified Eagle’s medium (DMEM) supplemented with 10% FBS, 100 U/mL penicillin, 100 mg/mL streptomycin,
2 mM L-glutamine. All cells used were negative for mycoplasma. The cell lines were authenticated by the source repository.
Animals
C57BL/6 male and female mice (postnatal day 26), purchased from the Charles River Laboratory, were used for monocular depriva-
tion experiments. The Olig3Cre mouse line was a gift from Y. Nakagawa, University of Minnesota. R26floxstop-TeNT (tetoxf/f) mouse
line was a gift from M. Goulding at the Salk Institute for Biological Studies. These two mouse strains were maintained on a mixed
background (Swiss Webster and C57/ B16). All mice were housed with a 12 hour light-dark cycle. Mixed cohorts of female and
male mice were used for all experiments to minimize gender effects. Animal protocols were performed in accordance with NIH
guidelines and approved by the Institutional Animal Care and Use Committee at New York University and Sun Yat-sen University.
METHOD DETAILS
Constructs
Nova-2 mutants were produced by Phusion Site-Directed Mutagenesis Kit (Thermo Scientific). BK channel without E29 was
first cloned from pcDNA3-BK-GFP (Li et al., 2014) via PCR and inserted into pCKII-GFP construct with an aCaMKII promoter to
restrict expression to pyramidal cells (Li et al., 2016). E29 sequence was synthesized and inserted by Phusion Site-Directed Muta-
genesis Kit (Thermo Scientific). The splice reporter containing E29 and partial flanking intron sequences were cloned by PCR
from genomic DNA obtained from rat brain tissue and inserted into pFlare5 vector (Stoilov et al., 2008). Other constructs, see Key
Resources Table.
cell debris by filtering through a 0.45 mm filter. The viral particles were concentrated by centrifuging the filtrate at 70,000 3 g for 2 hr at
4 C using a Beckman SW28 rotor. The viral pellet was then resuspended in sterile PBS, aliquoted, and stored at 80 C. Lentivirus
particles (0.5-1 mL of viral stock diluted in 20 mL of PBS per coverslip) were added to cortical cultures containing 500 mL of medium on
DIV7. The experiments with overexpressed proteins or shRNAs were performed on DIV14.
(1) Analysis of Nova-2, pCaMKIV, bCaMKK and flag-tagged Nova-2 (e.g., Figures 3 and 4). Nuclear and cytosolic regions of in-
terest were manually drawn while viewing only DAPI or MAP2 channels, but blinded to the Nova-2 color channel. Background-
subtracted mean intensity was quantified and normalized to control conditions.
(2) Analysis of pCaMKII, pCaMKI and surface GluR1 intensity (e.g., Figure S6) was restricted to MAP2-positive regions of interest
containing a proximal dendrite (60 mm in length, 15 mm in width) at least 30 mm from cell soma. For GluR1, PSD-95 puncta
were identified, and the average intensity of GluR1 signal intensity measured in the region of those puncta. For pCaMKII and
pCaMKI, the regions of interests were manually drawn while viewing MAP2 channel but blinded to the pCaMKII color channel.
The background was subtracted from pCaMKII intensity. Data represents mean ± SEM over 20 such dendrites, normalized to
control conditions.
candidate ROIs.
Immunoprecipitation
Immunoprecipitation was performed with Dynabeads Protein G Immunoprecipitation Kit (Life Technologies) following the kit protocol
as described (Li et al., 2016). Briefly, for binding the antibody with the beads, 5 mg of antibody or control IgG was crosslinked and
incubated at room temperature with 100 mL Dynabeads. The supernatant was removed with the help of a magnet to retain the
bead-antibody complex, which was then washed with Ab Binding & Washing Buffer. For lysate preparation, cultured cortical neurons
or HEK293 were washed with PBS and lysed on ice with IP Lysis Buffer (Pierce) containing protease and phosphatase inhibitors. The
lysate was centrifuged at 16,000 g for 10 min at 4 C to pellet the cell debris. The resulting supernatant was incubated with bead-anti-
body complex with rotation for 30 min at room temperature. The bead-antibody-antigen complex was then washed 3 times using
washing buffer. The protein complex was eluted with Elution buffer and the eluent was mixed and heated with 2 3 SDS sample buffer
containing b-mercaptoethanol at 90 C for 10 min.
Oligo-RNA pull-down
RNA pull-down was performed with Magnetic RNA-Protein Pull-Down Kit (Thermo) following the manual. Briefly, synthesized 50 -bio-
tinylated 20 -OMe-RNA oligonucleotides (Eurofins) (Key Resources Table) were bound to streptavidin magnetic beads (Thermo)
in RNA Capture Buffer at room temperature with agitation. Cleared brain lysates from 1-month old mouse cortex (Figure 2B) or
transfected HEK cells (Figure S5E) prepared in IP Lysis Buffer (Pierce) containing protease and phosphatase inhibitors mixed with
Protein-RNA Binding Buffer, and then incubated with the packed beads at 4 C for 1 hr. Beads were washed three times with
wash buffer and the precipitate was subjected to immunoblot analysis. Signal intensities were quantified by an Odyssey imaging
system (Li-Cor).
RNA immunoprecipitation
RNA immunoprecipitation was performed with Magna RIP RNA-Binding Protein Immunoprecipitation Kit (Sigma) following
the manual. Briefly, 5 mg of anti-Nova-2 antibody or control IgG were incubated with magnetic beads in RIP wash buffer at room
temperature for 30 mins. The antibody prebound beads were then washed 3 times with RIP wash buffer at room temperature. For
lysate preparation, mouse cortical lysates were prepared with complete RIP Lysis Buffer. The lysates were mixed with prebound
beads in RIP Immunoprecipitation Buffer containing RIP wash buffer, EDTA and RNase inhibitor, overnight with rotation at 4 C.
The magnetic beads antibody-RNA binding protein complex was washed with wash buffer and then re-suspended in proteinase
K buffer at 55 C for 30 minutes. Phenol:chloroform:isoamyl alcohol and chloroform were used to separate the phase and the
RNA was precipitated with Salt Solution I, Salt Solution II, Precipitate Enhancer and absolute ethanol at 80 C overnight.
The RNA precipitation was centrifugated and washed with 80% ethanol solution. The pellets were dried and resuspended in
RNase-free water and used for further RT-PCR.
was performed with AmpliTaq Gold 360 Master Mix with DY-682 labeled primers (for X6). The PCR products were either resolved
on agarose gel (X1-X5) or denatured and resolved on 6% polyacrylamide/8 M urea denaturing gels (X6). The band density was
analyzed by an Odyssey imaging system (Li-Cor). qPCR was performed with SYBR-green PCR master mix (Fermentas) with primers
(Table S1) using the DNA engine Opticon 2 (Bio-Rad).
Monocular deprivation
The monocular deprivation was performed following previous study (Ma et al., 2013). Briefly, the mice (postnatal day 26) were anes-
thetized with the mixture of 100 mg/kg ketamine hydrochloride and 10 mg/kg xylazine hydrochloride intraperitoneally. Under micro-
scope and illuminator, the eyelashes and the edge of eyelids were cut using the spring scissors. The eyelids of the right eye were
sutured with three mattress sutures. A thin layer of xylocaine 2% Jelly and bacitracin zinc ointment was applied to the sutured eyelids.
After 5 days monocular deprivation, mice brains were sliced for immunostaining or RNA isolation. Nova-2 localization or E29 splicing
in the monocular region of primary cortex V1, both contralateral and ipsilateral to the visual deprived eye, was analyzed.
All statistical analyses were performed using Prism (GraphPad Software). Means of two groups were compared using Student’s t
test. One-way ANOVA was used to compare group means between more than two groups, followed by either Dunnett’s multiple
comparisons test when all other groups were compared to the control group, or Tukey’s multiple comparisons test when
means of each pair of groups were compared. All comparisons are two-sided. Statistical significance was determined as follows:
*p < 0.05, **p < 0.01. Data are shown as mean ± SEM. The statistical details of all experiments, including the number of samples
and p values, can be found in figures and figure legends.
All data supporting the findings of this study and custom MATLAB code for calcium imaging are available upon request from the
Lead Contact.
Supplemental Figures
Figure S1. APD Increased after Chronic TTX Treatment, Related to Figure 1
(A) Representative waveforms of action potential recorded after TTX treatment (24 or 48 h). (B) Quantification of half width of action potentials in the groups in (A)
(n = 10. Data for control and 48 h groups is from Figure 1A).
ll
Article
Figure S2. Effects of Inactivity and E29 Splicing on Trafficking and Expression of BK Channels, Related to Figure 1
(A) Expression level of BK channel mRNA was assayed by RT-PCR (n = 4). (B) BK channel expression in the plasma membrane was assayed by western blotting
after biotinylation-mediated membrane protein fractionation. (C) Immunofluorescence of surface BK channel. Surface BK channel was probed with antibodies
targeting extracellular N terminus. (D) Quantification of intensity of surface BK channel (n = 10). (E) Overexpression of GFP-tagged BK channel isoforms in HEK293
cells. (F) Quantification of surface fluorescence intensity of GFP-tagged BK channel isoforms. (G) Expression level of GFP-tagged BK channel isoforms was
assayed by western blotting after biotinylation-mediated membrane protein fractionation. Na/K ATPase, Lamin B and GAPDH are markers for plasma membrane,
nucleus and cytosol.
ll
Article
Figure S3. Effects of Chronic Inactivity and Depolarization on AS of the BK Channel, Related to Figure 1
(A) Impact of chronic inactivity on alternative splicing of X1 to X5. Cultured cortical neurons were sham- or TTX-treated for 48 h. Separation of RT-PCR products
obtained with specific primers flanking each alternatively spliced site (X1 to X5). Closed arrows indicate the longer RT-PCR products obtained if the alternative
exon was included, open arrows indicate the shorter products if that exon was excluded. (B) Quantification of inclusion or exclusion of X1 to X6 by quantitative RT-
PCR. (C) High K+ solution maintained long-term depolarization in resting membrane potential. (D) Impact of sustain depolarization on E29 inclusion. Cultured
cortical neurons were sham- or high K+-treated (20mM, 40mM and 60mM, respectively) for 12 or 24 h. E29 inclusion was examined by RT-PCR.
ll
Article
Figure S5. Identification of CaMKIV-Mediated Nova-2 Phosphorylation Sites by Mass Spectrometry, Related to Figure 5
(A) Schematic diagram of Nova-2 protein as in Figure 7. (B to D) Mass spec results of phosphorylation level of S25, T27 and S194 sites. (E) WT and S194E Nova-2
are no different in RNA binding. Flag-tagged WT Nova-2 or S194E Nova-2 was overexpressed in HEK293 cells. RNA pull-down assay was performed from the cell
lysates by probe A1 (Figure 2). The RNA binding ability of WT and S194E Nova-2 was assayed by western blotting with anti-flag antibody.
ll
Article
Figure S6. Ca2+ Transients, Surface GluA1 Expression, and Activation of CaMKs in Con or TTX-Treated Neurons, Related to Figure 6
(A) Ca2+ transients were imaged from the soma of control neurons, with no TTX present. (B) Ca2+ imaging, from the soma of control neurons (upper) or TTX-treated
(48 h) neurons, was performed in the acute presence of TTX. (C) Ca2+ transients induced by action potential-independent synaptic transmission (red, imaged with
TTX present) or by back-propagating action potentials (bAP) (blue, imaged without TTX). (D and E) Puncta area (indicating active spines, D) and the number of
Ca2+ transients (per minute, E) in each puncta (active spine) in control and TTX-treated (48 h) neurons (n = 14 to 15). (F) The influence of chronic inactivity on
surface GluA1 expression. Surface-labeling of GluA1 in the dendrites from control and TTX-treated (48 h) neurons. PSD95 as synaptic marker, MAP2 as dendrite
marker. Scale bar 10 mm. (G) Quantification of surface GluA1 immunofluorescent intensity in the dendrites from control and TTX-treated (48 h) neurons. (H) The
activation of CaMKII in control and TTX-treated neurons. Immunofluorescence of phosphorylated CaMKII on Thr286 site, indicating the level of autophos-
phorylated CaMKII. (I) Quantification of dendritic pCaMKII intensity in control and TTX groups. (J) The activation of CaMKI in control and TTX-treated neurons.
Immunofluorescence of phosphorylated CaMKI on Thr177/178 site, indicating the activation of CaMKI by CaMKK. (K) Quantification of dendritic pCaMKII in-
tensity in control and TTX groups. MAP2 as dendrite marker. Scale bar 10 mm.
ll
Article
Figure S7. Regulation of pCaMKIV Activation, Nova-2 Localization, E29 Splicing, and CaMK Localization by Chronic Inactivity and CaV1-
CaMK Blockers or Inhibitors, Related to Figure 7
(A) The activation of nuclear CaMKIV in different groups as indicated. Immunofluorescence of phosphorylated CaMKIV on Thr196 site indicating the level of
CaMKIV activation by CaMKK. (B) Normalized nuclear/cytosolic ratio of pCaMKIV immunofluorescence intensity in each group as indicated. (C) The localization
of Nova-2 in different groups as indicated. (D) Normalized nuclear/cytosolic ratio of Nova-2 immunofluorescence intensity in each group as indicated. (E) 12 h and
24 h high potassium media treatment (20mM, 40mM and 60mM, respectively) led to Nova-2 translocation from nucleus to cytosol. (F) Long-term depolarization
reduced the inclusion of E29, which could be blocked by CaV1 blocker nimodipine and CaM kinases blocker KN-93. (G and H) Fold changes in abundance of
CaM, aCaMKI, bCaMKI, gCaMKI, dCaMKI and gCaMKII in the nucleus, cytosol and nucleus/cytosol ratio. (G) Representative micrographs of different proteins in
control or TTX-treated neurons. (H) Ratios of levels after TTX treatment (48 h) relative to levels in control neurons. (I) Validation of the knockdown of bCaMKK
expression by shRNAs. flag-tagged bCaMKK was co-expressed with shRNAs targeting different locations of rat bCaMKK CDS. The efficiency of shRNAs was
assayed by western blotting.
Article
Correspondence
mustafa.aydogan@path.ox.ac.uk
(M.G.A.),
mb915@cam.ac.uk (M.A.B.),
jordan.raff@path.ox.ac.uk (J.W.R.)
In Brief
Feedback-driven oscillations in centriolar
Plk4 kinase levels—normally entrained by
the cell-cycle oscillator but capable of
running autonomously—trigger and time
centriole biogenesis to ensure that
daughter centrioles grow at the right time
and to the right size.
Highlights
d Centriolar Plk4 levels oscillate and act as a switch for
centriole biogenesis
Article
An Autonomous Oscillation
Times and Executes Centriole Biogenesis
Mustafa G. Aydogan,1,5,* Thomas L. Steinacker,1,5 Mohammad Mofatteh,1 Zachary M. Wilmott,1,2 Felix Y. Zhou,3
Lisa Gartenmann,1 Alan Wainman,1 Saroj Saurya,1 Zsofia A. Novak,1 Siu-Shing Wong,1 Alain Goriely,2
Michael A. Boemo,1,4,* and Jordan W. Raff1,6,*
1Sir William Dunn School of Pathology, University of Oxford, Oxford OX1 3RE, UK
2Mathematical Institute, University of Oxford, Oxford OX2 6GG, UK
3Ludwig Institute for Cancer Research, University of Oxford, Oxford OX3 7DQ, UK
4Present address: Department of Pathology, University of Cambridge, Cambridge CB2 1QP, UK
5These authors contributed equally
6Lead Contact
SUMMARY
The accurate timing and execution of organelle biogenesis is crucial for cell physiology. Centriole biogenesis is
regulated by Polo-like kinase 4 (Plk4) and initiates in S-phase when a daughter centriole grows from the side of a
pre-existing mother. Here, we show that a Plk4 oscillation at the base of the growing centriole initiates and times
centriole biogenesis to ensure that centrioles grow at the right time and to the right size. The Plk4 oscillation is
normally entrained to the cell-cycle oscillator but can run autonomously of it—potentially explaining why cen-
trioles can duplicate independently of cell-cycle progression. Mathematical modeling indicates that the Plk4
oscillation can be generated by a time-delayed negative feedback loop in which Plk4 inactivates the interaction
with its centriolar receptor through multiple rounds of phosphorylation. We hypothesize that similar organelle-
specific oscillations could regulate the timing and execution of organelle biogenesis more generally.
1566 Cell 181, 1566–1581, June 25, 2020 ª 2020 The Author(s). Published by Elsevier Inc.
This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
ll
Article
A B
E F
Figure 1. Plk4 Levels Oscillate at the Centriole in a Process Entrained by the CCO
(A) Top panel: micrograph shows an image from a time-lapse movie of an embryo expressing Plk4-NG. Middle panels: micrographs illustrate the centriolar Plk4-
NG oscillation during nuclear cycle 12—obtained by superimposing all the Plk4-NG foci (n = 60) at each time point (see STAR Methods). Bottom panel:
quantification of centriolar Plk4-NG levels during nuclear cycles 11–13 in a single embryo (red arrows highlight equivalent time points in the middle panels).
(B) Graphs show the mathematical regression of centriolar Plk4-NG dynamics during S-phase of cycles 11–13 (regression mean ± SEM). R2 values indicate
goodness-of-fit. N R 15 embryos; n = 24, 37, and 53 centrioles (mean) per embryo over cycles 11–13, respectively.
(C) The bar charts quantify the oscillation parameters—derived from the data shown in (B). Data are presented as mean ± SD. Statistical significance was as-
sessed using an ordinary one-way ANOVA test (for Gaussian-distributed data) or a Kruskal-Wallis test (***p < 0.001; ****p < 0.0001; ns, not significant).
(D) Micrographs show, and pie charts quantify, the distribution of Plk4-NG at centrioles assessed by 3D-SIM at the indicated phases of the nuclear cycle (see
STAR Methods). N = 6 embryos per cell-cycle stage; n = 20 centrioles per embryo; all images were scored blindly by 3 assessors and the mean score is shown
(scale bar, 0.5 mm).
(legend continued on next page)
Recent studies have shown that Plk4 localizes to centrioles in Plk4 Oscillations Time and Execute Centriole
a cyclical manner in both fly embryos (Aydogan et al., 2018) and Biogenesis
human cultured cells (Takao et al., 2019), but the functional sig- To test whether the Plk4 oscillations were important for centriole
nificance of this localization pattern is unclear. Here, we show biogenesis, we generated flies co-expressing Plk4-NG (in a Plk4
that a Plk4 oscillation at the base of the growing centriole initiates mutant background) and the centriole cartwheel component
and times centriole biogenesis in fly embryos. Sas-6-mCherry, which is irreversibly incorporated into the base
of the growing daughter centriole cartwheel and can be used
to monitor centriole growth in fly embryos (Aydogan et al.,
RESULTS AND DISCUSSION 2018). These flies laid embryos that often failed to hatch (Fig-
ure S3C), but we simultaneously measured Plk4 oscillations
Plk4 Levels Oscillate at the Base of Growing Daughter and centriole growth in those embryos that appeared to be
Centrioles developing normally (Figures 2A, S3A, and S3B; Video S2). The
To investigate the cyclical recruitment of Plk4 to the centrioles, mother centrioles in these embryos were often slightly delayed
we generated flies transgenically expressing Plk4-mNeonGreen in initiating daughter centriole growth (Figures 2A, S3D, and
(Plk4-NG) under the control of its own promoter in a Plk4 mutant S3E), allowing us to measure the amount of Plk4 at the centrioles
background. We monitored centriolar Plk4-NG levels in living when daughter centrioles either started or stopped growing (Fig-
Drosophila syncytial embryos, where the duration of S-phase ure 2A, colored dotted lines).
gradually elongates over nuclear cycles 11–13 (Figures 1A, S1, Strikingly, the centriolar levels of Plk4 at which centriole
and S2A; Video S1). Centriolar Plk4-NG levels oscillated during growth initiated at each cycle (‘‘Start’’; Figures 2A and 2B)
each cycle: levels started to rise in M-phase, peaked in early- were not significantly different than the levels at which centriole
mid S-phase, and were minimal by the next M-phase (Figures growth stopped (‘‘Stop’’; Figures 2A and 2B). This suggests that
1A and S2A). We fit the S-phase oscillations in individual em- at each cycle there is a threshold level of centriolar Plk4 that is
bryos (Figures S1C and S1D) to derive an average S-phase oscil- required to support centriole growth: above this threshold the
lation for each cycle (Figure 1B). centrioles can grow, below this threshold they cannot. If the
Not surprisingly, the Plk4 oscillations appeared to be entrained threshold concept is correct, then mother centrioles that failed
by the core Cdk/Cyclin oscillator as their period increased as nu- to recruit sufficient Plk4 should not grow a daughter. We
clear cycles slowed during cycles 11–13 (Figure 1C). Moreover, observed that the centrioles in a fraction of the embryos express-
genetically altering the duration of the nuclear cycles elicited cor- ing both Plk4-NG and Sas-6-mCherry (mostly at nuclear cycle
responding alterations in the Plk4 oscillation period (Figures 1E 13) separated at the start of S-phase but did not detectably
and 1F). Interestingly, however, the Plk4 oscillation exhibited incorporate Sas-6-mCherry, indicating that daughter centrioles
adaptive behavior: as the period (T) of the oscillation tended to did not grow (Figures S3D and S3E)—a defect that may explain
increase at successive cycles, its amplitude (A) tended to why many of these embryos failed to hatch (Figure S3C). Intrigu-
decrease, so that the total amount of Plk4 recruited to centri- ingly, centriolar Plk4 levels continued to oscillate in these
oles—i.e., the area under the S-phase oscillation curve (area un- embryos, but the average amplitude of these oscillations was
der the curve [U])—remained relatively constant (Figure 1C). lower than in the embryos in which centrioles continued to dupli-
Plk4 is initially recruited to a ring around the mother centriole cate—and it was almost always below the average threshold at
that resolves into a single hub that defines the site of daughter which centriole growth was normally initiated (Figure 2C).
centriole assembly (Banterle and Gönczy, 2017; Fırat-Karalar Together, these results suggest that the Plk4 oscillations initiate,
and Stearns, 2014; Nigg and Holland, 2018). To examine how and determine the duration of, centriole growth.
this localization related to the Plk4 oscillations, we used 3D-
structured illumination super-resolution microscopy (3D-SIM) Mathematical Modeling of the Plk4 Oscillation
to assess the centriolar localization of Plk4 during the nuclear cy- Oscillations in biology are often generated by delayed feedback
cles in living embryos. Plk4-NG was only very briefly detectable circuits (Tsai et al., 2008). In Drosophila, Plk4 is recruited to cen-
in a ring during late-mitosis; at all other stages it appeared largely trioles by Asterless (Asl), which also activates Plk4, allowing it to
as a single hub (Figure 1D). Thus, the recruitment and loss of Plk4 phosphorylate both itself and Asl at multiple sites (Boese et al.,
from the centriole wall is not responsible for the S-phase oscilla- 2018; Dzhindzhev et al., 2010; Klebba et al., 2015). Human Asl
tion we observe in these embryos; instead, centriolar Plk4-NG (Cep152) also binds, and is phosphorylated by, Plk4 in vitro (Ciz-
levels oscillate at the base of the growing daughter centriole. mecioglu et al., 2010; Hatch et al., 2010). We realized that this
(E) Graph shows the mean regression of Plk4-NG oscillations in nuclear cycle 12 of WT embryos (green), or in embryos where the genetic dose of either cyclin B
(CycB1/2; blue) or grapes (Drosophila Chk1) (grp1/2; red) has been halved to slow or speed-up the nuclear cycles, respectively. Dashed lines mark the center (peak)
of the Plk4-NG oscillations (denoted with C), and dotted lines indicate the time of NEB (denoted with N) for each genotype. N R 14 embryos for each condition; n =
55, 43, and 44 centrioles (mean) per embryo in WT, CycB1/2, and grp1/2 embryos, respectively. To clearly illustrate the phase shift in the oscillations, the highest
mean fluorescence signal for each group was normalized to 1.
(F) Bar charts quantify the time at which the Plk4-NG oscillations peaked, the length of S-phase, and the ratio between them (C/N)—derived from the data shown
in (E). Data are presented as mean ± SD. Statistical significance was assessed using an ordinary one-way ANOVA test (for Gaussian-distributed data) or a
Kruskal-Wallis test (**p < 0.01; ***p < 0.001; ns, not significant).
See also Figures 6, S1, and S2.
B C
system could form a time-delayed negative feedback network Plk4 is required to promote centriole growth—but in our model
capable of generating Plk4 oscillations if the activation of Plk4 this reaction is not important for the Plk4 oscillation per se, so
by Asl eventually led to the inhibition of their interaction. we do not consider it further. We speculate that the phosphory-
A simple version of such a scenario is illustrated in Figures 3A lation of Asl at multiple sites reduces its affinity for Plk4, so that
and 3B. At the start of each oscillation cycle, we envisage that the bound Plk4 molecules are released, leaving behind the phos-
unphosphorylated Asl receptors on the mother centriole recruit phorylated Asl-receptor that can no longer recruit Plk4 (Fig-
Plk4 to the site of daughter centriole assembly with high affinity ure 3A, (iii)) (see the end of this section for how this network
(Figure 3A, (i)). Binding activates Plk4, allowing it to phosphory- can be reset to trigger subsequent rounds of oscillations).
late itself (Cunha-Ferreira et al., 2013; Holland et al., 2010; This network (Figure 3B; see mathematical model 1 in STAR
Klebba et al., 2013), Ana2/STIL (Dzhindzhev et al., 2014; Kratz Methods) maps onto a set of coupled linear ordinary differential
et al., 2015; McLamarrah et al., 2018; Ohta et al., 2014) and equations, which we solved analytically. Solutions to this first
Asl/Cep152 (Boese et al., 2018; Hatch et al., 2010) at multiple model (model 1 in STAR Methods) fit the discrete Plk4 oscillation
sites (Figure 3A, (ii)). The phosphorylated Ana2 promotes cart- data from each S-phase of nuclear cycles 11–13 very well (Fig-
wheel assembly, potentially explaining why a threshold level of ure 3C; R2 > 0.99). Although the model may overfit the data,
A B
D E
Figure 3. A Simple Mathematical Model of the Plk4 Oscillation and Experimental Investigations to Test Its Predictions
(A) Diagram of the model. (i) During mitosis, Asl receptors (red) on the surface of the mother centriole start to bind Plk4 (green) with high affinity (k1). (ii) Once bound,
Plk4 is activated, and it starts to phosphorylate itself, Ana2 (black) and Asl (k2) at multiple sites (indicated by dotted black arrows and black dots). (iii) We speculate
that, after several rounds of phosphorylation, Asl is converted to a state with low affinity for Plk4, so phosphorylated Plk4 is released (k3)—and likely degraded.
These Asl receptors are now inactivated and can no longer bind Plk4 to promote centriole growth.
(B) Schematic depicts the topology of the mathematical model (see STAR Methods for full details of the model). Asl-p and Plk4-p indicate phosphorylated
proteins. Bold arrows indicate the dominant direction of the reactions. This model discretely examines centriolar Plk4-NG levels only during S-phase of each
cycle. We speculate that a phosphatase normally removes the phosphate groups from Asl during mitosis to reset the system for the next oscillation (red arrow; k4),
and we extend the model to include this step elsewhere (Figures S4A and S4B).
(C) Graphs show the Lorentzian fit of the Plk4-NG oscillation data during S-phase of cycles 11–13 (solid lines) overlaid with analytical solutions to the model
(dotted lines). R2 values indicate goodness-of-fit.
(D) Bar charts quantify the average cytosolic concentration and centriolar fluorescence of Asl-GFP at the start of S-phase of each cycle. Cytosolic Asl-GFP was
measured using FCS; each data point represents the average of 4–6 10-s recordings from a single embryo (Figure S5). Centriolar Asl-GFP was measured using
confocal microscopy, as described in STAR Methods. N R 14 embryos for each cell cycle; n = 48, 70, and 130 centrioles (mean) per embryo in cycles 11–13,
respectively. Data are presented as mean ± SEM. Statistical significance was assessed using an ordinary one-way ANOVA test (ns, not significant).
these solutions were within a reasonable and generally narrow Asl-13A-mKate2 localized to centrioles less efficiently than Asl-
parameter space (Figures 3C and S5; Data S1, first and fourth WT-mKate2 (Figure S4D), expressing untagged Asl-13A
charts). Nevertheless, we believe this model is likely to be over- increased the amplitude of the Plk4-NG oscillation (Figure S4E),
simplified. Plk4’s ability to phosphorylate itself, for example, consistent with our idea that phosphorylating Asl can reduce its
could help to generate the oscillation by promoting Plk4 degra- affinity for Plk4 (Figures 3A and 3B).
dation (Cunha-Ferreira et al., 2009; Guderian et al., 2010; An inspection of the parameters generated by our model re-
Holland et al., 2010; Rogers et al., 2009) or lowering the affinity vealed that the reduction in the amplitude of the Plk4 oscillation
of the Asl::Plk4 interaction—as has recently been demonstrated at successive nuclear cycles was driven primarily by a reduction
(Park et al., 2019). Moreover, the model considers the behavior in the cytosolic concentration of Plk4 (that determines k1, the rate
of only Asl and Plk4, when other factors, such as Ana2/STIL, at which Plk4 binds to Asl), while total levels of the Asl receptor
are likely to modulate the systems behavior (Arquint and Nigg, (Atot) remain relatively constant (Data S1, first chart). To test if
2016; Gönczy and Hatzopoulos, 2019). Finally, the model does this was the case, we first used fluorescence correlation spec-
not consider the possibility that Plk4 bound to one receptor troscopy (FCS) (Figure S5) to examine the cytosolic concentra-
could phosphorylate nearby receptors, or the Plk4 bound to tion of Asl-GFP. Although the number of centrioles assembled
nearby receptors, to influence their behavior—a concept that doubles at each successive cycle, the average cytosolic con-
may be important when considering how Plk4 ultimately local- centration of Asl-GFP, and the average centriolar levels of Asl-
izes to only a single site on the side of the mother centriole GFP, remained relatively constant at the start of each successive
(Leda et al., 2018; Takao et al., 2019). cycle (Figure 3D), as predicted by our model. Unfortunately, the
In order to demonstrate how this network could be reset for the cytosolic concentration of Plk4-NG was too low to be measured
next oscillation, we extended our model (model 2 in STAR by conventional FCS, so we developed a new method, peak
Methods) to allow a protein phosphatase (PPTase) to be acti- counting spectroscopy (PeCoS), to measure relative protein
vated during M-phase to dephosphorylate Asl (Figures 3B, red abundance at lower concentrations (see STAR Methods) (Fig-
arrow, and S4A). This resetting is biologically plausible, because ure S6). This revealed that, in contrast to Asl-GFP, the cytosolic
the activities of several PPTases are regulated during the cell cy- levels of Plk4-NG tended to decrease at successive nuclear cy-
cle (Nilsson, 2019). This model can be solved exactly, and its so- cles (Figure 3E), as predicted by the model.
lutions generate robust centriolar Plk4 oscillations within the Why do cytosolic Plk4 levels decrease at successive nuclear cy-
context of a system that, like the early Drosophila embryo, alter- cles? Our modeling suggests that if total Plk4 levels in the devel-
nates between periods of S- and M-phases (Figure S4B). Thus, oping embryo remain constant (i.e., the rate of Plk4 degradation
our minimal model illustrates that a classical ‘‘time delayed nega- and synthesis are balanced), then the doubling of centriole
tive-feedback’’ network (Novák and Tyson, 2008) can generate numbers at each cycle can lead to the depletion of cytosolic
Plk4 oscillations, although the precise molecular details of this Plk4—particularly during later nuclear cycles—as an increasing
system remain to be fully elucidated. fraction of the protein is sequestered by the increasing number
of centrioles (Figure S4F). Alternatively (or additionally), Plk4 mole-
cules that are activated by binding to Asl may be more likely to
Testing Predictions of the Mathematical Models phosphorylate themselves to stimulate their degradation, ensuring
A key feature of our models is that the phosphorylation of Asl by that more Plk4 is degraded at each cycle as the number of centri-
Plk4 reduces their affinity (although, as discussed above, Plk4’s oles increase. Interestingly, in either of these scenarios, increasing
ability to phosphorylate itself, and other factors, could also help centriole numbers lead to more Plk4 depletion from the cytosol,
to generate the oscillation). To test the plausibility of this idea, we potentially allowing embryos to effectively ‘‘count’’ their centrioles.
mutated 13 potential Plk4 phosphorylation sites in Asl to Ala (Asl-
13A) (Figure S4C). These sites were selected based on their con-
servation, their similarity to known Plk-family consensus sites The Plk4 Oscillation Can Adapt to Changes in Plk4
(Leung et al., 2007), their proximity to the N- and C-terminal re- Levels to Maintain a Constant Centriole Size
gions of Asl that are thought to interact with Plk4 (Boese et al., Our finding that cytosolic levels of Asl remain constant at succes-
2018), and a previous analysis of sites in the Asl N-terminal re- sive cycles while cytosolic Plk4 levels decrease suggests a ratio-
gion that are either phosphorylated by Plk4 kinase domain nale for why centriole biogenesis may be regulated by an oscilla-
in vitro or have been shown to be phosphorylated in cultured tory system. In our models, Asl effectively functions as an
Drosophila cells (Boese et al., 2018). If some of these sites are integrator (Ferrell, 2016; Somvanshi et al., 2015) whose levels are
normally phosphorylated by Plk4 to reduce the affinity of the kept constant so that it can measure changes in the input (cytosolic
Asl::Plk4 interaction, we would predict that expressing Asl-13A Plk4 levels) and adapt the oscillation to maintain a constant output
in the presence of endogenous, unlabeled Asl would lead to an (centriole size). If this interpretation is correct, then the Plk4 oscil-
increase in centriolar Plk4-NG levels—because the Plk4 should lation should adapt to maintain a constant centriole size when
unbind from the mutant Asl receptors less efficiently. Although Plk4 levels change, but not when Asl levels change. To test this,
(E) Bar chart shows the relative abundance of Plk4-NG at the start of each nuclear cycle measured by PeCoS (see STAR Methods; Figure S6). Each data point
represents a single 180-s recording from a single embryo. Statistical significance was assessed using a Kruskal-Wallis test (****p < 0.0001). Data are presented as
mean ± SD.
See also Figures S4, S5, and S6 and Data S1 (first and fourth charts as well as the Monte Carlo analysis).
Figure 4. The Plk4 Oscillator Can Adapt to Changes in Plk4 Concentration but Not to Changes in Asl Concentration
(A and B) Graphs show the regression data (solid lines) and mathematical solutions (dotted lines) for Plk4-NG oscillations in cycle 12 for experiments where either
(A) the genetic dose of Plk4-NG was halved (Plk4-NG1/2), or (B) the genetic dose of asl was halved (asl1/2) (gray lines) compared to controls (green lines). (A) N R 11
embryos for each condition; n = 47 and 42 centrioles (mean) per embryo in control or Plk4-NG1/2 groups, respectively. (B) N = 18 embryos for each condition; n =
44 and 43 centrioles (mean) per embryo in control or asl1/2 groups, respectively. Data are presented as mean ± SEM. Bar charts quantify oscillation parameters,
as indicated; data are presented as mean ± SD.
we monitored Plk4-NG oscillations in embryos laid by mothers an interphase-like state with intact nuclei that do not duplicate
where we genetically halved the dose of either Plk4-NG (hereafter their DNA, but where centrosomes can continue to duplicate
Plk4-NG1/2 embryos) or asl (hereafter asl1/2 embryos). Centrioles (McCleland and O’Farrell, 2008). We initially injected embryos in
appeared to duplicate normally in both sets of embryos, but the nuclear cycles 7–8 and monitored Plk4-NG behavior 30 min
Plk4 oscillation parameters were altered: in Plk4-NG1/2 embryos, later. In all such embryos, we observed an initial synchronous
A decreased but there was a compensatory increase in T, so U re- round of centriole duplication without NEB (indicating that the
mained relatively constant (Figures 4A and S7A); in asl1/2 embryos, CCO was perturbed), followed by one or more rounds of less syn-
A decreased, but there was no compensatory change in T, so U chronous centriole duplication (Figures 5A and S2B; Video S3).
decreased (Figures 4B and S7B–S7D). Strikingly, a normal Plk4-NG oscillation was associated with the
Our mathematical model (model 1) could fit both sets of data first, synchronous, round of centriole duplication, but subsequent
well (Figures 4A and 4B; R2 > 0.99), generating a reasonable range oscillations were more variable (Figures 5A and S2B).
of parameters (Data S1, second and third charts), several of which We reasoned that any residual Plk4-NG oscillations in these
we again validated experimentally (Figure S7; see mathematical embryos might be triggered by residual CCO oscillations that
modeling section in STAR Methods). Interestingly, if we took the could trigger centriole duplication, but not DNA synthesis or
normal parameters derived from our model and simply adjusted NEB. While one can never rule out the possibility of residual
the amount of Asl or Plk4 in the model to the levels we experimen- CCO activity, we tried to overcome this potential problem by
tally measured in the half-dose embryos, the model fit the data examining centriole behavior in embryos in which the CCO was
less well (not shown). This suggests that changing the concentra- likely to be more fully suppressed by injecting the embryos earlier
tion of one component is likely to influence the concentration and/ (nuclear cycles 2–4) and monitoring them later (after 90 min). The
or behavior of other components so that several parameters of the centrioles in these embryos were now completely dissociated
Plk4 oscillation are altered. This seems plausible, as the core from the non-dividing nuclei and they appeared to divide stochas-
centriole duplication proteins are known to interact with and influ- tically, with some centrioles duplicating one or more times, and
ence each other in multiple ways (Arquint and Nigg, 2016; Gönczy others not duplicating at all (Figure S8; Video S4). The CCO coor-
and Hatzopoulos, 2019; Nigg and Holland, 2018). dinates cell-cycle events in normal early embryos by spreading as
Consistent with our observation that the Plk4-NG oscillations a chemical trigger wave (Chang and Ferrell, 2013; Deneke et al.,
adapt in Plk4-NG1/2 embryos by reducing A and increasing T to 2016), but duplicating centrioles did not detectably trigger the
maintain a relatively constant U, we previously showed that halving duplication of nearby centrioles (Figure S8F). Thus, the ‘‘decision’’
the genetic dose of Plk4 led to the centrioles growing slowly, but for to duplicate in these CCO-suppressed embryos appears to be
a longer period of time, to maintain a constant size (Aydogan et al., largely intrinsic to each individual centriole.
2018). In contrast, we would predict that daughter centrioles in To test whether these stochastic centriole duplications were
asl1/2 embryos should grow more slowly (as A is decreased), but triggered by Plk4 oscillations, we measured Plk4-NG fluores-
for a normal period (as T is unchanged), and so centrioles would cence levels at individual centrioles. The raw intensity data were
be too short (as U decreases). We measured the parameters of noisy, but duplicating ‘‘fertile’’ centrioles appeared to exhibit
daughter centriole growth in asl1/2 embryos and confirmed that more prominent Plk4-NG oscillations than non-duplicating ‘‘ster-
this was the case (Figure 4C). Together, these experiments sug- ile’’ centrioles (Figure 5B). Moreover, the average centriolar
gest that the Plk4 oscillatory network functions to maintain a con- Plk4-NG fluorescence level (expressed as signal-to-noise ratio
stant centriole size even when Plk4 levels vary. [SNR]) was significantly higher at fertile centrioles (Figure S8B),
and Plk4-NG SNR values could distinguish fertile and sterile cen-
Plk4 Oscillations Can Execute Centriole Duplication trioles, correctly predicting centriole fertility or sterility 74% and
Independently of a Robust Cdk/Cyclin Cell-Cycle 71% of the time, respectively (Figures S8C and S8D).
Oscillator Upon filtering the raw oscillation data, we found that the peaks
Although the Plk4 oscillations in fly embryos are normally en- of the Plk4-NG oscillations (see STAR Methods for a description
trained by the cell-cycle oscillator (CCO) (Figures 1E and 1F), it of peak-calling methodology) were often associated with
has long been known that centrioles can continue to duplicate centriole duplication events (Figure 5B). An unbiased computa-
in many systems even when several other aspects of cell-cycle tional analysis of all the 45 fertile centrioles that we observed in
progression are blocked (Balczon et al., 1995; Gard et al., 1990; 3 different embryos revealed that the predicted Plk4-NG oscilla-
Sluder et al., 1990). We wondered whether this might be because tion peaks predicted centriole duplication events with high preci-
Plk4 oscillations can continue to drive centriole biogenesis even in sion (40/49 Plk4-NG peaks were associated with a duplication
the absence of a robust CCO. To test this possibility, we injected event that occurred within ±5 min of the peak) and recall (40/
embryos with double-stranded RNAs (dsRNAs) targeting the three 52 duplication events occurred within ±5 min of a Plk4-NG oscil-
embryonic mitotic cyclins: A, B, and B3. These embryos arrest in lation peak) (Figures 5C and 5D). Computer simulations revealed
(C) Graph quantifies the parameters of cartwheel growth—as measured by Sas-6-GFP fluorescence incorporation (Aydogan et al., 2018)—in WT and asl1/2
embryos; data are presented as mean ± SEM. Bar charts quantify growth parameters presented as mean ± SD. N = 17 embryos for each condition; n = 77 and 72
centrioles (mean) per embryo in WT or asl1/2 groups, respectively. Statistical significance was assessed using an unpaired t test with Welch’s correction (for
Gaussian-distributed data) or an unpaired Mann-Whitney test (*p < 0.05; **p < 0.01; ***p < 0.001; ****p < 0.0001; ns, not significant). R2 values indicate goodness-
of-fit for the mathematical solutions.
See also Figure S7 and Data S1 (second and third charts).
A B
C D E F
Figure 5. Plk4 Levels Can Continue to Oscillate and Promote Centriole Duplication Even When the CCO is Perturbed
(A) Graph shows the Plk4-NG oscillations in an embryo injected with dsRNA against cyclin A-B-B3; the schema above the graph illustrates the experimental
protocol. The nuclei in this embryo arrest in interphase, but centrioles go through an additional round of division—centriole separation (CS)—accompanied by a
Plk4 oscillation. See Figure S2B for additional examples; n = 30 centrioles (mean) per embryo.
(B) Graphs show the raw (black lines) and filtered (red lines) fluorescence intensity data of 3 individual ‘‘fertile’’ centrioles and 3 individual ‘‘sterile’’ centrioles within
the same cyclin-depleted embryo. The fertile centrioles duplicate (black dotted lines), and these events were often closely associated with computed Plk4
oscillation peaks (red dotted lines) (see STAR Methods for further details of the peak calling methodology).
(C) An unbiased computational analysis of all 45 fertile centrioles in 3 embryos reveals that >80% of the computationally detected Plk4 oscillation peaks occur
within 5 min of an experimentally observed duplication event. A simulation with randomly distributed centriole duplication events and Plk4 oscillation peaks
showed a mean time separation of 10.5 min (data not shown).
(D) Venn diagram shows how, using a 5-min window, the oscillation peaks can be used to predict duplication events with both high precision and high recall (40/49
Plk4 oscillation peaks are associated with a duplication event, and 40/52 duplication events are associated with a Plk4 oscillation peak).
(E) Graph shows the ability of Plk4 oscillation peaks to ‘‘retrieve’’ centriole duplication events across all peak prominences. All detected oscillation peaks were
ranked in order of their peak prominence from high to low (black dots) and assigned uniquely to a duplication event if within a 5-min time window. The graph then
plots the precision and recall values if the threshold for calling a peak were set as the peak prominence value of each peak (in descending order). Below the
detected peak that is associated with a peak prominence threshold of 0.12, the precision dramatically drops, suggesting the existence of a minimum peak
amplitude for centriole duplication. At this threshold, precision and recall are jointly optimized. Note, if there were no overall correlation between Plk4 peaks and a
duplication event, the integrated area under the curve across all peak prominences or average precision (AP) for the 5-min time window (AP5min) would be ~50%
(given by # duplications/(# duplications + # peaks)); so the score of ~75% indicates a meaningful correlation.
(F) Graph shows the correlation between the time of the computationally determined Plk4 peaks and their respective experimentally observed duplication events.
Correlation strength was examined using Pearson’s correlation coefficient (r < 0.40 weak; 0.40 < r < 0.60 moderate; r > 0.60 strong); significance of correlation
was determined by the p value (p < 0.05) (see STAR Methods for a full description of this analysis).
See also Figures S2 and S8.
that a random distribution of the duplication events lead to an based on amplitude revealed that the higher the amplitude of
average time of >10 min between the peaks and duplication the oscillation, the more likely it was to be associated with a
events, indicating that the observed association was not centriole duplication event (Figure 5E), while plotting the relative
random. Moreover, a rank ordering of the Plk4-NG oscillations timing of the Plk4-NG oscillations and the centriole duplication
events revealed a strong positive correlation (Figure 5F; Pearson cell-cycle regulators can influence the Plk4 oscillation without
r = 0.9580, p < 0.0001). We conclude that individual centrioles changing Plk4’s cytosolic concentration. This supports the
can organize autonomous Plk4 oscillations that can drive model prediction that the drop in cytosolic Plk4 levels at succes-
centriole duplication even in the absence of a robust CCO. sive nuclear cycles (Figure 3E) is not, on its own, sufficient to
This potentially explains how centrioles can continue to dupli- account for the change in Plk4 oscillation parameters we
cate independently of many other cell-cycle events. observe from cycles 11–13. This presumably explains why the
model requires several parameters to change slightly at each
The CCO Can Phase-Lock the Plk4 Oscillation to successive cycle to best fit the data (Data S1, first chart).
Coordinate Centriole Duplication with Other Cell-Cycle Taken together, our observations are consistent with the
Events phase-locker model of cell-cycle regulation (Lu and Cross,
It is widely believed that the CCO acts primarily as a ‘‘ratchet’’ 2010). We propose that the Plk4 oscillation may be an exemplar
whose activity increases over the cell cycle to trigger the sequen- of an autonomously oscillating system that can independently
tial execution of cell-cycle events such as DNA replication, drive a cellular event (centriole duplication), but that is normally
centriole duplication, nuclear envelope breakdown (NEB), and phase-locked by the CCO to ensure its proper coordination
spindle assembly (Stern and Nurse, 1996; Swaffer et al., 2016, with other biological events and with cell division.
2018). An interesting alternative possibility is that the CCO could
act as a ‘‘phase-locker’’ whose function is simply to entrain the A Model to Generate Autonomous Plk4 Oscillations in
phase of a network of autonomous oscillations, each of which the Absence of a CCO
is responsible for the execution of a specific cell-cycle event How can a Plk4 oscillation be generated independently of the
(Lu and Cross, 2010). The Plk4 oscillation appears to time and CCO? Our mathematical model (model 2 in STAR Methods) cannot
execute centriole biogenesis, and it can trigger centriole duplica- explain this, as it requires a PPTase to reset the system specifically
tion independently of a robust CCO, so it is an excellent candi- during M-phase (Figures S4A and S4B). Interestingly, if we extend
date for such an autonomous oscillation. the model to allow the PPTase to have a constant low-level of ac-
To better understand how the CCO might entrain the Plk4 tivity (10% of the level normally required to reset the system in M-
oscillation, we measured the average period of the stochastic phase) (Figure S8G) this new model (model 3 in STAR Methods) re-
Plk4 oscillations in cyclin-depleted embryos (20.5 ± 4.6 min) capitulates several features of centriole duplication in the cyclin-
and compared this to the average period of the Plk4 oscillations depleted embryos (Figure S8H). This model predicts that after a
in cycles 11–12 (11.7 ± 0.7 min) and 12–13 (14.9 ± 1.7 min). The last round of mitosis the centrioles in the cyclin-depleted embryos
natural period of the autonomous Plk4 oscillation in these early will undergo a single synchronous Plk4 oscillation (as all of the Asl
embryos is therefore similar to, but slightly slower than, the receptors start this first cyclin-depleted cycle in a dephosphory-
period of the Plk4 oscillations normally enforced by the CCO, lated state), but subsequent Plk4 oscillations rapidly dampen as
indicating that the CCO could entrain the Plk4 oscillation by the individual Asl receptors lose synchrony, and the system tends
speeding up a phase of its natural cycle. toward a steady state—where some of the centriolar Asl receptors
To examine which phase this might be, we tested for correla- are Plk4-bound and being phosphorylated, while others are not
tions between various parameters of the Plk4 oscillation and the Plk4-bound and are being dephosphorylated (Figure S8H). Intrigu-
length of S- or M-phase. During cycles 11–13, we observed a sig- ingly, the inherent noise in the system generated stochastic Plk4
nificant correlation between the timing of the Plk4-NG oscillation oscillations that could plausibly drive centriole duplication (Fig-
trough in M-phase and the duration of M-phase (Figure 6, lower ure S8H)—potentially mimicking the stochastic Plk4 oscillations
scatterplots in the light yellow panel), suggesting that the CCO en- and centriole duplication events that we observe in the cyclin-
trains the Plk4 oscillation by speeding it up during M-phase. This is depleted embryos (Figure 5B).
consistent with our minimal model, in which the CCO entrains the In this model, each Asl receptor effectively behaves as an in-
Plk4 oscillation by ensuring the rapid and coordinated dephos- dependent oscillator—alternating between a Plk4-bound form
phorylation of Asl during M-phase (Figures S4A and S4B). that is being phosphorylated and a non-Plk4-bound form that
We also noticed an additional correlation between the peak of is being dephosphorylated. In the presence of the CCO, the
the Plk4-NG oscillation and S-phase length in cycle 13 (Figure 6, Asl receptors generate coordinated Plk4 oscillations because
upper rightmost scatterplot in the light yellow panel). This is not the CCO synchronizes them every nuclear cycle by providing a
surprising, as a Wee1-dependent checkpoint dramatically slows coordinated burst of PPTase activity during mitosis.
the CCO—and many other aspects of S-phase progression—
particularly during nuclear cycle 13 (Deneke et al., 2016; Stumpff Plk4 Oscillations Are Detectable in Non-dividing Mouse
et al., 2004). Moreover, in Wee1/ embryos, the correlation be- Liver Cells and Can Be Entrained by the Circadian Clock
tween the Plk4-NG oscillation trough and M-phase length was In species as distant as cyanobacteria and mammals, the CCO can
maintained (Figure 6, lower rightmost scatterplot in the light yel- be entrained to the circadian clock (Matsuo et al., 2003; Yang et al.,
low panel), while the correlation between the Plk4-NG oscillation 2010). We wondered, therefore, whether the autonomous Plk4
peak and S-phase length was lost (Figure 6; upper rightmost oscillation could also be entrained by the circadian clock. We
scatterplot in the light yellow panel), demonstrating that Wee1 examined a recently published diurnal proteome from non-regen-
can influence the Plk4 oscillation in S-phase. Interestingly, the erating mouse liver (Wang et al., 2018), where hepatocytes, the ma-
cytosolic levels of Plk4-NG were essentially the same in wild- jor building blocks of the liver, are largely quiescent (Friedman,
type (WT) and Wee1/ embryos (Figure S6E), indicating that 2000). Several key cell-cycle regulators (such as Cdk1, cyclin E,
Figure 6. The CCO Phase-Locks the Plk4 Oscillations in Mitosis of Cycles 11–13 Independently of Wee1 and in Interphase of Cycle 13 in a
Wee1-Dependent Manner
Scatterplots illustrate correlations between various parameters of the Plk4 oscillation and the length of S- or M-phase in nuclear cycles 11–13 in WT and Wee1/
embryos (see Plk4-NG smooth curve fitting and parameter extraction in STAR Methods for details of how these parameters [along with their descriptions] were
obtained in an unbiased way). During cycles 11–13, there is a significant correlation between the timing of the Plk4 oscillation trough in M-phase and the duration
of M-phase (lower scatterplots in the light yellow panel), suggesting that the CCO entrains the Plk4 oscillation during M-phase. This entrainment is not altered in
the Wee1/ embryos. During nuclear cycle 13, there is an additional correlation between the peak of the Plk4 oscillation and S-phase length that is lost in the
Wee1/ embryos (upper rightmost scatterplot in the light yellow panel). The plots for the WT group were generated with the data obtained from Figures 1A and
S2A, as well as 5 additional embryos of the same genotype. N = 10 embryos; n = 23 centrioles (mean; starting from cycle 11) per embryo in Wee1/ group.
Correlation strength was examined using Pearson’s correlation coefficient (r < 0.40 weak; 0.40 < r < 0.60 moderate; r > 0.60 strong); significance of correlation
was determined by the p value (p < 0.05).
See also Figures S4 and S6E.
cyclin B1, and Plk1) were not detectable at any stage of the diurnal dataset to examine the behavior of the mouse homologs of all
cycle, confirming that these cells were largely quiescent. In the mitotic PPTase subunits that function in flies (Chen et al.,
contrast, Plk4 protein (but not transcript) levels exhibited a striking 2007). Among the 27 PPTase subunits examined, only PPP2CB
oscillation that was entrained to the light/dark cycles (Figures 7A– exhibited a clear oscillatory behavior that is similar to Plk4, and
7C). We presume that this oscillation is sub-threshold for centriole the period of these oscillations was precisely out of phase with
biogenesis—because centrioles should not be duplicating in these the Plk4 oscillation (Figure 7D, highlighted with a red dotted
non-dividing cells—and simply reflects the ability of the Plk4 sys- frame). Intriguingly, PPP2CB is the homolog of Mts, the catalytic
tem to oscillate in a way that can be entrained by the circa- subunit of PP2A in Drosophila that localizes to centrosomes spe-
dian clock. cifically during mitosis in fly cells, and its knockdown leads to
In our model, a mitotic PPTase that dephosphorylates Asl-re- centrosome duplication defects (Dobbelaere et al., 2008).
ceptors out of phase with Plk4 is required to generate Plk4 oscil- Thus, PP2A is an excellent candidate for the PPTase that may
lations (Figures S4A and S4B). We therefore used the mouse normally dephosphorylate centriolar Asl during mitosis.
A C
Remarkably, 8% of the 6,800 proteins in the mouse data- B Spatiotemporal heatmap of centriole duplications
set exhibited a 24 h-entrained oscillatory behavior. It is unclear B Spatial clustering assessment of centriole duplications
why so many proteins oscillate in this way, or whether any of B Plk4-NG smooth curve fitting and parameter extraction
these oscillations are of functional significance. Nevertheless, B 3D-Structured Illumination Microscopy (3D-SIM)
these observations indicate that there are many other proteins, B Mathematical modeling and its experimental validation
and so perhaps many different biological processes, that have B Model 2: Generating robust Plk4 oscillations entrained
a largely under-appreciated ability to oscillate. by the CCO
B Model 3: Stochastic duplications
B Fluorescence Correlation Spectroscopy (FCS)
Concluding Remarks
B FCS background corrections
There is great interest in determining the physical and molecular
B Data restriction
principles that cells use to regulate the biogenesis of their organ-
B Peak Counting Spectroscopy (PeCoS)
elles (Liu et al., 2018; Mukherji and O’Shea, 2014). The idea that
d QUANTIFICATION AND STATISTICAL ANALYSIS
an organelle-specific oscillation could time and execute organelle
biogenesis has, to our knowledge, not been proposed previously.
We suggest that the Plk4 centriole oscillation could be a paradigm SUPPLEMENTAL INFORMATION
for a general mechanism describing the regulation of organelle
biogenesis: oscillations in the levels/activity of key regulatory fac- Supplemental Information can be found online at https://doi.org/10.1016/j.
tors essential for organelle biogenesis could precisely time the initi- cell.2020.05.018.
ation and duration of the growth process, ensuring that organelles
grow at the right time and to the appropriate size. In such a model, ACKNOWLEDGMENTS
the CCO and circadian clocks could act simply as ‘‘phase-lockers’’
(Lu and Cross, 2010; Morgan, 2010), whose function is to entrain We are grateful to Laura Hankins, Fabio Echegaray Iturra, Marjorie Fournier,
the phase of a network of autonomous oscillators to ensure that Christoffer Lagerholm, and Bela Novak for advice and discussion and Alissa
M. Kleinnijenhuis and members of the Raff laboratory for critically reading
biological processes occur in a coordinated manner.
the manuscript. Microscopy was performed at the Micron Oxford Advanced
Bioimaging Unit, funded by a Strategic Award from the Wellcome Trust
STAR+METHODS (107457). The research was funded by a Wellcome Trust Senior Investigator
Award (104575 to T.L.S., M.M., Z.M.W., A.W., S.S., and Z.A.N.), Edward Pen-
ley Abraham Scholarships (to M.G.A. and L.G.), a Cancer Research UK Oxford
Detailed methods are provided in the online version of this paper
Centre Prize DPhil Studentship (C5255/A23225 to S.-S.W.), a Balliol Jason Hu
and include the following: Scholarship, (to S.-S.W.), a Clarendon Scholarship (to S.-S.W.), and Ludwig
Institute for Cancer Research funding (to F.Y.Z.). M.A.B. was supported by a
d KEY RESOURCES TABLE Biotechnology and Biological Sciences Research Council grant (BB/
d RESOURCE AVAILABILITY N016858/1) and the St. Cross Emanoel Lee Junior Research Fellowship.
B Lead Contact
B Materials Availability
AUTHOR CONTRIBUTIONS
B Data and Code Availability
d EXPERIMENTAL MODEL AND SUBJECT DETAILS This study was conceptualized by M.G.A., T.L.S., M.A.B., and J.W.R. Investi-
B D. melanogaster stocks and husbandry gation was done by M.G.A., T.L.S., M.M., Z.M.W., L.G., A.W., S.S., and M.A.B.
d METHOD DETAILS Data were analyzed by M.G.A., T.L.S., Z.M.W., L.G., F.Y.Z., and M.B.A. Meth-
B Hatching experiments odology was developed by M.G.A., T.L.S., M.M., Z.M.W., S.-S.W., F.Y.Z.,
B Synthesis of double-stranded RNA M.A.B., and J.W.R. Project was administered by M.G.A., M.A.B., and J.W.R.
B Embryo collections and dsRNA injections Resources were shared/made by M.G.A., M.M., L.G., A.W., S.S., S.-S.W.,
Z.A.N., and M.A.B. Software development was carried out by M.G.A., T.L.S.,
B Immunoblotting
Z.M.W., S.-S.W., F.Y.Z., and M.A.B. Overall supervision was done by
B Image acquisition, processing, and analysis M.G.A., A.G., and J.W.R. Validation experiments/analyses were carried out
B Analysis of centriole ‘‘fertility’’ in embryos injected with by M.G.A., A.W., S.S., and J.W.R. M.G.A., T.L.S., A.G., M.A.B., and J.W.R.
dsRNA against cyclin A-B-B3 wrote and edited the draft with significant input from all authors.
Figure 7. Plk4 Levels May Autonomously Oscillate in Mouse Liver Cells Entrained by the Circadian Clock Where Levels of PP2A Catalytic
Subunit Oscillates Precisely out of Phase
(A) Diagram shows the workflow used by Wang et al. (2018) to obtain a diurnal proteome of the whole liver of light/dark-entrained mice.
(B) Graphs reproduced from Wang et al. (2018) show the relative diurnal expression of the circadian clock transcripts Bmal1 and Per1 as internal controls. We re-
analyzed the diurnal proteome produced in this study—comprising a matrix of Z scores for 6,780 proteins identified during 2 circadian cycles (supplemental
dataset 9 from Wang et al. [2018]).
(C) Graphs we derived show the relative protein levels of Plk4 and the cartwheel component STIL (Ana2 in flies). Plk4 levels strongly spike in a periodic manner
every circadian cycle, whereas STIL levels appear to randomly fluctuate and show neither a discernible pattern of oscillation nor any entrainment to the circadian
clock. Because these cells are generally not proliferating, centrioles should not be duplicating, so the Plk4 oscillations are presumably sub-threshold for centriole
biogenesis. Thus, Plk4 oscillations are detectable in non-dividing mammalian cells, where they are entrained by the circadian clock.
(D) Graphs examine in the non-dividing liver cells the behavior of mouse homologs of all the mitotic PPTase subunits that function in flies (Chen et al., 2007).
Among the 27 PPTase subunits examined, only PPP2CB (highlighted with a red dotted frame) exhibited a clear oscillatory behavior that is similar to Plk4, and the
period of these oscillations was precisely out of phase with the Plk4 oscillation.
Liu, T.-L., Upadhyayula, S., Milkie, D.E., Singh, V., Wang, K., Swinburne, I.A., Rogers, G.C., Rusan, N.M., Roberts, D.M., Peifer, M., and Rogers, S.L. (2009).
Mosaliganti, K.R., Collins, Z.M., Hiscock, T.W., Shea, J., et al. (2018). The SCF Slimb ubiquitin ligase regulates Plk4/Sak levels to block centriole
Observing the cell in its native state: Imaging subcellular dynamics in multicel- reduplication. J. Cell Biol. 184, 225–239.
lular organisms. Science 360, eaaq1392. Rüttinger, S., Buschmann, V., Krämer, B., Erdmann, R., Macdonald, R., and
Lu, Y., and Cross, F.R. (2010). Periodic cyclin-Cdk activity entrains an auton- Koberling, F. (2008). Comparison and accuracy of methods to determine the
omous Cdc14 release oscillator. Cell 141, 268–279. confocal volume for quantitative fluorescence correlation spectroscopy.
J. Microsc. 232, 343–352.
Markow, T.A., Beall, S., and Matzkin, L.M. (2009). Egg size, embryonic devel-
Schönle, A., Von Middendorff, C., Ringemann, C., Hell, S.W., and Eggeling, C.
opment time and ovoviviparity in Drosophila species. J. Evol. Biol. 22,
(2014). Monitoring triplet state dynamics with fluorescence correlation spec-
430–434.
troscopy: bias and correction. Microsc. Res. Tech. 77, 528–536.
Marsh, B.J., Mastronarde, D.N., Buttle, K.F., Howell, K.E., and McIntosh, J.R.
Schwarz, G. (1978). Estimating the Dimension of a Model. Ann. Stat. 6,
(2001). Organellar relationships in the Golgi region of the pancreatic beta cell
461–464.
line, HIT-T15, visualized by high resolution electron tomography. Proc. Natl.
Acad. Sci. USA 98, 2399–2406. Shaner, N.C., Lambert, G.G., Chammas, A., Ni, Y., Cranfill, P.J., Baird, M.A.,
Sell, B.R., Allen, J.R., Day, R.N., Israelsson, M., et al. (2013). A bright mono-
Marshall, W.F. (2016). Cell Geometry: How Cells Count and Measure Size.
meric green fluorescent protein derived from Branchiostoma lanceolatum.
Annu. Rev. Biophys. 45, 49–64.
Nat. Methods 10, 407–409.
Matsuo, T., Yamaguchi, S., Mitsui, S., Emi, A., Shimoda, F., and Okamura, H. Shcherbo, D., Murphy, C.S., Ermakova, G.V., Solovieva, E.A., Chepurnykh,
(2003). Control mechanism of the circadian clock for timing of cell division T.V., Shcheglov, A.S., Verkhusha, V.V., Pletnev, V.Z., Hazelwood, K.L., Roche,
in vivo. Science 302, 255–259. P.M., et al. (2009). Far-red fluorescent tags for protein imaging in living tissues.
McCleland, M.L., and O’Farrell, P.H. (2008). RNAi of mitotic cyclins in Drosophila Biochem. J. 418, 567–574.
uncouples the nuclear and centrosome cycle. Curr. Biol. 18, 245–254. Sibon, O.C., Stevenson, V.A., and Theurkauf, W.E. (1997). DNA-replication
McLamarrah, T.A., Buster, D.W., Galletta, B.J., Boese, C.J., Ryniawec, checkpoint control at the Drosophila midblastula transition. Nature
J.M., Hollingsworth, N.A., Byrnes, A.E., Brownlee, C.W., Slep, K.C., Ru- 388, 93–97.
san, N.M., and Rogers, G.C. (2018). An ordered pattern of Ana2 phos- Sluder, G., Miller, F.J., Cole, R., and Rieder, C.L. (1990). Protein synthesis and
phorylation by Plk4 is required for centriole assembly. J. Cell Biol. 217, the cell cycle: centrosome reproduction in sea urchin eggs is not under trans-
1217–1231. lational control. J. Cell Biol. 110, 2025–2032.
Morgan, D.O. (2010). The hidden rhythms of the dividing cell. Cell 141, Somvanshi, P.R., Patel, A.K., Bhartiya, S., and Venkatesh, K.V. (2015). Imple-
224–226. mentation of integral feedback control in biological systems. Wiley Interdiscip.
Mukherji, S., and O’Shea, E.K. (2014). Mechanisms of organelle biogenesis Rev. Syst. Biol. Med. 7, 301–316.
govern stochastic fluctuations in organelle abundance. eLife 3, e02678. Stern, B., and Nurse, P. (1996). A quantitative model for the cdc2 control of S
phase and mitosis in fission yeast. Trends Genet. 12, 345–350.
Nigg, E.A., and Holland, A.J. (2018). Once and only once: mechanisms of
centriole duplication and their deregulation in disease. Nat. Rev. Mol. Cell Stumpff, J., Duncan, T., Homola, E., Campbell, S.D., and Su, T.T. (2004).
Biol. 19, 297–312. Drosophila Wee1 kinase regulates Cdk1 and mitotic entry during embryogen-
esis. Curr. Biol. 14, 2143–2148.
Nigg, E.A., and Raff, J.W. (2009). Centrioles, centrosomes, and cilia in health
and disease. Cell 139, 663–678. Swaffer, M.P., Jones, A.W., Flynn, H.R., Snijders, A.P., and Nurse, P. (2016).
CDK Substrate Phosphorylation and Ordering the Cell Cycle. Cell 167,
Nilsson, J. (2019). Protein phosphatases in the regulation of mitosis. J. Cell
1750–1761.
Biol. 218, 395–409.
Swaffer, M.P., Jones, A.W., Flynn, H.R., Snijders, A.P., and Nurse, P. (2018).
Novák, B., and Tyson, J.J. (2008). Design principles of biochemical oscillators. Quantitative Phosphoproteomics Reveals the Signaling Dynamics of Cell-Cy-
Nat. Rev. Mol. Cell Biol. 9, 981–991. cle Kinases in the Fission Yeast Schizosaccharomyces pombe. Cell Rep. 24,
Novak, Z.A., Conduit, P.T., Wainman, A., and Raff, J.W. (2014). Asterless li- 503–514.
censes daughter centrioles to duplicate for the first time in Drosophila em- Takao, D., Watanabe, K., Kuroki, K., and Kitagawa, D. (2019). Feedback loops
bryos. Curr. Biol. 24, 1276–1282. in the Plk4-STIL-HsSAS6 network coordinate site selection for procentriole
Ohta, M., Ashikawa, T., Nozaki, Y., Kozuka-Hata, H., Goto, H., Inagaki, M., formation. Biol. Open 8, bio047175.
Oyama, M., and Kitagawa, D. (2014). Direct interaction of Plk4 with STIL en- Tinevez, J.-Y., Perry, N., Schindelin, J., Hoopes, G.M., Reynolds, G.D.,
sures formation of a single procentriole per parental centriole. Nat. Commun. Laplantine, E., Bednarek, S.Y., Shorte, S.L., and Eliceiri, K.W. (2017).
5, 5267. TrackMate: An open and extensible platform for single-particle tracking.
Park, J.-E., Zhang, L., Bang, J.K., Andresson, T., DiMaio, F., and Lee, K.S. Methods 115, 80–90.
(2019). Phase separation of Polo-like kinase 4 by autoactivation and clustering Tsai, T.Y.-C., Choi, Y.S., Ma, W., Pomerening, J.R., Tang, C., and Ferrell, J.E.,
drives centriole biogenesis. Nat. Commun. 10, 4959. Jr. (2008). Robust, tunable biological oscillations from interlinked positive and
Petrásek, Z., and Schwille, P. (2008). Precise measurement of diffusion coef- negative feedback loops. Science 321, 126–129.
ficients using scanning fluorescence correlation spectroscopy. Biophys. J. 94, van Breugel, M., Hirono, M., Andreeva, A., Yanagisawa, H.-A., Yamaguchi, S.,
1437–1448. Nakazawa, Y., Morgner, N., Petrovich, M., Ebong, I.-O., Robinson, C.V., et al.
STAR+METHODS
Continued
REAGENT or RESOURCE SOURCE IDENTIFIER
D. melanogaster: Plk4-GFP / Cyo; Plk4 Aa74
/ Aydogan et al., 2018 N/A
Plk4Aa74
D. melanogaster: Plk4-mNeonGreen, This paper N/A
Plk4Aa74 / Plk4Aa74
D. melanogaster: Asl-mCherry / +; This paper N/A
Plk4Aa74 / +
D. melanogaster: Plk4-mNeonGreen / +; This paper N/A
Plk4-mNeonGreen, Plk4Aa74 / Plk4Aa74
D. melanogaster: Plk4-mNeonGreen / +; This paper N/A
Plk4-mNeonGreen, Plk4Aa74 / aslB46,
Plk4Aa74
D. melanogaster: Asl-GFP / +; aslB46 / aslB46 This paper N/A
D. melanogaster: Plk4-GFP / Cyo; aslB46, This paper N/A
Plk4Aa74 / Plk4Aa74
D. melanogaster: Sas-6-GFP / +; aslB46 / + This paper N/A
D. melanogaster: Asl-13A-mKate2 / Asl- This paper N/A
13A-mKate2; aslB46 / aslB46
D. melanogaster: Asl-mKate2 / Asl-mKate2; This paper N/A
aslB46 / aslB46
D. melanogaster: Asl-13A / +; Plk4- This paper N/A
mNeonGreen, Plk4Aa74 / Plk4-
mNeonGreen, Plk4Aa74
D. melanogaster: Asl / +; Plk4- This paper N/A
mNeonGreen, Plk4Aa74 / Plk4-
mNeonGreen, Plk4Aa74
D. melanogaster: wee1* / wee1*; Plk4- This paper N/A
mNeonGreen, Plk4Aa74 / Plk4-
mNeonGreen, Plk4Aa74
Oligonucleotides
Primers to introduce the NheI restriction Invitrogen, Thermo Fisher Scientific N/A
enzyme sites into the mCherry C-terminal
Gateway vector, see Table S1.
Primers to replace the mCherry tag with Invitrogen, Thermo Fisher Scientific N/A
mNeonGreen by homologous
recombination on the destination vector,
see Table S1.
Primers to replace the mCherry tag with Invitrogen, Thermo Fisher Scientific N/A
mKate2 by homologous recombination on
the destination vector, see Table S1.
Primers to remove the NheI restriction Invitrogen, Thermo Fisher Scientific N/A
enzyme sites from the destination vector via
site-directed mutagenesis (mNeonGreen
vector), see Table S1.
Primers to remove the NheI restriction Invitrogen, Thermo Fisher Scientific N/A
enzyme sites from the destination vector via
site-directed mutagenesis (mKate2 vector),
see Table S1.
Primers to amplify Cyclin A, B or B3, see Invitrogen, Thermo Fisher Scientific N/A
Table S1.
Primers to introduce various site directed Invitrogen, Thermo Fisher Scientific N/A
mutations for Asl-13A construct, see
Table S1.
(Continued on next page)
Continued
REAGENT or RESOURCE SOURCE IDENTIFIER
Primers to delete mKate2 to generate Invitrogen, Thermo Fisher Scientific N/A
endogenous Asl-13A construct without a
fluorescent tag, see Table S1.
Primers to generate endogenous Asl Invitrogen, Thermo Fisher Scientific N/A
construct without a fluorescent tag, see
Table S1.
Recombinant DNA
mCherry C-terminal Gateway vector Basto et al., 2008 N/A
pDONR-Zeo vector Thermo Fisher Scientific Cat# 12535035
mNeonGreen vector Shaner et al., 2013 N/A
mKate2 vector Shcherbo et al., 2009 N/A
Asl-mKate2 P-element transformation This study N/A
vector
Software and Algorithms
Fiji (ImageJ) National Institutes of Health https://imagej.nih.gov/ij/
TrackMate Tinevez et al., 2017 https://imagej.net/TrackMate
Prism 7 and 8 GraphPad https://www.graphpad.com/
scientific-software/prism/
Scipy’s find_peaks function Jones et al., 2001 https://docs.scipy.org/doc/scipy/
reference/generated/scipy.signal.
find_peaks.html
Asymmetric baseline smoothing Eilers and Boelens, 2005 N/A
Zen Black Software Zeiss https://www.zeiss.com/microscopy/us/
products/microscope-software/zen.html
FoCuS-Point Software Waithe et al., 2016 N/A
The equations used for mathematical This paper https://github.com/RaffLab/
modeling and regressions centriole_oscillator_model
Python script to automate PeCoS analysis This paper https://github.com/RaffLab/
centriole_oscillator_model
RESOURCE AVAILABILITY
Lead Contact
Further information and requests for resources and reagents should be directed to and will be fulfilled by the Lead Contact, Jordan W.
Raff (jordan.raff@path.ox.ac.uk).
Materials Availability
All unique/stable reagents generated in this study are available from the Lead Contact without restriction, unless for commercial
application, in which case a completed Materials Transfer Agreement will be requested. There is restriction to the availability of
dsRNA cocktails produced in this study, as they only last for 6 months without degradation (if preserved at conditions indicated
in the STAR Methods), and therefore these cocktails are recommended to be made fresh using the protocol described in the
STAR Methods. Fly alleles and plasmids (with original source species) generated in this study will be requested by FlyBase admin-
istration to deposit onto FlyBase public archives within 6 months following the publication of this study. Compound and recombinant
flies are deposited to the Lead Contact’s laboratory stocks (without direct public access), but are available without restriction upon
request.
without restriction, via file transfer systems, when requested from the Lead Contact – unless for commercial application, in which
case a completed Materials Transfer Agreement will be requested.
METHOD DETAILS
Hatching experiments
To measure embryo hatching rates, 0-3 h embryos were collected and aged for 24 h, and the % of embryos that hatched out of their
chorion was calculated.
Immunoblotting
Immunoblotting was performed as described previously (Aydogan et al., 2018). Primary antibodies used in this study are as follows:
mouse anti-GFP (Roche; RRID: AB_390913) and mouse anti-Actin (Sigma; RRID: AB_476730). Both the antibodies were used at
1:500 dilution in blocking solution (Aydogan et al., 2018). For all blots, 10, 20 or 30 staged early embryos were boiled in sample buffer
and loaded in each lane. The incubation period for primary antibodies was 1 h (or overnight at 4 C). Membranes were quickly washed
3x in TBST (TBS and 0.1% Tween 20) and then incubated with HRPO-linked anti-mouse IgG (both GE Healthcare) diluted 1:3,000 in
blocking solution for 45 min. Membranes were washed 3x15min in TBST and then incubated in SuperSignal West Femto Maximum
Sensitivity Substrate (Thermo Fisher Scientific). Membranes were exposed to film using exposure times that ranged from < 1 to 60s.
Analysis of centriole ‘‘fertility’’ in embryos injected with dsRNA against cyclin A-B-B3
In experiments where we depleted embryos of mitotic cyclins during early rounds of nuclear division, we observed qualitatively that
‘‘fertile’’ centrioles exhibited distinct Plk4-NG fluorescence peaks that often appeared to correlate with centriole duplication events,
while ‘‘sterile’’ centrioles exhibited no obvious peaks (Figure 5B). To test if we could more quantitatively distinguish between fertile
and sterile centrioles, we computationally analyzed all 81 centrioles that we could track throughout the observation period in 3
different embryos. We first assessed the average signal-to-noise ratio (SNR) of Plk4-NG fluorescence of each centriole over the entire
observation period and found that fertile centrioles exhibited a significantly higher SNR than sterile centrioles—assessed using a t
test assuming equal variance (Figure S8B). The distribution of SNR within sterile and fertile centriole signals was unimodal and sym-
metrically distributed (Figure S8C), so we attempted to classify centrioles in an unbiased way by thresholding the SNR. Based on the
bimodality of the SNR, an automatic threshold was determined from the data using Otsu thresholding (red dashed line Figure S8C);
the classification performance was summarized in a visual confusion matrix, which shows the proportion of correctly and falsely clas-
sified signals (Figure S8D). This unbiased computational method successfully classified 74% of the fertile centrioles and 71% of
the sterile centrioles.
Peak Calling
We next tested whether computationally identified peaks in the Plk4-NG signal were correlated with centriole duplication events.
Plk4-NG peaks were only called on signals whose fluctuation (as measured by signal-to-noise ratio, SNR) was greater than a certain
defined threshold (0.1, see below). A peak was defined only as a local maximum in intensity. To call a peak, the Plk4-NG signal in-
tensity was compared to the signal intensity at neighboring times. Here an unbiased distance = 1 was set, that is, an intensity at time t
is a peak only if the intensity is higher than those at both t 1 and t + 1. To filter noise detections, a threshold of 0.1 was placed on the
peak prominence. Peak prominence measures the extent to which a detected peak stands out from its surrounding – it is defined as
the vertical distance between the peak and its lowest contour line (Scipy’s find_peaks function) (Jones et al., 2001). The choice of 0.1
as a threshold was guided by comparing a peaks predictive power given no cut-off with the (ground truth) duplication time. This anal-
ysis indicated that the optimal peak prominence cut-off—i.e., the point at which the power of the peaks to predict duplication events
(see below) sharply drops off—was 0.12 (green dot, Figure 5E). The observed steep drop-off in predictive power below this threshold
supports the view that that there is likely to be a minimal amount of centriolar Plk4-NG that is necessary to trigger duplication under
these conditions. Moreover, this unbiased computational approach identified 4x as many peaks in the fertile centrioles when
compared to the sterile centrioles.
To determine whether the filtered Plk4-NG peaks were predictive of centriole duplication, we determined all the peaks above the
0.1 threshold for the fertile centriole signals and assessed whether these peaks could be used to ‘‘retrieve’’ the real or relevant time
points for centriole duplication. The performance of such retrieval can be evaluated using ‘‘precision’’ (the number of relevant re-
trievals among all retrieved instances—in this case the number of Plk4 peaks associated with a centriole duplication event divided
by the total number of Plk4 peaks) and ‘‘recall’’ (the number of relevant instances retrieved of the total relevant instances—in this case
the number of Plk4 peaks associated with a centriole duplication event divided by the total number of centriole duplication events), as
defined below.
Number ofðRelevant & RetrievedÞ
Precision = ;
Number of ðRetrievedÞ
The evaluation of such a system naturally depends on the cut-off to call a positive match between the Plk4-NG signal peak and its
corresponding centriole duplication time. Too small a cut-off (e.g., 0 minutes) is unrealistic: no system can predict time perfectly;
while too large a cut-off (e.g., 15 min) is too lenient and non-specific. Figure 6C plots the precision evaluated over all centrioles
for different temporal cut-offs attempting to uniquely match Plk4-NG peaks to the nearest duplication time within a given time win-
dow. The elbow point (red dashed line) at 5 min was selected as an appropriate cut-off with a precision of 80%. (Note that the recall
was not plotted in this graph, but it exhibits a similar behavior to the precision: the Number of (Relevant) = 52, while the Number of
(Retrieved) = 49). This temporal cut-off can also be interpreted as an estimate of the temporal accuracy to which Plk4-NG peak time
associates with centriole duplication time. For comparison, we also derived the mean temporal separation distance of peaks and
duplication events, if the same number of experimental centriole duplication times were randomly distributed over the same time in-
terval for each embryo. 1000 simulations were run per embryo to produce a distribution. Across all embryos, an average temporal
separation distance (for randomly distributed duplication times) was 10.5 minutes (data not shown), twice as long as the chosen
5 min cut-off, thus the association is not coincidental.
In addition, we assessed the precision and recall performance over different possible threshold values (based on peak prominence)
used to call a Plk4 ‘peak’. To do this, we computed the precision-recall curve. All detected Plk4 peaks (Black dots; Figure 5E) were
ranked according to their peak prominence from high to low and were assigned uniquely to a duplication event according to a 5 min
time window for determining a positive match. Peaks that could not be uniquely assigned in such a manner were regarded as ‘neg-
atives’. The graph then plots the precision, recall values if the threshold for calling a peak were set as the peak prominence value of
each peak in descending order. Beyond the detected peak associated with a peak prominence of 0.12 (i.e., points right of this point),
the precision drops sharply. At this threshold, precision and recall are jointly optimized. This suggests that a minimum level of Plk4-
NG peak fluorescence intensity is required to predict duplication. The ability of Plk4 peaks to predict duplication across all peak
prominences (over the selected time window of 5 min) is quantified by the integrated area under the curve or average precision
(AP5min). If there were no overall correlation between a Plk4 peak and a duplication event, AP5min would be 51.5% (given by # dupli-
cations / (# duplications + # peaks)); the score of 75% indicates a strong overall correlation (Figure 5E).
Finally, the correlation between the Plk4-NG peaks and times of centriole duplication was examined (Figure 5F), which provided an
alternative accuracy test. Plk4-NG peaks were uniquely matched to the nearest centriole division times without using a temporal cut-
off over individual centrioles from three independent embryos. Pearson correlation r, R2 and P values are reported as goodness of fit.
The fitted regression line, y = 0:87x + 3:69. Together, these unbiased computational analyses indicate that the Plk4 oscillations at
individual centrioles are highly correlated with the time at which these centrioles duplicate.
l is the average density of points (estimated as n=A where the number of total points, n is divided by the area of the region containing
all points, A), and I is the indicator counting function (I = 1 if its operand is true, 0 otherwise). Thus, if points are homogeneously spread
in 2D, the Ripley K statistic should vary quadratically as pd2 . The basic test assumes (x,y) points occur at any spatial position contin-
uously in the image. However, centrioles only duplicate in certain discrete positions within fly embryos. Thus, to examine evidence of
spatial clustering from the natural distribution of centrioles, we assessed difference in the Ripley K statistics computed from the (x,y)
positions of all duplicating centrioles and the (x,y) positions of all centrioles accumulated over time.
X
N ðtmi Þ2
s2i
yðtÞ = ðA + BtÞ + Ci e
i=1
where t is time, A; B are the constant and slope of a linear trend line, and Ci ; mi ; si are the amplitude, time and temporal duration of the
ith Gaussian, respectively. This function was fit in two steps. In the first step A; B was fit by applying least-squares linear regression on
the baseline trend line that is extracted from asymmetric baseline smoothing (Eilers and Boelens, 2005). In the second step, peak and
trough positions were first detected on the de-trended signal, y0 ðtÞ after subtraction of the fitted trend line in the first step from the
original signal, yðtÞ, so as to determine the number N of Gaussians to fit. The mixture of Gaussians was then fit by iterative non-linear
regression using a robust Cauchy loss function. From the fit signal, yðtÞfitted , peak and trough positions were re-detected, and the
following signal parameters were extracted (Figure 6):
d Acceleration rate ðDFluo: A:U: =secsÞ: The maximum rate of increase in fluorescence between successive time points during
a trough to peak oscillation phase.
d Deceleration rate ðDFluo: A:U: =secsÞ: The maximum rate of decrease in fluorescence between successive time points during
a peak to trough oscillation phase.
d Oscillation peak time ðsÞ: The time point corresponding to the maximum fluorescence.
d Oscillation trough time ðsÞ into mitosis: The end point of a trough after which the fluorescence begins to accelerate upward.
The method described here was also used to determine the period of Plk4 oscillations (measuring peak-to-peak time; see Results
and Discussion) both in normal embryos and in embryos where dsRNA was injected against cyclins A, B and B3 to halt the progres-
sion of cell cycle.
d A1
= k2 A0 k2 A1 (2)
dt
d AN
= k2 AN1 k3 AN (3)
dt
d½A0
= k1 ½A0 (4)
dt
In Equation 5, the positive constant A b is the initial amount of Plk4-bound Asl at the centriole at the start of each S-phase, which is
0
determined experimentally for each cell cycle using the techniques described in the Image acquisition, processing and analysis sec-
tion of STAR Methods. The constant A b is the initial amount of unbound Asl, so the total amount of Asl in the system is given by Atot =
b b
A0 + A0.
Since the model specified by Equations 1–5 is a system of linear differential equations with constant coefficients, it has an analytical
solution that can be expressed as a sum of exponentials. Values for the parameters A b0 , k1 , k2 , and k3 can then be determined by fitting
the curve ½A0 ðtÞ + ½A1 ðtÞ + / + ½AN ðtÞ to the experimentally measured data for the amount of Asl-bound Plk4 (i.e., the Plk4 that is
recruited to the centriole) over time. Fitting was done using a trust-region algorithm to optimize a nonlinear least-squares penalty
function.
Parameter fitting
The fitting was constrained to enforce that all parameters were positive, and k1 and k2 were taken to be less than 1. Each cycle was
fitted individually using the discrete Plk4-NG oscillation data from S-phase of cycles 11, 12 and 13 (Figure 3C). Parameter values are
shown in Data S1 (First, second and third charts). As explained below, the solutions to this model are very insensitive to variations in k3
(see Data S1, the Monte Carlo analysis), so in the solutions presented here k3 was kept at a constant value of 0.06906, which was the
best-fit parameter value for cycle 12 (Data S1; first, second and third charts).
Picking the value of N, the number of phosphorylation sites:
In the model, we assumed that Asl had to be phosphorylated by Plk4 N times before it switched to a low-affinity state—indicated by
variables ½A0 ;/; ½AN . We tested the effect of the number of phosphorylation sites on the model solution by using N = 1, 4, 9, 14, or 16.
The best fit curves for ½A0 ðtÞ + ½A1 ðtÞ + / + ½AN ðtÞ suggested that the model is a good fit for the data for any value of N > 4 (N = 1 (R2 =
0.9152), N = 4 (R2 = 0.9886), N = 9 (R2 = 0.9996), N = 14 (R2 = 0.9962) or N = 16 (R2 = 0.9931)). So, we use N = 9 corresponding to 10
phosphorylation sites of Asl (1 unbound, 9 bound with various stages of phosphorylation) in all subsequent modeling, although we
note that any value above 4 works essentially equally well.
Data S1, first chart, shows that the trust-region algorithm finds a very good fit (R2 > 0.99) for the model to the experimental data
(Figure 3C), but this provides little information about uniqueness of the fit as there may be other subsets of the parameter space that
also provide a good fit to the data. To see if any such regions could be detected, the parameter space was further explored by using a
Metropolis-Hastings Markov chain Monte Carlo algorithm. Four Markov chains were started at the positions in the parameter space
specified in Data S1 (fourth chart). The Monte Carlo analysis in Data S1 shows the six two-dimensional traces of the four-dimensional
parameter space. For clarity, only points that provided a good fit to the cycle 12 data (R2 > 0.95) are shown.
The results in Data S1 (the Monte Carlo analysis) reveal how sensitive the model is to changes in each parameter value. The model
only fit the data well for a relatively narrow range of values for k1 , k2 , and A b0 . In contrast, the fits are mostly insensitive to k3 . This is
likely, because the rate of phosphorylating Asl at multiple sites is relatively slow compared to the rate at which Plk4 is subsequently
released from the multiply phosphorylated Asl—so the rate of release is not limiting. The Monte Carlo simulations also reveal corre-
lations between k1 and k2 , k1 and A b0 , and k2 and A b0 . For example, these results show that if A b0 (the initial amount of unbound Asl
receptor at the start of S-phase) is reduced, the model can still fit the data well if k2 is decreased and k1 is increased. While these
results suggest that there is a single, continuous region of the parameter space that provides a good fit to the data, it is still possible
that there are other such regions that the Markov chains in Data S1 (the Monte Carlo analysis) did not explore. However, the results in
Data S1 (the Monte Carlo analysis, panel B) show that the points, which are identified at the center of the parameter region, provide
the best fit to the data. This suggests the nonlinear least-squares minima found by the trust-region fitting is insensitive to the
initial seed.
Interestingly, the best-fit parameters for cycles 11–13 showed that the biggest difference between the parameters of the Plk4 os-
cillations at each cycle is in k1—the rate at which Plk4 binds to Asl (which is dependent on the cytosolic concentration of Plk4).
Although our model assumes that the cytosolic concentration of Plk4 remains constant during the S-phase period within each cycle,
if the phosphorylated Plk4 molecules that are released from the Asl receptor are ultimately degraded—and there is good evidence
that Asl activates Plk4 to promote Plk4 degradation (Klebba et al., 2015)—there could be a wave of phosphorylated-Plk4 degradation
in the cytoplasm toward the end of S-phase. If so, the cytosolic levels of Plk4 would get successively lower at the start of each suc-
cessive cycle, as our PeCoS analysis indicates is the case (Figure 3E).
The effects of reducing the genetic dose of Plk4 by half (Plk4-NG1/2 embryos—see Results and Discussion for details) were
analyzed. Our PeCoS analysis indicated that there was a 45% drop in the cytosolic levels of the Plk4-NG protein in the Plk4-
NG1/2 embryos (Figure S6C). When the model was fit to Plk4-NG1/2 oscillation, the best-fit (R2 = 0.996) parameter had a k1 value
that was 39% of the control value (Data S1, second chart), so in reasonable agreement with the 45% drop in cytosolic Plk4 levels
we measured experimentally. These parameter values also suggested that the total amount of centriolar Asl ðAtot Þshould remain rela-
tively unchanged between the Plk4-NG and Plk4-NG1/2 conditions (Data S1, second chart). Centriolar Asl levels were analyzed in em-
bryos expressing Asl-mCherry in either WT versus Plk41/2 conditions, and our findings showed that this was indeed the case
(Figure S7A).
Next, the effects of reducing the genetic dose of asl by half (asl1/2 embryos—see Results and Discussion for details) were analyzed.
Interestingly, the best-fit parameter values (R2 = 0.999) predicted that the total centriolar Asl levels (Atot) would be reduced by only
28% in asl1/2 embryos (Data S1, third chart). This value was therefore directly measured in embryos expressing either one or two
copies of Asl-GFP (under the control of its own promoter in an asl mutant background). Encouragingly our findings showed that
reducing the genetic dose of Asl-GFP by half led to a reduction of only 30% in centriolar Asl-GFP levels (Figure S7B). Moreover,
the parameter values suggested that the concentration of Plk4 (incorporated in the k1 term) should not vary significantly between
WT and asl1/2 conditions (Data S1, third chart). We confirmed this prediction using western blotting for Plk4-GFP and PeCoS for
Plk4-NG (both transgenically expressed from their own promoters in a Plk4 mutant background) in control and asl1/2 embryos (Fig-
ures S7C and S7D).
Taken together, these analyses indicate that our model can robustly describe the Plk4-NG oscillations under normal conditions
(Figure 3C) and when the levels of either Plk4 or Asl are perturbed experimentally (Figures 4A and 4B). Moreover, the model makes
several plausible predictions about the relative levels of these proteins in the perturbed conditions that are close to the levels that we
measured experimentally.
Finally, the best-fit value of k2 (reflecting the kinase activity of individual Plk4 molecules) decreased slightly between cycles 11 to 12
and decreased more significantly between cycles 12 to 13 (by 9% and 37%, respectively); k2 also decreased when levels of Plk4
were genetically reduced in Plk4-NG1/2 embryos (by 25%)—but not when Asl levels were genetically reduced in asl1/2 embryos
(Data S1; first, second and third charts). The molecular basis for this inferred decrease in kinase activity remains unknown, but we
believe it is biologically plausible. We previously suggested that centriolar Plk4 was likely to integrate several inputs at the start of
each cycle (from, for example, cell cycle regulators, or its activator Ana2/STIL) and adjust its kinase activity in response to the length-
ening of S-phase during successive nuclear cycles (Aydogan et al., 2018). Moreover, our finding that Wee1 kinase, an important cell
cycle regulator, can influence the Plk4 oscillation parameters in S-phase strongly supports this hypothesis (Figure 6).
d A1
= k2 A0 k2 A1 k4 A1 (7)
dt
d AN
= k2 AN1 k3 AN k4 AN (8)
dt
d½A0
= k4 ½A1 k1 ½A0 (9)
dt
d½A1
= k4 ½A2 k4 ½A1 (10)
dt
d½AN
= k3 AN k4 ½AN (11)
dt
It is further assumed that the embryo is in mitosis for 30% of the total time in each nuclear cycle and all cycle times are kept constant.
Hence, k4 = 0 for 0 < t mod T < 0:7T and a positive constant for 0:7T < t mod T < T, where T is the period of the cell cycle. Values for
the rate constants are determined by fitting the exact analytical solution in the S-phase of cycle 12 to the Lorentzian regression of the
experimental data (R2 = 0.9870). In the first instance, we assumed that the cytosolic concentration of Plk4 remains constant over the
nuclear cycles (see below). We plot the exact solution for the percentage of Asl-bound Plk4 molecules for a total of 14 nuclear cycles,
as is the case in fly embryos. This minimal model was sufficient to generate sustained oscillations in centriolar Plk4 levels (Figure S4B;
k4 = 0.0708).
As an alternative to the assumption that the cytosolic concentration of Plk4 is constant over the cycles, we also considered the
case where the total number of Plk4 molecules in the embryo is kept constant (so any Plk4 degradation is balanced by new synthesis).
In this model, as the number of centrioles, NC ðtÞ, increases at successive cycles so the number of available Plk4 molecules in the
cytosol initially decreases during S-phase (as Plk4 binds to centriolar Asl receptors), and then increases (as Plk4 unbinds from Asl
receptors). We estimate that there are NP = 105 molecules of Plk4 in an embryo of 0.01mm3 volume (Markow et al., 2009); a concen-
tration of 10nM, in agreement with that measured in human cells (Yamamoto and Kitagawa, 2019), but potentially higher than we
infer from our observation that Plk4 levels are too low to be measured by FCS (as we cannot infer absolute protein concentration from
our PeCoS experiments). To simulate the effect of centriole duplications, we double the number of centrioles each cycle and assume
that, at each centriole duplication, the bound Plk4 (attached to Asl receptors) is equally split between the mother and separating
daughter. To consider the Cdk/Cyclin trigger wave that sweeps through embryos (Deneke et al., 2016), it is assumed that the
duplicated centrioles separate nearly synchronously over the last 10% of the time-window in each cycle. Based on our 3D-SIM mi-
croscopy data, we assume that each centriole has 30 Asl receptors, as we essentially only need to consider Plk4 binding to Asl at
the site of centriole assembly (the model works well for between 20-80 receptors), and that there is a single centriole in cycle 1. With
these modifications to the model, the system reads
d A0 1 dNC
= k1 ½P½A0 k2 A0 A0 (13)
dt NC dt
d A1 1 dNC
= k2 A0 k2 A1 k4 A1 A1 (14)
dt NC dt
d AN 1 dNC
= k2 AN1 k3 AN k4 AN AN (15)
dt NC dt
d½A0 1 dNC
= k4 ½A1 k1 ½A0 + A (16)
dt NC 0 dt
d½A1 1 dNC
= k4 ½A2 k4 ½A1 + A (17)
dt NC 1 dt
d½AN 1 dNC
= k3 AN k4 ½AN + A (18)
dt NC N dt
NR NC ðtÞ XN
½P = 1 An (19)
NP n=0
At any given time, precisely one entry of V is equal to unity, corresponding to the state which the receptor is in at that moment, and all
other entries are equal to zero. We allow the receptor to change state over time according to the transition matrix:
where
ki
Pi = 1 eki ; Pi;j = 1 eðki + kj Þ ; (22)
ki + kj
describe the probabilities of a receptor changing state and remaining in the same state which arise in our model. We may allow the
cytosolic Plk4 concentration to vary in this model by making the substitution k1 / k1 ½P and using (19) to compute ½P (evaluating the
sum over all receptors being simulated). We also assume that, if a receptor is in a Plk4-bound state ðAn Þ during the last 10% of a cycle,
the Plk4 will unbind ðAn / An Þ with 50% probability during that time period in order to simulate mother-daughter separation.
In this model, each Asl receptor behaves as an independent oscillator—alternating between a Plk4-bound form that is being phos-
phorylated, and an unbound form that is being dephosphorylated. In the presence of the CCO, the individual Asl receptors generate
coordinated oscillations because the CCO effectively synchronizes them every cycle by ensuring a coordinated burst of PPTase ac-
tivity during mitosis. This activity is lost in the absence of the CCO, but instead we allow the PPTase to be active at a low, but constant,
level (10% of the mitotic activity in cycling embryos). We plot the Asl-bound Plk4 levels for a total of 10 centrioles (each with 30 re-
ceptors as assumed above; Figure S8H). We observe that the centrioles are initially synchronized, since they all start in an unbound
state, and display a single round of Plk4 binding. However, as time progresses, the Asl receptors lose synchrony, and each centriole
exhibits stochastic, low-amplitude oscillations. Such oscillations may be sufficient to trigger duplications at individual centrioles, as
evident from our experimental observations (Figures 5B–5F and S8B–S8F; Video S4).
All the equations used for mathematical modeling and regressions are available in the following web link: < https://github.com/
RaffLab/centriole_oscillator_model >.
where CD denotes a time average, dIðtÞ describes the intensity fluctuation at the time point t, and t states the lag time of the
autocorrelation.
All 10 s-recordings were then fitted with 8 different 3D diffusion models using the software FoCuS-Point (Waithe et al., 2016) with
the following equation:
where Ak defines the fraction of a diffusing species for which the sum of all diffusing species equals 1, txy describes the average resi-
dence time of the diffusing species in Veff, a accounts for anomalous subdiffusion within the cytoplasm, and AR is a structural param-
eter that describes the relationship among the x, y and z-axes of the excitation volume.
Dark states of the fluorophore were fitted with the following formula:
XTs Tj
GT ðtÞ = 1 + $et=tTj
j = 11 Tj
where T depicts the triplet population, and tT states the triplet correlation time during which the fluorophore stays in the dark state
(Schönle et al., 2014).
The data was fitted within the boundaries of 4x104 ms and 1.5x103 ms, and the dark states were restricted to 10-300 ms for the
blinking state, and 1-10 ms for the triplet state. The models (Ms) were defined as the following: M1) 1 diffusing species (ds) 0 blinking
states (bs) 0 triplet states (ts); M2) 1 ds 1bs 0ts; M3) 1 ds 0bs 1ts; M4) 1 ds 1bs 1ts; M5) 2 ds 0bs 0ts; M6) 2 ds 1bs 0ts; M7) 2 ds 0bs
1ts; M8) 2 ds 1bs 1ts. In all models, the structural parameter AR and the anomalous subdiffusion parameter a were kept constant at 5
and 0.7, respectively.
In order to avoid over-fitting the data, the most plausible model to describe the autocorrelation functions was selected using the
Bayesian Information Criterion (BIC), which is based on the likelihood function, but introduces a penalty term for the complexity (num-
ber of variables) for the models (Schwarz, 1978). In this study, M4 was the preferred model to describe Asl-GFP diffusion
(Figure S5A(iv)). The concentration was calculated from the FoCuS-point fit data of the preferred model:
1 hNi
hNi = ; conc: =
G0 Veff
where N states the average number of particles within the effective volume Veff, and G0 represents the height of the autocorrelation
function at t = 0.
1
hNi =
c2 G0
where hbi denotes the average background and hf i states the average count rate of the sample.
Data restriction
In some FCS measurements a sudden drop in CPM was observed, possibly due to movements within the embryo or the embryo
drifting away from the measurement plane. When this happened, a strong, often unreasonable increase in concentration was
observed. These outliers were therefore discarded based on a ROUT outlier test (with the aggression factor Q = 1%), which was per-
formed on all 10 s-long concentration measurements (the red data points in Figure S5(vi)). Only the embryos with at least 4x10 s re-
cordings (after discarding outliers and erratically shaped ACFs) were included in the final analysis.
The details for quantification, statistical tests, sample numbers, definitions of center, and the measures for dispersion and precision
are described in the main text, relevant figure legends, or relevant sections of STAR Methods. Significance in statistical tests was
defined by p < 0.05. To determine whether the data values were normally distributed, a D’Agostino–Pearson omnibus normality
test was applied. Prism 7 and 8 were used for all the modeling and statistical analyses.
Supplemental Figures
Figure S1. Summary of the Protocol for Image Acquisition, Processing, and Analysis of the Plk4-NG Oscillations, Related to Figure 1
(A) Diagram illustrates the centrioles in ~2 h old embryo expressing Plk4-NG being imaged on a spinning-disk confocal system.
(B) Micrograph shows a typical image of the tracks of the Plk4-NG centrioles in S-phase of cycle 12, tracked using the ImageJ plugin, TrackMate.
(C) Graphs show the Plk4-NG oscillation during cycle 12 in a single embryo quantified from the tracks of either several individual centriole pairs (i), or the Mean ±
SD oscillation calculated from the tracks of > 90 centriole pairs (ii). The data for each embryo was then regressed using a Lorentzian equation (red line, iii)—see (D)
for an explanation of the rationale for choosing this function. This process was repeated for multiple embryos to calculate a Mean ± SEM regression for nuclear
cycle 12 (iv). R2 values indicate the goodness-of-fit (Mean ± SD) of the regression. CS = time of centrosome separation (set to 0); NEB = time of nuclear envelope
breakdown.
(D) Table shows the various models that were tested to fit the Plk4-NG oscillation data. R2 and SSAbs (absolute sum of squares) values indicate the goodness of fit.
The Lorentzian function was the best fit for the majority of embryos, so it was used for all further analyses.
Further details of these models are provided in STAR Methods.
ll
Article
Figure S3. Simultaneously Measuring Centriole Growth and the Plk4 Oscillation in the Same Embryos, Related to Figure 2
(A and B) Graphs show the same data presented in Figure 2A, but with the SEM included (as these error bars were omitted from Figure 2A for ease of pre-
sentation). CS = centrosome separation and NEB = nuclear envelope breakdown. R2 values indicate the goodness of fit.
(C) Graph quantifies the embryo hatching frequency in embryos laid by either wild-type (Oregon-R) females or females simultaneously expressing Sas-6-mCherry
and Plk4-NG in a Plk4 mutant background (all mated with WT males). At least 4 technical repeats were carried out over several days, and a total of at least 400
embryos were analyzed.
(D) Cartoon graphs (i.e., imaginary data) illustrate the three different centriole growth phenotypes we observed in the Plk4 mutant embryos that simultaneously
express 2 copies of Plk4-NG and one copy of Sas-6-mCherry. In our previous analysis of centriole growth kinetics (Aydogan et al., 2018) almost all embryos
started to incorporate Sas-6-GFP at the very start of S-phase (‘‘Growth on time,’’ left graph). In the embryos analyzed here (with a more complicated genotype,
and expressing Sas-6-mCherry rather than Sas-6-GFP), some of the embryos exhibited a clear delay in initiating the incorporation of Sas-6-mCherry (‘‘Late
growth,’’ middle graph), while others did not appear to incorporate significant amounts of Sas-6-mCherry at all (‘‘No growth,’’ right graph).
(E) Pie charts quantify the percentage of embryos exhibiting each centriole growth phenotype at each nuclear cycle. Note that embryos exhibiting the ‘‘No
growth’’ phenotype were excluded from the analysis shown in (A) and (B) and in Figure 2A, although the amplitude of the Plk4 oscillations in these embryos was
analyzed separately (Figure 2C): we observed 8 embryos in total that exhibited the ‘‘No growth’’ phenotype (1 in cycle 12, and 7 in cycle 13). Centriolar Plk4-NG
levels continued to oscillate in these embryos, and the scatter graph shown in Figure 2C plots the peak amplitude of the Plk4-NG oscillations in these 8 embryos
overlaid on the average ‘‘threshold’’ level of Plk4-NG at which centrioles started to grow in the population of embryos that did exhibit Sas-6-mCherry incor-
poration. This threshold was very similar at cycle 12 and 13, so the threshold shown in Figure 2C is taken from cycle 13 embryos (as 7 of the 8 embryos shown here
were at cycle 13). The Plk4-NG oscillation in all but one of the 8 embryos failed to reach the average ‘‘threshold’’ level that would normally initiate centriole growth
in these embryos.
ll
Article
Figure S4. Theoretical and Experimental Assessment of Several Assumptions Made in the Mathematical Model, Related to Figure 3
(A) Our mathematical model depicted in Figures 3A and 3B only discretely examines the Plk4-NG oscillation during S-phase of each nuclear cycle. The schematic
here shows our speculation that a phosphatase normally removes the phosphate groups (dotted circles) from Asl (red) during mitosis to reset the system for the
next oscillation at rate k4 (dotted black arrow).
(B) We implemented this step to extend the original model and plotted the mathematical solution for the percentage of Asl-bound Plk4 molecules (black curve) for
a total of 14 nuclear cycles. For simplicity we kept the length of S-phase and mitosis constant through all 14 cycles (see STAR Methods for further details of this
extended model).
(C) Schematic shows the Serine (S) and Threonine (T) residues (in bold) that were mutated to Alanine in the Asl-13A construct. Dark gray boxes show the relative
positions of the previously mapped Plk4-interacting regions within the N-terminal (Dzhindzhev et al., 2010) and C-terminal (Klebba et al., 2015) regions of Asl.
(D) Micrographs show images from time-lapse movies of embryos expressing Asl-WT-mKate2 and Asl-13A-mKate2 (under the control of their own promoters in
an asl mutant background), respectively.
(E) Graphs show the regression data (solid lines) for Plk4-NG oscillations in cycle 12 in embryos expressing either Asl-WT (green) or Asl-13A (dark gray) (both
without any fluorescent tag) simultaneously with Plk4-NG. N R 25 embryos for each condition; n = 71 and 68 centrioles (mean) per embryo in Asl-WT or Asl-13A,
respectively (collection of two trials performed by two independent researchers, blinded for each other’s data). Data are presented as Mean ± SEM R2 values
indicate goodness-of-fit for the regressions. CS = Centrosome separation; NEB = Nuclear envelope breakdown.
(F) In (B) it is assumed that the cytosolic concentration of Plk4 is kept constant over all cycles. The graph here plots an alternative model where the total number of
Plk4 molecules in the embryo is kept constant at all cycles. The number of centrioles doubles each cycle, and the mathematical solution for the percentage of Asl-
bound Plk4 molecules (black curve), and the percentage of Plk4 molecules that remain in the cytoplasm (red curve), is depicted over 14 nuclear cycles (see STAR
Methods for further details and implications of this model). For the first few nuclear cycles, almost all of the Plk4 remains in the cytoplasm since there are only a few
centrioles. In the later cycles, however, the amount of Plk4 sequestered by the Asl receptors increases exponentially, as the number of centrioles increase by a
factor of 2 in each cycle. Therefore, the rate at which the Asl receptors are able to recruit Plk4 from the cytoplasm decreases, resulting in a reduction in the
amplitude of the Plk4 oscillation. This aspect of the model is consistent with our experimental observations that the amplitude of the Plk4 oscillation decreases at
later cycles (Figure 1), as does the cytosolic concentration of Plk4 (Figure 3E). An alternative, or additional, mechanism that might explain these observations is
that the Plk4 molecules activated by binding to Asl may be more likely to autophosphorylate to stimulate their degradation, so ensuring that more Plk4 is degraded
at each cycle as the number of centrioles increase. Interestingly, in either of these scenarios, increasing centriole numbers leads to increasing Plk4 depletion from
the cytosol, potentially allowing embryos to effectively ‘‘count’’ their centrioles.
ll
Article
Figure S6. Peak Counting Spectroscopy Analysis of Cytosolic Plk4 Levels, Related to Figures 3 and 6
(A) Schematic workflow describes the acquisition and analysis of Peak Counting Spectroscopy (PeCoS) measurements. (i) In addition to embryos expressing
Plk4-NG under its own endogenous promoter, embryos of two other genotypes were placed on the same imaging dish. One expressing a green-fluorescent
centriole marker to allow correction of the spherical aberration caused by coverslip thickness variation, the other expressing Asl-mKate2 to determine the au-
tofluorescence background threshold for the Plk4-NG expressing embryos—Asl-mKate2 allows one to determine the correct plane (containing the centrioles;
white arrows) for background measurement, while the mKate2 fluorophore does not interfere with the PeCoS measurements. (ii) As for FCS (see Figure S5), a
488nm laser beam is positioned near the cortex of embryos, and the measurements are taken at a single point in the cytosol (red crosshairs) at the beginning of S-
phase, but for 1x 180 s, in both control and Plk4-NG expressing embryos (iii). Afterward, (iv) an appropriate threshold is calculated from the control embryos, so
that the background contributes less than 5 peaks on average during each recording. Following background subtraction, (v and vi) the number of peaks is
quantified.
(B) To compare the effective linear concentration range of FCS and PeCoS we assessed a two-fold dilution series of the Alexa488 dye. At high dye concentrations,
FCS (black symbols) exhibits a near-linear response, while PeCoS (gray symbols) is saturated—presumably because there are too many fluorophores in the
effective volume (Veff) for them to be measured as individual peaks. At intermediate dye concentrations, both methods exhibit a near linear response. At low
concentrations (~ < 0.2nM), however, FCS becomes unreliable while PeCoS continues to have a near-linear response.
(C) The bar chart shows the in vivo validation of PeCoS. A significant difference in the number of peaks per minute was observed between embryos expressing
either 1x or 2x copies of Plk4-NG (under the control of its endogenous promoter), which were measured at the beginning of S-phase in nuclear cycle 12. Each data
point represents a 180 s recording from a single embryo. Statistical significance was assessed using Mann-Whitney test (***p < 0.001). Data are presented as
Mean ± SD
(D) Western blot analysis of Plk4-GFP (arrow) levels in early and late embryos supports the conclusion from the PeCoS analysis (Figure 3E) that cytosolic Plk4
levels are lower in late embryos than in early embryos. Prominent non-specific bands are indicated (*). A representative blot is shown from two technical repeats.
(E) The bar chart compares the cytosolic levels of Plk4-NG (under the control of its endogenous promoter; at the beginning of S-phase in Cycle 13) between WT
and Wee1/ embryos (the same genotypes as in Figure 6). Statistical significance was assessed using an ordinary unpaired t test (ns, not significant). Data are
presented as Mean ± SD.
ll
Article
Figure S7. Quantification of Centriolar Asl and Cytosolic Plk4 Levels When the Genetic Dose of asl or Plk4 Is Halved, Related to Figure 4
(A) Micrograph shows an image of Asl-mCherry at centrioles in an embryo in early S-phase (just after centrosome separation). Bar charts quantify the average
centriolar Asl-mCherry levels in early S-phase in either WT embryos (WT) or in embryos where the genetic dose of Plk4 has been halved (Plk41/2). N = 17 embryos
for each condition; n = 67 and 58 centrioles (mean) per embryo in WT or Plk41/2 groups, respectively. Average centriolar Asl levels do not change significantly
when the genetic dosage of Plk4 is halved, in agreement with the prediction of our model (see Data S1; first, second and third charts).
(B) Same schema as (A), but showing the localization of Asl-GFP, and quantifying the centriolar levels of Asl-GFP in asl mutant embryos expressing either 1 (Asl-
GFP1x) or 2 (Asl-GFP2x) copies of Asl-GFP. N = 10 embryos for each condition; n = 59 and 54 centrioles (mean) per embryo in Asl-GFP1x or Asl-GFP2x groups,
respectively. This analysis reveals that centriolar Asl-GFP levels drop by ~30% when the genetic dosage of Asl-GFP is halved, in good agreement with the
prediction of our model (see Data S1; first, second and third charts). Data are represented as Mean ± SEM. Statistical significance was assessed using an
unpaired t test with Welch’s correction (for Gaussian-distributed data) or an unpaired Mann-Whitney test (**p < 0.01; ns, not significant).
(C) Western blot compares the protein levels of Plk4-GFP (arrow) (expressed under the control of its own promoter in a Plk4 mutant background) in otherwise WT
embryos or in embryos in which the genetic dosage of asl has been halved. This analysis reveals that Plk4-GFP levels in the embryo do not change dramatically
when the genetic dosage of asl is halved, in agreement with the prediction of our model. WT embryos (Lane 1) are shown as a negative control to demonstrate that
the Plk4-GFP band is only detected in embryos expressing Plk4-GFP. Prominent non-specific bands are indicated (*). Actin is shown as a loading control. A
representative blot is shown from two technical repeats.
(D) The bar chart compares the number of Plk4-NG peaks per minute that was observed between normal embryos (WT) or embryos where the genetic dose of Asl
was halved (asl1/2). Measurements were performed at the beginning of S-phase in nuclear cycle 12. Each data point represents a 180 s recording from a single
embryo. Statistical significance was assessed using an ordinary unpaired t test (for Gaussian-distributed data) or a Mann-Whitney test (ns, p > 0.05). Data are
presented as Mean ± SD.
ll
Article
Figure S8. The Average Centriolar Plk4-NG Level on Individual Centrioles Can Be Used to Predict Stochastic Centriole Duplications in
Embryos Arrested in Interphase by Mitotic Cyclin Depletion, Related to Figures 5 and S2
(A) The pie chart quantifies the percentage of centrioles that continued to duplicate in embryos where cyclin A-B-B3 dsRNA was injected into embryos at nuclear
cycle 2-4, and centriole behavior assessed ~90 min later. Ambiguous (gray) indicates the fraction of centrioles whose duplication state could not be unam-
biguously determined due to their drifting out of focus during imaging.
(B) Bar chart shows the mean signal-to-noise ratio (SNR) of Plk4-NG fluorescence signals from sterile and fertile centrioles (red and green, respectively) through
the entire period of observation. Data are presented as Mean ± SD. Statistical significance of SNR was tested using a t test assuming equal variance (***p < 0.001).
(legend continued on next page)
ll
Article
(C) Heatmap histogram of all SNR values from sterile and fertile centrioles. Red dashed line shows the unbiased threshold, determined automatically from Otsu
thresholding for distinguishing sterile and fertile centrioles. Heatmap (Red: Sterile and Green: Fertile) indicates the fraction of fertile/sterile centrioles in each
column. Note that, the higher the SNR, the more fertile the centrioles are.
(D) Confusion matrix shows the classification performance of sterile versus fertile centriole Plk4-NG signals using the Otsu threshold in (C) as a proportion of the
total number of signals, n = 81 centrioles from 3 embryos.
(E) Heatmap plots demonstrate the spatial (x,y) coordinates of all centriole duplication events in a representative cycling embryo (left; at the beginning of cycle 13
when centrioles are separating over the course of ~3.5 min) and non-cycling embryo (right; captured over ~60 min), as each duplication event colored light blue to
dark red to represent early and late time points, respectively. The black points plot the observed spatial (x,y) positions of all centrioles, duplicating and non-
duplicating, at all time points. Note that the duplications in the cycling embryo are spatially and temporally coordinated (tending to divide first at the top of the
embryo and later at the bottom of the embryo), while the duplications in the non-cycling embryo occur over a longer time-scale and do not appear to be co-
ordinated in space or time.
(F) To test more rigorously whether the centriole duplication events in non-cycling embryos are largely stochastic, we calculated Ripley K statistics for all the non-
cycling embryos used in (A–D). This statistic provides a measure of whether the temporal duplication events have spatial preference by measuring the average
number of events that occur as a function of distance from individual centrioles. Curves were computed from the (x,y) coordinates of only the duplicating
centrioles (denoted Kdivpoint , red line) and of all centrioles at all times (denoted Kallpoints , black line). The sigmoidal increase in the statistic as a function of distance in
both cases suggests that duplication events do not cluster spatially at short distances (< 50-100 pixels). The trend and amplitudes of red and black lines (mean ±
SD) are very similar and fall in each other’s statistical confidence range, indicating that duplicating centrioles do not exhibit additional spatial clustering above the
natural spatial distribution of centrioles.
(G) Schematic depicts the topology of the mathematical model that illustrates how Plk4 oscillations at individual centrioles could be generated to trigger sto-
chastic duplications in non-cycling embryos. Briefly, we no longer assume that a PPTase acts in discrete bursts during mitosis and instead assume a continuous
low-level PPTase activity (10% of the activity in cycling embryos). We allow individual Asl receptors to bind Plk4 and be phosphorylated until they release Plk4 (as
in our original model), and to be continuously be slowly dephosphorylated by the PPTase. Asl-p and Plk4-p indicate phosphorylated proteins. Bold arrows
indicate the dominant direction of the reactions.
(H) Graph shows how the percentage of Asl receptors that are bound to Plk4 changes over time at 10 individual centrioles, two of which have been colored red or
green. The centrioles are initially synchronized, as their Asl receptors all start in a dephosphorylated, unbound, state and so exhibit a coordinated pulse of Plk4
binding. As time progresses, however, the Asl receptors lose synchrony (as their dephosphorylation is no longer entrained by the CCO), and so each centriole
exhibits low amplitude stochastic oscillations. These oscillations may be sufficient to trigger centriole duplication under these conditions of interphase arrest with
low CCO activity, as evident from our experimental observations (Figures 5B–5F and Video S4; see STAR Methods for full details of the model).
Article
Correspondence
srj2003@med.cornell.edu
In Brief
Analysis of the transcriptome-wide
effects of m6A-mRNA effectors, known as
YTHDF proteins, demonstrates that they
act redundantly to induce degradation of
the same subset of mRNAs, with no
evidence of a direct role in promoting
translation.
Highlights
d YTHDF proteins function together to mediate degradation of
m6A-mRNAs
Article
A Unified Model for the Function of YTHDF Proteins
in Regulating m6A-Modified mRNA
Sara Zaccara1 and Samie R. Jaffrey1,2,*
1Department of Pharmacology, Weill Cornell Medicine, Cornell University, New York, NY 10065, USA
2Lead Contact
*Correspondence: srj2003@med.cornell.edu
https://doi.org/10.1016/j.cell.2020.05.012
SUMMARY
N6-methyladenosine (m6A) is the most abundant mRNA nucleotide modification and regulates critical as-
pects of cellular physiology and differentiation. m6A is thought to mediate its effects through a complex
network of interactions between different m6A sites and three functionally distinct cytoplasmic YTHDF
m6A-binding proteins (DF1, DF2, and DF3). In contrast to the prevailing model, we show that DF proteins
bind the same m6A-modified mRNAs rather than different mRNAs. Furthermore, we find that DF proteins
do not induce translation in HeLa cells. Instead, the DF paralogs act redundantly to mediate mRNA degrada-
tion and cellular differentiation. The ability of DF proteins to regulate stability and differentiation becomes
evident only when all three DF paralogs are depleted simultaneously. Our study reveals a unified model of
m6A function in which all m6A-modified mRNAs are subjected to the combined action of YTHDF proteins
in proportion to the number of m6A sites.
1582 Cell 181, 1582–1595, June 25, 2020 ª 2020 Elsevier Inc.
ll
Article
A B
Figure 1. DF Proteins Bind the Same m6A Sites throughout the Transcriptome
(A) The amino acids that contact m6A and the m6A-proximal nucleotides are conserved in the DF1, DF2, and DF3 YTH domain. Conserved (gray) and non-
conserved (blue) amino acids are shown on the YTH domain, rendered from the DF1 m6A-RNA structure (Xu et al., 2015) (PDB: 4RCJ). Conserved is defined as
amino acids identical in all three DF paralogs, whereas non-conserved is defined as amino acids that are different in at least one DF paralog. Shown are a front
view (left) and back view (rotated 180 , right).
(B) DF1, DF2, and DF3 have similar binding preferences for different m6A submotifs. Shown is the prevalence of different binding sites recognized by DF1, DF2,
and DF3 based on iCLIP binding data. For comparison, the percentage of each DRACH motif identified by miCLIP is shown. The three DF paralogs have similar
binding site preferences. Their binding preferences are similar to the prevalence of the m6A sequence motifs.
(C) Each m6A site in the transcriptome binds DF1, DF2, and DF3. DF1, DF2, and DF3 binding at 4,182 m6A mRNAs was based on DF1, DF2, and DF3 iCLIP
datasets in HEK293T cells (Patil et al., 2016). m6A sites are plotted as points in which the x and y coordinates represent the number of normalized DF iCLIP reads
proteins, and their shared subcellular localizations. Furthermore, the different DF paralogs to bind different m6A sequence motifs.
we show that the effects of m6A on cellular differentiation can be m6A residues are almost exclusively found in a single highly
explained by the combined action of the three DF proteins rather conserved sequence motif: DR-m6A-CH (D = A, G, U; R = A,
than individual DF proteins. Overall, these studies reveal a new G; H = A, C, U) (Canaani et al., 1979; Dimock and Stoltzfus,
unified model of m6A function in which m6A predominantly influ- 1977; Linder et al., 2015; Wei et al., 1976). We reasoned that
ences mRNA degradation via the combined action of the three the differential association of DF paralogs with distinct sets of
largely redundant DF proteins. m6A sites could be explained by their preferences for certain
DR-m6A-CH submotifs, such as GG-m6A-CU versus AG-
RESULTS m6A-CU.
To determine the binding preference for each DF paralog,
Structural Analysis of Different YTH Domains Reveals we examined our recently generated transcriptome-wide
Their Similar RNA-Binding Properties iCLIP (individual-nucleotide-resolution UV crosslinking and
An important concept in m6A-mediated gene regulation is that immunoprecipitation) maps of the binding sites of endogenously
the different DF proteins bind distinct subsets of m6A residues expressed DF1, DF2, and DF3 in HEK293T cells (Patil et al.,
in the transcriptome (Han et al., 2019; Liu et al., 2018; Shi et al., 2016). We calculated the percentage of each DR-m6A-CH
2017, 2018, 2019; Wang et al., 2015). As a result, m6A is thought sequence submotif at each DF-binding site identified by iCLIP
to influence mRNAs in different ways, depending on which DF pa- and ranked the most common sequence submotifs bound by
ralog it binds. Based on this, a set of DF1, DF2, or DF3 ‘‘unique’’ each DF paralog. These analyses showed a nearly identical fre-
m6A sites is commonly used for analysis of the function of each quency of each DR-m6A-CH submotif bound by each DF paralog
DF paralog (Han et al., 2019; Liu et al., 2018; Park et al., 2019; (Figure 1B). The rank order of the prevalence of the submotifs
Shi et al., 2017, 2018, 2019). Because this m6A-specific binding recognized by the DF paralogs was similar to the overall preva-
behavior of the DF paralogs ultimately determines how m6A lence of each m6A submotif in the transcriptome (Linder et al.,
regulates cellular processes, a major goal is to understand the 2015; Figure 1B). Thus, each DF paralog shows the same binding
basis of selective m6A recognition by the DF paralogs. preferences, which largely correlate with the prevalence of m6A
To address this, we first asked whether differences in the motifs, rather than a paralog-specific sequence preference.
YT521-B homology (YTH) domains might account for the Although the DF paralogs seem to bind m6A in proportion
different binding preferences of the DF proteins. The YTH to the prevalence of the m6A motif, each paralog may be targeted
domain comprises ~134 amino acids and mediates selective to a different set of m6A sites, as suggested by previous studies
binding to m6A (Li et al., 2014; Patil et al., 2018; Xu et al., (Shi et al., 2017, 2019; Figure S1C). Thus, we sought to identify
2015). We used the crystal structures of the DF1 and DF2 YTH the m6A sites uniquely bound by each DF paralog.
domains bound to m6A-containing RNA (Li et al., 2014; Xu To do this, we quantified the number of DF1, DF2, and DF3
et al., 2015) to annotate the amino acids that recognize m6A reads from the iCLIP datasets performed in HEK293T cells (Patil
and the adjacent nucleotides. In the case of DF3, an RNA-bound et al., 2016) that map to each HEK293T m6A site (Linder et al.,
structure has not yet been reported; however, each of the amino 2015) or, similarly, the FLAG-DF1 and FLAG-DF2 reads from
acids that bind m6A and the adjacent nucleotides in DF1 and DF2 the PhotoActivatable Ribonucleoside-enhanced Crosslinking
is conserved in DF3 (Figure S1A), suggesting that the structural and ImmunoPrecipitation (PAR-CLIP) datasets performed in
mechanism of m6A binding is the same for all three DF proteins. HeLa cells (Shi et al., 2017; Wang et al., 2014a, 2015) and map-
It remains possible that the few amino acids that differ be- ped at each HeLa m6A site (Ke et al., 2017). We could not use the
tween the YTH domains (Figure S1A) could affect RNA binding. FLAG-DF3 PAR-CLIP dataset (Shi et al., 2017) because it con-
However, these amino acids map on the surface opposite the tained a low number of reads mappable to cytosolic mRNAs
RNA-binding pocket (Figure 1A), suggesting that they might (Figure S1D).
not affect YTH-RNA interactions. Furthermore, each full-length Using this m6A-centric approach, we found a strong linear cor-
DF protein shows a similar binding affinity for an m6A-containing relation in pairwise comparisons between DF1, DF2, and DF3
RNA oligonucleotide (Figure S1B), consistent with previous reads at each m6A site. This correlation was seen in the PAR-
studies (Arguello et al., 2019; Wang et al., 2014a; Xu et al., CLIP datasets in HeLa cells mapped onto HeLa m6A sites (Fig-
2015). Overall, the RNA-binding surfaces for the three YTH ures S1E–S1F), as well as the iCLIP datasets in HEK293T cells
domains appear to be identical. mapped onto HEK293T m6A sites (Figure 1C). Notably, in
contrast to the previous model where many m6A sites show DF
The DF Paralogs Show Equivalent Binding to All m6A paralog-specific binding, we found no preferential enrichment
Sites throughout the Transcriptome of iCLIP or PAR-CLIP reads for any paralog at any m6A site.
The different transcriptome-wide binding properties reported Overall, these results suggest that each m6A site is bound by
for the DF paralogs (Shi et al., 2017) could reflect the ability of DF1, DF2, and DF3 in similar proportions.
overlapping that site (log2 normalized). We did not find m6A sites that preferentially bound either DF1, DF2, or DF3. The high Pearson correlation coefficients (r)
show that DF paralogs have highly similar binding preferences. Similar results were found in HeLa cells using DF PAR-CLIP datasets (Figure S1E).
(D) DF1, DF2, and DF3 iCLIP reads show a similar distribution in mRNAs that resembles the miCLIP read distribution. Shown are representative examples of DF1,
DF2, and DF3 iCLIP read distribution and miCLIP read distribution on HNRNPF and MDM2. The iCLIP and miCLIP data shown here were obtained in HEK293T
cells and were similar to data obtained in HeLa cells (Figures S1F–S1H).
C D
Because we were surprised by the lack of m6A sites uniquely even when using different cell lines and different CLIP methodol-
bound by each of the DF proteins, we reexamined the previously ogies (iCLIP and PAR-CLIP).
reported DF1 and DF2 unique RNA-binding sites (Shi et al., Overall, these diverse lines of evidence, including structural
2017). We asked whether DF1 binding, as measured by PAR- similarity, motif analysis, and analysis of DF binding at each
CLIP signal, was missing at DF2 unique sites and vice versa. m6A site in the transcriptome using CLIP datasets from two
To test this, we plotted DF2 binding at each DF1 unique site cell lines, suggest that the DF paralogs do not exhibit different
and vice versa. This analysis shows that there is a comparable patterns of m6A-binding in the transcriptome. Instead, they
level of DF1 and DF2 binding at each DF1 unique site and a com- appear to have similar binding preferences and affinities.
parable level of DF1 and DF2 binding at each DF2 unique site
(Figures S1G and S1H). Thus, the previously described unique DF Proteins Exhibit Similar Protein Interaction Networks
sites instead appear to be indistinguishable in terms of their Although we find that the DF paralogs bind the same m6A sites,
DF1 and DF2 binding. an unresolved question is how these nearly identical proteins
Next, we directly examined the PAR-CLIP reads on individual exert different molecular effects on m6A-mRNAs, especially in
transcripts proposed to be DF1 unique or DF2 unique. After the case of DF1 (translation) compared with DF2 (degradation).
extensive visual inspection, we found essentially no differences A compelling possibility for divergent functions of DF paralogs
in the distribution of DF1 and DF2 PAR-CLIP reads for any of might lie within their effector domain, an ~40-kDa low-
these transcripts (Figure S1J). Notably, the PAR-CLIP reads complexity domain that comprises the remainder of the protein
correlated with the location of m6A-called sites, supporting the outside of the YTH domain (Patil et al., 2018). Recruitment of
idea that DF binding is due to m6A. A similar effect was seen a distinct set of proteins to the effector domain of each DF pa-
when using the iCLIP datasets for DF1, DF2, and DF3 (Figure 1D). ralog may allow the DF paralogs to exert different effects on
These data further suggest that the sites that were previously m6A-mRNAs. We thus analyzed the differences in the effector
described as being DF paralog unique are not, in fact, unique. domains between DF paralogs as well as their protein interaction
Our reexamination of the iCLIP and PAR-CLIP studies contrast partners.
with previous reports showing that each DF paralog has only par- Examination of the three effector domains shows them to be
tial overlap with m6A sites and with each other (Shi et al., 2017; superficially similar. Each is a proline and glutamine-rich low-
Wang et al., 2014a, 2015; Figure S1C). In the PAR-CLIP studies, complexity domain with ~60% amino acid identity and 70%
DF paralog-binding sites were determined based on a threshold amino acid similarity (Figure S2A). Furthermore, the hydropho-
number of misincorporation-containing reads at any individual bicity and charge distribution along the length of the low-
site in the transcriptome. However, using a threshold approach complexity domains are highly similar. Additionally, the positions
in independent experiments can often produce false negative of the disordered regions within the low-complexity domains are
sites. These arise because of the arbitrary nature of a threshold, similar for each paralog (Figure 2A).
which can cause some sites to be missed when they fall just The different DF proteins may have different protein interaction
beneath that threshold. For this reason, the threshold approach partners to mediate their different functions. We therefore exam-
is not generally used for comparative analysis of datasets (Chak- ined a recent comprehensive Bio-ID study in which 139 proteins
rabarti et al., 2018; Guertin et al., 2018; Landt et al., 2012). were BirA tagged, and interacting proteins were detected based
Instead, comparative analysis is often performed by selecting a on their proximity-induced biotinylation (Youn et al., 2018). In
set of specific transcriptomic sites, such as sequence motifs, these experiments, 937–1,360 proteins were detected as
and determining whether one CLIP dataset or another shows possible DF interactors, of which 63–103 were identified as
preferential binding to any of these sites (Dominguez et al., high-confidence interactors (63 of 937 DF1 interactors, 100 of
2018; Wheeler et al., 2018). Our approach is similar; by 1,270 DF2 interactors, and 103 of 1,360 DF3 interactors).
comparing DF paralogs binding at mapped m6A sites, we To determine which interactors are preferentially bound by
find that the RNA binding of DF paralogs is essentially identical, each DF paralog, we performed a scatterplot analysis in which
each biotinylated interactor was plotted as a circle, with the x were P-bodies and cytoplasmic mRNA ribonucleoprotein
and y axis position each reflecting the average spectral counts (mRNP) complexes (Figure 2C). Thus, DF paralogs are predicted
for one of the three DF paralogs. In this pairwise analysis, we to have similar subcellular localizations.
found similar proteins bound by all three paralogs. Additionally, To further test this, we examined the localization of each DF
we found a marked linear correlation in the spectral counts for paralog by immunofluorescence. Here we found that all three
each high-confidence protein interactor when examined in DF paralogs showed essentially identical localization throughout
pairwise comparisons of DF1, DF2, and DF3 (Figure 2B). Thus, the cytoplasm in small punctate structures and, to a lower
the DF binding partners seem to be shared and bind with the extent, in larger punctate structures. Although the smaller ones
same overall rank order preference for all three DF paralogs. resemble the granule-like particles detected previously under
Notably, the top 25 interactors of all three DF paralogs unstressed conditions (Youn et al., 2018), the larger punctate
were highly similar and included high-scoring interactions structures were identified as P-bodies based on their colocaliza-
with components of the Carbon Catabolite Repression- tion with EDC4, a P-body marker (Figure 2D). This is consistent
Negative On TATA-less (CCR4-NOT) RNA degradation complex, with an independent proteomic analysis using P-body markers
such as CNOT1, CNOT7, and CNOT10 (Figures 2B and S2B). that also identified all three DF paralogs in P-bodies (Youn
Other high-confidence interactors of all three DF paralogs et al., 2018). Thus, rather than exhibiting distinct localizations
included RNA degradation proteins such as PATL1, XRN1, in cells, DF proteins have similar punctate cytoplasmic localiza-
LSM12, and DDX6. In addition, all three DF paralogs interact tions, some of which include localization to P-bodies. Overall,
with protein components of stress granules (Figures 2B and these data suggest that DF proteins have highly similar se-
S2B). Stress granule proteins interact even in the absence of quences, functional domains, protein binding partners, and
stress (Markmiller et al., 2018; Youn et al., 2018). Notably, all of intracellular localizations.
these interactors have been seen in other studies. For example,
CNOT1 immunoprecipitation studies have identified all three DF The Combined Activity of DF Proteins Leads to
paralogs (Du et al., 2016). Additionally, an engineered ascorbate Degradation of m6A-Modified mRNA
peroxidase (APEX)-based proteomics analysis of the G3BP1 Because all DF paralogs show high-confidence interactions
stress granule protein found all three DF paralogs to be interac- with RNA degradation pathway proteins (Youn et al., 2018; Fig-
tors (Markmiller et al., 2018). Thus, all three DF paralogs show ure 2B), we considered the possibility that each paralog may
similar patterns of binding proteins, and the major interactions function to mediate degradation of m6A-mRNA.
are seen across independent proteomic datasets. Previous comparative studies of the DF paralogs found that
Because DF1 is thought to regulate translation through its some of them induce mRNA degradation when artificially teth-
interactions with the eIF3A and eIF3B translation initiation fac- ered to a reporter transcript (Kennedy et al., 2016; Shi et al.,
tors (Shi et al., 2017; Wang et al., 2015), we wanted to determine 2017; Tirumuru et al., 2016; Wang et al., 2014a, 2015). How-
whether DF1 shows selective and robust interactions with these ever, because these studies used heterologously overex-
proteins. The Bio-ID proteomics analyses (Figure 2B) shows pressed DF proteins, we were concerned about the potential
weak interactions of eIF3A and eIF3B with all three DF paralogs, for these proteins to aggregate into stress granule-like struc-
and in each case, the probability of interaction was low. Previous tures when overexpressed. This phenomenon is seen in pro-
analysis of DF1-binding partners identified eIF3A and eIF3B teins, like the DF paralogs, that contain low-complexity do-
based on mass spectrometry analysis of DF1 immunoprecipi- mains (Alberti et al., 2019). Indeed, we observed variable
tates (Wang et al., 2015). However, in that study, eIF3A and levels of DF granule-like structures after transfecting cells
eIF3B were among the lowest-scoring DF1 interactors (Fig- with DF-expressing plasmids (Figure S2D). Thus, DF overex-
ure S2C). The weak binding seen in immunoprecipitates is pression experiments may be misleading because of the
consistent with the low probability seen in the Bio-ID studies variable degrees of DF aggregation, which could sequester
(Figure 2B). Thus, each proteomic study suggests that all three proteins and potentially suppress their function. We therefore
DF paralogs exhibit nonspecific or low-level binding to eIF3A re-examined the role of DF paralogs using knockdown ap-
and eIF3B in cells. proaches instead.
Overall, the protein-protein interactions of all three DF paral- We first examined mRNA abundance using RNA sequencing
ogs are very similar, rather than different, with high-confidence (RNA-seq) after small interfering RNA (siRNA)-mediated deple-
interactions with RNA degradation machinery and low-confi- tion of DF1, DF2, and DF3 in HeLa cells. We used siRNA that
dence interactions with translation machinery. we validated for selective high-efficiency DF1, DF2, and DF3
knockdown (Patil et al., 2016; Figure S3A–S3C). Because m6A
DF Proteins Exhibit Similar Intracellular Localizations sites on mRNAs bind all the three DF paralogs (Figures 1 and
To further understand the different functions of the DF paralogs, S1), we examined how m6A-mRNAs were affected by DF
we examined their subcellular localizations because distinct depletion and whether the effects were correlated with the num-
subcellular localizations might be expected for proteins with ber of m6A sites (Table S1). As in a previous study of DF1 (Wang
different functions. In the Bio-ID study (Youn et al., 2018), a et al., 2015), we found no effect of DF1 depletion on m6A-mRNA
non-negative matrix factorization classification analysis was abundance compared with non-methylated mRNAs (Figure 3A).
developed to predict the subcellular localization of 139 RNA- In contrast, as reported previously (Wang et al., 2014a), deple-
binding proteins based on their interaction partners. The tion of DF2, the most highly expressed DF paralog in HeLa cells
prominent localizations predicted for all three DF paralogs (Patil et al., 2018; Figure S3D), was associated with a small but
A E
statistically significant increase in the abundance of m6A- compensation cannot occur, and m6A-mRNA stability is
mRNAs (Figure 3B). This effect was more pronounced for increased most robustly.
mRNAs with high numbers of annotated m6A sites. No increase
in m6A-mRNA abundance was observed following DF3 depletion DF Paralogs Do Not Affect the Translation of m6A-
(Figure 3C), which, like DF1, is more lowly expressed in HeLa Modified mRNAs
cells (Patil et al., 2018; Figure S3D). Overall, our data confirm Although our data strongly indicate that DF paralogs act
the idea that DF2, but not DF1 or DF3, destabilizes m6A-mRNAs. together to destabilize m6A-mRNA, it remains possible that
We then considered the possibility that the DF paralogs may one or more of these proteins could also control m6A-mRNA
be functionally redundant and that our inability to detect a translation. Importantly, DF1 and DF3 have been described to
stability-regulatory effect was due to compensation by the other be enhancers of m6A-mRNA translation (Shi et al., 2017; Wang
paralogs. Additionally, we noticed that knockdown of any of et al., 2015). Therefore, we examined the role of the DF paralogs
the DF paralogs was associated with a compensatory increase in translational regulation of m6A-mRNAs.
in the expression of the other paralogs in HeLa cells (Figures DF1 has been proposed to enhance translation by binding
S3A–S3B). Thus, compensatory upregulation of the other DF pa- m6A and recruiting eIF3 to 30 UTRs. This subsequently facilitates
ralogs could further mask an effect of knockdown. loading of eIF3 onto mRNA caps (Wang et al., 2015). This is
We therefore tested simultaneous knockdown of different reminiscent of the poly(A)-binding protein PABPC1, which also
combinations of two DF paralogs or all three DFs. Knocking binds 30 UTRs, binds an initiation factor, and promotes formation
down any two DF paralogs increased the overall abundance of of initiation complexes at mRNA caps (Le et al., 1997; Ozoe et al.,
m6A-mRNA (Figures S3E–S3I). Importantly, the selective in- 2013; Wells et al., 1998). The ability of PABPC1 to enhance
crease in m6A-mRNA expression was largest upon triple knock- mRNA translation is evident based on its enrichment in the
down (Figure 3D), and, in each case, it was directly correlated polysome fractions corresponding to highly translated mRNAs
with the number of m6A sites per mRNA (Figures 3D and S3F– (Arava et al., 2003).
S3H). Double knockdown and triple knockdown exhibited We therefore examined whether DF1 is similarly enriched with
greater increases in m6A-mRNA abundance than DF2 knock- highly translated mRNAs. However, all three DF paralogs are
down alone (Figure 3E). These data are consistent with a mostly excluded from the polysome fraction (Figure 4A and
model in which any of the DF proteins can promote degradation S4A) and highly enriched in the fractions at the beginning of
of m6A-mRNAs, but m6A-mRNA degradation is most effective the gradients, which represent cytoplasmic mRNPs (Arava
when all three DF paralogs are available, resulting in maximal et al., 2003). This result is not consistent with a model in which
DF-dependent m6A-mRNA degradation. any DF protein is stably bound to mRNA 30 UTRs, enhancing
To determine whether the expression levels seen upon their translation.
triple DF knockdown are correlated with changes in mRNA Despite the absence of DF paralogs from the highly translating
stability levels, we measured the levels of m6A-mRNAs mRNA pool, we wanted to examine whether translation of m6A-
after transcription inhibition with actinomycin D. Because mRNAs is enhanced by any of the DF paralogs.
m6A-mRNAs tend to have a short half-life (Ke et al., 2017; We first examined the published ribosome profiling data that
Schwartz et al., 2014), we treated cells with actinomycin D revealed the translation-enhancing effect of DF1 (Wang et al.,
for 2 h and measured the amount of each mRNA remaining 2015). In this study, processed data were provided, listing the
using RNA-seq. mRNA stability was relatively unaffected abundance of ribosome-protected fragments that map to each
upon depletion of each DF paralog individually, except in gene. When we examined the processed data obtained after
the case of DF2, where a small stabilizing effect was seen DF1 silencing, it was evident that m6A-mRNAs were less trans-
(Figure 3F). However, when all three DF paralogs were lated than in control siRNA-transfected samples (Figure S4B).
knocked down, a substantial increase in mRNA stability was Thus, consistent with the previous study (Wang et al., 2015),
seen for m6A-containing mRNAs compared with non-m6A- but in contrast to what was expected from the polysome analysis
mRNAs (Figure 3F). These effects were also seen in qRT- (Figure S4A), DF1 silencing causes a reduction in translation of
PCR experiments examining the stability of specific mRNAs m6A-mRNAs. This analysis was performed using the average
annotated to contain high numbers of m6A sites or annotated of the two ribosome profiling replicates, as described in the orig-
to lack m6A (Figure S3J). inal study.
Overall, these data further suggest that the most efficient However, we found different results when the processed data
degradation of m6A-mRNAs occurs when all three DF paralogs of the two DF1 siRNA replicates were analyzed separately. Repli-
are present. When DF1 or DF3 is knocked down alone, the ef- cate 1 showed a prominent decrease in translation of m6A-
fects are nearly undetectable, but knockdown of DF1 and mRNAs upon DF1 depletion, consistent with the idea that DF1
DF3 together or in combination with DF2 resulted in readily enhances translation of m6A-mRNA (Figure S4C). However,
detectable increases in m6A-mRNA mRNA expression (Fig- replicate 2 showed no change in translation of m6A-mRNAs
ure 3E). This suggests that the other DF paralogs can partially upon DF1 silencing (Figure S4D). Thus, the overall reduction in
compensate for the loss of another DF. The ability of any DF m6A-mRNA translation we saw in the averaged data (Figure S4B)
protein to compensate is likely to be influenced by its expres- was driven by the large drop in m6A-mRNA translation seen in
sion level, with the lower expression of DF1 and DF3 (Fig- replicate 1.
ure S3D) making them less able to compensate for DF2 deple- Because we were surprised by the different roles of DF1
tion. In this model, when all three DF paralogs are depleted, implied by the two different replicates, we re-processed the raw
transcriptome. Additionally, rather than exerting different ef- mediate degradation of m6A-mRNAs. Further studies may also
fects on different m6A-mRNAs, we show that the three DF address whether the position of m6A along the transcript body
proteins function together to mediate degradation of m6A- has an effect on DF-mediated mRNA degradation.
containing mRNAs. Although depletion of single DF paralogs Our data do not support the idea that DF paralogs have a
leads to mild or no effects on mRNA abundance and stability, direct role in regulating mRNA translation efficiency. Our reanal-
depleting all three DF proteins leads to robust stabilization of ysis of previously published data, together with our independent
m6A-mRNAs, suggesting that each paralog can fully or ribosome profiling datasets and analysis of individual m6A-
partially compensate for the function of the other DF paralogs. mRNAs by polysome fractionation analysis performed in the
Lastly, we find that previous studies that linked DF proteins to same cell line, shows no evidence of decreased translation of
mRNA translation were affected by bioinformatic and tech- m6A-mRNAs upon depletion of DF proteins. This conclusion is
nical issues, which led to the incorrect view that a major func- consistent with the previously published protein polysome frac-
tion of DF proteins was to promote translation. Instead, we tionation analysis (Wang et al., 2015) as well as the new one pre-
show that DF paralogs do not appear to directly enhance sented here, which do not show an enrichment of DF proteins in
translation of m6A-mRNAs in HeLa cells. Our comprehensive the fractions containing highly translated mRNAs. Thus, DF pro-
analysis of DF paralog function reveals a unified model of teins, including DF1, do not behave like other translation-
DF protein binding and function, with the major effect of enhancing RNA-binding proteins that bind the 30 UTR (Le et al.,
m6A to mediate mRNA degradation through the combined ef- 1997; Ozoe et al., 2013; Wells et al., 1998). Additionally, protein
fects of all three DF paralogs. interaction analysis shows a lack of high-confidence interactions
The issue of whether DF paralogs bind different mRNAs or of the DF paralogs with translation initiation factors in cells (Youn
the same mRNAs is a central question for understanding m6A et al., 2018). Overall, these diverse lines of evidence suggest that
function in cells. If each DF paralog binds different mRNAs, DF proteins do not promote translation. Although we cannot
then each DF would affect different cellular pathways and pro- exclude a role of DF proteins in translation enhancement in other
cesses. This prevailing model has created the impetus for cell lines or conditions, our current findings are not consistent
knockout studies focusing on separate analysis of each DF pa- with a role of any DF paralog in regulating m6A-mRNA transla-
ralog, with the goal of understanding the specific functions of tion, at least in the original cell line and conditions where which
each. However, this model lacks a clear mechanism to explain DF1 was originally linked to translational regulation.
how the DF paralogs could bind different m6A sites. Although DF proteins do not promote translation, m6A can
Our analysis supports the opposite conclusion. We find that promote translation through DF-independent mechanisms.
all m6A sites bind all DF proteins in an essentially indistinguish- m6A can affect 30 UTR length (Ke et al., 2017), which can indi-
able manner, with the main determinant of DF paralog binding rectly affect translation. m6A may directly bind eIF3 when m6A
simply being the presence of the m6A site. The level of DF bind- is in the 50 UTR (Meyer et al., 2015), or m6A may be associated
ing is likely to be correlated with m6A stoichiometry, which would with bound METTL3, which may facilitate translation (Choe
positively correlate with the degradation effect. It should be et al., 2018). However, because eIF3 and METTL3 are thought
noted that some m6A sites may bind DF paralogs poorly when to bind just a small subset of m6A sites, the role of eIF3 and
they are obscured by local RNA structure or by binding of nearby METTL3 binding in the overall cellular effects of m6A is likely to
RNA-binding proteins that limit access to m6A. be limited to specific transcripts (Zaccara et al., 2019).
Another central question is that of establishing the function of Our finding that all DF paralogs contribute to mRNA destabili-
the major cytoplasmic m6A readers DF1, DF2, and DF3. This is zation reconciles findings made by diverse groups where
arguably the most important step in understanding m6A biology depletion of DF1 was not associated with reduced mRNA
because it can explain how the effects of m6A can be rationalized translation. For example, DF1 depletion does not affect transla-
in diverse m6A-dependent processes. The current concept is tion of a heterologously expressed m6A-mRNA in MCF7 cells
that m6A, through the action of different DF proteins, can induce (Slobodin et al., 2017) and does not affect translation of m6A-
mRNA translation, mRNA degradation, or both and that this ef- mRNAs in neurons when using a DF1 tethering system unless
fect is transcript specific (Shi et al., 2017, 2019). a stimulus is added (Shi et al., 2018). Thus, the lack of translation
In contrast to this prevailing model, we find diverse lines of ev- regulation seen upon DF1 depletion can now be explained by
idence supporting the idea that DF proteins have similar rather the new model of DF function presented here.
than different functions. In addition to the high sequence identity In light of our current findings, phenotypes seen upon DF
in the RNA-binding and effector domains, DF proteins have depletion likely derive from upregulation of m6A-mRNAs.
similar cytoplasmic localizations, including P-body localization, Depletion of any single DF protein can affect mRNA levels to
and similar protein interactors, which are hallmarks of proteins a small degree, which can still result in clear cellular effects.
with similar functions. Most notable among the binding partners For example, recent studies show that m6A suppresses inter-
for all three DF proteins are CCR4-NOT deadenylation complex feron beta (IFNB1) mRNA levels and that knockdown of
proteins, supporting the idea that all DF paralogs have a com- different DF paralogs can cause a small increase in the abun-
mon role in mRNA degradation. A recent study showed that all dance and translation of this transcript (Winkler et al., 2019).
three DF proteins, when heterologously expressed, bind Because cells are sensitive to small increases in IFNB1, single
CNOT1 and elicit mRNA deadenylation (Du et al., 2016). The abil- DF knockdown can lead to readily detectable cellular effects.
ity of all three DF proteins to mediate deadenylation supports the These hypomorphic phenotypes may be different from
overall finding that the major function of these proteins is to METTL3 depletion because METTL3 depletion would cause
more a substantial increase in the levels of highly m6A-modi- B Generation of the ribosome profiling and RNA seq li-
fied mRNAs and affect multiple cellular pathways, potentially braries upon DFs silencing
leading to very different phenotypes. B Actinomycin D treatment
Although DF proteins appear to have similar functions, B Quantitative PCR analysis
depletion of different DF paralogs can have different effects. B Microscale Thermophoresis (MST)
This is because DF proteins exhibit markedly different expres- d QUANTIFICATION AND STATISTICAL ANALYSIS
sion levels (Patil et al., 2018). Therefore, depletion of a low- B Protein-protein interactome analysis
abundance DF paralog, such as DF3, is likely to only affect B Analysis of the physiochemical properties of each DF
a small number of highly sensitive mRNAs, whereas depletion paralog.
of a higher-abundance DF paralog would affect a larger sub- B Reanalysis of publicly available PAR-CLIP data of DF1,
set of mRNAs, causing a different phenotype. Additionally, DF2 and DF3 in HeLa cells
because DF paralogs may be expressed at different levels in B Comparison of the coverage of DF proteins at each
different tissues, single DF paralog depletion can result in tis- m6A site on mRNAs throughout the transcriptome.
sue-specific phenotypes (Nishizawa et al., 2017). DF paralogs B Calculation of coverage at each DF1 and DF2
may also be phosphorylated in a paralog-specific manner unique sites
(Patil et al., 2018; Zaccara et al., 2019) or selectively induced B Analysis of the ribosome profiling and RNA seq data
in response to specific stimuli, as seen with p63-dependent upon silencing of DF proteins
induction of DF3 (Birkaya et al., 2007; Shi et al., 2017). These B Reanalysis of publicly available ribosome profiling data
pathways and subtle DF-paralog amino acid sequence differ- upon silencing of DF1 in HeLa cells
6
ences may confer additional modes of regulation that can B Analysis of m A mRNA stability upon DF paralog
affect the ability of DF paralogs to induce m6A-mRNA depletion.
degradation.
Notably, the functions of the DF paralogs may differ de- SUPPLEMENTAL INFORMATION
pending on the cellular context. For example, in cell stress,
DF proteins bind m6A-mRNA and relocalize to stress granules, Supplemental Information can be found online at https://doi.org/10.1016/j.
cell.2020.05.012.
but the mRNA is not targeted for degradation (Ries et al.,
2019). In neurons, DF proteins have also been localized
ACKNOWLEDGMENTS
to transport granules and may therefore have roles in traf-
ficking m6A-mRNA to dendrites and, thus, indirectly promote We thank members of the Jaffrey laboratory for comments and suggestions, in
translation in neurons (Merkurjev et al., 2018; Ries et al., particular V. Despic, and D. Patil for early contributions to this project. We
2019). However, for cell types that exhibit the classic thank members of the Epigenomics and Flow Cytometry Weill Cornell Cores.
mRNA-destabilization effect of m6A, which has been seen in We thank A. North and the Rockefeller Bio-Imaging Resource Center for assis-
diverse cell types (Geula et al., 2015; Ke et al., 2017; Vu tance with SIM microscopy. We thank the Rockefeller High-Throughput
et al., 2017) and was originally described in 1978 (Sommer Screening Resource Center for assistance with the MST experiments. We
thank Y. Cheng and M. Kharas for advice regarding the experiments using leu-
et al., 1978), the likely mediator of this m6A-dependent
kemia cells. This work was supported by NIH grants R35NS111631 and
destabilization effect is the combined action of DF1, DF2, R01CA186702 (to S.R.J.) and an American-Italian Cancer Foundation fellow-
and DF3. ship (to S.Z.).
S.Z. and S.R.J. designed the experiments. S.Z. performed the experiments.
Detailed methods are provided in the online version of this paper
S.Z. and S.R.J. wrote the manuscript.
and include the following:
Clancy, M.J., Shambaugh, M.E., Timpte, C.S., and Bokar, J.A. (2002). Induc- Le, H., Tanguay, R.L., Balasta, M.L., Wei, C.C., Browning, K.S., Metz, A.M.,
tion of sporulation in Saccharomyces cerevisiae leads to the formation of N6- Goss, D.J., and Gallie, D.R. (1997). Translation initiation factors eIF-iso4G
methyladenosine in mRNA: a potential mechanism for the activity of the IME4 and eIF-4B interact with the poly(A)-binding protein and increase its RNA bind-
gene. Nucleic Acids Res. 30, 4509–4518. ing activity. J. Biol. Chem. 272, 16247–16255.
Lee, H., Bao, S., Qian, Y., Geula, S., Leslie, J., Zhang, C., Hanna, J.H., and
Cui, Q., Shi, H., Ye, P., Li, L., Qu, Q., Sun, G., Sun, G., Lu, Z., Huang, Y.,
Ding, L. (2019). Stage-specific requirement for Mettl3-dependent m6A
Yang, C.-G., et al. (2017). m6A RNA Methylation Regulates the Self-
mRNA methylation during haematopoietic stem cell differentiation. Nat. Cell
Renewal and Tumorigenesis of Glioblastoma Stem Cells. Cell Rep. 18,
Biol. 21, 700–709.
2622–2634.
Li, F., Zhao, D., Wu, J., and Shi, Y. (2014). Structure of the YTH domain of hu-
Dimock, K., and Stoltzfus, C.M. (1977). Sequence specificity of internal
man YTHDF2 in complex with an m(6)A mononucleotide reveals an aromatic
methylation in B77 avian sarcoma virus RNA subunits. Biochemistry 16,
cage for m(6)A recognition. Cell Res. 24, 1490–1492.
471–478.
Li, A., Chen, Y.-S., Ping, X.-L., Yang, X., Xiao, W., Yang, Y., Sun, H.-Y., Zhu, Q.,
Dobin, A., Davis, C.A., Schlesinger, F., Drenkow, J., Zaleski, C., Jha, S., Batut,
Baidya, P., Wang, X., et al. (2017). Cytoplasmic m6A reader YTHDF3 promotes
P., Chaisson, M., and Gingeras, T.R. (2013). STAR: ultrafast universal RNA-seq
mRNA translation. Cell Res. 27, 444–447.
aligner. Bioinformatics 29, 15–21.
Linder, B., Grozhik, A.V., Olarerin-George, A.O., Meydan, C., Mason, C.E., and
Dodt, M., Roehr, J.T., Ahmed, R., and Dieterich, C. (2012). FLEXBAR-Flexible Jaffrey, S.R. (2015). Single-nucleotide-resolution mapping of m6A and m6Am
Barcode and Adapter Processing for Next-Generation Sequencing Platforms. throughout the transcriptome. Nat. Methods 12, 767–772.
Biology (Basel) 1, 895–905.
Liu, J., Eckert, M.A., Harada, B.T., Liu, S.M., Lu, Z., Yu, K., Tienda, S.M., Chry-
Dominguez, D., Freese, P., Alexis, M.S., Su, A., Hochman, M., Palden, T., Ba- plewicz, A., Zhu, A.C., Yang, Y., et al. (2018). m6A mRNA methylation regulates
zile, C., Lambert, N.J., Van Nostrand, E.L., Pratt, G.A., et al. (2018). Sequence, AKT activity to promote the proliferation and tumorigenicity of endometrial
Structure, and Context Preferences of Human RNA Binding Proteins. Mol. Cell cancer. Nat. Cell Biol. 20, 1074–1083.
70, 854–867.e9.
Madeira, F., Park, Y.M., Lee, J., Buso, N., Gur, T., Madhusoodanan, N., Basut-
Du, H., Zhao, Y., He, J., Zhang, Y., Xi, H., Liu, M., Ma, J., and Wu, L. (2016). kar, P., Tivey, A.R.N., Potter, S.C., Finn, R.D., and Lopez, R. (2019). The EMBL-
YTHDF2 destabilizes m(6)A-containing RNA through direct recruitment of EBI search and sequence analysis tools APIs in 2019. Nucleic Acids Res. 47
the CCR4-NOT deadenylase complex. Nat. Commun. 7, 12626. (W1), W636–W641.
Geula, S., Moshitch-Moshkovitz, S., Dominissini, D., Mansour, A.A., Kol, N., Markmiller, S., Soltanieh, S., Server, K.L., Mak, R., Jin, W., Fang, M.Y., Luo, E.-
Salmon-Divon, M., Hershkovitz, V., Peer, E., Mor, N., Manor, Y.S., et al. C., Krach, F., Yang, D., Sen, A., et al. (2018). Context-Dependent and Disease-
(2015). Stem cells. m6A mRNA methylation facilitates resolution of naı̈ve plu- Specific Diversity in Protein Interactions within Stress Granules. Cell 172, 590–
ripotency toward differentiation. Science 347, 1002–1006. 604.e13.
Quinlan, A.R., and Hall, I.M. (2010). BEDTools: a flexible suite of utilities for Winkler, R., Gillis, E., Lasman, L., Safra, M., Geula, S., Soyris, C., Nachshon,
comparing genomic features. Bioinformatics 26, 841–842. A., Tai-Schmiedel, J., Friedman, N., Le-Trilling, V.T.K., et al. (2019). m6A modi-
fication controls the innate immune response to infection by targeting type I in-
Ramı́rez, F., Ryan, D.P., Grüning, B., Bhardwaj, V., Kilpert, F., Richter, A.S., terferons. Nat. Immunol. 20, 173–182.
Heyne, S., Dündar, F., and Manke, T. (2016). deepTools2: a next generation
Wu, C.C.-C., Zinshteyn, B., Wehner, K.A., and Green, R. (2019). High-Res-
web server for deep-sequencing data analysis. Nucleic Acids Res. 44 (W1),
olution Ribosome Profiling Defines Discrete Ribosome Elongation States
W160-5.
and Translational Regulation during Cellular Stress. Mol. Cell 73,
Ries, R.J., Zaccara, S., Klein, P., Olarerin-George, A., Namkoong, S., 959–970.e5.
Pickering, B.F., Patil, D.P., Kwak, H., Lee, J.H., and Jaffrey, S.R. (2019).
Xu, C., Liu, K., Ahmed, H., Loppnau, P., Schapira, M., and Min, J. (2015). Struc-
m6A enhances the phase separation potential of mRNA. Nature 571,
tural Basis for the Discriminative Recognition of N6-Methyladenosine RNA by
424–428.
the Human YT521-B Homology Domain Family of Proteins. J. Biol. Chem. 290,
Risso, D., Ngai, J., Speed, T.P., and Dudoit, S. (2014). Normalization of RNA- 24902–24913.
seq data using factor analysis of control genes or samples. Nat. Biotechnol. Youn, J.-Y., Dunham, W.H., Hong, S.J., Knight, J.D.R., Bashkurov, M., Chen,
32, 896–902. G.I., Bagci, H., Rathod, B., MacLeod, G., Eng, S.W.M., et al. (2018). High-Den-
Schwartz, S., Mumbach, M.R., Jovanovic, M., Wang, T., Maciag, K., Bushkin, sity Proximity Mapping Reveals the Subcellular Organization of mRNA-Asso-
G.G., Mertins, P., Ter-Ovanesyan, D., Habib, N., Cacchiarelli, D., et al. (2014). ciated Granules and Bodies. Mol. Cell 69, 517–532.e11.
Perturbation of m6A writers reveals two distinct classes of mRNA methylation Zaccara, S., Ries, R.J., and Jaffrey, S.R. (2019). Reading, writing and erasing
at internal and 50 sites. Cell Rep. 8, 284–296. mRNA methylation. Nat. Rev. Mol. Cell Biol. 20, 608–624.
Shi, H., Wang, X., Lu, Z., Zhao, B.S., Ma, H., Hsu, P.J., Liu, C., and He, C. Zhong, S., Li, H., Bodi, Z., Button, J., Vespa, L., Herzog, M., and Fray, R.G.
(2017). YTHDF3 facilitates translation and decay of N6-methyladenosine- (2008). MTA is an Arabidopsis messenger RNA adenosine methylase and inter-
modified RNA. Cell Res. 27, 315–328. acts with a homolog of a sex-specific splicing factor. Plant Cell 20, 1278–1288.
STAR+METHODS
Continued
REAGENT or RESOURCE SOURCE IDENTIFIER
Oligonucleotides
Single m6A-DRACH 10-nt oligo: rUrCrCrGrG/ Xu et al., 2015 https://doi.org/10.1074/jbc.M115.680389
iN6Me-rA/rCrUrGrU
Human YTHDF2 shRNA Sigma Cat# TRCN000254410
Human YTHDF3 shRNA Sigma Cat# TCN000365173
Human YTHDF1, YTHDF2 and YTHDF3 shRNA Patil et al., 2016 https://doi.org/10.1038/nature19342
ERCC RNA Spike-in Mix Thermo Fisher Scientific Cat# 4456740
Recombinant DNA
pcDNA3-FLAG-HA Patil et. al., 2016 https://doi.org/10.1038/nature19342
pcDNA3-FLAG-HA-YTHDF1 Patil et. al., 2016 https://doi.org/10.1038/nature19342
pcDNA3-FLAG-HA-YTHDF2 Patil et. al., 2016 https://doi.org/10.1038/nature19342
pcDNA3-FLAG-HA-YTHDF3 Patil et. al., 2016 https://doi.org/10.1038/nature19342
Software and Algorithms
GraphPad Prism 8 GraphPad Software https://www.graphpad.com/scientific-
software/prism/
RStudio (1.0.153) RStudio https://rstudio.com/
Fiji (ImageJ 2.0.0-rc-68/1.52h) NIH https://fiji.sc/
Imaris Bitplane https://imaris.oxinst.com
FLEXBAR Dodt et. al., 2012 https://github.com/seqan/flexbar/wiki
riboWaltz Lauria et al., 2018 https://github.com/
LabTranslationalArchitectomics/
riboWaltz
STAR Dobin et al., 2013 https://github.com/alexdobin/STAR
BEDTools Quinlan and Hall, 2010 https://github.com/arq5x/bedtools2
MEME Suite Bailey et al., 2009 http://meme-suite.org/index.html
DeepTools Ramı́rez et al., 2016 https://deeptools.readthedocs.io/en/
develop/
RESOURCE AVAILABILITY
Lead Contact
Information and requests for resources and reagents should be directed to and will be fulfilled by the Lead Contact, Samie R. Jaffrey
(srj2003@med.cornell.edu).
Material Availability
This study did not generate new unique reagents.
Cell Culture
HeLa (ATCC CCL-2) cells of female origin were maintained in 1x DMEM (11995-065, Life Technologies) with 10% FBS, 100 U ml-1
penicillin and 100 mg ml-1 of streptomycin under standard culture conditions. Cells were split with TrypLE Express (Life Technologies)
according to the manufacturer’s instructions. HeLa cells were authenticated. MOLM-13 (man origin) were maintained in 1x RPMI with
10% FBS, 100 U ml-1 penicillin and 100 mg ml-1 of streptomycin under standard culture condition. MOLM-13 cells were a kind dona-
tion from the laboratory of M. Kharas and were not authenticated during this study.
METHOD DETAILS
Flow cytometry
After 5 days of DF silencing, MOLM-13 cells were tested for the expression of the CD14 differentiation marker using the following
protocol. Cells were collected by centrifugation, washed once with 1X PBS+2% FBS and then counted using a hemocytometer.
One million cells were used to perform cellular staining using PE-CD14 antibody (BD PharMingen). The antibody dilution was per-
formed according to the manufacturer’s instructions. Staining was performed on ice for 1 h. After the staining, excess unbound anti-
body was removed by two washes with 1x PBS. DAPI was added prior to analysis. Cells were analyzed on a BD FACS ARIA instru-
ment. To calculate the percentage of CD14 positive cells, data were processed using FlowJo.
permeabilization/blocking, cells were incubated with the primary antibody in a humidified chamber for 2 h. Cells were then washed
with 1x PBS for three times and then incubated with the appropriate secondary antibody (anti-rabbit Alexa Fluor 568, anti-mouse
Alexa Fluor 488) for 1 h at room temperature. Following the incubation, cells were washed as before. Hoechst staining was performed
for 5 min. After additional washes, coverslips were mounted in mounting media (Prolong Diamond, P36961, Life Technologies) and
quickly sealed with nail polish. Stained cells were imaged by super-resolution 3D-SIM on OMX Blaze 3D-SIM super-resolution
microscope (Applied Precision) equipped with a 100x/1.40 UPLSAPO oil objective. To reduce spherical aberrations, an oil with
the optimal refractive index was first identified at the beginning of every acquisition session. Image reconstruction and alignment
was performed using SoftWoRx. Because all acquired images have AF-568 staining (red staining), Optimal-Red transfer functions
(OTFs) were used during the image processing. Further processing was performed on Imaris.
Polysome
HeLa cells were seeded on 10-cm dishes. At 70%–80% confluency, cells were treated with 100 mg/mL cycloheximide for 10 min at
37 C and then collected. Briefly, cells were washed with PBS in mild cycloheximide condition and then lysed directly on the dish us-
ing 400 ml of lysis buffer (20 mM Tris HCl pH 7.4 100 mM KCl, 5 mM MgCl2, 1% Triton X-100, 100 mg/ml cycloheximide, 2 mM DTT, 1x
cOmplete no EDTA protease inhibitor cocktail). The lysate was then left on ice for 10 min and then a centrifugation at 12,000 g x 15 min
was performed to clear the lysate. The cleared lysates were then loaded in 15%–50% linear sucrose gradients, ultra-centrifuged
and fractionated with an automated fraction collector. Proteins were extracted from each fraction using trichloroacetic acid. Sodium
deoxycholate was added as a carrier to assist in the protein precipitation. After acetone addition and washes, proteins were resus-
pended in protein loading buffer, denatured at 95 C for 10 min, and used for detecting DF1, DF2, DF3 and RPS6 by western blot. RNA
was extracted from each fraction using TRIzol-LS reagent (Invitrogen). To account for differences in extraction efficiency, 2 ng of a
spike-in luciferase mRNA (TriLink Biotechnologies) was added to each fraction before RNA extraction. Briefly, RNA extraction was
performed using a ratio of 2:1 between the volume of TRIzol LS reagent and the sample. After TRIzol addition, RNA isolation was
performed as indicated by the manual’s instructions.
Generation of the ribosome profiling and RNA seq libraries upon DFs silencing
Ribosome profiling was performed following the original protocol (McGlincy and Ingolia, 2017) with minor modifications. Briefly, HeLa
cells were plated on a 10-cm dish and each DF or combination of DF transcripts was silenced as described. After the silencing, to
inhibit ribosome run-off during the collection and lysis, cells were rapidly washed once with ice-cold PBS containing 50 mg/ml cyclo-
heximide and the plates were immersed in liquid nitrogen and placed in dried ice as previously described (Calviello et al., 2016). After
allowing cells to reach 4 C by keeping cells on ice, cells were immediately lysed in 400 mL of cell lysis buffer (20 mM Tris pH 7.4,
150 mM NaCl, 5 mM MgCl2, 1 mM DTT, 100 mg/mL cycloheximide, 1% Triton, 25 U DNase I) and by triturating the sample through
a 26-gauge needle ten times to further lyse the cellular material. 5% of the total cellular extract was used for RNaseq library prepa-
ration. 10% of the total cellular extract was used for Western Blot validation of the silencing. Lysate was then clarified by performing a
centrifugation step at 20,000 x g for 10 min at 4 C. Supernatant was collected, flash-frozen and stored at 80 C.
For the ribosome profiling library preparation, the RNA concentration in the cleared lysate was quantified using RiboGreen. 30 mg of
RNA was digested using RNase I (Epicenter) as previously described (McGlincy and Ingolia, 2017). After RNase digestion, lysates
were loaded on a sucrose gradient and centrifuged at 100,000 rpm (TLA 100.3 rotor) for 1 h to pellet ribosomes. RNA from the re-
suspended ribosomal pellet was purified using Trizol and run on a gel to selectively excise foot-printed RNAs (from 17 nt to 34 nt
in length). To reduce ribosomal contamination in the library preparation steps, we then performed Ribo Zero Gold depletion of the
foot-printed RNA. The rRNA-depleted RNA fragments were dephosphorylated, and the adaptor was ligated. To specifically deplete
unligated linker, yeast 50 -deadenylase and RecJ exonuclease digestion was performed. At this point, the library preparation steps
were performed essentially as described previously (McGlincy and Ingolia, 2017). Briefly, reverse transcription was performed using
SuperScript III. To avoid untemplated nucleotide addition, reverse transcription was carried out at 57 C, as described (McGlincy and
Ingolia, 2017). cDNA purification, circularization, and amplification were performed as previously described (Linder et al., 2015). Us-
ing this method, every ribosomal footprint represents a unique ribosome-binding event. Libraries were sequenced with a single end
50 bp run using an Illumina Hiseq2500 platform.
For the RNaseq library preparation, the RNA was extracted from the 5% of the lysate before the RNase digestion using Trizol LS.
The RNA quality was assessed by Bioanalyzer analysis. 1 mg of total RNA was used for library preparation using the NEBNext Ultra
Directional RNA Library Prep Kit. Ribosomal RNA was removed using NEBNext rRNA Depletion Kit. The libraries were sequenced on
the Illumina HiSeq 2500 instrument, in single read mode, 50 bases per read.
Actinomycin D treatment
HeLa cells were plated on a 10-cm dish and each DF or combination of DF proteins was silenced as described. After five days of
silencing, cells were treated with 5 mg/ml of actinomycin D or vehicle (DMSO) for 2 h to inhibit transcription. To ensure that the treat-
ment did not affect cell viability, cells were counted before collection. Total RNA was extracted using TRIzol and 1 mg of total RNA was
used to perform RNA-seq library preparation using the NEBNext Ultra Directional RNA Library Prep Kit. Ribosomal RNA was removed
using NEBNext rRNA Depletion Kit. Prior to ribosomal depletion, 2 mL of the 1:100 dilution of the ERCC (external RNA control
consortium) RNA Spike-in Control Mix 1 was added to each 1 mg of total RNA sample, as suggested by the manufacturer’s
instructions.
Reanalysis of publicly available PAR-CLIP data of DF1, DF2 and DF3 in HeLa cells
Data from the DF3 PAR-CLIP (GEO: GSE86214) were downloaded and used for analysis in this study. Reads processing was per-
formed using Flexbar (Dodt et al., 2012). First, the adaptor sequence reported on the GEO submission was removed using the
following set-up: flexbar -r SRR509926X.fastq -f i1.8 -t SRR509926X.noadapter–max-uncalled 1 -a adapters_par-clip.fasta–pre-
trim-phred 20 -n 20–min-read-length 1. After the adaptor removal, quality read analysis check was performed using Fastqc
(https://www.bioinformatics.babraham.ac.uk/projects/fastqc/). According to the Fastqc results, 40% of the reads of each DF3
Comparison of the coverage of DF proteins at each m6A site on mRNAs throughout the transcriptome.
To compare iCLIP coverage at each m6A site, we calculated the number of normalized DF iCLIP read counts at each m6A site using a
previously described approach (Patil et al., 2016). We first calculated the number of unique read counts per million mapped reads
obtained from the CITS analysis of each iCLIP dataset (Patil et al., 2016). In this approach, each iCLIP read represents a unique
RNA-bound protein. Only m6A mapping to mRNAs were used in this analysis. A 5-nucleotide window from the m6A site genomic co-
ordinate was selected and the number of iCLIP reads per million uniquely mapped reads was calculated at every m6A coordinate
using BEDTools (Quinlan and Hall, 2010). The following formula was used: (r*106)/R where r = number of unique CLIP reads at the
m6A window, R = total number of uniquely mapped unique iCLIP reads in the whole iCLIP library. We refer to the resulting number
as ‘‘coverage at each m6A site’’ throughout the manuscript. For comparing the reanalyzed DF1 and DF2 PAR-CLIP coverage at each
m6A site, we used a similar approach. For each DF1, DF2 and DF3 iCLIP and miCLIP, the enriched motif presented in Figure 1 was
obtained using MEME suite (Bailey et al., 2009) as previously described (Patil et al., 2016).
For representation of iCLIP, PAR-CLIP and miCLIP tracks in Figure 1 and Figure S1, the normalized number of read counts from
iCLIP, PAR-CLIP and miCLIP datasets at the indicated genomic coordinate is presented. The reported annotated m6A sites in the
respective GEO submission were downloaded (GEO: GSE86336, GSE63753). In order to calculate the number of m6A sites per tran-
script, m6A sites were assigned to the gene transcript using MetaplotR (Olarerin-George and Jaffrey, 2017).
Analysis of the ribosome profiling and RNA seq data upon silencing of DF proteins
After the sequencing of the ribosome profiling libraries performed in this study, reads were quality-based trimmed and reads below
16 nucleotides were excluded. The adaptor was removed using Flexbar (Dodt et al., 2012). The demultiplexing was performed based
on the experimental barcode using the pyBarcodeFilter.py script. The second part of the random barcode was then moved to the
read header. After removal of the PCR duplicates based on the UMI sequence, ribosomal RNA reads were removed using STAR
aligner (Dobin et al., 2013). As previously reported, a high percentage of reads in the library is represented by ribosomal RNA. To avoid
possible contamination of the reads representing the Ago-complex recruited on the siRNA target sequence, reads were also mapped
to the siRNA sequences to remove these sequences. siRNA sequences are previously described and reported in the GEO submis-
sion related to this previous study. Reads with no acceptable alignment to ribosomal and siRNA sequences were then mapped to the
hg38 transcriptome. The STAR genome index was built using the annotation obtained from GENCODE (version 26). The following
parameters were used to align to the genome: STAR–runThreadN 20–genomeDir ./star_hg38–readFilesIn reads_rRNA_siRNA_Un-
mapped.fastq–outSAMtype BAM SortedByCoordinate–sjdbGTFfile gencode.v26.protein_coding.annotation.gtf–outFilterMulti-
mapNmax 8–limitBAMsortRAM 10000000000–alignEndsType EndtoEnd–outWigType bedGraph–outWigStrand Stranded–
outWigNorm None–quantMode TranscriptomeSAM GeneCounts–outFilterMismatchNmax 8–outSAMattributes All–
outFilterIntronMotifs RemoveNoncanonicalUnannotated. These parameters were previously used (Calviello et al., 2016). The map-
ped reads that represent ribosome-protected fragments (reads longer than 21 nucleotides, but shorter than 32 nucleotides) with a
40% minimum level of periodicity were considered for the subsequent analysis. We specifically considered only ribosome-protected
fragments reads mapped to the coding sequence to avoid any possible contamination coming from the untranslated area of the
genome. To measure the degree to which transcripts are translated independently from the initiation and termination rate, we
excluded reads mapping to the first 15 nucleotides from the start codon and the 9 nucleotides before the stop codon of each coding
sequence, as recommended in previous studies (Lauria et al., 2018). To precisely determine the localization of each ribosome with
respect to the start and stop codon, the P-site offset was estimated as previously described using riboWaltz (Lauria et al., 2018).
Reads matching all these criteria were considered to determine the number of ribosome-protected fragments per gene. Given the
level of variability and the possibility of contamination of the ribosome profiling library preparation, all these criteria have been re-
ported as essential analysis steps to better estimate the number of ribosome-protected fragments mapped to each transcript (Cal-
viello et al., 2016; McGlincy and Ingolia, 2017). Only the longest splice isoform of each gene was considered. Genes with less than 10
ribosome-protected fragments were excluded from the analysis. For each DF paralog silencing experiment, TMM normalization,
empirical Bayes estimate of the negative binomial dispersion, and measurement of the level of change in translation (log2 Fold
Change) compared to the control condition was performed using edgeR (McCarthy et al., 2012). To account for sample-to-sample
variability, all replicates were analyzed at the same time to define a unique log2 Fold Change measurement.
After the sequencing of the RNA-seq libraries, reads with low quality were discarded and read lengths shorter than 18 nt were dis-
carded. Ribosomal reads were removed using STAR aligner. The remaining reads were mapped to hg38 genome using STAR and the
data were used to normalize the Riboseq data and derive the change in RNaseq expression. For each DF paralog silencing, TMM
normalization, empirical Bayes estimate of the negative binomial dispersion, and measurement of the level of change in translation
(log2 Fold Change) compared to the control condition was performed using edgeR. To account for sample to sample variability, all
replicates were analyzed at the same time to define a unique log2 Fold Change measurement.
For the translational efficiency measurement, the Riboseq change was compared to the RNaseq change.
Reanalysis of publicly available ribosome profiling data upon silencing of DF1 in HeLa cells
To analyze the public available ribosome profiling data upon DF1 silencing, we used two independent approaches. First, we down-
loaded the Table ‘‘GSE63591_C-Y1-Ribosome_profiling.xlsx’’ available at the GEO: GSE63591 and assigned to each reported tran-
script a number of m6A sites. To define whether m6A mRNAs were affected by the DF1 silencing, we used the translational efficiency
values reported in the table without performing any further processing step. Secondarily, the original sequencing data of the GEO:
GSE63591were downloaded and processed as follow. Quality check was performed using Flexbar. After quality check, sequences
longer than 21 nt were first mapped to the siRNA sequences reported in the published study (Wang et al., 2015), to the ribosomal
RNA, and then to the hg38 transcriptome. All mapping steps were performed using STAR as mentioned above. Reads mapping
to the coding sequence that represent ribosome protected fragments and have high level of periodicity were considered for the anal-
ysis steps as described above. We excluded reads mapping to the first 5 codons from the start codon and the 3 codons before the
stop codon of each coding sequence as described above.
We noticed a major difference between the original public available table and the new tables generated by our reanalysis. In the
case of Replicate 2, the number of ribosome-protected fragments assigned to the DF1 mRNA in the processed table by Wang et al.
(2015) showed a significant reduction in the number of ribosome-protected fragments for the DF1-silenced sample compared to the
matching control sample. This is expected for the DF1 knockdown condition. When we reanalyzed the original sequences reads from
GEO: GSE63591, we reproduced this loss of ribosome-protected fragments assigned to DF1 upon DF1 silencing. In contrast, the
number of ribosome-protected fragments assigned to the DF1 mRNA in Replicate 1, as shown in the processed table
‘‘GSE63591_C-Y1-Ribosome_profiling.xlsx,’’ shows a substantially higher number of ribosome-protected fragments for the DF1-
silenced sample compared to the matching control sample (15-fold increase). This would indicate a condition of overexpression
instead of knockdown. We therefore examined the raw sequencing reads for Replicate 1 to generate a new table of ribosome-pro-
tected fragments. Upon reanalysis of the Replicate 1 ribosome-profiling data (Table S2), the number of ribosome-protected frag-
ments mapped to DF1 mRNA was significantly lower than the number indicated in the published processed table ‘‘GSE63591_C-
Y1-Ribosome_profiling.xlsx.’’ In the reanalysis, we observed 98% less ribosome-protected fragments mapped to DF1 than the num-
ber of ribosome-protected fragments mapped to DF1 in the matching control sample. This is consistent with the raw data represent-
ing a DF1 knockdown condition rather than an overexpression condition.
We used a standard protocol for determining ribosome-protected reads, which inherently discards any reads that derive from the
siRNA itself because these fragments would be too short to be a ribosome-protected fragment, as described above (Calviello et al.,
2016; McGlincy and Ingolia, 2017). It is possible that the large number of ribosome-protected fragments that map to DF1 in the pub-
lished processed table for Replicate 1 derive from DF1-specific siRNA molecules that were cloned into the library but not removed, if
there was no appropriate size filter used in the alignment protocol. The length, sequence and alignment position of each ribosome
protected fragment is not provided in the processed table. Alternatively, the ribosome-protected reads assigned to DF1 could derive
from a sample which contains overexpressed DF1 mRNA. Because of this uncertainty, we generated new DF1 knockdown ribosome
profiling datasets in the same cell line. The pipeline used for the reanalysis of these datasets is reported (see ‘‘Data and Code
Availability’’).
Supplemental Figures
Figure S1. DF Paralogs Bind the Same m6A Sites throughout the Transcriptome, Related to Figure 1
(A) The YTH domain of DF1, DF2 and DF3 exhibit high sequence homology. Shown is a detailed representation of the aligned amino acid sequence for the YTH
domains of DF1, DF2 and DF3. Amino acids previously described to be essential for the m6A binding based on the DF1- m6A RNA and DF2- m6A RNA crystal
structures (Li et al., 2014; Xu et al., 2015) (PDB: 4RCJ, 4RDN) are highlighted. Pink indicates the three tryptophan residues that form the aromatic cage
(legend continued on next page)
ll
Article
surrounding m6A. Green and Blue indicate amino acids that make additional points of contact between the YTH domain and the nucleotides adjacent to m6A. Of
note, each YTH-DF domain shares each of these amino acids. Shown below in a yellow color code scheme is the level of conservation of every amino acid among
the three YTHDF proteins with a range from 1 (low conservation) to 10 (high conservation) as predicted by the Clustal Omega alignment algorithm (Madeira et al.,
2019). Most residues (88%) are fully conserved residues across all DFs paralogs as indicated by the conservation score of 10. Importantly, amino acids essential
for recognizing m6A and adjacent residues in RNA are fully conserved. As indicated in Figure 1A, amino acids with a conservation score lower than 10 are located
on the opposite side of the YTH domain away from the RNA-binding pocket, suggesting that they might not affect the YTH-m6A RNA interaction.
(B) DF1, DF2 and DF3 have similar binding affinity to m6A-modified mRNA. Previous studies suggest that DF1, DF2 and DF3 bind with similar affinity to m6A-
modified mRNA (Patil et al., 2018; Wang et al., 2014a; Xu et al., 2015). To compare the affinities side-by-side, full length DF1, DF2 and DF3 were prepared as
recombinant proteins in E. coli and the affinity to an m6A-modified RNA was measured by microscale thermophoresis (MST). Shown is the percentage of protein
bound (fraction bound) at increasing concentration of the m6A-RNA. These data show that the three DFs have similar in vitro binding affinity to m6A. Data are
represented as mean ± s.d. (n = 2 replicates).
(C) The previously reported (Shi et al., 2017) level of overlap of the DF1, DF2, and DF3 targets is summarized. In these previous studies, the targets of each DF
paralog were first identified by PAR-CLIP (Shi et al., 2017; Wang et al., 2014a, 2015). Then, to determine which of these mRNAs represent valid DF-mRNA in-
teractions, RIP (immunoprecipitation of the DF paralog, followed by reverse transcription and sequencing of the bound RNA) was performed. An overlap
approach was used to identify targets of each DF paralog, and subsequently to define DF1, DF2 or DF3 unique targets or common targets. This analysis resulted
in the conclusion that only a minority of mRNAs (23.98%) are bound by all DF paralogs, as shown by the pie chart. Most of the mRNAs were annotated to be either
unique targets of one DF or shared by two DFs. The finding that different DF proteins bind distinct subsets of m6A-containing mRNAs in the transcriptome has
been the foundation of the current model of how DFs differentially regulate different m6A mRNA cohorts (Han et al., 2019; Liu et al., 2018; Park et al., 2019; Shi
et al., 2017, 2018, 2019).
(D) Read quality analysis of the previously performed DF1, DF2 and DF3 PAR-CLIP datasets. For each PAR-CLIP dataset (Shi et al., 2017; Wang et al., 2014a,
2015), reads are classified based on the following quality parameters: read length (yellow, whether a read is shorter than 18 nucleotides, and thus cannot be
aligned to the genome), presence of PCR duplicates (blue, duplicates category), and mappability to the genome using Bowtie and the previously reported
parameters (categorized into ‘‘mappable’’ (green) or ‘‘fail to map’’ (light blue) reads). The DF3 PAR-CLIP dataset was unusual for several reasons. First, 44.11% of
all reads mapped to a single site in a single gene, MT-RNR2 (STAR Methods). These reads were identical, consistent with a PCR duplication event and were not
found in the DF1 or DF2 PAR-CLIP datasets or any miCLIP datasets. Second, of the remaining reads, only a small percentage of these were mappable (indicated
in green in the 10 3 10 dot plot). Overall, the DF3 PAR-CLIP dataset lacks sufficient read depth to detect endogenous DF3-binding sites in the transcriptome as
efficiently as the other PAR-CLIP datasets.
(E) Schematic representation of the method used to identify m6A sites bound by one DF or shared by two or more DFs. Scatterplots are used to represent the
pairwise comparison of the number of normalized DF reads calculated from the iCLIP or PAR-CLIP datasets at each m6A site. Yellow areas indicate where m6A
sites that uniquely bind one DF should be located. For instance, DF1-unique sites with a high level of coverage for DF1 (high number on x axis) and a low level of
coverage for DF2 (low number on the y axis) will be shown in the lower right area of the scatterplot. m6A sites bound by two or more DFs with a similar level of
binding for each DF will be found in the indicated light red area.
(F) Essentially all m6A sites show highly correlated binding of DF1 and DF2 based on previous PAR-CLIP datasets (Shi et al., 2017; Wang et al., 2014a, 2015). As
described in (E), to identify m6A sites that preferentially bind DF1 or DF2, their level of binding was quantified at each m6A site in the transcriptome (Ke et al., 2017).
In this experiment, we used the previously reported PAR-CLIP datasets prepared in HeLa cells (Shi et al., 2017; Wang et al., 2014a, 2015). According to the
previous analysis (Shi et al., 2017; Wang et al., 2014a, 2015), nearly 50% of m6A sites should be uniquely bound by one DF. However, few if any m6A sites show
evidence of disproportionate binding of one DF paralog compared to the other. Additionally, the high Pearson correlation coefficients (r) show that DF1 and DF2
have highly similar binding preferences for each m6A site in the transcriptome of HeLa cells. Similar results are shown in Figure 1C using iCLIP data for DF1, DF2
and DF3 in HEK293T cells (Linder et al., 2015; Patil et al., 2016). Thus, the binding of all m6A sites to each DF paralog is not cell-type specific, and not specific to
the type of CLIP assay used (PAR-CLIP versus iCLIP).
(G, H) DF1-unique sites show extensive levels of bound DF2, and vice versa. Previous studies show that nearly half of all m6A sites are uniquely bound by only one
DF paralog (Shi et al., 2017; Wang et al., 2014a, 2015. To confirm whether DF1-unique mRNAs are indeed uniquely bound by DF1, we asked if there are markedly
fewer DF2 PAR-CLIP reads than DF1 PAR-CLIP reads located at DF1-unique sites. For this analysis, we selected the previously described DF1-uniquely bound
sites and DF2-uniquely bound sites (Shi et al., 2017; Wang et al., 2014a, 2015). Shown is the DF1 and DF2 PAR-CLIP coverage at DF1-unique sites (G) and at DF2-
unique sites (H). For each site, PAR-CLIP DF1 or DF2 coverage is shown, with each row representing the coverage for a different unique site and its surrounding
area. The rows are ordered based on the degree of DF1 PAR-CLIP coverage (G) or DF2 PAR-CLIP coverage (H). In principle, DF1-unique sites should show higher
level of DF1 PAR-CLIP coverage than DF2. However, the PAR-CLIP coverage for both DF1 and DF2 appears largely identical at both DF1-unique sites and DF2-
unique sites. We were unable to define the DF3 coverage level at each site given the low number of mappable reads of the DF3 PAR-CLIP dataset, as discussed in
Figure S1C.
(J) DF1 and DF2 PAR-CLIP read tracks are highly similar even on transcripts thought to be DF1-unique or DF2-unique (Shi et al., 2017; Wang et al., 2014a, 2015).
DF1-unique and DF2-unique transcripts should show unique peaks or patterns of DF1 and DF2 binding. However, we found that transcripts generally appear to
show identical PAR-CLIP read coverage along the entire length of the transcript body. When inspecting different mRNAs, differences in PAR-CLIP coverage were
only seen if read coverage was low and therefore susceptible to read noise. Shown is the normalized read distribution of DF1 PAR-CLIP (in blue) and DF2 PAR-
CLIP (in green) on OGT, an mRNA previously thought to be a DF1-unique target, and on DUSP1, an mRNA previously thought to be a DF2-unique target. The
‘‘called’’ DF peaks were previously reported by Wang et al. (2015) and Wang et al. (2014a) based on a statistical peak finding algorithm PARalyzerv1.1 and are
presented below their respective tracks. Each row represents called peaks in a different PAR-CLIP replicate. Notably, the location of peaks and the relative
heights of each peak is similar for DF1 and DF2 on both transcripts. All the peaks overlap substantially with called m6A sites in HeLa cells (Ke et al., 2017). In the
original DF1, DF2, and DF3 mapping studies, the PAR-CLIP called sites were overlapped with a list of target mRNAs immunoprecipitated using the RIP protocol
(Shi et al., 2017; Wang et al., 2014a, 2015). Compared to the total number of m6A-containing mRNAs or the total number of mRNAs containing DF1 or DF2 PAR-
CLIP peaks, only a few mRNAs were successfully immunoprecipitated using this method (1747 for DF1, 1592 for DF2, 2080 for DF3). Therefore, the use of this
approach makes the final list of DF-specific mRNA targets highly dependent on this low efficiency mRNA immunoprecipitation method.
ll
Article
Figure S2. DF Proteins Have Similar Effector Domains and Protein-Protein Interactions, Related to Figure 2
(A) DF1, DF2 and DF3 exhibit generally high sequence identity and similarity in their effector domain. In Figure S1, we examined the high sequence identity of all
DFs in their YTH domain. Shown here is a schematic representation of the aligned amino acid sequence of the remaining part of the DF1, DF2 and DF3 protein.
There was overall similarity along the entire length of the effector domain, with small regions where there were amino acid differences. These small differences
may account for the previously described differences of DF function, which are examined in this study. Residues are classified according to the ClustalW Multiple
Sequence Alignment score (Madeira et al., 2019) based on whether the residue is exactly identical among all DFs, or ‘‘strongly and weakly similar’’ residues (the
amino acids share higher or lower level of similarity in their physicochemical properties among all DFs). Shown in the table is the overall percentage of sequence
identity and similarity calculated by the pairwise sequence EMBOSS Needle aligner (Madeira et al., 2019). The pairwise comparisons of the DF amino acid
sequences confirms the high sequence similarity and identity of all DFs.
(B) High-confidence interactors of each DF paralog generally show high-confidence interactions with the other DF paralogs. As described in Figure 2B, high-
confidence interactors of each DF paralog in vivo were identified previously using a Bio-ID proteomic approach (Youn et al., 2018). The top 25 interacting proteins
for each DF was previously determined based on their length-normalized spectral counts and the average probability of interaction calculated across different
replicates (Youn et al., 2018). The heatmap show the AvgP, average probability of interaction, for each of the top 25 interactors of DF1, DF2 and DF3. The majority
of the top 25 interactors have a high average probability of interaction with all three paralogs. These proteins include the main components of the CCR4-NOT
RNA-degradation complex and proteins previously identified as stress granule components. A recent study showed that all three DF proteins bind CNOT1 and all
three can elicit mRNA deadenylation when tethered in cells to a reporter mRNA (Du et al., 2016), supporting the overall finding that the major function of the DF
proteins is to mediate degradation of m6A-mRNAs. Interestingly, interaction with components, or repressor of the translation process may suggest a possible
involvement of DF proteins in the translational repression of m6A mRNAs. This repression may be then interconnected to the observed DF-dependent regulation
of m6A-mRNA stability. Shown at the bottom is the AvgP of interaction of each DF with the other DF paralogs.
(C) The Bio-ID DF1 interactome data (Youn et al., 2018) did not identify eIF3A and 3B as high confidence DF1 interactors. However, in Wang et al., 2015, eIF3A and
eIF3B were identified in FLAG-tagged DF1 immunoprecipitates by mass spectrometry (Wang et al., 2015). Shown is the reported score and number of unique
peptides mapped to each putative DF1 interactor from Wang et al. (2015). However, even in this study, eIF3A and eIF3B are not among the top DF1 interactors
when ranking all the interactors by the reported score. Moreover, as shown in red, the number of unique peptides is less than 5. This low level of interaction is
consistent with the Bio-ID interactome studies presented in Figure 2B. Thus, eIF3A and eIF3B are not seen as DF1 interactors in the Bio-ID study (Youn et al.,
2018) and are weak interactors in the Wang et al. (2015) study.
(D) Heterologous expression of DF proteins causes formation of DF stress granule-like structures. Plasmids expressing FLAG-3xHA tagged DF proteins, or a
control FLAG-3xHA construct were transfected in HeLa cells and staining was performed after 24 h of transfection. Areas where the HA staining (green) overlaps
with the DF staining (red) indicate the FLAG-3xHA tagged DF localization (yellow). As indicated by the arrows, the FLAG-3xHA tagged protein formed granule-like
structures of different dimension in some, but not all cells. This phenomenon occurs with proteins with low-complexity domains (Alberti et al., 2019). As shown by
the absence of these structures in the control-transfected cells (FLAG-3xHA), the granule-like structures are not an artifact of transfection. Thus, expressing DF
proteins may lead to uninterpretable results due to the variable formation of protein aggregates that could act to sequester DF proteins and mask their function.
Scale bar, 20 mm.
ll
Article
Figure S3. Analysis and Validation of RNA-Seq Data Obtained upon Single, Double, and Triple DF Protein Silencing in HeLa Cells, Related to
Figure 3
(A) Validation of DF1, DF2 and DF3 knockdown upon silencing of each of the DF paralogs alone or together. Western blotting was used to confirm knockdown
using the indicated DF paralog-specific antibodies on each sample used to perform RNA seq and Ribosome profiling analysis. HeLa cells were transfected with
the same total concentration (30 nM) of siRNA in each condition. Western blotting was performed 4 days after transfection. In each silencing condition, the siRNA
sequences were specific to the DF paralog of interest, as seen by the selective knockdown seen with each siRNA. GAPDH was used as loading control.
(B) Relative quantification of DF1, DF2 and DF3 protein band intensity shown in A. western blot chemiluminescent signal of every band was measured using
ImageLab Image tools and normalized to the GAPDH loading control. Knockdown of any of the DF paralogs was associated with a compensatory increase in the
expression of the other paralogs. The number of dots indicates the number of Western Blot replicates for each condition. Error bars indicate standard deviation.
(C) Reproducibility of the RNA-seq library replicates. For each silencing condition tested (DF1, DF2, DF3 and triple silencing), three independent replicates were
performed. The Pearson correlation coefficient of the normalized number of mapped reads across replicates was calculated and presented in each heatmap.
(D) DF1, DF2 and DF3 RNA and protein levels in HeLa cells measured by RNA-seq and Ribo-seq. The normalized counts of reads mapped to each paralog after
RNaseq (mRNA) and ribosome profiling (RPFs, ribosome protected fragments) were used as a proxy of the mRNA expression and protein expression levels,
respectively. As shown by the heatmap, DF1 has mRNA levels similar to DF2. However, DF2 is the most highly translated DF paralog. This will result in a higher
amount of DF2 protein in HeLa cells. Thus, at least in HeLa cells, DF2 is likely to be the most abundant DF paralog.
(E) Validation of the simultaneous knockdown of DF1 and DF2, DF1 and DF3, DF2 and DF3 four days after siRNA transfection. Cells were transfected with siRNA
on Day 1 and Day 3, and then western blotted to detect levels of DF1, DF2, or DF3 on Day 4. Knockdown of two DF paralogs was associated with a modest
compensatory increase in the expression of the remaining DF paralog. GAPDH was used as loading control.
(F-H) Cumulative distribution plots related to Figure 3E. In Figures 3A–3D, the cumulative distribution plots were shown only for the single knockdown experiments
and the triple knockdown experiment. Here we show the cumulative distribution blots for the double knockdown experiments. The abundance of each mRNA
(based on RNA-seq counts) was compared between the control and the indicated double DF silenced condition. mRNAs were binned based on the number of
annotated m6A sites. m6A mRNAs show higher expression levels compared to nonmethylated mRNA upon silencing any two DF paralogs. In F-H, each m6A
binned group was compared to the non-methylated mRNA group using a two-tailed Mann–Whitney test. Only significant p values are shown in each graph and
the exact p values are reported. n = 3 biological replicates.
(I) Reproducibility of the RNA-seq library replicates. For each silencing condition tested (DF1-DF2, DF1-DF3, DF2-DF3), three independent replicates were
performed. The Pearson correlation coefficient of the normalized number of reads across replicates was calculated and presented in each heatmap.
(J) RT-qPCR validation of the increase in stability of m6A mRNAs upon the silencing of the three paralogs. Highly methylated mRNAs (ZNF503 – 4 annotated m6A
sites, SGK1 – 4 annotated m6A sites, ID3 – 7 annotated m6A sites) were selected for validation. Transcripts lacking annotated m6A sites (PP1E3C and RPS28)
were chosen as controls. The stability of m6A-modified mRNAs and non-methylated mRNAs was determined by quantifying by RT-qPCR the mRNA levels
immediately before (0 h), 1 h and 2 h after actinomycin D treatment. The amount of mRNA detected at each time point is shown as percentage of the total mRNA
amount quantified at time 0 and normalized on the RPS28 abundance. The increase in mRNA stability is most apparent when all three DF paralogs were knocked
down. (n = 2 biological replicates ± s.d.).
ll
Article
Figure S4. Reanalysis of Previously Performed Ribosome Profiling Data Shows that DF1 and DF3 Do Not Control the Translational Efficiency
of m6A-mRNAs, Related to Figure 4
(A) Unlike PABPC1, DF1 is not enriched in the polysomal fraction. DF1 has been proposed to enhance translation by facilitating loading of eIF3 to mRNA 50 UTRs.
In this model, DF1 binds to an mRNA 30 UTR, and binds eIF3. This can create an mRNA loop in which the 30 UTR and 50 UTR are connected via the DF1-eIF3
complex. This is similar to other PABPC1, which also binds to mRNA 30 UTRs and binds a translation initiation factor, which results in mRNA looping. PABPC1
function in enhancing mRNA translation is evidenced by its enrichment in the sucrose fractions occupied by mRNAs containing high levels of actively translating
ribosomes (polysomes). We therefore tested if DF1 is also enriched in the polysome fractions by treating HeLa cells with cycloheximide and fractionating
polysomes on a sucrose gradient. The RNA fractions indicated here are related to the UV trace as shown in Figure 4A. DF1 is not enriched in these fractions based
on paralog-specific western blotting (also see Figure 4A). In contrast, PABPC1 is readily detected in the actively translating ribosomal fractions. The ribosomal
protein RPS6 was used to confirm the efficiency in recovering the translating ribosomes from the polysomal sucrose fractions. Thus, these data do not support a
model in which DF1 enhances the translation of m6A-mRNAs by binding to highly translated mRNAs.
(B) m6A-mRNAs are downregulated upon depletion of DF1, based on ribosome profiling datasets provided in Wang et al. (2015). Here, we used two ‘‘processed’’
datasets provided in the GEO submission for Wang et al. (2015) (GSE63591_C-Y1-Ribosome_profiling.xlsx). These datasets are referred to as processed since
the number of ribosome-protected fragments, RNA-Seq data, and translational efficiency (TE) are listed for each transcript based on the authors’ calculations and
analysis of their next-generation sequencing data. We used this processed data to calculate the change in translational efficiency of mRNAs upon DF1 silencing
based on the number of annotated m6A sites. As can be seen, m6A-modified mRNAs show reduced translation upon DF1 knockdown. This result is essentially
identical to the cumulative distribution plots presented in Wang et al. (2015). Notably, the data presented here is based on the average of two replicates provided
by Wang et al. (2015), designated as Replicate 1 and Replicate 2.
(C) Replicate 1 shows a more prominent reduction in m6A-mRNA translation upon depletion of DF1. The effect seen in Replicate 1 is more pronounced than the
effect seen in (A), which is the average of Replicate 1 and Replicate 2. Only significant p values are shown in each graph and the exact p values are reported.
(D) Replicate 2 shows no reduction in m6A-mRNA translation efficiency upon depletion of DF1. Surprisingly, and in contrast to Replicate 1 (shown in (B), Replicate
2 does not show a drop in m6A mRNA translation efficiency upon depletion of DF1. The effect seen in (A) was seen since it was the average of Replicate 1 and
Replicate 2, which show different effects on m6A mRNA. In B-D, each m6A group was compared to the non-methylated mRNA group using a two-tailed Mann–
Whitney test. Only significant p values are shown in each graph and the exact p values are reported.
(E) The reanalyzed ribosome profiling datasets do not show reduced m6A-mRNA translation efficiency upon depletion of DF1. In this experiment, we accessed the
original next-generation sequencing data generated by Wang et al. (2015) (GEO: GSE63591) and generated a new table of ribosome-protected fragments using a
standard ribosome-profiling analysis pipeline (McGlincy and Ingolia, 2017). We referred to these reanalyzed datasets ‘‘Replicate R1’’ and ‘‘Replicate R2.’’ mRNAs
were binned based on the number of annotated m6A sites. In contrast to the results presented in (B), analysis of the two replicates (presented as an average)
shows that m6A mRNAs do not show reduced translation efficiency upon DF1 silencing.
(F) The reanalyzed ribosome profiling sample, i.e., Replicate R1, no longer shows a link between m6A mRNA translation and DF1. In contrast to the original
processed data presented in (C), reanalysis of the underlying next-generation sequencing data from the ribosome profiling data in Wang et al. (2015), no longer
shows decreased m6A-mRNA translation upon depletion of DF1.
(G) Replicate R2, like the original Replicate 2 shown in (D), shows no change in m6A-mRNA translation efficiency upon depletion of DF1. In E-G, each m6A binned
group was compared to the non-methylated mRNA group using a two-tailed Mann–Whitney test. Only significant p values are shown in each graph and the exact
p values are reported.
(H) The number of ribosome-protected fragments per mRNA reported by Wang et al. (2015) for Replicate 1 only modestly correlates to the number of ribosome-
protected fragments of Replicate 2. Both the control replicates, and the DF1-depleted replicates were compared. For this analysis, we used the number of
ribosome-protected fragments reported for the two replicates in the GEO submission for Wang et al. (2015) (GSE63591_C-Y1-Ribosome_profiling.xlsx). These
datasets are referred to as ‘‘processed’’ since the number of ribosome-protected fragments are listed based on the authors’ analysis of their next-generation
sequencing raw data. For each tested condition (transfection with a control siRNA or with siRNAs targeting DF1), a plot showing the comparison between the
number of ribosome-protected fragments calculated for Replicate1 and Replicate 2 is presented. A Pearson correlation was calculated for each comparison. To
reduce the complexity of the analysis, only the ribosome-protected fragments mapped to the m6A-mRNAs are presented.
(I) Upon our re-examination of the raw data reported by Wang et al. (2015), the number of ribosome-protected fragments per mRNA in Replicate 1 strongly
correlates with the number of ribosome-protected fragments of Replicate 2. This analysis was performed for both the control and DF1-depleted samples. Here,
we accessed the original next-generation sequencing data generated by Wang et al. (2015) (GEO: GSE63591) and generated a new table of ribosome-protected
fragments per gene using a standard ribosome-profiling analysis pipeline (McGlincy and Ingolia, 2017). We referred to these as ‘‘Reanalysis of the data from Wang
et al. (2015).’’ As in (H), a plot showing the comparison between the number of ribosome-protected fragments calculated for Replicate1 and Replicate 2 is
presented for each tested condition. In contrast to (H), there is high level of correlation between replicates independently from the tested condition. This result is
generally seen when analyzing replicates of the same condition giving higher level of confidence in interpreting the possible effect of DF1 on the translation of
m6A genes.
(J) m6A-mRNAs do not show a reduction in translation efficiency upon depletion of DF3. We asked if m6A-mRNA translation is affected upon depletion of DF3,
using the ribosome profiling datasets generated by Shi et al. (2017). Unlike in the DF1 knockdown dataset, the processed data provided by Shi et al. (2017)
presented the ribosome-protected fragments from two replicates as a log2-fold-change relative to a single control ribosome profiling experiment. mRNAs were
binned based on the number of annotated m6A sites. Here, we saw no change in translation of m6A-annotated mRNAs using the two knockdown replicates
provided in Shi et al. (2017).
ll
Article
Figure S5. Quality Control Analysis of Ribosome Profiling Datasets and Validation of DF Paralog Silencing in MOLM-13 Cells, Related to
Figures 4 and 5
(A) Reproducibility of the ribosome profiling library replicates. For each silencing condition tested (DF1, DF2, DF3), three independent replicates were performed.
The Pearson correlation coefficient of the normalized number of ribosome-protected fragments reads across replicates was calculated and presented in each
heatmap.
(B) Validation of ribosome profiling datasets. Distribution of read length in the ribosome profiling experiments in each tested condition (DF1, DF2 and DF3
silencing). Most of the reads are 29-30 nt in length, which reflects the expected size of the ribosome footprint when a ribosome is translating on the mRNA and the
A site is occupied (McGlincy and Ingolia, 2017; Wu et al., 2019). The samples were prepared in the absence of cycloheximide as recommended in the most current
ribosome profiling protocols (McGlincy and Ingolia, 2017). The percentage of P-sites assigned to the coding sequence, 30 UTR, or the 50 UTR is presented as a bar
plot for each condition. As expected, most of the reads map to the coding region. (lower right) The position of ribosome footprints relative to the reading frame was
determined for each dataset. Overall, these analyses indicate that a high fraction of the reads obtained in these datasets represent ribosome-protected fragments
that reflect the position of the translating ribosomes in cells.
(C) Validation of the DF1, DF2 and DF3 knockdown. To confirm knockdown of the DF paralogs, we quantified the number of ribosome-protected fragments (RPFs)
that mapped to each transcript before and after the silencing of each DF paralog. Upon the silencing, the reduction in the RPFs confirms that DF1, DF2 and DF3
are less translated, consistent with efficient knockdown. These results are consistent with the western blot validation presented in Figures S3A and S3B.
(D) Effect of DF1 depletion on the distribution of m6A-modified mRNAs (OSGIN2, CDC27 and YY1) and non-m6A-modified mRNA (RPS28) along the sucrose
gradient. To independently test whether m6A-mRNAs are less translated, we performed polysome fractionation on a sucrose gradient of DF1-silenced and
control siRNA-treated cytoplasmic lysates (upper graph). The level of m6A-mRNAs reported to be exhibit markedly reduced in translation upon DF1 silencing by
(YY1 and OSGIN2, 5.6 log2 fold change; CDC27, 3.66 log2 fold change; Wang et al., 2015), were tested in each fraction by qRT-PCR. The quantity of each
mRNA per fraction was normalized to the amount of spike-in luciferase mRNA in each fraction and presented as percentage of the total amount measured in all
fractions. As shown in the lower graphs, DF1 silencing does not affect the distribution of m6A-mRNAs along the gradient. Thus, the effect previously shown by the
ribosome profiling for the top-regulated m6A-mRNAs cannot be independently validated using polysome fractionation and qPCR analysis. Data are means ±
s.e.m. (n = 2)
(E) Reproducibility of the DF1, DF2, and DF3 triple knockdown ribosome profiling dataset. As shown in (A), the Pearson correlation coefficient of the normalized
number of ribosome-protected fragments across replicates was calculated and presented in a heatmap.
(F) Validation of the ribosome profiling dataset obtained upon depletion of DF1, DF2, and DF3. As presented in (B), shown are the distribution of the ribosome
profiling read lengths, the percentage of P-sites assigned to the coding sequence, 30 UTR, and 50 UTR, and the position of ribosome footprints relative to the
reading frame. Overall, these analyses indicate that a high fraction of the reads obtained in this dataset represent ribosome-protected fragments that reflect the
position of the translating ribosomes in cells.
(G) Confirmation of DF1, DF2 and DF3 knockdown in MOLM-13 cells. Western Blot was performed after 5 days from the shRNA transduction. GAPDH was used
as loading control.
Article
Correspondence
z.takats@imperial.ac.uk (Z.T.),
george.poulogiannis@icr.ac.uk (G.P.)
In Brief
Metabolic fingerprinting using the iKnife
offers near real-time diagnosis of PIK3CA
mutant breast cancers and connects
oncogenic PIK3CA with enhanced
arachidonic acid metabolism. cPLA2
inhibition shows remarkable synergy with
dietary fat restriction to restore tumoral
immune cell infiltration and inhibit growth
of mutant PIK3CA-bearing breast tumors.
Highlights
d The iKnife offers near real-time diagnosis of PIK3CA mutant
breast cancers
Article
Metabolic Fingerprinting Links Oncogenic PIK3CA
with Enhanced Arachidonic Acid-Derived Eicosanoids
Nikos Koundouros,1,2 Evdoxia Karali,1 Aurelien Tripp,1 Adamo Valle,1,3,4,5 Paolo Inglese,2 Nicholas J.S. Perry,1
David J. Magee,1,6 Sara Anjomani Virmouni,1,7 George A. Elder,1,8 Adam L. Tyson,9,10 Maria Luisa Dória,2
Antoinette van Weverwijk,11,12 Renata F. Soares,2 Clare M. Isacke,11 Jeremy K. Nicholson,2,13 Robert C. Glen,2,14
Zoltan Takats,2,* and George Poulogiannis1,2,15,*
1Signalling and Cancer Metabolism Team, Division of Cancer Biology, The Institute of Cancer Research, 237 Fulham Road, London SW3
6JB, UK
2Division of Systems Medicine, Department of Metabolism Digestion and Reproduction, Imperial College London, London SW7 2AZ, UK
3Energy Metabolism and Nutrition, Research Institute of Health Sciences (IUNICS), University of Balearic Islands, 07122 Palma de
Mallorca, Spain
4Health Research Institute of the Balearic Islands (IdISBa), University of Balearic Islands, 07120 Palma de Mallorca, Spain
5Biomedical Research Networking Center for Physiopathology of Obesity and Nutrition (CIBERobn CB06/03/0043), Instituto de Salud Carlos
SW3 6JB, UK
10Sainsbury Wellcome Centre for Neural Circuits and Behaviour, University College London, 25 Howland Street, London W1T 4JG, UK
11Breast Cancer Now Research Centre, The Institute of Cancer Research, 237 Fulham Road, London SW3 6JB, UK
12Division of Tumor Biology and Immunology, the Netherlands Cancer Institute, Plesmanlaan 121, 1066 CX Amsterdam, the Netherlands
13The Australian National Phenome Centre, Health Futures Institute, Murdoch University, Perth WA6150, WA, Australia
14Centre for Molecular Informatics, Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, UK
15Lead Contact
SUMMARY
Oncogenic transformation is associated with profound changes in cellular metabolism, but whether tracking
these can improve disease stratification or influence therapy decision-making is largely unknown. Using the
iKnife to sample the aerosol of cauterized specimens, we demonstrate a new mode of real-time diagnosis,
coupling metabolic phenotype to mutant PIK3CA genotype. Oncogenic PIK3CA results in an increase in
arachidonic acid and a concomitant overproduction of eicosanoids, acting to promote cell proliferation
beyond a cell-autonomous manner. Mechanistically, mutant PIK3CA drives a multimodal signaling network
involving mTORC2-PKCz-mediated activation of the calcium-dependent phospholipase A2 (cPLA2).
Notably, inhibiting cPLA2 synergizes with fatty acid-free diet to restore immunogenicity and selectively
reduce mutant PIK3CA-induced tumorigenicity. Besides highlighting the potential for metabolic phenotyping
in stratified medicine, this study reveals an important role for activated PI3K signaling in regulating arachi-
donic acid metabolism, uncovering a targetable metabolic vulnerability that largely depends on dietary fat
restriction.
1596 Cell 181, 1596–1611, June 25, 2020 ª 2020 The Author(s). Published by Elsevier Inc.
This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
ll
Article OPEN ACCESS
of proliferating cells (Carracedo and Pandolfi, 2008; Dibble and (Figures 1C and S1C), suggesting that the modulation of ER
Manning, 2013; Fruman et al., 2017; Lien et al., 2016). signaling induces distinct lipidomic alterations, which are detect-
These observations raise some fundamental questions: able by REIMS and are reversible by ER inhibition.
whether we can use metabolic tracking for more effective With a robust lipidomic profile obtained using REIMS, we
screening of the molecular features underlying tumor pathogen- next performed unsupervised hierarchical clustering to partition
esis and, ultimately, whether this information can be translated all breast cancer cell lines on the basis of their spectral similar-
into better and more efficacious treatment strategies for each ities measured over 872 lipid species. This analysis revealed
patient. Indeed, the concept of metabotyping has been widely two subtypes with distinctive signatures, in which REIMS-de-
applicable in characterizing functionally distinct traits that have tected lipid species were significantly enriched (black) or
the power to influence clinical decision-making (Gavaghan depleted (gray) (Figure 1D). The observed clusters were also
et al., 2000; Holmes et al., 2008; Nicholson et al., 2002, 2012). confirmed using a consensus non-negative matrix factorization
Here, we used rapid evaporative ionization mass spectrometry (NMF) (Figure S1D).
(REIMS), which coupled to the intelligent surgical device, also To shed light on the mechanism that is driving this unique
known as iKnife, allows for instantaneous chemical analysis of metabolic classification, we examined mutational enrichment
the aerosol generated during electrosurgical tissue ablation of the cells between the two subtypes. Out of the top 150 genes
and cauterization, in the form of gas-phase ionic species. Unlike that are frequently (>20%) mutated in these cell lines, oncogenic
other technologies that are commonly used for metabolite mutation in PIK3CA was the only one to be significantly (Fisher’s
profiling, such as liquid chromatography-mass spectrometry test, p value = 0.019) overrepresented in the lipid-enriched clus-
(LC-MS), REIMS analysis requires no sample preparation and al- ter (Figure 1E; Table S2). In accordance with this finding, analysis
lows for near real-time (1–2 s) lipidomic analysis and tissue of isogenic MCF10A PIK3CA wild-type (WT) and mutant (MUT)
recognition, based on multivariate classification analysis of (E545K and H1047R) cell lines also revealed clustering of the
spectral libraries of reference mass spectra. latter in the lipid-enriched group, both when cells were cultured
The iKnife/REIMS can be used both in the intraoperative and in 2D, or 3D as spheroids (Figures 1E and S1E). Consistent with
biopsy collection settings to differentiate cancerous from non- this stratification, performing gene and functional pathway
cancerous tissues with very high precision, based on their lipido- enrichment analyses revealed KEGG-pathway ontologies
mic composition (Alexander et al., 2017; Balog et al., 2013; St relating to metabolic pathways that were significantly associated
John et al., 2017). However, the potential of using this technology with the lipid-enriched subtype (Figures S2A and S2B). Specific
beyond the pattern-level identification of tissues, to reveal the overexpressed genes included FASN and ELOVL6, which are
biological mechanisms underlying unique metabolic signatures, involved in de novo lipogenesis, and LDLRAP1, which facilitates
or identify which patients will likely benefit from a given treat- exogenous lipid uptake (Figures S2C–S2E). Indeed, PIK3CA
ment, has not yet been explored. MUT cells displayed elevated induction of the de novo lipogen-
esis transcriptional regulator SREBP1 (Figure 1F) and higher
RESULTS exogenous FA uptake capacity (Figure 1G), suggesting that
both could contribute to the lipid-enriched metabotype.
Metabolic Phenotyping Using REIMS Predicts Molecular Most importantly, the observed metabolic stratification was
Markers Including Oncogenic Mutations in PIK3CA also evident among PIK3CA WT and MUT breast cancer PDXs
We first examined whether REIMS-detected lipid signatures (Figure 1H) and primary tumors (Figure 1I; Table S3). Among
correlate with any established molecular markers of breast can- the PDX tumors assessed (n = 18), only one was misclassified
cer of known prognostic and therapeutic value (Figure 1A). For (BR5017), and this harbored a rare I391M mutation that has ac-
this, we selected a panel of 43 breast cancer cell lines, 18 pa- tivity reminiscent of WT PIK3CA (Dan et al., 2010) (Figure 1H).
tient-derived xenograft (PDX), and 12 primary breast tumors Overall, PIK3CA mutation status in both PDX and primary tumors
that are well characterized for their estrogen (ER), progesterone could be classified with an accuracy of 90% using all measurable
(PR), and HER2 receptor status. REIMS profiling of cell lines lipid species (Figure S1F), suggesting that the iKnife/REIMS
consistently classified ER, HER2, and triple negative status could be used for near real-time diagnosis of PIK3CA MUT
(TN) with area under the curve (AUC) accuracies between 0.8– breast cancers by MS analysis of aerosolized tissue material.
0.9, and 0.6–0.7 for PR (Figure 1B).
Consistent with previous studies (Hilvo et al., 2011), the most mTORC2 Signaling Downstream of Oncogenic PIK3CA
striking differences in lipid profiles were observed between ER- Drives the Lipid-Enriched Phenotype
positive (+ve) and -negative (ve) breast cancer cell lines (Fig- The effects of PI3K/Akt/mTOR signaling on lipid metabolism have
ures 1B and S1A; Table S1) and tumor specimens (Figure S1B). been observed on numerous levels (Dibble and Manning, 2013;
A surrogate marker for ER positivity, aside from its routine deter- Lien et al., 2016; Saxton and Sabatini, 2017). To elucidate the spe-
mination by immunohistochemistry (IHC), is expression of the cific mechanisms underlying the observed phenotype, we treated
estrogen receptor 1 (ESR1) gene. We built a regression model PIK3CA MUT cells with inhibitors targeting the activity of key no-
to predict ESR1 expression based on the spectral profiles ob- des in the PI3K pathway (Figure 2A), albeit at concentrations that
tained by REIMS and tested this in representative ER+ve cell lines do not affect cell viability (Figure S3A). PI3K (BYL719, BKM120)
treated with or without 4-hydroxy-tamoxifen (4-OHT). Of note, and mTOR (rapamycin, torin 1) inhibition dramatically reduced
the predicted ESR1 expression was significantly reduced relative phospholipid levels, but surprisingly, Akt inhibition with
following 4-OHT treatment as compared to untreated controls either MK2206 or GSK690693 did not (Figure 2B). Similar results
A B C
D E
F H I
Figure 1. REIMS Analysis Predicts Breast Cancer Molecular Markers Including Oncogenic Mutations in PIK3CA
(A) Schematic overview of sample preparation for REIMS analysis.
(B) Area under the curve (AUC) classification accuracies for ER, PR, HER2 receptor, and triple negative status of 43 breast cancer (BC) cell lines (median intensity
of n = 3 biological replicates) following feature selection for phospholipids in the m/z range 600–900, and leave-one-out cross validation.
(C) Immunoblot analysis of estrogen inducible protein pS2 and predicted ESR1 expression in ER+ve MCF7 cells following treatment with 0.1% DMSO or indicated
concentrations of 4-OHT for 72 h.
(D) Unsupervised hierarchical clustering of 872 lipid species detected by REIMS across 43 BC cell lines.
(E) Dendrogram of BC cell lines and isogenic MCF10A cells harboring either WT or MUT (E545K or H1047R) PIK3CA.
(F) Immunoblot analysis of mature SREBP1 transcription factor expression in nuclear extracts of the MCF10A PIK3CA isogenic panel.
(G) Relative exogenous fatty acid uptake in MCF10A PIK3CA WT and MUT cells following serum starvation for 1 h and supplementation with fluorescently labeled
dodecanoic acid (n = 5 replicates).
(H and I) Unsupervised hierarchical clustering of 9 PIK3CA WT and 9 MUT breast PDX tumors (H) and (I) 5 WT and 7 MUT primary breast tumors. Individual rows in
the heatmaps in (D), (H) and (I) correspond to scaled Z score phospholipid intensities (n = 3 biological replicates). Error bars represent ± SEM. n.s., not significant;
*p % 0.05; **p % 0.01; ***p % 0.001. p values in (C, bottom panel) and (G) were calculated with one-way ANOVA, followed by unpaired, two-tailed Student’s t test
with Bonferroni correction.
A B C D
Figure 2. Oncogenic PIK3CA Drives the Lipid-Enriched Phenotype via mTORC2 Signaling
(A) MCF10A PIK3CA MUT cells were treated with BYL719 or BKM120 (100 nM), MK2206, or GSK690693 (150 nM) for 72 h, or rapamycin (20 nM) for 4 h, or
rapamycin or torin 1 (20 nM) for 72 h.
(B) Unsupervised hierarchical clustering of MCF10A E545K and H1047R MUT cells treated with PI3K, AKT, and mTOR inhibitors.
(C and D) Immunoblot analysis (C) and unsupervised hierarchical clustering (D) of MCF10A E545K and H1047R cells transfected with RAPTOR, RICTOR, or mTOR
siRNA. Individual rows in the heatmaps in (B) and (D) correspond to scaled Z score phospholipid intensities (n = 3 biological replicates).
were observed in a panel of 5 PIK3CA MUT breast cancer cell which are established products of elevated lipogenesis and in
lines, with the exception of MCF7 cells that also responded to line with the observed lipid enriched phenotype (Figure 1D; Table
MK2206 (Figure S3B). S4). Interestingly, the second most significantly elevated FA after
Interestingly, we did not observe an effect on relative phos- palmitoleate was arachidonic acid (AA) (FA20:4), an omega-6 FA
pholipid abundances following acute exposure to rapamycin which is predominantly found in animal fats and is of particular
for 4 h, despite inhibition of mTORC1 (Figures 2A, bottom left relevance as a major regulator of pro-inflammatory responses
panel, and 2B). Because extended rapamycin treatment inhibits in cancer, through the production of bio-active lipids known as
mTORC1 and mTORC2 (Figure 2A, bottom right panel) (Sarbas- eicosanoids (Wang and Dubois, 2010) (Figure 3A; Table S4).
sov et al., 2006), and both impinge upon lipogenesis (Düvel et al., Importantly, in addition to the cell lines, significant elevations in
2010; Griffiths et al., 2013; Guri et al., 2017; Lee et al., 2017; AA were also observed in all the PIK3CA MUT breast PDX and
Ricoult et al., 2016), we sought to investigate which of these primary tumors (Figures 3B and 3C) and across tumors of other
complexes might contribute to the regulation of the observed tissue types including ovarian, pancreatic, and sarcomas (Fig-
phenotype. Knockdown of RICTOR or mTOR, but not RAPTOR, ure 3D). In agreement with our REIMS findings, AA and down-
led to a significant reduction in relative phospholipid abun- stream eicosanoids were also found to be significantly elevated
dances (Figures 2C and 2D), pointing to a PIK3CA- and in both PIK3CA MUT cells using LC-MS (Figures 3E, S4A,
mTORC2-dependent metabolic phenotype that is largely inde- and S4B).
pendent of mTORC1 or Akt inhibition. To measure FAs that are secreted from cells, as opposed to
those that might already exist in serum-supplemented media,
Oncogenic PIK3CA Drives Enhanced Arachidonic Acid cells were grown under FA-deprived conditions. Pro-inflamma-
Metabolism, thereby Promoting Cell Proliferation tory derivatives of AA were significantly increased in the media
beyond a Cell-Autonomous Manner of PIK3CA MUT cells (Figure 3F), signifying a potential role for
Given that global lipidomic profiles could stratify breast cancer these bio-active lipids in tumor microenvironment (TME)
cell lines and tumors based on PIK3CA mutation status, we interactions.
next aimed to characterize specific lipid alterations that are Next, to ascertain the functional consequences of elevated
associated with oncogenic PIK3CA. Fatty acids (FAs), which eicosanoid metabolism, the effects of PIK3CA MUT-derived
are the main constituents of phospholipids and have additional conditioned media (CM) were assessed. PIK3CA WT cells dis-
effector functions in cancer pathogenesis, were profiled in the played dramatically increased proliferative rate following incuba-
PIK3CA isogenic panel using REIMS. Among the most abundant tion with CM obtained from MUT cells, and this was effectively
FAs in PIK3CA MUT compared to WT cells were palmitoleate rescued by depleting the lipids from the media (Figures 3G,
(FA16:1), palmitic acid (FA16:0), and oleic acid (FA18:1), all of S4C, and S4D). Moreover, the proliferation of both PIK3CA
A B C
D E
F G H
MUT cell lines was significantly reduced following incubation data support not only a critical role for oncogenic PIK3CA in influ-
with their respective lipid-deprived CM (Figures 3H and S4E), encing autocrine- and paracrine-mediated cell proliferation, but
whereas supplementation of lipid-deprived CM with AA, but also point to AA as an easily measured metabolic biomarker that
not palmitate or palmitoleate, restored proliferation in both WT could help with the diagnosis and treatment of PIK3CA MUT
and MUT cells (Figures 3G, 3H, S4D, and S4E). Together, these tumors.
A B C D
E F G H
J K M
Oncogenic PIK3CA Promotes Enhanced Arachidonic mTORC2 signaling to this process is unknown, as is the role of
Acid Production via mTORC2-PKCz-cPLA2 Signaling PKCz in the regulation of cPLA2. Consistent with PKCz being a
To better understand the mechanism by which PIK3CA MUT direct substrate of PDK-1 (Chou et al., 1998) and mTORC2 phos-
cells have elevated AA, we assessed various pathways which phorylation (Li and Gao, 2014), it was found to be hyperphos-
contribute to its cellular pool, including: direct exogenous up- phorylated in PIK3CA MUT cells (Figure 4F) and breast PDX
take, synthesis from linoleic acid, hydrolysis from diacylglycerol tumors (Figure S5L). Moreover, its inhibition led to a marked
(DAG), or endogenous release from membrane phospholipid reduction in MAPK/ERK signaling (Figure 4G), as well as p38
through phospholipase (PLAs) activity. Curiously, we noted a MAPK phosphorylation and active GTP-bound Rac-1 (Fig-
persistent increase in AA in PIK3CA MUT isogenic panel even ure 4H), culminating in reduced cPLA2 phosphorylation at the
when cells were cultured with FA-free media (Figure 3A). Addi- S505 site (Figures 4G and S5L).
tionally, DAG levels—that can be in part generated from the hy- In addition to S505 phosphorylation, an increase in intracel-
drolysis of phosphatidylinositol-4,5-bisphosphate (PIP2)—were lular calcium levels is essential for sustained phospholipase ac-
significantly reduced in PIK3CA MUT cells (Figures S4F tivity and liberation of AA by cPLA2 (Ambs et al., 1995; Clark
and S4G). et al., 1995). Given that elevated PIP3 levels, induced by onco-
These results pointed to a potential role for phospholipases genic PIK3CA, promote the activation of phospholipase C
(PLAs), of which three main classes predominate: cytosolic/cal- gamma 1 (PLCg1), leading to an increase in cytosolic calcium
cium-dependent (cPLA2), calcium-independent (iPLA2), and via generation of inositol-1,4,5-trisphosphate (IP3) (Rameh
secretory (sPLA2) phospholipase A2 (Burke and Dennis, 2009). et al., 1998), we hypothesized that this signaling node could
Among these, only cPLA2 displayed significantly higher enzy- also play a role in regulating cPLA2 activity and AA release down-
matic activity (Figure 4A), as well as elevated total protein levels stream of active PI3Ka. In line with this premise, we observed
and stability (Figures S5A and S5B) in the presence of oncogenic higher phosphorylation of PLCg1 (Figure S6A) and significantly
PIK3CA, whereas expression of PLA2G4A—the gene encoding elevated intracellular calcium levels in PIK3CA MUT cells (Fig-
cPLA2a—remained unchanged (Figure S5C). ure S6B). Although genetic and pharmacological (U73122) inhibi-
Consistent with the predominant role of mTORC2 in driving the tion of PLCg1 led to a significant reduction in calcium flux in both
lipid enriched phenotype in PIK3CA MUT cells (Figures 2C and PIK3CA WT and MUT cells (Figures S6C–S6E), an inhibitory ef-
2D), RICTOR, but not RAPTOR, silencing rescued cPLA2 activity fect on cPLA2 activity and AA levels was only observed in the
(Figure 4B), and led to a concomitant reduction in AA and pros- MUT cells (Figures S6F–S6H), highlighting the importance of
taglandin E2 (PGE2) levels (Figures 4C and S5D–S5F). Impor- PLCg1-mediated Ca2+ flux in sustaining elevated cPLA2 activity
tantly, mTORC2-specific inhibition was also accompanied by in the context of oncogenic PIK3CA.
decreased cPLA2 stability (Figure S5G). To elucidate the mech- Finally, because cPLA2 has a predicted PKCz phosphorylation
anism underlying this observation, known substrates of site (T376), we tested the possibility that it could serve as a direct
mTORC2 including serum/glucocorticoid-regulated kinase 1 substrate for PKCz. In vitro kinase assays using purified PKCz
(SGK-1) and the protein kinase C (PKC) isoforms were inhibited and cPLA2 suggested a direct interaction and phosphorylation
in PIK3CA MUT cells. ASB14780—an indole-based compound (Figure 4I), and this was confirmed following immunoprecipita-
that inhibits both cPLA2 translocation to membrane compart- tion (Figure 4J) and proximity ligation activity (PLA) assays (Fig-
ments and the interaction between phospholipid substrates ures S6I and S6J). To further evaluate cPLA2 as a candidate sub-
with the enzyme active site (McKew et al., 2008; Tomoo et al., strate for PKCz, we developed a custom antibody recognizing
2014)—was used as a positive control. Importantly, a significant the cPLA2 T376 phosphorylation site. Specificity was validated
reduction in AA and cPLA2’s enzymatic activity, reminiscent of in both serum starved/stimulated samples (Figure S6K, left
that observed upon ASB14780 inhibition, was only observed panel), following PKCz inhibition (Figures S5K and S6K, middle),
following pharmacological and small interfering RNA (siRNA)- and in cPLA2 CRISPR knockout cells overexpressing a phos-
mediated inhibition of PKCz (Figures 4D, 4E, and S5H–S5K). phoresistant MUT (T376A) cPLA2 (Figure S6K, right panel).
Previous studies have shown that phosphorylation of cPLA2 Importantly, increased T376 phosphorylation was observed in
on S505 regulates its activity and stability and that this is medi- the presence of oncogenic PIK3CA and this was reduced to
ated, at least in part, by the p38 MAPK/ERK signaling pathway levels equivalent to PIK3CA WT cells upon PKCz inhibition
(Kramer et al., 1996; Lin et al., 1993). The contribution of PI3K- (Figure 4K).
(F and G) Immunoblot analysis of the MCF10A PIK3CA isogenic panel following growth factor deprivation for 16 h and 30 min stimulation with serum and growth
factors (F) or PKCz inhibition with 1 mM peptide inhibitor for 72 h (G).
(H) Immunoblot analysis of activated Rac-1 and p38 MAPK in the MCF10A PIK3CA isogenic panel following PKCz inhibition with 1 mM peptide inhibitor for 72 h.
(I) In vitro kinase assay of 100 ng and 0.5 mg/mL purified PKCz and cPLA2 proteins, respectively.
(J) Immunoblot analysis of anti-HA immunoprecipitates derived from HA-tagged cPLA2 transfected MCF10A PIK3CA WT and MUT cells.
(K) Immunoblot analysis of anti-HA immunoprecipitates derived from HA-tagged cPLA2 transfected MCF10A PIK3CA WT and MUT cells treated where indicated
with 1 mM PKCz peptide inhibitor for 48 h.
(L) AA levels across H1047R MUT cells with CRISPR knockout of PLA2G4A reconstituted with WT or phosphoresistant cPLA2 isoforms.
(M) Diagram summarizing the proposed model for PI3K-mTORC2-PKCz and calcium-dependent activation of cPLA2, leading to a concomitant increase in AA
and downstream eicosanoids. Data are presented as the mean ± SEM of n = 3–6 biological replicates and are representative of at least two independent ex-
periments. n.s., not significant; *p % 0.05; **p % 0.01; ***p % 0.001. p values in (A) were calculated with unpaired, two tailed Student’s t test, and in (B)–(E), (I), and
(L) with one-way ANOVA, followed by unpaired, two-tailed Student’s t test with Bonferroni correction.
A B C D
F G
Figure 5. Genetic and Pharmacological Inhibition of cPLA2 Selectively Reduces Oncogenic PIK3CA-Mediated Tumorigenicity
(A–D) Cell viability of (A) PIK3CA WT and MUT MCF10A cells, and (B) breast cancer cell lines (PIK3CA WT: MDAMB134, Hs578T, AU565; PIK3CA MUT: MCF-7,
CAL-51, MDAMB453) following treatment with increasing concentrations (20 nM–10 mM) of ASB14780 under full serum conditions for 72 h. The same treatments
were also performed under fatty acid-free conditions in (C) and (D), in the presence or absence of exogenous supplementation of 25 mM AA.
(E) Clonogenic assays of MCF10A PIK3CA WT and MUT cells treated with increasing concentrations of ASB14780 as in (A)–(D). Treatments were performed
under fatty acid-free conditions, with or without the supplementation of 25 mM AA.
(F) Immunoblot analysis confirming specific knockdown of cPLA2 using two independent constitutive shRNAs (sh1 and sh5) (left) and reduction in AA levels in
MCF10A E545K/H1047R MUT cells using REIMS.
(G) Proliferation of MCF10A H1047R MUT cells expressing shGFP, cPLA2-sh1, or cPLA2-sh5 under exogenous FAF conditions. Sulforhodamine B (SRB) protein
staining was used to measure cell proliferation over 5 days.
(legend continued on next page)
To ascertain the functional significance of these two phos- cPLA2, we demonstrated that PIK3CA MUT, but not WT cells,
phorylation sites (S505 and T376), endogenous cPLA2 in rely on cPLA2 to form epithelial acini in 3D culture and sustain
PIK3CA WT and H1047R MUT cells was reconstituted with WT their proliferation (Figures 6A–6C).
or phosphoresistant MUT of cPLA2 (S505A or T376A) (Figure 4L). To further evaluate the therapeutic effect of cPLA2 inhibition in
Knockout of cPLA2 in H1047R cells reduced AA to levels equiv- primary breast cancers, we treated triple negative breast cancer
alent to the PIK3CA WT background, and this could be rescued (TNBC) (Figure 6D) PIK3CA WT (Figure 6E) and MUT (Figure 6F)
with ectopic expression of the WT, but not MUT (S505A or PDX-bearing mice with the ASB14780 inhibitor in conjunction
T376A) cPLA2 (Figure 4L). Interestingly, the activity of exoge- with a normal or near-isocaloric fat-free diet. A significant reduc-
nously expressed WT or MUT cPLA2 largely mirrored the trends tion in tumor weight was only observed in the PIK3CA MUT PDX
in AA that were previously detected by REIMS (Figure S6L), sug- model when both the inhibitor and a fat-free diet were adminis-
gesting that cPLA2 activity is highly dependent on the phosphor- tered in combination (Figure 6F). Corroborating these observa-
ylation of S505 and T376 in the context of oncogenic PIK3CA. tions, histological analysis did not reveal any changes in tumor
Overall, our model provides a unifying framework for several pre- area for the PIK3CA WT BR1458 model (Figures 6G, left panels,
viously unconnected components of PI3K signaling, which and 6H), while a striking reduction in viable tumor regions was
converge on cPLA2 activation and enhanced AA metabolism observed in PIK3CA MUT PDX-bearing mice treated with
(Figure 4M). ASB14780 in fat-free diet (Figures 6G, right panels, and 6I).
The concomitant increase in necrotic regions (as indicated by
cPLA2 Inhibition and Dietary Fat Restriction Suppress areas of pale eosinophilic cytoplasms, in addition to loss of
PIK3CA-Induced Tumorigenicity and Restore Anti- nuclei and karyolysis), evidently contributed to a substantial pro-
cancer Immune Responses portion of the overall weight of the PIK3CA MUT-bearing tumor
If cPLA2 is required for PIK3CA oncogenicity, then targeting this that was left after treatment of ASB14780 in fat-free diet (Fig-
enzyme could represent an attractive therapeutic strategy. For ure 6G, bottom right panel). In the interest of measuring AA levels
this, we used the inhibitor ASB14780, which displays excellent at the end of the treatment regime (Figure 6D), resected tumors
oral bioavailability and higher specificity for the cPLA2 isoform were analyzed directly with REIMS. Although a fat-free diet alone
than other commonly used compounds such as Efipladib and led to a modest, yet significant reduction in the AA levels of the
Ecopladib (Lee et al., 2007; Tomoo et al., 2014). Furthermore, PDX tumors, the decrease was significantly more pronounced
ASB14780 has been shown to ameliorate inflammatory pathol- when accompanied with cPLA2 inhibition (Figures 6J and 6K),
ogies including non-alcoholic fatty liver disease (NAFLD) and highlighting the role of dietary AA restriction in therapy response
chronic obstructive pulmonary disease (COPD) through sup- (Figures 6E–6I).
pression of AA and prostaglandin synthesis (Kanai et al., 2016; The observation that cPLA2 inhibition and fat-free diet selec-
Tomoo et al., 2014). However, the potential anti-neoplastic prop- tively reduces the tumorigenicity of PIK3CA MUT cells raises
erties of this inhibitor remain obscure. Of note, PIK3CA mutation the interesting prospect of modulating this response by altering
sensitized cells to pharmacological inhibition of cPLA2 with dietary fat content. Indeed, it is becoming increasingly appreci-
ASB14780 (Figures 5A and 5B), and this effect was more prom- ated that the diet of Western populations commonly contains
inent under exogenous FA-free conditions (Figures 5C and 5D), an excess of pro-inflammatory omega-6 to omega-3 FAs by up
alleviating any compensatory mechanisms to obtain AA. In addi- to 50 times (popularly referred to as the ‘‘Western’’ diet), and
tion to viability, cPLA2 inhibition significantly reduced the clono- this may be implicated in the progression of breast and colo-
genicity of PIK3CA MUT cells, and, importantly, both could be rectal cancers (Patterson et al., 2012; Simopoulos, 2008). To
rescued by exogenous supplementation of AA (Figures 5C, 5D, further explore this premise, we injected triple negative CAL51
and 5E). Near-identical results were obtained following genetic (PIK3CA MUT) and Hs578T (PIK3CA WT) cell lines stably ex-
knockdown of cPLA2 using two constitutive short-hairpin pressing either control shGFP, or two independent shRNAs tar-
RNAs (shRNAs, denoted as cPLA2-sh1 and cPLA2-sh5) (Figures geting cPLA2 (cPLA2-sh1 or sh5) into the mammary fat pad of
5F, 5G, and S6M), while WT cells were unaffected, suggesting BALB/c nude mice that had been preconditioned on either fat-
that cPLA2 is dispensable in this setting (Figure S6N). To further free, balanced (omega6:omega3 = 1:1) or ‘‘Western’’ (omega6:o-
confirm the importance of cPLA2-induced AA metabolism in mega3 = 50:1) diets (Figure 7A). Consistent with pharmacolog-
PIK3CA MUT cancers, we suppressed cPLA2 in the isogenic ical inhibition, knockdown of cPLA2 significantly impaired the
panel and assessed their ability to form colonies under FA-free growth of PIK3CA MUT tumor xenografts under fat-free diet con-
conditions. Although no significant difference in colony forma- ditions (Figure 7B), and this therapeutic effect was completely
tion was detected in PIK3CA WT cells, there was a marked reversed when animals were fed the AA-enriched ‘‘Western’’
reduction in the number of colonies following cPLA2 knockdown diet (Figures 7C and 7D). Although there was a trend toward a
in the MUT cells that was restored in the presence of AA reduction in overall tumor weights under a balanced diet, this
(Figure 5H). Furthermore, using an inducible shRNA against did not reach statistical significance (Figures S7A and S7B). In
(H) Clonogenic assays of MCF10A PIK3CA WT and MUT cells expressing shGFP, cPLA2-sh1, or cPLA2-sh5 under FAF conditions, supplemented with or without
25 mM AA. Data in (A)–(H) are presented as the mean ± SEM of n = 3–4 biological replicates and are representative of at least two independent experiments. Data
in (D) are presented as the mean viability of three PIK3CA MUT (MCF-7, CAL-51, MDAMB453) and WT (MDAMB134, Hs578T, AU565) measured in triplicate wells.
n.s., not significant; *p < 0.05; **p < 0.01; ***p < 0.001; p values in (A)–(D) and (G) were calculated using two-way ANOVA. For (E, right), (F, right), and (H, right), one-
way ANOVA followed by unpaired, two-tailed Student’s t test with Bonferroni correction was applied.
A B C
E F G
H I J K
accordance with our model, PIK3CA WT tumors were unaffected In agreement with the chemokine analysis, tumor infiltration of
by these treatments (Figures 7E–7G, S7C, and S7D). Further NK cells was significantly increased in PIK3CA MUT tumors by
corroborating our findings, genetic inhibition of cPLA2 only dietary and therapeutic interventions, with dual inhibition (either
reduced viable tumor area in animals with PIK3CA MUT tumors with ASB14780 or shRNA) of cPLA2 and fat-free diet leading to
when those were fed a fat-free diet, while no anti-neoplastic the largest increase in NKp46 staining (Figures S7K, S7L, S7N,
benefit was conferred under a ‘‘Western’’ diet (Figures 7H–7J). and S7O). It is also noteworthy that PIK3CA WT PDX and cell
REIMS profiling of excised tumors revealed that AA levels line-derived xenograft tumors contained relatively higher base-
were significantly altered in concordance with targeting cPLA2 line levels of CCL5, CX3CL5, and NKp46 expression, as
and dietary fat intake (Figures 7K and S7E). Interestingly, a compared to MUT tumors, and these were not significantly
more substantial AA reduction was observed in PIK3CA MUT altered by cPLA2 inhibition and/or changes in the diet (Figures
xenograft tumors following cPLA2-knockdown and administra- S7G–S7J, S7M, and S7P). Considered together, these data sug-
tion of fat-free diet (40%–50% decrease), as compared to WT tu- gest that oncogenic PIK3CA might suppress BC immunoge-
mors (20% decrease) (Figures 7K and 7L, S7E, and S7F). In line nicity, at least in part, through regulation of AA metabolism,
with our previous in vivo study (Figure 6), these findings demon- and this can be reversed through co-administration of cPLA2 in-
strate that the modulation of dietary fat content, and supplemen- hibition and dietary fat restriction.
tation of omega-6 FAs either in a balanced ratio with omega-3
FAs, or to a much larger extent in the ‘‘Western’’ diet, completely DISCUSSION
abolishes the therapeutic benefit of cPLA2 inhibition in PIK3CA
MUT tumors. Unraveling the interplay between genotype and metabolic
In addition to promoting growth and proliferation through both phenotype, as well as their complex interactions with nutrient
autocrine and paracrine mechanisms, AA and its downstream me- availability, unequivocally plays a major role in understanding
tabolites have also been implicated in the metabolic remodeling of disease pathogenesis and identifying novel therapeutic interven-
the tumor microenvironment. One of their major consequences is tions. Here, we demonstrate that the iKnife/REIMS enables close
inhibition of the anti-cancer immune responses, ultimately leading to real-time prediction of clinically relevant tumor features based
to immune evasion and tumor progression (Böttcher et al., 2018; on their metabolic fingerprints, offering a novel repertoire for
Zelenay et al., 2015). Although the BALB/c nude mice used in cancer diagnosis and therapy decision-making. Among these,
this study lack adaptive immunity in the form of T cells, these an- is oncogenic PIK3CA, which triggers almost ‘‘the perfect storm’’
imals mount robust innate immune responses predominantly of signaling events, culminating in the overproduction of AA and
mediated by natural killer (NK) cells (Lee et al., 2015; Okada downstream eicosanoids via the activation of cPLA2.
et al., 2019). We therefore sought to investigate how the various We have evidence of the central role of AA and eicosanoids in
therapeutic and dietary regimes that were used in this study a wide range of disorders including cancer, obesity, diabetes,
impact NK cell responses. To do this, the levels of type I inter- asthma, and autoimmune disorders (Dennis and Norris, 2015;
feron-induced chemokines (CCL5 and CX3CL1) and expression Sonnweber et al., 2018; Wang and Dubois, 2010). However,
of a major NK cell-activating receptor (NKp46) were measured the signal transduction pathways behind their activation, as
in tumors from different treatment groups. well as the molecular cues that link these bio-active lipids with
Coinciding with the largest reduction in AA levels (Figure 6K), a growth factor-independent cell proliferation have remained
marked increase in CCL5 and CX3CL1 was only observed in the largely obscure. Our results demonstrate that mTORC2 down-
BR1282 (PIK3CA MUT) PDX tumor following co-administration stream of oncogenic PIK3CA acts as a pivotal signaling hub for
of ASB14780 and fat-free diet (Figures S7G and S7I). Similar re- driving enhanced AA metabolism to sustain cell proliferation
sults were obtained in the CAL51 (PIK3CA MUT)-derived xeno- beyond a cell autonomous manner. This is particularly interesting
graft tumor model, where the increase in chemokine levels was in light of evidence obtained from in situ single-cell analysis of
rescued when mice were fed the AA-enriched ‘‘Western’’ diet primary breast tumors, showing often that only a small fraction
(Figures S7H and S7J). of cancer cells within a tumor carry PIK3CA mutations, while
Figure 6. Oncogenic PIK3CA Serves as a Defining Biomarker for Sensitivity of Pre-clinical Models to cPLA2 Inhibition
(A) Immunoblot analysis confirming inducible knockdown of cPLA2 following induction with 2 mg/mL doxycycline.
(B) 3D acini formation of MCF10A PIK3CA WT and MUT cells following doxycycline-induced cPLA2-sh1 or shGFP expression. Cells were stained for Ki-67 (pink,
Alexa Fluor 546), F-actin (red, Phalloidin 633), and DAPI (blue).
(C) Quantification of Ki-67 staining from treatments in (B).
(D) Schematic of in vivo experimental design and tumor profiling with REIMS.
(E and F) Tumor weights of (E) PIK3CA WT (BR1458) and (F) C420R MUT (BR1282) breast PDX tumors treated with 100 mg/kg of the cPLA2a pharmacological
inhibitor ASB14780 under FAF diet (n = 8 mice for the BR1458 model, and n = 7 mice for the BR1282 for both the vehicle- and ASB14780-treated groups).
(G) Representative images of H&E staining from resected tumors in (E) and (F). The black masks in (G) represent viable tumor area, while unshaded regions
correspond to necrotic tissue.
(H and I) Quantification of viable tumor area from (H) PIK3CA WT (BR1458) and (I) PIK3CA MUT (BR1282) tumor sections based on the analysis depicted in (G).
(J and K) AA levels measured by REIMS in (J) PIK3CA WT (BR1458) and (K) MUT (BR1282) tumors excised and snap frozen 2 h after the final dosing. Error bars in
(C), (J), and (K) represent mean ± SEM, with data in (J) and (K) corresponding to tumor REIMS measurements from n = 7–8 mice. n.s., not significant;*p < 0.05;
**p < 0.01; ***p < 0.001; p values in (E), (F), and (H)–(K) were calculated using one-way ANOVA followed by unpaired, two-tailed Student’s t test with Bonferroni
correction.
B C D
E F G
H I J
K L
many of the neighboring cancer and stromal cells are WT (Janis- oncogenic PIK3CA has been shown to sensitize cancer cells to
zewska et al., 2015). Given that AA has been shown to induce aspirin (Henry et al., 2017; Liao et al., 2012) and, based on our
both PI3K (Hughes-Fulford et al., 2006) and MAPK (Alexander findings, it is tempting to speculate that this connection could
et al., 2006) signaling, PIK3CA MUT cells could trigger a snow- be true, in part because of the heightened capacity of PIK3CA
ball effect, through overproduction of AA, affecting not only their MUT cells for high AA production. However, further studies are
own signaling and proliferation, but also that of their adjacent needed to ascertain this connection, because evidence sug-
PIK3CA WT cells. Moreover, in light of the role of prostaglandins gests that the growth inhibitory effect of aspirin in PIK3CA
in lymphangiogenesis (Lala et al., 2018), the paracrine effects of MUT cells is likely to be COX-2-independent (Henry et al., 2017).
AA could be of further relevance to the activity of PI3Ka in endo- Although PI3K pathway inhibitors have shown some efficacy in
thelial cells (Okkenhaug et al., 2016; Wang and Dubois, 2010). treating advanced solid tumors, the majority has been associ-
Eicosanoids no longer represent the missing link between ated with only partial tumor remission and they are often accom-
inflammation and cancer (Greene et al., 2011). Elevated tumor- panied by severe side effects (Fruman et al., 2017; Li et al., 2018).
derived PGE2 contributes to immune evasion by preventing the Recent evidence suggests that one way to enhance their efficacy
interferon gamma (IFNg)-dependent upregulation of ICAM-1 is by suppressing their insulin feedback through adoption of a
that is pertinent for complete CD8(+) T cells activation (Basingab ketogenic diet (Hopkins et al., 2018). Indeed, diet could play a
et al., 2016). In addition, autocrine PGE2 impairs NK cell viability much more significant role in therapy response than previously
and chemokine production and leads to downregulation of the anticipated. Our data suggest that a diet rich in FAs limits the ef-
chemokine receptors of cDC1 that promote their recruitment ficacy of the cPLA2 inhibitor, as PIK3CA MUT tumors likely
into tumors (Böttcher et al., 2018). Importantly, there is a strong depend on their high flux of extracellular FA intake to compen-
positive correlation between the gene signature of cDC1 and sate for the loss of AA. This observation raises the possibility
NK cells and better overall survival in melanoma and breast can- that adopting a diet without meat and dairy products (major sour-
cers (Böttcher et al., 2018), suggesting that monitoring the immu- ces of AA) could dramatically improve the sensitivity of the
nomodulatory functions of prostaglandins via PI3K/Akt pathway cPLA2 inhibitor and help restore tumor immunogenicity, sug-
inhibition could have important clinical implications. Indeed, we gesting a novel path for future clinical trials where nutrition will
have shown that modulation of AA levels in PIK3CA MUT tumors play a major role in disease management and treatment.
through cPLA2 inhibition in combination with dietary fat restriction
increases intra-tumor infiltration of NK cells and their associated STAR+METHODS
chemokines, while this can be reversed by the ‘‘Western’’ diet,
which contains an excess of omega6-FAs. NK cell markers Detailed methods are provided in the online version of this paper
were largely unaffected in PIK3CA WT tumors, and this could and include the following:
reflect their lower intra-tumor AA levels that are likely attributable
to reduced cPLA2 activity and FA uptake, as compared to d KEY RESOURCES TABLE
PIK3CA MUT cells. In light of recent evidence showing that block- d RESOURCE AVAILABILITY
ing PI3K signaling with the pan-PI3K inhibitor BKM120 increases B Lead Contact
tumor-immune infiltrate and renders PIK3CA MUT mouse bladder B Materials Availability
tumors more susceptible to PD-1 blockade (Borcoman et al., B Data and Code Availability
2019), our model raises important considerations for how immu- d EXPERIMENTAL MODEL AND SUBJECT DETAILS
notherapies may be successfully applied to oncogenic PIK3CA B Human samples
MUT tumors that may be inherently less immunogenic, at least B Animals
in part, due to enhanced AA production. B Cell culture
Another way to relieve the immunosuppressive effects of tu- d METHOD DETAILS
mor cells is by inhibiting COX activity via the use of non-steroidal B Experimental design
anti-inflammatory drugs (NSAIDs), such as aspirin. Notably, B Mass spectrometry analysis
Figure 7. Dietary Supplementation of Arachidonic Acid Reverses the Sensitivity of PIK3CA Mutant Tumors to cPLA2 Inhibition
(A) Schematic of in vivo experimental design and profiling of breast cancer cell line xenografts with REIMS.
(B and C) Relative tumor growth of CAL-51 (PIK3CA MUT)-derived xenografts stably expressing control shGFP or two independent shRNAs targeting cPLA2
(cPLA2-sh1 and cPLA2-sh5) under (B) fat-free or (C) ‘‘Western’’ diets.
(D) Weights of tumors excised at the end of the experiments (B) and (C).
(E and F) Relative tumor growth of Hs578T (PIK3CA WT)-derived xenografts stably expressing control shGFP or two independent shRNAs targeting cPLA2 under
(E) fat-free or (F) ‘‘Western’’ diets.
(G) Weights of tumors excised at the end of the experiments (E) and (F).
(H) Representative images of H&E staining from resected tumors in (D) and (G). The black masks in (H) represent viable tumor area, while unshaded regions
correspond to necrotic tissue.
(I and J) Quantification of viable tumor area from (I) PIK3CA MUT (CAL-51) and (J) PIK3CA WT (Hs578T) tumor sections based on the analysis depicted in (H).
(K and L) AA levels measured by REIMS in (K) PIK3CA MUT (CAL-51) and (L) PIK3CA WT (Hs578T) snap frozen excised tumors. AA intensities are reported as
scaled values to the appropriate shGFP-fat-free diet condition. Data in (B), (C), (E), (F), (K), and (L) represent the mean ± SEM of relative tumor growth or tumor
REIMS measurements from n = 3–5 mice. n.s., not significant; *p < 0.05; **p < 0.01; ***p < 0.001; p values in (B), (C), (E), and (F) were calculated using two-way
ANOVA, and one-way ANOVA followed by unpaired, two-tailed Student’s t test with Bonferroni correction was used in (D), (G), and (I)–(L).
B Metabolomics data pre-processing and analysis ted with LC-MS profiling of eicosanoids. A.v.W. and C.M.I. helped with animal
B Transfections and site directed mutagenesis experiments. R.F.S. provided essential technical support for sample process-
B Cell based assays ing with REIMS. N.K. and G.P. wrote the manuscript. All authors edited the
manuscript.
B Three-dimensional cell culture
B Confocal microscopy
DECLARATION OF INTERESTS
B Enzymatic assays
B Immunoblot analysis
N.K. and G.P. are inventors on a patent application covering new methods and
B Immunoprecipitation analysis compositions useful in the treatment of cancers with PIK3CA mutation (appli-
B Immunohistochemistry analysis cation number GB 2005874.9).
B Proximity ligation assay
B Fluorescent calcium assay Received: August 5, 2019
B Quantitative RT-PCR and PIK3CA mutation analysis Revised: March 7, 2020
Accepted: May 28, 2020
B In vitro kinase assay
Published: June 18, 2020
B Chemokine assays
B Lipid extraction and eicosanoid profiling
REFERENCES
B Bioinformatic analysis
d QUANTIFICATION AND STATISTICAL ANALYSIS Alexander, L.D., Ding, Y., Alagarsamy, S., Cui, X.L., and Douglas, J.G. (2006).
Arachidonic acid induces ERK activation via Src SH2 domain association with
SUPPLEMENTAL INFORMATION the epidermal growth factor receptor. Kidney Int. 69, 1823–1832.
Alexander, J., Gildea, L., Balog, J., Speller, A., McKenzie, J., Muirhead, L.,
Supplemental Information can be found online at https://doi.org/10.1016/j. Scott, A., Kontovounisios, C., Rasheed, S., Teare, J., et al. (2017). A novel
cell.2020.05.053. methodology for in vivo endoscopic phenotyping of colorectal cancer based
A video abstract is available at https://doi.org/10.1016/j.cell.2020.05. on real-time analysis of the mucosal lipidome: a prospective observational
053#mmc8. study of the iKnife. Surg. Endosc. 31, 1361–1370.
Ambs, P., Baccarini, M., Fitzke, E., and Dieter, P. (1995). Role of cytosolic
ACKNOWLEDGMENTS phospholipase A2 in arachidonic acid release of rat-liver macrophages: regu-
lation by Ca2+ and phosphorylation. Biochem. J. 311, 189–195.
We thank Naomi Guppy and Farzana Noor (Breast Cancer Now Histopatholo-
Balog, J., Sasi-Szabó, L., Kinross, J., Lewis, M.R., Muirhead, L.J., Veselkov,
gy, ICR, London, UK) and Elena Miranda and Adriana Resende Alves (Pathol-
K., Mirnezami, R., Dezso } , B., Damjanovich, L., Darzi, A., et al. (2013). Intraoper-
ogy Core Facility, University College London Cancer Institute) for support with
ative tissue identification using rapid evaporative ionization mass spectrom-
immunohistochemistry, hematoxylin, and eosin analysis; and Champions
etry. Sci. Transl. Med. 5, 194ra93.
Oncology (London, UK) for kindly providing breast, pancreatic, ovarian, sar-
coma, and colorectal cancer PDX tumor specimens. We would also like to Basingab, F.S., Ahmadi, M., and Morgan, D.J. (2016). IFNg-Dependent Inter-
thank Edward St. John for enabling access to primary breast tumor samples, actions between ICAM-1 and LFA-1 Counteract Prostaglandin E2-Mediated
Verena M. Horneffer-van der Sluis for assistance with eicosanoid analysis, and Inhibition of Antitumor CTL Responses. Cancer Immunol. Res. 4, 400–411.
the Biological Services Unit staff at the Institute of Cancer Research (Chelsea Bligh, E.G., and Dyer, W.J. (1959). A rapid method of total lipid extraction and
site) for their assistance with in vivo experiments. N.K. was supported by an purification. Can. J. Biochem. Physiol. 37, 911–917.
ICR PhD studentship. The work described and the laboratory of G.P. was sup- Borcoman, E., De La Rochere, P., Richer, W., Vacher, S., Chemlali, W.,
ported by the Institute of Cancer Research and a Cancer Research UK Grand Krucker, C., Sirab, N., Radvanyi, F., Allory, Y., Pignot, G., et al. (2019). Inhibi-
Challenge award (C59824/A25044). Work in the Z.T. lab was supported by the tion of PI3K pathway increases immune infiltrate in muscle-invasive bladder
European Research Council (MASSLIP Consolidator grant), Cancer Research cancer. OncoImmunology 8, e1581556.
UK Grand Challenge award (C59824/A25044), and the National Institute for
Böttcher, J.P., Bonavita, E., Chakravarty, P., Blees, H., Cabeza-Cabrerizo, M.,
Health Research (Imperial Biomedical Research Centre). A.V. was funded by
Sammicheli, S., Rogers, N.C., Sahai, E., Zelenay, S., and Reis E Sousa, C.
the Ministry of Education, Culture and Sport under the Program for Promoting
(2018). NK Cells Stimulate Recruitment of cDC1 into the Tumor Microenviron-
and Hiring of Talent and its Employability (Subprogram for Mobility ‘‘José Cas-
ment Promoting Cancer Immune Control. Cell 172, 1022–1037.e14.
tillejo’’) of the Spanish Government and by Comunitat Autònoma de les Illes
Balears, Direcció General d’lnnovació i Recerca (AAEE003/2017) and Fons Eu- Burke, J.E., and Dennis, E.A. (2009). Phospholipase A2 structure/function,
ropeu de Desenvolupament Regional de la Unió Europea (FEDER). mechanism, and signaling. J. Lipid Res. 50 (Suppl ), S237–S242.
Carracedo, A., and Pandolfi, P.P. (2008). The PTEN-PI3K pathway: of feed-
AUTHOR CONTRIBUTIONS backs and cross-talks. Oncogene 27, 5527–5541.
Chambers, M.C., Maclean, B., Burke, R., Amodei, D., Ruderman, D.L., Neu-
N.K., Z.T., and G.P. designed the study with contributions from J.K.N. and mann, S., Gatto, L., Fischer, B., Pratt, B., Egertson, J., et al. (2012). A cross-
R.C.G. Z.T. and G.P. directed the project and supervised data analysis. N.K. platform toolkit for mass spectrometry and proteomics. Nat. Biotechnol. 30,
performed and analyzed most experiments. E.K. performed 3-D acini, prox- 918–920.
imity ligation assays, calcium flux analyses, the CRISPR knockdown genera-
Chou, M.M., Hou, W., Johnson, J., Graham, L.K., Lee, M.H., Chen, C.S.,
tion, the site-directed mutagenesis constructs, and assisted with xenograft
Newton, A.C., Schaffhausen, B.S., and Toker, A. (1998). Regulation of protein
studies. A.T. performed in vitro kinase assays, ELISAs, and assisted with xeno-
kinase C zeta by PI 3-kinase and PDK-1. Curr. Biol. 8, 1069–1077.
graft studies and REIMS analysis of tumor samples. A.V. assisted with estro-
gen receptor signaling experiments, the generation of dox-inducible cPLA2- Clark, J.D., Schievella, A.R., Nalefski, E.A., and Lin, L.L. (1995). Cytosolic
knockdown cell lines, and lipid extractions for LC-MS analysis. P.I. developed phospholipase A2. J. Lipid Mediat. Cell Signal. 12, 83–117.
essential platforms for pre-processing and interpretation of REIMS data. Dan, S., Okamura, M., Seki, M., Yamazaki, K., Sugita, H., Okui, M., Mukai, Y.,
N.J.S.P. assisted with xenograft studies. D.J.M. assisted with chemokine as- Nishimura, H., Asaka, R., Nomura, K., et al. (2010). Correlating phosphatidyli-
says. S.A.V. and G.A.E. cultured and analyzed cell lines grown under different nositol 3-kinase inhibitor efficacy with signaling pathway status: in silico and
microenvironment conditions. A.L.T. carried out image analysis. M.L.D. assis- biological evaluations. Cancer Res. 70, 4982–4994.
Sarbassov, D.D., Ali, S.M., Sengupta, S., Sheen, J.H., Hsu, P.P., Bagley, A.F., 3-(1-Aryl-1H-indol-5-yl)propanoic acids as new indole-based cytosolic phos-
Markhard, A.L., and Sabatini, D.M. (2006). Prolonged rapamycin treatment in- pholipase A2a inhibitors. J. Med. Chem. 57, 7244–7262.
hibits mTORC2 assembly and Akt/PKB. Mol. Cell 22, 159–168.
Vander Heiden, M.G., and DeBerardinis, R.J. (2017). Understanding the Inter-
Saxton, R.A., and Sabatini, D.M. (2017). mTOR Signaling in Growth, Meta- sections between Metabolism and Cancer Biology. Cell 168, 657–669.
bolism, and Disease. Cell 168, 960–976.
Wang, D., and Dubois, R.N. (2010). Eicosanoids and cancer. Nat. Rev. Cancer
Simopoulos, A.P. (2008). The importance of the omega-6/omega-3 fatty acid 10, 181–193.
ratio in cardiovascular disease and other chronic diseases. Exp. Biol. Med.
(Maywood) 233, 674–688. Wiederschain, D., Wee, S., Chen, L., Loo, A., Yang, G., Huang, A., Chen, Y.,
Caponigro, G., Yao, Y.M., Lengauer, C., et al. (2009). Single-vector inducible
Sonnweber, T., Pizzini, A., Nairz, M., Weiss, G., and Tancevski, I. (2018).
lentiviral RNAi system for oncology target validation. Cell Cycle 8, 498–504.
Arachidonic Acid Metabolites in Cardiovascular and Metabolic Diseases. Int.
J. Mol. Sci. 19, 3285. Wolfer, A.M., Gaudin, M., Taylor-Robinson, S.D., Holmes, E., and Nicholson,
St John, E.R., Balog, J., McKenzie, J.S., Rossi, M., Covington, A., Muirhead, J.K. (2015). Development and Validation of a High-Throughput Ultrahigh-Per-
L., Bodai, Z., Rosini, F., Speller, A., Shousha, S., et al. (2017). Rapid evapora- formance Liquid Chromatography-Mass Spectrometry Approach for
tive ionisation mass spectrometry of electrosurgical vapours for the identifica- Screening of Oxylipins and Their Precursors. Anal. Chem. 87, 11721–11731.
tion of breast pathology: towards an intelligent knife for breast cancer surgery. Yu, G., and He, Q.-Y. (2016). ReactomePA: an R/Bioconductor package for re-
Breast Cancer Res. 19. actome pathway analysis and visualization. Mol. Biosyst. 12, 477–479.
Tarrado-Castellarnau, M., de Atauri, P., and Cascante, M. (2016). Oncogenic Zelenay, S., van der Veen, A.G., Böttcher, J.P., Snelgrove, K.J., Rogers, N.,
regulation of tumor metabolic reprogramming. Oncotarget 7, 62726–62753. Acton, S.E., Chakravarty, P., Girotti, M.R., Marais, R., Quezada, S.A., et al.
Tomoo, T., Nakatsuka, T., Katayama, T., Hayashi, Y., Fujieda, Y., Terakawa, (2015). Cyclooxygenase-Dependent Tumor Growth through Evasion of Immu-
M., and Nagahira, K. (2014). Design, synthesis, and biological evaluation of nity. Cell 162, 1257–1270.
STAR+METHODS
Continued
REAGENT or RESOURCE SOURCE IDENTIFIER
Rabbit polyclonal anti-phospho-cPLA2 Cell Signaling Technology Cat# 2831; RRID: AB_2164445
(Ser505)
Rabbit monoclonal anti-HA-Tag Cell Signaling Technology Cat# 3724; RRID: AB_1549585
Rabbit monoclonal anti-GAPDH Cell Signaling Technology Cat# 2118; RRID: AB_561053
Rabbit polyclonal anti-PLCgamma1 Cell Signaling Technology Cat# 2822; RRID: AB_2163702
Rabbit polyclonal anti-phospho- Cell Signaling Technology Cat# 2821; RRID: AB_330855
PLCgamma1 (Tyr783)
Rabbit polyclonal anti-phospho-threonine Cell Signaling Technology Cat# 9381; RRID: AB_330301
Rabbit polyclonal anti-IkBalpha Cell Signaling Technology Cat# 9242; RRID: AB_331623
Rabbit monoclonal anti-phospho- Cell Signaling Technology Cat# 2859; RRID: AB_561111
IkBalpha (Ser32)
Rabbit monoclonal anti-Stat3 Cell Signaling Technology Cat# 4904; RRID: AB_331269
Rabbit polyclonal anti-phospho-Stat3 Cell Signaling Technology Cat# 9134; RRID: AB_331589
(Ser727)
Rabbit monoclonal anti-PKD/PKCm Cell Signaling Technology Cat# 90039; RRID: AB_2800149
Rabbit polyclonal anti-phospho-PKD/PKCm Cell Signaling Technology Cat# 2054; RRID: AB_2172539
(Ser744/748)
Mouse monoclonal anti-Rb (4H1) Cell Signaling Technology Cat# 9309; RRID: AB_823629
Rabbit monoclonal anti-estrogen inducible Abcam Cat# ab92377; RRID: AB_10562122
protein pS2
Rabbit polyclonal anti-PKCzeta Abcam Cat# ab59364; RRID: AB_944858
Rabbit monoclonal anti-phospho-PKCzeta Abcam Cat# ab62372; RRID: AB_946309
(Thr560)
Rabbit polyclonal anti-PKCepsilon Abcam Cat# ab63638; RRID: AB_1142276
Rabbit polyclonal anti-phospho- Abcam Cat# ab63387; RRID: AB_1142277
PKCepsilon (Ser729)
Rabbit monoclonal anti-secretory Abcam Cat# ab139692
phospholipase A2
Mouse monoclonal anti-PKCbeta II (F-7) Santa Cruz Biotechnology Cat# sc-13149; RRID: AB_628144
Mouse monoclonal anti-PKCzeta (B-7) Santa Cruz Biotechnology Cat# sc-393218
Mouse monoclonal anti-phospho-RKIP Santa Cruz Biotechnology Cat# sc-135779; RRID: AB_2163163
(Ser153)
Normal rabbit IgG Santa Cruz Biotechnology Cat# sc-2027; RRID: AB_737197
Mouse monoclonal anti-phospho-Rb Santa Cruz Biotechnology Cat# sc-271930; RRID: AB_670923
(Thr821/826)
Rabbit polyclonal anti-phospholipase Sigma-Aldrich Cat# SAB4200129; RRID: AB_11129638
A2 (iPLA2)
Rabbit monoclonal anti-phospho-PKCzeta Thermo Fisher Scientific Cat# MA5-15060; RRID: AB_10983263
(Thr410)
Mouse monoclonal anti-SREBP1 BD Biosciences Cat#557036; RRID: AB_396559
Goat anti-rabbit IgG (H+L)-HRP conjugate Bio-Rad Cat#170-6515; RRID: AB_11125142
Goat anti-mouse IgG (H+L)-HRP conjugate Bio-Rad Cat#170-6516; RRID: AB_11125547
Mouse polyclonal anti-Nkp46/ncr1 R and D systems Cat#AF2225; RRID: AB_355192
Rabbit monoclonal anti-Ki-67 (D3B5) Cell Signaling Technology Cat# 9129; RRID: AB_2687446
Goat anti-Mouse IgG (H+L) secondary Thermo Fisher Scientific Cat# A-11003; RRID: AB_2534071
antibody, Alexa Fluor 546
Phalloidin-iFluor 633 Abcam Cat# ab176758
Rabbit polyclonal anti-phospho-cPLA2 This paper (produced by Thermo Fisher Cat# UE1820P-T-AB1792
(Thr376) (Peptide name: PLA2G4A- Scientific)
369:383-pT376)
(Continued on next page)
Continued
REAGENT or RESOURCE SOURCE IDENTIFIER
Continued
REAGENT or RESOURCE SOURCE IDENTIFIER
Ovarian PDX model Champions Oncology Cat# CTG0252
Ovarian PDX model Champions Oncology Cat# CTG0258
Ovarian PDX model Champions Oncology Cat# CTG0259
Ovarian PDX model Champions Oncology Cat# CTG0486
Pancreatic PDX model Champions Oncology Cat# CTG0292
Pancreatic PDX model Champions Oncology Cat# CTG0381
Pancreatic PDX model Champions Oncology Cat# CTG0391
Pancreatic PDX model Champions Oncology Cat# CTG1485
Pancreatic PDX model Champions Oncology Cat# CTG2205
Pancreatic PDX model Champions Oncology Cat# CTG0282
Pancreatic PDX model Champions Oncology Cat# CTG0283
Pancreatic PDX model Champions Oncology Cat# CTG0284
Pancreatic PDX model Champions Oncology Cat# CTG0285
Pancreatic PDX model Champions Oncology Cat# CTG0286
Sarcoma PDX model Champions Oncology Cat# CTG0886
Sarcoma PDX model Champions Oncology Cat# CTG1084
Sarcoma PDX model Champions Oncology Cat# CTG1116
Sarcoma PDX model Champions Oncology Cat# CTG1255
Sarcoma PDX model Champions Oncology Cat# CTG1628
Sarcoma PDX model Champions Oncology Cat# CTG0142
Sarcoma PDX model Champions Oncology Cat# CTG0143
Sarcoma PDX model Champions Oncology Cat# CTG0241
Sarcoma PDX model Champions Oncology Cat# CTG0242
Sarcoma PDX model Champions Oncology Cat# CTG0243
Colorectal PDX model Champions Oncology Cat# CTG0083
Colorectal PDX model Champions Oncology Cat# CTG0129
Colorectal PDX model Champions Oncology Cat# CTG0360
Colorectal PDX model Champions Oncology Cat# CTG0706
Colorectal PDX model Champions Oncology Cat# CTG0799
Colorectal PDX model Champions Oncology Cat# CTG0058
Colorectal PDX model Champions Oncology Cat# CTG0062
Colorectal PDX model Champions Oncology Cat# CTG0063
Colorectal PDX model Champions Oncology Cat# CTG0065
Colorectal PDX model Champions Oncology Cat# CTG0066
Chemicals, Peptides, and Recombinant Proteins
Hydrocortisone Sigma-Aldrich Cat# H-0888
Insulin-transferrin-selenium GIBCO Cat# 41400-045
Epidermal growth factor (EGF) Peprotech Cat# AF-100-15
Cholera toxin Sigma-Aldrich Cat# C-8052
Insulin Sigma-Aldrich Cat# I-1882
Fatty acid free bovine serum albumin Sigma-Aldrich Cat# A8806
PKC zeta pseudosubstrate inhibitory Sigma-Aldrich Cat# P1614
peptide
PKC beta II peptide inhibitor Sigma-Aldrich Cat# P-0102
FIPI hydrochloride hydrate Sigma-Aldrich Cat# F5808
4-hydroxytamoxifen Sigma-Aldrich Cat# T176
Bromoenol lactone Sigma-Aldrich Cat# B1552
Cycloheximide Sigma-Aldrich Cat# C7698
(Continued on next page)
Continued
REAGENT or RESOURCE SOURCE IDENTIFIER
Doxycycline hyclate Sigma-Aldrich Cat# D9891
Arachidonic acid Sigma-Aldrich Cat# 10931
Palmitoleate Sigma-Aldrich Cat# P9417
Palmitate Sigma-Aldrich Cat# P0500
PKC alpha (C2-4) inhibitor peptide Santa Cruz Biotechnology Cat# sc-304
PKC epsilon inhibitor peptide Cambridge Bioscience Cat# CAY17476
Rapamycin Selleckchem Cat# S1039
Torin 1 Selleckchem Cat#S2827
BYL719 Selleckchem Cat# S2814
BKM120 Selleckchem Cat# 2247
MK2206 Selleckchem Cat# S1078
GSK690693 Selleckchem Cat# S1113
GSK650394 Tocris Bioscience Cat# 3572
U73122 Tocris Bioscience Cat# 1268
ASB14780 Axon Medchem Cat# 2578
DharmaFECT-1 transfection reagent Dharmacon Cat# T-2001-02
FuGENE HD Transfection reagent Promega Cat# E2311
Lipid Removal Adsorbent Supelco Cat# 13358
Matrigel Corning Cat# 354230
Paraformaldehyde Sigma-Aldrich Cat# 158127
Triton X-100 Sigma-Aldrich Cat# X100
DAPI (4’,6-Diamidino-2-Phenylindole, Thermo Fisher Scientific Cat# D1306
Dihydrochloride)
RIPA buffer Thermo Fisher Scientific Cat# 89900
Leupeptin Sigma-Aldrich Cat# L2884
Pepstatin Sigma-Aldrich Cat# P5318
Na3VO4 Sigma-Aldrich Cat# 450243
DL-Dithiothreitol Sigma-Aldrich Cat# 646653
Calyculin A Cell Signaling Technology Cat# 9902
Beta-glycerophosphate Sigma-Aldrich Cat# G9422
PMSF protease inhibitor Cell Signaling Technology Cat# 8553
Bradford reagent Bio-Rad Cat# 5000006
ALLN protease inhibitor Merck-Millipore Cat# 208719
2x Laemmli sample buffer Bio-Rad Cat# 161-0737
4x Laemmli sample buffer Bio-Rad Cat# 161-0747
10X Cell lysis buffer Cell Signaling Technology Cat# 9803
Recombinant human protein kinase C zeta Insight Biotechnology Cat# TP302472
Recombinant human phospholipase A2, Insight Biotechnology Cat# TP320972
group IVA
Magnesium chloride Sigma-Aldrich Cat# M8266
Bovine serum albumin Sigma-Aldrich Cat# A2153
Puromycin Invivogen Cat# ant-pr-1
Blasticidin Invivogen Cat# ant-bl-1
Critical Commercial Assays
QuikChange Lightning Site-Directed Agilent Cat# 210518
Mutagenesis Kit
CellTiter96 Aqueous Non-radioactive (MTS) Promega Cat# G5430
cell proliferation assay
(Continued on next page)
Continued
REAGENT or RESOURCE SOURCE IDENTIFIER
Fatty Acid Uptake Kit Sigma-Aldrich Cat# MAK156
Cytosolic Phospholipase A2 Assay Kit Abcam Cat# ab133090
Secretory Phospholipase A2 Assay Kit Abcam Cat# ab133089
Diacylglycerol (DAG) Assay Kit Cell Biolabs Inc Cat# MET-5028
Active Rac1 Detection Kit Cell Signaling Technology Cat# 8815
Duolink In Situ Detection Reagents Red Kit Sigma-Aldrich Cat# DUO92008
Minus and Plus PLA probes Sigma-Aldrich Cat# DUO92004 and DUO92002
Fluo-4 Direct Calcium Assay Kit Thermo Fisher Scientific Cat# F10471
RNeasy Plus Mini Kit QIAGEN Cat# 74134
QuantiTect Reverse Transcription Kit QIAGEN Cat#205311
SYBR Select Master Mix Thermo Fisher Scientific Cat# 4472908
QIAamp DNA mini kit QIAGEN Cat# 51304
PNAClamp PIK3CA Mutation Detection Kit Panagene Cat# PNAC-4001
ADP-Glo Kinase Assay Promega Cat# V6930
Arachidonic Acid ELISA Kit Generon Cat# CEB098Ge
Prostaglandin E2 ELISA Kit Enzo Life Sciences Cat# ADI-900-001
Mouse RANTES (CCL5) ELISA Kit Abcam Cat# ab100739
Mouse Fractalkine (CX3CL1) ELISA Kit Abcam Cat# ab100683
Pierce BCA Protein Assay Kit Thermo Fisher Scientific Cat# 23225
Deposited Data
Custom script for quantification of proximity This manuscript Github: https://github.com/adamltyson/
ligation assay images foci2D
Custom script for quantification of acini This manuscript Github: https://github.com/adamltyson/
images cell-coloc-3D
Custom script for quantification of This manuscript Github: https://github.com/adamltyson/
calcium flux CalciumAnalysis
REIMS data for Figure 1D This manuscript Mendeley Data; https://doi.org/10.17632/
xcgc5kpntm.1
REIMS data for Figure 1E This manuscript Mendeley Data; https://doi.org/10.17632/
xcgc5kpntm.1
REIMS data for Figure 1H This manuscript Mendeley Data; https://doi.org/10.17632/
xcgc5kpntm.1
REIMS data for Figure 3B This manuscript Mendeley Data; https://doi.org/10.17632/
xcgc5kpntm.1
REIMS data for Figure 3D This manuscript Mendeley Data; https://doi.org/10.17632/
xcgc5kpntm.1
Significantly altered phospholipids across This manuscript Table S1
breast cancer cell lines of different receptor,
or triple negative status
Significantly different fatty acids between This manuscript Table S4
MCF10A PIK3CA wild-type and E545K/
H1047R mutant cells
Experimental Models: Cell Lines
Human PIK3CA (H1047R/+) MCF10A Horizon Discovery Cat# HD 101-011
Human PIK3CA (E545K/+) MCF10A Horizon Discovery Cat# HD 101-002
AU565 (human breast carcinoma) ATCC Cat# CRL-2351; RRID: CVCL_1074
BT20 (human breast carcinoma) ATCC Cat# HTB-19; RRID: CVCL_0178
BT474 (human breast carcinoma) ATCC Cat# HTB-20; RRID: CVCL_0179
BT549 (human breast carcinoma) ATCC Cat# HTB-122; RRID: CVCL_1092
CAL51 (human breast carcinoma) DSMZ Cat# ACC-302; RRID: CVCL_1110
(Continued on next page)
Continued
REAGENT or RESOURCE SOURCE IDENTIFIER
CAMA1 (human breast carcinoma) ATCC Cat# HTB-21; RRID: CVCL_1115
EFM19 (human breast carcinoma) DSMZ Cat# ACC-231; RRID: CVCL_0253
Hs578T (human breast carcinoma) ATCC Cat# HTB-126; RRID: CVCL_0332
JIMT1 (human breast carcinoma) DSMZ Cat# ACC-589; RRID: CVCL_2077
KPL1 (human breast carcinoma) DSMZ Cat# ACC-317; RRID: CVCL_2094
MCF7 (human breast carcinoma) ATCC Cat# HTB-22; RRID: CVCL_0031
MDAMB134 (human breast carcinoma) ATCC Cat# HTB-23; RRID: CVCL_0617
MDAMB157 (human breast carcinoma) ATCC Cat# HTB-24; RRID: CVCL_0618
MDAMB231 (human breast carcinoma) ATCC Cat# HTB-26; RRID: CVCL_0062
MDAMB361 (human breast carcinoma) ATCC Cat# HTB-27; RRID: CVCL_0620
MDAMB436 (human breast carcinoma) ATCC Cat# HTB-130; RRID: CVCL_0623
MDAMB453 (human breast carcinoma) ATCC Cat# HTB-131; RRID: CVCL_0418
MDAMB468 (human breast carcinoma) ATCC Cat# HTB-132; RRID: CVCL_0419
MFM223 (human breast carcinoma) DSMZ Cat# ACC-422; RRID: CVCL_1408
S68 (human breast carcinoma) Breast Cancer Now (Institute of Cancer RRID: CVCL_5585
Research)
SKBR3 (human breast carcinoma) ATCC Cat# HTB-30; RRID: CVCL_0033
T47D (human breast carcinoma) ATCC Cat# HTB-133; RRID: CVCL_0553
UACC812 (human breast carcinoma) ATCC Cat# CRL-1897; RRID: CVCL_1781
VP229 (human breast carcinoma) Breast Cancer Now (Institute of Cancer RRID: CVCL_2754
Research)
BT483 (human breast carcinoma) ATCC Cat# HTB-121; RRID: CVCL_2319
HCC1143 (human breast carcinoma) ATCC Cat# CRL-2321; RRID: CVCL_1245
HCC1395 (human breast carcinoma) ATCC Cat# CRL-2324; RRID: CVCL_1249
HCC1428 (human breast carcinoma) ATCC Cat# CRL-2327; RRID: CVCL_1252
HCC1500 (human breast carcinoma) ATCC Cat# CRL-2329; RRID: CVCL_1254
HCC1569 (human breast carcinoma) ATCC Cat# CRL-2330; RRID: CVCL_1255
HCC1937 (human breast carcinoma) ATCC Cat# CRL-2336; RRID: CVCL_0290
HCC1954 (human breast carcinoma) ATCC Cat# CRL-2338; RRID: CVCL_1259
HCC202 (human breast carcinoma) ATCC Cat# CRL-2316; RRID: CVCL_2062
HCC38 (human breast carcinoma) ATCC Cat# CRL-2314; RRID: CVCL_1267
HCC70 (human breast carcinoma) ATCC Cat# CRL-2315; RRID: CVCL_1270
SUM52 (human breast carcinoma) Breast Cancer Now (Institute of Cancer RRID: CVCL_3425
Research)
ZR751 (human breast carcinoma) ATCC Cat# CRL-1500; RRID: CVCL_0588
ZR7530 (human breast carcinoma) ATCC Cat# CRL-1504; RRID: CVCL_1661
SUM44 (human breast carcinoma) Breast Cancer Now (Institute of Cancer RRID: CVCL-3424
Research)
SUM159 (human breast carcinoma) Breast Cancer Now (Institute of Cancer RRID: CVCL_5423
Research)
SUM149 (human breast carcinoma) Breast Cancer Now (Institute of Cancer RRID: CVCL_3422
Research)
SUM225 (human breast carcinoma) Breast Cancer Now (Institute of Cancer RRID: CVCL_5593
Research)
SUM229 (human breast carcinoma) Breast Cancer Now (Institute of Cancer RRID: CVCL_5594
Research)
HEK293T (human embryonic kidney) ATCC Cat# CRL-3216; RRID: CVCL_0063
Human MCF10A PIK3CA WT CRISPR This manuscript N/A
control cell line
(Continued on next page)
Continued
REAGENT or RESOURCE SOURCE IDENTIFIER
Human MCF10A PIK3CA WT cPLA2 This manuscript N/A
CRISPR cell line
Human MCF10A PIK3CA H1047R (+/) This manuscript N/A
CRISPR control cell line
Human MCF10A PIK3CA H1047R (+/) This manuscript N/A
cPLA2 CRISPR cell line
Experimental Models: Organisms/Strains
Mouse: BALB/c nude (female, age: 6– Beijing Anikeeper Biotech (Beijing, China) N/A
8 weeks)
Mouse: BALB/c nude (female, age: 7– Envigo N/A
9 weeks)
Oligonucleotides
Primers for cPLA2 shRNA amplification Sigma-Aldrich N/A
(Forward 50 -
GTGGAAAGGACGAAACACCGGT-30 ,
Reverse 50 -
TTTGTCTCGAGGTCGAGAATTC-30
Mutagenesis primers to generate cPLA2 Sigma-Aldrich N/A
S505A (Forward 50 -
GCAAAGTCACTCAAAGGAGCCAGTG
GATAAGATGTATTG-30 , Reverse 50 -
CAATACATCTTATCCACTGGCTCCTTT
GAGTGACTTTGC-30
Mutagenesis primers to generate Sigma-Aldrich N/A
cPLA2 T376A (Forward 50 -TTCTTC
ATACTTCTTAACGACTGCTCCCATAAAAA
ATTTGCTTCCAA-30 ,
Reverse 50 -TTGGAAGCAAATTT
TTTATGGGAGCAGTCGTTAAG
AAGTATGAAGAA-30
qPCR primers for human cPLA2 (PLA2G4A) Sigma-Aldrich N/A
(Forward 50 -
GATGAAACTCTAGGGACAGCAAC-30 ,
Reverse 50 -
CTGGGCATGAGCAAACTTCAA-30
qPCR primers for human beta-actin Sigma-Aldrich N/A
(Forward 50 -
GACCCAGATCATGTTTGAGACC-30 ,
Reverse 50 -
CTTCATGAGGTAGTCAGTCAGG-30 )
Recombinant DNA
pLKO-Tet-On Vector Wiederschain et al., 2009 Addgene ID: 21915
pCMV3-HA-PLA2G4A Sino Biological Cat# HG13126-NY
RICTOR ON-TARGETplus SMARTPool Dharmacon Cat# L-016984-00-0005
human siRNA
RAPTOR ON-TARGETplus SMARTPool Dharmacon Cat# L-004107-00-0005
human siRNA
FRAP1 ON-TARGETplus SMARTPool Dharmacon Cat# L-003008-00-0005
human siRNA
PLCg1 ON-TARGETplus SMARTPool Dharmacon Cat# L-003559-00-0005
human siRNA
PRKCZ ON-TARGETplus SMARTPool Dharmacon Cat# L-003526-00-0005
human siRNA
Non-targeting siRNA control Dharmacon Cat# D-001810-01-05
(Continued on next page)
Continued
REAGENT or RESOURCE SOURCE IDENTIFIER
TRC Lentiviral eGFP shRNA positive control Dharmacon Cat# RHS4459
TRC Lentiviral Human PLA2G4A shRNA 1 Dharmacon Cat# TRCN0000050263
TRC Lentiviral Human PLAG2G4A shRNA 5 Dharmacon Cat# TRCN0000050267
LentiCRISPR v2 Sanjana et al., 2014 Addgene ID: 52961
PLA2G4A sgRNA CRISPR/Cas9 All-in-One Applied Biological Materials Cat# K1659207
Lentivector Target 2
pCMV-HA-PLA2G4A-S505A This manuscript N/A
pCMV-HA-PLA2G4A-T376A This manuscript N/A
Inducible pLKO-Tet-On-TRC Lentiviral This manuscript N/A
Human PLA2G4A shRNA 1
Software and Algorithms
R statistical software (version 3.5.1) The R Project https://www.r-project.org/
ProteoWizard MsConvert software (version Chambers et al., 2012 http://proteowizard.sourceforge.net/
3.0.11781) download.html
MALDIquant package Gibb and Strimmer, 2012 https://cran.r-project.org/web/packages/
MALDIquant/index.html
ReactomePA package Yu and He, 2016 https://bioconductor.org/packages/
release/bioc/html/ReactomePA.html
MATLAB (2014a, version 8.3.0.532) Mathworks https://www.mathworks.com/products/
matlab.html
GraphPad Prism (version 8.0.1) GraphPad https://www.graphpad.com/
scientific-software/prism/
Database for Annotation, Visualization and Huang et al., 2009 https://david.ncifcrf.gov/
Integrated Discovery (DAVID, version 6.8)
TargetLynx software Waters Corporation https://www.waters.com/waters/
home.htm
Image Lab Software (version 5.2.1) Bio-Rad https://www.bio-rad.com/en-uk/product/
image-lab-software?ID=KRE6P5E8Z
ImageJ (version 1.51) NIH https://imagej.nih.gov/ij/download.html
Other
Protein G Sepharose beads Sigma-Aldrich Cat# P3296
Dulbecco’s Modified Eagle’s GIBCO Cat# 41965-039
Medium (DMEM)
Roswell Park Memorial Institute GIBCO Cat# 10220-106
(RPMI) 1640
Ham’s F12 media Thermo Fisher Scientific Cat# 11765054
DMEM/F-12 GIBCO Cat# 31330-038
Horse Serum Thermo Fisher Scientific Cat# 16050-122
4–15% Criterion TGX Precast Midi Bio-Rad Cat# 5671084
Protein Gel
4–15% Mini-PROTEAN TGX Precast Bio-Rad Cat# 4561083
Protein Gel
Electrosurgical bipolar forceps Erbe Elektromedizin (Germany) N/A
ForceTriad electrosurgical unit Covidien (Ireland) N/A
Thermo Exactive orbitrap instrument Thermo Scientific N/A
Waters HSS T3 UPLC column Waters Corporation Cat# 186005614
Waters Xevo TQ-S triple quadrupole mass Waters Corporation N/A
spectrometer
Normal diet (for PDX study) Keaoxieli Feed (Beijing, China) Cat# 2152
Fat free diet (for PDX study) Xietong Organism (Beijing, China) Cat# RD17112401
(Continued on next page)
Continued
REAGENT or RESOURCE SOURCE IDENTIFIER
Western diet (omega-3/omega-6 = 1:50) Research Diets Cat# D19032707
(For cell line xenograft study)
Balanced diet (omega-3/omega-6 = 1:1) Research Diets Cat# D19032708
(For cell line xenograft study)
Fat free diet (For xenograft study) Research Diets Cat# D19032705
Precellys Lysing Soft tissue Precellys Cat# P000912-LYSK0
homogenizing kit
Precellys24 homogenizer Bertin Instruments Cat# P000669-PR240-A
RESOURCE AVAILABILITY
Lead Contact
Further information and requests for resources and reagents should be directed to and will be fulfilled by the Lead Contact, George
Poulogiannis (george.poulogiannis@icr.ac.uk).
Materials Availability
All unique/stable reagents generated in this study are available from the Lead Contact without restriction.
Human samples
12 primary breast cancer samples from female patients (> 18 years of age) who consented to utilization of tissue for research were
provided by the Imperial College Healthcare NHS Trust Tissue Bank. Other investigators may have received samples from these
same tissues. The research was supported by the National Institute for Health Research (NIHR) Biomedical Research Centre based
at Imperial College Healthcare NHS Trust and Imperial College London. The views expressed are those of the author(s) and not
necessarily those of the NHS, the NIHR or the Department of Health. Human samples used in this research project were obtained
with evaluation and approval from the Wales Research Ethics Committee Reference 17/WA/0161 (Imperial College Healthcare Tissue
Bank Human Tissue Authority license: 12275; Project number R18024), the East of England – Cambridge East Research Ethics Com-
mittee Reference 14/EE/0024, and the project was registered under the Imperial College Tissue Bank.
Animals
Mouse PDX experiments were performed by Crown Bioscience in accordance with approved Institutional Animal Care and Use Com-
mittee (IACUC) protocols and ethical guidelines, and in strict accordance with the Crown Bioscience Guidelines and Standard Oper-
ating Procedures. Two primary human triple-negative breast cancer PDX corresponding to PIK3CA WT (BR1458) or PIK3CA C420R
MUT (BR1282) tumor fragments (2-3 mm in diameter) were inoculated subcutaneously into the breast pad of 7-9-week-old female
immunodeficient BALB/c nude mice weighing 17-23 g and which had not received previous treatments or procedures. Once tumors
reached a volume of 100-200 mm3, mice were randomized into four groups corresponding to either a normal or fatty acid free (FAF)
diet and administered with vehicle (0.5% hydroxypropyl cellulose in sterile water) or 100 mg/kg cPLA2 inhibitor ASB14780 (2578,
Axon Medchem) daily through oral gavage for 21 days. Animals were housed in a specific pathogen free facility in individually vented
cages and provided with diets and distilled water ad libitum. Room temperature was monitored and maintained at 20-25 C with the
light cycle set at 12 hours. All animals were checked daily for signs of ill health, as well as for any effects of tumor growth and
treatments on behavior such as mobility, food and water consumption, and body weight gain/loss. Researchers were not blinded
to treatment groups. Tumors were excised 2 hours after the final dosing and snap frozen in preparation for metabolomics/REIMS
processing and histopathological assessment. The mean tumor area as a percent of the total tissue area was initially assessed by
an independent histopathologist, and subsequently quantified using ImageJ version 1.51. Normal and FAF diets were purchased
from Keaoxieli Feed (2152) and Xietong Organism (RD17112401), respectively, and their compositions are summarized in Table S6.
All animal work undertaken at the Institute of Cancer Research was carried out under UK Home Office Project Licenses
P6AB1448A (Establishment License, X702B0E74 70/2902) and was approved by the Animal Welfare and Ethical Review Body at
the ICR. For cell line-derived xenograft studies, 7-9 week old female immunodeficient BALB/c nude mice weighing 18-25 g and which
had not received previous treatments or procedures were initially pre-conditioned on fat free, balanced (omega3/omega6 1:1) or
Western (omega3/omega6 1:50) diets for 2 weeks to assess tolerability. In order for the animal feed to be controlled, cages were ran-
domized to treatment groups rather than individual mice, and this occurred prior to orthotopic injections. Animals were subsequently
injected with 2.5x106 triple negative CAL51 (PIK3CA mutant) or Hs578T (PIK3CA WT) cells expressing either control shGFP or two
independent shRNAs targeting cPLA2 (cPLA2-sh1 or cPLA2-sh5) in 100 mL PBS:matrigel (50:50) into the right mammary fat pad. An-
imals were housed in a specific pathogen free facility in individually vented cages (no more than 4 mice per cage) and provided with
diets and distilled water ad libitum. Room temperature was monitored and maintained at 20-25 C with the light cycle set at 12 hours.
All animals were checked daily for signs of ill health, as well as for any effects of tumor growth and treatments on behavior such as
mobility, food and water consumption, and body weight gain/loss. Owing to the nature of the diets, blinding was not possible. Tumor
measurements were taken twice weekly in three dimensions (width, length and depth), and presented as relative tumor growth
normalized to first measurement day. Mice were excluded from the analysis if the primary tumor engrafted subcutaneously or into
the peritoneum instead of the mammary fat pad. On the final day of the experiment, tumors were excised and immediately snap
frozen in preparation for metabolomics/REIMS processing and histopathological assessment. The mean tumor area was quantified
as a percent of the total tissue area using ImageJ version 1.51. Fat free, balanced and Western diets were purchased from Research
Diets Inc. (D19032705, D19032708, D19032707, respectively). The composition of the diets used for the cell line xenograft study are
summarized in Table S6
Fresh frozen breast patient-derived xenograft (PDX) tumors were obtained from Crown Bioscience, and additional breast, ovarian,
pancreatic, sarcoma and colorectal PDX tumors were kindly provided by Champions Oncology. These are described in the Key Re-
sources Table. The PIK3CA mutational status for all 66 PDX tumor samples is summarized in Table S5.
Cell culture
Human female breast carcinoma cell lines AU565 BT20, BT474, BT549, CAL51, CAMA1, EFM19, Hs578T, JIMT1, KPL1, MCF7,
MDAMB134, MDAMB157, MDAMB231, MDAMB361, MDAMB436, MDAMB453, MDAMB468, MFM223, S68, SKBR3, T47D,
UACC812 and VP229 cells were cultured in Dulbecco’s Modified Eagle’s Medium (DMEM) (GIBCO, 41965-039) and BT483,
HCC1143, HCC1395, HCC1428, HCC1500, HCC1569, HCC1937, HCC1954, HCC202, HCC38, HCC70, SUM52, ZR751 and
ZR7530 cells were cultured in Roswell Park Memorial Institute (RPMI) 1640 media (Sigma-Aldrich, R8758), both supplemented
with 10% fetal bovine serum (FBS) (GIBCO, 10220-106). SUM44 (human, female), SUM159 (human, female), SUM149 (human, fe-
male), SUM225 (human, female) and SUM229 (human, female) were cultured in Ham’s F12 media (Thermo Fisher Scientific,
11765054) supplemented with 5% FBS, 0.5 mg/ml hydrocortisone (Sigma-Aldrich, H-0888) and 0.4% insulin-transferrin-selenium
(GIBCO, 41400-045) and MCF10A (human, female) cells (including the PIK3CA MUT isogenic panel) were cultured in DMEM/F-12
(GIBCO, 31330-038) supplemented with 5% horse serum (Thermo Fisher Scientific, 16050-122), 20 ng/ml epidermal growth factor
(EGF) (Peprotech, AF-100-15), 100 ng/ml cholera toxin (Sigma-Aldrich, C-8052), 0.5 mg/ml hydrocortisone (Sigma-Aldrich, H-0888)
and 10 mg/ml insulin (Sigma-Aldrich, I1882). For cell culture conditions free of exogenous sources of fatty acids, 10% FBS or 5%
horse serum was replaced with 1% fatty acid bovine serum albumin (BSA) (Sigma-Aldrich, A8806). All cell lines were maintained
at 37 C, 5% CO2. All cell lines were authenticated by short tandem repeat analysis (Eurofins Scientific) and were tested and
confirmed to be negative for mycoplasma infection.
METHOD DETAILS
Experimental design
Experiments were repeated multiple times across different cell line and tumor models with similar results as indicated in the figure leg-
ends. Key findings from in vivo experiments were reproduced using orthogonal approaches including cell line xenograft models and
genetic inhibitions. Animals were randomized into treatment groups either individually following PDX engraftment and growth to
100-200 mm3, or in cages following diet preconditioning. Throughout the study researchers were not blinded as data analysis required
prior knowledge of the sample annotation. For in vitro and in vivo experiments, sample size was chosen based on preliminary exper-
iments and previous experience with protocols. No completed data were excluded from the analysis performed in this manuscript.
REIMS analysis was performed with commercially available electrosurgical bipolar forceps (Erbe Elektromedizin, Germany) con-
nected to a ForceTriad electrosurgical unit (Covidien, Ireland) programmed in Macro bipolar setting using 4 W or 30 W power for cell
lines and tumors, respectively. Bipolar forceps were connected to the inlet capillary of a Thermo Exactive orbitrap instrument (Thermo
Scientific) using PTFE tubing, allowing for the direct suction of aerosol generated from rapid biomass heating to the mass spectrom-
eter (set up is shown in Figure 1A). The mass spectrometer settings used for phospholipid and fatty acid profiling are summarized in
Table S7.
PCR cycling parameters were as follows: initial denaturing at 95 C for 2 min, followed by 18 cycles of 20 s denaturing (95 C), 10 s
annealing (60 C) and 4.5 min elongation (68 C). A final elongation step occurred for 5 min (68 C). To assess the effects of these phos-
phomutants on cPLA2 activity or arachidonic acid (AA) levels, cells were transiently transfected with 9 mg pCMV3-HA-PLA2G4A,
pCMV3-HA-PLA2G4A-S505A or pCMV3-HA-PLA2G4A-T376A vector DNA and 18 mL FuGENE HD Transfection reagent, and ex-
periments were performed 48 hours post transfection.
Confocal microscopy
For immunofluorescence, cells were fixed with 4% PFA at room temperature, permeabilized with pre-chilled 0.5% Triton X-100 for
10 min prior to blocking with 2% BSA/PBS for 1 hour. After blocking, cells were incubated overnight at 4 C with primary Ki-67 anti-
body (Cell Signaling Technology, 9129) in a humidified chamber. The following day cells were incubated with fluorescently labeled
secondary antibodies Alexa Fluor 546-conjugated secondary antibody (Thermo Fischer Scientific, A-11003) and Phalloidin 633
(Abcam, ab176758) to visualize Ki-67 and F-actin, respectively, 1% BSA/PBS for 1 hour. Slides were mounted using Prolong
Gold anti-fade reagent with DAPI (Invitrogen, D1306). Images were captured using a Zeiss AxioObserver microscope equipped
with a Yokogawa CSU-W1 spinning disk unit (Intelligent Imaging Innovations) and a 40x oil objective. Serial z stacks of the acini struc-
tures were acquired at 5 um intervals (usually 10-15 sections per field), and then analyzed with a custom MATLAB script (2017b, The
Mathworks Inc.). Images were resampled to isotropic resolution and each spheroid was manually segmented. The DAPI signal was
thresholded using Otsu’s method (Otsu, 1979) following intensity depth correction and smoothing. Holes were then filled and small,
non-cellular objects were removed. The resulting binary nuclei image was used as a mask to measure cellular proliferation.
Enzymatic assays
cPLA2, iPLA2 and sPLA2 activities were measured using commercially available assays (Abcam, ab133089 or ab133090) according
to the manufacturers instructions. Briefly, total cell lysates were obtained using 1x Cell Lysis Buffer (Cell Signaling Technology, 9803)
under non-denaturing conditions. For cPLA2 activity, 10 mL lysate, 5 mL Assay buffer and 200 mL substrate solution containing arach-
idonoyl Thio-PC were incubated at room temperature for one hour. For iPLA2 activity, cell lysates were either untreated (measuring
both cPLA2 and iPLA2 activity) or treated with 5 mM of the iPLA2 specific inhibitor bromoenol lactone (BEL), and activity was deter-
mined as follows: iPLA2 activity = (Activity without BEL) – (Activity with BEL). For sPLA2 activity, conditioned media was obtained
from MCF10A PIK3CA WT and MUT cells and concentrated using a centrifugal vacuum evaporator (‘‘SpeedVac’’). Dried samples
were resuspended in 100 mL Assay Buffer, and 10 mL of this was used for the assay. Cellular diacylglycerol (DAG) levels were
measured using a DAG assay kit according to the manufacturer’s instructions (Cell Biolabs Inc., MET-5028). Briefly, 1x107 cells
were harvested by scraping with 1 mL cold PBS, and pellets were obtained after centrifugation at 1500 g for 10 min. Lipids were
extracted following sonication and incubation with methanol, sodium chloride and chloroform. The lower chloroform phase was
washed twice with pre-equilibrated upper phase (PEU) and dried under a stream of nitrogen. 50 mL of assay buffer was used to re-
suspend the dried sample, 20 mL of which was used for the assay.
Immunoblot analysis
Cells were washed with ice-cold PBS and lysed on ice for 30 min with cell lysis buffer containing RIPA buffer (Thermo Scientific,
89900) supplemented with 4 mg each of leupeptin (Sigma-Aldrich, L2884) and pepstatin (Sigma-Aldrich, P5318), 2 mM Na3VO4
(Sigma-Aldrich, 450243), 1 mM DL-Dithiothreitol (Sigma-Aldrich, 646653), 10 mM Calyculin A (Cell Signaling Technology, 9902),
250 mM b-glycerophosphate (Sigma-Aldrich, G9422) and 400 mM PMSF protease inhibitor (Cell Signaling Technology, 8553). Lysates
were subjected to centrifugation at 12,000 g for 30 min at 4 C, and protein concentrations were determined using the Bradford assay
(Bio-Rad, 5000006). Nuclear isolation for mature SREBP probing was performed using a Nuclear Extract Kit (Active Motif, 40010) with
10 mg/ml of the protease inhibitor ALLN (Millipore, 208719). Protein lysates were boiled for 10 min and subjected to SDS-PAGE elec-
trophoresis using 4%–15% precast gels (Bio-Rad, 567-1084). Densitometry was calculated using the Image Lab Software 5.2.1 (Bio-
Rad). Affinity purified custom antibodies for phosphorylated cPLA2 on Thr376 were developed by Thermo Fisher Scientific using the
PLA2G4A-369:383 peptide antigen and used at a dilution of 1:500 in 5% BSA/TBST. All other primary antibodies were used at a dilu-
tion of 1:1000 in 5% BSA/TBST solution, and secondary antibodies at 1:5000 in 5% milk/TBST.
Immunoprecipitation analysis
MCF10A PIK3CA WT and MUT isogenics were transiently transfected with HA-tagged cPLA2 (pCMV3-HA-PLA2G4A, Sino Biological,
HG13126-NY) and lysed with 1X Cell Lysis Buffer (Cell Signaling Technologies, 9803) 48 hours post transfection. Equal volume of diluted
cell lysates containing 4 mg of soluble protein were incubated with 2 mg rabbit anti-HA (Cell Signaling Technology, 3724) or 2 mg normal
rabbit IgG (Santa-Cruz Biotechnology, sc-2027) as a negative control, and were pre-coupled with protein G Sepharose beads (Sigma-
Aldrich, P3296) following incubation at 4 C for 4 hours rotating. 1X Cell Lysis Buffer was used to wash the beads four times, after which
samples were resuspended in 30 mL 2x Laemmli sample buffer (Bio-Rad, 161-0737) and boiled for 10 min to release bound proteins.
To detect GTP-bound Rac1, the Active Rac1 Detection Kit (Cell Signaling Technology, 8815) was used, following the manufac-
turer’s instructions. Briefly, cell lysates were harvested under non-denaturing conditions using 1X Cell Lysis Buffer. To affinity pre-
cipitate activated G-protein, spin cups were incubated with 100 mL of glutathione resin and subsequently washed with 1X Cell Lysis
Buffer. 20 mg of GST-PAK1-PBD was added to spin cups containing glutathione resin, after which 700 mL of the cell lysate containing
1 mg total protein was added, and the mix was incubated at 4 C for 1 hour with gentle shaking. Following three washes with 400 mL 1X
Cell Lysis Buffer, samples were eluted by adding 2X reducing sample buffer containing 200 mM DTT in 2X SDS sample buffer, and
subsequently heated for 5 min at 95 C. 25 mL of eluted samples were loaded onto an SDS-PAGE gel for subsequent immunoblot
analysis using the provided Rac1 mouse monoclonal antibody (1:1000 dilution).
Immunohistochemistry analysis
10 mM thick sections were obtained from formalin-fixed and paraffin-embedded tumors and stained using an anti-phospho PKCz
Thr560 antibody (Abcam, ab62372) at a dilution of 1:100. For breast PDX and cell line-derived xenograft tumors, fresh frozen tumor
pieces were mounted on Optimal Cutting Temperature (OCT) compound and sectioned to 10 mM thickness. To assess the presence
of activated natural killer cells, slides were stained using goat anti-mouse NKp46/ncr1 polyclonal antibody (R and D systems, 170-
6516) at a final concentration of 3 mg/ml. Images were captured on a Hamamatsu NanoZoomer, and staining was quantified using the
IHC Profiler plug-in for ImageJ.
media of the 2x Fluo-4 DirectTM calcium assay reagent solution for 30 min in 37 C, 5% CO2, without removing the assay media. The
induced cells were stimulated with full media for 30 min, while incubated with the calcium assay reagent solution. Cells were then
analyzed by monitoring the fluorescence of the Fluo-4 dye using an ImageXpress Micro Confocal High-Content Imaging System
(Molecular Devices) and a 20x objective with 488 nm excitation. The fluorescence per field before and after induction was calculated
using a MetaXpress Software Custom Module.
To assess the effects of PLCg1 inhibition on calcium flux, cells were treated with either 2 mM U73122 for 24 hours, or transiently
transfected with 25 nM ON-TARGETplus SMARTpool human siRNA targeting PLCg1 for 48 hours. For the final 18 hours, full MCF10A
growth medium was removed and replaced with FBS-free medium. Immediately prior to the assay, cells were treated with equal vol-
ume culture media and 2x Fluo-4 DirectTM calcium assay reagent, and baseline fluorescence was monitored using an ImageXpress
Micro Confocal High-Content Imaging System (Molecular Devices). After 300 ms, cells were induced by adding 1 mL full MCF10A
growth medium, and fluorescence of the Fluo-4 dye was monitored using a 20x objective with 488 nm excitation. Calcium flux
was calculated by first normalizing fluorescence readings to baseline measurements using the formula: dF = F(t) – F(0)/F(0), where
F = fluorescence at 488 nm, t = time, and F(0) represents the average baseline readings from 1-299 ms. Fluorescence values were
subsequently normalized to a unit interval between 0 and 1 and presented as the time required post stimulation to reach a maximal
calcium intensity.
reverse: 50 -CTTCATGAGGTAGTCAGTCAGG. Reactions were performed with SYBR Select Master Mix (ThermoFisher, 4472908)
using the TProfessional Thermocycler from Biometra and analyzed with the qPCRsoft version 3.1 (Thistle Scientific/Analytik Jena).
The following cycle reactions were used: pre-denaturation for 3 min at 95 C, followed by 45 cycles of 5 s at 95 C (denaturation), 5 s at
55.8 C (annealing) and 15 s at 72 C (elongation).
To detect PIK3CA mutations in primary breast tumor samples, DNA was extracted from 10 mg of tissue using the QIAamp DNA
mini kit (QIAGEN, 51304). Mutations in exon 9 (helical domain) and exon 20 (kinase domain) of PIK3CA were assessed using the PNA-
Clamp PIK3CA Mutation Detection Kit (Panagene, PNAC-4001), according to the manufacturer’s instructions. Briefly, reactions were
performed using 10 ng DNA with a SYBR Green PCR reaction premix and primer premixes detecting E542, E545, Q546 and H1047
mutations using the TProfessional Thermocycler. The following cycle reactions were used: pre-denaturation for 5 min at 94 C, fol-
lowed by 40 cycles of 30 s at 94 C (denaturation), 20 s at 70 C (peptide nucleic acid clamping), 30 s at 63 C (annealing) and 30 s
at 72 C (extension). PIK3CA mutations were assessed based on manufacturer’s instructions.
Chemokine assays
Tumor samples were placed in 2 mL lysing tubes prefilled with 1.4 mm ceramic (zirconium oxide) beads and 1 mL of chilled PBS.
Samples were homogenized with a Precellys24 homogenizer programmed with three 30 s cycles at 6,500 Hz and 4-min pause times.
At the end of the homogenization cycle, samples were centrifuged at 5,000rpm for 10 min at 4 C, and supernatants were transferred
to fresh Eppendorf tubes
To measure CCL5 (RANTES) chemokine levels, the CCL5 Mouse ELISA kit (Abcam, ab100739) was used according to the man-
ufacturer’s instructions. Briefly, 100 mL of standards prepared in Assay Diluent A and sample supernatants were added to appropriate
wells and incubated at room temperature for 2.5 hours. Wells were subsequently washed 4 times with 1X Wash Solution, and 100 mL
of 1X Biotinylated CCL5 (RANTES) Detection Antibody was added and incubated for 1 hour at room temperature with gentle shaking.
Following the incubation with the detection antibody, wells were washed 4 times and incubated with 100 mL HRP-Streptavidin so-
lution for 45 min at room temperature with gentle shaking. Following a final set of 4 washes, 100 mL of TMB One-Step Substrate Re-
agent was added to each well and incubated for 30 min at room temperature in the dark with gentle shaking. 50 mL of stop solution
was subsequently added, and measurements were taken immediately at 450 nm.
To measure CX3CL1 chemokine levels, the CX3CL1 Mouse ELISA kit (Abcam, ab100683) was used according to the manufac-
turer’s instructions. Briefly, 100 mL of standards prepared in Assay Diluent C and sample supernatants were added to appropriate
wells and incubated at room temperature for 2.5 hours. Wells were subsequently washed 4 times with 1X Wash Solution, and 100 mL
of 1X Biotinylated CX3CL1 Detection Antibody was added and incubated for 1 hour at room temperature with gentle shaking.
Following the incubation with the detection antibody, wells were washed 4 times and incubated with 100 mL HRP-Streptavidin so-
lution for 45 min at room temperature with gentle shaking. Following a final set of 4 washes, 100 mL of TMB One-Step Substrate Re-
agent was added to each well and incubated for 30 min at room temperature in the dark with gentle shaking. 50 mL of stop solution
was subsequently added, and measurements were taken immediately at 450 nm. All measurements were normalized to total protein
content as determined using the BCA protein assay (Thermo Fisher Scientific, 23225).
Bioinformatic analysis
Gene-centric RMA-normalized expression data and mutation status for available cell lines were obtained from the Cancer Cell Line
Encyclopedia (CCLE). 150 frequently mutated genes were identified based on a mutation frequency of R 20% or higher in n = 35
available cell lines, and significant enrichment in observed clusters was assessed using Fisher’s exact test. A functional annotation
analysis using the Database for Annotation, Visualization and Integrated Discovery (DAVID, v6.8) (Huang et al., 2009) was used to
identify biologically relevant pathways represented by a set of 512 genes, which were significantly upregulated in the PIK3CA mu-
tation- and lipid-enriched cluster. Interactions between genes involved in the identified pathways were visualized by functional
gene networks using the ‘ReactomePA’ package in R (Yu and He, 2016).
Student t test, Fisher’s exact test, and one- or two-way ANOVA were used to evaluate statistical significance as indicated in the
respective figure legends. The Shapiro-Wilk test was used to assess normality. The ‘N’ for each experiment can be found in the figure
legends and represents independently generated samples for in vitro experiments, wells for cell-based assays, or mice for in vivo
experiments. Bar graphs present the mean ± standard error of the mean (SEM). Significance was defined as p < 0.05, and denoted
by asterisks throughout the figures as follows: n.s (not significant), *(p < 0.05), **(p < 0.01), ***(p < 0.001). No statistical methods were
used to determine sample size, and no completed data were excluded from the analysis performed in this manuscript. Statistical
tests were performed using GraphPad Prism (version 8.0.1) and R statistical software (version 3.5.1).
Supplemental Figures
A B C
E F
A B
B C
48 hours post transfection with siPLCg1, or (E) treatment with 2 mM U73122 for 24 hours. For the final 18 hours of the treatments, cells were serum and growth
factor deprived, and stimulated with full media immediately prior to the assay. cPLA2 activity in MCF10A PIK3CA WT and MUT following (F) siRNA-mediated
knockdown of PLCg1 for 48 hours, or (G) treatment with 2 mM U73122 for 24 hours. (H) AA levels measured by REIMS in MCF10A PIK3CA isogenics following
treatment with 2 mM U73122 for 24 hours. (I) Representative confocal images and (J) quantification of in situ proximity ligation assay (PLA) between cPLA2 and
phospho-Thr560 PKCz in MCF10A PIK3CA WT and MUT cells. (K) Immunoblot analysis of phospho-cPLA2 (T376) custom antibody in the MCF10A isogenic panel
following serum and growth factor deprivation for 18 hours and subsequent stimulation for 30 min (left), treatment with 1 mM PKCz peptide inhibitor for 72 hours
(middle), and in MCF10A H1047R cPLA2 CRISPR knockout cells overexpressing a phosphoresistant mutant (T376A) cPLA2 (right). (L) Activity of cPLA2 in
MCF10A PIK3CA WT or H1047R cPLA2 CRISPR knockout cells transfected with 9 mg of either WT-cPLA2, or S505A/T376A phosphoresistant mutant cPLA2
constructs. Activity was measured 48 hours post-transfection. Cell proliferation of MCF10A (M) PIK3CA WT and (N) E545K MUT cells expressing control shGFP,
cPLA2-sh1 or sh5 under exogenous FAF conditions. Sulforhodamine B (SRB) protein staining was used to measure cell proliferation over 5 days. Data in (B), (D),
(E), (F), (G), (H), (L), (M) and (N) are presented as the mean ± SEM of n = 3-6 biological replicates and are representative of at least two independent experiments.
n.s., not significant; *p < 0.05; **p < 0.01; ***p < 0.001; P values in (B), (D), (E), (M) and (N) were calculated using two-way ANOVA. One-way ANOVA followed by
Student’s t test with Bonferroni correction was used for (F), (G), (H), (J)and (L).
ll
OPEN ACCESS Article
Correspondence
jimmie.ye@ucsf.edu (C.J.Y.),
lawrence.fong@ucsf.edu (L.F.)
In Brief
Single-cell RNA and paired T cell receptor
sequencing highlights enrichment of
cytotoxic CD4+ T cells rather than CD8+
T cells in human bladder cancer. These
CD4+ T cells are capable of killing
autologous tumor cells and are subjected
to inhibition by Tregs.
Highlights
d Human bladder tumors contain multiple clonally expanded
cytotoxic CD4+ T cell states
Article
Intratumoral CD4+ T Cells Mediate Anti-tumor
Cytotoxicity in Human Bladder Cancer
David Y. Oh,1,11 Serena S. Kwek,1,11 Siddharth S. Raju,3,6,8 Tony Li,1 Elizabeth McCarthy,3 Eric Chow,4 Dvir Aran,5
Arielle Ilano,1 Chien-Chun Steven Pai,1,12 Chiara Rancan,1 Kathryn Allaire,1 Arun Burra,1 Yang Sun,3
Matthew H. Spitzer,2,6,8 Serghei Mangul,9 Sima Porten,7 Maxwell V. Meng,7 Terence W. Friedlander,1
Chun Jimmie Ye,2,3,5,10,* and Lawrence Fong1,2,13,*
1Division of Hematology/Oncology, Department of Medicine, University of California, San Francisco, San Francisco, CA 94143, USA
2Parker Institute for Cancer Immunotherapy, University of California, San Francisco, San Francisco, CA 94143, USA
3Division of Rheumatology, Department of Medicine; Department of Epidemiology and Biostatistics; and Institute for Human Genetics,
94143, USA
5Bakar Computational Health Sciences Institute, University of California, San Francisco, San Francisco, CA 94143, USA
6Department of Microbiology and Immunology, University of California, San Francisco, San Francisco, CA 94143, USA
7Department of Urology, University of California, San Francisco, San Francisco, CA 94143, USA
8Department of Otolaryngology – Head and Neck Surgery, University of California, San Francisco, San Francisco, CA 94143, USA
9Department of Computer Science, University of California, Los Angeles, Los Angeles, CA 90095, USA
10Chan Zuckerberg Biohub, San Francisco, CA, USA
11These authors contributed equally
12Present address: Department of Oncology Biomarker Development, Genentech, 1 DNA Way, South San Francisco, CA 94080, USA
13Lead Contact
SUMMARY
Responses to anti-PD-1 immunotherapy occur but are infrequent in bladder cancer. The specific T cells that
mediate tumor rejection are unknown. T cells from human bladder tumors and non-malignant tissue were as-
sessed with single-cell RNA and paired T cell receptor (TCR) sequencing of 30,604 T cells from 7 patients. We
find that the states and repertoires of CD8+ T cells are not distinct in tumors compared with non-malignant
tissues. In contrast, single-cell analysis of CD4+ T cells demonstrates several tumor-specific states, including
multiple distinct states of regulatory T cells. Surprisingly, we also find multiple cytotoxic CD4+ T cell states
that are clonally expanded. These CD4+ T cells can kill autologous tumors in an MHC class II-dependent
fashion and are suppressed by regulatory T cells. Further, a gene signature of cytotoxic CD4+ T cells in tu-
mors predicts a clinical response in 244 metastatic bladder cancer patients treated with anti-PD-L1.
1612 Cell 181, 1612–1625, June 25, 2020 ª 2020 The Authors. Published by Elsevier Inc.
This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
ll
Article
and tumor mutational burden and, conversely, a lower score of S1). To assess the shared heterogeneity of T cells across sam-
transforming growth factor b (TGF-b) gene signature, particularly ples, we restricted the analysis to highly variable genes and
in immune-excluded tumors, were associated with response to used an empirical Bayes approach (ComBat; Johnson et al.,
the anti-PD-L1 agent atezolizumab (Mariathasan et al., 2018). 2007; Büttner et al., 2019) to account for preparation batch
However, the importance of heterogeneous subsets of TILs in among individual samples. We subsequently used Leiden clus-
TCC beyond canonical CD8+ cytotoxic and exhausted pheno- tering (Traag et al., 2019) to define clusters that were visualized
types in response to PD-1 blockade remains unexplored. In using uniform manifold approximation and projection (UMAP)
particular, the role of CD4+ T cells in controlling or enhancing (McInnes and Healy, 2018). Tumor- and non-malignant-derived
TCC tumor growth remains largely unknown. Although regulato- CD8+ T cells form 11 clusters, each populated by cells from all
ry CD4+ T cells (Tregs) in the TCC environment have been asso- samples suggestive of shared states in TCC regardless of the
ciated with adverse outcomes (Baras et al., 2016), and a CD4+ treatment regimen (Figure 1A; Figure S2A). Differential expres-
subset expressing the inducible costimulator (ICOS) that pro- sion analyses comparing each cluster with all other cells com-
duces interferon-gamma (IFNg) in response to anti-CTLA-4 ther- bined identified 1,067 differentially expressed genes in at least
apy has been described in human bladder tumors (Liakou et al., one cluster (adjusted P value (Padj) < 0.05, |log2(fold change,
2008), the presence of other CD4+ T cell subsets that directly FC)| > 0.5) (Table S2). The identified states include known CD8+
promote cell-mediated immunity through other effector mecha- subtypes (Figures 1B and 1C): cells expressing HAVCR2 (TIM-
nisms remains unclear. Detailed characterization of the T lym- 3), LAG3, ENTPD1, as well as the chemokine CXCL13
phocytes in the tumor is critically needed for precisely mapping (CD8ENTPD1: log2(FC) = 1.4–3.7), described previously as tumor-
the cells responsible for tumor recognition and control and reactive CD8+ T cells (Duhen et al., 2018); effector cells express-
defining predictive markers of response to CPI in bladder cancer. ing FGFBP2 and GNLY, a granule-associated pore-forming pro-
To address these points, we interrogated the tumor microen- tein known to function in pathogen killing (Krensky and Clay-
vironment of patients with localized muscle-invasive bladder berger, 2009) (CD8FGFBP2: log2(FC) = 3.6–5.3); naive cells
TCC who received or did not receive neoadjuvant anti-PD-L1 expressing CCR7 and GZMK (CD8NAIVE: log2(FC) = 0.9–2.8); cen-
immunotherapy (atezolizumab, Genentech) prior to surgical tral memory cells expressing CCR7 and SELL (L-selectin)
resection. Droplet single-cell RNA-seq (dscRNA-seq) and paired (CD8CM: log2(FC) = 1.5–1.7); and mucosal-associated invariant
T cell receptor sequencing (TCR-seq) of more than 30,000 CD4+ T (MAIT) cells expressing KLRB1 (CD8MAIT: log2(FC) = 2.7) that
and CD8+ T cells from paired tumors and adjacent non-malig- preferentially use the known semi-invariant TCR a chains
nant tissues revealed heterogeneity in known CD4+ states, TRAV1-2 and/or TRAJ33 (Kurioka et al., 2016; Figure 1D). Of
such as regulatory T cells, which were also enriched and clonally note, we also found MKI67+ proliferating cells (CD8PROLIF: log2(-
expanded in tumors. In addition, several states of cytotoxic FC) = 6.5) as well as cells expressing the chemokines XCL1/2
CD4+ T cells expressing cytolytic effector proteins were identi- (CD8xcl: log2(FC) = 5.2–5.6). Similar states were also identified
fied, some of which are enriched in tumors. Cytotoxic CD4+ in the tumor environment of hepatocellular carcinoma based on
T cells were clonally expanded in tumors and could kill autolo- scRNA-seq (Zheng et al., 2017a). Surprisingly, although the fre-
gous tumors ex vivo. Cytotoxic CD4+ T cells existed in discrete quency of CD8ENTPD1 cells was higher in tumors, none of the
proliferating and non-proliferating states in tumors. A gene CD8+ states displayed statistically significant differences in fre-
signature of cytotoxic CD4+ T cells was predictive of a response quency between the tumor and non-malignant bladder (exact
to PD-1 blockade in an orthogonal RNA-seq dataset of metasta- permutation test; Figure 1E; density plots in Figure 1F).
tic bladder cancer patients treated with anti-PD-L1. Overall,
these findings highlight the importance of CD4+ T cell heteroge- Tregs Included Heterogeneous States that Are Enriched
neity and the relative balance between activation of cytotoxic in Bladder Tumors
CD4+ effectors and inhibitory regulatory cells for killing autolo- Given the lack of tumor enrichment of CD8+ states and the higher
gous tumors. frequency of CD4+ over CD8+ T cells in bladder tumors (Fig-
ure S1B), we investigated CD4+ T cell heterogeneity in a similar
RESULTS fashion to determine their contribution to anti-tumor responses.
We sequenced and analyzed 16,995 tumor- and 2,847 non-ma-
Canonical CD8+ T Cell States Were Not Enriched in the lignant tissue-infiltrating CD4+ T cells isolated from the same pa-
Bladder Tumor Microenvironment tients. Tumor-derived and non-malignant tissue-derived CD4+
To assess the T cell composition of the tumor environment, we T cells formed 11 clusters each with representation from all indi-
profiled T cells from dissociated bladder tumors and adjacent vidual patients (Figure 2A; Figure S2B). We identified 1,511
uninvolved bladder tissues using single-cell RNA and TCR differentially expressed genes in at least one cluster (Padj <
sequencing (see schematic in Figure S1A). We used the 10X Ge- 0.05, |log2(FC)| > 0.5; Table S2; Figures 2B and 2C) defining
nomics Chromium platform (Zheng et al., 2017b) to sequence several canonical CD4+ T cell states. These include CCR7+ cells,
8,833 tumor-derived and 1,929 non-malignant tissue-derived which demonstrated a central memory phenotype (CD4CM)
CD8+ T cells from 7 patients (Table S1). All samples were mus- based on parallel flow cytometry data showing that these were
cle-invasive bladder cancer (MIBC) from 2 standard-of-care-un- CD45RA (see below) as well as cells expressing high levels of
treated patients (‘‘untreated’’), 1 chemotherapy-treated patient CXCL13 and IFNG (CD4CXCL13: log2(FC) = 5.9 and 1.4), which
(gemcitabine + carboplatin, ‘‘chemo’’), and 4 anti-PD-L1-treated were also likely to be exhausted based on overexpression of
patients (‘‘anti-PD-L1’’) with detailed clinical annotations (Table TOX (log2(FC) = 1.9) (Yao et al., 2019) and whose presence has
A B C
D E
F
been associated with improved outcomes in breast, gastric, and tuting 26% ± 1.9% (mean ± SEM) of tumor-infiltrating CD4+ cells,
microsatellite-unstable colorectal carcinoma, which is an im- which co-expressed FOXP3 (CD4IL2RAHI: log2(FC) = 2.7;
mune-responsive tumor (Schmidt et al., 2018; Gu-Trantien CD4IL2RALO: log2(FC) = 1.2) and known immune checkpoints,
et al., 2013, 2017; Wei et al., 2018; Zhang et al., 2018). Other including IL2RA, TIGIT, TNFRSF4/9/18, and CD27 (Philip et al.,
states included Th17 cells expressing IL17A (CD4TH17: log2(FC) = 2017; Zheng et al., 2017a; Plitas et al., 2016; De Simone et al.,
4.7), which represented important anti-tumor effectors (Kryczek 2016) (CD4IL2RAHI and CD4IL2RALO: log2(FC) > 0.65; Figures 2B,
et al., 2011); activated cells expressing CD69 (CD4ACTIVATED: 2C, and 3A). With the exception of TIGIT, these immune check-
log2(FC) = 2.2) but not FOXP3 (log2(FC) < 0.5) (Figures 2B and points are minimally expressed by other CD4+ states, such as
2C); as well as several important additional states described in CD4CM (Figure 3A). The two Treg states were distinguished by
further detail below. Notably, some of these states were selec- higher expression of IL2RA, TNFRSF4, TNFRSF9, and
tively enriched in specific compartments. CD4CXCL13 demon- TNFRSF18 in CD4IL2RAHI cells (CD4IL2RAHI: log2(FC) = 2.5–3.6;
strated significant enrichment in tumor compared with non-ma- CD4IL2RALO: log2(FC) = 0.4–1.6; Figure 3A; Table S2). Of note,
lignant tissue (tumor versus non-malignant: 6.5% versus 3.0%, both Treg states were significantly enriched in tumor compared
p = 0.015, exact permutation test), whereas states enriched in with adjacent non-malignant tissue (CD4IL2RAHI: 14.3% versus
non-malignant tissue included CD4CM (tumor versus non-malig- 4.6%, p = 0.002; CD4IL2RALO: 11.1% versus 6.7%, p = 0.002;
nant: 30% versus 42%, p = 0.008) and CD4ACTIVATED (tumor exact permutation test; Figure 2E). We confirmed, by flow cytom-
versus non-malignant: 7.5% versus 10%, p = 0.02) (density plots etry from 7 additional bladder tumors, that multiple tumors con-
in Figure 2D; tumor and non-malignant frequencies in Figure 2E). tained distinct regulatory states that expressed graded protein
Tregs were abundant constituents of the bladder tumor micro- levels of IL2RA and co-expressed significantly different levels
environment with demonstrated heterogeneity. We found two of immune checkpoints, such as TNFRSF18 (p < 0.05 for
states of Tregs (CD4IL2RAHI and CD4IL2RALO), together consti- TNFRSF18 expression in FOXP3+ CD25low versus CD25hi
A B
D E
Figure 2. CD4+ T Cells in Bladder Tumors Are Composed of Multiple Distinct Functional States
(A) UMAP plots of 19,842 single sorted CD3+ CD4+ T cells obtained from bladder tumors and adjacent non-malignant tissue (N = 7 patients). Each distinct
phenotypic cluster identified using Leiden clustering is identified with a distinct color. Annotation of each unbiased cluster was performed by manual inspection of
the highest-ranked differentially expressed genes for each cluster and using reference signature-based correlation methods (SingleR) as described in the text.
(B) Relative intensity of expression of select genes superimposed on the UMAP projections shown in (A).
(C) Violin plot showing relative expression of select differentially expressed genes (columns) for each cluster shown in (A) (rows) (all Padj < 0.05).
(D) Density plots showing distribution of cells in tumor or non-malignant samples.
(E) The frequency of cells in individual CD4+ T cell states defined by scRNA-seq clustering is shown as a proportion of total CD4+ cells within either tumor or non-
malignant compartments across all patients (orange, tumor; blue, non-malignant). A box and whisker plot is shown with formatting as in Figure 1E. *p < 0.05,
**p < 0.01 by exact permutation test.
populations by Wilcoxon signed-ranked test; Figure 3B; gating Ranger), this approach yielded 11,081 CD4+ T cells and 5,779
strategy in Figures S1C–S1D). This heterogeneity may be conse- CD8+ T cells with paired TRA and TRB CDR3 sequences
quential because Tregs expressing higher levels of immune (49% and 47% recovery, respectively; summary in Table S3).
checkpoints have been shown to be correlated with poorer out- These results are consistent with expected frequencies based
comes in non-small cell lung cancer (Guo et al., 2018). Both reg- on the average recovery of individual TRA (CD4+, 54%; CD8+,
ulatory states also demonstrated a common tumor-specific gene 50%) and TRB (CD4+, 68%; CD8+, 67%) sequences across
expression program that included several heat shock proteins whitelisted cells. Overall, the TCR repertoire was more
compared with non-malignant tissue (Figure S2C; Table S2). restricted in the tumor microenvironment than in adjacent
non-malignant tissue based on two analyses. First, in intratu-
Tregs Are Clonally Expanded in Bladder Tumors moral CD4+ T cells, 10.8% ± 1.6% of unique clonotypes are
To query the TCR sequence in the same single cells for which shared by 2 or more cells; this degree of sharing was signifi-
we obtained whole-transcriptome data, we PCR-amplified and cantly greater than in the non-malignant compartment (5.1%
sequenced to saturation the complementarity-determining re- ± 1.6%, unpaired t test, p = 0.033) and is not seen in blood
gion 3 (CDR3) of the TCR alpha (TRA) and beta (TRB) loci from healthy donors (0.12%–0.16%) or from publicly available
from the barcoded full-length cDNA library (primers in Table reference circulating CD4+ T cell data (0%) (Figure S3A). Sec-
S3). After filtering for matching whitelisted cell barcodes (Cell ond, we observed skewing of the intratumoral CD4+ T cell
A B
Figure 3. Regulatory CD4+ T Cells Are Heterogeneous, Enriched, and Clonally Expanded in Bladder Tumors
(A) Heatmap showing the expression of select regulatory T cell marker genes (rows) for individual single cells (columns) within the CD4IL2RAHI and CD4IL2RLO
clusters compared with the CD4CM cluster. Cells were grouped based on their annotations by tissue (tumor or non-malignant), treatment, and patient. Log2-
transformed expression of each gene was row scaled.
(B) Flow cytometry staining of CD4+ FOXP3+ TILs from a bladder tumor, showing the gating strategy for CD25neg, CD25low, and CD25hi (top left), and histograms of
TNFRSF18 staining from each CD25 gate (top right). Mean fluorescence intensity of TNFRSF18 and percent TNFRSF18+ from the parental gate are shown for
CD25 gates across samples (N = 7 tumors, mean ± SEM). *p < 0.05 by Wilcoxon paired t test.
(C) Gini coefficients for regulatory populations (CD4ILRA2HI and CD4IL2RALO, red labels at far left) and other CD4+ T cell populations within tumor and non-malignant
compartments across all samples. For each cluster, a box and whisker plot is shown with the median, IQR (box), and 1.5 times the IQR (whiskers), with outliers
exceeding 1.5 times the IQR beyond lower and upper quartiles. *p < 0.05, **p < 0.01 by exact permutation test. N = 7 tumor samples and 6 non-malignant
samples.
(D) Left: single cells expressing the top 3 most expanded clonotypes found in the combined regulatory populations (CD4ILRA2HI and CD4IL2RALO) are shown in red
in the same UMAP space as in Figure 2A. The regions composed of regulatory, cytotoxic, and proliferating T cells are outlined and superimposed on the UMAP
projection. Right: density plots for total CD4+ T cell distribution within tumor and non-malignant compartments are reproduced from Figure 2D for ease of visual
comparison.
repertoire toward an increased cumulative frequency of clono- When we assigned TCR sequences to cells with cluster
types over fewer cells (Figure S3B) and a corresponding higher identities (9,770 CD4+ and 5,151 CD8+ T cells with a paired
Gini coefficient (0.21 for tumor versus 0.05 for non-malignant TRA/TRB had an assigned phenotypic cluster or 49% and
tissue, Wilcoxon signed-rank test with Benjamini-Hochberg 48% of all T cells with assigned clusters, respectively; merged
correction, p = 0.009; Figure S3C) compared with the non-ma- TCR sequences and phenotypic clusters for CD4+ and CD8+
lignant compartment and healthy controls. T cells in Table S4), we found that clonal expansion of Tregs
B D
C E
contributes to intratumoral CD4+ T cell repertoire restriction. tumor-infiltrating CD4+ T cells. CD4GZMB and CD4GZMK cytotoxic
Compared with paired non-malignant tissue, both regulatory cells expressed a core set of cytolytic effector molecules (log2(-
states exhibited increased Gini coefficients in tumors FC) > 0.5, Padj < 0.05): GZMA (granzyme A), GZMB (granzyme B),
(CD4IL2RAHI: Ginitumor 0.17 versus Gininormal 0, p = 0.003; and NKG7 (a granule protein that translocates to the surface of
CD4IL2RALO: Ginitumor 0.06 versus Gininormal 0.003, p = 0.009; natural killer (NK) cells following target cell recognition; Medley
exact permutation test; Figure 3C). The most expanded clono- et al., 1996) (Figures 2B, 2C, and 4A; Table S2). Each cytotoxic
types within the Tregs were specific to regulatory cells but not CD4+ state was distinguished by the expression of specific
other cell states (all single cells expressing the top expanded effector molecules. CD4GZMB cells co-expressed high levels of
regulatory clonotypes are shown in Figure 3D). The CXCL13- GZMB, the pore-forming protein PRF1 (perforin), and the
expressing state CD4cxcl13 (discussed in greater detail below) granule-associated proteins GNLY and NKG7 (CD4GZMB: log2(-
was also restricted in tumors (Ginitumor 0.07 versus Gininormal FC) = 5.7, 3.4, 5.1, and 4.4, respectively), whereas CD4GZMK cells
0, p = 0.02, exact permutation test; Figure 3C). Gini coeffi- co-expressed high levels of the distinct granzyme GZMK and
cients for CD4+ states did not differ significantly by anti-PD- lower levels of NKG7 (CD4GZMK: log2(FC) = 6.3 and 3.9) (Fig-
L1 treatment (Figure S3G). In contrast, although repertoire ure 4A; Table S2). These shared cytolytic molecules were not ex-
restriction was also seen in CD8+ T cells from the same pressed by other CD4+ states, including regulatory and central
samples, this was observed in both tumor (percent unique clo- memory T cells (Figure 4A). Cytotoxic CD4+ cells co-expressed
notypes shared between cells: 15.1% ± 1.1%; Ginitumor: additional molecules, which may further contribute to anti-tumor
0.36% ± 0.04%) and non-malignant compartments (percent effector function. Notably, IFNG was expressed by both cyto-
unique clonotypes shared between cells: 14.6% ± 0.2%; toxic states, which may contribute to tumor cell death, including
Gininormal: 0.39% ± 0.06; Figures S3D–S3F). Furthermore, no ferroptosis (Wang et al., 2019) (CD4GZMB and CD4GZMK: log2(-
significant increase in Gini coefficient in tumor over non-ma- FC) = 2.1). Of note, the minority of CD4GZMB cells that expressed
lignant tissue was seen for any CD8+ state, including with IFNG appeared to also express TNF as well as specific immune
anti-PD-L1 treatment (Figures S3H and S3I). Hence, an impor- checkpoints, such as PDCD1, LAG3, and HAVCR2 (TIM3) (Fig-
tant contributor to increased repertoire restriction of tumor- ure 4A). A larger proportion of CD4GZMB cells expressed
infiltrating CD4+ over non-malignant tissue, which was not CXCR6 (CD4GZMB: log2(FC) = 1.3; Figure 4A). This chemokine
seen in the CD8+ compartment, involved clonal expansion of is expressed in regulatory and non-regulatory CD4+ TILs from
several distinct regulatory T cell states that differed in their colorectal carcinoma, nasopharyngeal carcinoma, and renal
levels of immune checkpoint expression, which may be cell carcinoma and, together with its ligand CXCL16, can
driven by tumor-associated antigens and the tumor-specific mediate TIL chemotaxis (Löfroos et al., 2017; Parsonage et al.,
microenvironment. 2012; Oldham et al., 2012). Finally, CD4GZMB and CD4GZMK cells
did not express high levels of other checkpoints associated with
Bladder Tumors Possessed Multiple Cytotoxic CD4+ T regulatory T cells, such as IL2RA, TIGIT, or TNFRSF4/9/18
Cell States (log2(FC) < 0.5; Figure 4A), nor did they express the exhaustion
In addition to regulatory states, we also found 2 distinct states of marker TOX (Table S2). Similar states were found with unbiased
cytotoxic CD4+ T cells in all samples constituting 15% ± 0.9% of clustering without batch correction for paired tumor- and
Figure 4. Multiple Cytotoxic CD4+ T Cell States Are Enriched and Clonally Expanded in Bladder Tumors and Possess Lytic Capacity against
Tumors
(A) Heatmap showing the expression of select cytotoxic or regulatory T cell marker genes (rows) for individual single cells (columns) within the cytotoxic CD4GZMB
and CD4GZMK clusters compared with regulatory (CD4IL2RAHI and CD4IL2RLO) and CD4CM clusters. Cells were grouped based on their annotations by tissue (tumor
or non-malignant), treatment, and patient. Log2-transformed expression of each gene was row scaled.
(B) Flow cytometry staining of GZMB, perforin, or GZMK in CCR7 CD4+ FOXP3 T cells.
(C) Percentage of cells expressing GZMB, GZMK, or perforin from CCR7 CD4+ FOXP3 T cells by flow cytometry (left) and the percentage of cells co-expressing
perforin within GZMB+ or GZMK+ CCR7 CD4+ FOXP3 T cells (right) (N = 7 tumors, mean + SEM).
(D) Representative flow cytometry staining of IFNg and TNF-a expression in GZMB+ or GZMK+ CCR7 CD4+ FOXP3 T cells stimulated with PMA and ionomycin.
(E) Percentages of cells expressing IFNg, TNF-a, or both from GZMB+ or GZMK+ CCR7 CD4+ FOXP3 T cells with and without stimulation (N = 11 tumors,
mean + SEM).
(F) Multiplex immunofluorescent staining of DAPI (blue), CD4 (immunohistochemistry, red), GZMK (RNAscope probe, green), and GZMB (RNAscope probe, white)
and overlay without DAPI from a cystectomy tumor region from a patient with parallel scRNA-seq and TCR-seq data (anti-PD-L1 C, top row) and from a cor-
responding tumor field with negative control staining (bottom row). CD4+ cells that co-express GZMK (arrows) or GZMB (arrowhead) are indicated. Scale
bar, 10 mm.
(G) The ratio of abundances of all regulatory T cell populations (CD4ILRAHI and CD4IL2RALO) to all cytotoxic CD4+ populations (CD4GZMB and CD4GZMK) across all
tumor and non-malignant samples (mean + SEM shown; *p < 0.05 by unpaired t test, assuming unequal variance).
(H) Gini coefficients for each of the cytotoxic CD4+ populations within tumor and non-malignant compartments across all samples (box and whisker plot is shown
with formatting as in Figure 3C; *p < 0.05, **p < 0.01, exact permutation test, N = 7 tumor samples and 6 non-malignant samples).
(I) Left panel: quantitation of Annexin V+ apoptotic cells over time from a time-lapse cytotoxicity experiment with tumor cells cultured alone or with bulk CD4+ TILs
(CD4total) or CD4+ TILs depleted of regulatory T cells (CD4eff) at a 30:1 effector:target ratio. Right panel: CD4eff TILs and tumor cells (30:1 effector:target ratio) were
co-cultured with a pan-anti-MHC class II antibody or isotype control. All traces were from the same culture and cytotoxicity assay from the same patient. All traces
show relative change in cell death from time point 0. Cytotoxicity with CD4eff is representative of independent experiments from 4 different patients. Mean ± SEM
from multiple technical replicates for each experiment is shown.
non-malignant-derived CD4+ cells from individual patients (Fig- tively, are assigned to effector memory CD8+ cell annotations),
ures S2D and S2E). reinforcing their cytotoxicity profile (Figure S2F). Finally, an inter-
We validated the presence and functional heterogeneity of nal comparison of the transcriptional profiles from CD4+ and
cytotoxic CD4+ T cells using several orthogonal and comple- CD8+ TIL clusters from our scRNA-seq data indicated that,
mentary methods. Using flow cytometry, the presence of cyto- although most CD4+ clusters are most similar to other CD4+ clus-
toxic CD4+ T cells with an effector memory (CCR7 CD45RA) ters, cytotoxic CD4+ T cells are an exception. CD4GZMB cytotoxic
or effector (CCR7 CD45RA+) phenotype that express GZMB, cells were most correlated with tumor-specific CD8ENTPD1 cells
GZMK, and perforin protein was confirmed by flow cytometry (Pearson correlation coefficient = 0.92), whereas CD4GZMK cyto-
in tumors from multiple independent replicate samples (N = 7 tu- toxic cells were most correlated with CD8CM and CD8NAIVE cells
mors; Figure 4B; gating strategy in Figures S1C and S1D). (Pearson correlation coefficient = 0.98 for both) (Figure S2G). The
Across this sample set, 9% ± 2.9% (mean ± SEM) of CD4+ tumor-specific gene expression program of these cytotoxic
FOXP3 CCR7 cells expressed GZMB, whereas 16% ± 4.5% CD4+ cells was marked by heat shock protein expression in
expressed GZMK and 5.3% ± 2.6% expressed perforin (Fig- both states as well as tumor overexpression of CXCL13 and
ure 4C, left panel), at lower frequencies than CCR7 CD8+ cyto- numerous immune checkpoints (TNFRSF18/LAG3/TIGIT/
toxic cells from the same patients (Figures S1E–S1G). Impor- HAVCR2) as well as ENTPD1 within CD4GZMB cells (Figure S2C;
tantly, 25.9% ± 8.7% of GZMB+ CD4+ FOXP3 CCR7 and Table S2).
8.6% ± 3.5% of GZMK+ CD4+ FOXP3 CCR7 cytotoxic
T cells showed co-expression of perforin with granzymes, in Cytotoxic CD4+ T Cells Were Enriched and Clonally
agreement with the scRNA-seq data (Figure 4C, right panel); Expanded in Bladder Tumors
these frequencies of granzyme and perforin co-expression Of the 2 cytotoxic CD4+ states, CD4GZMK cells were significantly
were lower than those of CCR7 CD8+ cytotoxic cells from the enriched in abundance in tumors (CD4GZMK in tumor versus non-
same patients (Figures S1F and S1G). Importantly, CD45 malignant tissues: 7.2% ± 0.5% versus 5.0% ± 0.5%, exact per-
bladder tumor cells express multiple major histocompatibility mutation test, p = 0.01; Figure 2E). Overall, the CD4+ compart-
complex (MHC) class II molecules (data not shown), which would ment exhibited a bias toward regulatory over cytotoxic CD4+
allow antigen recognition by TCRs expressing CD4 as a co-re- T cells in tumors (regulatory CD4+/cytotoxic CD4+ ratio = 1.8 ±
ceptor. Flow cytometry of a separate set of 11 muscle-invasive 0.2) compared with non-malignant tissues, where proportions
bladder tumors confirms the functional capacity of cytotoxic of regulatory and cytotoxic CD4+ T cells were more balanced
CD4+ T cells to produce multiple cytokines. In agreement with (regulatory CD4+/cytotoxic CD4+ ratio = 1.1 ± 0.2, t test, p =
the scRNA-seq data, 56% ± 4.8% (mean ± SEM) of CD4+ 0.04; Figure 4G). Cytotoxic CD4+ T cell states contributed to in-
CCR7 cells were polyfunctional and could produce both IFNg tratumoral CD4+ repertoire restriction. Both cytotoxic CD4+
and tumor necrosis factor alpha (TNF-a), whereas a minority of T cell states have significantly increased Gini coefficients in tu-
these cells only secrete IFNg alone or TNF-a alone after stimula- mor compared with non-malignant tissues, with CD4GZMB repre-
tion and, therefore, may demonstrate signs of exhaustion (IFNg+ senting the more restricted cytotoxic state in tumors (CD4GZMB:
TNF-a: 2.0% ± 0.76%; IFNg TNF-a+: 19% ± 3.3%) (Figures Ginitumor 0.21 versus Gininormal 0.06; CD4GZMK: Ginitumor 0.12
4D and 4E). The frequency of polyfunctional cytotoxic CD4+ versus Gininormal 0; exact permutation test, p = 0.04 for CD4GZMB
T cells was similar to stimulated CD8+ CCR7 T cells from the and p = 0.002 for CD4GZMK; Figure 4H). Hence, unbiased
same patients (IFNg+ TNF-a+: 55% ± 6.3%), although CD8+ dscRNA-seq revealed that heterogeneous cytotoxic CD4+
CCR7 T cells that were monofunctional demonstrated an T cells, a subset of which are closely related to conventional
increased trend toward preferential IFNg production alone over cytotoxic CD8+ T cells based on their functional program, are un-
TNF-a production compared with cytotoxic CD4+ T cells expected but frequent constituents of the bladder tumor micro-
(IFNg+ TNF-a: 14% ± 4.7%; IFNg TNF-a+: 7.2% ± 2.1%) environment, some of which are quantitatively enriched in tu-
(Figure S1H). mors. The tumor-specific clonal expansion of both cytotoxic
As further validation of the cytotoxic CD4+ T cell phenotype in CD4+ states suggests that their restricted repertoire may result
tissue, multiplex immunofluorescence tissue staining of bladder from recognition of MHC class II cognate antigens that may
tumor tissue from a patient in the scRNA-seq dataset demon- include bladder tumor antigens.
strated CD4+ T cells that also expressed GZMB or GZMK (Fig-
ure 4F, top row; tissue staining from an additional patient in Fig- Cytotoxic CD4+ T Cells Possessed Lytic Capacity
ure S1I) at levels not seen with negative control staining against Autologous Tumor Cells that Was Restricted by
(Figure 4F, bottom row). Autologous Tregs
Overall annotation of clusters from the scRNA-seq data was To validate the functional relevance of cytotoxic CD4+ in bladder
supported by an independent analysis that assigns each single tumors, we isolated CD4+ TILs by fluorescence-activated cell
cell to the best-known published immune subset profiled by sorting (FACS) and then cultured the cells ex vivo with inter-
bulk expression analysis after sorting (SingleR) (Aran et al., leukin-2 (IL-2). We then co-cultured these cells with autologous
2019). This corroborated the identification of Tregs (90% and tumor cells in an imaging-based time-lapse cytotoxicity assay,
78% of CD4IL2RAHI and CD4IL2RALO cells are assigned to Treg an- assessing for cell death with Annexin V. We found that expanded
notations, respectively) and further demonstrated that both cyto- CD4+ TILs were cytotoxic and could trigger increased tumor
toxic CD4+ states are most similar to CD8+ effector memory apoptosis (‘‘CD4total:tumor,’’ Figure 4I, left panel). However,
T cells (37% and 45% of CD4GZMB and CD4GZMK cells, respec- when we performed the same co-cultures but with CD4+ TILs
A B
C D
E
F
Figure 5. Proliferating CD4+ T Cells Contain Regulatory and Cytotoxic Cell States
(A) Heatmap showing expression of select cytotoxic, regulatory, and proliferating marker genes (rows) for individual single cells (columns) within the CD4PROLIF
cluster. Samples were hierarchically clustered. Log2-transformed expression of each gene was row scaled.
(B) Representative flow cytometry staining from a bladder tumor showing expression of CD25, GZMB, GZMK, and Ki67.
(C) Single cells expressing the top 3 most expanded clonotypes found in the CD4PROLIF T cell population are shown in red in the same UMAP space as in
Figure 2A. The regions composed of proliferating, regulatory, and cytotoxic T cells are outlined and superimposed on the UMAP projection for visualization.
(D) Left panel: pseudotime trajectories derived from all tumors (N = 7 samples) and non-malignant samples (N = 6 samples). Cells with expanded TCRs from the
proliferating (CD4PROLIF, green), regulatory (CD4IL2RAHI and CD4IL2RALO, shades of red), and cytotoxic (CD4GZMB and CD4GZMK, shades of purple) states were
used for this analysis. Specific branches corresponding to proliferating cytotoxic cells (top right), non-proliferating cytotoxic cells (bottom right), proliferating
regulatory cells (top left), and non-proliferating regulatory cells (bottom left) are labeled. Right panel: branches are color-coded according to the above prolif-
erating or non-proliferating identities. Also labeled are branch points that discriminate proliferating and non-proliferating cytotoxic CD4+ T cells (branch point 1)
and proliferating and non-proliferating regulatory T cells (branch point 2).
(legend continued on next page)
from the same patient that were depleted of Tregs, we found that confirmed the presence of discrete regulatory or cytotoxic pop-
killing was increased (‘‘CD4eff:tumor,’’ Figure 4I, left panel), indi- ulations of Ki67+ CD4+ T cells that co-expressed CD25, GZMB,
cating that autologous Tregs can inhibit the activity of cytotoxic or GZMK (Figure 5B). Across multiple independent samples,
CD4+ T cells. Significant tumor death was seen in co-cultures 4.7% ± 1.0% (mean ± SEM) of CD4+ FOXP3+ cells co-ex-
with CD4eff TILs compared with tumors alone (Figure 4I, left pressed Ki67 and CD25, whereas 1.2% ± 0.5% of CD4+
panel; representative of 3 independent experiments from FOXP3 CCR7 cells co-expressed Ki67 and GZMB, and
different patients). Furthermore, the cytotoxic activity of CD4eff 1.0% ± 0.1% of CD4+ FOXP3 CCR7 cells co-expressed
was at least partially dependent on MHC class II recognition Ki67 and GZMK (N = 7 tumors; Figure S1J). Proliferating
because tumor apoptosis was inhibited with pre-incubation Ki67+ GZMB+ cells are also seen, using flow cytometry, within
with a pan-anti-MHC class II antibody that was not seen with the CD8+ compartment of TCC patients (Figure S1K). Examina-
an isotype control antibody (Figure 4I, right panel; representative tion of exact TCR clonotype sharing of the most expanded
of 2 independent patients). Independent experiments with an CD4PROLIF clones identified sharing with regulatory and cyto-
alternative death indicator (Cytotox Red) confirmed increased toxic CD4+ T cells, further underscoring the contribution of
autologous tumor killing with tumor/CD4eff co-cultures (Fig- each state to CD4PROLIF cells (Figure 5C).
ure S4A), MHC class II dependence of CD4eff killing (Figure S4B), Given that regulatory and cytotoxic CD4+ T cells were heterog-
as well as similar MHC class I-dependent autologous tumor enous and composed of cells that were proliferating to a different
killing with expanded CD8+ T cells (Figures S4C and S4D). extent, existing clusters may fail to resolve the separate contri-
Hence, flow cytometry and functional analyses from multiple in- bution of specific expression programs from subsets with
dependent patients confirmed not only that cytotoxic CD4+ different proliferative capacity. Hence, we used pseudotime
T cells expressed cytolytic proteins, such as granzymes and per- analysis to separate regulatory and cytotoxic cells into prolifer-
forin, in tumor tissue but that these cells can recognize bladder ating and non-proliferating components (Qiu et al., 2017). This
tumor antigens in an MHC class II-dependent fashion and analysis divided CD4PROLIF cells into two groups, each lying
were functionally competent to lyse autologous tumor cells in a along a branch specific for proliferating regulatory or cytotoxic
manner that can be suppressed by autologous Tregs. CD4+ T cells, with separate branches for non-proliferating regu-
latory and cytotoxic cells (Figure 5D). This underscored that reg-
Proliferating CD4+ T Cells Contained Regulatory and ulatory and cytotoxic CD4+ T cells consist of distinct proliferating
Cytotoxic Cells and non-proliferating states in TCC, based on transcriptomic
Induction of proliferating T cells can be beneficial for anti-tumor and clonotypic analyses.
immune responses. Proliferating CD4+ T cells are rapidly
induced in the periphery within weeks of initiating checkpoint A Signature of Cytotoxic CD4+ T Cells Predicts Clinical
blockade in prostate cancer patients (Kavanagh et al., 2008) Response to Anti-PD-L1
and in separate cohorts of thymic epithelial tumors and non- To assess the importance of the specific proliferating and non-
small cell lung cancer treated with anti-PD-1; a higher fold proliferating cytotoxic CD4+ T cell states for patient outcomes,
change in Ki67+ cells among PD-1+ CD8+ T cells in the periph- we performed branched expression analysis modeling (BEAM)
ery after a week was predictive of durable clinical benefit, pro- to identify all genes that were differentially expressed between
gression-free survival, and (in the non-small cell lung cancer branches at branchpoint 1 of the pseudotime trajectory. This
cohorts) overall survival (Kim et al., 2019). Within our tumor- branchpoint divided proliferating cytotoxic CD4+ T cells, non-
infiltrating CD4+ T cell compartment in TCC, we also identified proliferating cytotoxic CD4+ T cells, and all other regulatory cells
proliferating cells (CD4PROLIF) expressing MKI67, microtubule- (Figure 5D, right panel). Hierarchical clustering identified genes
associated markers (e.g., STMN1/TUBB), and DNA-binding upregulated preferentially in the proliferating cytotoxic branch
proteins associated with cell cycle progression, such as (cluster 7) or the non-proliferating cytotoxic branch (cluster 4)
PCNA, HMGB1, and HMGB2, which were expressed at lower but not in regulatory branches within this analysis (all genes
levels in regulatory and cytotoxic CD4+ T cells (CD4PROLIF: with q < 0.05; heatmap of clusters and branches in Figure 5E;
log2(FC) > 2.1; Figure 2C; Table S2). A similar signature was branch-specific signatures in Table S5). We developed a gene
also seen in the CD8+ compartment (CD8PROLIF; Figure 1C; Ta- signature from this analysis consisting of genes that were upre-
ble S2). Higher-resolution clustering revealed that this prolifer- gulated specifically in proliferating or non-proliferating cytotoxic
ating state is comprised of discrete groups of cells co-express- CD4+ T cells (from cluster 7: ABCB1; from cluster 4: APBA2,
ing regulatory or cytotoxic genes but not both simultaneously SLAMF7, GPR18, and PEG10; Figure 5E) but were not upregu-
(Figure 5A). Flow cytometry analysis of separate TCC samples lated in any of the CD8+ T cell states from our scRNA-seq
(E) Heatmap showing all differentially expressed genes (columns) between branches for branch point 1 across cells in the pseudotime analysis (rows). Cells are
grouped by their proliferating or non-proliferating branch assignments, color-coded at the right of the heatmap and corresponding to colors in (D). Genes are
grouped by color-coded clusters (1–8) shown at the top of the plot, which result from hierarchical clustering based on co-regulation in specific branches.
(F) Cytotoxic CD4+ T cell gene signature scores were plotted in clinical responders (complete response or partial response) versus non-responders (stable
disease or progressive disease) from baseline metastatic biopsies from bladder cancer patients with inflamed tumors on the IMvigor210 clinical trial (N = 62
tumors). The signature score was obtained from the IMvigor210 bulk RNA-seq dataset for the cytotoxic CD4+ T cell-specific genes derived from non-proliferating
(cluster 4) and proliferating (cluster 7) cytotoxic CD4+ clusters from the pseudotime analysis shown below the heatmap in (E). Median ± SEM is shown; *p = 0.037
by two-tailed t test.
analysis (Table S2). We then tested this gene signature’s ability Second, we identified heterogenous states of cytotoxic CD4+
to predict treatment response using bulk RNA-seq data from T cells that were unexpected and differed in their expression of
pre-treatment tumors from a separate phase 2 trial of atezolizu- canonical cytolytic effector molecules (GZMB, GZMK, and
mab for metastatic bladder cancer (IMvigor210; Mariathasan PRF1 [perforin]) as well as other granule-associated proteins
et al., 2018). In 244 metastatic bladder cancer patients with (GNLY [granulysin] and NKG7) that may have roles in target
pre-treatment RNA-seq data, immunohistochemistry (IHC) infor- cell killing. These were distinct populations based on scRNA-
mation regarding immune phenotype (immune desert, immune- seq and orthogonal validation by flow cytometry and multiplex
excluded, or inflamed), and information regarding clinical immunofluorescence tissue staining. Our annotation using Sin-
response, this gene signature was significantly correlated with gleR indicated that effector states such as cytotoxic CD4+
clinical response to anti-PD-L1 therapy in inflamed samples T cells found in the tumor microenvironment may not yet be an-
(p = 0.037, two-tailed t test, N = 62 inflamed samples; Figure 5F), notated, and, based on ‘‘best fit’’ comparisons with external
which was not seen in samples with an immune-excluded or im- reference data and transcriptional correlation within our own
mune desert phenotype. Hence, we used a composite signature data, these cells were, in fact, most similar to conventional
containing genes that discriminated proliferating and non-prolif- effector memory cytotoxic CD8+ T cells. The functional similarity
erating cytotoxic CD4+ T cells to assess the specific contribu- between cytotoxic CD4+ T cells and conventional CD8+ T cells
tions of these discrete states and found that this signature is was underscored by our finding that CD4GZMB TILs were actually
associated with response to PD-1 blockade in a large cohort of most similar to tumor-specific CD8ENTPD1 cells (Duhen et al.,
TCC patients. This result highlights the potential clinical impor- 2018), based on transcriptional data, whereas CD4GZMK TILs
tance of possessing intratumoral cytotoxic CD4+ T cell activity were most similar to CD8CM and CD8NAIVE cells. Although these
in response to anti-PD-L1 treatment. were distinct cell types, based on separate CD4 and CD8 co-re-
ceptor expression, this may indicate shared modes of tumor
DISCUSSION recognition and tumor clearance by cytotoxic CD4+ and CD8+
T cells. Although cytotoxic CD4+ T cells are present in non-small
Current efforts to dissect the mechanism of tumor immune sur- cell lung and hepatocellular carcinoma (Zheng et al., 2017a; Guo
veillance and enhance the efficacy of cancer immunotherapies et al., 2018), circulate with ipilimumab treatment in metastatic
have primarily focused on conventional cytotoxic CD8+ T cell- melanoma (Kitano et al., 2013), and also are present in an infec-
mediated responses. However, given the known functional di- tious context, where they represent a clonally expanded dengue
versity of CD4+ T cell effector responses and emerging data virus-specific effector subset (Patil et al., 2018), the extent of
that CD4+ T cell recognition may be important for anti-tumor re- their heterogeneity in other solid tumors (including bladder can-
sponses (for instance, in the context of a neoantigen vaccine; Ott cer) and whether these cells are important for systemic immuno-
et al., 2017; Sahin et al., 2017), the role of specific CD4+ states in therapy has remained unclear prior to this work. We found that
enhancing or suppressing immune responses in the tumor cytotoxic CD4+ subsets in bladder tumors were clonally
microenvironment and how these are modulated by systemic expanded, which may be the result of recognition of cognate
therapies, including immunotherapy, remains unknown. Here bladder tumor antigens. Their functional importance was
we use unbiased massively parallel genotypic and phenotypic confirmed by their ability to kill autologous tumors ex vivo. The
profiling of the T cell compartment in localized bladder tumors mechanism by which these cells kill target tumor cells involves
and the adjacent non-malignant compartment, including those contact-dependent mechanisms based on inhibition of killing
treated with anti-PD-L1 immunotherapy, as a tool to finely by anti-MHC class II antibodies, although other mechanisms
dissect heterogeneity in CD4+ T cell subsets. We identified spe- may also contribute. We documented that these cytotoxic
cific CD4+ T cell states with functional relevance for response to CD4+ T cells are polyfunctional and secrete multiple, such as
immunotherapy and clinical outcomes. We not only confirmed TNF-a and IFNg; the latter may contribute to tumor death as
the presence of CD4+ T cell states with known contributions to well through ferroptosis (Wang et al., 2019) in addition to con-
anti-tumor immune responses, such as CXCL13+ CD4+ T cells tact-dependent cytotoxicity. Of note, apart from the subset of
(Schmidt et al., 2018; Gu-Trantien et al., 2013, 2017; Wei et al., cells that co-express TNF-a, IFNg, PDCD1, LAG3, and HAVCR2,
2018; Zhang et al., 2018) as well as Th17 cells (Kryczek et al., cytotoxic CD4+ T cells are found to generally lack surface
2011), we also uncovered insights into the contribution of expression of many immune checkpoints currently being tested
CD4+ TILs to tumor control by the immune system in bladder with therapeutic antibodies in pre-clinical and clinical testing,
cancer. suggesting that these effector cells may have distinct require-
First we identified distinct states of Tregs that differed based ments for activation.
on the level of expression of IL2RA and immune checkpoints, Importantly, a gene signature derived from single-cell analysis
such as TNFRSF18, which was confirmed at the protein level. of proliferating and non-proliferating cytotoxic CD4+ T cells is
These Tregs possessed a private repertoire with no detected predictive of the response to anti-PD-L1 therapy in a separate
clonotype sharing with other T cell states, which would suggest set of 62 patients with inflamed metastatic bladder cancer.
that these are not induced Tregs. Because a gene signature from Most of these genes have been previously implicated in the
checkpoint-high Tregs is associated with worse outcome in non- biology of cytotoxic effector cells or specific human CD4+
small cell lung cancer (Guo et al., 2018), it is possible that these T cell responses in pathogenesis or autoimmunity, including hu-
regulatory cells are responsible for setting a basal state of more man cytotoxic CD4+ T cells (Arlehamn et al., 2014; Burel et al.,
potent immunosuppression and adverse outcomes in TCC. 2018; Campbell et al., 2018, Imbeault et al., 2012; Mattoo
et al., 2016; Sumida and Cyster, 2018; Wang et al., 2014). Over- STAR+METHODS
all, the predictive value of this cytotoxic CD4+ T cell-specific
signature in a large cohort of anti-PD-L1-treated metastatic Detailed methods are provided in the online version of this paper
TCC patients highlights how anti-PD-L1 therapy may alter the and include the following:
immune microenvironment to favor activation of cytotoxic
CD4+ effectors, particularly in patients with pre-existing cyto- d KEY RESOURCES TABLE
toxic CD4+ T cell activity. d RESOURCE AVAILABILITY
The importance of the relative balance between regulatory and B Lead Contact
effector T cells is well known for conventional effectors; the reg- B Materials Availability
B Data and Code Availability
ulatory CD4+:cytotoxic CD8+ ratio has been associated with
improved survival or response to therapy in several cancers, d EXPERIMENTAL MODEL AND SUBJECT DETAILS
including bladder (Preston et al., 2013; Sato et al., 2005; Baras d METHOD DETAILS
B Tissue processing
et al., 2016; Takada et al., 2018). This work identifies the biolog-
B Flow cytometry/FACS
ical importance of another axis involving the relative balance of
regulatory T cells and these cytotoxic CD4+ effectors for anti-tu- B Single cell RNA sequencing
B TCR sequencing
mor activity: removal of regulatory T cells enhanced tumor killing
B Expression analysis
by cytotoxic CD4+ T cells. Our findings suggest that manipu-
lating the balance between cytotoxic CD4+ and regulatory B TCR analysis
T cell states can lead to therapeutic benefit in TCC. B Tumor infiltrating lymphocyte (TIL) isolation and
Finally, the origin of cytotoxic CD4+ T cell effectors within tu- culturing
mors remains unclear. We do not find direct evidence of plas- B Cytotoxic T lymphocyte (CTL) killing assay
Baras, A.S., Drake, C., Liu, J.J., Gandhi, N., Kates, M., Hoque, M.O., Meeker, Kim, K.H., Cho, J., Ku, B.M., Koh, J., Sun, J.M., Lee, S.H., Ahn, J.S., Cheon, J.,
A., Hahn, N., Taube, J.M., Schoenberg, M.P., et al. (2016). The ratio of CD8 to Min, Y.J., Park, S.H., et al. (2019). The First-week Proliferative Response of Pe-
Treg tumor-infiltrating lymphocytes is associated with response to cisplatin- ripheral Blood PD-1+CD8+ T Cells Predicts the Response to Anti-PD-1 Therapy
based neoadjuvant chemotherapy in patients with muscle invasive urothelial in Solid Tumors. Clin. Cancer Res. 25, 2144–2154.
carcinoma of the bladder. OncoImmunology 5, e1134412. Kitano, S., Tsuji, T., Liu, C., Hirschhorn-Cymerman, D., Kyi, C., Mu, Z., Allison,
J.P., Gnjatic, S., Yuan, J.D., and Wolchok, J.D. (2013). Enhancement of tumor-
Bolotin, D.A., Poslavsky, S., Mitrophanov, I., Shugay, M., Mamedov, I.Z., Pu-
reactive cytotoxic CD4+ T cell responses after ipilimumab treatment in four
tintseva, E.V., and Chudakov, D.M. (2015). MiXCR: software for comprehen-
advanced melanoma patients. Cancer Immunol. Res. 1, 235–244.
sive adaptive immunity profiling. Nat. Methods 12, 380–381.
Koshkin, V.S., and Grivas, P. (2018). Emerging Role of Immunotherapy in
Burel, J.G., Lindestam Arlehamn, C.S., Khan, N., Seumois, G., Greenbaum,
Advanced Urothelial Carcinoma. Curr. Oncol. Rep. 20, 48.
J.A., Taplitz, R., Gilman, R.H., Saito, M., Vijayanand, P., Sette, A., and Peters,
B. (2018). Transcriptomic Analysis of CD4+ T Cells Reveals Novel Immune Sig- Krensky, A.M., and Clayberger, C. (2009). Biology and clinical relevance of
natures of Latent Tuberculosis. J. Immunol. 200, 3283–3290. granulysin. Tissue Antigens 73, 193–198.
Kryczek, I., Zhao, E., Liu, Y., Wang, Y., Vatan, L., Szeliga, W., Moyer, J., Klimc-
Büttner, M., Miao, Z., Wolf, F.A., Teichmann, S.A., and Theis, F.J. (2019). A test
zak, A., Lange, A., and Zou, W. (2011). Human TH17 cells are long-lived
metric for assessing single-cell RNA-seq batch correction. Nat. Methods
effector memory cells. Sci. Transl. Med. 3, 104ra100.
16, 43–49.
Kurioka, A., Walker, L.J., Klenerman, P., and Willberg, C.B. (2016). MAIT cells:
Campbell, K.S., Cohen, A.D., and Pazina, T. (2018). Mechanisms of NK Cell
new guardians of the liver. Clin. Transl. Immunology 5, e98.
Activation and Clinical Activity of the Therapeutic SLAMF7 Antibody, Elotuzu-
mab in Multiple Myeloma. Front. Immunol. 9, 2551. Liakou, C.I., Kamat, A., Tang, D.N., Chen, H., Sun, J., Troncoso, P., Logothetis,
C., and Sharma, P. (2008). CTLA-4 blockade increases IFNgamma-producing
Cancer Genome Atlas Research Network (2008). Comprehensive genomic CD4+ICOShi cells to shift the ratio of effector to regulatory T cells in cancer pa-
characterization defines human glioblastoma genes and core pathways. Na- tients. Proc. Natl. Acad. Sci. USA 105, 14987–14992.
ture 455, 1061–1068.
Löfroos, A.B., Kadivar, M., Resic Lindehammer, S., and Marsal, J. (2017).
De Simone, M., Arrigoni, A., Rossetti, G., Gruarin, P., Ranzani, V., Politano, C., Colorectal cancer-infiltrating T lymphocytes display a distinct chemokine re-
Bonnal, R.J.P., Provasi, E., Sarnicola, M.L., Panzeri, I., et al. (2016). Transcrip- ceptor expression profile. Eur. J. Med. Res. 22, 40.
tional Landscape of Human Tissue Lymphocytes Unveils Uniqueness of Tu-
Mariathasan, S., Turley, S.J., Nickles, D., Castiglioni, A., Yuen, K., Wang, Y.,
mor-Infiltrating T Regulatory Cells. Immunity 45, 1135–1147.
Kadel, E.E., III, Koeppen, H., Astarita, J.L., Cubas, R., et al. (2018). TGFb atten-
Duhen, T., Duhen, R., Montler, R., Moses, J., Moudgil, T., de Miranda, N.F., uates tumour response to PD-L1 blockade by contributing to exclusion of
Goodall, C.P., Blair, T.C., Fox, B.A., McDermott, J.E., et al. (2018). Co-expres- T cells. Nature 554, 544–548.
sion of CD39 and CD103 identifies tumor-reactive CD8 T cells in human solid Martincorena, I., and Campbell, P.J. (2015). Somatic mutation in cancer and
tumors. Nat. Commun. 9, 2724. normal cells. Science 349, 1483–1489.
Gu-Trantien, C., Loi, S., Garaud, S., Equeter, C., Libin, M., de Wind, A., Ravoet, Mattoo, H., Mahajan, V.S., Maehara, T., Deshpande, V., Della-Torre, E., Wal-
M., Le Buanec, H., Sibille, C., Manfouo-Foutsop, G., et al. (2013). CD4+ follic- lace, Z.S., Kulikova, M., Drijvers, J.M., Daccache, J., Carruthers, M.N., et al.
ular helper T cell infiltration predicts breast cancer survival. J. Clin. Invest. 123, (2016). Clonal expansion of CD4(+) cytotoxic T lymphocytes in patients with
2873–2892. IgG4-related disease. J. Allergy Clin. Immunol. 138, 825–838.
Gu-Trantien, C., Migliori, E., Buisseret, L., de Wind, A., Brohée, S., Garaud, S., McInnes, L., and Healy, J. (2018). UMAP: Uniform Manifold Approximation and
Noël, G., Chi, V.L.D., Lodewyckx, J.N., Naveaux, C., et al. (2017). CXCL13- Projection for Dimension Reduction. arXiv, arXiv: 1802.03426. https://arxiv.
producing TFH cells link immune suppression and adaptive memory in human org/abs/1802.03426.
breast cancer. JCI Insight 2, 91487. Medley, Q.G., Kedersha, N., O’Brien, S., Tian, Q., Schlossman, S.F., Streuli,
Guo, X., Zhang, Y., Zheng, L., Zheng, C., Song, J., Zhang, Q., Kang, B., Liu, Z., M., and Anderson, P. (1996). Characterization of GMP-17, a granule mem-
Jin, L., Xing, R., et al. (2018). Global characterization of T cells in non-small-cell brane protein that moves to the plasma membrane of natural killer cells
lung cancer by single-cell sequencing. Nat. Med. 24, 978–985. following target cell recognition. Proc. Natl. Acad. Sci. USA 93, 685–689.
STAR+METHODS
Continued
REAGENT or RESOURCE SOURCE IDENTIFIER
Critical Commercial Assays
GentleMACS Miltenyi Biotec Cat# 130-093-235
FoxP3/transcription factor staining eBioscience Cat# 00-5523-00
buffer set
Cell stimulation cocktail eBioscience Cat# 00-4975
Chromium Single Cell 30 Library, Gel Bead & 10X Genomics Cat# 120233 (discontinued)
Multiplex Kit
Chromium Single Cell 30 Chip Kit 10X Genomics Cat# 120232 (discontinued)
Dynabeads Human T-Activator CD3/ GIBCO Cat# 11162D
CD28/CD137
Opal 7-color manual IHC kit Perkin Elmer Cat# NEL811001KT
Deposited Data
Processed data This study NCBI GEO: GSE149652
Healthy human donor TCR data for CD4+ 10X Genomics https://support.10xgenomics.com/
peripheral blood mononuclear cells single-cell-vdj/datasets/2.2.0/
vdj_v1_hs_cd4_t
Healthy human donor TCR data for CD8+ 10X Genomics https://support.10xgenomics.com/
peripheral blood mononuclear cells single-cell-vdj/datasets/2.2.0/
vdj_v1_hs_cd8_t
Human reference genome, build hg19 10X Genomics http://software.10xgenomics.com/
Oligonucleotides
TCR sequencing primers Table S3 N/A
Software and Algorithms
Cell Ranger v1.1 10X Genomics http://software.10xgenomics.com/
Scanpy v1.4.3 Wolf et al., 2018 https://scanpy.readthedocs.io/en/stable/
index.html
miXCR v2.1.12 Bolotin et al., 2015 https://mixcr.readthedocs.io/en/latest/
Monocle v2.10.1 Qiu et al., 2017 Bioconductor
SingleR v1.1.9 Aran et al., 2019 Bioconductor
FlowJo TreeStar N/A
RESOURCE AVAILABILITY
Lead Contact
Further information and requests for resources and reagents should be directed to and will be fulfilled by the Lead Contact, Lawrence
Fong (lawrence.fong@ucsf.edu).
Materials Availability
Primer sequences for TCR sequencing are enumerated in Table S3. All unique reagents generated in this study are available from the
Lead Contact upon request.
Tissues were obtained from patients with localized bladder transitional cell carcinoma (TCC) who either received 1-2 doses of neo-
adjuvant atezolizumab as part of an ongoing clinical trial (UCSF IRB# 14-15423, patients were accrued sequentially to receive
increasing numbers of atezolizumab doses), or standard of care treatments recommended by their treating physician including
chemotherapy (gemcitabine/carboplatin) or no systemic therapy prior to planned cystectomy (these patients were consented for tis-
sue collection under a separate protocol, UCSF IRB# 10-04057). All studies with patients and patient samples were conducted with
appropriate institutional IRB approval and oversight. Patient demographics, including age, gender, disease state (all were localized
muscle-invasive bladder cancer), neoadjuvant treatment, and presence of tumor and pathologic staging at the time of surgery are
provided in Table S1. No formal sample size calculations were conducted for this particular collection.
METHOD DETAILS
Tissue processing
Tissues were obtained from patients with localized bladder transitional cell carcinoma (TCC) who either received neoadjuvant ate-
zolizumab, standard of care chemotherapy (gemcitabine/carboplatin), or no systemic therapy as per standard of care prior to
planned cystectomy. Cystectomy surgical specimens were obtained fresh from the operating field, and dissected in surgical pathol-
ogy where grossly apparent tumor or adjacent bladder not grossly affected by tumor (‘‘non-malignant’’) were isolated, minced, and
transported at room temperature immersed in L15 media with 15 mM HEPES and 600 mg% glucose. Once received, these were
digested using Liberase TL as well as mechanical dissocation with heat (gentleMACS) using standard protocols. Single cell suspen-
sions were obtained and counted for viability before staining for FACS. Healthy donor blood was separately collected, processed by
gradient centrifugation to peripheral blood mononuclear cells (PBMCs), and cryopreserved to be thawed later for control
experiments.
Flow cytometry/FACS
Freshly dissociated TILs and previously frozen healthy donor PBMCs were used for sorting. Samples were stained with designated
panels for 30 minutes at 4 C and washed twice with FACS buffer (PBS, 2% FBS, 1mM EDTA). Cells were incubated with Draq7 (Bio-
legend, Cat# 424001) for 5 mins at room temperature to stain dead cells. Samples were sorted on a FACSAria Fusion (Becton Dick-
inson) using FACSDiva software with single channel compensation controls acquired on the same day.
For RNA sequencing flow validation, previously frozen TILs were thawed into in complete media (RPMI, 10% heat inactivated FBS,
1% non-essential amino acids solution, 10 uM HEPES, 1mM sodium pyruvate, 2 mM L-glutamine, 100 U/ml penicillin-streptomycin)
and washed once with PBS. Live/dead fixable Near-IR dead cell stain (Invitrogen, Cat# L34975) was incubated with cells for 30 mi-
nutes at room temperature and washed once with FACS buffer. Samples were stained with designated panels for 30 minutes at 4 C
and washed twice with FACS buffer. Cells requiring intracellular staining were fixed and permeabilized with eBioscience FoxP3/ Tran-
scription factor staining buffer set (Cat# 00-5523-00) according to the manufacturer’s protocol. Intracellular staining with antibodies
was carried out for 30 minutes at room temperature and washed twice with FACS wash. Cells were fixed with FluoroFix buffer (Bio-
legend, Cat# 422101) and washed once with FACS buffer. Cells were acquired the next day on a FACSymphony (Becton Dickinson)
using FACSDiva with single channel compensation controls acquired on the same day. Data was analyzed offline using FlowJo anal-
ysis software (FlowJo, LLC).
For cytokine expression, cells were resuspended in complete media and divided into two T25 flasks. One flask was activated with
cell stimulation cocktail (eBioscience, cat# 00-4975 containing phorbol 12-myristate 13-acetate, ionomycin, brefeldin A and monen-
sin at a final concentration of 81 nM, 1.34 nM, 10.6 mM and 2 mM respectively) and both flasks were incubated upright for 3 h in a CO2
incubator at 37 C. Cells were collected and washed once with PBS prior to Live/Dead Fixable Near-IR dead cell staining and surface
and intracellular flow staining as described above.
Antibodies used for sorting were Brilliant Violet 605 CD25 (Biolegend, clone BC96, Cat# 302632), Brilliant Violet 786 CD127 (Bio-
legend, clone A019D5, Cat# 351330), Brilliant Violet 421 CD4 (Biolegend, clone OKT4, Cat# 317434), Brilliant Violet 650 CD3 (Bio-
legend, clone UCHT1, Cat# 300468), Brilliant Ultraviolet 395 CD45 (Becton Dickinson, clone H130, Cat# 563792), and Alexa Fluor
647 CD8 (Biolegend, clone SK1, Cat# 344726). Antibodies used for RNA sequencing flow validation were FITC GZMK (Biolegend,
clone GM26E7, Cat# 370508), PerCP-Cy5.5 HLA-DR (Biolegend, clone L243, Cat# 307630), APC-R700 CCR7 (Becton Dickinson,
clone 3D12, Cat# 565867), Brilliant Violet 480 CD3 (Becton Dickinson, clone UCHT1, Cat# 566105), Brilliant Violet 510 GZMB (Becton
Dickinson, clone GB11, Cat# 563388), Brilliant Violet 605 Ki67 (Biolegend, clone Ki-67, Cat# 350522), Brilliant Violet 650 CD45RA
(Biolegend, clone HI100, Cat# 304136), Brilliant Violet 786 CD25 (Biolegend, clone BC96, Cat# 302638), Brilliant violet 711 TNFSRF18
(Biolegend, clone 108-17, Cat# 371212), Brilliant ultraviolet 395 CD4 (Becton Dickinson, clone RPA-T4, Cat# 564724), Brilliant ultra-
violet 496 CD8 (Becton Dickinson, clone RPA-T8, Cat# 564808), Brilliant ultraviolet 805 CD45 (Becton Dickinson, clone HI30, Cat#
564914), PE-CF594 FoxP3 (Becton Dickinson, clone 259D/C7), PE-Cy7 Perforin (Biolegend, clone B-D48, Cat# 353316).
Antibodies used for cytokine staining in addition to those used above were Alexa Fluor 647 IFNg (Biolegend, clone 4S.B3, Cat#
502516) and PE anti-human TNFa (Biolegend, clone Mab11, Cat# 502909).
non-malignant tissues, or Ficoll-purified and previously cryopreserved healthy control PBMCs, into 500 ul of PSA/0.04% BSA for
loading onto 10X. Following library preparation, sequencing was performed on an Illumina HiSeq 2500 (Rapid Run mode). Paired
samples from the same experiment and patient were processed in parallel during library preparation, and sequenced on the
same flowcell to minimize batch effects.
TCR sequencing
In brief, approximately 10% of the barcoded cDNA from the 10X workflow was utilized for TCR analysis. Primers used for TCR
sequencing are listed in Table S3. cDNA were first amplified with 6-12 amplification cycles using a template switching oligonucleotide
(TSO) and P7 primers. A pool of forward Va and Vb primers containing the TruSeq Read 1 primer sequence were then used in
conjunction with a reverse P7 primer to amplify CDR3 sequences from the TCR alpha and beta loci. An additional amplification
step using forward primers containing the Illumina P5, i5 and Truseq Read 1 sequences was used with reverse P7 primer to create
final TCR libraries for sequencing. Deep sequencing was done on an Illumina NovaSeq S1 with separate lanes for the TCR alpha and
TCR beta sequencing. Read 1 contained 280 bp of the TCR alpha or beta CDR3 sequence, and the i7 read contained the 14 bp 10X
barcode.
Expression analysis
After 10X sequencing data was processed through the Cell Ranger pipeline (version 1.1, hg19 genome assembly) with default set-
tings, filtered gene-barcode matrices for single tumors were analyzed using the scanpy toolkit (Wolf et al., 2018). Genes that were
detected in less than three cells were filtered out, and cells were filtered out with greater than ten percent of mitochondrial genes
and with fewer than 100 or greater than 1200 detected genes. Cells that were annotated as red blood cells (HBB) or macrophage
(CD14, CD68, CD163) were also excluded from downstream analyses. The gene expression values were log2 plus one transformed
and normalized to 10,000 counts per cell. The resulting matrix was batch corrected by regressing out total UMI counts and percent
mitochondrial genes using the built-in scanpy function followed by using the scanpy implementation of ComBat (Johnson et al., 2007)
with each well acting as a batch (13 wells total). The adjusted matrix was scaled to a mean of zero and variance of 1. Highly variable
genes were selected using the embedded scanpy function followed by principal component analysis (PCA), leiden clustering and
UMAP plotting with default settings with the exception of using a resolution of 1.5 for CD4+ T cells and 1.0 for CD8+ T cells for the
leiden clustering. This yielded 19 clusters which were collapsed to 11 cell types based on manual gene annotations (for CD4+ cells),
and 11 clusters (for CD8+ cells). We performed differential expression to identify marker genes that were upregulated in each indi-
vidual cluster relative to the combination of all other single cells (regardless of tumor or non-malignant tissue origin), or genes that
were upregulated in tumor versus non-malignant compartments. We compared the gene lists to known literature to label the clusters,
as well as using SingleR (Aran et al., 2019) to map the expression signature for each cluster to the best correlated candidate immune
reference signature, using the Monaco bulk RNA-seq reference of sorted human immune cell populations described within (Monaco
et al., 2019). Significant differences between the cell type abundances for the normal and tumor tissue samples were assessed using
an exact permutation test on the abundances.
Correlation analysis between gene expression from distinct clusters was performed by restricting to genes expressed across all
clusters being tested, and then correlating the scaled expression of the multidimensional vector of shared genes between pairs of
clusters and computing the Pearson correlation coefficient.
TCR analysis
TRA and TRB CDR3 nucleotide reads were demultiplexed by matching reads to 10X barcodes from cells with existing expression
data that passed filtering in the Cell Ranger pipeline, excluding cell barcodes that overlapped between multiple samples. Following
demultiplexing of the TRA and TRB CDR3s, reads were aligned against known TRA/TRB CDR3 sequences then assembled into clo-
notype families using miXCR (Bolotin et al., 2015) with similar methodologies to a previous study (Zemmour et al., 2018). For any given
10X barcode, the most abundant TRA or TRB clonotype was accepted for further analysis; if 2 TRA or TRB clonotypes were equally
abundant for a given 10X barcode, the clonotype with the highest sequence alignment score was used for further analysis. Detailed
sequencing statistics and saturation analysis are provided in Table S3. Only cells with paired TRA and TRB were used for further
downstream analysis. Analysis utilizing TCR data only (number of unique cells sharing a specific TRA/TRB clonotype sequence,
Gini coefficient) utilized cells both with and without a specific functional population that had been assigned by clustering. Analysis
involving both TCR clonotype and function was restricted to cells with both a mapped TRA/TRB and a functional population from
clustering. Statistical comparisons of Gini coefficients across compartments was performed using Wilcoxon signed-rank test with
Benjamini-Hochberg correction for multiple testing; statistical testing of differences in Gini coefficients between tumor and non-ma-
lignant compartments across all phenotypic clusters was performed using exact permutation testing.
Cells were subsequent stained and sorted by FACS. CD4 TIL (Draq7-CD45+CD3+CD4+ that were not CD25+CD127lo) and CD8 TIL
(Draq7-CD45+CD3+CD8+) were sorted into ImmunoCult XF complete medium (Medium + 10% FCS + 1% penicillin/streptomycin;
STEMCELL Technologies #10981). T cells were pooled together for culturing. After centrifugation, T cells were suspended in Immu-
noCult XF complete medium, and Dynabeads Human T-Activator CD3/CD28/CD137 (GIBCO #11162D) were added to the culture per
manufacturer’s protocol. T cells were cultured in 96 well U-bottom plates, and briefly centrifuged to ensure cell contact with Dyna-
beads. T cell expansion was managed in two phases. For the first week of T cell expansion, TILs were maintained with ImmunoCult XF
complete medium + 200 IU/ml of human recombinant IL-2 (Peprotech #200-02). From the second week onward, IL-2 concentration
was gradually increased from 200 IU/ml to 2000 IU/ml based on cell growth kinetics (which varied by patient sample). T cells were
harvested between 5-8 weeks for functional killing assays.
Pseudotime analysis
Pseudotime analysis, including branched expression analysis modeling (BEAM) to identify all genes with branch-dependent differ-
ential expression followed by unbiased clustering of genes based on patterns of co-expression in specific branches, was performed
using Monocle v2.10.1 as described (Qiu et al., 2017), for the combination of proliferating (CD4PROLIF, regulatory (CD4IL2RAHI,
CD4IL2RALO) and cytotoxic (CD4GZMB, CD4GZMK) states from scRNA-seq clustering.
Specific statistical tests and metrics (median, mean, standard error) used for comparisons, along with sample sizes, are described in
the Results and figure legends. The chemotherapy sample was included in unbiased clustering, testing for conserved marker genes
and tumor versus non-malignant testing, but was excluded from analyses of treatment effect (anti-PD-L1 versus untreated).
ADDITIONAL RESOURCES
The clinical trial of neoadjuvant atezolizumab prior to planned cystectomy for localized bladder cancer is registered under
clinicaltrials.gov (NCT02451423).
Supplemental Figures
Figure S1. Flow Cytometry and Immunofluorescence Validation of T Cell Phenotypes in Bladder Tumors, Related to Figures 1, 2, 3, 4, and 5
(A) Schematic of processing for paired tumor and adjacent non-malignant tissue from either anti-PD-L1-treated, or standard-of-care (untreated/chemotherapy-
treated) cystectomy patients. FACS-sorted CD4+ or CD8+ T cells were subjected to droplet-based single-cell RNA sequencing (dscRNA-seq) with paired T cell
receptor (TCR) sequencing as described in the text. (B) Parallel flow cytometry data from the same single-cell digest used for dscRNA-seq from 4 anti-PD-L1-
treated tumors, showing the percentage of CD4+ or CD8+ T cells from total CD3+ cells. (C) Gating strategy for flow cytometric analysis of populations in CD4+ and
CD8+ T cells from RNA-seq. CD4+ and CD8+ populations were gated out of CD3+ CD45+ single live cells. CD4+ cells were further gated as FoxP3- and FoxP3+.
Treg cells are gated as FOXP3+ CD25+ cells. FOXP3- CD4+ and CD8+ cells were gated into central memory (CM, CCR7+ CD45RA-), and CCR7- cells (a com-
bination of effector memory CCR7- CD45RA- and effector CCR7- CD45RA+). Boolean gating of CCR7- cells was used to obtain GZMK+, GZMB+ and Ki67+
populations for further marker analysis. Plots are shown here to demonstrate the presence of these populations. (D) Representative gates shown for each marker
for CD4+ and CD8+ T cells were used for Boolean gating for the populations described above. (E) Flow cytometry staining of GZMB, GZMK, or perforin versus CD3
in CCR7- CD8+ T cells. Gates used for Boolean analysis are shown. (F) Flow cytometry staining of GZMB or GZMK co-expression with perforin in CCR7- CD8+
T cells. (G) Percentage of cells expressing GZMB, GZMK, or perforin from CCR7- CD8+ T cells by flow cytometry (left), and the percentage of cells co-expressing
perforin within GZMB+ or GZMK+ CCR7- CD8+ T cells (right), are shown (N = 7 tumors, mean + SEM). (H) Percentages of cells expressing IFNg, TNFa, or both from
GZMB+ or GZMK+ CCR7- CD8+ T cells with and without stimulation (N = 11 tumors, mean + SEM). (I) Multiplex immunofluorescent staining of DAPI (blue), CD4
(red), GZMK (green), GZMB (white) and overlay without DAPI are shown from a cystectomy tumor region from an additional patient with parallel scRNA-seq and
TCR-seq data (anti-PD-L1 D). CD4+ cells that co-express GZMK (arrows) or GZMB (arrowhead) are indicated. Scale bar, 10 mm. (J) Percentage of cells co-
expressing Ki67 and either GZMB or GZMK from CCR7- CD4+ FOXP3- T cells (left), or Ki67 and CD25 from CD4+ FOXP3+ T cells (right), by flow cytometry are
shown, with dots for values from individual tumors (N = 7 tumors, mean ± SEM). (K) Flow cytometry staining showing co-expression of GZMB and Ki67, or GZMK
and Ki67, from CCR7- CD8+ T cells.
ll
Article
Figure S2. Clustering, Differential Expression, Annotation, and Correlation Analysis of T Cell Transcriptional Phenotypes, Related to Figures
1, 2, 3, and 4
(A-B) UMAP plots showing cluster representation for CD8+ (A) and CD4+ (B) TIL from individual patients. (C) Volcano plots showing adjusted P values versus
log2(FC) for differential testing of genes between tumor and non-malignant compartments for regulatory T cell populations (top, CD4IL2RAHI, CD4IL2RALO) and
cytotoxic CD4+ populations (bottom, CD4GZMB, CD4GZMK). Genes whose expression is significantly different between compartments with Padj < 0.05 and |
log2(FC) > 1.4| are shown in red. (D-E) Unbiased clustering of CD4+ T cells from tumor and adjacent non-malignant tissue from a single patient (anti-PD-L1 C). (D)
UMAP plot showing individual cells coded by cluster or by tissue of origin. (E) Violin plot showing top 5 differentially expressed marker genes for each unbiased
cluster. (F) Annotations of single CD4+ T cells from tumor and adjacent non-malignant tissue using SingleR. (G) Correlation matrix of all CD4+ and CD8+ pop-
ulations from tissue (combined tumor and non-malignant tissues) based on expression of shared genes. Pearson correlation coefficient is shown. Populations
were arranged based on hierarchical clustering using Euclidean distance metric.
ll
Article
Figure S3. T Cell Receptor Repertoire Analysis of CD4+ and CD8+ Bladder Tumor- and Non-malignant Tissue-Infiltrating T Cells, Related to
Figures 3 and 4
(A) The percentage of unique paired TRA and TRB CDR3 nucleotide sequences that are expressed by one cell (blue), shared by two cells (green), or shared by three or
more cells (red) is indicated for CD4+ T cells from individual tumor (darker shades) and non-malignant tissues (lighter shades) from anti-PD-L1-treated (‘‘PD-L1’’),
untreated, and chemotherapy-treated (‘‘chemo’’) patients. Triplicate control samples from a single healthy donor’s CD4+ T cells sorted from peripheral blood and
processed for scRNA-seq and TCR in identical fashion in separate sequencing runs is shown (‘‘healthy 1-3’’), as well as reference publicly available data from peripheral
blood CD4+ from a healthy donor. (B) Lorenz curves showing the cumulative frequency distributions for unique CD4+ T cells and unique CD4+ T cell clonotypes for
tumor, non-malignant tissues, and healthy donor blood. Mean ± SD is shown. (C) Gini coefficients for CD4+ T cell clonotypes from tumor, non-malignant tissues, and
healthy donor blood, calculated from the Lorenz curves in (D); p = 0.009 by Wilcoxon with Benjamini-Hochberg correction for tumor versus non-malignant tissues. For
(D) and (E): N = 7 tumor samples; 6 non-malignant samples, 4 healthy donor samples (3 triplicates from one healthy donor, 1 dataset from 10X Genomics). (D-F) Paired
TRA/TRB clonotype sharing between cells, Lorenz curves, and Gini coefficients for CD8+ clonotype data as in (A-C). (G) Gini coefficients for tissue-infiltrating CD4+ in
individual populations, separated by treatment type. (H-I) Gini coefficients for CD8+ T cells in individual populations, separated by tumor versus non-malignant tissue (H)
and treatment type (I). All box and whisker plots are formatted as in Figure 3C.
ll
Article
Figure S4. Autologous MHC-Dependent Killing of Bladder Tumors by CD4+ and CD8+ TIL, Related to Figure 5
Analysis of the increase in the number of dead cells over time from the same killing assay for CD4eff TIL (ie cultures with Tregs sorted out during expansion) at 30:1
effector:target ratio (A), CD4eff TIL at 30:1 effector:target ratio with a pan-anti-MHCII antibody (B), CD8+ TIL at 30:1 effector:target ratio (C), or CD8+ TIL at 30:1
effector:target ratio with a pan-anti-MHCI antibody (D), are shown. Control traces from separate wells with tumor only are included. All traces were normalized to
the number of dead cells per mm2 at time point 0. Experiments were done using Cytotox Red. The observation of autologous tumor killing by CD4+ and CD8+ TIL
above the background level of spontaneous death is representative of 2 independent experiments involving distinct aliquots from the same patient.
Resource
Correspondence
becher@immunology.uzh.ch
In Brief
High-parametric single-cell mapping of
the tumor microenvironment of patients
with primary brain tumors or brain
metastases reveals that the immune
response to cancer in the brain is shaped
by cancer type, with metastases favoring
T cell and monocyte-derived
macrophage invasion and gliomas
characterized by activated microglia.
Highlights
d Leukocyte invasion is higher in brain metastasis than in CNS-
endogenous cancers
Resource
Single-Cell Mapping of Human Brain Cancer
Reveals Tumor-Specific Instruction
of Tissue-Invading Leukocytes
Ekaterina Friebel,1,5 Konstantina Kapolou,2,5 Susanne Unger,1 Nicolás Gonzalo Núñez,1 Sebastian Utz,1
Elisabeth Jane Rushing,4 Luca Regli,3 Michael Weller,2 Melanie Greter,1 Sonia Tugues,1 Marian Christoph Neidert,2,3,6
and Burkhard Becher1,6,7,*
1Instituteof Experimental Immunology, University of Zurich, Zurich 8057, Switzerland
2Laboratory of Molecular Neuro-Oncology, Department of Neurology, Clinical Neuroscience Center, University Hospital Zurich and University
of Zurich, Zurich 8091, Switzerland
3Department of Neurosurgery, Clinical Neuroscience Center, University Hospital Zurich and University of Zurich, Zurich 8091, Switzerland
4Department of Neuropathology, University Hospital Zurich and University of Zurich, Zurich 8091, Switzerland
5These authors contributed equally
6These authors contributed equally
7Lead Contact
*Correspondence: becher@immunology.uzh.ch
https://doi.org/10.1016/j.cell.2020.04.055
SUMMARY
Brain malignancies can either originate from within the CNS (gliomas) or invade from other locations in the
body (metastases). A highly immunosuppressive tumor microenvironment (TME) influences brain tumor
outgrowth. Whether the TME is predominantly shaped by the CNS micromilieu or by the malignancy itself
is unknown, as is the diversity, origin, and function of CNS tumor-associated macrophages (TAMs). Here,
we have mapped the leukocyte landscape of brain tumors using high-dimensional single-cell profiling (Cy-
TOF). The heterogeneous composition of tissue-resident and invading immune cells within the TME alone
permitted a clear distinction between gliomas and brain metastases (BrM). The glioma TME presented pre-
dominantly with tissue-resident, reactive microglia, whereas tissue-invading leukocytes accumulated in BrM.
Tissue-invading TAMs showed a distinctive signature trajectory, revealing tumor-driven instruction along
with contrasting lymphocyte activation and exhaustion. Defining the specific immunological signature of
brain tumors can facilitate the rational design of targeted immunotherapy strategies.
INTRODUCTION et al., 2019; Tawbi et al., 2018). These divergent clinical re-
sponses to immunotherapy may be explained by tumor cell
Malignant brain tumors can be either primary tumors that arise in genomic (intrinsic) properties regulating T cell antigen recogni-
the brain (e.g., gliomas) or secondary tumors of extracranial tion and immune responses. Alternatively, tumor-extrinsic fea-
origin (metastases), most frequently invading from non-small- tures such as the tumor microenvironment (TME) may represent
cell lung carcinoma (NSCLC), breast cancer, and melanoma resistance pathways to immune-mediated interventions (Keenan
(Quail and Joyce, 2013). Both primary and metastatic brain ma- et al., 2019). Thus, in-depth knowledge of the immune cell
lignancies have a very poor prognosis, mainly due to limitations composition within the TME across different types of brain tu-
of standard therapies (e.g., surgery, radio/chemotherapy) (Al- mors may be critical to predict not only how tumors will progress,
dape et al., 2019). In this scenario, immunotherapy poses a but also their immunotherapeutic outcomes.
promising treatment option; however, programmed death (PD)- Uniquely, the CNS is both a site of immune privilege and a
1 blockade has not shown a survival benefit in recurrent glioblas- complex leukocyte landscape (Keren-Shaul et al., 2017; Forres-
toma (NCT 02017717). Only a selective group of patients with te- ter et al., 2018; Mrdjen et al., 2018; Van Hove et al., 2019). An
mozolomide-induced hypermutations in gliomas have been abundant cellular component of the brain TME are tumor-asso-
shown to benefit from immunotherapy (Daniel et al., 2019; Jo- ciated macrophages (TAMs), which possess both tumor-pro-
hanns et al., 2016; Bouffet et al., 2016). In contrast, a consider- moting and immunosuppressive capacities (Quail and Joyce,
able number of patients with brain metastases (BrM) respond 2013; Takenaka et al., 2019). This dichotomy is further compli-
to checkpoint blockade with concordant responses in extracere- cated by the heterogeneity of TAMs. TAM populations can be
bral disease and brain lesions (Goldberg et al., 2016; Kluger subdivided ontogenetically into at least two major populations:
1626 Cell 181, 1626–1642, June 25, 2020 ª 2020 Elsevier Inc.
ll
Resource
Figure 1. Mass Cytometry Analysis Reveals Unique Changes in the Leukocyte Composition of Brain Tumors
For a Figure360 author presentation of Figure 1, see https://doi.org/10.1016/j.cell.2020.04.055.
(A) Experimental approach.
(B) MDS plot showing the mean antigen expression across leukocytes. The clinical groups are indicated by varying point shapes and colors.
(C) A representative UMAP map showing the FlowSOM-guided meta-clustering of CD45+ cells.
(1) tissue-resident microglia and macrophages of embryonic gliomas, BrM (metastatic melanoma, NSCLC, and other tumors)
origin, and (2) tissue-invading monocyte-derived macrophages and epilepsy (serving as a quasi-steady-state control) (Fig-
(hereafter termed MDM) (Croxford et al., 2015; Kiss et al., ure 1A). Gliomas were further characterized according to IDH1
2018). Tissue-resident populations comprise microglia and R132H mutation (IDH1mut and IDH1wt) and methylation status
border-associated macrophages (BAMs), representing the ma- of the O6-methylguanine DNA methyltransferase (MGMT) pro-
jority of phagocytes in the healthy steady-state brain. However, moter. Some patients with BrM had a long disease history and
tumor progression induces the recruitment of blood-borne underwent either chemotherapy or treatment with immune
monocytes, which differentiate into MDMs (Bowman et al., checkpoint inhibitors. Clinical information and treatment regi-
2016; Ginhoux et al., 2010; Goldmann et al., 2016; Mrdjen mens prior to surgery are summarized in Table S1.
et al., 2018). Concomitant with the expansion of MDMs, CNS- To map the complexity of the leukocyte compartment of the
resident macrophages undergo marked phenotypical and func- TME, we designed two CyTOF panels together, measuring 74
tional changes, which further complicates the identification of parameters at the single-cell level. The first panel was myeloid-
the origin of TAMs (Hambardzumyan et al., 2016; Bowman focused (Table S2). Here, we combined antibodies to capture
et al., 2016; Quail and Joyce, 2017). Importantly, the majority the entire spectrum of phagocytes in the brain TME, together
of studies assessing the contribution of TAMs in brain cancer with lineage-identifying markers to map the major leukocyte
have been performed in mouse models of malignant glioma or populations and their relative cellular frequencies. The second
restricted cohorts of glioblastoma patients, with only a few at sin- panel was lymphoid-focused (Table S3) and was designed to
gle-cell resolution (Darmanis et al., 2017; Quail et al., 2016; Ven- deeply interrogate the lymphoid compartment. We started the
teicher et al., 2017). analysis of leukocytes obtained from the single-cell suspension
Brain tumors also harbor variable lymphocyte infiltrates (TILs) of brain tissues with the myeloid-focused panel and estimated
(Bien kowski and Preusser, 2015). Both CD8 and CD4 T cell sub- similarities between glioma, BrM, and non-tumor control sam-
sets (including regulatory T cells [Tregs]) increase with tumor ples. To do so, we applied multidimensional scaling (MDS) to
grade (Jacobs et al., 2010; Kuppner et al., 1988). Paradoxically, the total leukocytes (Figure 1B), where the distances, or relative
prognostically more favorable diffuse gliomas with mutations in similarities, between samples were calculated using the mean
the isocitrate dehydrogenases 1 and 2 are associated with value of antigen expressions (Buja et al., 2008). The first dimen-
reduced T cell abundance, presumably due to the effect of the sion featured an unambiguous separation between the glioma
oncometabolite (R)-2-hydroxyglutarate on the TME (Bunse and BrM TME, although the single epilepsy sample was mapped
et al., 2018; Kohanbash et al., 2017). The extent of T cell infiltra- closer to the glioma samples. The second dimension showed
tion in brain tumors of extracranial origin has also been examined relative variety between samples within the tumor group (Fig-
in some studies. In melanoma BrM, for instance, high densities of ure 1B). In order to better understand the basis of the group sep-
TILs have been observed mainly within the tumor stroma and aration in the MDS analysis, we visualized the mean antigen
surrounding brain (Amit et al., 2013). Nevertheless, none of these expressions across leukocytes using a heatmap with unsuper-
reports have specifically investigated T cell and other lympho- vised hierarchical clustering (Figure S1) (Nowicka et al., 2017).
cyte infiltration in relation to the overall immune landscape of The most informative differentially expressed markers were
the TME. those commonly expressed by most phagocytes (CD11c,
To provide a more detailed insight into the nature of the brain CD64, HLA-DR, and CX3CR1), as well as common T cell-
TME with a focus on the immune cell composition, we performed restricted antigens (CD3, PD-1, and CD38).
single-cell mass cytometry (CyTOF) on ex vivo surgical resec- To visualize every single immune population isolated from the
tions of brain tumors and non-tumor controls. Our in-depth im- different brain tumor samples, we created a two-dimensional
mune cell phenotyping has revealed fundamental differences in graph using the dimensionality reduction algorithm uniform
cellular frequencies and phenotypes within primary and second- manifold approximation and projection (UMAP) (Figures 1C
ary brain tumor TMEs. Our findings may explain the divergent re- and S2A) (Mcinnes et al., 2018). To compute UMAP, we specified
sults of immunotherapy in patients with brain tumors and allow the lineage markers, listed in Figure 1D to be considered for the
for the data-driven design of novel therapeutic interventions. estimation of cell similarity. By reducing the high-dimensional
data into two dimensions, we could present the quantification
RESULTS of all measured marker expressions on all cellular subtypes,
simultaneously (Figure S2A). Next, we categorized the various
Mass Cytometry Analysis Reveals Unique Changes in embedded cell clusters using self-organizing maps (FlowSOM)
the Leukocyte Composition of Brain Tumors (Van Gassen et al., 2015; Hartmann et al., 2016). This strategy al-
The overall strategy involved harvesting freshly resected tissue lowed us to create a map of diverse immune cells including,
from 38 patients undergoing neurosurgery for the treatment of CNS-resident and invading TAMs/monocytes (CD64+, CD11c+,
(D) Heatmap displaying the median antigen intensity of markers used to generate (C).
(E) The relative frequencies of immune populations of brain tumors and non-tumor control. Only statistically significant p values were displayed (p < 0.05, Mann-
Whitney-Wilcoxon test, Benjamini-Hochberg correction). Error bars define an interval of max/min value ± SD.
(F) Circos plots showing the multiple correlation matrix between the leukocyte frequencies. Results are plotted in order to display statistical significance (p < 0.05).
Here, correlation coefficients (R) higher than 0.6 are represented in red, and those lower than 0.6 in blue.
See also Figures S1 and S2 and Table S1.
and CD11b+), neutrophils (CD66b+ and CD16+), two subsets of closer to proximal nodes indicative of shared and similar fea-
dendritic cells (CD141+ and CADM1+ for cDC1 and CD1c+ for tures. The size of the node represents the number of cells within
cDC2), T cells (CD3+), natural killer (NK) cells (CD56+CD16+), the group. In addition, the Scaffold layout allowed for the intro-
B cells (CD19+ and HLA-DR+), and plasma cells (CD19+ and duction of manually gated data, which were used as landmark
CD38high) (Figures 1C and 1D). Notably, TAMs and monocytes populations for the reference map. This permitted a direct com-
comprised up to 80% (±18%) of leukocytes in the IDH1mut or parison of single-cell data from different species and/or single-
IDH1wt gliomas similar to the epilepsy non-tumor control, while cell maps acquired using different techniques. This way, CyTOF
T cells represented only 13% (±10%) (Figures 1E and S2B). and fluorescence-activated cell sorting (FACS) data obtained
Further stratification of IDH1wt glioblastoma patients according from patients and mice could be mapped side-by-side. To build
to the MGMT promoter methylation status showed comparable the reference map, we used equally normalized pooled data from
frequencies of major immune populations (Figure S2C). Interest- glioma and BrM samples, to ensure that all TAM subsets were
ingly, we observed the opposite situation in BrM, such as mela- present in our reference sample. Here, we focused on the
noma and carcinoma (Figure 1E), with significantly higher relative expression of the integrin alpha 4 (ITGA4/CD49d), which was
frequencies of T cells (up to 50% ± 16%) and lower frequencies previously reported to specifically mark CNS-invading macro-
of TAMs (up to 40% ± 18%) in the TME. Additionally, the meta- phages (Bowman et al., 2016) (Figure 2A). Last, we created the
static TME in both melanoma and carcinoma showed signifi- reference map for the Scaffold layout, considering three
cantly higher frequencies of plasma cells, which were absent in major cellular clusters: (1) CNS-resident microglia (CD49d ,
gliomas (Figure 1E). Mertk+, CX3CR1+, CD11c+, and CD64+), (2) monocytes
To determine whether therapy (including immunotherapy) was (CD14+, CCR2+, and CD11b+), and (3) MDMs (CD49d+, Mertk+,
driving these inter-tumor variations, we tracked treated patients CD163+, and CD64+) (Figures 2A and 2B).
in our analysis. However, we could not find a significant treat- Of note, our study revealed additional markers (CD45RA,
ment effect across the samples, and hence all samples were CD141, and ICAM), which were differentially expressed by
analyzed together and not stratified across therapeutic interven- invading MDMs compared to CNS-resident microglia (Figure 2B).
tions (Figure 1E). The relative distribution across leukocytes and Cellular identification by computational tools was confirmed us-
their relationship within brain tumors differed (Figure 1F). In gli- ing manual gating of the CyTOF data (Figure S3A). The CyTOF
oma, an increase in the relative frequencies of T cells, neutro- spectrum of CNS TAMs was complemented by 24-parameter
phils, and pDCs correlated negatively with TAM/monocyte fluorescence cytometry including the microglia-specific protein
frequencies, whereas T cell frequencies positively correlated P2Y12 (Butovsky et al., 2014) in epilepsy, glioma, and BrM sam-
with pDCs and cDCs frequencies. In BrM, on the other hand, ples. We repeated the same strategy using FlowSOM of the
we only observed a negative correlation between T cell and CD11b+CD11c+CD66b lin cells measured by FACS. Over-
TAM/monocyte frequencies. Taken together, our results show laying FACS data onto the reference map from the CyTOF data
that gliomas and BrM shape TMEs with a distinct immune cell showed the expected position of nodes around the fixed land-
composition: TAMs dominate the TME in gliomas whereas TILs mark populations (Figures 2C, 2D). FACS data confirmed both
dominate BrM. A close analysis of the immune compartment of the phenotype and frequencies of the CyTOF populations and
TME alone allowed for a clear separation into cancers of CNS categorized the exclusive expression of CD49d on CNS-
origin or extracranial CNS-invading metastasis. invading and P2Y12 on CNS-resident macrophages. As ex-
pected, we identified only P2Y12+ cells in epilepsy, as an
The Brain TME Harbors a Heterogeneous Mononuclear approximation to the steady-state CNS, whereas glioma and
Phagocyte Population BrM contained both populations, CD49d+ invading MDMs and
TAMs within the CNS may originate from the CNS-resident mi- P2Y12+ CNS-resident microglia.
croglia and/or from blood-derived monocytes that invade the Even though P2Y12 is one of the markers most commonly
TME and transform into MDMs. To study the relative contribution used to identify mouse and human microglia, it has been re-
of CNS-resident versus invading TAMs across gliomas and BrM, ported that P2Y12 expression is reduced after microglial activa-
we re-applied FlowSOM to CD64+, CD11c+, CD11b+, CD1c , tion (Haynes et al., 2006). Therefore, to solidify the notion that
and CD66b cells (Figures 1C, 1D) identified in the TME and CD49d expression distinguishes MDMs from microglia even
non-tumor control. This time, however, we essentially consid- within the TME, we used a preclinical model of glioma combined
ered the expression of the macrophage activation markers and with genetic fate mapping Sall1 expression, a transcriptional
initially clustered cells into 100 nodes, where each node repre- regulator exclusive to microglia within the hematopoietic system
sents a group of cells. For visualization, we chose a single-cell (Figure 2E) (Buttgereit et al., 2016; Mrdjen et al., 2018). This
analysis by fixed force- and landmark-directed (Scaffold) layout model showed that Sall1 expression by microglia remains high
(Figure 2A) (Spitzer et al., 2015). Here, one cluster node (cellular even during inflammatory conditions, and this marker is not ex-
population) represents one FlowSOM node and will be mapped pressed by any other CNS-invading cell types or BAMs. Mouse
(E) Sall1 and CD49d expression of mouse TAMs/monocytes, overlaid onto the reference map (A).
(F) TAMs/monocytes of glioma and BrM overlaid onto the reference map (A).
(G) Relative frequencies of the three TAM populations among CD64+ cells. Only statistically significant p values were displayed (p < 0.05, Mann-Whitney-Wil-
coxon test, Benjamini-Hochberg correction), whiskers within 1.5x IQR.
See also Figure S3 and Table S1.
Figure 3. TAM Instruction Is Driven by the Type of Tumor Rather Than the Local Tissue Microenvironment
(A) Two-dimensional dot-plots displaying cellular markers among microglia.
(B) Boxplots quantify the mean antigen intensity of data shown in (A), whiskers within 1.5x IQR.
(C) A force-directed graph displays MDMs/monocytes from glioma and BrM. Individual plots are overlaid with the most differentially expressed markers
among MDMs.
CD64+ phagocytes were analyzed using the same reference microglia in IDH1wt diffuse astrocytoma preserved a more rami-
Scaffold map as for human CyTOF data, excluding markers fied shape (Figure S4A).
that are differentially expressed in human and mouse (CD11c) Next, we investigated the monocyte-to-macrophage transi-
or were not used in both panels (Ly6C and CD14). The map faith- tion, which is initiated by the invasion of the monocytes from
fully separated mouse microglia (Sall1 YFP+CD49d ) from other the blood into the tissue and is primarily dictated by cues in
phagocytes (Sall1 YFP CD49d+) and invading phagocytes and the local tissue microenvironment (Ginhoux and Guilliams,
confirmed that CD49d is preferentially expressed among CNS- 2016; Okabe and Medzhitov, 2016). The fundamental question
invading phagocytes (MDMs) (Figure 2E). addressed was whether the phenotypic and functional features
Using a combination of experimental approaches, we could of MDMs are driven by the nature of the pathogenic insult (i.e.,
reliably identify the origin of each TAM population stemming the tumor type growing within the brain or instead by the CNS tis-
either from the embryonically derived CNS-resident microglia sue itself), which is generally deficient in blood-borne leukocytes
or from blood-derived MDMs. Glioma TME predominantly con- during the steady state. In order to chart the monocyte/MDM
tained TAMs of microglial origin, whereas BrM had a higher developmental trajectory in relation to either gliomas versus
invasion of MDMs, with a particular extension in the TME of car- BrM, we built a force-directed graph using Vortex (Figure 3C)
cinoma BrM (Figures 2F and 2G). MDM accumulation was also (Good et al., 2019; Samusik et al., 2016). The algorithm was
observed in the IDH1wt glioma TME, albeit to a lesser extent, applied to a combined CyTOF dataset of CNS-invading TAMs
whereas the macrophage composition of the TME of IDH1mut gli- of glioma and BrM TMEs, focusing on the expression level of
omas—associated with a better prognosis—did not differ signif- monocyte/macrophage markers (Figure 3C). The resulting graph
icantly from the epilepsy case (Figures 2F and 2G). Monocytes depicts a continuous process of monocyte/MDM differentiation,
were present at similarly low frequencies across all samples (Fig- which was characterized by the downregulation of CCR2 and
ures 2F and 2G). Overall, the data show that the TME of IDH1mut CD33, concomitant with the upregulation of CD163 and Mertk
tumors of neuronal derivation is dominated by TAMs of microglial (Figure 3C). Of note, among MDMs, we observed a trifurcation
origin. The more aggressive gliomas (IDH1wt) showed an in the developmental trajectory displayed in three branches on
increased invasion by MDM TAMs, independently of the methyl- the force-directed map, which was driven by the differential
ation status of the MGMT promoter (Figure S3B). Along this tra- expression of CD169, CD206, CD209, CD38, and PD-L1. To infer
jectory, brain tumors of extracranial origin are predominantly cell lineages and pseudotimes, we computed the slingshot algo-
invaded by MDMs, which represent the majority of the TAM rithm (Figure 3D) (Street et al., 2018). The algorithm modeled the
population. developmental trajectory by connecting adjacent clusters, using
the monocyte cluster as a starting point to construct branching
TAM Instruction Is Driven by the Type of Tumor Rather curves (Figure 3C). To explore different subsets of MDM, we per-
Than the Local Tissue Microenvironment formed FlowSOM analysis using the differentially expressed an-
TAM plasticity and polarization in the TME across various cancer tigens (Figure 3C). Our analysis revealed the presence of four
entities are subjects of intense investigation (Kiss et al., 2018). MDM clusters and one monocyte cluster (Figures 3D and 3E).
Here, we explored the expression of a large number of markers Next, we separated the force-directed map of invading
used for macrophage phenotyping and polarization (Table S2) on MDMs/monocytes for IDH1mut and IDH1wt gliomas versus mela-
human CNS-resident microglia with single-cell resolution. Of noma and carcinoma BrM to analyze the trajectories and relative
those, for instance, CD169, CD206, and CD209 were virtually ab- frequencies of different MDM subsets (Figure 3F). Importantly,
sent from CNS-resident tumor-associated microglia (Figure 2B). we found that the developmental traces of the monocyte-
We also examined the expression level of markers, which were to-MDM transition is not a random feature across brain tumors,
previously used to describe the ‘‘reactivity’’ of CNS-resident mi- but that each tumor entity clearly dictates the development of its
croglia in different pathological settings (Figure 3A) (Hopperton own MDM type with a distinctive and tumor type-specific pheno-
et al., 2018; Keren-Shaul et al., 2017; Mrdjen et al., 2018; Walker typic signature. For instance, CD163+ CX3CR1+ CADM1+ MDMs
and Lue, 2015). Microglia in IDH1mut glioma had a comparable (MDM 4) were found almost exclusively in the IDH1wt glioma TME
expression of HLA-DR to microglia in IDH1wt glioma and BrM (Figures 3F and 3G). MDMs expressing CD163, CD206, and
(Figure 3B). However, in both IDH1wt gliomas and BrM, but not CD169 (MDM 2) showed higher relative frequencies in carci-
in IDH1mut gliomas, these cells upregulated CD14 and CD64 noma BrM, whereas MDMs with higher levels of CD209, CD38,
(Figure 3B). The ‘‘reactive’’ phenotype of tumor-associated mi- PD-L1, and PD-L2 (MDM 3) were equally infiltrating the mela-
croglia in IDH1wt glioblastomas and BrM was corroborated by noma and carcinoma BrM (Figures 3F and 3G). CNS-invading
co-staining of brain sections with Iba1 and CD163, revealing phagocytes in the TME of IDH1mut glioma were predominantly
an amoeboid microglial morphology of these cells. In contrast, composed of monocytes and low frequencies of MDMs. These
(D–F) FlowSOM-guided metaclustering overlaid on (C). (E) Heatmap displaying the median antigen intensity of markers used to generate (D). (F) A force-directed
graph display changes in the brain TME (D and F). The overlaid lines represent the results of the slingshot pseudotime analysis.
(G) Bar plots representing the relative frequencies of monocytes and MDMs in the brain TME. The IDH1mut glioma TME composed of monocytes 74.5% ± 3.3%,
MDM1 13.8% ± 3.3%, MDM2 1.9% ± 0.6%, MDM3 5.7% ± 1.5%, and MDM4 4.1% ± 2.1%; the IDH1wt glioma: monocytes 20.4% ± 21.4%, MDM1 26.9% ±
12.8%, MDM2 17.3% ± 15.5%, MDM3 7.4% ± 10.2%, and MDM4 27.9% ± 20.7%; the melanoma BrM: monocytes 24.8% ± 18.1%, MDM1 22.8% ± 6.9%,
MDM2 17.5% ± 8.2%, MDM3 33.7% ± 27.9%, and MDM4 1.2% ± 1.3%; carcinoma BrM: monocytes 14.2% ± 5.8%, MDM1 26.1% ± 5.9%, MDM2 34.6% ±
7.8%, MDM3 22.7% ± 12.3%, and MDM4 2.4% ± 0.5%.
Figure 5. Overall Glioma Patient Survival Correlates with the Presence of MDM Signatures
(A) Kaplan-Meier curve shows OS in glioblastoma patients (TCGA-GBM database) with high and low CD163 or CX3CR1 gene expression.
(B–E) Kaplan-Meier survival analysis in patient groups of (B) TCGA-LGG and (C) TCGA-GBM databases with high and low microglial signature. (D and E) Kaplan-
Meier survival analysis in patient groups of (D) TCGA-LGG and (E) TCGA-GBM databases correlated with MDM2/3 signature. (B–E) Heatmaps show the selected
groups of patients according to the gene expression.
See also Figure S4E and Table S1.
Figure 6. Tregs Accumulation and T Cell Exhaustion Characterize the TME of Brain Metastases
(A) Relative frequencies of the main T cell populations among lymphocytes. Only statistically significant p values were displayed (p < 0.05, Mann-Whitney-
Wilcoxon test, Benjamini-Hochberg correction), whiskers within 1.5x IQR.
(B) One-SENSE analysis comparing the lineage and activation profiles of CD8 T cells.
(C) Representative histograms showing differentially expressed markers on CD8 RM and CD8 EM subsets.
See also Figures S5, S6, and S7A and Table S1.
Laurens van der Maaten, 2008). Here, the x-axis represents a (p = 0.1) toward a higher proportion of CD56int/brightCD16+ NK
naive/memory profile and the y-axis the activation profile, cells in the unmethylated cases (Figure S7C). We also correlated
including expression of co-stimulatory and co-inhibitory recep- the frequencies of CD56+ subsets with OS or number of follow up
tors (Figure 6B). Based on the expression of five markers days in the IDH1wt cohort, which suggested ie-ILC1-like cell
(x-axis: CD45RA, CD45RO, CCR7, CD127, and CD103), T cells accumulation as a potential marker of OS (Figure S7D).
can be differentiated into naive, central memory (CM), effector We next characterized the activation status of the three sub-
memory (EM), terminally differentiated effector memory sets of CD56+ cells, represented in the y-axis of the One-SENSE
(TEMRA), and non-circulating tissue-resident (RM) T cells (Smol- map (Cheng et al., 2016; Laurens van der Maaten, 2008) (Fig-
ders et al., 2018) (Figures 6B and S6A). The majority of T cells ure 7C). Detailed analysis of the CD56int/brightCD16+ population
analyzed were memory T cells, without statistically significant across patients revealed a higher trend of 2B4, CD38, and
variations in terms of relative frequencies among total T cell Ki-67 in BrM (Figures 7C and 7D). The CD56int/brightCD16+ pop-
numbers between the TME of gliomas versus BrM (Figure S6B). ulation in IDH1wt gliomas was distinguished by CD57 and TIM3
However, we observed a positive correlation of CD4 CM (p = expression, with the latter also observed in the IDH1mut gliomas.
0.021) and CD8 CM (p = 0.077) T cell frequencies with OS/ The profile of metastatic and glioma CD56int/brightCD16+ popula-
number of follow up days for patients with IDH1wt glioma tion was relatively homogeneous. Taken together, NK cell pro-
(Figure S6C). files also display tumor specificity, highlighted by the proportion
Further analysis of additional T cell markers demonstrated a of infiltrating cytotoxic and immature NK cells, with the preva-
higher expression of co-stimulatory receptors (ICOS, CD27, lence of the latter in the IDH1wt gliomas. Additionally, we
and CD137), co-inhibitory receptors (2B4, TIGIT, and PD-1), observed heterogeneity of infiltrating CD56int/brightCD16 NK
the activation marker CD38, effector (CD57 and granzyme B) cells between glioma and BrM.
and proliferation functions (Ki-67) among RM and EM CD8
T cells in melanoma BrM (Figures 6B, 6C, and S7A). The same DISCUSSION
phenotype, albeit with lower frequencies, was identified in carci-
noma BrM. Of note, CD8 RM and EM T cells in IDH1mut glioma In this study, we combined extensive single-cell proteome anal-
had lower expression of proliferation and activation markers ysis, immunofluorescence imaging, and genetic fate-mapping to
compared to DH1wt glioma. The same analysis strategy was interrogate the leukocyte landscape of the TME in brain tumors.
applied to CD4 T cells; however, here we did not observe major Our analyses revealed major changes in the CNS-resident and
differences between glioma and BrM. invading leukocyte populations, which were dictated by the
Altogether, the data revealed the accumulation of Tregs in the type of brain tumor. We found resident phagocytes to be abun-
TME of IDH1wt gliomas and BrM, with the highest frequencies in dant in the glioma TME, whereas invading leukocytes dominate
the carcinoma BrM. In IDH1wt gliomas, the accumulation of CD4 the immune landscape of BrM.
CM and CD8 CM in the TME may be a positive prognostic indi- Previous studies have shown that TAMs are the largest popu-
cator of better OS. In addition, the expansion of the analyzed lation of leukocytes in the glioma TME. Despite the correlation of
markers revealed unique phenotypic features of EM and RM TAMs with clinical prognosis and grade of glioma (Quail and
CD8 T cells. The metastatic TME was composed of activated/ Joyce, 2017; Venteicher et al., 2017), it appears that cellular phe-
exhausted T cells, whereas the glioma samples showed a lower notypes are more indicative of clinical outcomes than the mere
expression of activation markers. number of infiltrating macrophages (Müller et al., 2017; Pyonteck
et al., 2013). For instance, blocking the colony-stimulating factor
Immature NK Cells Accumulate in Glioblastoma receptor 1 (CSF-1R) resulted in a significant reduction of tumor
In view of the potential role of innate lymphoid cells in anti-tumor growth in a preclinical glioma model, which was associated
immunity (Tugues et al., 2019), we next used the lymphoid panel with a ‘‘re-education’’ of TAMs toward a pro-inflammatory tu-
(Table S3) dataset to interrogate NK cells (CD56+CD3-) (Fig- mor-suppressive phenotype (Coniglio et al., 2012; Pyonteck
ure 7A). We assigned cells mainly corresponding to the combina- et al., 2013). Despite the general interest in targeting TAMs for
torial expression of CD16 and CD56 and identified two major anti-glioma therapy, only a few studies performed in preclinical
populations of CD56int/brightCD16 and CD56intCD16+, which models have taken into account the dual origin of the TAM pop-
correspond to the immature and the high cytotoxic populations ulation (Bowman et al., 2016; Chen et al., 2017). Discrimination of
of NK cells, respectively (Simoni et al., 2017). Among microglia and the blood-derived MDM revealed the predomi-
CD56int/brightCD16 cells, we also found a major population of nance of ‘‘reactive’’ microglia-derived TAMs in glioma lesions,
CD69+CD103+CD56+ cells, which closely resemble intraepithe- which is in line with a recent report by Sankowski et al. (2019).
lial ILC1-like cells (ie-ILC1-like cells) (Figure 7A) (Simoni et al., This phenotype likely results from interferon (IFN)-g produced
2017). Next, we compared the frequencies of these three by TILs (both T and NK cells), a notion that can be further inves-
above-described CD56+ subsets in gliomas and BrM of different tigated in preclinical mouse models.
origins. In the IDH1wt gliomas, we observed the enrichment of Of note, we observed high relative frequencies of MDMs (not
immature CD56int/bright CD16 NK cells among lymphocytes, measured in Sankowski et al. [2019]) in IDH1wt glioma and
whereas in both the IDH1mut gliomas and BrM, predominantly BrM. In contrast to microglia frequencies, MDMs significantly
CD56int/bright CD16+ NK cells (Figures 7B and S7B) accumulated. correlated with clinical outcome of glioma patients and particu-
Further splitting of the IDH1wt glioma cohort according to the larly LGG. Interestingly, CX3CR1+CADM1+ MDMs observed in
methylation status of the MGMT promoter indicated a trend the IDH1wt glioma TME were phenotypically different from those
identified in BrM. Recently developed preclinical glioma models inhibitory molecules, as well as proliferation markers. How this
will likely fuel the study of novel roles as well as unique activated phenotype is translated into certain functional proper-
trajectories of MDMs. For instance, the model mimicking the ties is not yet known and will require further investigation.
overproduction of (R)-2-hydroxyglutarate driven by IDH mutation However, the map provided here represents an important
(Amankulor et al., 2017) or those recreating various molecular resource for the informed design of novel therapeutic avenues.
types of gliomas (Miyai et al., 2017) can be used to further inves- A similar trend was observed in the NK cell compartment
tigate differentially matured MDMs upon extrinsic tumor factors. where the proportion of infiltrating cytotoxic and immature NK
Another attractive approach to study unique combination of cells decreased with disease severity. In gliomas, the impaired
TAMs for personalized therapy is the engraftment of patient- lymphocyte populations are in line with the TCGA data that char-
derived glioblastoma organoids, which represent the molecular acterize IDH1wt gliomas as lymphocyte-depleted and IDH1mut
heterogeneity among gliomas (Jacob et al., 2020; Mansour gliomas as immunologically silent (Kiss et al., 2018; Thorsson
et al., 2018). et al., 2018). In neuro-oncology, the IDH1 status is leveraged
In the BrM TME, we found a unique subpopulation of MDMs as an important classifier (Louis et al., 2016), with IDH1mut pa-
expressing CD206, CD209, CD169, and CD163 as well as high tients showing a favorable survival (Eckel-Passow et al., 2015;
levels of CD38, PDL-1, and PDL-2, which is reminiscent of a tu- Unruh et al., 2019). It is becoming more evident that IDH1 status
mor population of TAMs described by Ries et al. (2014) in is not only prognostically relevant but represents a marker for a
NSCLC. MDMs expressing CD38 were also described in vitro distinct disease entity within gliomas. The distinct molecular
on IFN-g stimulation, whereas CD38 expression was strongly characteristics of IDH1mut glioma and the distinct clinical
associated with phagocytosis (Schulz et al., 2019). In our study, behavior are now also well characterized by a distinct TME: im-
we did not observe a clear separation of TAMs based on the M1/ mune cells show the least amount of activation across all brain
M2 model of macrophage polarization (Murray et al., 2014), tumor samples.
which underestimates the dynamic nature of pro- or anti-inflam- Taken together, we show that the immune cell compartment of
matory properties across macrophages (Xue et al., 2014). brain TMEs is mainly shaped by the specific tumor type rather
Instead, the unique signature of cytokines, growth factors, and than by the CNS as an ‘‘immune-privileged niche.’’ We observed
mutational landscapes that instruct TAMs in each brain tumor a continuous increase in bone marrow-derived infiltrates, such
type may trigger the phenotypic differences observed. as MDMs and T cells, along the axis of IDH1mut, IDH1wt gliomas,
In most cases, we found MDMs localized near blood vessels in and BrM, and increasing dominance of CNS-resident cells, such
glioma and BrM. Previous preclinical glioma models showed a as microglia, in the glioma counterparts. The richness and acti-
close connection of MDMs and glioma stem cells, which were vation of the BrM TMEs regarding cellular subtypes and fre-
mainly found in perivascular niches and reported to secrete peri- quencies as well as functional states parallels their favorable
ostin to attract TAMs (Zhou et al., 2015). In turn, TAMs are a rich clinical response to checkpoint inhibitors. In-depth knowledge
source of soluble mediators (e.g., interleukin [IL]-6, IL-10, and of the specific immunological TME signatures across brain tu-
transforming growth factor-b1 and pleiotrophin) capable of sus- mors is a major step forward for the rational design of targeted
taining malignant stem cells (Shi et al., 2017). Regarding BrM, we immunotherapy strategies.
hypothesize that MDMs localize near blood vessels to establish
the metastatic niche. However, further studies using autochtho- STAR+METHODS
nous mouse models of BrM (An et al., 2017) are required to reveal
potential targets for immunotherapy. Detailed methods are provided in the online version of this paper
Regarding the lymphocyte compartment, our analyses re- and include the following:
vealed that Tregs preferentially accumulate in BrM rather than
in gliomas. Importantly, the presence of PD-L1+ TAMs has d KEY RESOURCES TABLE
been correlated with the Treg frequencies in several solid tissue d RESOURCE AVAILABILITY
tumors (Harter et al., 2015) as well as IDH1wt gliomas (Berghoff B Lead Contact
et al., 2017). In turn, Tregs secrete IL-10, IL-4, and IL-13, which B Materials Availability
may trigger the development of TAMs with immunosuppressive B Data and Code Availability
properties (Mantovani et al., 2017). An increased number of d EXPERIMENTAL MODEL AND SUBJECT DETAILS
Tregs might also result in the suppression of cytotoxic CD8 B Human Brain Tissue Samples
T cell responses. Brain tumors establish an immunosuppressive B Animal Models
TME that leads to T cell dysfunction (Quail and Joyce, 2017). B Cell Lines
T cells infiltrating glioblastoma, for instance, were found to ex- d METHOD DETAILS
press high amounts of multiple immune checkpoints (e.g., B Processing of Human Samples for Cytometry Analysis
PD-1, Lag-3, TIM-3, or TIGIT), which correlated with a loss of B Tissue collection for Immunofluorescence
effector function (Woroniecka and Fecci, 2018). PD-1 expression B Orthotopic Glioma Cell Injection
has been found in a high percentage of TILs in melanoma BrM B In Vivo Bioluminescent Imaging
(Berghoff et al., 2015). However, we found the greatest changes B Harvesting and Processing of Mouse Brain Samples
in T cell activation in BrM, with CD8 TRM (CD103+; CD69+) and B Mass-Tag Cellular Barcoding
TEM (CCR7 ; CD45RA ) displaying an activated phenotype B Metal-Isotope-Tagged Antibodies
characterized by high amounts of both co-stimulatory and co- B Cell Surface Staining for Cytometry
Conceptualization, B.B., E.F., M.W., and M.C.N.; Methodology, E.F. and K.K.; Buja, A., Swayne, D.F., Littman, M.L., Dean, N., Hofmann, H., and Chen, L.
Software, E.F.; Formal Analysis, E.F. and N.G.N.; Investigation, E.F., K.K., S. (2008). Data Visualization with Multidimensional Scaling Introduction.
Unger, S. Utz, and N.G.N.; Resources, B.B., M.C.N., M.W., and L.R.; Data Cu- J. Comput. Graph. Stat. 17, 444–472.
ration, E.F., S. Unger, N.G.N., K.K., M.C.N., and E.J.R.; Writing – Original Draft, Bunse, L., Pusch, S., Bunse, T., Sahm, F., Sanghvi, K., Friedrich, M., Alansary,
E.F., B.B., and S.T.; Writing – Review & Editing, E.F., B.B., S.T., K.K., M.C.N., D., Sonner, J.K., Green, E., Deumelandt, K., et al. (2018). Suppression of anti-
E.J.R., M.W., and M.G.; Visualization, E.F.; Funding Acquisition and Supervi- tumor T cell immunity by the oncometabolite (R)-2-hydroxyglutarate. Nat.
sion, M.C.N., M.W., and B.B. Med. 24, 1192–1203.
Butovsky, O., Jedrychowski, M.P., Moore, C.S., Cialic, R., Lanser, A.J., Ga-
DECLARATION OF INTERESTS briely, G., Koeglsperger, T., Dake, B., Wu, P.M., Doykan, C.E., et al. (2014).
Identification of a unique TGF-b-dependent molecular and functional signature
The authors declare no competing interests. in microglia. Nat. Neurosci. 17, 131–143.
Buttgereit, A., Lelios, I., Yu, X., Vrohlings, M., Krakoski, N.R., Gautier, E.L.,
Received: November 27, 2019
Nishinakamura, R., Becher, B., and Greter, M. (2016). Sall1 is a transcriptional
Revised: March 11, 2020
regulator defining microglia identity and function. Nat. Immunol. 17,
Accepted: April 28, 2020
1397–1406.
Published: May 28, 2020
Chen, Z., Feng, X., Herting, C.J., Garcia, V.A., Nie, K., Pong, W.W., Rasmus-
REFERENCES sen, R., Dwivedi, B., Seby, S., Wolf, S.A., et al. (2017). Cellular and molecular
identity of tumor-associated macrophages in glioblastoma. Cancer Res. 77,
Aldape, K., Brindle, K.M., Chesler, L., Chopra, R., Gajjar, A., Gilbert, M.R., Got- 2266–2278.
tardo, N., Gutmann, D.H., Hargrave, D., Holland, E.C., et al. (2019). Challenges Cheng, Y., Wong, M.T., van der Maaten, L., and Newell, E.W. (2016). Categor-
to curing primary brain tumours. Nat. Rev. Clin. Oncol. 16, 509–520. ical Analysis of Human T Cell Heterogeneity with One-Dimensional Soli-
Amankulor, N.M., Kim, Y., Arora, S., Kargl, J., Szulzewsky, F., Hanke, M., Mar- Expression by Nonlinear Stochastic Embedding. J. Immunol. 196, 924–932.
gineantu, D.H., Rao, A., Bolouri, H., Delrow, J., et al. (2017). Mutant IDH1 reg- Colaprico, A., Silva, T.C., Olsen, C., Garofano, L., Cava, C., Garolini, D., Sabe-
ulates the tumor-associated immune system in gliomas. Genes Dev. 31, dot, T.S., Malta, T.M., Pagnotta, S.M., Castiglioni, I., et al. (2016). TCGAbio-
774–786. links: an R/Bioconductor package for integrative analysis of TCGA data. Nu-
Amit, M., Laider-Trejo, L., Shalom, V., Shabtay-Orbach, A., Krelin, Y., and Gil, cleic Acids Research 44, e71.
Z. (2013). Characterization of the melanoma brain metastatic niche in mice and Coniglio, S.J., Eugenin, E., Dobrenis, K., Stanley, E.R., West, B.L., Symons,
humans. Cancer Med. 2, 155–163. M.H., and Segall, J.E. (2012). Microglial stimulation of glioblastoma invasion in-
An, J., Wang, L., Zhao, Y., Hao, Q., Zhang, Y., Zhang, J., Yang, C., Liu, L., volves epidermal growth factor receptor (EGFR) and colony stimulating factor
Wang, W., Fang, D., et al. (2017). Effects of FSTL1 on cell proliferation in breast 1 receptor (CSF-1R) signaling. Mol. Med. 18, 519–527.
Mrdjen, D., Hartmann, F.J., and Becher, B. (2017). High Dimensional Cytome- Simoni, Y., Fehlings, M., Kløverpris, H.N., McGovern, N., Koo, S.L., Loh, C.Y.,
try of Central Nervous System Leukocytes During Neuroinflammation. Lim, S., Kurioka, A., Fergusson, J.R., Tang, C.L., et al. (2017). Human Innate
Methods Mol. Biol. 1559, 321–332. Lymphoid Cell Subsets Possess Tissue-Type Based Heterogeneity in Pheno-
type and Frequency. Immunity 46, 148–161.
Mrdjen, D., Pavlovic, A., Hartmann, F.J., Schreiner, B., Utz, S.G., Leung, B.P.,
Lelios, I., Heppner, F.L., Kipnis, J., Merkler, D., et al. (2018). High-Dimensional Smolders, J., Heutinck, K.M., Fransen, N.L., Remmerswaal, E.B.M., Hom-
Single-Cell Mapping of Central Nervous System Immune Cells Reveals brink, P., Ten Berge, I.J.M., van Lier, R.A.W., Huitinga, I., and Hamann, J.
Distinct Myeloid Subsets in Health, Aging, and Disease. Immunity 48, 380–395. (2018). Tissue-resident memory T cells populate the human brain. Nat. Com-
mun. 9, 4593.
Müller, S., Kohanbash, G., Liu, S.J., Alvarado, B., Carrera, D., Bhaduri, A.,
Watchmaker, P.B., Yagnik, G., Di Lullo, E., Malatesta, M., et al. (2017). Sin- Spitzer, M.H., Gherardini, P.F., Fragiadakis, G.K., Bhattacharya, N., Yuan,
gle-cell profiling of human gliomas reveals macrophage ontogeny as a basis R.T., Hotson, A.N., Finck, R., Carmi, Y., Zunder, E.R., Fantl, W.J., et al.
for regional differences in macrophage activation in the tumor microenviron- (2015). IMMUNOLOGY. An interactive reference framework for modeling a dy-
ment. Genome Biol. 18, 234. namic immune system. Science 349, 1259425.
Murray, P.J., Allen, J.E., Biswas, S.K., Fisher, E.A., Gilroy, D.W., Goerdt, S., Street, K., Risso, D., Fletcher, R.B., Das, D., Ngai, J., Yosef, N., Purdom, E.,
Gordon, S., Hamilton, J.A., Ivashkiv, L.B., Lawrence, T., et al. (2014). Macro- and Dudoit, S. (2018). Slingshot: cell lineage and pseudotime inference for sin-
phage activation and polarization: nomenclature and experimental guidelines. gle-cell transcriptomics. BMC Genomics 19, 477.
Immunity 41, 14–20. Takasato, M., Osafune, K., Matsumoto, Y., Kataoka, Y., Yoshida, N., Meguro,
Noble, W.S. (2009). How does multiple testing correction work? Nat. Bio- H., Aburatani, H., Asashima, M., and Nishinakamura, R. (2004). Identification of
technol. 27, 1135–1137. kidney mesenchymal genes by a combination of microarray analysis and
Sall1-GFP knockin mice. Mech. Dev. 121, 547–557.
Nowicka, M., Krieg, C., Crowell, H.L., Weber, L.M., Hartmann, F.J., Guglietta,
S., Becher, B., Levesque, M.P., and Robinson, M.D. (2017). CyTOF workflow: Takenaka, M.C., Gabriely, G., Rothhammer, V., Mascanfroni, I.D., Wheeler,
differential discovery in high-throughput high-dimensional cytometry data- M.A., Chao, C.C., Gutiérrez-Vázquez, C., Kenison, J., Tjon, E.C., Barroso,
sets. F1000Res. 6, 748. A., et al. (2019). Control of tumor-associated macrophages and T cells in glio-
blastoma via AHR and CD39. Nat. Neurosci. 22, 729–740.
Okabe, Y., and Medzhitov, R. (2016). Tissue biology perspective on macro-
phages. Nat. Immunol. 17, 9–17. Tawbi, H.A., Forsyth, P.A., Algazi, A., Hamid, O., Hodi, F.S., Moschos, S.J.,
Khushalani, N.I., Lewis, K., Lao, C.D., Postow, M.A., et al. (2018). Combined
Pyonteck, S.M., Akkari, L., Schuhmacher, A.J., Bowman, R.L., Sevenich, L., Nivolumab and Ipilimumab in Melanoma Metastatic to the Brain. N. Engl. J.
Quail, D.F., Olson, O.C., Quick, M.L., Huse, J.T., Teijeiro, V., et al. (2013). Med. 379, 722–730.
CSF-1R inhibition alters macrophage polarization and blocks glioma progres-
sion. Nat. Med. 19, 1264–1272. Thorsson, V., Gibbs, D.L., Brown, S.D., Wolf, D., Bortone, D.S., Ou Yang,
T.-H., Porta-Pardo, E., Gao, G.F., Plaisier, C.L., Eddy, J.A., et al.; Cancer
Quail, D.F., and Joyce, J.A. (2013). Microenvironmental regulation of tumor Genome Atlas Research Network (2018). The Immune Landscape of Cancer.
progression and metastasis. Nat. Med. 19, 1423–1437. Immunity 48, 812–830.
Quail, D.F., and Joyce, J.A. (2017). The Microenvironmental Landscape of Tugues, S., Ducimetiere, L., Friebel, E., and Becher, B. (2019). Innate lymphoid
Brain Tumors. Cancer Cell 31, 326–341. cells as regulators of the tumor microenvironment. Semin. Immunol. 41,
Quail, D.F., Bowman, R.L., Akkari, L., Quick, M.L., Schuhmacher, A.J., Huse, 101270.
J.T., Holland, E.C., Sutton, J.C., and Joyce, J.A. (2016). The tumor microenvi- Uhl, M., Aulwurm, S., Wischhusen, J., Weiler, M., Ma, J.Y., Almirez, R., Man-
ronment underlies acquired resistance to CSF-1R inhibition in gliomas. Sci- gadu, R., Liu, Y.-W., Platten, M., Herrlinger, U., et al. (2004). SD-208, a Novel
ence 352, aad3018. Transforming Growth Factor b Receptor I Kinase Inhibitor, Inhibits Growth and
R Development Core Team (2008). R: A language and environment for statis- Invasiveness and Enhances Immunogenicity of Murine and Human Glioma
tical computing (Vienna, Austria: R Foundation for Statistical Computing). Cells In vitro and In vivo. Cancer Res. 64, 7954–7961.
RStudio Team (2015). RStudio: Integrated Development for R (Boston, MA: Unruh, D., Zewde, M., Buss, A., Drumm, M.R., Tran, A.N., Scholtens, D.M., and
RStudio, Inc.). Horbinski, C. (2019). Methylation and transcription patterns are distinct in IDH
Ries, C.H., Cannarile, M.A., Hoves, S., Benz, J., Wartha, K., Runza, V., Rey- mutant gliomas compared to other IDH mutant cancers. Sci. Rep. 9, 8946.
Giraud, F., Pradel, L.P., Feuerhake, F., Klaman, I., et al. (2014). Targeting tu- Van Gassen, S., Callebaut, B., Van Helden, M.J., Lambrecht, B.N., Demeester,
mor-associated macrophages with anti-CSF-1R antibody reveals a strategy P., Dhaene, T., and Saeys, Y. (2015). FlowSOM: Using self-organizing maps for
for cancer therapy. Cancer Cell 25, 846–859. visualization and interpretation of cytometry data. Cytometry A 87, 636–645.
Samusik, N., Good, Z., Spitzer, M.H., Davis, K.L., and Nolan, G.P. (2016). Auto- Van Hove, H., Martens, L., Scheyltjens, I., De Vlaminck, K., Pombo Antunes,
mated mapping of phenotype space with single-cell data. Nat. Methods 13, A.R., De Prijck, S., Vandamme, N., De Schepper, S., Van Isterdael, G., Scott,
493–496. C.L., et al. (2019). A single-cell atlas of mouse brain macrophages reveals
Sankowski, R., Böttcher, C., Masuda, T., Geirsdottir, L., Sagar, Sindram, E., unique transcriptional identities shaped by ontogeny and tissue environment.
Seredenina, T., Muhs, A., Scheiwe, C., Shah, M.J., et al. (2019). Mapping mi- Nat. Neurosci. 22, 1021–1035.
croglia states in the human brain through the integration of high-dimensional Venteicher, A.S., Tirosh, I., Hebert, C., Yizhak, K., Neftel, C., Filbin, M.G., Hov-
techniques. Nat. Neurosci. 22, 2098–2110. estadt, V., Escalante, L.E., Shaw, M.L., Rodman, C., et al. (2017). Decoupling
STAR+METHODS
Continued
REAGENT or RESOURCE SOURCE IDENTIFIER
anti-human CCR2 (K036C2), purified Biolegend Cat# 357202;RRID: AB_2561851
anti-human CCR4 (205410), Gd-158 Fluidigm Cat# 3158006A, RRID: AB_2687647
anti-human CCR6 (11A9), Pr-141 Fluidigm Cat# 3141014A, RRID: N/A
anti-human CCR7 (G043H7), Er-167 Fluidigm Cat# 3167009A, RRID: N/A
anti-human CD103 (Ber-ACT8), Eu-151 Fluidigm Cat# 3151011B, RRID: AB_2756418
anti-human CD127 (A019D5), Ho-165 Fluidigm Cat# 3165008B, RRID: N/A
anti-human CD137 (4B4-1), purified Biolegend Cat# 309802, RRID: AB_314781
anti-human CD16 (3G8), Bi-209 Fluidigm Cat# 3209002B, RRID: AB_2756431
anti-human CD19 (HIB19), Nd-142 Fluidigm Cat# 3142001B, RRID: AB_2651155
anti-human CD25 (2A3), Sm-149 Fluidigm Cat# 3149010B, RRID: AB_2756416
anti-human CD27 (L127), Gd-155 Fluidigm Cat# 3155001B, RRID: AB_2687645
anti-human CD28 (CD28.2), purified Biolegend Cat# 302902, RRID: AB_314304
anti-human CD3 (UCHT1), Sm-154 Fluidigm Cat# 3154003B, RRID: AB_2687853
anti-human CD38 (HIT2), purified Biolegend Cat# 303502, RRID: AB_314354
anti-human CD4 (RPA-T4), Nd-145 Fluidigm Cat# 3145001B, RRID: AB_2661789
anti-human CD45 (HI30)purified Biolegend Cat# 304002, RRID: AB_314390
anti-human CD45RA (HI100), Eu-153 Fluidigm Cat# 3153001B, RRID: AB_2802108
anti-human CD45RO (UCHL1), Dy-164 Fluidigm Cat# 3164007B, RRID: AB_2811092
anti-human CD45RO (UCHL1), purified Biolegend Cat# 304202, RRID: AB_314418
anti-human CD56 (NCAM16), purified BD Cat# 559043, RRID: AB_397180
anti-human CD57 (hCD57), Yb-172 Fluidigm Cat# 3172009B, RRID: N/A
anti-human CD69 (FN50), Nd-144 Fluidigm Cat# 3144018, RRID: AB_2687849
anti-human CD8 (RPA-T8), Nd-146 Fluidigm Cat# 3146001B, RRID: AB_2687641
anti-human CD90 (5E10), Tb-159 Fluidigm Cat# 3159007B, RRID: N/A
anti-human CD95 (DX2), Dy-164 Fluidigm Cat# 3164008B, RRID: N/A
anti-human CRTH2 (BM16), purified Biolegend Cat# 350102, RRID: AB_10639863
anti-human CTLA-4 (14D3), Dy-161 Fluidigm Cat# 3161004B, RRID: AB_2687649
anti-human CXCR3 (G025H7), Gd-156 Fluidigm Cat# 3156004B, RRID: AB_2687646
anti-human Granzyme B (GB11), Yb-171 Fluidigm Cat# 3171002B, RRID: AB_2687652
anti-human HLA-DR (TU36), Nd-150 Fluidigm Cat# 3150028B, RRID: N/A
anti-human ICOS (C398.4A), purified Biolegend Cat# 313502, RRID: AB_416326
anti-human Ki-67 (B56), Er-168 Fluidigm Cat# 3168007B, RRID: AB_2800467
anti-human KLRG1 (14C2A07), purified Biolegend Cat# 368602, RRID: AB_2566256
anti-human LAG-3 (11C3C65), Er-168 Fluidigm Cat# 3165037B, RRID: AB_2810971
anti-human Nkp44 (P44-8), purified Biolegend Cat# 325102, RRID: AB_756094
anti-human CD134 (OX-40; Ber-ACT35 (ACT35)), purified Biolegend Cat# 350002, RRID: AB_10639951
anti-human PD-1 (EH12.2H7), Lu-175 Fluidigm Cat# 3175008, RRID: AB_2687629
anti-human SOX2 (14A6A34), purified Biolegend Cat# 656102 ; RRID: AB_2562246
anti-human TCRgd (11F8), Sm-152 Fluidigm Cat# 3152008B, RRID: AB_2687643
anti-human TIGIT (MBSA43), Tm-169 eBioscience Cat# 16-9500-82, RRID: AB_10718831
anti-human Tim3 (F38-2E2), Biotin Miltenyi Cat# 130-098-945, RRID: N/A
anti-human Tim3 (F38-2E2), VioBright FITC Miltenyi Cat# 130-104-646, RRID: N/A
Flow cytometry panel, human brain samples
anti-human CCR2 (K036C2), BV605 Biolegend Cat# 357213; RRID:AB_2562702
anti-human CD11b (ICRF-44), FITC Biolegend Cat# 301330 ; RRID: AB_2561703
anti-human CD11c (B-ly6), BV570 BD Cat# 624298; RRID: N/A
anti-human CD123 (6H6), BV711 Biolegend Cat# 306029; RRID: AB_2566353
anti-human CD14 (6H6), BUV737 BD Cat# 564444; RRID:AB_2744285
(Continued on next page)
Continued
REAGENT or RESOURCE SOURCE IDENTIFIER
anti-human CD141 (1A4), BB700 BD Cat# 742245; RRID:AB_2740668
anti-human CD16 (3G8), BUV496 BD Cat# 321102; RRID: N/A
anti-human CD163 (GHI/61), BV650 BD Cat# 563888; RRID:AB_2738468
anti-human CD169 (7-239), PE/Dazzle 594 Biolegend Cat# 346015; RRID:AB_2750265
anti-human CD19 (REA675), APC-Vio770 Myltenyi Cat# 130-110-252; RRID: N/A
anti-human CD1c (F10/21A3), BB660 BD Cat# 624295; RRID: N/A
anti-human CD206 (15-2), BV421 Biolegend Cat# 321126; RRID: AB_2563839
anti-human CD235A (REA175), APC-Vio770 Myltenyi Cat# 130-100-264; RRID:AB_2656505
anti-human CD273 (MIH18), AF700 BD Cat# 565189; RRID:AB_2739102
anti-human CD274 (MIH1), PE-Cy7 BD Cat# 558017; RRID:AB_396986
anti-human CD3(REA613), APC-Vio770 Myltenyi Cat# 130-109-543; RRID:AB_2657072
anti-human CD33 (WM53), BUV395 BD Cat# 740293; RRID:AB_2740032
anti-human CD38 (HIT2), BV785 Biolegend Cat# 303529; RRID: AB_2561368
anti-human CD45 (HI-30), BUV805 BD Cat# 564915; RRID: N/A
anti-human CD45RA (MEM-56),Pe-Cy5.5 LifeTechnologies, Thermo Cat# MHCD45RA18; RRID:AB_10372221
Fisher Scientific
anti-human CD49d (9F10), Biotin Biolegend Cat# 304334; RRID: AB_2749896
anti-human CD56 (HCD56), APC-C7 Biolegend Cat# 318332; RRID: AB_10896424
anti-human CD66b (G10F5), BB790-P BD Customized
anti-human CD86 (IT2.2), PE-Cy5 Biolegend Cat# 305407; RRID: AB_314527
anti-human CD235A (REA175), APC-Vio770 Myltenyi Cat# 130-100-264; RRID:AB_2656505
anti-human CD273 (MIH18), AF700 BD Cat# 565189; RRID:AB_2739102
anti-human CX3CR1 (2A9-1), BV480 BD Cat# 746723; RRID:AB_2743987
anti-human HLA-DR (G46-6), BUV661 BD Cat# 565073; RRID:AB_2722500
anti-human MERTK (125518), APC R&D Cat# FAB8912A; AB_357213
anti-human P2Y12 (S16001E), PE Biolegend Cat# 392104; RRID: AB_2716007
Flow cytometry panel, mouse brain samples
anti-mouse MHCII (M5/114.15.2), BB700 BD Cat# 746197; RRID:AB_2743544
anti-mouse CCR2 (SA203G11), BV650 BD Cat# 747968; RRID: N/A
anti-mouse CD11b (M1/70), BUV737 BD Cat# 564443; RRID:AB_2738811
anti-mouse CD11c (N418), BV570 Biolegend Cat# 117331; RRID: AB_10900261
anti-mouse CD163 (TNKUPJ), biotin eBioscience Cat# 14-1631-82; RRID:AB_2716934
anti-mouse CD206 (C068C2), AF700 Biolegend Cat# 141734; RRID: AB_2629637
anti-mouse CD209a (MMD3), PE Biolegend Cat# 833004; RRID: AB_2721637
anti-mouse CD38 (90), PE-Dazzle594 Biolegend Cat# 102729; RRID: AB_2632890
anti-mouse CD45 (30-F11), BUV395 BD Cat# 564279; RRID:AB_2651134
anti-mouse CD49d (R1-2), PE-Cy7 BioLegend Cat# 103618; RRID:AB_2563700
anti-mouse CD64 (X54-5/7.1), BV421 Biolegend Cat# 139309; RRID: AB_2562694
anti-mouse CX3CR1 (SA011F11), BV510 Biolegend Cat# 149025; RRID: AB_2565707
anti-mouse F4/80 (BM8), PE-Cy5 Biolegend Cat# 123112; RRID: AB_893482
anti-mouse Ly6C (HK1.4), BV711 Biolegend Cat# 128037; RRID: AB_2562630
anti-mouse Ly6G (1A8), BUV 563 BD Cat# 565707; RRID:AB_2739334
anti-mouse MerTK (DS5MMER), SuperBright 780 eBioscience Cat# 78-5751-82; RRID: AB_2762814
anti-mouse PD-L1 (10F.9G2), BV605 Biolegend Cat# 124321; RRID: AB_2563635
anti-mouse Siglec1 (3D6.112), APC Biolegend Cat# 142417; RRID: AB_2565640
Streptavidin, BUV661 BD Customized
anti-mouse CD64 (X54-5/7.1), BV421 Biolegend Cat# 139309; RRID: AB_2562694
anti-mouse CX3CR1 (SA011F11), BV510 Biolegend Cat# 149025; RRID: AB_2565707
(Continued on next page)
Continued
REAGENT or RESOURCE SOURCE IDENTIFIER
Immunohistochemistry
CD163 (163C01/10D6) NeoMarkers / Lab Vision Cat# MS-1103-S, RRID:AB_64138
Corporation
Iba1 Wako Chemicals Cat# 019-19741, RRID:AB_839504
Immunofluorescence
VE-cadherin (C-19) (polyclonal) Santa Cruz Cat# sc-6458, RRID:AB_2077955
anti-human P2RY12 (S16001E), PE Biolegend Cat# 392104; RRID:AB_2716007
anti-human CD163 (GHI/61), PE Biolegend Cat# 333606; RRID:AB_1134002
anti-human CD169 (7-239), PE-Dazzle594 Biolegend Cat# 346015; RRID:AB_2750265
anti-human CD206 (15-2), PE-Dazzle594 Biolegend Cat# 321106; RRID:AB_571911
anti-human CD209 (DCS-8C1), PE Biolegend Cat# 343004; RRID:AB_2074328
Biological Samples
Brain tumor (glioma and brain metastases) University Hospital Zurich N/A
and non-tumor brain tissue (epilepsy) samples
Chemicals, Peptides, and Recombinant Proteins
16% Paraformaldehyde aqueous solution Electron Microscopy Sciences/ Cat#15710; RRID: N/A
LucernaChem
2-methylbutane Sigma Aldrich Cat# M32631-25.L; RRID: N/A
Antibody Stabilizer PBS Candor Bioscience Cat# 131 050; RRID: N/A
Antifading Mounting Medium with DAPI Dianova Cat# SCR-038448; RRID: N/A
Bambanker LubioScience GmbH Cat# 523303 (BB02); RRID: N/A
Benzonase nuclease Sigma-Aldrich Cat# E1014-25KU; RRID: N/A
Bovine Serum Albumin (BSA) Sigma-Aldrich Cat# B2064; RRID: N/A
Bromoacetamidobenzyl-EDTA (BABE) Dojindo Laboratories Cat# B437-10; RRID: N/A
Cell-IDTM Intercalator-Ir Fluidigm Cat# 201192B; RRID: N/A
Cell-ID Cisplatin Fluidigm Cat# 201064; RRID: N/A
CO2-Independent Medium Thermo Fisher Scientific Cat# 18045-070; RRID: N/A
Collagenase from Clostridium histolyticum, type IV Sigma-Aldrich Cat# C5138; RRID: N/A
Cryo Embedding Medium Medite Cat# 41-3011-00; RRID: N/A
Dead Cell Removal Kit Miltenyi Cat# 30-090-101; RRID: N/A
Deoxyribonuclease I from bovine pancreas Sigma-Aldrich Cat# DN25-1G; RRID: N/A
D-Luciferin Perkin Elmer Cat# 122799; RRID: N/A
DMSO Sigma-Aldrich Cat# D2438; RRID: N/A
DMSO; Dimethyl Sulfoxide, anhydrous, 99.7% Fischer Bioreagents Cat# BP231-1; RRID: N/A
EDTA StemCell Technologies, Inc Cat# EDS-100G; RRID: N/A
EQ Four Element Calibration Fluidigm Cat# 201078; RRID: N/A
Formaldehyde 4.0% PanReac Cat# 252931.1211; RRID: N/A
Foxp3 / Transcription Factor Staining Buffer Set eBioscience Cat# 00-5523-00; RRID: N/A
HBSS ThermoFisher Scientific Cat# 14175095; RRID: N/A
Human TruStain FcX Biolegend Cat# 422302; RRID:AB_2818986
Indium (115In) Trace Sciences International N/A
Iridium (191Ir, 193Ir) Fluidigm Cat# 201192A; RRID: N/A
Isoflurane Minrad N/A
Maleimido-mono-amide-DOTA (mDOTA) Macrocyclics Cat# B-272; RRID: N/A
Maxpar X8 Multimetal Labeling Kit Fluidigm Cat# 201300; RRID: N/A
Maxpar Fix and Perm Buffer Fluidigm Cat# 201067; RRID: N/A
Normal goat serum ThermoFisher Scientific Cat# PCN5000; RRID: N/A
Palladium (104Pd,105Pd, 106Pd, 108Pd, 110Pd) Trace Sciences International N/A
(Continued on next page)
Continued
REAGENT or RESOURCE SOURCE IDENTIFIER
Percoll GE Cat# P4937; RRID: N/A
Percoll Sigma-Aldrich Cat# GE17-0891; RRID: N/A-0
Phosphate-buffered saline Homemade N/A
Phosphate-buffered saline, DPBS, NO CALCIUM, Life Technologies Cat# 14190094; RRID: N/A
NO MAGNESIUM
RPMI 1640 Bioswisstec; Seraglob Cat# M 3413; RRID: N/A
RPMI 1640 Medium, HEPESS, no glutamine Life Technologies Cat# 42401042; RRID: N/A
Saponin Sigma-Aldrich Cat# S7900; RRID: N/A
Sudan black B Sigma-Aldrich Cat# 199664; RRID: N/A
Deposited Data
TCGA-LGG - Harmonized The Cancer Genome Atlas https://portal.gdc.cancer.gov -
mRNA gene quantification
HTSeq - Counts
TCGA-GBM - Legacy The Cancer Genome Atlas https://portal.gdc.cancer.gov/legacy-
archive - mRNA gene expression and
quantification HT_HG-U133A
TCGA-GBM - Harmonized The Cancer Genome Atlas https://portal.gdc.cancer.gov -
mRNA gene quantification
HTSeq - Counts
Mass- and flow cytometry data this study https://data.mendeley.com/datasets/
jk8c3c3nmz/draft?a=c0a9d8dc-
8ac2-4942-baf9-208de7a8c310
Experimental Models: Cell Lines
GL-261 cells A. Fontana, Experimental N/A
Immunology, University of Zurich,
Zurich, Switzerland
Experimental Models: Organisms/Strains
Sall1CreER/+ Ryuchi Nishinakamura RRID:MGI:4818961
(Kumamoto University)
R26YFP The Jackson Laboratory RRID:IMSR_JAX:00 6148
Software and Algorithms
MATLAB R2016a N/A https://www.mathworks.com/
Normalizer Finck et al., 2013 https://github.com/nolanlab/bead-
normalization/releases
FlowJo V10.6.1.1 Tree Star https://www.flowjo.com/
R version 3.5 R Development Core Team, https://www.r-project.org/
2008
R Studio RStudio Team, 2015 https://www.rstudio.com/
FlowSOM Van Gassen et al., 2015 https://github.com/SofieVG/FlowSOM
Circlize Gu et al., 2014 https://cran.r-project.org/web/
packages/circlize/index.html
MCF-data-analysis Hartmann et al., 2016 https://github.com/hartmannfj/
MCF-data-analysis
CyTOF workflow Nowicka et al., 2017 https://f1000research.com/articles/6-748#
UMAP Mcinnes et al., 2018 https://github.com/lmcinnes/umap
SCAFFoLD Spitzer et al., 2015 https://github.com/nolanlab/scaffold
VorteX Samusik et al., 2016 https://github.com/nolanlab/vortex/
wiki/Getting-Started
Slingshot Street et al., 2018 https://bioconductor.org/packages/
release/bioc/html/slingshot.html
t-SNE Laurens van der Maaten, 2008 https://github.com/jkrijthe/Rtsne
(Continued on next page)
Continued
REAGENT or RESOURCE SOURCE IDENTIFIER
One-SENSE Cheng et al., 2016 N/A
flowStats Hahne et al., 2020 https://www.bioconductor.org/
packages/release/bioc/html/
flowStats.html
pheatmap Kolde, 2019 https://cran.r-project.org/web/
packages/pheatmap/index.html
dplyr Wickham et al., 2019a https://cran.r-project.org/web/
packages/dplyr/index.html
ggplot2 Wickham et al., 2019b https://cran.r-project.org/web/
packages/ggplot2/index.html
gplots Warnes et al., 2019 https://cran.r-project.org/web/
packages/gplots/index.html
Hmisc Harrell, 2020 https://cran.r-project.org/web/
packages/Hmisc/index.html
flowWorkspaceData Finak, 2018 N/A
flowCore Ellis et al., 2019 N/A
TCGAbiolinks Colaprico et al., 2016 https://bioconductor.org/packages/
release/bioc/html/TCGAbiolinks.html
Gephi Bastian et al., 2009 https://gephi.org/
Fiji Schindelin et al., 2012 https://imagej.net/Fiji
Living Image 2.5 Caliper Life Sciences N/A
Leica Bond III N/A https://www.leicabiosystems.com
Adobe Illustrator CS6 Adobe https://www.adobe.com/ch_de/
products/illustrator.html
Other
gentleMACS Octo Dissociator with Heaters Miltenyi Biotec Cat# 130-096-427
Bone wax (Aesculap B. Braun N/A
FACSymphony BD N/A
Gentle MACS C-tubes Miltenyi Biotec Cat# 130-096-334
Hamilton 75N syringe Sigma-Aldrich Cat# 28613-U
Hamilton syringe 5 ml Sigma-Aldrich Cat# 26286
Helious CyTOF2 Fluidigm N/A
Hyrax C60 Cryostat Zeiss N/A
MACS columns MS Miltenyi Biotec Cat# 130-042-201
microinjection pump (UMP-3) World precision Instruments N/A
Olympus IX81 Olympus N/A
Stereotactic frame David Kopf Instruments N/A
Tissue glue Indermil; Henkel N/A
Xenogen IVIS 100 Caliper Life Sciences N/A
RESOURCE AVAILABILITY
Lead Contact
Further information and requests for resources should be directed to the Lead Contact, Burkhard Becher (becher@immunology.
uzh.ch).
Materials Availability
This study did not generate new unique reagents.
Animal Models
Mice were bred in house (see Key Resources Table): R26YFP (de Boer et al., 2003). Sall1CreER were kindly provided by R. Nishina-
kamura (Kumamoto University) (Inoue et al., 2010; Takasato et al., 2004). All ‘Cre’ and ‘CreER’ strains were used as heterozygotes.
Six-week-old mice (female and male) were used for all experiments of the glioma preclinical model. All animal experiments performed
in this study were approved by the Swiss Veterinary Office. All mice were on a C57BL/6 background and kept in individually ventilated
cages under specific-pathogen-free conditions. Animals were monitored once a week to assess weight loss and physical/neurolog-
ical abnormalities. In vivo measurements to assess tumor growth were also performed once a week.
Cell Lines
GL-261 cells, which are syngeneic in C57BL/6 mice, were stably transfected with pGl3-ctrl and pGK-Puro (Promega) and selected
with puromycin (Sigma-Aldrich) to generate luciferase-stable GL-261 cells. A single clone was isolated by limiting dilution and
passaged in vivo by intracranial tumor inoculation. Subsequently, cells were transfected with pCEP4-mIgG3, pCEP4-mIl-
12mIgG3, or pCEP4-mIl23mIgG3, and cytokine production was detected by ELISA and RT-PCR, as previously described (Eisenring
et al., 2010). SMA-560 spontaneous murine astrocytoma cells were characterized previously (Uhl et al., 2004).
METHOD DETAILS
1.5 mm lateral and 1 mm frontal from the bregma. A 5ml syringe (Hamilton; Sigma-Aldrich) was injected with a depth of 4 mm below
the skull and retracted 1 mm, forming a reservoir. Using a microinjection pump (UMP-3; World precision Instruments Inc.), 5 3 104
GL-261 cells were injected in a volume of 2 ml at 1 ml/minute. After resting the needle for 2 minutes, it was retracted at a speed of
1 mm/minute. The injection hole was closed with bone wax (Aesculap; B. Braun), and the scalp wound was sealed with tissue
glue (Indermil; Henkel).
Every animal got an ID code. ‘‘C’’ in the animal ID marked the control animals, the one did not have the injection of tumor cells, but
had Sall1/YFP expression. Samples LH60, LV57, L57, RV58, RH57, RH58 and RV57 developed a big tumor, and the rest had an in-
termediate to a small tumor. Samples RV58, LH60, L57 and RH60 were wild-type control animals (YFP-) with the injection of tu-
mor cells.
Metal-Isotope-Tagged Antibodies
All anti-human antibodies, corresponding clone, and tagged metal isotope for mass cytometry analysis are listed in Tables S2 and S3
and Key Resources Table. Preconjugated antibodies to metal isotope were purchased from Fluidigm or commercial suppliers in
purified form and conjugated in house using the Maxpar X8 chelating polymer kit (Fluidigm) according to the manufacturer’s
instructions.
Immunohistochemistry
Immunohistochemistry was performed on 4-mm-thick tissue sections. Double immunostaining for Iba1 (Wako Chemicals USA, 019-
19741), pretreatment with TrisEDTABorat antigen retrieval, 32 minutes, followed by 32 minutes incubation, dilution 1/1000, OptiView
DAB detection kit (Ventana) was performed on an automated Ventana Benchmark Ultra, and CD163 (NeoMarkers / Lab Vision Cor-
poration, MS-1103-S), prediluted, incubation 32 minutes, Bond Polymer Refine Detection kit, was performed on a Leica Bond III. In
total we analyzed 27 glioma, 7 BrM and 2 epilepsy samples from the CyTOF cohort.
Immunofluorescence
Frozen tissues were cryosectioned (10-mm thick) using a Hyrax C60 Cryostat (Zeiss) and stored at 20 C. Sections were fixed with
4% PFA (PanReac), washed in PBS, and incubated with a blocking solution consisting of PBS supplemented with 1% BSA (Sigma-
Aldrich) and 0.3% Triton X-100 (Sigma-Aldrich). Subsequently, sections were incubated with the following primary antibodies (diluted
in blocking solution) at 4 C overnight: rabbit anti-Iba1 antibody (Wako; polyclonal; 1:500); goat anti-VE-cadherin antibody (Santa
Cruz; polyclonal; 1:00). Sections were then washed with blocking solution and incubated with AF647-labeled donkey anti-rabbit,
AF488-labeled donkey anti-goat secondary antibodies (Life Technologies; 1:500) and one of the following directly-labeled antibodies
(all diluted in blocking solution) at room temperature for 2h: P2RY12-PE (Biolegend; clone S16001E; 1:100), CD163-PE (Biolegend;
clone GHI/61; 1:200), CD169-PE (Biolegend; clone 7-239; 1:100), CD206-PE-Dazzle594 (Biolegend, clone 15-2; 1:100) or CD209-PE
(Biolegend; clone DCS-8C1; 1:100). Sections were washed with blocking solution and incubated in Sudan black B (Sigma-Aldrich)
dissolved in 70% ethanol to reduce autofluorescence of the tissues at RT for 30 minutes. Finally, sections were washed with HBSS
(ThermoFisher Scientific) and mounted with Immunoselect Anti fading Mounting Medium with DAPI (Dianova). Fluorescence photo-
micrographs were captured with an Olympus IX81 microscope using the 40 3 objective and images were processed with Fiji soft-
ware (Schindelin et al., 2012). A Gaussian filter s = 1 was applied to the P2RY12 image before merging with the other channels.
into R studio of R using the R packages ‘‘flowCore’’ and ‘‘flowWorkspaceData’’ (R Foundation for Statistical Computing) (Ellis et al.,
2019; Finak, 2018). Before automated high-dimensional data analysis, the mass cytometry data were transformed with a cofactor in
the range of 5 and 60 using an inverse hyperbolic sine (arcsinh) function (Bendall et al., 2011).
For flow cytometry data, the compensation matrix was corrected using FlowJo software (Tree Star). After live, single, CD45
positive and compensated cells were exported and imported into R Studio. Before automated high-dimensional data analysis,
flow cytometry data were transformed using an inverse hyperbolic sine (arcsinh) function with a cofactor in the range of between
300 and 600.
Additionally, all cytometry data were normalized between 0 and 1 to the 99-999th percentile of the merged sample in each
batch. To control for the batch effect, we used the same clinical sample in two acquisition rounds. For the mass cytometry data,
the marker expression distributions were verified between two batches of the acquisition applying R package ‘‘flowStats’’ (Hahne
et al., 2020).
Survival Analysis
For overall Survival (OS) data, we used The Cancer Genome Atlas (TCGA) from 164 (harmonized data, RNaseq) and 558 (legacy data,
Affymetrix) glioblastoma samples (TCGA-GBM). In order to analyze OS in low grade gliomas (TCGA-LGG), we used 516 samples
(harmonized data, RNaseq). To obtain the signature of genes associated with outcome, data were extracted from TCGAbiolinks
Rstudio and the gene list obtained from CyTOF. For harmonized data (TCGA-GBM/LGG), genes were plotted as a heatmap, and
selected according to high or low expression (gene groups). For legacy data, median levels were used to segregate cancer patients
according to OS outcome.
Statistical Analysis
P values were calculated to compare the relative frequencies of leukocytes or median antigen expression of an immune cell types
between glioma IDH1mut versus IDH1wt versus melanoma BrM versus carcinoma BrM using nonparametric Mann-Whitney-Wilcoxon
tests, and controlled for multiple testing by using the Benjamini-Hochberg test (Field et al., 2013; Noble, 2009). The relative fre-
quencies of individual patients’ leukocyte subsets from the first and second mass cytometry batches were combined to perform
the statistical analysis. Reported p-values were below 0.05, considered statistically significant and displayed on the corresponding
graph. In Figures 1E, S2C, S3B, and S5E error bars define an interval of max/min value ± SD, horizontal line indicates the mean value.
In Figures 2G, 6A, 7B, 7D, S5D, S6B, and S7A–S7C boxplots represent the interquartile range (IQR) 50% and whiskers 25%. Listed
figures showing the relative frequencies of leukocytes or median antigen expression of an immune cell types generated using the R
package ggplot2 (Wickham et al., 2019b). The Pearson’s correlation matrix between the relative frequencies of immune populations
was calculated with the R environment (‘‘Hmisc’’ R package) and include p-value (p) and correlation coefficients (R) (Harrell, 2020).
Correlations were considered statistically significant if the p-value was below 0.05 and R value was below 0.6 or above 0.6 and
visualized using the R package ‘‘circlize’’ (Gu et al., 2014).
Supplemental Figures
Figure S1. Mass Cytometry Analysis Reveals Unique Changes in the Leukocyte Composition of Brain Tumors, Related to Figure 1
Heatmap of the mean expression of all markers in the myeloid panel, calculated across CD45+ leukocytes (value range 0-1 post-transformation/normalization).
ll
Resource
Figure S2. Mass Cytometry Analysis Reveals Unique Changes in the Leukocyte Composition of Brain Tumors, Related to Figure 1
(A) Individual UMAP plots are overlaid with all markers from the myeloid panel. (B) Composition of major immune populations found in the brain samples. Selected
bars show the reference sample included in both CyTOF acquisition rounds, which allowed normalization and batch correction. Samples were ordered according
to patients’ clinical diagnosis. (C) Frequencies of the main immune populations among CD45+ cells in the glioblastoma cohort stratified according to methylation
status of the MGMT promoter. Error bars define an interval of max/min value ± SD, horizontal line indicates the mean value. P-values were calculated using a non-
parametric Mann-Whitney-Wilcoxon test. P-values of less than 0.05 were considered statistically non-significant and were not displayed.
ll
Resource
Figure S3. The Brain TME Harbors a Heterogeneous Mononuclear Phagocyte Population, Related to Figure 2
(A) Manual gating strategy to validate the identification of the major immune populations by unsupervised machine-learning algorithms. (B) Relative frequencies of
three TAM populations among CD64+ cells in the glioblastoma cohort stratified according to methylation status of the MGMT promoter. Error bars define an
interval of max/min value ± SD, horizontal line indicates the mean value. P-values were calculated using a non-parametric Mann-Whitney-Wilcoxon test. P-values
of less than 0.05 were considered statistically non-significant and were not displayed.
ll
Resource
Figure S4. TAM Instruction Is Driven by the Type of Tumor Rather Than the Local Tissue Microenvironment, Related to Figures 4 and 5
(A) Representative immunohistochemistry images of CD163 (brown) and Iba1 (red) co-staining for glioma and BrM samples. Selected areas showing Iba1+
CD163- microglia with amoeboid (1) and ramified (2) morphology. Errors depicting the blood vessel. (B) Representative immunofluorescence image of CD209+
CNS phagocytes in recurrent anaplastic oligodendroglioma (ZH927). (C) Representative immunofluorescence image of CD169+ CNS phagocytes in the
melanoma BrM (ZH879). (D) Representative immunofluorescence images of CD163+, CD206+ and CD169+ CNS phagocytes in NSCLC BrM (ZH968). (E) The
Pearson correlation between TAM frequencies and OS in the IDH1wt glioma group. Relative frequencies of microglia were calculated among CD64+ cells. Relative
frequencies of monocytes and MDM subsets were calculated among parental MDM/monocyte population. Patients that have survived till day of analysis (hence
past 400 days) have been highlighted using an asterisk (*) alongside their Patient ID.
ll
Resource
Figure S5. Preferential Treg Accumulation in the TME of Brain Metastases, Related to Figure 6
(A) A representative UMAP map displaying 120,000 singlets, live, TILs (CD3, CD19, CD56, CD90 expressing cells) equally proportioned from glioma and
metastasis samples. Individual UMAP plots are overlaid with all markers in the lymphoid panel. (B) Heatmap displaying the median antigen intensity of markers
used to generate part (C). (C) UMAP map overlaid with FlowSOM-guided manual meta-clusters. (D) Relative frequencies of the main TILs populations identified
among lymphocytes in the brain tumors. (E) Relative frequencies of the main TILs populations identified in patients with glioblastoma stratified according to
methylation status of the MGMT promoter. (D, E) Only statistically significant p-values were displayed (p < 0.05, Mann-Whitney-Wilcoxon test, Benjamini-
Hochberg correction).
ll
Resource
Figure S6. Characterization of T cell memory formation and correlation with Gliobastoma patient survival. Related to Figure 6
(A) Gating strategy of naive/memory T cells. (B) Relative frequencies of main T cell subsets among CD3+ cells. (C) The Pearson correlation between T cell subset
frequencies and OS in the IDH1wt glioma group. Relative frequencies of naive/memory populations were calculated among T cells. Patients that have survived till
day of analysis (hence past 400 days) have been highlighted using an asterisk (*) alongside their Patient ID.
ll
Resource
Figure S7. T Cell Exhaustion Characterize the TME of Brain Metastases, whereas Immature NK Cells Accumulate in Glioblastoma, Related to
Figures 6 and 7
(A) Median expression of antigens derived from the most differentially expressed genes among CD8 RM and CD8 EM. Boxplots quantify the mean antigen
intensity of data shown in Figure 6C. Point shapes differentiate patients who received therapy prior to surgery (including immune-, radio- and chemotherapy). Only
statistically significant p-values were displayed (p < 0.05, Mann-Whitney-Wilcoxon test, Benjamini-Hochberg correction). (B) Relative frequencies of three CD56
expressing populations identified in the brain samples among lymphocytes. Point shapes differentiate patients who received therapy prior to surgery (including
immune-, radio- and chemotherapy). (C) Relative frequencies of ILC populations among CD56+ cells identified in the glioblastoma cohort stratified according to
methylation status of the MGMT promoter. (B, C) Only statistically significant p-values were displayed (p < 0.05, Mann-Whitney-Wilcoxon test, Benjamini-
Hochberg correction). (D) The Pearson correlation between ILC cell subset frequencies and OS in the IDH1wt glioma group. Relative frequencies of ILC
populations were calculated among CD56+ cells. Patients that have survived till day of analysis (hence past 400 days) have been highlighted using an asterisk
(*) alongside their Patient ID.
Resource
Correspondence
johanna.joyce@unil.ch
In Brief
High-dimensional, multi-omics
characterization of the brain tumor
microenvironment, including
comparisons of gliomas and brain
metastases, suggests that education of
immune cell types in the TME depends on
tumor origin and IDH mutational status.
Highlights
d Flow cytometry, RNA-seq, and protein and image analyses
reveal brain TME complexity
Resource
Interrogation of the Microenvironmental Landscape
in Brain Tumors Reveals Disease-Specific
Alterations of Immune Cells
Florian Klemm,1,2 Roeltje R. Maas,1,2,3,4 Robert L. Bowman,5 Mara Kornete,1,2 Klara Soukup,1,2 Sina Nassiri,1,2,6
Jean-Philippe Brouland,7 Christine A. Iacobuzio-Donahue,8 Cameron Brennan,9 Viviane Tabar,9 Philip H. Gutin,9
Roy T. Daniel,4 Monika E. Hegi,3,4 and Johanna A. Joyce1,2,10,*
1Department of Oncology, University of Lausanne, Lausanne, Switzerland
2Ludwig Institute for Cancer Research, University of Lausanne, Lausanne, Switzerland
3Neuroscience Research Center, Centre Hospitalier Universitaire Vaudois, Lausanne, Switzerland
4Department of Neurosurgery, Centre Hospitalier Universitaire Vaudois, Lausanne, Switzerland
5Memorial Sloan Kettering Cancer Center, New York, NY, USA
6Bioinformatics Core Facility, SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland
7Department of Pathology, Centre Hospitalier Universitaire Vaudois, Lausanne, Switzerland
8Department of Pathology, Memorial Sloan Kettering Cancer Center, New York, NY, USA
9Department of Neurosurgery, Memorial Sloan Kettering Cancer Center, New York, NY, USA
10Lead Contact
*Correspondence: johanna.joyce@unil.ch
https://doi.org/10.1016/j.cell.2020.05.007
SUMMARY
Brain malignancies encompass a range of primary and metastatic cancers, including low-grade and high-
grade gliomas and brain metastases (BrMs) originating from diverse extracranial tumors. Our understanding
of the brain tumor microenvironment (TME) remains limited, and it is unknown whether it is sculpted
differentially by primary versus metastatic disease. We therefore comprehensively analyzed the brain TME
landscape via flow cytometry, RNA sequencing, protein arrays, culture assays, and spatial tissue character-
ization. This revealed disease-specific enrichment of immune cells with pronounced differences in propor-
tional abundance of tissue-resident microglia, infiltrating monocyte-derived macrophages, neutrophils,
and T cells. These integrated analyses also uncovered multifaceted immune cell activation within brain ma-
lignancies entailing converging transcriptional trajectories while maintaining disease- and cell-type-
specific programs. Given the interest in developing TME-targeted therapies for brain malignancies, this
comprehensive resource of the immune landscape offers insights into possible strategies to overcome
tumor-supporting TME properties and instead harness the TME to fight cancer.
Cell 181, 1643–1660, June 25, 2020 ª 2020 Elsevier Inc. 1643
ll
Resource
A C
E F
Immune checkpoint blockade (ICB), adoptive cell therapy, and their TME differently than cancers that metastasize from extra-
vaccines represent treatments targeted against immune cells cranial sites? Does IDH mutation status affect the TME? How
within the TME and systemically. The success of immunother- do distinct TME compositions potentially modulate the activa-
apies in certain extracranial cancers has led to clear motivation tion states of immune cells? By integrating the answers to these
for their evaluation in brain malignancies. However, although questions, we provide insights into potential strategies to
they show some clinical efficacy in a subset of BrM patients harness the brain TME in the fight against these deadly diseases.
(Hendriks et al., 2019; Long et al., 2018; Tawbi et al., 2018),
ICB has only resulted in responses in isolated cases of primary RESULTS
gliomas to date (Lim et al., 2018; Schalper et al., 2019). Beyond
tumor cell-intrinsic effects, this may be attributed in part to im- Tumor Origin and IDH Mutational Status Influence the
mune-suppressive components of the brain TME, including tu- Immune Composition of Brain Malignancies
mor-associated macrophages (TAMs), which have emerged as We first determined the broad immune cell abundance in the
prominent players in brain cancers (Gutmann and Kettenmann, brain TME by analyzing the pan-leukocyte marker CD45 through
2019; Quail and Joyce, 2017). immunofluorescence (IF) staining of whole-tissue sections and
Lineage-tracing experiments in mice revealed that brain TAMs flow cytometry (FCM) analyses of non-tumor brain tissue, IDH
can originate from tissue-resident MG or monocyte-derived mac- mut low-grade and IDH WT high-grade gliomas, and BrMs orig-
rophages (MDMs) recruited from the peripheral circulation inating from different primaries, including breast cancer, lung
(Bowman et al., 2016; Chen et al., 2017). TAMs are highly plastic cancer, and melanoma (Figures 1A, 1B, and S1A). This showed
cells that integrate input from cytokines, growth factors, and other a leukocyte abundance from 20%–40% across the cancer
stimuli, resulting in diverse activation states and cellular pheno- samples. Stratification of CD45+ cells into myeloid and lymphoid
types, including promotion of invasion, angiogenesis, metastasis, lineages revealed a significant increase in myeloid cells in IDH
and immune suppression (Mantovani et al., 2017; Noy and Pollard, mut and IDH WT gliomas and of lymphocytes in IDH WT tumors
2014). This plasticity and their position at the nexus between ma- and BrMs compared with non-tumor tissue (Figure 1B; p < 0.05,
lignant cells and tumor-infiltrating T cells makes TAMs a promising one-sided Student’s t test). We used multicolor fluorescence-
target of TME-directed therapies in different cancers. Indeed, activated cell sorting (FACS) to analyze 14 major immune cell
studies in mice showed that phenotypic alteration of TAMs results populations across 100 clinical samples (Figure S1A; Tables
in anti-tumor efficacy in glioblastoma (Pyonteck et al., 2013; Quail S1 and S2) and collected cells for RNA sequencing (RNA-seq)
et al., 2016; Yan et al., 2017), whereas TAM depletion prevents from 48 patients (Table S3; full clinical annotation).
BrM outgrowth (Qiao et al., 2019). By incorporating cell lineage tracing and mouse models of
Despite these preclinical studies, the precise contribution of high-grade gliomas and BrM, we previously identified the cell
the two ontogenetically distinct TAM cell types in human brain surface marker integrin alpha 4, ITGA4/CD49D, as a means to
malignancies is unclear, which hinders clinical translation. For discriminate tumor-associated MG (T-MG) from tumor-associ-
example, previous studies interrogating the role of TAMs in pa- ated MDMs (T-MDMs) (Bowman et al., 2016), which we inte-
tient brain tumors did not distinguish between MG and MDMs grated here into clinical sample analyses. This enabled sorting
based on use of lineage tracing-derived markers (Gabrusiewicz of CD45 non-immune cells, CD49Dlow MG, CD49Dhigh MDMs,
et al., 2016; Sankowski et al., 2019; Szulzewsky et al., 2016) or neutrophils, and CD4+ and CD8+ T cells (Figure S1A; Tables S2
focused solely on gliomas (Müller et al., 2017; Venteicher et al., and S3A) for transcriptome analysis by RNA-seq. We assessed
2017). We therefore interrogated the TME landscape in gliomas sorting fidelity by FCM re-analysis of the sorted CD49Dlow and
and BrMs, with an emphasis on exploring TAMs, while also CD49Dhigh TAM populations (purity, 98.4%–99.8%) and by
investigating their relation to other immune cells and structures investigating the frequency of the canonical IDH codon 132
in the TME. We leveraged this multimodal resource to address missense mutation in the RNA-seq reads from CD45 cells
a number of questions. Do tumors arising within the brain shape and CD49Dlow and CD49Dhigh TAM populations. Although we
observed a mean mutated allele frequency of 0.43 in CD45 cells and deconvolution analyses to independently validate their pres-
from IDH mut gliomas (range, 0.3–0.61), this was very rare in ence. Commonly employed MG markers, such as P2RY12,
TAMs (mean, 0.01; range, 0.0–0.09), indicating reliable separa- TMEM119, and SALL1, and MDM-associated genes, such as
tion of cell populations. In a t-distributed stochastic neighbor AHR and VDR, showed varying RNA expression levels across
embedding (t-SNE) visualization of sorted populations, samples different brain malignancies while maintaining their cell type
clustered mostly by cell type (Figure S1B), with gliomas and specificity (Figure S2A) in a similar manner as observed for the
BrMs discernible as separate groups in the CD45 population. ontogeny core gene sets (Figure 1C). An equivalent pattern
In this global expression analysis in the context of the other was observed at the protein level (Figure S2B), where P2RY12
major brain TME components, CD49Dlow and CD49Dhigh TAM showed the highest expression in non-tumor tissue, and CD68
populations clustered closely, suggesting broad transcriptomic was most abundant in BrM-TAM populations. This necessitated
similarity. We thus further interrogated the utility of CD49D to use of both markers complemented by CD49D to reliably identify
differentiate between TAM populations by analyzing association MG and MDMs in IF analyses (Figure S2C). We used this strategy
of MG- and MDM-specific ontogeny core gene sets, identified to interrogate a cohort of non-tumor, glioma, and BrM samples
previously from lineage-tracing studies (Bowman et al., 2016), by whole-section quantification, confirming MDM accumulation
in human CD49Dlow and CD49Dhigh cells sorted from non-malig- in IDH WT gliomas and BrMs (Figures 2A–2C). Furthermore,
nant and brain cancer tissues. This revealed enrichment of comparison of tissue processed independently for IF and FCM
ontogeny core gene sets in the corresponding cell type (Fig- from the same individual samples demonstrated significant
ure 1C), demonstrating our ability to accurately distinguish MG concordance (Figure S2D).
and MDMs in human samples across different disease entities. We queried the sorted cell populations for T-MG- and T-MDM-
Interestingly, these core signatures were influenced within specific differentially expressed genes (DEGs) that separate
certain tumor types, with T-MDMs showing an increased MG these two populations from the most abundant other cell types;
core gene set signal in IDH mut gliomas and T-MG acquiring i.e., CD45 cells, neutrophils, and T cells (Figure S2E). Several of
MDM features in BrMs, suggesting tissue-dependent transcrip- the genes highly expressed in T-MG are well-established MG
tional programming of these cells, as further interrogated below. markers (P2RY12, TMEM119, and TAL1), whereas genes highly
We next assessed the landscape of intratumoral immune cell expressed in T-MDMs include markers of alternative macro-
populations (Figure S1A; Table S2) using clustering analysis to phage polarization (FCGR2B and CLEC10A) and DC-like pheno-
identify patterns of cellular abundance (Figure 1D; chi-square types (CD1C, CD1B, and CD207) with increased phagocytic and
test for independence, p < 0.0001). This revealed three major antigen cross-presentation ability (CD209). These gene sets also
clusters: (1) non-tumor samples and IDH mut gliomas character- allowed us to utilize a publicly available integrated dataset (Vivian
ized by dominance of MG with low numbers of other immune et al., 2017) containing bulk expression data of healthy cortical
cells; (2) IDH WT gliomas and several BrMs with an influx of brain tissue from the Genotype-Tissue Expression project
MDMs and, to some extent, neutrophils into the tumor while (GTEx; GTEx Consortium, 2013) and low- and high-grade glioma
mostly excluding lymphocytes; and (3) predominantly BrMs and samples from The Cancer Genome Atlas (TCGA; Ceccarelli et al.,
few IDH WT gliomas exhibiting the most diverse immune cell 2016) in a bulk tissue transcriptome deconvolution approach
landscape with substantial infiltration of T cells and neutrophils. (Racle et al., 2017). The estimates obtained of MG and MDM pro-
Certain tumors contained CD14low/CD16+ non-classical mono- portions in this external dataset (n = 711 samples) verified the
cytes, CD14+/CD16+ intermediate monocytes, CD16 granulo- prevalence of MG in IDH mut gliomas and MDM enrichment in
cytes, dendritic cells (DCs), or immature myeloid cells. Across IDH WT gliomas (Figure 2D).
all samples, the lymphocyte compartment was mostly composed
of T cells with fewer natural killer (NK) cells and B cells. MG and MDMs Exhibit a Multifaceted Polarization
Principal-component analysis (PCA) of the relative abundance Phenotype in Brain Malignancies
of all investigated populations confirmed that MG, MDMs, neu- We next employed PCA to specifically focus on TAMs and
trophils, and CD4+ and CD8+ T cells are the major immune cell analyze genes whose expression was influenced by tissue type
determinants of the brain TME landscape (Figure 1E). Principal (i.e., reference MDMs, non-tumor brain, gliomas, and BrMs)
component 1 (PC1) separated non-tumor tissue and IDH mut gli- and cell type (i.e., MG and MDMs) (Figure 3A). Within the first
omas from IDH WT gliomas and BrMs, whereas PC2 distin- two PCs, MG and MDMs projected into different spaces, with
guished IDH WT gliomas and BrMs. Further analysis stratifying in vitro differentiated MDMs distinct from tissue-derived sam-
for IDH status in gliomas and the primary tumor site in BrMs veri- ples. We observed a gradient across PC1 with non-tumor brain
fied a substantially higher proportion of lymphocytes in BrMs tissue at one end, traversing IDH mut and IDH WT gliomas,
(Figure 1F; meanlymphocytes %CD45+ = 46.23%, SEM = 4.15, t and ending with BrMs. Thus, TAM transcriptomic changes are
test, p < 0.0001). Melanoma BrMs exhibited the most abundant influenced by the brain TME per se and also by the specific
lymphocyte infiltrate with a sizeable CD8+ T cell fraction type of malignancy.
(meanCD8+ %CD45+ = 33.01%, SEM = 5.82, one-way ANOVA, We contrasted T-MG and T-MDMs from BrMs or gliomas
p < 0.01). Regulatory T cells (Tregs) were detected in certain (regardless of IDH mutation status) with MG from non-tumor
BrMs (meanTreg %CD45+ = 1.2%, SEM = 0.36) but were rare in gli- brain or in vitro differentiated MDMs from healthy donors,
omas (meanTreg %CD45+ = 0.25 %, SEM = 0.05, t test, p < 0.05). respectively (Figure S3A; Tables S3A and S4). This revealed pro-
Because of the prominence of T-MG and T-MDMs in the found expression changes in both populations, with T-MDMs ex-
myeloid compartment of brain malignancies, we used IF staining hibiting a higher magnitude in their transcriptional response
C D
compared with T-MG (Figure S3A). The intersect of DEGs in gli- Response,’’ ‘‘IL2 STAT5 Signaling,’’ and ‘‘IL6 JAK STAT3
omas and BrMs was highest in T-MDMs (Figure S3B), potentially Signaling’’) (Figure S3C).
reflecting the greater changes experienced by these cells upon We also assessed the M1 and M2 polarization status of T-MG
entering the completely foreign environment of a brain tumor. and T-MDMs using a panel of marker genes (Murray et al., 2014).
This was also evident when focusing on genes upregulated in gli- However, no evident pattern emerged of a defined M1 versus M2
omas and BrMs that are exclusive to T-MG or T-MDMs (Fig- phenotype in glioma or BrM T-MG or T-MDMs (Figure S3D). To
ure 3B). In T-MG and T-MDMs, the number of shared genes further explore the activation state of T-MG and T-MDMs, we
was higher across different diseases than between these two subjected their respective upregulated genes to ORA of
cell populations within the same tumor type. Consequently, macrophage stimulus-specific programs (Xue et al., 2014). This
only a small number of genes (n = 137) showed concordant up- revealed a multifaceted response (Figure 3C) incorporating ca-
regulation across a comparison of all diseases and TAM types nonical M1 (interferon g [IFNg]) and M2 polarization (inter-
(Figure 3B). leukin-4 [IL-4]), including expression changes associated with
To explore the underlying biological processes conserved in chronic inflammatory stimuli (tumor necrosis factor alpha
gliomas and BrMs, we examined the intersect of upregulated [TNF-⍺] + prostaglandin E [PGE2] and TNF⍺ + PGE2 +
genes (Figure S3B) in T-MG or T-MDMs using gene set over- Pam3CysSerLys4 [TPP]) and exposure to free fatty acids (oleic
representation analysis (ORA). In the Molecular Signature Data- acid [OA] and palmitic acid [PA]), which have been implicated
base (MSigDB; Liberzon et al., 2015) ‘‘hallmark’’ collection of in modulating myeloid cell function (Thapa and Lee, 2019). This
major biological categories, T-MG and T-MDMs showed indicates diverse transcriptional programming of T-MG and
pathway enrichment in (1) modeling of the TME (‘‘Angiogenesis,’’ T-MDMs in gliomas and BrMs extending beyond simple M1
‘‘Hypoxia’’), (2) inflammation (‘‘Inflammatory Response,’’ ‘‘Allo- versus M2 polarization.
graft Rejection’’), and (3) immune cell activation states (‘‘TNF⍺ To understand which processes are linked to and potentially
Signaling via NFkB,’’ ‘‘Interferon ⍺ Response,’’ ‘‘Interferon g driving these responses, we identified the gene set enrichment
A B C
analysis (GSEA; Subramanian et al., 2005) leading-edge genes in perivascular niche (Figures 4A and S4A). Analysis of their distri-
T-MG and T-MDMs in gliomas and BrMs and clustered them into bution relative to CD31+ vascular structures showed a closer
leading-edge metagenes (LEMs) with non-negative matrix proximity of T-MDMs compared with T-MG (Figures 4B and
factorization (Godec et al., 2016). This identified up to 5 distinct S4A). Interrogation of anatomical transcriptome data from the
LEMs per cell type and comparison that were tested for signifi- Ivy Glioblastoma Atlas Project (Ivy GAP) study (Puchalski et al.,
cant overlap in a pairwise fashion (Figure S3E) and annotated us- 2018) also demonstrated enrichment of T-MDMs in the micro-
ing Gene Ontology (GO) terms (Figure 3D). LEMs associated with vascular compartment (Figure S4B). This enrichment coincided
mitosis and cell proliferation were present in T-MG and T-MDMs with CD4+ and CD8+ T cells, indicating further spatial TME orga-
in gliomas and BrMs (Figure 3D, group 1). The biological validity nization in IDH WT gliomas.
of these LEMs were verified by staining for Ki-67, a marker of cell We assessed whether the distinct T-MG and T-MDM distribu-
proliferation, in non-tumor, glioma, and BrM tissue sections (Fig- tions and cell numbers are paralleled by their activation state. In
ure 3E), showing increased proliferation in T-MG and T-MDMs in the LEM analysis, we had detected a type I IFN response in gli-
IDH WT gliomas and BrMs and in T-MG in IDH mut gliomas. oma MDMs but not MG (Figure 3D); we therefore queried the
Interestingly, LEMs enriched for type I IFN signaling were de- FCM data to analyze levels of major histocompatibility complex
tected in glioma and BrM T-MDMs and in BrM T-MG but not in (MHC) class II human leukocyte antigen-DR isotype (HLA-DR)
glioma T-MG (Figure 3D, group 2). Sustained type I IFN signaling expression. This showed significantly increased HLA-DR in
has been implicated in mediating immune suppression and ICB T-MDMs compared with T-MG in IDH mut and IDH WT tumors
resistance (Benci et al., 2016). The stringency of these group 2 (Figure 4C). We screened the associated RNA-seq data for anti-
LEMs was validated by building a protein-protein interaction gen processing and presentation pathway gene sets using GSEA
(PPI) network of the shared LEM genes (Figure S3F). Beyond and gene set variation analysis (GSVA) (Figure 4D). Interestingly,
their role in antiviral responses, the genes highlighted at the cen- we found evidence of increased expression of MHC class II an-
ter of the PPI network (Figure S3F, red nodes) have been impli- tigen presentation gene sets in IDH WT glioma MDMs and also
cated in a variety of tumor-promoting and -suppressing roles antigen processing-associated pathways (Figure S4C) and
(Benci et al., 2016). Similarly, the more peripheral network nodes MHC class I presentation gene sets (Figure 4D). Although these
IL15 and TNFSF10 are potentially able to modulate an effective findings suggest the potential of TAMs, particularly T-MDMs, to
immunological anti-tumor response or induce apoptosis in can- initiate an immune response, this potential is generally not real-
cer cells, respectively (Bouralexis et al., 2005; Santana Carrero ized in the glioma TME, based on the current status of ICB trials
et al., 2019). We asked whether these genes were directly in this disease, and we thus asked whether there was also evi-
induced by secreted factors in the brain TME and established dence of pro-tumor states in these cell populations.
cell-based assays to expose MDMs to TME conditioned medium We compared T-MG and T-MDMs from IDH WT gliomas with
(CM) generated from single-cell suspensions of freshly isolated T-MG from IDH mut gliomas because they constitute the most
glioma or BrM samples in culture. All genes analyzed were upre- abundant TME cell types in these tumors, respectively (Figures
gulated by BrM-TME-CM and to a lesser extent by glioma TME- 1F and 2C; Table S5). This revealed 489 DEGs in T-MG (Fig-
CM (Figure 3F). We also detected induction of inflammation- and ure 4E; Table S5; 406 up- and 83 downregulated), and 1,478
nuclear factor kB (NFkB) signaling-associated LEMs in BrM-MG, DEGs in T-MDMs (Figure 4F; Table S5; 903 up- and 575 down-
glioma MDMs, and BrM-MDMs (Figure 3D, group 3). LEMs that regulated). Although these gene lists were generated by
point toward a Th17 response (group 4) and recruitment of im- comparing T-MDMs from IDH WT gliomas with T-MG from IDH
mune cells and interactions between different immune cell com- mut gliomas, they similarly separated T-MDMs in IDH mut versus
partments were exclusively detected in MDMs (group 5). Collec- IDH WT disease in a clustering analysis (Figure 4F), indicating
tively, these analyses reveal acquisition of a multifaceted that they indeed reflect T-MDM alterations based on the IDH sta-
activation state of MG and MDMs upon their integration into tus of the tumor. 421 genes exhibit a similar pattern across both
the TME of brain malignancies. TAM cell types (343 up- and 78 downregulated), suggesting that
T-MG and T-MDMs can also acquire a common transcriptional
IDH Mutation Status Associated with Changes in Glioma pattern in IDH WT tumors. Among the shared genes were
TAM Activation several encoding extracellular matrix (ECM) proteins (Figure 4G,
We next asked whether MG and MDMs occupy distinct regions FN1 and VCAN) and ECM-associated matricellular proteins
within the TME of IDH WT gliomas. Spatial analysis of tissue sec- (THBS1, TGFBI, LGALS3, and ANGPTL4) that regulate the avail-
tions showed significant enrichment of both populations in the ability of ECM-sequestered ligands, angiogenesis, and tumor
interferon gamma; LA, lauric acid; LiA, linoleic acid; OA, oleic acid; PA, palmitic acid; PGE2, prostaglandin E2; sLPS, standard lipopolysaccharide; TNF-a = tumor
necrosis factor alpha; TPP, TNFa + PGE2 + Pam3CysSerLys4; IL-10, interleukin 10.
(D) Heatmap of GO overrepresentation analysis of leading-edge metagenes (LEMs) in MG and MDMs from gliomas and BrMs. Tile fill indicates significance
(hypergeometric test, -log10 (adjusted p value), terms were filtered by significance).
(E) IF quantification of the proportion of proliferating Ki67+ MG and MDMs in non-tumor tissue (n = 5), IDH mut (n = 10) and IDH WT gliomas (n = 9), and BrMs (n = 8).
Means were compared with one-tailed t test: *p < 0.05.
(F) qRT-PCR of type I IFN LEM marker genes from group 2 (Figure S3F) in in-vitro-generated MDMs stimulated with the indicated TME culture-conditioned
medium (TME CM). Fold changes were calculated relative to colony-stimulating factor-1 (CSF-1)-treated MDM baseline (one-way ANOVA, p < 0.1, nMDM = 4–11).
Data are represented as mean ± SEM.
See also Figure S3 and Table S4.
A B C
E G H
I J
immunity (Mushtaq et al., 2018). This suggests that TAMs help was confirmed in a multivariate Cox proportional hazard model
shape the composition and effector functions of ECM proteins that included the transcriptomic glioma subtypes (as annotated
in IDH WT tumors. We also found the anti-inflammatory mole- in the TCGA dataset) and IDH status (Figure 4J). To verify that
cules ANXA1 and GPNMB (Figure 4G), previously implicated in this effect did not simply reflect changes in T-MDM number, we
pro-tumorigenic macrophage polarization and inhibition of classified the TCGA cohort based on enrichment of the T-MDM-
T cell activation (Kobayashi et al., 2019; Ripoll et al., 2007), to specific gene set used for deconvolution, which showed a low ef-
be upregulated in T-MG and T-MDMs. fect on survival (Figure S4F).
We next investigated inflammation mediators within the In light of disappointing outcomes from PD1 or PDL1 ICB trials
CD45 population of IDH WT tumors in parallel with their corre- in glioblastoma to date, we queried whether the abundant T-MG
sponding receptors in TAMs. TGFB2 expression was elevated and T-MDMs could contribute to the limited therapeutic efficacy.
compared with IDH mut CD45 cells, and the accessory trans- We performed ORA of a panel of 20 gene sets previously asso-
forming growth factor b (TGF-b) receptor ENG was highly ex- ciated with innate anti-PD1 resistance (IPRES; Hugo et al.,
pressed in IDH WT TAMs (Figure 4H). TGFB2 has pleiotropic ef- 2016) in the TAM DEGs of IDH WT gliomas and found a sizeable
fects in inflammation and tissue remodeling during wound fraction to be upregulated in T-MG and T-MDMs (Figure S4G).
healing and has been implicated in an autocrine signaling loop We then included the CD45 population and interrogated enrich-
in glioblastoma cells (Rodón et al., 2014). The neuroinflammatory ment of IPRES gene sets on the single-sample level by GSVA
cytokine MDK, which modulates TAM polarization to a M2-like (Figure S4H). This yielded a diverse picture with tumor cells
phenotype in glioma (Meng et al., 2019), was upregulated in and TAMs enriched for IPRES gene sets to varying degrees.
CD45 cells from IDH WT tumors, and its receptors SDC4 and Therefore, TAMs and CD45 cells from IDH WT gliomas may
ITGA4/CD49D were differentially expressed in T-MDMs versus contribute to mediating innate ICB resistance.
T-MG (Figure 4H), suggesting cell-type-specific effects of this in-
ferred signaling loop. The Immune Contexture Influences the TME on a
We asked whether a T-MDM-specific gene set generated from Global Level
IDH WT gliomas was associated with a survival difference in pa- Through integrated analysis of protein and gene expression
tients. By logistic regression, we derived a representative signa- data, we next explored the effect of immune cell infiltration.
ture consisting of 36 genes (Figure S4D) from the total number of Of 200 inflammation-associated proteins assessed, 55 were
genes upregulated in TAMs in brain malignancies (Figure 3B). differentially detected in our sample cohort (for clinical infor-
This included the macrophage marker RUNX3; the atypical che- mation, see Table S3B). Unsupervised clustering analysis re-
mokine receptor ACKR3, which can regulate CXCL12-CXCR4 vealed distinct clusters with abundant inflammatory proteins in
signaling; the endoplasmic reticulum (ER) stress protein tumors (Figure 5A). The profile of IDH WT gliomas and BrMs
HERPUD1 and the inhibitory Fc receptor FCGR2B, which can showed a sizeable overlap (protein cluster 1), encompassing
modulate macrophage activation (Bournazos et al., 2016; Li angiogenic factors (VEGFA and ANG), growth factors (PDGFA,
et al., 2018); and the cytokine IL19, which affects angiogenesis TGFB1, SPP1, and GDF15), several proteases and protease in-
and macrophage polarization (Richards et al., 2015). The signature hibitors (SERPINE1, CTSS, and TIMP1), the proteolysis cascade
was used to classify patients in a merged TCGA dataset of low- regulator PLAUR, and the cytokines CCL2 and CCL5 (Figures
and high-grade gliomas (Figures 4I and S4E). In IDH mut patients, 5A and S5A). However, we also found distinct protein
a decrease in median overall survival was associated with enrich- patterns between gliomas and BrMs. The neurotrophic growth
ment of the T-MDM IDH WT signature, whereas IDH WT patients factor FGF2 and neuronal cell adhesion molecules, including
with a low enrichment score showed increased survival. This ALCAM, which regulates immune cell infiltration during
B C
neuroinflammation (Lécuyer et al., 2017), were highly expressed systemic inflammatory disorders (Filková et al., 2009), was up-
in non-tumor brain, IDH mut, and IDH WT samples (protein clus- regulated in BrM-TAMs (Figure 6A).
ter 3; Figure S5A). Conversely, BrM samples had abundant im- Analysis of individual BrM-TAM populations uncovered
mune-regulatory molecules affecting myeloid and lymphocytic distinct expression patterns. BrM-MG showed restricted upre-
cells and their heterotypic signaling (protein cluster 2; Fig- gulation of IL6 (Figure 6A), which exerts immunosuppressive ef-
ure S5A), such as CD40L, IL6R, INHBA, and AREG (Morianos fects on T cell function in cancer and mediates ICB resistance
et al., 2019; Zaiss et al., 2015), possibly reflecting the greater im- (Tsukamoto et al., 2018), and the receptor TREM1, which mod-
mune cell diversity in BrMs. This orthogonal dataset reinforces ulates pro-inflammatory responses in MG and systemically in
the RNA-seq analyses showing that inflammatory signaling path- myeloid cells during neuroinflammation (Liu et al., 2019; Xu
ways are highly enriched in brain tumors. et al., 2019). Among the upregulated chemokines, we found in-
We integrated the cell-type-specific RNA-seq data and bulk creases in both TAM cell types (CCL23) and BrM-MG-restricted
protein data to distinguish proteins with more restricted expres- (CXCL5 and CXCL8) or BrM-MDM-restricted increases (CCL8,
sion versus those that are expressed across a range of cell types. CCL13, CCL17, and CCL18) (Figure 6A). These results reveal
Transcriptome data from CD45 cells, TAMs, neutrophils, and distinct contributions of TAM populations to the inflammatory
CD4+ and CD8+ T cells from all tumor samples were clustered us- TME milieu in a disease-specific manner.
ing a self-organizing map (SOM). This yielded 6 SOM spots (i.e., GSEA identified additional cell-type-specific enrichment pat-
metagenes of co-expressed genes; Figure 5B) that recapitulated terns. BrM-MG showed evidence of IL-6 pathway activity (Fig-
the respective cell lineages (Figure S5B). The CD45 populations ure S6C), and in BrM-MDMs, the ‘‘Naba core matrisome’’ gene
were assigned to three distinct spots that were associated with set was significantly enriched (Figure S6D). This prompted us
more aggressive IDH WT gliomas and BrMs (spot VI) or reflected to assess expression of genes encoding ECM and matricellular
the brain-intrinsic or -extrinsic tumor origin (spots I and V). These proteins in BrM-MDMs versus BrM-MG, which revealed genes
cell-type-associated SOM spots overlapped considerably with encoding matrix proteins, including type III and IV collagens,
the protein data (30 of 55 proteins, Fisher’s exact test, p < FN1, the proteoglycans LUM and OGN, and the matricellular
0.0001; Figure 5C). Although VEGFA, ANG, and TGFB1 were ex- proteins ECM1, SPARC, and SPARCL1 as highly expressed in
pressed by diverse cell types in gliomas and BrMs, other genes, BrM-MDMs (Figure 6B). Although ECM remodeling has been
such as GDF15 and IGFBP2, showed more CD45 cell-restricted implicated in tumor progression, LUM, OGN, SPARCL, and
expression (Figure 5D). The significant contribution of TAMs to SPARCL1 exhibit pro- and anti-metastatic properties, which un-
production of key inflammatory proteins, including SPP1 and derscores the complex context-dependent role of the ECM (Kai
IHNBA, is reflected by TAM SOM spot III, constituting the largest et al., 2019). We also found high expression of the cathepsin pro-
group of proteins with cell type-specific expression (Figures 5C teases CTSB and CTSW in BrM-MDMs (Figure 6B), which partic-
and 5D). ipate in multiple tumor-promoting processes, including invasion
and metastasis (Olson and Joyce, 2015). The hyaluronan recep-
Myeloid Cells Show a Distinct Phenotype in BrMs tor HMMR, involved in macrophage chemotaxis and fibrosis in
Our global analysis juxtaposing the expression patterns of lung injury (Cui et al., 2019), was also higher in BrM-MDMs (Fig-
TAMs in gliomas (regardless of IDH status) with BrMs showed ure 6B). Together, these data suggest that the ECM is not only
disease- and cell-type-specific transcriptomic changes. We shaped by macrophages at the primary site (Afik et al., 2016)
thus explored BrM-specific alterations by focusing on genes but that T-MDMs may also play a pivotal role in ECM niche con-
upregulated only in relation to the corresponding reference struction in BrM that is distinct from IDH WT gliomas (Figure 4G).
and to IDH WT gliomas (Figures S6A and S6B; Table S6). Given the upregulation of CXCL8, a key neutrophil chemoattrac-
Various cytokines, chemokines, and pro-inflammatory mole- tant, by BrM-MG (Figure 6A), we explored the TME contribution to
cules were elevated in BrM-MG and BrM-MDMs (Figure 6A), recruitment of neutrophils, which were highly abundant in BrM
including the potent mediators of autoimmune neuroinflamma- (Figure 1F). Analysis of major neutrophil-recruiting chemokines
tion CSF2 and IL23A (Zhao et al., 2017) and the pattern and their receptors showed broad expression across all interro-
recognition receptor MARCO. Intriguingly, antibody-mediated gated myeloid cells (Figure S6E). To explore the phenotype of
MARCO targeting in extracranial tumors increases M1-like po- BrM-associated neutrophils, we queried the RNA-seq data, which
larization of TAMs and enhances ICB efficacy (Georgoudaki revealed BrM-specific upregulation of ITGA3 (Figure 6C), which is
et al., 2016). These effects relied on interaction of the antibody involved in neutrophil tissue infiltration in sepsis, and CXCL17, pre-
with FCGR2B, which is also part of the T-MDM IDH WT signa- viously implicated in neutrophil and macrophage recruitment in
ture (Figures S2E and S4C). Finally, RETN, which is involved in cancer (Li et al., 2014). We also observed upregulation of the
are hierarchically clustered. Clinical data are annotated per row; column annotation reflects the major protein clusters (further information can be found in
Table S3B).
(B) Self-organizing map (SOM) of RNA expression data of major cell populations in glioma and BrM samples. SOM spots are highlighted, numbered with Roman
numerals, and annotated with their cell type association.
(C) Overlap of individual proteins and SOM spot metagenes; tile color fill reflects protein cluster membership from (A).
(D) RNA-seq counts (normalized, scaled to expression range) of proteins from (A) across major cell types in IDH mut and IDH WT gliomas and BrMs. SOM spot
membership of individual genes is indicated per row.
See also Figure S5.
B C
adenosine receptor ADORA2A (Figure 6C), which attenuates the omas, T-MG and T-MDMs mostly neighbored homotypic cells
phenotype of pro-inflammatory neutrophils (Barletta et al., 2012). while lacking T cells in their close vicinity (Figures 7A, 7B, and
Furthermore, we found increased expression of CD177 (Fig- S7A), possibly reflecting the general T cell sparseness in these
ure 6C), a cell surface receptor that modulates neutrophil migra- tumors. In contrast, both TAM populations neighbored T cells
tion and activation and serves as a marker for PR3-positive neutro- far more frequently in BrMs, indicating the potential for interac-
phils, which, in turn, negatively affect T cell proliferation (Yang tion (Figures 7A, 7B, and S7A).
et al., 2018). Notably, MET, which has been linked to recruitment We thus investigated the T cell activation state in BrMs in rela-
of immunosuppressive neutrophils in cancer (Glodde et al., tion to unmatched healthy donor blood and also juxtaposed
2017), was upregulated in neutrophils in a BrM-specific manner them to the corresponding populations from IDH WT gliomas.
(Figure 6C). In sum, we have uncovered multiple disease-specific Compared with controls, CD4+ T cells from BrM showed evi-
alterations of myeloid cells extending beyond BrM-TAMs to neu- dence of a hyporesponsive, anergic phenotype (Figure 7C),
trophils, which has potential implications for the recruitment and whereas CD8+ T cells exhibited an exhaustion signature (Fig-
activation of other cell types within the TME, including T cells. ure 7D), which usually occurs upon chronic activation, resulting
in upregulation of inhibitory receptors. These defective T cell
TAMs Are Poised toward an Immunomodulatory states can be caused by aberrant activation or T cell inhibition
Phenotype in BrMs by tumor cells and antigen-presenting cells in the TME and are
Although we found a significant accumulation of CD4+ and CD8+ a major obstacle in treating cancers.
T cells in BrMs versus IDH WT gliomas by FCM, this analysis of To delineate putative mechanisms in the BrM TME that may
dissociated tissue samples lacks structural information. We thus drive these alterations, we probed the RNA-seq data from
performed neighborhood analysis of IF-phenotyped IDH WT and CD45 cells, TAMs, and T cells (Figure 7E) for expression of acti-
BrM tissue sections to elucidate whether there is a spatial rela- vating and inhibitory immunomodulatory signals (Wei et al.,
tionship between TAMs and CD3+ T cells in BrMs. In IDH WT gli- 2018). This revealed upregulation of various canonical T cell
A B
C D
E F
activators and co-activators but also mediators of inhibition in MDM module genes. The notion that BrM-MDMs undergo dis-
T cells (PDCD1/PD1, CD28, and CTLA4), whereas T cell-inhibit- ease-specific alterations distinct from the primary extracranial
ing and activating signals were detected in both TAM popula- tumor is supported by upregulation of these genes (Figure 7G)
tions (CD274/PD-L1 and PDCD1LG2/PD-L2). Notably, we found in our analysis of an external cohort of BrM samples compared
an upregulation of CD80, which has diverse roles in T cell activa- with their matched primary tumor tissue (Vare slija et al., 2019).
tion because it heterodimerizes with CD274, provides co-stimu-
latory signals to T cells via CD28 and exerts inhibitory effects via DISCUSSION
interaction with CTLA4 (Zhao et al., 2019), in both TAM popula-
tions compared with their normal references and IDH WT tumor Brain tumors, including glioblastoma and BrMs, confer some of
populations (Figure 7E). The potential contribution of TAMs to the poorest prognoses for patients with cancer, with survival
metabolic immune evasion is also suggested by high expression rates often measured in just months. Given the current dearth
of IDO1 and IDO2 (Zhai et al., 2018) in BrM (Figure 7E). of effective therapeutic options for these patients and the
We investigated additional immunomodulatory mediators us- modest effects of the various immunotherapies evaluated to
ing weighted gene correlation network analysis (WGCNA; Lang- date, it is of critical urgency to identify novel targets for future
felder and Horvath, 2008) and correlated the resulting expres- clinical evaluation. One potentially rich source of therapeutic tar-
sion patterns with paired FCM abundance of CD4+ and CD8+ gets is the TME. However, even though the TME is now widely
cells in a disease- and cell-type-specific manner. We identified accepted as an important regulator of cancer progression and
15 unique co-expression modules showing significant correla- therapeutic response, our knowledge of the brain TME is
tion (p < 0.05) of their eigengenes (i.e., the first PC of the module restricted to individual brain tumor types or cellular compart-
expression data) with any of the provided sample traits (Fig- ments and lacks comprehensive and integrative analysis.
ure S7B). Among these, the ‘‘brown’’ WGCNA module correlated In this study, we leveraged a diverse panel of analyses to
with a specific BrM-MDM annotation and CD4+ T cell abun- deeply interrogate the immune landscape of primary and meta-
dance. ORA of this module revealed signals for pathways such static brain cancers. Through integration of multiparameter
as coagulation and ECM modulation (Figure S7C) that affect FCM analyses, RNA-seq data, TME cell culture assays, protein
the availability and activity of growth factors and cytokines within arrays, and spatial tissue characterization, we uncovered critical
the TME (Mohan et al., 2020). We ranked genes by module mem- insights into the composition and transcriptomes of the most
bership strength and correlation with CD4+ T cell abundance, abundant immune cell populations in patient samples from IDH
which identified several factors with opposing immunomodula- mut and WT gliomas and BrMs originating from distinct extracra-
tory functions (Figure 7F). Although the receptors CD300E and nial primary tumors.
BST1 promote monocyte motility and survival (Isobe et al., By exploring the broad immune landscape, we uncovered
2018; Ortolan et al., 2019), we also detected effectors of immu- several pronounced differences between gliomas and BrMs
nosuppression, such as the actin-associated regulatory protein when directly compared side by side. In brain tumors, TAMs
CNN2, which negatively regulates macrophage motility and are composed of tissue-resident MG and recruited MDMs, and
phagocytic activity (Huang et al., 2008). The leukocyte immuno- we found a significant shift in the ratio of MG to MDMs between
globulin-like receptor subfamily B members LILRB2 and LILRB3, IDH mut and IDH WT gliomas. Additionally, gliomas contain an
which attenuate myeloid cell activation (van der Touw et al., abundance of TAMs, whereas T cells were much fewer, particu-
2017), are also highly ranked genes within this module. Interest- larly in IDH mut tumors. This confirms the notion that gliomas are
ingly, LILRB2 has been identified as a novel myeloid immune immunologically cold tumors (Jackson et al., 2019). Although
checkpoint that limits antitumor immunity (Chen et al., 2018). T cell sequestration in the bone marrow has been observed in gli-
We also found evidence of effects on T cells; CD52, which, in oma mouse models and following intracranial implantation of
its soluble form, inhibits T cell function, was among the BrM- brain-extrinsic tumors (Chongsathidkiet et al., 2018), our clinical
Li, L., Yan, J., Xu, J., Liu, C.Q., Zhen, Z.J., Chen, H.W., Ji, Y., Wu, Z.P., Hu, J.Y., Pyonteck, S.M., Akkari, L., Schuhmacher, A.J., Bowman, R.L., Sevenich, L.,
Zheng, L., and Lau, W.Y. (2014). CXCL17 expression predicts poor prognosis Quail, D.F., Olson, O.C., Quick, M.L., Huse, J.T., Teijeiro, V., et al. (2013).
and correlates with adverse immune infiltration in hepatocellular carcinoma. CSF-1R inhibition alters macrophage polarization and blocks glioma progres-
PLoS ONE 9, e110064. sion. Nat. Med. 19, 1264–1272.
Li, Y., Xie, Y., Hao, J., Liu, J., Ning, Y., Tang, Q., Ma, M., Zhou, H., Guan, S., Qiao, S., Qian, Y., Xu, G., Luo, Q., and Zhang, Z. (2019). Long-term character-
Zhou, Q., and Lv, X. (2018). ER-localized protein-Herpud1 is a new mediator ization of activated microglia/macrophages facilitating the development of
of IL-4-induced macrophage polarization and migration. Exp. Cell Res. 368, experimental brain metastasis through intravital microscopic imaging.
167–173. J. Neuroinflammation 16, 4.
Quail, D.F., and Joyce, J.A. (2017). The Microenvironmental Landscape of
Liberzon, A., Birger, C., Thorvaldsdóttir, H., Ghandi, M., Mesirov, J.P., and
Brain Tumors. Cancer Cell 31, 326–341.
Tamayo, P. (2015). The Molecular Signatures Database (MSigDB) hallmark
gene set collection. Cell Syst. 1, 417–425. Quail, D.F., Bowman, R.L., Akkari, L., Quick, M.L., Schuhmacher, A.J., Huse,
J.T., Holland, E.C., Sutton, J.C., and Joyce, J.A. (2016). The tumor microenvi-
Lim, M., Xia, Y., Bettegowda, C., and Weller, M. (2018). Current state of immu-
ronment underlies acquired resistance to CSF-1R inhibition in gliomas. Sci-
notherapy for glioblastoma. Nat. Rev. Clin. Oncol. 15, 422–442.
ence 352, aad3018.
Liu, Q., Johnson, E.M., Lam, R.K., Wang, Q., Ye, B.H., Wilson, E.N., Minhas,
R Core Team (2018). R: A Language and Environment for Statistical Computing
P.S., Liu, L., Swarovski, M.S., Tran, S., et al. (2019). Peripheral TREM1 re-
(R Foundation for Statistical Computing).
sponses to brain and intestinal immunogens amplify stroke severity. Nat. Im-
munol. 20, 1023–1034. Racle, J., de Jonge, K., Baumgaertner, P., Speiser, D.E., and Gfeller, D. (2017).
Simultaneous enumeration of cancer and immune cell types from bulk tumor
Löffler-Wirth, H., Kalcher, M., and Binder, H. (2015). oposSOM: R-package for
gene expression data. eLife 6, e26476.
high-dimensional portraying of genome-wide expression landscapes on bio-
conductor. Bioinformatics 31, 3225–3227. Rau, A., Gallopin, M., Celeux, G., and Jaffrézic, F. (2013). Data-based filtering
for replicated high-throughput transcriptome sequencing experiments. Bioin-
Long, G.V., Atkinson, V., Lo, S., Sandhu, S., Guminski, A.D., Brown, M.P., Wil-
formatics 29, 2146–2152.
mott, J.S., Edwards, J., Gonzalez, M., Scolyer, R.A., et al. (2018). Combination
Richards, J., Gabunia, K., Kelemen, S.E., Kako, F., Choi, E.T., and Autieri, M.V.
nivolumab and ipilimumab or nivolumab alone in melanoma brain metastases:
(2015). Interleukin-19 increases angiogenesis in ischemic hind limbs by direct
a multicentre randomised phase 2 study. Lancet Oncol. 19, 672–681.
effects on both endothelial cells and macrophage polarization. J. Mol. Cell.
Love, M.I., Huber, W., and Anders, S. (2014). Moderated estimation of fold Cardiol. 79, 21–31.
change and dispersion for RNA-seq data with DESeq2. Genome Biol. 15, 550.
Ripoll, V.M., Irvine, K.M., Ravasi, T., Sweet, M.J., and Hume, D.A. (2007).
Mantovani, A., Marchesi, F., Malesci, A., Laghi, L., and Allavena, P. (2017). Gpnmb is induced in macrophages by IFN-gamma and lipopolysaccharide
Tumour-associated macrophages as treatment targets in oncology. Nat. and acts as a feedback regulator of proinflammatory responses. J. Immunol.
Rev. Clin. Oncol. 14, 399–416. 178, 6557–6566.
Meng, X., Duan, C., Pang, H., Chen, Q., Han, B., Zha, C., Dinislam, M., Wu, P., Robinson, J.T., Thorvaldsdóttir, H., Wenger, A.M., Zehir, A., and Mesirov, J.P.
Li, Z., Zhao, S., et al. (2019). DNA damage repair alterations modulate M2 po- (2017). Variant Review with the Integrative Genomics Viewer. Cancer Res. 77,
larization of microglia to remodel the tumor microenvironment via the p53- e31–e34.
mediated MDK expression in glioma. EBioMedicine 41, 185–199.
Rodón, L., Gonzàlez-Juncà, A., Inda, Mdel.M., Sala-Hojman, A., Martı́nez-
Mohan, V., Das, A., and Sagi, I. (2020). Emerging roles of ECM remodeling pro- Sáez, E., and Seoane, J. (2014). Active CREB1 promotes a malignant TGFb2
cesses in cancer. Semin. Cancer Biol. 62, 192–200. autocrine loop in glioblastoma. Cancer Discov. 4, 1230–1241.
Morianos, I., Papadopoulou, G., Semitekolou, M., and Xanthou, G. (2019). Ac- Sankowski, R., Böttcher, C., Masuda, T., Geirsdottir, L., Sagar, Sindram, E.,
tivin-A in the regulation of immunity in health and disease. J. Autoimmun. 104, Seredenina, T., Muhs, A., Scheiwe, C., Shah, M.J., et al. (2019). Mapping mi-
102314. croglia states in the human brain through the integration of high-dimensional
Müller, S., Kohanbash, G., Liu, S.J., Alvarado, B., Carrera, D., Bhaduri, A., techniques. Nat. Neurosci. 22, 2098–2110.
Watchmaker, P.B., Yagnik, G., Di Lullo, E., Malatesta, M., et al. (2017). Sin- Santana Carrero, R.M., Beceren-Braun, F., Rivas, S.C., Hegde, S.M., Gangad-
gle-cell profiling of human gliomas reveals macrophage ontogeny as a basis haran, A., Plote, D., Pham, G., Anthony, S.M., and Schluns, K.S. (2019). IL-15 is
for regional differences in macrophage activation in the tumor microenviron- a component of the inflammatory milieu in the tumor microenvironment pro-
ment. Genome Biol. 18, 234. moting antitumor responses. Proc. Natl. Acad. Sci. USA 116, 599–608.
Murray, P.J., Allen, J.E., Biswas, S.K., Fisher, E.A., Gilroy, D.W., Goerdt, S., Schalper, K.A., Rodriguez-Ruiz, M.E., Diez-Valle, R., López-Janeiro, A., Por-
Gordon, S., Hamilton, J.A., Ivashkiv, L.B., Lawrence, T., et al. (2014). Macro- ciuncula, A., Idoate, M.A., Inogés, S., de Andrea, C., López-Diaz de Cerio,
STAR+METHODS
Continued
REAGENT or RESOURCE SOURCE IDENTIFIER
IF: AF555 donkey anti-rabbit IgG 1:1000 dilution Thermo Fisher Scientific Cat#A31572, RRID:AB_162543
IF: AF555 donkey anti-mouse IgG, 1:500 dilution Thermo Fisher Scientific Cat#A32773; RRID:AB_2762848
IF: AF488 donkey anti-rat IgG, 1:500 dilution Thermo Fisher Scientific Cat#A21208; RRID:AB_141709
IF: AF647 donkey anti-rat IgG, 1:500 dilution abcam Cat#ab150155; RRID:AB_2813835
IF: DyLight755 donkey anti-goat IgG, 1:500 dilution Thermo Fisher Scientific Cat# SA5-10091; RRID:AB_2556671
IF: AF555 donkey anti-sheep IgG, 1:500 dilution Thermo Fisher Scientific Cat#A21436; RRID:AB_2535857
Biological Samples
Non-tumor, glioma and brain metastasis tissue Centre Hospitalier Universitaire N/A
Vaudois, Lausanne, Switzerland
Non-tumor, glioma and brain metastasis tissue Memorial Sloan Kettering N/A
Cancer Center, New York,
NY, USA
Healthy donor blood Transfusion Interrégionale N/A
Croix-Rouge Suisse, Epalinges,
Switzerland
Healthy donor blood New York Blood Bank, N/A
New York, NY, USA
Chemicals, Peptides, and Recombinant Proteins
DMEM-F12 (1:1), GlutaMAX GIBCO Cat#31331028
DMEM, high glucose, GlutaMAX, pyruvate GIBCO Cat#31966021
Penicillin/Streptomycin GIBCO Cat#15140122
Human recombinant CSF-1 R&D Systems Cat#216-MC-025
Ficoll-Paque Premium GE Cat#17-5442-02
Trizol Thermo Fisher Scientific Cat#15596018
Trizol LS Thermo Fisher Scientific Cat#10296028
Tween 20 Applied Chemicals Cat#A4974
Triton X-100 Applied Chemicals Cat#A4975
TNB Blocking Reagent Perkin Elmer Cat#FP1020
Fluorescence Mounting Medium Dako Cat#S302380
Critical Commercial Assays
Brain Tumor Dissociation Kit (P) Miltenyi Cat#130-095-942
Tumor Dissociation Kit, human Miltenyi Cat#130-095-929
Myelin Removal Beads Miltenyi Cat#130-096-733
CD14 MicroBeads, human Miltenyi Cat#130-050-201
Human TruStain FcX BioLegend Cat#422302
ZombieNIR Fixable Viability Kit BioLegend Cat#423106
High Capacity cDNA Reverse Transcription Kit Applied Biosystems Cat#4368814
TaqMan Universal PCR Master Mix Applied Biosystems Cat#4304437
Quantibody Array Q4000 ELISA Raybiotech Cat#QAH-CAA-4000-1
Deposited Data
RNAseq count data This paper https://joycelab.shinyapps.io/
braintime/
Human reference genome, hg38 Genomics Data Common https://gdc.cancer.gov/about-
data/data-harmonization-and-
generation/gdc-reference-files
TCGA LGG and GBM datasets Genomics Data Common https://portal.gdc.cancer.gov/
TOIL TGCA TARGET GTEx datasets Vivian et al., 2017 https://xenabrowser.net/datapages/
Ivy Glioblastoma Atlas Project RNA sequencing fata Puchalski et al., 2018 https://glioblastoma.alleninstitute.
org/static/download.html
(Continued on next page)
Continued
REAGENT or RESOURCE SOURCE IDENTIFIER
STRING Protein-Protein-Interaction database, version 10.5 Szklarczyk et al., 2017 https://version-10-5.string-db.org/
cgi/download.pl
Molecular Signatures Database gene set collection Liberzon et al., 2015; https://www.gsea-msigdb.org/
Subramanian et al., 2005 gsea/msigdb/
RNA sequencing count matrix from matched breast cancer Vare
slija et al., 2019 https://github.com/npriedig
primaries and brain metastases
Oligonucleotides
See Table S7 N/A
Software and Algorithms
FlowJo, version 10.4 BD https://www.flowjo.com/
BBDuk, version 38.12 Joint Genome Institute https://jgi.doe.gov/data-and-tools/
bbtools/
STAR aligner, version 2.5.2b Dobin et al., 2013 https://github.com/alexdobin/STAR
R environment, version 3.5.0 R Core Team, 2018 https://www.r-project.org/
VIS Image Analysis, version 2019.7 Visiopharm https://www.visiopharm.com/
Other
gentleMACS Octo Dissociator Miltenyi Cat#130-095-937
gentleMACS C Tubes Miltenyi Cat#130-096-334
LS Columns Miltenyi Cat#130-042-401
SepMate-50 StemCell Cat#85450
PermaLife Cell Culture Bags OriGen Biomedical Cat#PL30-2G
LSR II flow cytometer BD N/A
Fortessa flow cytometer BD N/A
FACSAria III, flow cytometer & cell sorter BD N/A
Axio Scan.Z1 slide scanner Zeiss N/A
QuantStudio 6 Flex Applied Biosystems N/A
Omni Tissue Homogenizer (TH) Omni International Cat#TH220
RESOURCE AVAILABILITY
Lead Contact
Further information and requests for resources should be directed to the Lead Contact, Johanna Joyce (johanna.joyce@unil.ch).
Materials Availability
This study did not generate new unique reagents.
All procedures performed in studies involving human participants were in accordance with the ethical standards of the institutional
and/or national research committee and with the 1964 Helsinki declaration and its later amendments or comparable ethical
standards.
Informed consent was obtained from all individual participants included in this study. The collection of non-tumor and tumor tissue
samples at the Centre Hospitalier Universitaire Vaudois (CHUV, Lausanne, Switzerland) was approved by the Commission cantonale
d’éthique de la recherche sur l’être humain (CER-VD, protocol PB 2017-00240, F25 / 99). Sample collection at Memorial Sloan Ket-
tering Cancer Center (MSKCC, New York, NY, USA) was approved by the institutional review board (IRB, protocols #IRB #06-107,
#14-230). Non-tumor samples of cerebral cortex tissues were collected at CHUV during medically indicated surgical treatment of
refractory epilepsy patients, and at MSKCC in normal brain distant from the tumor in patients with low-grade glioma or from post-
mortem samples collected through the rapid autopsy program with no history of brain malignancy.
Tissue specimens were immediately collected from the operating room and processed as described below. All patient-related data
and unique identifiers were removed so that human samples were anonymized before any further processing.
Pathological review and molecular analysis of tumor samples was performed as part of standard clinical care at the respective
locations (CHUV or MSKCC). In all glioma samples subjected to RNA sequencing, the IDH1 and IDH2 mutation status was verified
by inspection of the reads from the CD45- population aligning to the IDH1 and IDH2 loci with the Integrative Genomics Viewer (IGV;
Robinson et al., 2017). For immunofluorescence sections the tumor diagnosis was confirmed independently, for all non-tumor sam-
ples, the absence of malignancy was equally confirmed by a pathologist.
Peripheral blood and buffy coats were obtained from the Transfusion Interrégionale, Croix-Rouge Suisse (Epalinges/Lausanne,
Switzerland), the New York Blood Center (New York, NY, USA), and healthy donors.
METHOD DETAILS
Clinical sample processing, flow cytometry (FCM) and fluorescence activated cell sorting (FACS)
Tissue specimens were washed in HBSS and macro-dissected under sterile conditions. Parts of the tissue were either immediately
frozen by submerging the sample in liquid nitrogen-cooled 2-methyl butane (Sigma-Aldrich) or OCT-embedded (Tissue-Tek) before
freezing for subsequent sectioning and immunofluorescence staining. OCT embedding was performed by placing the sample in a
freezing mold filled with OCT and then submerging the mold in 2-methyl butane cooled with dry ice.
The remaining tissue was further processed with either the Brain Tumor Dissociation Kit (Miltenyi) for non-tumor tissue and gli-
omas, or the Tumor Dissociation Kit for BrMs (Miltenyi) using the gentleMACS Octo Dissociator (Miltenyi). Myelin debris in cell sus-
pensions from non-tumor and glioma tissues was removed by incubating the cells with Myelin Removal Beads (Miltenyi) and mag-
netic-activated cell sorting (MACS) using LS columns (Miltenyi) according to the manufacturer’s instructions. All tissue suspensions
were filtered through a 40 mm filter and underwent red blood cell lysis (BioLegend). Single cell suspensions were stained with a fixable
live-dead stain (Zombie NIR, BioLegend), FC-blocked for 10 min (Human TruStain FcX, BioLegend) and then incubated with direct
fluorophore-conjugated antibodies for 20 min at 4 C. All FCM antibodies were titrated in a lot-specific manner. Antibody details are
listed in the Key Resources Table. Cells were washed with PBS +2% fetal bovine serum (FBS) +0.5 mM EDTA and stored at 4 C in the
dark until FAC-sorting.
All FCM acquisition was completed on either a BD Fortessa or a BD LSR II device (BD), and cell sorting was performed on a
FACSAria III (BD) using FACSDiva (BD). Cells were sorted directly into Trizol LS (Thermo Fisher Scientific) and immediately snap
frozen with liquid nitrogen. Analysis of FCM data was performed with FlowJo (BD).
TNB Blocking Reagent (Perkin Elmer), followed by incubation with primary antibody in the same buffer overnight at 4 C. Primary anti-
body information and dilutions are listed in the Key Resources table. Sections were washed with PBS +0.2% Tween 20 before incu-
bation with fluorophore-conjugated secondary antibodies at a dilution of 1:500 in PBS +0.5% Tween 20 +1% TNB Blocking
Reagent +1 mg/ml DAPI at room temperature. Directly-conjugated primary antibodies were employed where indicated after an initial
round of primary and secondary antibody staining, to avoid potential for cross reactivity. Finally, sections were washed with
PBS +0.2% Tween and mounted with Fluorescence Mounting Medium (Dako).
Stained tissue sections were imaged with an Axio Scan.Z1 slide scanner (Zeiss) equipped with a Colibri 7 LED light source (Zeiss)
using a Plan-Apochromat 20x/0.8 DIC M27 coverslip-corrected objective (Zeiss). All slides from the same staining panel were digi-
talized using identical acquisition settings.
low-grade glioma ‘‘TCGA-LGG’’ and high-grade "TCGA-GBM’’ and ‘‘frontal cortex’’ GTEx samples to integrate bulk glioma expres-
sion data with unmatched non-tumor samples. MG- and MDM-specific marker genes were derived by identifying differentially ex-
pressed genes in these two populations versus all other sorted populations in a pairwise fashion, determining the intersect and
ranking the resulting genes by their fold change versus the CD45- population. The 20 highest ranked genes were then used as
cell type-specific marker genes. Deconvolution of MG- and MDM-proportions in tumor and non-tumor sample expression data
was done with the EPIC R package (Racle et al., 2017) using these marker genes and providing the expression data from the sorted
populations as reference profiles. As the exact amount of RNA within the estimated cell types is not known, this parameter was set to
1 when running the deconvolution.
Cell type abundance estimation in spatial Ivy Glioblastoma Atlas Project (GAP) data
The micro-dissected Ivy GAP (Puchalski et al., 2018) RNA-seq RSEM count data and sample annotation containing anatomical loca-
tion were downloaded from the Ivy GAP website (https://glioblastoma.alleninstitute.org/static/download.html) and normalized using
DESeq2. The relative abundance of cell types was estimated by deriving marker genes through a multinomial logistic regression
model on the normalized expression data of the FAC-sorted cell types of interest in IDH WT tumors and then computing the
GSVA enrichment scores in the Ivy GAP samples.
Expression analysis of external dataset of matched primary breast cancer and BrMs
RNA-seq raw count data from patient-matched primary breast tumors and corresponding BrMs (Vare slija et al., 2019) were down-
loaded (https://github.com/npriedig/jnci_2018) and transformed using DESeq2. The statistical significance of gene expression
changes between primary tumors and BrMs was assessed with a two-tailed Wilcoxon signed-rank test on the variance-stabilized
counts.
Summary data are presented as mean ± standard error of the mean (SEM) or Tukey boxplots using ‘‘ggplot2.’’ Numerical data was
analyzed using the statistical tests noted within the corresponding sections of the article. Hierarchical clustering was performed using
Ward’s method with 1-Pearson correlation coefficient as the distance metric unless noted otherwise. P values were annotated as
follows: * < 0.05, ** < 0.01, *** < 0.001, **** < 0.0001, ns > 0.05.
Supplemental Figures
Figure S3. Analysis of DEGs and TAM Activation Patterns, Related to Figure 3
(A) Summary of contrasts applied when performing differential gene expression (DEG) analysis in MG and MDMs in gliomas (regardless of IDH status) and BrMs
(from all primaries) in comparison to normal controls (non-tumor brain MG and in vitro differentiated MDMs respectively) with the corresponding log2(fold-change)
versus -log10(adjusted p value) volcano plots. (B) Euler plot of the number of differentially expressed genes (DEG, log2(fc) > 1, p.adj < 0.01) that overlap in MG and
MDMs as shown in (A). (C) Molecular Signatures Database (MSigDB) ‘‘Hallmark’’ gene set collection overrepresentation analysis (ORA) in genes upregulated in
both gliomas and BrMs versus non-tumor brain tissue or healthy donors in MDMs and MG in MDMs and MG. Dot sizes reflect the fraction of gene set members
found within the analyzed DEGs, and dot color indicates cell type. (D) Heatmap of fold changes of macrophage M1 and M2 polarization marker genes (absolute
log2(fc) > 1, p.adj < 0.05) in MDMs and MG in gliomas and BrMs. Blank tiles indicate the lack of significant fold change. Genes are annotated with their canonical
stimuli and the associated polarization phenotype. (GC = glucocorticoid, Ic = immune complexes, IFNg = Interferon gamma, IL10 = interleukin 10, IL4 = interleukin
4, LPS = lipopolysaccharide, TGFb = transforming growth factor beta). (E) Overlap between leading edge metagenes (LEMs) in MG and MDMs in gliomas and
BrMs. Tile fill color indicates significance of overlap determined by hypergeometric testing (-log10(p.adj)). (F) String-DB protein-protein-interaction network of the
intersect from IFN Type-1 group 2 modules from LEMs ‘‘BrM-MG 1,’’ ‘‘Glioma MDM 1’’ and ‘‘BrM-MDM 4.’’ Genes selected for validation through qRT-PCR are
highlighted in red (corresponding data shown in Figure 3E). Node size indicates the centrality, while edge width corresponds to the String-DB interaction score
(only scores > 700, i.e., with a high degree of confidence have been included).
ll
Resource
Figure S5. Protein Concentration in Bulk Tumor Tissues and Relation to Cell-Type-Associated SOM Spots, Related to Figure 5
(A) Bulk tissue protein concentrations of indicated proteins in non-tumor brain (n = 3), gliomas (n = 14) and BrMs (n = 12). Color indicates disease type and IDH
status. (B) Heatmap of self-organizing map (SOM) spot metagene expression across the analyzed samples. Rows were z-scored and have been hierarchically
clustered, columns were ordered by cell type, disease type and IDH mutation status.
ll
Resource
Figure S7. Correlation of WGCNA Modules with External Traits and Module Pathway ORA, Related to Figure 7
(A) Representative immunofluorescence images in IDH WT gliomas and BrMs. Scale bars = 100mm, boxed area is shown in higher magnification in Figure 7A. (B)
Heatmap of the weighted gene correlation network analysis (WGCNA) module eigengene (= first principal component of expression data, columns, module
columns are labeled with a color code) correlation to the traits (rows, cell type and disease, abundance of CD4+ or CD8+ T cells in % of CD45+). Values inside the
cells state Pearson’s r and the associated p value. (C) ‘‘Brown’’ BrM-MDM module MSigDB ‘‘C2CP’’ ORA results (p value < 0.01) enrichment map network
visualization. Node size represents p value, edge thickness reflects overlap of genes between gene sets.
Resource
Correspondence
donia@princeton.edu
In Brief
Each human has a diverse gut
microbiome, which can metabolize drugs
differently. In this resource, Javdan et al.
present a way to capture and grow much
of the unique diversity of human
microbiomes in culture and also a way to
detect many of our microbiome-derived
metabolites. Together, they use these
unique gut communities and the
metabolomics pipeline to see how
personalized microbiomes metabolize
drugs in different ways.
Highlights
d Development of subject-personalized ex vivo batch cultures
of the gut microbiome
Resource
Personalized Mapping of Drug Metabolism
by the Human Gut Microbiome
Bahar Javdan,1,4 Jaime G. Lopez,2,4 Pranatchareeya Chankhamjon,1 Ying-Chiang J. Lee,1 Raphaella Hull,1 Qihao Wu,1
Xiaojuan Wang,1 Seema Chatterjee,1 and Mohamed S. Donia1,3,5,*
1Department of Molecular Biology, Princeton University, Princeton, NJ 08544, USA
2Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ 08544, USA
3Department of Chemical and Biological Engineering, Princeton University, Princeton, NJ 08544, USA
4These authors contributed equally
5Lead Contact
*Correspondence: donia@princeton.edu
https://doi.org/10.1016/j.cell.2020.05.001
SUMMARY
The human gut microbiome harbors hundreds of bacterial species with diverse biochemical capabilities.
Dozens of drugs have been shown to be metabolized by single isolates from the gut microbiome, but the
extent of this phenomenon is rarely explored in the context of microbial communities. Here, we develop a
quantitative experimental framework for mapping the ability of the human gut microbiome to metabolize
small molecule drugs: Microbiome-Derived Metabolism (MDM)-Screen. Included are a batch culturing sys-
tem for sustained growth of subject-specific gut microbial communities, an ex vivo drug metabolism screen,
and targeted and untargeted functional metagenomic screens to identify microbiome-encoded genes
responsible for specific metabolic events. Our framework identifies novel drug-microbiome interactions
that vary between individuals and demonstrates how the gut microbiome might be used in drug development
and personalized medicine.
INTRODUCTION and intestine (Meinl et al., 2009). Direct interactions include the
partial or complete biochemical transformation of a drug into
The oral route is the most common route for drug administration. more or less active metabolites by microbiome-derived enzymes
Upon exiting the stomach, drugs can be absorbed in the small (termed herein: Microbiome-Derived Metabolism, or MDM).
and/or large intestine to reach systemic circulation and eventu- The human gut microbiome harbors hundreds of bacterial
ally the liver or directly transported there via the portal vein. In species, encoding an estimated 100 times more genes than
the liver, drugs may be metabolized and secreted back to the in- the human genome (Qin et al., 2010). This enormous diversity
testines through bile—via enterohepatic circulation (Kimura and richness represent a repertoire of yet-uncharacterized
et al., 1994; Li and Jia, 2013). Even parenterally administered biochemical activities capable of metabolizing ingested chemi-
drugs and their metabolites can reach the intestines through cals (Bäckhed et al., 2005; Koppel et al., 2017). Although MDM
biliary secretion. Thus, whether prior to or after absorption, has been observed in dozens of examples for the past 50 years,
some administered drugs will spend a considerable amount of this process is still mostly overlooked in the drug development
time in the small and large intestines, where our human gut pipeline where little to no effort is spent on determining the spe-
microbiome resides. It is therefore important to study gut micro- cific role of MDM in pharmacokinetics (Ilett et al., 1990; Li and
biome composition and function, specifically as it relates to drug Jia, 2013; Scheline, 1973; Spanogiannopoulos et al., 2016).
interactions, while accounting for the significant variability be- This is because of the vast complexity of the microbiome, and
tween individuals (Falony et al., 2016). overwhelming technical challenge of testing hundreds of drugs
Broadly speaking, the microbiome interacts with drugs both against thousands of cultured isolates under multiple conditions.
directly and indirectly. Indirect interactions include competition In contrast to liver-derived metabolism, we lack a systematic and
between microbiome-derived metabolites and administered standardized map of MDM, hindering our ability to reliably pre-
drugs for the same host metabolizing enzymes (Clayton et al., dict and eventually interfere with undesired microbiome effects
2009), microbiome effects on the immune system in anticancer on drug pharmacokinetics and pharmacodynamics.
immunotherapy (Iida et al., 2013; Sivan et al., 2015; Vétizou To address this gap in knowledge, we developed a
et al., 2015), microbiome reactivation of secreted inactive me- quantitative experimental workflow for mapping MDM of orally
tabolites of the drug (Wallace et al., 2010), and overall micro- administered drugs using personalized gut microbiome-derived
biome effects on the levels of metabolizing enzymes in the liver microbial communities (MDM-Screen). The methods and
Cell 181, 1661–1679, June 25, 2020 ª 2020 Elsevier Inc. 1661
ll
Resource
findings reported here provide a framework for discovering and versity across all media, and the closest one to PD) (Figure 1C). In
characterizing novel cases of MDM and for potentially incorpo- PD, there are 33 ASVs present above a relative abundance of
rating an ‘‘MDM module’’ in the drug development pipeline. 1%, 26 (79%) of which are present in mGAM day two culture.
Overall, total shared ASVs between PD and mGAM day two ac-
RESULTS count for 70% of the PD composition (by relative abundance),
indicating that the mGAM culture recapitulates the bulk of the
Mapping the Capacity of a Single Subject’s Microbiome original community. Taken together, and consistent with previ-
to Metabolize Hundreds of Drugs ous reports showing that mGAM can support the growth of a
A major challenge in studying the capacity of the human gut mi- wide variety of gut microorganisms in monoculture (Rettedal
crobiome to metabolize orally administered drugs is the diversity et al., 2014; Tramontano et al., 2018), our results support the
of bacterial species and strains involved (Almeida et al., 2019; use of mGAM day two cultures as a viable ex vivo batch culturing
Lloyd-Price et al., 2017; Nayfach et al., 2019; Pasolli et al., model for the PD microbiome.
2019; Qin et al., 2010). Because it is impractical to systematically With an optimized ex vivo culturing system for PD in hand, we
screen thousands of isolated strains against hundreds of drugs, next developed a combined biochemical and analytical chemis-
previous studies have relied mainly on monocultures of a try approach to map the capacity of PD-derived microbial com-
selected set of representative species. However, gene expres- munities to metabolize clinically used, orally administered drugs
sion and biochemical transformation profiles vary dramatically (MDM-Screen) (Figure 2A). Three samples were prepared per
between a strain grown in monoculture versus in a mixed com- drug of interest: (1) a 24-h mGAM ex vivo culture of PD, incu-
munity. To address these challenges, we sought to develop an bated with the drug of interest (final concentration 33 mM, in
optimized ex vivo mixed culturing system that supports the line with estimates of drug concentrations in the gastrointestinal
growth of a large proportion of the species from a given micro- tract) (Maier et al., 2018), (2) a similar culture incubated with a
biome sample and is amenable to high-throughput (HT) vehicle control (DMSO), and (3) an equal volume of sterile
biochemical screens. mGAM, incubated with the same drug concentration. Cultures
We began our screening efforts focused on a single micro- and controls were then incubated for an additional 24 h at
biome donor (pilot donor, PD). To identify the medium and 37 C in an anaerobic chamber, chemically extracted, and
culturing period that can support the growth of a batch culture analyzed using high performance liquid chromatography
whose composition is maximally similar to the original PD micro- coupled with mass spectrometry (HPLC-MS). To verify the
biome, freshly collected and glycerol-stocked human feces from reproducibility, the entire procedure was repeated three consec-
PD were cultured in 14 different media and sampled daily for four utive times. We tested a diverse library of 575 orally administered
days. We then extracted DNA from all samples, amplified the V4 drugs, and although the majority of the drugs in this library are
region of the bacterial 16S rRNA gene, and deeply sequenced currently being used in the clinic, less than 10% of them had
the amplicons (Figure 1A). From the sequencing results, ampli- been previously explored with respect to MDM (Table S1A). A
con sequence variants (ASVs) were inferred and the taxonomic drug was deemed MDM+ when (1) a new metabolite was
composition at different levels was determined for each sample observed when incubated with PD culture or (2) the drug was
(see STAR Methods) (Bokulich et al., 2018; Bolyen et al., 2018; no longer detected when incubated with PD culture, indicating
Callahan et al., 2016; McDonald et al., 2012). We then quantified that it is either consumed entirely or metabolized into a molecule
the differences between the various media and PD at the family that fails our detection, and (3) the drug was metabolized in the
level (using the Jensen-Shannon divergence, DJS), and the same manner during at least two of three independent
variant recovery from PD at the single ASV level. experiments.
As expected, we observed a great level of variation in both the
taxonomic composition and diversity between the different me- MDM-Screen Identifies Known and Novel Drug-
dia and culturing periods. Some media led to highly diverse com- Microbiome Interactions
munities that captured portions of the original fecal diversity, Among the 575 drugs, we successfully analyzed 438 (76%); the
while others became dominated almost exclusively by a single remaining 137 failed MDM-Screen because of issues related to
family. Among the 14 media commonly used in gut microbiome drug stability or incompatibilities with the extraction or chroma-
cultivation efforts (Rettedal et al., 2014), we identified one, modi- tography methods employed (see Discussion). Of the success-
fied Gifu Anaerobic Medium (mGAM), that supported the growth fully analyzed drugs, we identified 57 (13%) as MDM+. These
of a bacterial community most similar in composition and diver- spanned 28 pharmacological classes and even more based on
sity to PD’s (Figures 1B and S1A). At the family level, mGAM their chemical structure (Figure 2B; Table S1B; Data S1). As ex-
cultures largely match the composition of PD, differing primarily pected, several previously reported MDM cases were identified.
in a commonly observed expansion of the facultative anaerobes, These include the nitroreduction of the muscle relaxant dantro-
Enterobacteriaceae, at the expense of the obligate anaerobes, lene (Kuroiwa et al., 1985), the antiepileptic clonazepam (Elmer
Ruminococcaceae (McDonald et al., 2018). Among all tested and Remmel, 1984; Zimmermann et al., 2019b), and the antihy-
media, mGAM cultures showed the lowest DJS divergence pertensive drug nicardipine (Kuroiwa et al., 1986); hydrolysis of
from PD, becoming increasingly similar to the original sample the isoxazole moiety in the antipsychotic risperidone (Mannens
as growth proceeds (Figure S1A). et al., 1993; Meuldermans et al., 1994); and azoreduction of
Even at the single ASV level, mGAM cultures capture much of the anti-inflammatory prodrug sulfasalazine (Azadkhan et al.,
the diversity in PD (mGAM cultures have the highest Shannon di- 1982; Peppercorn and Goldman, 1972).
B C
More importantly, we identified a suite of novel MDM cases (45 Although HRMS and HRMS/MS analyses can narrow down
cases, 80% of the MDM+ drugs): ten resulted from full depletion the number of possibilities for the molecular structure of a given
of the parent drug (or full conversion to a metabolite that evades metabolite, they are not sufficient for full structural determina-
our detection), while 35 resulted from the appearance of a new tion. Thus, we selected seven MDM+ examples for detailed
metabolite (Figures 2C and 2D; Table S1B; Data S1A). In most characterization of their resulting metabolites: spironolactone
cases, the new metabolites showed a high-resolution mass spec- (anti-hypertensive), tolcapone (anti-Parkinson’s), misoprostol
trometry (HRMS) profile within a small difference from their parent (anti-ulcer), mycophenolate mofetil (immunosuppressant),
drugs and/or a similar tandem MS fragmentation (HRMS/MS) capecitabine (anticancer), and finally, hydrocortisone and hydro-
pattern, indicating that they are derivatives (Wang et al., 2016) (Ta- cortisone acetate (two steroidal anti-inflammatory drugs that
ble S1B; Data S1B). An aggregate statistical analysis of MDM+ and produced an identical MDM metabolite). In all but one example,
MDM drugs revealed specific structural features that are signifi- no direct drug-microbiome interactions had been previously re-
cantly enriched in MDM+ drugs (e.g., a steroidal skeleton, nitro ported. The exception was hydrocortisone where several metab-
groups, ketones, among others) (see STAR Methods; Table S1C). olites had been previously reported from individual gut isolates
B C
(Ridlon et al., 2013; Winter et al., 1982), but the identity of the ically following tolcapone use, yet the mechanism of its produc-
MDM metabolite we observed could not be accurately matched tion remains unknown (Jorga et al., 1999; Smith et al., 2003). Our
to any of them based on MS alone. discovery that the same metabolite can be produced via MDM
To unequivocally determine the structure of the resulting provides a possible explanation, and a potential link between
metabolites, we isolated them from scaled-up biochemical incu- the gut microbiome and tolcapone toxicity. In all four cases,
bations with PD cultures and elucidated their structures using additional experiments need to be performed to differentiate
nuclear magnetic resonance (NMR) and/or comparison to an the contribution of human- and microbiome-derived metabolism
authentic standard (STAR Methods; Data S2). For hydrocorti- to the observed drug pharmacokinetics and/or toxicity in
sone, we determined that MDM results in the reduction of the humans.
ketone group at C20, producing 20b-dihydrocortisone. For hy-
drocortisone acetate, the same modification occurs but is Expanding MDM-Screen to Multiple Subjects
accompanied with deacetylation of the C21 hydroxyl group (Fig- Next, we sought to expand our framework to accommodate mul-
ure S2). While C20 ketone reduction was previously reported for tiple subjects. To accomplish this goal, we needed to first design
hydrocortisone (to produce either 20b-dihydrocortisone or a generalizable quantitative metric for assessing the best
20a-dihydrocortisone depending on the gut isolate incubated culturing medium for microbiome samples (Figures 3A and 3B).
with it; Ridlon et al., 2013; Winter et al., 1982), neither MDM de- In our analysis of PD’s ex vivo cultures, we applied a variety of
acetylation nor C20 ketone reduction were reported for hydro- metrics and found a medium that was the best trade-off between
cortisone acetate. For capecitabine, we show that MDM results richness, evenness, and compositional similarity. However, this
in complete deglycosylation; for misoprostol and mycopheno- approach is not scalable to a large number of donors and also
late mofetil, we observed an ester hydrolysis transformation; ignores the role of community biomass, which may lead to sub-
and in the case of spironolactone, a thioester hydrolysis one. optimal media selection. Therefore, we developed a metric
None of these MDM transformations were previously reported called Expected Number of Detectable Strains (ENDS), a cor-
for these drugs. Finally, for tolcapone, we observed two consec- rected richness metric where the contribution of each ASV is
utive transformations, a typical nitroreduction followed by a weighed by the probability that its metabolite can be detected
relatively uncommon N-acetylation—neither of which had been while considering total biomass (STAR Methods; Figure 3B;
previously linked to the microbiome for this drug (Figure 2D). Methods S1). The core idea is that we desire a medium that sup-
Taken together, these results establish MDM-Screen as a viable ports the highest number of different bacterial ASVs, while
method for identifying both known and novel biochemical mod- ensuring that the metabolic contributions of these ASVs are
ifications of structurally and pharmacologically diverse drugs by detectable by our experimental method. ENDS utilizes two
the gut microbiome. data inputs related to the ex vivo culture composition: relative
Interestingly, based on already known pharmacokinetic abundance at a given taxonomic level and total community
studies in humans, some of these new MDM cases may have biomass; and it also utilizes two inputs related to the instrument
direct consequences on the activation or toxicity of the drugs detection sensitivity: a model of instrument background noise
involved. For example, in the case of spironolactone we and a model of instrument measurement noise. Using this
observed the production of 7a-thiospironolactone, a postulated information, a simple mechanistic model of MDM metabolite
intermediate en route to the drug’s main active metabolite, production, and estimations of statistical power, we compute
7a-thiomethylspironolactone (Gardiner et al., 1989; Sica, the probabilities that metabolic reactions performed by each
2005). For the prodrugs misoprostol and mycophenolate mofetil, strain will be detected in the ex vivo culture (see STAR Methods
we observed the production of their active metabolites miso- for a detailed mathematical description of ENDS).
prostol acid (Schoenhard et al., 1985; Tsai et al., 1991) and With this quantitative framework in hand, we collected addi-
mycophenolic acid (Bullingham et al., 1998), respectively. Inter- tional fresh fecal samples from 20 healthy donors (D1-20) and
estingly, mycophenolic acid has been linked to the clinically processed them in the same manner as PD. We then cultured
observed gastrointestinal toxicity associated with mycopheno- each sample in nine representative media and used 16S rRNA
late mofetil use (Taylor et al., 2019), albeit generated via a gene sequencing to determine the composition of the cultured
different route—hydrolysis of a biliary secreted glucuronide con- communities as previously described (Figures 3A–3C and S1;
jugate by gut microbiome-derived b-glucuronidases. Finally, in Table S2A). To measure community biomass, one mL of each
the case of tolcapone, N-acetylamino-tolcapone has been de- culture was pelleted and weighed (STAR Methods; Tables S2C
tected systemically in humans post-tolcapone administration and S2D). We observed a wide variation in culture characteris-
and was suggested to be involved in liver toxicity observed clin- tics, with the richness ranging from 20–135 ASVs and mean
Figure 2. Screening of the PD Microbiome against Orally Administered Drugs Identifies Novel Drug-Microbiome Interactions
(A) Schematic representation of MDM-Screen. A drug was considered MDM+ if a new metabolite is produced (e.g., drug 3) or if the drug is no longer detectable
(e.g., drug 5) after incubation with the microbiome, as compared with abiotic media controls.
(B) A bar graph showing the pharmacological classes of MDM+ drugs discovered by MDM-Screen with the PD microbiome. ‘‘Others’’ include one drug each from
14 additional classes.
(C) Examples of MDM+ drugs where the drug is no longer detectable after incubation with the PD microbiome.
(D) Examples of MDM+ drugs where a new metabolite is discovered by MDM-Screen and fully characterized in this study.
See also Table S1; Data S1 and S2.
D E F
G H
biomass density ranging from 2–27.9 g/L (Figure 3D). mGAM and mated to be the most physiologically relevant (Methods S1D). In
Bryant and Burkey (BB) media consistently performed well with addition, BG recapitulates a large portion of the microbial commu-
all 20 donors. Interestingly, mGAM had moderate ASV richness nity in the original fecal sample. On average, BG cultures recover
and high biomass, while BB yielded a much lower biomass 76.6% of the ASVs above 1% in the original fecal sample, which
with high richness and did not suffer from the Enterobacteri- translates to 84.7%, 88.3%, and 92.7% recovery rate on the spe-
aceae expansion observed in mGAM. We calculated that a 70/ cies, genus, and family levels of taxa above 1% in the original sam-
30 BB/mGAM mixture would yield an optimal medium with mod- ple, respectively (Figure 3H). In terms of recovery of all elements,
erate biomass, high richness, and a reduced Enterobacteriaceae BG recovers 43.3%, 57.3%, 60.6%, and 62.8% on the ASV, spe-
expansion (Methods S1B), thus we included this mixture (named cies, genus, and family levels, respectively. BG is also, on average,
BG) as a 10th medium in our culturing trials. the closest in composition to the original sample (DJS = 0.16,
Next, we wondered whether the ex vivo cultured communities computed at the family level in base e), has the highest average di-
are truly personalized per subject, an important prerequisite if versity (H = 4.3, computed at the ASV level in base 2), and shows
cultured communities are to be used for assessing inter-individual the highest ASV richness (ranging from 34-133 ASVs with an
variability in MDM. Personalization between cultures was clearly average of 77). We therefore selected BG as the medium to use
observed at the ASV level, with clear specific patterns unique to in our 20-donor screen.
individual donors and their cultures (Figure 3C). We found signifi- We next sought to develop a HT, quantitative metabolomic
cantly more ASVs shared between donor feces and their self approach to assess MDM inter-individual variability with a sub-
ex vivo cultures than non-self (47.1 versus 27.5 ASVs, p < 0.001, set of drugs. Several improvements were made to the original
permutation test), partially recapitulating the inherent personaliza- drug metabolism screen. All experimental steps including incu-
tion between the donor fecal samples (Figures 3E and 3F). More- bation, chemical extraction, and HPLC-HRMS analysis were
over, we identified 167 ASVs that were unique to one of the 20 do- performed in microtiter 96-well plates instead of individual tubes,
nors in their fecal samples (8.4 ASVs per donor on average) and at a 400 mL volume instead of 3 mL. This lowered the amount of
were concordantly unique to the same donor in their ex vivo cul- drug used per incubation, allowed us to perform triplicated reac-
tures (Table S2F). Finally, we grew and sequenced multiple repli- tions simultaneously, and streamlined our chemical extraction
cates of mGAM and BG ex vivo cultures from different donors (all and analysis procedures. We also spiked a known concentration
20 donors, three replicates each for BG, and eight donors, six rep- of an internal standard prior to the chemical extraction, which al-
licates each for mGAM). The analysis of these replicates, whether lowed us to precisely quantify partial, in addition to complete,
cultured from the same or separate glycerol stock aliquots, re- drug depletion.
vealed a high correlation between ASV abundances of replicates We chose a 23-drug subset to test the ability of our quantita-
from the same donor and ensured that the community assembly tive approach to reveal potential inter-individual variability in
process was replicable (Pearson correlation coefficient >0.9) MDM under the MDM-Screen conditions. 13 drugs had at least
(Methods S1A; Figure S1). These analyses confirm that our one defined metabolite with a known chemical structure, allow-
approach results in personalized and replicable microbial ing us to unambiguously compare their levels between samples
communities. (Figure S3; Table S3). For all 20 donors, ex vivo cultures (in BG
To select a single medium that would be on average optimal for medium) were incubated in triplicates in a 96-well microtiter plate
use in a 20-donor screen, we computed the average ENDS of all with each of the 23 drugs at a final concentration of 33 mM, or
media across a range of reaction rates (Figure 3G). We found with DMSO (Figure 4A). In addition, an abiotic medium-drug
that BG is on average an optimal medium at reaction rates we esti- plate as well as a heat-killed-microbiome-drug (HKM-drug) plate
B C
D F
were prepared in the same manner. After 24 h incubation, culture We quantified this variability by computing the Shannon entropy
and control plates were chemically extracted and analyzed using (in base 2) of the distribution of metabolizers and non-metaboliz-
HPLC-HRMS (STAR Methods). ers, denoted as HV. This metric is maximal (HV = 1) when half of
We then devised a targeted metabolomics strategy to quantify donors metabolize the drug and is minimal when the drug is
MDM. We calculated percent drug remaining and metabolite either always or never metabolized (HV = 0).
level, to assess drug depletion and metabolite production in The observed variability ranged widely from 1/20 to 19/20 do-
the presence of microbiome cultures, respectively. Both metrics nors deemed MDM+ for a given type of drug depletion or metab-
were calculated using area under the curve (AUC) integration olite production. In the case of digoxin, for example, 3/20 donors
with normalization to the internal standard (see STAR Methods; (HV = 0.61) produced the known metabolite dihydrodigoxin in
Tables S3J–M). We determined statistical significance for statistically significant amounts (Figures 4C and 4E). Inter-indi-
metabolite production using one-sided Welch’s t tests vidual variability in digoxin MDM has been clinically known for
between the donor-drug condition and the donor-DMSO, decades, where significant reduction of the drug into dihydrodi-
medium-drug, and HKM-drug conditions and corrected the re- goxin and related metabolites occurs in only a subset of patients
sulting p values for multiple hypotheses using the Benjamini- (Lindenbaum et al., 1981). These results demonstrate that our
Hochberg method, requiring that tests against all three control screen can quantitatively assess the inter-individual variability
conditions have a false discovery rate (FDR) corrected p < 0.01 of MDM between personalized gut microbial communities
(Benjamini and Hochberg, 1995). For drug depletion, we used cultured under identical ex vivo conditions. Follow-up studies
the same method with the donor-drug and HKM-drug conditions will need to be performed to evaluate whether our screening re-
as controls and included an additional fold-change cutoff of two sults directly correlate with clinical outcomes.
(Figures 4B and 4C; Tables S3N–S3P). We also performed untar- Next, we sought to determine whether the depletion of drugs
geted metabolomics analyses for new metabolite discovery by in our screen can be explained by the production of associated
identifying unique molecular features from all samples, deter- metabolites. If changes in drug levels are primarily because of
mining statistical significance using similar methods as for conversion to a detected metabolite, there should exist a strong
the targeted metabolomics and verifying the metabolite’s negative correlation between depletion and metabolite produc-
relationship to the parent drug based on their HRMS/MS frag- tion, corresponding to a stoichiometric mass balance. The
mentation pattern (Wang et al., 2016) (STAR Methods; Tables absence of such a correlation, on the other hand, would suggest
S3C and S3D). All verified metabolites from the untargeted me- additional events that are not accounted for (e.g., the production
tabolomics approach were then quantified using the same tar- of additional unknown or undetectable metabolites, the conver-
geted metabolomics workflow described above (Figure 4C; sion of the initial metabolite into a second one, or bacterial con-
Tables S3N–S3P). sumption of the parent drug). For drugs with variable MDM (HV >
We observed cases of consistently negative MDM across do- 0.5 for at least one metabolite or the parent drug), we computed
nors (ketoconazole, praziquantel, ropinirole, and torsemide), the Pearson correlation coefficient of the drug signal and the sum
consistently positive MDM in either drug depletion (misoprostol, of known metabolite signals in all donor-drug ex vivo samples.
nicardipine, and spironolactone), metabolite production (tolca- We then determined whether a drug has statistically significant
pone and vorinostat), or both (clonazepam, risperidone, and sul- correlation by performing t tests, correcting p values using the
fasalazine), and variable MDM (Figures 4B–4D). This variability Benjamini-Hochberg method, and requiring FDR corrected p <
was in drug depletion (ketoprofen and levonorgestrel), metabo- 0.01. For vorinostat and digoxin, for example, we found a signif-
lite production (misoprostol, nicardipine, and spironolactone), icant negative correlation between metabolite production and
or both (capecitabine, clofazimine, digoxin, hydrocortisone, drug depletion (Pearson correlation coefficient of 0.91 and
lovastatin, mycophenolate mofetil, sulindac, and vorinostat). 0.79, respectively), suggesting that the majority of drug
Figure 4. A HT, Quantitative Metabolomic Approach to Assess Inter-Individual Variability in MDM Using Personalized Microbial Communities
(A) Schematic representation of quantitative MDM-Screen with 20 donors and 23 selected drugs.
(B) Heatmap of drug depletion showing the mean fraction of drug remaining after 24 h for each donor-drug combination. The fraction remaining is computed
relative to the medium-drug control, and fractions above 1 are truncated to 1 for simplicity.
(C) Heatmap of metabolite production showing the mean level of metabolite after 24 h, normalized to the maximum level of that metabolite across all donors.
Metabolites in red were discovered using the untargeted metabolomics approach, while ones in black were discovered previously or by MDM-Screen with the PD
microbiome (Table S3B). In (B) and (C), *statistically significant metabolism in the donor condition as compared with controls. The upper inset axes represent
inter-individual variability in MDM using the Shannon entropy (calculated in base 2) of the distribution of donors with significant and non-significant metabolism.
(D) Cumulative histogram of the number of significant donors for both metabolite production and parent drug depletion. For parent drugs, the y axis is normalized
to the total number of drugs tested (23), and for metabolite production, it is normalized to the total number of metabolites produced (32).
(E) Levels of metabolite production (measured by HPLC-HRMS in AUC normalized to an internal standard) for four drugs, with the variability entropy indicated
above. Filled data points indicate that the replicates are significantly higher than control conditions, while hollow data points indicate that they are not.
(F) The upper three scatterplots show significant negative correlation between drug depletion and metabolite production, with the Pearson correlation coefficient
indicated above. The line shown is a linear regression fit of the data. The lower bar plot indicates the Pearson correlation coefficient between remaining drug levels
and total metabolite production for all computed cases. *FDR corrected two-sided t test p < 0.01. For drugs with multiple metabolites, we sum the normalized
AUC of all metabolites.
(G) Correlation between drug depletion and metabolite production for nicardipine before and after inclusion of metabolites discovered by untargeted
metabolomics.
See also Table S3.
depletion can be explained by the production of the quantified mermann et al., 2019b). For capecitabine deglycosylation, we
metabolite (Figure 4F). Nicardipine, on the other hand, exhibited elected to use a homology-based approach.
a very poor correlation initially (Pearson correlation coefficient of
0.22), implying that additional unknown factors are at play. Characterizing the Genetic Basis of MDM
Interestingly, our untargeted metabolomics pipeline detected Deglycosylation Using a Homology-Based Approach
11 additional metabolites of nicardipine, which upon inclusion To identify a specific microbiome-derived isolate where a homol-
in the analysis resulted in a stronger negative correlation (Pear- ogy-based approach can be employed, we explored the ability
son correlation coefficient of 0.6, FDR corrected p = 0.0102) of a limited panel of bacterial isolates to deglycosylate capecita-
(Figure 4G; Table S3E). Since our screen is based on microbial bine, including strains isolated originally from PD. Interestingly,
communities and not individual strains, it provides a powerful capecitabine deglycosylation was mainly performed by Proteo-
platform to discover interacting factors that influence drug and bacteria (including Escherichia coli), and one of two tested
metabolite levels under realistic conditions—as exemplified by Bacteroidetes: Parabacteroides distasonis, providing genetically
the varying number of nicardipine metabolites observed per tractable organisms for functional studies (Figure S4). In hu-
personalized community. mans, thymidine phosphorylase (TP) and uridine phosphorylase
Next, we assessed whether we could predict MDM using (UP), both part of the pyrimidine salvage pathway, catalyze the
taxonomic data. We computed Spearman correlations between deglycosylation of 50 -deoxy-5-fluorouridine (a late metabolite
absolute abundances of taxonomic elements (at different of capecitabine) to yield 5-fluorouracil (5-FU) (Temmink et al.,
levels) in the BG ex vivo cultures and measured drug and 2007). To test whether bacterial homologs of human TP and/or
metabolite levels in matching donors but found no significant UP are responsible for the observed MDM deglycosylation of ca-
correlations—even in specific cases of MDM where meta- pecitabine, we generated strains of E. coli BW25113 that are
bolism has been previously attributed to a single species knockouts for TP (DdeoA), UP (Dudp), or both, and compared
(e.g., digoxin reduction by Eggerthella lenta) (Haiser et al., their ability to metabolize capecitabine with that of wild-type
2013). This is likely due to a combination of two factors. First, (WT) E. coli (Figure 5A). While WT E. coli efficiently deglycosy-
as has been previously observed (Haiser et al., 2013; Maini Re- lates capecitabine (30% conversion rate), the deglycosylating
kdal et al., 2019), taxonomic classifications may not reflect the activity of the Dudp and DdeoA/Dudp strains is significantly
presence or absence of gene variants that encode strain-spe- diminished (less than 4% conversion rate, p value < 0.001,
cific drug-metabolizing enzymes, even at the ASV level. Sec- two-tailed t test) (Figure 5B). Surprisingly, the DdeoA strain
ond, the observed level of MDM may not be monotonically showed a significant increase in its deglycosylating activity in
dependent on a single taxon’s abundance if confounding com- comparison with WT (50% conversion rate, p value < 0.01,
munity effects are at play. Examples of such effects include the two-tailed t test), possibly because of a compensating mecha-
contribution of several community members to the production nism (e.g., overexpression of udp) in the absence of deoA. These
of the metabolite(s), the consumption of the drug or metabo- results indicate that microbiome-derived UP is, at least in part,
lite(s), or the inhibition of the metabolite-producing or drug- responsible for the deglycosylation of capecitabine.
depleting bacterium or enzyme (for a mathematical analysis Capecitabine is one of several generations of antimetabolite
of the impact of these factors on the correlation, see Methods chemotherapeutic agents, many of which are prodrugs for
S1C). These results emphasize the importance of considering 5-FU, and are known collectively as the oral fluoropyrimidines
whole community effects in MDM. While our ex vivo commu- (FPs) (Lamont and Schilsky, 1999; Longley et al., 2003). Impor-
nities may not fully recapitulate all possible community effects tantly, oral FPs’ bioavailability and toxicity vary widely among pa-
that occur in humans, they represent an important step toward tients (Cleary et al., 2017; Zampino et al., 1999), but the human
identifying and quantifying them. gut microbiome’s contribution to this variability had not been
explored. To determine whether deglycosylation occurs with
Linking MDM to Specific Genes in the Human other FPs, and whether the same enzymes are involved, we
Microbiome investigated the MDM of two additional oral FPs (doxifluridine
Next, we sought to link the observed biochemical transforma- and trifluridine) using WT and mutant E. coli. Unlike with capeci-
tions to specific microbiome-derived enzymes. We picked two tabine, almost complete deglycosylation was observed for both
representative cases of MDM transformations: MDM deglycosy- drugs with WT E. coli, and the activity was dependent on both TP
lation of capecitabine into deglycocapecitabine and C20 ketone and UP (Figures S4 and S5). These results indicate a level of de-
reduction of hydrocortisone into 20b-dihydrocortisone. Three glycosylation specificity for TP/UP among the FPs (Figure 5C).
main approaches had been previously employed to identify Remarkably, the consequences of the same modification differ
genes responsible for a specific MDM transformation: compara- depending on the tested drug. For trifluridine, the resulting
tive transcriptomics, which assumes that the expression of metabolite (trifluorothymine) is inactive (Figures 5D and S4):
metabolizing enzymes is induced in the presence of their sub- trifluridine is typically incorporated intact into DNA to cause cyto-
strates (e.g., digoxin) (Haiser et al., 2013; Koppel et al., 2018), ho- toxicity (Cleary et al., 2017; Lenz et al., 2015). Such a premature
mology-based discovery, which assumes that related classes of intestinal inactivation by the microbiome may thus be an un-
enzymes metabolize similar substrates (e.g., levodopa) (Maini known contributor to the established low bioavailability of triflur-
Rekdal et al., 2019; van Kessel et al., 2019; Zimmerman et al., idine, in addition to the known contribution of human TP (Cleary
2019b)), and HT mutagenesis screens, which identify metabo- et al., 2017). For doxifluridine, however, the resulting metabolite
lizing enzymes by isolating loss-of-function mutant strains (Zim- is the active 5-FU (Figures 5E and S5). This premature activation
B C
D E
F G
Figure 5. Genetic Basis and Widespread Nature of MDM Deglycosylation among the FPs and in Human Gut Metagenomes
(A) Genetic organization of the udp and deoA loci in the genome of E. coli BW25113.
(B) A bar graph indicating percent conversion of capecitabine to deglycocapecitabine by wild-type (WT) E. coli BW25113 and Dudp, DdeoA, and DdeoA/Dudp
mutants (each tested in triplicate). ***p value < 0.001, while **p value < 0.01, two-tailed t test. Error bars represent the standard deviation.
(C) Biochemical reaction catalyzed by thymidine and uridine phosphorylases on their natural substrates.
(D) MDM deglycosylation of the oral anticancer drug trifluridine leads to its premature inactivation, since trifluorothymine is no longer active.
(E) MDM deglycosylation of the anticancer prodrug doxifluridine leads to its premature activation, since 5-FU is the intended active metabolite.
(F) Heatmaps indicating the prevalence (in percent of subjects harboring the gene of interest) and abundance (in median RPKM for all positive subjects) of E. coli-
derived deoA and udp across six gut metagenomic cohorts. The number of subjects analyzed in each cohort is indicated on the right. For cohorts with multiple
sub-cohorts, the values reported are the averages of the sub-cohort values.
(G) Jitter plots of E. coli-derived deoA and udp abundances (in RPKM) for all positive subjects in the same cohorts.
See also Figures S4 and S5; Table S4.
of the prodrug may therefore lead to gastrointestinal toxicity— the cat fecal isolate Butyricicoccus desmolans ATCC 43058 (De-
again, a side effect commonly associated with oral doxifluridine vendran et al., 2017). Neither this enzyme nor close homologs
(Kim et al., 2001; Min et al., 2000). Additional studies are neces- thereof (at 60% protein sequence identity or above) could be
sary to directly correlate the level of MDM deglycosylation of identified in a deep metagenomic sequencing dataset that we
different FPs in humans to their clinically observed pharmacoki- generated from the PD fecal DNA (STAR Methods). We therefore
netics and/or toxicity. decided to use this example as a test case for developing an un-
Because capecitabine was significantly and variably metabo- targeted functional metagenomic screening strategy for metab-
lized into deglycocapecitabine in 17/20 donors, we sought to olizing enzymes. In typical functional metagenomic screens,
examine the representation of FP deglycosylating enzymes in the metagenomic DNA is cloned into a vector that replicates in
gut microbiome of the human population at large. We specifically E. coli and functional screens are performed in either a selective
focused on enzymes that we experimentally verified to have a manner (e.g., for antibiotic resistance or an engineered circuit for
role in FP deglycosylation: E. coli-derived TP and UP. Overall, we survival) (Genee et al., 2016; Sommer et al., 2009; Uribe et al.,
analyzed six large and diverse human cohorts: the Human Micro- 2019) or a visual readout (e.g., a colorimetric or antibacterial
biome Project (HMP-1-1 and HMP-1-2, 299 subjects from the one) (Brady et al., 2002; Cohen et al., 2015; Gillespie et al.,
USA) (Human Microbiome Project Consortium, 2012; Lloyd-Price 2002; Rondon et al., 2000). Here, we use a functional metage-
et al., 2017), the Metagenomics of the Human Intestinal Tract Con- nomic screen where the readout is a specific MDM transforma-
sortium (MetaHIT, 219 subjects from Spain and 176 subjects from tion that is detected by MS. For metagenomic genes that are
Denmark) (Nielsen et al., 2014), a Chinese cohort (194 subjects) successfully expressed and produce functional gene products
(Qin et al., 2012), and a Fijian cohort (Fijicomp, 192 subjects) (Brito in E. coli, this approach would allow access to enzymes encoded
et al., 2016). We mapped fecal metagenomic reads from each of by cultured and not-yet cultured members of the microbiome,
the cohort samples to the DNA sequence of deoA and udp, and and to ones that share no close homology with previously
calculated two metrics: prevalence, i.e., the percent of subjects characterized enzymes. Two major technical challenges in this
from each cohort that are positive for a given gene, and abundance strategy, however, are to produce a large-enough metagenomic
of the gene among positive samples (calculated in reads per Kbp library that captures the majority of the genetic content in the
per million of sequenced reads, or RPKM) (STAR Methods; Tables complex microbiome, and to develop a HT analytical chemistry
S4B and S4C). Interestingly, we found that both genes were most approach that permits the screening of such a library.
prevalent in non-Western cohorts (Chinese 74%/76% positive We isolated metagenomic DNA from PD and used it to
subjects, and Fijicomp 64%/70% positive subjects for deoA/udp, construct a 3 3106-member clone library (PD-CL) in an
respectively) in comparison with Western ones (24%/26% on E. coli expression vector (insert size 2–4 Kbp) (Figure 6A). To
average, for deoA/udp, respectively) and that their abundance determine whether PD-CL is truly representative of the genetic
per positive samples varies widely within and between cohorts content in PD, we deeply sequenced a representative pool that
(from 101 to 102 RPKM) (Figures 5F and 5G; Tables S4B and contains 105 unique clones (PD-CL-100) and compared it
S4C). These results indicate that FP deglycosylating enzymes are with the deeply sequenced PD fecal metagenome. We mapped
both widespread and variable in the gut microbiome of diverse hu- metagenomic reads from either PD or PD-CL-100 to assembled
man cohorts (even when considering the contribution of a single scaffolds from the PD metagenome (25,529 scaffolds R 2 Kbp).
bacterial species, E. coli) and further highlight the importance of Satisfyingly, reads from PD-CL-100 (which represents only 3%
considering MDM deglycosylation of FPs in clinical studies. of the full PD-CL) map to 21% of the PD scaffolds, including
ones that originate from all major bacterial phyla and varying
An Untargeted Functional Metagenomic Screening coverages in the PD microbiome (Figure 6B; Table S4A). These
Approach for Identifying Metabolizing Enzymes results indicate that PD-CL represents a large component of
Although the homology-based approach was relatively straight- the genetic content in PD and that it is adequate for use in func-
forward in identifying responsible species and enzymes for the tional metagenomics screens.
deglycosylation of FPs, it is not widely applicable. Unlike pyrim- During the construction of PD-CL, we split it into 80 pools of 2–
idine phosphorylases, oxidoreductases (the enzyme class likely 6 3 104 unique clones (UCs) each and preserved them in corre-
responsible for hydrocortisone reduction to 20b-dihydrocorti- sponding glycerol stocks (see STAR Methods). We tested each
sone) are extremely diverse and typically substrate specific, of these pools for the ability to convert hydrocortisone into
with numerous homologs found per bacterial genome. More- 20b-dihydrocortisone and identified six that showed significant
over, homology-based discovery, as well as comparative tran- metabolism. To reach a single functional clone, we performed
scriptomics and HT mutagenesis screens, typically require the 10-fold serial dilutions of a selected positive pool of 2 3 104
identification of an isolated strain that performs the modification UCs, by following positive sub-pools at the 2 3 103, 2 3 102,
of interest and its use as the basis for genetic manipulations and/ and 2 3 101 UC levels. We then plated the 20-UCs positive
or functional analyses. These two limitations motivated us to sub-pool and screened individual clones in a 96-well plate
employ an orthogonal strategy that is not reliant on enzymatic format to reach a single positive clone: Hyd-red-1 (Figures 6C
homology nor isolated strains. and S6). Sequencing of Hyd-red-1 revealed that it likely origi-
While no human gut microbiome-derived enzymes had previ- nated from a Bifidobacterium sp. Analysis of the genetic context
ously been deemed responsible for converting hydrocortisone of Hyd-red-1 in a PD scaffold revealed a single putative oxidore-
into 20b-dihydrocortisone, a cat microbiome-derived enzyme ductase in the cloned insert (Figure 6D). We then cloned and het-
had: a 20b-hydroxysteroid dehydrogenase (20b-HSDH) from erologously expressed this single gene and showed that it is
C D
G
F
indeed a 20b-HSDH (Figures 6C and 6D). A second round of further corroborating our findings (Doden et al., 2019). As
screening of PD-CL performed in a similar manner revealed a mentioned above (Figure 2D), we also observed the production
different clone, Hyd-red-2, harboring the same gene and con- of 20b-dihydrocortisone from hydrocortisone acetate when
firming our findings (Figure 6D). These results indicate that incubated with PD. This transformation would require two steps:
combining MDM-Screen with a functional metagenomics deacetylation at the C21 hydroxyl, by a yet-unidentified enzyme
approach is a valid strategy to link MDM transformations to and reduction of the ketone at C20 by a 20b-HSDH. Interestingly,
metabolizing enzymes from diverse bacteria without the need when we incubated hydrocortisone acetate with either
for bacterial isolation. P. distasonis or C. bolteae, it was deacetylated to yield hydrocor-
We then sought to further probe the biological relevance of the tisone but not further reduced, implying that the two metabolic
discovered 20b-HSDH. Because it was discovered by heterolo- steps at play here can be uncoupled and performed by different
gous expression of PD-derived DNA in E. coli, we wondered members of the microbiome in a sequential manner (Figure S2).
whether it is actually expressed under host colonization condi-
tions. To answer this question, we isolated RNA from PD, sub- MDM Deglycosylation Occurs In Vivo
jected it to deep metatranscriptomic sequencing, and mapped Although MDM-Screen is able to uncover novel microbiome-drug
resulting reads to the PD scaffold harboring the 20b-HSDH interactions, it is unclear whether these results (observed ex vivo)
gene (see STAR Methods). We observed robust expression of can be recapitulated within the gastrointestinal tract of a live
the 20b-HSDH gene in PD-derived metatranscriptomic data mammalian host (in vivo). To address this question, we sought
but not of neighboring genes, suggesting that it is expressed to monitor one MDM transformation, MDM deglycosylation of
individually and not as part of a gene cluster (Figure 6E). To FPs, in an in vivo pharmacokinetic study that is performed in a mi-
determine whether the identified 20b-HSDH is unique to PD or crobiome-dependent manner. Capecitabine was among the initial
widespread in the human population, we mapped fecal metage- hits that resulted from MDM-Screen and its modification yields a
nomic reads from the same six human cohorts mentioned above novel metabolite (deglycocapecitabine) that has not been previ-
to the DNA sequence of its gene and to that of a previously iden- ously reported in humans or animals; we selected its MDM degly-
tified 20a-HSDH gene from the gut microbiome isolate Clos- cosylation as a test case for in vivo studies and a proxy for other
tridium scindens ATCC 35704 for comparison (which converts FPs. We treated two groups of C57BL/6 mice with a cocktail of an-
hydrocortisone to 20a-dihydrocortisone) (Ridlon et al., 2013). tibiotics for 14 days to eliminate their native microbiome, then
While the C. scindens-derived 20a-HSDH gene was rare (present colonized one group with PD while the control group remained
in only 0.6% of subjects, on average), the PD-derived 20b-HSDH non-colonized (see STAR Methods). The two groups were then
gene was widespread in all cohorts (present in 36% of subjects, treated with a single human-equivalent oral dose of capecitabine
on average), and its abundance varied widely between subjects (755 mg/kg), and blood and feces were collected from each
and cohorts (Figures 6F and 6G; Tables S4B and S4C). mouse at 0, 20, 40, 60, 120, and 240 min post-drug administration
Although Bifidobacterium adolescentis had been known to (Figures 7A and 7B). We then quantified capecitabine and its me-
convert hydrocortisone into 20b-dihydrocortisone for almost tabolites in the serial fecal and blood samples using HPLC-HRMS.
40 years (Winter et al., 1982), no responsible enzymes have In blood samples, capecitabine and its major liver-derived metab-
been identified from it. Interestingly, while this manuscript was olite (50 -deoxy-5-fluorocytidine), but not deglycocapecitabine,
under revision, a different study published the crystal structure were readily detected and showed no significant differences be-
of a 20b-HSDH from B. adolescentis L2-32 (which is 98% iden- tween the two groups (Figure S7). In fecal samples, however, de-
tical to the 20b-HSDH we identified from the PD microbiome), glycocapecitabine was detected from animals colonized with PD
as early as 20 min after dosing and was almost completely absent obtained here—including the extent and type of certain modifi-
in non-colonized ones (Figure 7C). These results indicate that—at cations—are specific to the strain-level composition of each
least in the case of FP deglycosylation—MDM transformations donor’s microbiome. MDM-Screen thus has a good potential
observed ex vivo by MDM-Screen are recapitulated in vivo (i.e., for assessing inter-individual variability in MDM.
in mice); establishing the same results in humans awaits further Second, most previous studies have focused on certain combi-
studies. They also suggest that MDM deglycosylation of certain nations of drugs and species that have historically been deemed
FPs (e.g., doxifluridine, which is prematurely activated into 5-FU important (e.g., have been readily observed in humans) or that
upon deglycosylation) should be investigated as a potential are manageable experimentally. By default, our microbial-com-
contributor to their undesired intestinal toxicity observed in the munity setup allows us to screen a wider range of combinations,
clinic, although future in vivo studies with different dosing regi- which enabled us to expand in either the drug or subject spaces
mens and a variety of FPs need to be performed. and to discover drug-microbiome interactions never reported
before. Notably, while this manuscript was under revision, an
DISCUSSION elegant study reported the screening of 271 orally administered
drugs against 76 bacterial isolates of the human gut microbiome
In the current study, we developed a quantitative experimental (Zimmermann et al., 2019a). Two thirds of the tested drugs were
workflow for assessing the ability of the human gut microbiome shown to be significantly depleted by at least one of the tested iso-
to directly metabolize orally administered drugs, using a combi- lates, further emphasizing the great potential of gut microbes to
nation of microbial community cultivation, small-molecule struc- metabolize orally administered small-molecule drugs. We view
tural analysis, quantitative metabolomics, functional genomics these two approaches as complementary: while screening drugs
and metagenomics, and mouse colonization assays. Several against optimized, well-characterized, donor-derived microbial
key differences set our approach apart from previous studies communities in MDM-Screen provides a personalized view of
in this area. First, instead of relying on single isolates in perform- drug metabolism that takes into account strain-level and commu-
ing the initial screen, we use well-characterized, subject-person- nity-wide contributions, screening drugs against a set of repre-
alized microbial communities. Despite the technical challenges sentative gut isolates streamlines the identification and character-
associated with characterizing and maintaining stable microbial ization of specific taxon-drug and gene-drug interactions.
communities in batch cultures, three main advantages make this Combined together, the results from the two approaches serve
strategy worth pursuing: (1) the extent of a biochemical transfor- as a valuable resource for the scientific community to further study
mation performed by single isolates cultured individually may be the mechanistic details and pharmacological consequences of
different than that performed by the same isolates when cultured newly discovered drug-microbiome interactions.
as part of a complex community; (2) the net result of several Despite these advances, our approach is still subject to several
members of the microbiome acting on the same drug can only limitations. First, 24% of the drugs tested failed to be analyzed us-
be identified in mixed communities and not in single-isolate ex- ing the general analytical chemistry workflow described in MDM-
periments, unless all pairwise and higher order permutations Screen. These drugs fell into one or more of three main categories:
are tested; and (3) our strategy is ‘‘personalized.’’ The results unstable after overnight incubation in no-microbiome controls,
could not be extracted using ethyl acetate, or could not be B ex vivo screening of the drug library with PD
analyzed using reverse phase chromatography. An alternative B Structural elucidation of selected metabolites
chemical analysis method will need to be developed for these B Molecular networking analysis in PD screen
molecules in order to assess their MDM. Second, we focused B Enrichment analysis for drugs in PD screen
initially on oral drugs, yet several parenteral drugs and their B Gene abundance analysis in metagenomic cohorts
liver-derived metabolites may be subject to important MDM trans- B ENDS (Expected Number of Detectable Strains)
formations after biliary secretion. Third, even in our most diverse B High-throughput screen with D1-20
ex vivo cultures, we fail to support the growth of 100% of the com- B Targeted quantitative metabolomics analysis
munity in the original sample. This limitation can potentially be B Untargeted metabolomics analysis
overcome by utilizing multiple distinct media conditions that B Isolate screen for capecitabine
each captures unique portions of the community. ENDS provides B Metagenomic library construction
the theoretical framework for selecting an optimal ensemble of B Functional screening of the metagenomic library
media conditions, and we show in STAR Methods how to B Heterologous expression of PD-derived 20b-HSDH
compute a version of ENDS that estimates the number of detect- B Metagenomic and metatranscriptomic analyses
able strains gained by testing additional media. B TP and UP gene deletions in E. coli BW25113
We developed our screen in two stages. We began with a sin- B MDM-Screen of capecitabine using E. coli mutants
gle human sample, PD, and incubated its ex vivo culture with 575 B MDM-Screen of other FPs using E. coli mutants
drugs. We then transitioned into a HT format with more rigorous B Microbiome-dependent pharmacokinetic experiment
methods for media selection, drug and metabolite quantification, d QUANTIFICATION AND STATISTICAL ANALYSIS
and metabolite discovery and used these methods to screen
ex vivo cultures from 20 human donors against 23 drugs. A simul- SUPPLEMENTAL INFORMATION
taneous expansion into hundreds of drugs and hundreds of
Supplemental Information can be found online at https://doi.org/10.1016/j.
donor samples is necessary to reveal the complete biochemical
cell.2020.05.001.
potential of MDM: it is very likely that the types of MDM transfor-
mations observed here are an underestimation of all possible ACKNOWLEDGMENTS
ones. With the HT experimental approach and automatic tar-
geted and untargeted metabolomic analyses developed here, We would like to thank Wei Wang and the Lewis Sigler Institute sequencing
we have laid the groundwork for this expansion. Finally, and core facility for assistance with HT sequencing; Matthew Cahn and Abhishek
Biswas for assistance with sequencing data analysis; Shuo Wang for assis-
most relevant from a clinical stand point, a direct comparison be-
tance with the functional group analysis; Joseph Koos, A. James Link, and
tween drug metabolism outcomes in humans and in MDM- Yuki Sugimoto for assistance with Mass Spectrometry; Riley Skeen-Gaar for
Screen for the same cohort of donors is important to establish assistance with statistical analysis; Joseph Sheehan and Zemer Gitai for assis-
which MDM transformations can be observed in humans, and tance with obtaining the Keio library mutants; the Laboratory Animal Re-
to quantify the magnitude by which inter-individual variability in sources at Princeton University for assistance with mouse studies; Janie
MDM-Screen recapitulates that which occurs in humans. Our Kim for illustrating the graphical abstract; and members of the Donia lab for
useful discussions. We are grateful to the 21 anonymous donors who provided
quantitative framework—on both the microbial community and
the fecal samples that made this project possible. Figure S6 and a part of
metabolomic angles—provides the necessary tools to perform Figure 7 were created with BioRender.com. Funding for this project has
such comparison. been provided by an Innovation Award from the Department of Molecular
Biology, Princeton University and an NIH Director’s New Innovator Award
(1DP2AI124441), both to M.S.D. B.J. is funded by a New Jersey Commission
STAR+METHODS
on Cancer Research Pre-doctoral award (DFHS18PPC056), Y.-C.J.L. is
funded by a training grant from the National Institute of General Medicine Sci-
Detailed methods are provided in the online version of this paper ences (T32GM007388), and J.G.L. is funded by a National Science Foundation
and include the following: Graduate Research Fellowship (2017249408).
Maini Rekdal, V., Bess, E.N., Bisanz, J.E., Turnbaugh, P.J., and Balskus, E.P. Ridlon, J.M., Ikegawa, S., Alves, J.M., Zhou, B., Kobayashi, A., Iida, T., Mita-
(2019). Discovery and inhibition of an interspecies gut bacterial pathway for mura, K., Tanabe, G., Serrano, M., De Guzman, A., et al. (2013). Clostridium
Levodopa metabolism. Science 364, eaau6323. scindens: a human gut microbe with a high potential to convert glucocorticoids
into androgens. J. Lipid Res. 54, 2437–2449.
Mannens, G., Huang, M.L., Meuldermans, W., Hendrickx, J., Woestenborghs,
R., and Heykants, J. (1993). Absorption, metabolism, and excretion of risper- Rondon, M.R., August, P.R., Bettermann, A.D., Brady, S.F., Grossman, T.H.,
idone in humans. Drug Metab. Dispos. 21, 1134–1141. Liles, M.R., Loiacono, K.A., Lynch, B.A., MacNeil, I.A., Minor, C., et al.
McDonald, D., Price, M.N., Goodrich, J., Nawrocki, E.P., DeSantis, T.Z., (2000). Cloning the soil metagenome: a strategy for accessing the genetic
Probst, A., Andersen, G.L., Knight, R., and Hugenholtz, P. (2012). An improved and functional diversity of uncultured microorganisms. Appl. Environ. Micro-
Greengenes taxonomy with explicit ranks for ecological and evolutionary ana- biol. 66, 2541–2547.
lyses of bacteria and archaea. ISME J. 6, 610–618. Scheline, R.R. (1973). Metabolism of foreign compounds by gastrointestinal
McDonald, D., Hyde, E., Debelius, J.W., Morton, J.T., Gonzalez, A., Acker- microorganisms. Pharmacol. Rev. 25, 451–523.
mann, G., Aksenov, A.A., Behsaz, B., Brennan, C., Chen, Y., et al.; American Schmieder, R., and Edwards, R. (2011). Quality control and preprocessing of
Gut Consortium (2018). American Gut: an Open Platform for Citizen Science metagenomic datasets. Bioinformatics 27, 863–864.
Microbiome Research. mSystems 3, e00031-18.
Schoenhard, G., Oppermann, J., and Kohn, F.E. (1985). Metabolism and phar-
McKinney, W.G. (2010). Data structures for statistical computing in python. macokinetic studies of misoprostol. Dig. Dis. Sci. 30 (11, Suppl), 126S–128S.
Proceedings of the 9th Python in Science Conference. 445, 52–56.
Shannon, P., Markiel, A., Ozier, O., Baliga, N.S., Wang, J.T., Ramage, D., Amin,
Meinl, W., Sczesny, S., Brigelius-Flohé, R., Blaut, M., and Glatt, H. (2009).
N., Schwikowski, B., and Ideker, T. (2003). Cytoscape: a software environment
Impact of gut microbiota on intestinal and hepatic levels of phase 2 xenobi-
for integrated models of biomolecular interaction networks. Genome Res. 13,
otic-metabolizing enzymes in the rat. Drug Metab. Dispos. 37, 1179–1186.
2498–2504.
Meuldermans, W., Hendrickx, J., Mannens, G., Lavrijsen, K., Janssen, C.,
Sica, D.A. (2005). Pharmacokinetics and pharmacodynamics of mineralocorti-
Bracke, J., Le Jeune, L., Lauwers, W., and Heykants, J. (1994). The meta-
coid blocking agents and their effects on potassium homeostasis. Heart Fail.
bolism and excretion of risperidone after oral administration in rats and
Rev. 10, 23–29.
dogs. Drug Metab. Dispos. 22, 129–138.
Min, J.S., Kim, N.K., Park, J.K., Yun, S.H., and Noh, J.K. (2000). A prospective Sivan, A., Corrales, L., Hubert, N., Williams, J.B., Aquino-Michaels, K., Earley,
randomized trial comparing intravenous 5-fluorouracil and oral doxifluridine as Z.M., Benyamin, F.W., Lei, Y.M., Jabri, B., Alegre, M.L., et al. (2015).
postoperative adjuvant treatment for advanced rectal cancer. Ann. Surg. On- Commensal Bifidobacterium promotes antitumor immunity and facilitates
col. 7, 674–679. anti-PD-L1 efficacy. Science 350, 1084–1089.
Nayfach, S., Shi, Z.J., Seshadri, R., Pollard, K.S., and Kyrpides, N.C. (2019). Smith, K.S., Smith, P.L., Heady, T.N., Trugman, J.M., Harman, W.D., and Mac-
New insights from uncultivated genomes of the global human gut microbiome. donald, T.L. (2003). In vitro metabolism of tolcapone to reactive intermediates:
Nature 568, 505–510. relevance to tolcapone liver toxicity. Chem. Res. Toxicol. 16, 123–128.
Nielsen, H.B., Almeida, M., Juncker, A.S., Rasmussen, S., Li, J., Sunagawa, S., Sommer, M.O.A., Dantas, G., and Church, G.M. (2009). Functional character-
Plichta, D.R., Gautier, L., Pedersen, A.G., Le Chatelier, E., et al.; MetaHIT Con- ization of the antibiotic resistance reservoir in the human microflora. Science
sortium; MetaHIT Consortium (2014). Identification and assembly of genomes 325, 1128–1131.
STAR+METHODS
Continued
REAGENT OR RESOURCE SOURCE IDENTIFIER
L-arabinose Sigma Cat#A3256-25G
L-Cysteine Fisher Scientific Cat#ICN19464625
LB Broth Sigma Cat#L3522-1Kg
Levonogestrel Santa Cruz Biotech Cat#SC205731
Liver broth Sigma Cat#61724-500 g
Lovastatin Sigma Cat#PHR1285
M17 BROTH Fisher Scientific Cat#OXCMO817B
Magnesium Sulfate, heptahydrate Sigma Cat#230391-500 g
Meat extract Sigma-Aldrich Cat#70164-500G
MegaX DH10b Electrocompetent Cells Thermo Fisher scientific Cat#C640003
Methanol Fisher Scientific Cat#A452-4
Metronidazole Sigma Cat#M3761
Milli-q water Millipore Corporation N/A
Misoprostol Sigma Cat#M6807
Misoprostol free acid Sigma Cat#M6932
MRS Broth Sigma/Millipore Cat#69966-500G
Mycophenolic acid Sigma Cat#Cat#M5255
Mycophenolate mofetil Sigma Cat#SML0284
Neomycin Sigma-Aldrich Cat#N1876-25G
Nicardipine Sigma Cat#N7510
Nitrofurantoin Sigma Cat#N7878
Pellet Paint NF Co-Precipitant EMD Millipore Cat#70748
pGFPuv Clontech Laboratories Cat#632312
Phusion High-Fidelity DNA Polymerase New England Biolabs (NEB) Cat#M0530S
Plasmid pCP20 encoding the FLP Gitai lab, Princeton University (Datsenko and Wanner, 2000)
recombinase
Plasmid pKD46 expressing the Lambda Fischbach lab, Stanford University (Datsenko and Wanner, 2000)
Red recombinase
Potassium phosphate monobasic Sigma Cat#P5655-500G
Praziquantel Sigma Cat#P4668
Reinforced Clostridial Medium Fisher Scientific Cat#DF1808-17-3
Resazurin Santa Cruz Biotechnology Cat#62758-13-8
Risperidone Sigma Cat#PHR1631
Ropinirole Sigma Cat#R2530
Screen-Well FDA approved drug library Enzo Life Sciences Cat#BML-2843-0100
Sodium Bicarbonate Sigma Cat#S6014-500 g
Sodium chloride Sigma-Aldrich Cat#S7653-5KG
Sodium hydroxide Sigma-Aldrich Cat#795429-500G
Spironolactone Sigma Cat#S3378
Sulfamethoxazole Sigma Cat#S7507
Sulfasalazine Sigma Cat#S0883
Sulindac Cayman Cat#38194-50-2
T4 DNA ligase NEB Cat#M0202T
Terrific Broth Fisher Scientific Cat#DF0438-17
Thioglycolate Broth Sigma Cat#70157-500 g
Tinidazole Santa Cruz Cat#sc-205862
Tolcapone Sigma Cat#SML0150
Trace Mineral Supplement ATTC ATCC MD-TMS
(Continued on next page)
Continued
REAGENT OR RESOURCE SOURCE IDENTIFIER
Trifluridines Sigma Cat#T2255-100MG
Tryptic Soy Broth Fisher Scientific Cat#DF0370 17 3
Tryptic Soy Broth Thermo Fischer Cat#DF0370-17-3
Trypticase Peptone VWR Cat#90000-434 (EA)
Tween 80 Sigma-Aldrich Cat#P1754-500ml
Vancomycin Sigma-Aldrich Cat#V2002-1G
Vitamin K1 Sigma-Aldrich Cat#95271-250MG
Vitamin Mix ATCC ATCC MD-VS
Voriconazole Cayman Cat#15633
Vorinostat Sigma Aldrich Cat#SML0061
Yeast extract VWR Cat#90000-026 (EA)
Zidovudine Sigma Cat#PHR1292
Experimental models: Organisms/Strains
Anaerococcus prevotii Fischbach lab, Stanford University N/A
Anaerostipes caccae DSM 14662 DSMZ DSM 14662
Clostridium bolteae ATCC BAA-613 ATCC ATCC BAA-613
E. coli BW25113 Keio collection (Baba et al., 2006)
E. coli BW25113 mutants harboring Keio collection (Baba et al., 2006)
replacement of deoA or udp with a
kanamycin resistance gene
E. coli BW25113 clean mutants of deoA or This study Methods
udp or both
E.coli BL21 NEB Cat#C2527I
E.coli Stellar Clontech Laboratories Cat#636763
Enterococcus faecalis TYG’11 This paper Methods
Escherichia coli TYG1 This paper Methods
Escherichia coli TYG2 This paper Methods
Lactobacillus gasseri JV-V03 BEI Resources HM-104
Parabacteroides distasonis CL09T03C24 Comstock lab, Harvard Medical School N/A
Prevotella bivia ATCC 29303 ATCC ATCC 29303
Salmonella enterica SL484 Ravel lab, UMB N/A
Serratia marcescens Fischbach lab, Stanford University N/A
Oligonucleotides
Primers used in this study This study Methods
Software and Algorithms
Adobe Illustrator Adobe N/A
Agilent Profinder 8.0 Agilent N/A
Agilent Qualitative Analysis 10.0 Agilent N/A
Agilent Quantiative Analysis 9.0 Agilent N/A
Bowtie2 (Langmead and Salzberg, 2012) http://bowtie-bio.sourceforge.net/bowtie2/
index.shtml
Blast 2.7.1+ NCBI https://blast.ncbi.nlm.nih.gov/Blast.cgi
ChemDraw Professional 16.0 PerkinElmer N/A
Cytoscape (Shannon et al., 2003) https://cytoscape.org/
ENDS This study https://github.com/donia-lab/
personalized_community_MDM_screen
Geneious R9 Geneious https://www.geneious.com/
Global Natural Products Social Molecular (Wang et al., 2016) https://gnps.ucsd.edu
Networking
(Continued on next page)
Continued
REAGENT OR RESOURCE SOURCE IDENTIFIER
Illumina HiSeq Control Software Illumina N/A
Kraken-1.1.1 (Wood and Salzberg, 2014) http://ccb.jhu.edu/software/kraken
MATLAB R2017b Mathworks https://www.mathworks.com
Matplotlib 2.1.0 (Hunter, 2007) https://matplotlib.org/
MestreNova 10.0 Mestrelab Research N/A
NumPy 1.14.2 (Oliphant, 2006) https://www.numpy.org/
Open Babel (O’Boyle et al., 2011) http://openbabel.org
Pandas 0.22.0 (McKinney, 2010) https://pandas.pydata.org/
PRINSEQ-lite v0.20.4B (Schmieder and Edwards, 2011) http://prinseq.sourceforge.net/
ProteoWizard N/A http://proteowizard.sourceforge.net/
Python 3.6.3 Python Core Team https://www.python.org/
QIIME2 version 2018.6 (Bolyen et al., 2018) https://qiime2.org
SPAdes v3.11.0 (Bankevich et al., 2012) http://cab.spbu.ru/software/spades/
Targeted and untargeted metabolomics This study https://github.com/donia-lab/
pipeline personalized_community_MDM_screen
Deposited Data
Sequencing data This study NCBI BioProject: PRJNA593062
Metabolomics data This study MassIVE: MSV000084641
Critical Commercial Assays
DNeasy Power Soil Kit(100) QIAGEN Cat#12888-100
End-It DNA End-Repair Kit Epicenter Cat#ER0720
In-Fusion HD Cloning System (50RXNS) TAKARA Cat#639646
Qiaprep Spin mini prep(250) kit QIAGEN Cat#27106
Qiaquick Gel Extraction Kit,250 QIAGEN Cat#8706
QIAquick PCR Purification kit QIAGEN Cat#28106
RNAlater Stabilization Solution Fisher Scientific Cat#AM7021
Zymoclean Large Fragment DNA Zymo Research Cat#D4046
Recovery Kit
Animal Models
C57BL/6 mice Jackson Laboratories (Jax-West facility) N/A
Others
96-well plate, 1.0 mL, round wells, U shape, Agilent Cat#5043-9305
polypropylene, 32 mm, 50/pk
Agilent 1290 Infinity II LC System Agilent N/A
Agilent 2100 Bioanalyzer Agilent N/A
Agilent 6120 quadrupole mass Agilent N/A
spectrometer
Agilent 6530 Q-TOF LC/MS equipment Agilent N/A
Agilent 6545 Q-TOF LC/MS equipment Agilent N/A
Agilent Eclipse Plus C18 RRHD column Agilent Part Number 959757-902
1.8 mM (2.1 3 50 mm)
Agilent Poroshell 120 EC-C18 column Agilent Part Number 695775-902
2.7um (2.1x100mm)
Agilent Poroshell 120 EC-C18 column Agilent Part Number 695975-902
2.7um (4.6x100mm)
Agilent Poroshell 120 EC-C18 column Agilent Part Number 699975-902
2.7um (4.6x50mm)
Anaerobic chamber COY N/A
EquaVAP 96-Well Blowdown Evaporator Fisher Analytical Sales & Services Cat#23096
(Continued on next page)
Continued
REAGENT OR RESOURCE SOURCE IDENTIFIER
g-TUBE Covaris N/A
Mega Bond Elut-C18 10 g Agilent Part Number 12256031
NanoDrop 2000 Thermo Fisher Scientific N/A
Nunc 96-Well Cap Mats, case of 50 ThermoFisher Cat#276000
Nunc 96-Well Polypropylene DeepWell FisherScientific Cat#278743
Storage Plates, case of 60, 2mL
Reservoir w/Lid 75 mL 25/Pk RV-L25 Mettler-Toledo Rainin LLC Cat#17007886
Sealing mat, 96 wells, round, preslitted, Agilent Part Number 5043-9317
silicone, 50/pk
RESOURCE AVAILABILITY
Lead Contact
Further information and requests for resources and reagents should be directed to and will be fulfilled by the Lead Contact, Mohamed
S. Donia donia@princeton.edu)
Materials Availability
All unique/stable reagents generated in this study are available from the Lead Contact, but we may require a completed Materials
Transfer Agreement if there is potential for commercial application.
Mice
8-to-10-week-old male and female mice (25–30 g) C57BL/6 mice were purchased from Jackson laboratories. All animals
were housed and maintained in a certified animal facility and all experiments were conducted according to USA Public Health
Service Policy of Humane Care and Use of Laboratory Animals. All protocols were approved by the Institutional Animal Care
and Use Committee, protocol 2087-16 (Princeton University). The sex and number of animals are specified for the pharmacokinetic
study.
METHOD DETAILS
ex vivo culture of PD
A small aliquot (20 mL) from a PD glycerol stock was used to inoculate 10 mL of 14 different media: Liver Broth (Liver), Brewer Thiogly-
colate Medium (BT), Bryant and Burkey Medium (BB), Cooked Meat Broth (Meat), Thioglycolate Broth (TB), Luria-Bertani Broth (LB) (ob-
tained from Sigma Aldrich, USA), Brain Heart Infusion (BHI), MRS (MRS), Reinforced Clostridium Medium (RCM), M17 (M17) (obtained
from Becton Dickinson, USA), modified Gifu Anaerobic Medium (mGAM) (obtained from HyServe, Germany), Gut Microbiota Medium
(GMM (Goodman et al., 2011)), TYG, and a 1:1 mix of each (BestMix), and cultures were incubated at 37 C in an anaerobic chamber.
One mL was harvested from each culture each day for 4 consecutive days, and centrifuged to recover the resulting bacterial pellets.
rotary evaporator (Speed Vac). This extraction method recovers organic molecules from both cells and broths of the cultures, and
therefore is not affected by cases of bacterial sequestration of the parent drugs. The dried extracts were suspended in 250 mL
MeOH, centrifuged at 15000 rpm for 5 min to remove any particulates, and analyzed using HPLC-MS (Agilent Single Quad, column:
Poroshell 120 EC-C18 2.7mm 4.6 3 50mm, flow rate 0.8 mL/min, 0.1% formic acid in water (solvent A), 0.1% formic acid in acetonitrile
(solvent B), gradient: 1 min, 0.5% B; 1-20 min, 0.5%–100% B; 20-25 min, 100% B). We tested each drug twice, along with matching
no-drug and no-microbiome controls. If drugs were deemed positive for MDM in one or both of the two trials they were tested a third
time and analyzed using both HPLC-MS and HR-HPLC-MS/MS (Agilent QTOF, column: Poroshell 120 EC-C18 2.7mm 2.1x100 mm,
flow rate 0.25 mL/min, 0.1% formic acid in water (solvent A), 0.1% formic acid in acetonitrile (solvent B), gradient: 1 min, 0.5% B;
1-20 min, 0.5%–100% B; 25-30 min, 100% B). For selected molecules, cultures were scaled up and metabolites were isolated
and their structures were elucidated using NMR and/or comparison to an authentic standard obtained commercially using HPLC-
HRMS/MS (see below). See Data S1 for the chromatograms of all MDM+ metabolites.
Other than natural and synthetic classifications, we also looked more closely at functional groups that are enriched or depleted in
MDM+ drugs. We generated a list of 94 common functional groups and structural features and searched for them within all of the
drugs tested in our screen. To determine whether certain functional groups are enriched in MDM positive drugs, we aggregated
the SMARTS of common functional groups and the SMILES of all drugs within our screen. We then searched for these functional
groups within the drugs using the obgrep function within Open Babel (O’Boyle et al., 2011). We then tested for enrichment or deple-
tion of these groups within MDM+ drugs using two-sided proportion z-tests, correcting the resulting p values using the Benjamini-
Hochberg method and requiring that the false discovery rate (FDR) corrected p value is less than 0.01. The n for these tests is based
on the number of molecules with and without the functional group. Not surprisingly, we observed an enrichment of the following func-
tional groups in MDM+ drugs: nitro groups (FDR corrected p = 3e-16), ketones (FDR corrected p = 3e-8), carbonyl groups with one
carbon attachment (FDR corrected p = 8e-4), azo groups (FDR corrected p = 0.001), imines (FDR corrected p = 0.002), and alkenes
(FDR corrected p = 0.001). These results are consistent with common reduction and hydrolysis reactions often performed by gut bac-
teria. On the other hand, we observed a general depletion of arenes and nitrogen atoms in MDM+ drugs (FDR corrected p = 7e-5 and
FDR corrected p = 1e-7, respectively). However, when we excluded steroids and repeated the analysis we found that the depletions
in arenes and nitrogen atoms were no longer statistically significant (corrected p = 0.7 and corrected p = 0.7, respectively). This in-
dicates that the original statistically significant depletions were the result of steroids being a highly modified class that generally does
not contain these functional groups, rather than the functional groups themselves being important predictors for the lack of meta-
bolism (Figure 2D and Table S1). The exclusion of steroids, on the other hand, did not affect the observed enrichments we found
for nitro groups, imines, azo groups and ketones (FDR corrected p < 0.01). It is important to note that the results of our analysis
of MDM enrichment are based on a single subject’s microbiome, and should be repeated in the future with data from a much larger
set of donors.
Where E½Ns is the expected number of detectable microbes, and Bðxi Þ is the probability of microbe i’s reaction being detected with
an absolute population of size xi :
How can we construct Bðxi Þ? In statistical terms, Bðxi Þ is equivalent to the power (the probability of deciding there is a reaction
when the reaction is actually present) of the hypothesis testing method used to analyze the data. In this case, we are using a one-
sided unequal variances t test with cutoff a. Now that we have a framework for calculating Bðxi Þ; we must relate a given xi to a
null and alternative distribution of measurements.
We assume that the metabolite measurements are composed of two types of signal: background noise, X and compound signal, Y.
Both signals are assumed to be normally distributed (we show in Methods S1F that an empirically estimated power function provides
similar results). The background noise X Nðm1 ; s1 Þ is the signal present when no actual metabolite is present. The compound signal
Y Nðm2 ; fðm2 ÞÞ is the portion of the signal due to measurement of an actual metabolite. The measurements in the control condition
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
are modeled by X while the measurements in the experimental conditions are modeled as Z = X + Y Nðm1 + m2 ; s21 + fðm2 Þ2 Þ
We will now relate the population abundance xi to the mean of Y, m1 . In real terms, m1 is the average level of a metabolite produced
by xi . We assume the production metabolite is governed by the dynamics ðd½M =dtÞ = rxi where ½M is the concentration of the metab-
olite and r is the rate of metabolite production per cell. If we assume the drug is added at stationary phase such that ðdxi =dtÞ = 0 for all
t after drug addition, the total amount of metabolite produced is ½M = trxi where t is the incubation time. The r can vary widely, and to
account for this it can be set using the rate of a known MDM reaction (see Methods S1D).
We must now estimate the distribution of X. We estimate this by computing the mean and standard deviation of spurious peaks
detected when samples not containing the compound being measured are quantified.
Now we must define the standard deviation of Y, fðm2 Þ. This is clearly dependent on the instrument being used. By plotting the
standard deviations of triplicate measurements from our machine against their mean, we can estimate fðm2 Þ. To ensure we are
capturing only measurement signal, we based our model only on measurements that largely composed of measurement signal
(more than three standard deviations above the mean of the null distribution). We have found that a power law fðm2 Þ = amb2 fits the
data well. With the distributions of X and Z, we then estimate the Bðxi Þ using existing methods (Harrison and Brady, 2004).
What if we want to create an ensemble of multiple culture conditions to detect even more microbial reactions? To do this we can
define the expected number of new detectable microbes gained if we include another media E½DNs . For each ASV in all media, we
take the product of the probability that the reaction is not detected in the existing media and the probability that the reaction is
detected in the new media.
X
n Y
m
E½DNs = Bðxi Þ 1 B yij
i=1 j=1
Where m is the number of media in the existing ensemble, and yij is the abundance of ASV i in existing media j.
In our actual computation of ENDS, we exclude samples with less than 10,000 reads and take the optimal medium as the one with
the highest ENDS averaged across all twenty donors.
acquisition of 8 MS1 spectra/s, acquisition of 6 MS2 spectra/s, maximum of 5 precursors per cycle, precursor selection threshold of
2000 counts.
To verify that the concentration of the internal standard (voriconazole) used in the screen was below the saturation limit of the ma-
chine, we constructed a standard curve of voriconazole. 12 mL of 1 mM voriconazole was added to 228 mL of methanol and serial
dilutions were performed by a factor of three to cover the concentration ranges of 40 mM to 0.165 mM. These samples were analyzed
using the 2GHz setting described above to match the setting used for drug quantification in the multi-donor screen.
Drugs and their detected metabolites were quantified in the MS1 of all samples using MassHunter Quantitative Analysis with the
Agile2 integrator. The metabolites quantified here were either ones that we previously discovered during the PD screen, ones that
were previously reported in the literature, or novel metabolites from the multi-donor screen identified using untargeted metabolomics
and verified by molecular networking (See below, Table S3, Figure S3). For quantification of dihydrodigoxin, we required a highly
sensitive integration method to distinguish between dihydrodigoxin and the isotopes of digoxin, since parent and metabolite eluted
at similar retention times. To do this, we performed integrations within MassHunter Qualitative, specifying a mass range of 805.43 -
805.44 m/z for dihydrodigoxin. We verified this method could differentiate the two compounds using authentic standards of dihydro-
digoxin and digoxin, showing that it could accurately quantify dihydrodigoxin and that it did not detect dihydrodigoxin when only
digoxin was present.
Following quantification, all further data processing was performed in MATLAB. For each plate, we remove any samples whose
internal standard AUC was greater than three interquartile ranges above the third quartile or below the first quartile. In order to correct
for differences in extraction efficiency, all peak areas in a given sample were divided by the corresponding internal standard area. This
ratio was then used for hypothesis testing and all other downstream analyses. For drug depletion, unadjusted p values were obtained
by one-sided Welch’s t tests testing whether drug levels are significantly lower in the donor-drug conditions than in controls; p values
were computed for tests against controls where the drug was incubated with BG medium (medium-drug) and incubated with the
heat-killed-microbiome (HKM-drug) controls. For all metabolite quantification statistical tests, n is the number of replicates passing
quality control (maximum 3). For metabolite production, unadjusted p values were obtained for one-sided Welch’s t tests testing
whether metabolite levels are significantly higher in donor-drug conditions than in control conditions; p values were computed for
tests against medium-drug, HKM-drug, as well as donor-DMSO (where the cultures are incubated with only the vehicle, DMSO) con-
trols. Correction for multiple hypotheses was performed using the Benjamini-Hochberg procedure (Benjamini and Hochberg, 1995).
Metabolite production and drug depletion p values were adjusted separately. For depletion to be considered significant, we required
FDR corrected p value < 0.01 for both the medium-drug and HKM-drug adjusted p values and also depletion was required to be
greater than 50% relative to both controls. For metabolite production to be considered significant, we required that FDR corrected
p value < 0.01 for the medium-drug, HKM-drug and donor-DMSO p values.
We then looked for correlation between the results of the targeted metabolomics and the compositions of the BG communities, We
restricted the drugs and metabolites tested by requiring that they must have at least one associated compound (parent drug or
metabolite) with an inter-individual variability entropy > 0.5, and that there exists at least one sample with more than 20% of the
drug remaining relative to medium-drug controls. We tested only taxonomic elements present in at least three samples with a
biomass of at least 1 mg/L in at least one sample, and corrected the resulting p values for multiple hypotheses at each taxonomic
level using the Benjamini-Hochberg method. The n for these tests is based on the number of observations used to compute the cor-
relations. Spearman correlation was computed for this analysis. The mean BG community composition for each donor was used for
the correlation analysis. We tested taxonomic elements at the ASV, species, genus, and family levels.
to determine whether a metabolite is linked to its parent by the molecular ion networking, we first identify whether the drug and
metabolite are present in the network. For this, we require that the mass and retention time found in the molecular ion networking
differ by less than 0.2 min and 0.02 Da, respectively, from the properties reported by the initial donor-drug stage of the pipeline.
We call the two compounds related if they are in the same connected component of the graph. In the cases where either the metab-
olite or the parent drug or both were not picked up in the molecular ion networking analysis, we deem the linkage ‘‘undetermined.’’
There are several reasons why the metabolites or drugs are not picked up in the analysis, including the abundance of the ions and the
number and abundance of fragment ions.
Forward: CTGGTGCCGCGCGGCAGCCATATGGCAGACGAATCATCGAAGATTC
Reverse: GGTGGTGCTCGAGTGCGGCCCAAATGGGGTACGGTGATTAGAAGAC
Plasmid was recovered using a QIAprep Spin Miniprep Kit (QIAGEN, USA) and sequenced (Sanger) to confirm the presence of
the correct insert.
Cultures of BL21-DE3 E. coli harboring either the codon optimized or native sequence of the 20b-HSDH gene were grown from
glycerol stock overnight at 37 C, 220 RPM. The next day, cultures were back diluted to an OD600 of 0.05, grown at 37 C,
220RPM until OD600 was 0.4 and induced with a final concentration of 1mM IPTG. After four h of growth (from the time of back dilu-
tion), 10 mL of 10 mM hydrocortisone was added. All cultures were then grown for 20 h at 37 C, 220 RPM, chemical extraction and
HPLC-HRMS analysis were performed as described above.
only grow on LB, but not on LB-ampicillin (confirming the loss of the pCP20 plasmid), nor on LB-kanamycin (confirming the excision of
the kanamycin resistance gene) were confirmed to harbor the correct deletion using PCR and DNA sequencing. Primers deoA-
Check-F: 50 -CGCATCCGGCAAAAGCCGCCTCATACTCTTTTCCTCGGGAGGTTACCTTG-30 , deoA-Check-R: 50 - CAAATTTAAA
TGATCAGATCAGTATACCGTTATTCGCTGATACGGCGATA-3 0 , udp-Check-F: 50 -CGCGTCGGCCTTCAGACAGGAGAAGAGAA
TTACAGCAGACGACGCGCCGC-30 , and udp-Check-R: 50 -TGTCTTTTTGCTTCTTCTGACTAAACCGATTCACAGAGGAGTTGT
ATATG-30 were used in PCR experiments to confirm the deletion of the deoA or upd genes and the kanamycin resistance gene re-
placing them (Baba et al., 2006). To construct the DdeoA/Dudp double knockout, the in-frame Dudp knockout obtained above was
used as a starting point. Plasmid pKD46 expressing the Lambda Red recombinase was transformed to it using electroporation (Dat-
senko and Wanner, 2000) and transformants were selected on LB-Ampicillin at 30 C for 16 h. One Ampicillin-resistant transformant
was then cultured at 30 C in 50 mL of LB-Ampicillin, with an added 50 mL of 1 M L-arabinose to induce the expression of the recom-
binase. At an optical density of 0.4-0.6, electrocompetent cells were prepared from the growing culture by serial washes in ice cold
10% glycerol, and 300 ng of a linear PCR product were transformed to it by electroporation. This PCR product was prepared by
using the deoA-Check-F and deoA-Check-R primers on a template DNA prepared from the deoA mutant of the Keio library, in which a
kanamycin resistance gene replaces deoA. After electroporation, transformants were selected on LB-kanamycin at 37 C to induce
the loss of the temperature sensitive pKD46 plasmid, cultured in LB-kanamycin overnight at 37 C, and checked by PCR to confirm
the correct recombination position. Finally, the kanamycin resistance gene was excised from the deoA locus by the FLP recombinase
using the same strategy explained above, resulting in the final DdeoA/Dudp mutant
and transferred to an autosampler vial for HPLC-HR-MS analysis. For fecal samples, pellets were weighed (for later normalizations),
and suspended in 500 mL sterile Milli-Q water (Millipore Corporation, USA). 2 mL of an internal standard solution (0.5 mg / mL of vor-
iconazole) were added to the sample, and the mixtures were extracted with 500 mL 1:1 ethyl acetate: MeOH. Fecal debris were then
spun down and collected supernatants were dried under vacuum using a rotary evaporator (Speed Vac). The dried residues were
suspended in 100 mL MeOH. The final solutions were centrifuged at 15000 rpm and transferred to autosampler vials.
The prepared samples were analyzed by HPLC-HR-MS (Agilent QTOF). Chromatography separation was carried out on a Poros-
hell 120 EC-C18 2.7 mm 2.1 3 100 mm column (Agilent, USA) with the gradient: 99.5% A, 0.5% B to 100% B in 20 min and a flow rate
of 0.25 mL/min, where A = 0.1% formic acid in water and B = 0.1% formic acid in acetonitrile. A 10 mL aliquot of the reconstituted
extract was injected into the HR-HPLC-MS system, and the Area Under the Curve (AUC) was integrated for each metabolite and
normalized by the internal standard’s AUC. Peak identities were confirmed by accurate mass, and by comparison of chromato-
graphic retention time and MS/MS spectra to those of authentic standards.
All statistical analyses were performed in MATLAB. p values less than 0.01 (after correction for multiple hypotheses, if applicable)
were considered significant. For comparisons of the means of two populations, Welch’s t test was generally used. In cases where
the independence assumption of this test was not met (as determined by the form of the null hypothesis), permutation tests were
used instead. Comparison of multiple means was done via ANOVA. Comparisons of two proportions was done via a proportions
z-test. For all analyses, the meaning and value of n and the measures of center, dispersion, and precision used can be found in
the relevant main text or in Method Details.
Supplemental Figures
Figure S1. Expanded Bacterial Community Composition Analysis for PD, D1-20 and Their Ex Vivo Cultures, Related to Figures 1 and 3
A) Four-day time course of PD cultured ex vivo in various media. Family level bacterial composition of the original PD fecal sample (far left), as well as that of PD
ex vivo cultures grown anaerobically in 14 different media over four days (.01, .02, .03, .04). 16S rRNA gene amplicon sequences that could not be classified at the
family level, and families with less than 1% relative abundance in all samples are grouped into ‘‘Other.’’ Cultures are ordered according to their Jensen-Shannon
divergence (DJS) from the original PD sample (upper axes, computed at the family level), where lower values indicate higher similarity to PD. Note that cultures
grown in mGAM are the most similar to PD. B) Family level bacterial composition of the original D1-20 fecal samples and corresponding BG cultures. ‘‘Other’’
includes 16S rRNA gene amplicon sequences not classified at the family level and taxa that are below 5% abundance in all sequenced samples. Replicate BG
cultures for each donor labeled as ‘‘BG_1,’’ ‘‘BG_2,’’ and ‘‘BG_3.’’ See also STAR Methods, Table S2.
ll
Resource
Figure S2. Sequential Metabolism of Hydrocortisone Acetate by the PD Microbiome, Related to Figure 2
HPLC-MS analysis of hydrocortisone acetate (1) and hydrocortisone (2) incubated with PD mGAM.02 culture, mGAM.02 broth, P. distasonis culture, or C. bolteae
culture. An HPLC chromatogram at an absorbance of 254 nm is shown for all samples, indicating the conversion of hydrocortisone acetate to hydrocortisone by
both P. distasonis and C. bolteae, and the conversion of both hydrocortisone acetate (in two steps) and hydrocortisone (in one step) to 20b-dihydrocortisone (3) in
the presence of the PD microbiome.
ll
Resource
Figure S3. Structures of All Drugs Quantified in the D1-20 MDM-Screen and Their Known or Predicted Metabolites, Related to Figure 4
ll
Resource
Figure S4. MDM Deglycosylation of the Anticancer Drugs Capecitabine and Trifluridine, Related to Figure 5
A) A heatmap indicating the ability of each of 12 tested bacterial strains to perform capecitabine deglycosylation (d-G) in order to identify candidate species for the
homology-based gene finding approach. E. faecalis TYG11, E. coli TYG1, and E. coli TYG2 were originally isolated from PD. B. HPLC-MS analysis of trifluridine (1)
incubated with wild type E. coli BW25113 (WT), and Dudp, DdeoA, and DdeoA/Dudp mutants in M9 medium. An HPLC chromatogram at an absorbance of
250 nm is shown for all samples, indicating the conversion of trifluridine (1) to trifluorothymine (2) by wild type E. coli BW25113 (WT), Dudp, and DdeoA, but not the
DdeoA/Dudp mutant.
ll
Resource
Figure S5. MDM Deglycosylation of the Anticancer Prodrug Doxifluridine, Related to Figure 5
HPLC-MS analysis of doxifluridine (1) incubated with wild type E. coli BW25113 (WT), and Dudp, DdeoA, and DdeoA/Dudp mutants in M9 medium. Extracted Ion
Chromatograms for both doxifluridine (1) and its resulting MDM metabolite 5-fluorouracil (2) are shown for all samples, indicating the complete conversion of
doxifluridine (1) to 5-fluorouracil (2) by wild type E. coli BW25113 (WT), Dudp, and DdeoA, but not the DdeoA/Dudp mutant.
ll
Resource
Figure S6. Functional Metagenomic Screening for Hydrocortisone Metabolizing Enzymes in the PD Microbiome, Related to Figure 6
Schematic representation of the screening scheme followed to identify the 20b-HSDH gene from the PD microbiome metagenomic clone library.
ll
Resource
Correspondence
mpsnyder@stanford.edu (M.S.),
mmelbye@stanford.edu (M.M.)
In Brief
Identification of blood metabolites in
pregnant women that can accurately
predict gestational age and provide
insights into pregnancy variations
undetected by ultrasound.
Highlights
d Weekly metabolome of maternal blood changes dynamically
through healthy pregnancy
Resource
Metabolic Dynamics and Prediction of Gestational
Age and Time to Delivery in Pregnant Women
Liang Liang,1 Marie-Louise Hee Rasmussen,2 Brian Piening,1,8 Xiaotao Shen,1 Songjie Chen,1 Hannes Röst,1,9
John K. Snyder,3 Robert Tibshirani,4 Line Skotte,2 Norman CY. Lee,3 Kévin Contrepois,1 Bjarke Feenstra,2
Hanyah Zackriah,5 Michael Snyder,1,10,* and Mads Melbye2,6,7,*
1Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
2Department of Epidemiology Research, Statens Serum Institut, Copenhagen, 2300, Denmark
3Department of Chemistry and the Chemical Instrumentation Center, Boston University, Boston, Massachusetts 02215, USA
4Department of Statistics and Biomedical Data Science, Stanford University, Stanford, CA 94305, USA
5Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, CA 94720, USA
6Department of Clinical Medicine, University of Copenhagen, Copenhagen, 2200, Denmark
7Department of Medicine, Stanford University School of Medicine, Stanford, CA 94305, USA
8Present address: Earle A. Chiles Research Institute, Providence Portland Medical Center, Portland, OR, 97213 USA
9Present address: Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, ON M5S 3E1, Canada
10Lead Contact
SUMMARY
Metabolism during pregnancy is a dynamic and precisely programmed process, the failure of which can bring
devastating consequences to the mother and fetus. To define a high-resolution temporal profile of metabo-
lites during healthy pregnancy, we analyzed the untargeted metabolome of 784 weekly blood samples from
30 pregnant women. Broad changes and a highly choreographed profile were revealed: 4,995 metabolic fea-
tures (of 9,651 total), 460 annotated compounds (of 687 total), and 34 human metabolic pathways (of 48 total)
were significantly changed during pregnancy. Using linear models, we built a metabolic clock with five me-
tabolites that time gestational age in high accordance with ultrasound (R = 0.92). Furthermore, two to three
metabolites can identify when labor occurs (time to delivery within two, four, and eight weeks, AUROC R
0.85). Our study represents a weekly characterization of the human pregnancy metabolome, providing a
high-resolution landscape for understanding pregnancy with potential clinical utilities.
1680 Cell 181, 1680–1692, June 25, 2020 ª 2020 The Authors. Published by Elsevier Inc.
This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
ll
Resource
Table 1. Demographics and Birth Characteristics of the The study identified a large number of pregnancy-related metab-
Discovery and Validation Cohorts olites and metabolic pathways offering a comprehensive view of
Discovery Test Set 1 Test Set 2 the metabolite changes during healthy pregnancy and the post-
partum period. Leveraging the high-resolution datasets, we built
N = 21 N=9 N=8
a metabolic clock that not only predicts gestational age in high
Demographics
accordance with the first-trimester ultrasound, the clinical gold
Maternal age at birth, 29.8 ± 3.1 29.7 ± 3.3 31.4 ± 1.0 standard, but also recovers personal pregnancy variations unde-
years
tected by ultrasound but capable of affecting delivery time.
Previous births, No. (%)
0 13 (61.9) 6 (66.7) 4 (50.0) RESULTS
1 8 (38.1) 2 (22.2) 3 (37.5)
R2 0 (0) 1 (11.1) 1 (12.5) Danish Pregnancy Cohort: A Study of Normal Pregnancy
with High-Density Sampling
Pre-pregnancy BMI, 22.1 ± 2.9 21.2 ± 3.4 21.1 ± 1.6
kg/m2 To capture the highly dynamic pregnancy process at high reso-
lution, we established a multi-year single-center Danish normal
Smoking during pregnancy, No. (%)
pregnancy cohort and a design of high-density blood sampling.
Yes 0 (0) 0 (0) 1 (12.5) Consenting female participants submitted weekly blood draws
No 18 (85.7) 9 (100) 6 (75.0) beginning in week 5 of pregnancy and ending in the postpartum
Missing 3 (14.3) 0 (0) 1 (12.5) period. A total of 30 women with weekly blood samples were as-
Alcohol during pregnancy, No. (%) signed to a discovery (N = 21) and a validation (test set 1, N = 9)
Yes 5 (23.8) 1 (11.1) 1 (12.5)
cohort (Table 1; Figures 1A and S1A). The samples were
analyzed in two separate years. In addition, another separate
Average number of 0.80 1.0 0.25
set of women (N = 8) was included as the secondary validation
units per week
cohort. These samples were analyzed independently three years
No 13 (61.9) 8 (88.9) 6 (75.0)
apart from the discovery cohort (test set 2) (Table 1).
Missing 3 (14.3) 0 (0) 1 (12.5)
Birth characteristics Weekly Pregnancy Progression Is Precisely Ordered by
Gestational age, days 281 ± 8.4 280.7 ± 8.3 279.3 ± 9.5 Metabolites
Mode of delivery, No. (%) We randomized the 784 samples from the first 30 subjects within
each cohort (Discovery and test set 1), processed them by using
Spontaneous vaginal birth 10 (47.6) 5 (55.6) 4 (50.0)
a standardized protocol (Contrepois et al., 2015), and analyzed
Induced vaginal birth 7 (33.3) 1 (11.1) 3 (37.5)
them by liquid chromatography-mass spectrometry (LC-MS)
C-section before onset 1 (4.8) 3 (33.3) 1 (12.5) for untargeted metabolomics across two separate years. After
of labor
quality control, data filtering, and normalization (see STAR
C-section during labor 3 (14.3) 0 (0) 0 (0) Methods) (N = 30), we identified 9,651 metabolic features across
Birth weight, grams 3,638 ± 500 3,803 ± 662 3,362 ± 493 the different samples. Of these, 4,995 features (51.7%) were
Birth length, centimeters 52.4 ± 2 53.3 ± 2 51 ± 2.3 altered during pregnancy and/or the postpartum period (false
Gender of child, No. (%) discovery rate [FDR] < 0.05), suggesting extensive metabolic
Male 9 (42.9) 5 (55.6) 5 (62.5)
changes occur during pregnancy. We examined the data glob-
ally with principal component analysis (PCA), in which the sam-
Female 12 (57.1) 4 (44.4) 3 (37.5)
ples were distributed on the basis of the first two principal com-
Values are means (SDs) or numbers (percentages).
ponents according to their gestational stages (Figure 1B; Scree
plot in Figure S1B and the partial least-squares discriminant
et al., 2018). Therefore, a more accurate and cost-effective analysis [PLS-DA] results in Figure S1C), regardless of individual
method for estimating gestational age and delivery time, variation and batches (Figures S1D and S1E). Interestingly, we
possibly using blood metabolites, is needed. In addition, current found that metabolites with uni-directional behaviors dominated
clinical tests often only focus on a few markers, whereas the features, and over half of them increased across pregnancy
research covering more molecules often examines the profiles until reaching their peaks immediately before labor (Figures S1F
at one or a few time points during pregnancy (Bahado-Singh and S1G).
et al., 2012; Chan et al., 2003; Dudzik et al., 2014; Gagnon To understand the potential function of pregnancy-related
et al., 2008; Kenny et al., 2010; Koh et al., 2014; López-Hernán- metabolites, we first annotated metabolic features by using an
dez et al., 2019; Romero et al., 2010; Sachse et al., 2012; Soldin in-house library and a combined public spectral database (see
et al., 2005). Thus, a high-resolution landscape of pregnancy- details in STAR Methods). A total of 952 metabolic features
related metabolites during healthy pregnancy and the post- were mapped to 687 compounds, which include plasma metab-
partum period is still poorly understood. olites with important functions in humans. We then applied
Here, we use untargeted metabolomics (Kaddurah-Daouk significance analysis for microarrays (SAM) to examine the cor-
et al., 2008) to systematically profile blood metabolites relation between the abundance of each compound and the
throughout pregnancy with weekly sampling of maternal blood. reported gestational age of a woman at blood sampling. Among
B C
Gestational
Age/ weeks
10 20 30 40 Postpartum
Estriol-16-Glucuronide
60 Pr Estrone 3-sulfate
eg 0.15 THDOC
Af na N-Acetyl-D-glucosamine
PE(P-16:0e/0:0) Theophylline
DHEA-S 1-Methylxanthine
Oleoylcarnitine C16PAF (Platelet-activating factor)
beta-Glycyrrhetinic acid Cortisone
0 0.05 LPC(17:0) Docosadienoic acid
LPC(P-16:0)
MG(24:1)
MG(24:0)
LPE(22:2)
MG(22:2)
0.00 MG(20:0)
−30 Top increased
Top decreased
−0.05 All others
D E
Log2(Intensity) Log2(Intensity)
up
up
ro
G
G
Estriol−16−Glucuronide 8,9−DHET
Estrone 3−sulfate 4 Hexadecadienoylcarnitine 2
LPE(22:2)
Progesterone LPC(20:5)
2 1
17alpha−Hydroxyprogesterone LPC(P−18:0)
THDOC LPC(P−16:0)
0 Oleoylcarnitine 0
C16 PAF (Platelet−activating factor) LPE(22:4)
Taurochenodeoxycholate −2 LPC(18:2) −1
17,18−EpETE LPE(20:1)
Cortisone −4 LPC(17:0) −2
LPE(22:1)
Cortisol LPE(20:0)
Theobromine LPE(20:3)
Tetracosapentaenoic acid Sinapyl alcohol Group
Tetracosatetraenoic acid 2−Phenylbutyric acid Amino acid metabolism
Isobutyryl−L−carnitine Bile acid biosynthesis
1−Methylxanthine LPC(24:0) Ca eine metabolism
Docosadienoic acid PC(18:1(9Z)e/2:0) Fatty acid metabolism
Erucic acid Tricosanoic acid Phospholipid metabolism
Hydroxybupropion Steroid hormone biosynthesis
Caffeine Others
MG(18:1)
Theophylline Glycyrrhetinic acid
N−Acetyl−D−glucosamine PE(P-16:0e/0:0)
5−Pregnane−3,17−diol−20−one 3−sulfate DHEA-S
3−Acetoxypyridine LPC(P−18:1)
Valylhistidine
7alpha,24−Dihydroxy−4−cholesten−3−one Glycochenodeoxycholate
Dodecanoylcarnitine Androsterone sulfate
Androstane−3,17−diol 3−Hydroxyoleylcarnitine
Pregnenolone sulfate PC(22:1/22:1) (Lecithin)
beta−Glycyrrhetinic acid
Corticosterone MG(14:1)
Sphingosine MG(24:0)
7−Methylguanine MG(24:1)
Ketoisovaleric acid
Cyclo(leucylprolyl) MG(20:0)
Tetracosahexaenoic acid MG(22:2)
16
18
20
22
24
26
28
30
32
34
36
38
40
PP
16
18
20
22
26
24
28
30
32
34
36
38
40
PP
Figure 1. Untargeted Metabolomics Cluster the Weekly Plasma Samples Precisely According to Gestational Age
(A) Sampling scheme. Note that validation cohort refers to test set 1 in Table 1.
(B) Principal component analysis (PCA) distributed individual samples according to pregnancy stages (based on 9,651 features). The two PCs explaining the
largest part of the variation are shown.
(C) Plot shows the top 15 increased (red) and decreased (blue) metabolites (with MSI level 1 or 2 identification) in pregnancy.
(D and E) Heatmap displays the metabolite signal intensity averaged across individuals, showing the top 68 altered metabolites (D) increased and (E) decreased
by the end of pregnancy. Abbreviations are as follows: PP, postpartum. The gestational ages (GAs) were calculated by scaling delivery events to 40 weeks. The
the 687 annotated compounds, 460 compounds were signifi- steadily increased throughout pregnancy, estriol-16-glucuro-
cantly associated with pregnancy (67.0%; FDR < 0.05, SAM). nide exhibited a rapid increase before week 24, (Figures 1D
In addition, 264 compounds were identified with a metabolomics and 2B). Nearly all upregulated metabolites positively correlated
standards initiative (MSI) level 1 or 2 identification (Viant et al., with this cluster of steroids, whereas all downregulated metabo-
2017), among which 176 compounds (66.7%) were significantly lites negatively correlated with this cluster (Figure S2A). This
associated with pregnancy, as determined by linear regression result suggests that different steroid hormones might regulate
with gestational age (FDR < 0.05, SAM). global metabolome dynamics during pregnancy.
Our dense sampling revealed detailed temporal patterns of Within the lipid cluster, intra-correlation was relatively high. The
molecular changes. Among the top 68 metabolites (of the 176) largest cluster was composed of LysoPCs (Figures 2A and S2A), a
that changed over 50% during whole pregnancy, those that class of phospholipids. LysoPCs gradually decreased during
increased (N = 30) included steroid hormones estriol-16-glucu- pregnancy and increased after childbirth in a pattern that highly
ronide, estrone 3-sulfate, and tetrahydrodeoxycorticosterone correlates with the steroid dehydroepiandrosterone sulfate
(THDOC). All three increased more rapidly than the well-known (DHEA-S) (Figure 2C). LysoPCs are bioactive pro-inflammatory
steroids progesterone and 17a-hydoxyprogesterone (FDR < lipids that have been linked with organismal oxidative stress
0.05, SAM) (Figures 1C and 1D). By contrast, the top metabolites and inflammation (Sevastou et al., 2013). The second-largest
that decreased during pregnancy (N = 38) were mostly lipids or cluster of lipids included several free fatty acids that were highly
lipid-like molecules, such as monoacylglycerides (MGs), lyso- correlated within the cluster (Figures 2A and S2A). Long-chain
phosphatidylcholines (LPC or lysoPC), and oleoylcarnitine (Fig- fatty acids showed intricate dynamics in their amounts revealed
ures 1C and 1E). Hierarchical clustering of the weekly samples by the dense sampling. Hexadecadienoylcarnitine and tetracosa-
on the basis of the top 68 altered metabolites revealed a week- hexaenoic acid (THA) decreased at the beginning of pregnancy,
order mostly consistent with the actual progression of gesta- followed by waves of increased amounts similar to other fatty
tional age (Table S1; Figures 1D and 1E). Intriguingly, most of acids in the late second and third trimesters (Figure 2D). After
these metabolite changes rapidly returned to baseline after childbirth, the amounts of most long-chain fatty acids decreased,
childbirth (postpartum) (Figures 1B and 1D and 1E). Together, except for hexadecadienoylcarnitine (Figure 2D).
these results suggest a dramatic and programmed change of hu- Within the non-lipid cluster, one sub-cluster included five high-
man blood metabolites at a system level during pregnancy. ly correlated metabolites belonging to the same caffeine meta-
bolism pathway (Figures 2A and S2A). All five metabolites were
Metabolite Groups Altered during Pregnancy consistently elevated during pregnancy, and caffeine reached
To detect the functional groups of metabolites that change during a concentration three times higher at the end of pregnancy
pregnancy, we performed correlation analysis on the temporal in- than at the beginning (Figure 2E). This elevation might be due
tensity profiles of the top 68 pregnancy-related compounds to a slower caffeine metabolism in pregnant women rather than
mentioned above. In Figure S2, metabolites that were significantly an increase in coffee intake (Knutti et al., 1981). Overall, among
increased or decreased tended to cluster together. Using existing the 68 top-altered metabolites in pregnancy, functional metabo-
structural and biological information, we first categorized the top lite groups (e.g., steroids, LysoPCs, fatty acids, and caffeine
changing compounds in pregnancy into seven groups. Interest- metabolites) were altered in an orchestrated manner during
ingly, compounds of the same groups tended to cluster together pregnancy, and individual compounds within each group
in the correlation matrix. On the basis of the correlation relation- showed inter-correlation to each other (Figure 2A).
ship, we constructed a regularized partial correlation network us-
ing all pregnancy-related compounds to explore the potential reg- Orchestrated Metabolome Reconfigurations Span
ulatory relationships (Figures 2A and S2B). The topology of the Multiple Pathways during Pregnancy
network indicates that different metabolite groups occupied Next, we longitudinally examined the global pathway changes of
different positions; dense interactions occurred between both in- all 687 annotated compounds during normal pregnancy. Among
ter- and intra- metabolite groups with the densest interactions be- the 48 mapped Kyoto Encyclopedia of Genes and Genomes
tween central steroid hormones (Figure 2A). These findings high- (KEGG) pathways, 34 showed significant changes (70.8%,
light that even though the amount of each compound dynamically adjusted FDR < 0.05, global test) (Goeman et al., 2004; Xia and
changes during pregnancy, a highly coordinated metabolite reg- Wishart, 2010a) through metaboanalystR (Figure 3A) (Chong
ulatory network underlies the pregnancy process. et al., 2018), suggesting large-scale pathway changes of meta-
We next examined the main clusters that were present in the bolism in pregnancy. To quantify the pathway activities through
correlation analysis. Three main clusters emerged from the hier- gestational age, we calculated the average intensity of metabo-
archical clustering of metabolites (Figure S2A), with a steroid lites in the pathways (Figure 3B; see STAR Methods). Among the
cluster (e.g., antrostane-3,17-diol, estriol-16-glucuronide, pro- top altered pathways (Figure 3A), steroid hormone biosynthesis
gesterone, 17a-hydroxyprogesterone, and THDOC) sitting be- showed elevated activity precisely timed to gestation, peaking
tween the large clusters of lipids and non-lipid molecules. before the end of pregnancy and then declining sharply shortly
Compared with the other steroids in this cluster that slowly but after delivery (Figure 3B). Along with the essential roles of steroid
week order, which mostly coincides with the actual order, was ordered by hierarchical clustering on the basis of Manhattan distances. The intensities averaged
before 14 weeks of all women were used as the baseline.
See also Figure S1 and Table S1.
A Ketoisovaleric acid
LPC(P−18:1)
LPE(22:2)
Theobromine
LPC(P−16:0)
1−Methylxanthine
Theophylline PE(P−16:0e/0:0)
LPE(22:1)
Caffeine
LPE(20:0) LPC(P−18:0) Group
LPC(17:0) 2−Phenylbutyric acid
Amino acid metabolism
LPC(18:2) LPE(20:1) LPE(22:4)
Bile acid biosynthesis
Cyclo(leucylprolyl)
LPE(20:3) Caffeine metabolism
Taurochenodeoxycholate Fatty acid metabolism
3−Acetoxypyridine Oleoylcarnitine Phospholipid metabolism
PC(22:1/22:1) (Lecithin)
Glycochenodeoxycholate PC(18:1(9Z)e/2:0) 3−Hydroxyoleylcarnitine Steroid hormone biosynthesis
N−Acetyl−D−Pregnenolone sulfate Others
glucosamine MG(14:1)
Androsterone sulfate
ate
te
e LPC(20:5)
Sinapyl alcohol
Estrone 3−sulfate
Androstane−3,17−diol
Sphingosine Tetracosahexaenoic acid
5−Pregnane−3,17
−diol−20−one 3−sulfate THDOC 7−Methylguanine Tetracosapentaenoic acid
Estriol−16−
Glucuronide Hexadecadienoylcarnitine
Tetracosatetraenoic acid
Progesterone 17alpha− DHEA−S Erucic acid
Hydroxyprogesterone
Dodecanoylcarnitine
Tricosanoic acid Docosadienoic acid
7alpha,24−Dihydroxy−
Cortisone C16 PAF
4−cholesten−3−one Valylhistidine
Cortisol
17,18−EpETE LPC(24:0)
MG(24:1) MG(20:0)
Hydroxybupropion
Glycyrrhetinic acid MG(18:1)
8,9−DHET
B C
Steriod hormone biosynthesis Phospholipids and DHEA-S
1.0
Mean log2(Intensity)
5.0
Mean log2(Intensity)
1416 24 32 40 PP 1416 24 32 40 PP
Gestational age (weeks) Gestational age (weeks)
D E
Long-chain fatty acids Caffeine metabolism
Mean log2(Intensity)
Mean log2(Intensity)
1.5
1.0
C16 PAF (Platelet−activating factor)
Docosadienoic acid 1.0
0.5
Tetracosapentaenoic acid Caffeine
Tetracosatetraenoic acid 0.5 Theophylline
0.0 Sphingosine 1−Methylxanthine
Erucic acid Theobromine
−0.5 Dodecanoylcarnitine 0.0 Cyclo (leucylprolyl)
Tetracosahexaenoic acid (THA)
Hexadecadienoylcarnitine
−1.0
−0.5
1416 24 32 40 PP 14 16 24 32 40 PP
Gestational age (weeks) Gestational age (weeks)
A B
hormones in maintaining pregnancy and later inducing parturi- tex, gonad, and placenta were among the top origins of preg-
tion (Mendelson, 2009), we observed an orchestrated elevation nancy-related metabolites (Figure S3B). The ability to recognize
of many components centered on progesterone, including many well-known and less-characterized steroid hormone
some less well-characterized hormones (Figure S3A). Consistent changes across pregnancy validates our approach.
with known sources of pregnancy metabolites (e.g., hormones) In addition to the steroid pathway, we observed a dynamic
(Maltepe and Fisher, 2015), metabolite set enrichment analysis pattern of metabolite changes with pregnancy in other pathways,
(MSEA) (Xia and Wishart, 2010b) revealed that the adrenal cor- such as the arachidonic acid metabolism pathway (Figures 3A,
3B, and S3C). We observed 20-HETE amounts increased until mance of our model in two independent validation cohorts (test
week 34; 20-HETE is potentially linked to the regulation of blood set 1 and test set 2). In test set 1, the model yielded an R of
pressure and renal function during pregnancy (Wang et al., 0.89 (R2 = 0.80, p = 8e93, RMSE = 4.11) (Figure 4C). The model,
2002; Wu et al., 2014) (Figures S3C and S3D). By contrast, including four steroids and one lipid (Figure 4D), was further veri-
5-HETE amounts generally decreased during pregnancy, poten- fied in a second independent-validation cohort of eight individuals
tially associated with its regulation of the uterus (Figures S3C with R of 0.91 (R2 = 0.83, RMSE = 3.05, samples n = 32, test set 2)
and S3D) (Edwin et al., 1997; Pearson et al., 2010). Thus, beyond (Table 1; Figure 4E). The compound identifications were
energy metabolism and hormones, a system-wide reconfiguration confirmed by chemical standards (Figures 4F–4H, S4E, and
of the metabolome occurs as the mother adapts to pregnancy. In S4F; Table S4; see STAR Methods). We noted that four of these
addition, based on MSEA analysis, many pregnancy-related me- five compounds are among the central steroid cluster forming a
tabolites are implicated in human disease states, including obesity dense correlation network with one another (Figure 2A).
and prepartum depression (Figure 3C) (Xia and Wishart, 2010b). As pregnancy progresses toward term, clinical classifications
and decisions often need to be made based on the timing of preg-
The Metabolic Clock of Normal Pregnancy Identified by nancy (e.g., < 37 weeks for preterm birth). Babies born before
Machine Learning 37 weeks are considered preterm, those born before 20 weeks
We next determined whether we can build a metabolic clock are considered a miscarriage, and those born before 24 weeks
based on the high-resolution profile to predict gestational age have low survival. Therefore, for clinical action it is important to
for individual plasma samples. In the discovery cohort (samples accurately classify the gestational age by clinical cutoff points at
n = 507, subjects N = 21), we applied feature selection (lasso weeks 20–37. As a proof-of-principle, we tested the potential use
[least absolute shrinkage and selection operator]) with all 9,651 of the metabolome data to classify the normal pregnancy samples
features to build the linear regression model that shows optimal as before or after 20, 24, 28, 32, and 37 gestational weeks (Fig-
cross-validation performance for predicting a given phenotype in ure S5A). First, using only samples from the third-trimester (>
this cohort. We then ran the validation cohort data (test set 1, 28 weeks of gestation), the time window where women were
samples n = 245, subjects N = 9) through the model established more susceptible to preterm delivery, we determined whether the
in the discovery cohort to measure the independent perfor- identified maternal blood metabolites can distinguish the sample
mance of our model (Figure 4A; see STAR Methods). gestational age as before or after 37 weeks. Both the discovery
We first tested whether the metabolome change can quantita- and the validation prediction yielded an area under the receiver
tively determine the gestational age in normal pregnant women. operating characteristics (AUROC) over or close to 0.90 (Fig-
Feature selection in the discovery cohort yielded a linear model ure S5B; see STAR Methods). Remarkably, the prediction model
that included 42 metabolic features (Figure S4A; Table S2). In contained only three metabolites, and the abundance range of
the cross-validation test of 507 samples in the discovery cohort, each individual metabolite separated the > 37 week samples
the metabolic model predicted gestational age in weeks from the < 37 week samples for all but one to two validation sub-
(GAmetabolic) that correlated with gestational age estimated by jects (Figures S5C–S5F). Similarly, using samples across the whole
the first-trimester ultrasound (GAultrasound, in compliance with pregnancy, we found that metabolites can also accurately distin-
the clinical standard of care) with a Pearson correlation coeffi- guish pregnancy samples before or after other important gesta-
cient (R) of 0.96 (R2 = 0.93, p < 1 X 10100, root mean squared tional age cutoffs, such as 20, 24, 28, and 32 gestational weeks
error [RMSE]= 2.49) (Figure S4B). In the independent-validation (Figures S5A and S5G–S5J).
cohort, the model yielded a similar R of 0.95 (R2 = 0.91, p < 1 X
10100, RMSE = 2.76, test set 1) (Figure S4C). This indicates Personal Metabolic Clock of Pregnancy Linked with
metabolic features can accurately predict the gestational age Timing of Delivery and Fetal Growth
on the basis of a blood sample from a pregnant woman. Next, we examined the metabolic clock prediction performance
For potential clinical use, we next tested whether we can use in individuals. First, we noted that for most individuals, our model
the annotated compounds in blood to predict the gestational produced predictions consistently aligned with the gestational
age in pregnant women. We performed feature selection in dis- age estimated by the first-trimester ultrasound (Figures 5A and
covery cohort by using the 264 level 1 and level 2 compounds 5B). In both cross validations for Discovery and test set 1, the
identified in the Human Metabolome Database (HMDB) in the dis- prediction deviation (measured by RMSE) in individuals centered
covery cohort (Table S3). This yielded a linear model including five around 3 weeks (Figures S6A and S6B). However, in each data-
compounds (Figures S4D and 4D) that together are highly predic- set, there is a small population of individuals with higher predic-
tive. We first evaluated the performance of the model in a 10-fold tion deviation. When we examined these individuals (e.g.,
cross-validation (CV) test in the discovery cohort, in which sam- subjects 1, 2, and 4), we found the predictions were not more
ples were distributed into folds by subject instead of by sample randomly scattered than other individuals. Rather, in the majority
to prevent person-specific information cross-over between the of them, predictions shifted away from the actual gestational
training folds and the test fold. In the CV test, the metabolic-clock ages in a portion of the pregnancy duration (Figures 5A and
model produced a result (GAmetabolic) that correlated with the 5B), suggesting effects from non-random causes.
gestational age estimated by the first-trimester ultrasound We hypothesized that some of these large prediction deviations
(GAultrasound) with a Pearson correlation coefficient (R) of 0.92 might arise from biological causes, particularly from the maternal-
(R2 = 0.85, p = 8e222, RMSE = 3.67) (Figure 4B). To avoid the fetal interaction. It is reported that the fetoplacental unit secretes
hyperparametric selection bias, we further evaluated the perfor- hormones in conjunction with fetal growth and development
B C
D E
F G H
Figure 4. Metabolic Clock Of Pregnancy: Five Metabolites Selected by Machine Learning Can Accurately Predict the Timing of Normal
Pregnancy Progression in Both a Discovery and Two Validation Cohorts
(A) Design of the analytical pipeline.
(B and C) Gestational age (GA) predicted by the linear model consisting of five identified metabolites (GAmetabolic, y axis) highly correlates with clinical values
determined by the standard of care (by first-trimester ultrasound [GAultrasound] x axis) in the Discovery (B) and the validation cohort (test set 1) (C). Note that two
samples presented as outliers in the validation cohort, possibly because of occasional mass-spectrometry signal instability in given samples. The 95% confi-
dence interval for the linear regression is represented by the gray area.
(D) Contribution of the five metabolites to the gestational age prediction model.
(E) Gestational age predicted by the five metabolites (GAmetabolic, y axis, scaled) correlates with clinical values determined by the standard of care (by first-
trimester ultrasound [GAultrasound] x axis) in the test set 2 cohort. The 95% confidence interval for the linear regression is represented by the gray area.
(F–H) Confirmation of the metabolites predicting gestational age in the metabolic clock model by standard compounds, THDOC (F), estriol-16-glucuronide (G),
and progesterone (H) (see two additional compounds PE(P-16:0e/0:0) and DHEA-S in Figures S4E and S4F). Measured MS/MS spectral fragmentation profiles
(top, in black) matching chemical standards (bottom, in red). Note that the discovery results were from the 10-fold CV to avoid over-fitting (see STAR Methods).
See also Figures S4 and S5 and Tables S2–S4.
(Murphy et al., 2006). Indeed, we noted that the average prediction gestational age estimation determined by first-trimester ultrasound
deviation strongly correlated with adjusted infant birth weight (Fig- in mothers with a heavier fetus while being delayed in the mothers
ure 5C, adjusted for gestational length; see Figure S6C and STAR with a lighter fetus. The finding suggests that fetal growth appears
Methods). Thus, the overall metabolic clock tends to outpace the to be one of the inputs read by the metabolic clock.
A B
RMSE= 2.6 RMSE= 4.2 RMSE= 3.1 RMSE= 2.4 RMSE= 2.1 RMSE= 3.2 RMSE= 3.4 RMSE= 4.3 RMSE= 3.0 RMSE= 2.9
R2= 0.88 R2= 0.90 R2= 0.93 R2= 0.94 R2= 0.88 R2= 0.93 R2= 0.89 R2= 0.88 R2= 0.92 R2= 0.90
40 p= 5e−11 40 p= 2e−13 40 p= 3e−14 40 p= 2e−19 40 p= 3e−09 40 p= 2e−09 40 p= 7e−12 40 p= 9e−14 40 p= 2e−17 40 p= 1e−12
30 30 30 30 30 30 30 30 30 30
20 20 20 20 20 20 20 20 20 20
10 10 10 10 10 10 10 10 10 10
GAmetabolic (weeks)
GAmetabolic (weeks)
Subject 18 Subject 11 Subject 15 Subject 5 Subject 16 Subject 7 Subject 19 0 Subject 30 0 Subject 25 Subject 26
0
0 0 0
0 0 0 0
0 10 20 30 40 0 10 20 30 40 0 10 20 30 40 0 10 20 30 40 0 10 20 30 40 0 10 20 30 40 0 10 20 30 40 0 10 20 30 40 0 10 20 30 40 0 10 20 30 40
RMSE= 2.7 RMSE= 5.7 RMSE= 3.3 RMSE= 4.8 RMSE= 2.0 RMSE= 4.6 RMSE= 4.5 RMSE= 3.6 RMSE= 3.0 RMSE= 4.6
R2= 0.96 R2= 0.82 R2= 0.93 R2= 0.90 R2= 0.94 R2= 0.90 R2= 0.89 R2= 0.91 R2= 0.88 R2= 0.82
40 p= 2e−10 40 p= 3e−12 40 p= 3e−16 40 p= 2e−17 40 p= 7e−16 40 p= 3e−14 40 p= 5e−15 40 p= 2e−13 40 p= 6e−12 40 p= 3e−14
30 30 30 30 30 30 30 30 30 30
20 20 20 20 20 20 20 20 20 20
10 10 10 10 10 10 10 10 10 10
0 Subject 21 Subject 4 0 Subject 17 0 Subject 1 Subject 10 Subject 2 Subject 3 Subject 28 Subject 22 Subject 24
0 0 0 0 0 0
0
0 10 20 30 40 0 10 20 30 40 0 10 20 40 0 10 20 30 40 0 10 20 30 40 0 10 20 30 40 0 10 20 30 40 0 10 20 30 40 0 10 20 30 40 0 10 20 30 40
RMSE= 3.3 RMSE= 4.1 RMSE= 2.6 RMSE= 3.5 RMSE= 2.5 RMSE= 4.1 RMSE= 3.0
RMSE= 2.6 RMSE= 3.3 RMSE= 7.5
R2= 0.88 R2= 0.89 R2= 0.97 R2= 0.93 R2= 0.94 R2= 0.87 R2= 0.84
R2= 0.90 R2= 0.86 R2= 0.56
40 p= 3e−11 40 p= 3e−15 40 p= 1e−08 40 p= 4e−17 40 p= 3e−14 40 p= 1e−08 40 p= 5e−08
40 p= 9e−13 40 p= 7e−12 40 p= 3e−05
30 30 30 30 30 30 30
30 30 30
20 20 20 20 20 20 20
20 20 20
10 10 10 1030 10 10 10
10 10 10
0 Subject 12 Subject 8 0 Subject 20 0 Subject 14 Subject 13 Subject 9 Subject 6 Subject 29 Subject 27 Subject 23
0 0 0 0 0 0
0 10 20 30 40 0 10 20 30 40 0 10 20 30 40 0 10 20 30 40 0 10 20 30 40 0 10 20 30 40 0 10 20 30 40 0
0 10 20 30 40 0 10 20 30 40 0 10 20 30 40
C D E
P = 0.01 P = 0.02
R2= 0.22 R2= 0.28
( (GAmetabolic- GAultrasound)) (weeks)
42
2
GA at delivery (weeks)
Average discrepencies
41
0
Weeks to delivery: 8 4 2
40
AUC in validation: 0.91 0.87 0.86
THDOC 1 2 1
−2 39 Androstane-3,17-diol 2
Estriol-16-glucuronide 1 3
N= 29 N= 18 17alpha-Hydroxyprogesterone 2
with birthweight info.
38
with natural labor onsets PE(P−16:0e/0:0) 3
−4
−1000 −500 0 500 1000 −4 −2 0 2
Adjusted brithweight deviation Average discrepencies
from the population mean (g) ( (GAmetabolic- GAultrasound)) (weeks)
0.75
Contribution
0.4
log2(Intensity)
13.0
0.3
0.50 12.5
0.2
AUC: 0.88
Discovery CV N= 128 0.1 12.0
0.25 95% CI: 0.82-0.95 0.0
AUC: 0.86 11.5
C
O
Test Set 1 N= 65
D
0.00
TH
29
30
22
25
26
Figure 5. Personal Metabolic Clock of Pregnancy Linked with Timing of Delivery and Fetal Growth
(A and B) Highly correlated patterns of the metabolic-clock-predicted gestational age (GAmetabolic) of the five-metabolite model with the gestational age estimated
by the first-trimester ultrasound (GAultrasound) at the individual level in the cross validation (A) and test set 1 (B). Note that the outlier sample with negative prediction
value in Figure 4C belonged to the last subject of the test set 1 and did not show in the current plot with the y axis scale limitation.
(C) The average discrepancies between metabolic-clock-predicted gestational age and ultrasound-estimated gestational age (D(GAmetabolic-GAultrasound)) were
significantly correlated with the fetal growth deviation from the population by person. All 29 subjects who had baby birth weight information are included here. The
95% confidence interval for the linear regression is represented by the gray area.
(D) Average discrepancies between GAmetabolic and GAultrasound (D(GAmetabolic-GAultrasound)) were negatively correlated with the actual delivery weeks (by ultra-
sound-estimation). All 18 subjects who had natural labor onset are included here. Dashed lines marked the ultrasound estimated GA at 40 weeks (due date,
black), GAmetabolic one week earlier than the GAultrasound (blue), and GAmetabolic one week later than the GAultrasound (red). The 95% confidence interval for the linear
regression is represented by the gray area.
(E) Summary of prediction models of 2, 4, and 8 weeks approaching delivery, using two to three metabolites. The contribution rank of each predictor in every
model is listed as number 1, 2, and 3. The weeks to delivery were built using samples of the third trimester (> 28 weeks). AUCs in the validation cohort (test set 1)
are listed.
In addition, within the 18 women with natural labor onset (i.e., (Tulchinsky et al., 1972; Wang et al., 2016) (such as progester-
excluding women with induction before labor onset and sched- one, 17-hydroxyprogesterone, and the linoleic acid pathway),
uled cesarean-section), we found that the women whose overall validating our approach. At the same time, we also noted that
metabolic clock of pregnancy outpaced ultrasound evaluation a large portion of pregnancy-related metabolites identified in
tended to deliver earlier, whereas a delay in metabolic clock our study was less well-studied. For example, over 95% of the
correlated with a delayed time to child delivery compared to ul- pregnancy-related metabolites identified in our study were not
trasound estimated due date (Figure 5D). Interestingly, five out of recovered from a targeted metabolic profiling study on preg-
six women (83%) with a metabolic-clock-predicted gestational nancy (Wang et al., 2016), demonstrating the power of unbiased
age more than one week later than the ultrasound-estimated and hypothesis-independent profiling. Among the changing me-
gestational age had natural labor onset after their due date (esti- tabolites, the major class that increased was steroids, including
mated by ultrasound, marked in red in Figure 5D), and four out of progesterone, which interacts with the hypothalamic-pituitary-
five women (80%) with metabolic-clock-predicted gestational adrenal axis (HPA axis) (Chrousos et al., 1998), and estriol-16-
age more than one week earlier than the ultrasound-estimated glucuronide produced by the placenta (Levitz et al., 1984).
gestational age had natural labor onset before their due date Here, the detailed differences in their temporal profiles were re-
(marked in blue in Figure 5D). These results suggest the meta- vealed by the weekly sampling design of the study (Figures 1D
bolic clock of pregnancy with maternal metabolites contains in- and 2B). In addition, we discovered less well-studied steroids
formation on the timing of delivery in normal pregnancy. in pregnancy, such as the neurosteroid THDOC, an allosteric
modulator of the GABAA receptor that potentially affects stress
Prediction for Timing of Delivery and depression in human pregnancy (Hosie et al., 2006; Reddy,
We then tested whether the maternal blood metabolites can also 2003). Intriguingly, many pregnancy-related metabolites that
predict the timing of a normal delivery event within a defined period changed, including steroids, quickly returned to the maternal
(2, 4, and 8 weeks from delivery) approaching the labor events (in the non-pregnant state after childbirth (Figures 1B and 1D and 1E).
third trimester). We first examined whether metabolites can predict In addition, we also identified a wide variety of non-steroid
a delivery within 2 weeks (weeks to delivery [WD] < 2w). To predict hormones whose abundance altered during pregnancy
delivery triggered naturally without outside procedures (such as progression.
scheduled cesarean-section), we only included delivery events These metabolite changes presumably accommodate and/or
naturally triggered (subjects N = 18, samples n = 193). With just three reflect important maternal biological physiology during preg-
metabolites, the metabolome accurately predicted an upcoming nancy and fetal growth (Bispham et al., 2003; Prentice and Gold-
delivery event within 2 weeks in both discovery and validation co- berg, 2000). For maternal nutrient metabolism, one of the
horts with AUROC close to 0.9 (Figures 5E–5H, S6D, and S6E; decreased carnitines, oleoylcarnitine (Figure 1C), accumulates
see STAR Methods). Similarly, identified metabolites can also be during certain metabolic conditions, including fasting (Hoppel
used to predict the timing of a normal delivery event within 4 and and Genuth, 1980; Minkler et al., 2005). Also, one phosphatidyl-
8 weeks (Figures 5E and S6F–S6I). Intriguingly, the panels of metab- choline that functions as a micronutrient, lecithin, increased in
olites partially overlapped between the models, whereas the individ- pregnancy, suggesting a systematic change in the maternal
ual metabolites contributed differently to the models (Figure 5E). All nutritional status during gestation. Within molecules reflecting
of the metabolite markers were identified as steroids, except for pregnancy-related physiological changes, consistent with
phospholipid PE(P-16:0e/0:0), and most of them (three out of five decreased blood pressure (Hermida et al., 1997), the antihyper-
in total) also appeared in the aforementioned metabolic clock for tensive molecule 20-HETE of the arachidonic acid metabolism
gestational age (Figure 5E; Table S4). These results demonstrate pathway is elevated during pregnancy until the early third
that we can precisely categorize critical pregnancy stages in normal trimester, and its synthesis is regulated in a renal-specific
subjects by using a small number of maternal blood metabolites, manner (Wang et al., 2002; Wu et al., 2014). This reveals the high-
which can be further validated in larger and independent cohorts. ly dynamic temporal regulation of 20-HETE in blood pressure
and kidney function during pregnancy. In contrast, compared
DISCUSSION with early pregnancy and postpartum, the amount of 5-HETE
in the same pathway was generally lower in the second and third
In this study, we performed untargeted metabolomics profiling trimesters with an increasing trend right before the childbirth,
and identified highly dynamic temporal regulation of metabolic consistent with previous findings that 5-HETE elevates in the
changes in human pregnancy: more than half of the measured uterus and amniotic fluid at the onset of human labor (Edwin
metabolites and metabolic pathways changed during preg- et al., 1997; Pearson et al., 2010). In the developing fetus,
nancy. We were able to detect many of the pregnancy-associ- changes in hexadecadienoylcarnitine amounts are associated
ated metabolite profiles revealed in previous targeted studies with congenital heart defects (Bahado-Singh et al., 2014).
(F) The logistic regression model based on three metabolites can accurately identify the third-trimester plasma samples approaching the delivery (weeks to
delivery [WD] < 2w; only women with natural labor onset included).
(G) Contribution of the three metabolites to the prediction model of 2 weeks approaching delivery.
(H) Metabolite THDOC showed abundance separations before or after 2 weeks approaching the delivery, except in one subject. See Figure S6 for other me-
tabolites in the model. Note that the discovery results were from the 10-fold CV instead of direct fitting to avoid over-fitting.
See also Figure S6 and Table S4.
Here, we revealed that the amount of hexadecadienoylcarnitine women that had advanced metabolic clock tended to deliver
in the blood decreased continuously until week 24, then steadily earlier than predicted by ultrasound, whereas a delay in metabolic
increased thereafter (Figure 2D). In addition, the amount of long- clock correlated with a delayed time to child delivery (Figure 5D).
chain fatty acids in maternal blood samples is associated with In summary, combining untargeted metabolome and high-den-
childhood metabolic health (Maslova et al., 2018). Here, the sity sampling revealed the landscape of metabolome changes dur-
omega-3 fatty acid THA decreased during early pregnancy and ing pregnancy and the postpartum period with high resolution. The
gradually increased before childbirth (Figure 2D), suggesting data itself can serve as a resource for future research. As a proof-
gestation-related changes in the formation of docosahexaenoic of-principle, we also demonstrated that the temporal abundance
acid (DHA) (Moore et al., 1995). Our findings are robust even information of metabolome can be used to predict gestational
without a requirement for prior fasting. It will be interesting to age with high accuracy in a cohort of healthy women. There is a
validate these findings in cohorts that have dietary information great need for accurate timing of pregnancy: in the US alone,
and detailed clinical measurements, to define critical ‘‘nutritional 900,000 women annually missed their first-trimester ultrasound
time-zones’’ for micronutrient amounts and further understand (Martin et al., 2018), currently the only accurate timing method
the metabolite changes that are important for physiologic for pregnancy (Committee on Obstetric Practice, the American
changes across pregnancy. Institute of Ultrasound in Medicine, and the Society for Maternal-
Our high-density sampling scheme allowed us to study the tem- Fetal Medicine, 2017). In low and middle-income countries, acces-
poral alteration of metabolite levels at weekly resolution. For sibility to ultrasound is even more scarce, complicating many preg-
example, even though many steroid metabolites were elevated nancies and fetal care down-stream (e.g., identify imminent labor,
during pregnancy, our profiling was able to show that there were manage complications, etc.). Our study demonstrated that the
at least two different behaviors: an early wave (such as progester- development of clinical tools with a few metabolites in maternal
one and 17a-hydroxyprogesterone) and a second wave (such as blood to time pregnancy is promising. Testing of blood drawn
estriol-16-glucuronide). These temporal changes of steroids from the pregnant woman would likely be limited to once or a
across pregnancy and after childbirth are at least partially regulated few times to be informative and have the potential to benefit preg-
by the fetoplacental unit, including both maternal adrenal gland nant women in both developed and developing worlds.
and placenta and fetal adrenals and liver (Diczfalusy, 1953; Frand-
sen and Stakemann, 1961; Raeside, 2017). Further investigation
STAR+METHODS
into the interaction of fetal-maternal contribution will be necessary
for understanding the temporal regulation of these metabolites.
Detailed methods are provided in the online version of this paper
Untargeted metabolome and high-density sampling enabled us
and include the following:
to identify a broad set of high-resolution temporal profiles of me-
tabolites during pregnancy. We hypothesized that this information d KEY RESOURCES TABLE
might help us to understand the underlying metabolic clock that d RESOURCE AVAILABILITY
times the progression of pregnancy. We found that solely using B Lead Contact
the abundance of five compounds, without any other inputs from B Materials Availability
clinical features, we can precisely determine the gestational age B Data and Code Availability
of a healthy pregnant woman. The precision surpasses the recent d EXPERIMENTAL MODEL AND SUBJECT DETAILS
cell-free RNA model by using maternal blood (Ngo et al., 2018). B Pregnancy cohort
Similarly, with two to three compounds, we can categorically pre- d METHOD DETAILS
dict many pregnancy cutoff times with high AUC: we can deter- B Plasma sample preparation
mine whether a woman has reached 20, 24, 28, 32, or 37 weeks B Chemical materials for untargeted metabolomics
(clinical cutoffs for miscarriage, age of viability, extremely preterm, B MS acquisition
very preterm, and prematurity, respectively) into her pregnancy B Chromatographic conditions
(Figure S5A), or whether a woman will enter into labor within the d QUANTIFICATION AND STATISTICAL ANALYSIS
next two, four, or eight weeks (Figure 5E). The proof-of-principle B Section 1: Metabolomics Data Processing
study suggested that metabolome bears rich quantitative informa- B Section 2: Metabolic Features Identification
tion about pregnancy progression. However, our study has its lim- B Section 3: Identify Significantly Altered
itations. The studied population consisted of healthy Caucasian Features/Compounds
pregnant women with small variations in clinical characteristics. B Section 4: Regularized Partial Correlation Network
In the future, we need to test the models in a larger cohort with B Section 5: Pathway Analysis
diverse ethnicities and complications. Meanwhile, targeted chem- B Section 6: Machine Learning for Pregnancy Timing
ical assays need to be developed on the small panels of identified B Section 7: Analyze the discrepancies between meta-
metabolite markers that were discovered by untargeted metabolo- bolic clock (GA prediction model) and first-trimester ul-
mics to measure the metabolite concentration independent of trasound estimations
batches. Intriguingly, we found the metabolic clock of pregnancy
to be robust in general, but small personal deviations can be
SUPPLEMENTAL INFORMATION
observed, most likely affected by the fetal growth (Figure 5C).
Lastly, we also found that the discrepancies between metabolic Supplemental Information can be found online at https://doi.org/10.1016/j.
timing and ultrasound suggested biological significance: the cell.2020.05.002.
AUTHOR CONTRIBUTIONS Chrousos, G.P., Torpy, D.J., and Gold, P.W. (1998). Interactions between the
hypothalamic-pituitary-adrenal axis and the female reproductive system: clin-
M.-L.H.R. and M.M. organized and contributed to the collection of pregnancy ical implications. Ann. Intern. Med. 129, 229–240.
samples. L.L., B.P., B.F., M.S., and M.M. conceptualized the study. L.L., M.S., Committee on Obstetric Practice, the American Institute of Ultrasound in Med-
and M.M. created the analysis plan. L.L. processed and analyzed the samples. icine, and the Society for Maternal-Fetal Medicine (2017). Committee Opinion
L.L. and H.R. analyzed the data. L.L., X.S., S.C., J.S., and N.L. contributed to No 700: Methods for Estimating the Due Date. Obstet. Gynecol. 129,
the spectral analysis. L.L., M.S., and M.M. wrote the manuscript, and all au- e150–e154.
thors contributed to reviewing and editing the manuscript.
Contrepois, K., Jiang, L., and Snyder, M. (2015). Optimized Analytical Proced-
ures for the Untargeted Metabolomic Profiling of Human Urine and Plasma by
DECLARATION OF INTERESTS Combining Hydrophilic Interaction (HILIC) and Reverse-Phase Liquid Chroma-
tography (RPLC)-Mass Spectrometry. Mol. Cell. Proteomics 14, 1684–1695.
M.S. is a co-founder and member of the scientific advisory boards of the Diczfalusy, E. (1953). Chorionic gonadotrophin and oestrogens in the human
following: Personalis, SensOmics, Filtricine, Qbio, January, Mirvie, and Ora- placenta. Acta Endocrinol. Suppl. (Copenh.) 11, 1–175.
lome. He is a member of the scientific advisory board of Jungla. M.M. is a
Donahue, S.M., Kleinman, K.P., Gillman, M.W., and Oken, E. (2010). Trends in
co-founder of Mirvie. L.L., M.S., and M.M. are inventors on the patent applica-
tion PCT/US2019/052515 related to this work. birth weight and gestational length among singleton term births in the United
States: 1990-2005. Obstet. Gynecol. 115, 357–364.
Received: September 4, 2018 Dudzik, D., Zorawski, M., Skotnicki, M., Zarzycki, W., Kozlowska, G., Bibik-
Revised: March 11, 2020 Malinowska, K., Vallejo, M., Garcı́a, A., Barbas, C., and Ramos, M.P. (2014).
Accepted: April 29, 2020 Metabolic fingerprint of Gestational Diabetes Mellitus. J. Proteomics
Published: June 25, 2020 103, 57–71.
Edwin, S.S., Mitchell, M.D., and Dudley, D.J. (1997). Action of immunoregula-
REFERENCES tory agents on 5-HETE production by cultured human amnion cells. J. Reprod.
Immunol. 36, 111–121.
Alkema, L., Chou, D., Hogan, D., Zhang, S., Moller, A.B., Gemmill, A., Fat, Epskamp, S., and Fried, E.I. (2018). A tutorial on regularized partial correlation
D.M., Boerma, T., Temmerman, M., Mathers, C., and Say, L.; United Nations networks. Psychol. Methods 23, 617–634.
Maternal Mortality Estimation Inter-Agency Group collaborators and technical
Frandsen, V.A., and Stakemann, G. (1961). The site of production of oestro-
advisory group (2016). Global, regional, and national levels and trends in
genic hormones in human pregnancy. Hormone excretion in pregnancy with
maternal mortality between 1990 and 2015, with scenario-based projections
anencephalic foetus. Acta Endocrinol. (Copenh.) 38, 383–391.
to 2030: a systematic analysis by the UN Maternal Mortality Estimation Inter-
Agency Group. Lancet 387, 462–474. Gagnon, A., and Wilson, R.D.; Society of Obstetricians and Gynaecologists of
Canada Genetics Committee (2008). Obstetrical complications associated
Altman, N.S. (1992). An Introduction to Kernel and Nearest-Neighbor Nonpara-
with abnormal maternal serum markers analytes. Journal d’obstetrique et gy-
metric Regression. Am. Stat. 46, 175–185.
necologie du Canada 30, 918–932.
Bahado-Singh, R.O., Akolekar, R., Mandal, R., Dong, E., Xia, J., Kruger, M.,
Goeman, J.J., and Bühlmann, P. (2007). Analyzing gene expression data in
Wishart, D.S., and Nicolaides, K. (2012). Metabolomics and first-trimester pre-
terms of gene sets: methodological issues. Bioinformatics 23, 980–987.
diction of early-onset preeclampsia. J Matern Fetal Neonatal Med 25,
1840–1847. Goeman, J.J., van de Geer, S.A., de Kort, F., and van Houwelingen, H.C.
(2004). A global test for groups of genes: testing association with a clinical
Bahado-Singh, R.O., Ertl, R., Mandal, R., Bjorndahl, T.C., Syngelaki, A., Han,
outcome. Bioinformatics 20, 93–99.
B., Dong, E., Liu, P.B., Alpay-Savasan, Z., Wishart, D.S., et al. (2014). Metab-
olomic prediction of fetal congenital heart defect in the first trimester. A J Ob- Hermida, R.C., Ayala, D.E., Mojón, A., Fernández, J.R., Silva, I., Ucieda, R.,
stet Gynecol 211, e1–e14. and Iglesias, M. (1997). High sensitivity test for the early diagnosis of gesta-
tional hypertension and preeclampsia. IV. Early detection of gestational hyper-
Baumann, D., and Baumann, K. (2014). Reliable estimation of prediction errors
tension and preeclampsia by the computation of a hyperbaric index. J. Perinat.
for QSAR models under model uncertainty using double cross-validation.
Med. 25, 254–273.
J. Cheminform. 6, 47.
Bispham, J., Gopalakrishnan, G.S., Dandrea, J., Wilson, V., Budge, H., Keisler, Hochreiter, S., and Schmidhuber, J. (1997). Long short-term memory. Neural
D.H., Broughton Pipkin, F., Stephenson, T., and Symonds, M.E. (2003). Comput. 9, 1735–1780.
Maternal endocrine adaptation throughout pregnancy to nutritional manipula- Hoppel, C.L., and Genuth, S.M. (1980). Carnitine metabolism in normal-weight
tion: consequences for maternal plasma leptin and cortisol and the program- and obese human subjects during fasting. Am. J. Physiol. 238, E409–E415.
ming of fetal adipose tissue development. Endocrinology 144, 3575–3585. Hosie, A.M., Wilkins, M.E., da Silva, H.M., and Smart, T.G. (2006). Endogenous
Blencowe, H., Cousens, S., Chou, D., Oestergaard, M., Say, L., Moller, A.B., neurosteroids regulate GABAA receptors through two discrete transmem-
Kinney, M., Lawn, J., and Born Too Soon Preterm Birth Action, G.; Born Too brane sites. Nature 444, 486–489.
STAR+METHODS
RESOURCE AVAILABILITY
Lead Contact
Further information and requests for resources and reagents should be directed to and will be fulfilled by the Lead Contact, Mike
Snyder (mpsnyder@stanford.edu).
Materials Availability
This study did not generate new unique reagents.
Pregnancy cohort
We recruited pregnant women through family doctors and advertisements (Danish IRB number H-3-2014-004). At enrollment, all
women were screened to ensure that they were healthy at baseline, without chronic conditions, and without medication intake of
any kind (ages 23 to 36 at giving birth). From each woman, non-fasting blood samples were collected weekly during pregnancy
and one sample was collected after pregnancy (2x9 mL EDTA tube and 1xPaxGene RNA tube).
METHOD DETAILS
MS acquisition
Metabolic extracts were analyzed by reversed-phase liquid chromatographic (RPLC)-mass spectrometry (MS) in both positive and
negative ionization modes. Thermo Q Exactive Hybrid Quadrupole-Orbitrap plus and Q Exactive mass spectrometers (Xcalibur,
Thermo Scientific, San Jose, CA, USA) were operated in full MS-scan mode for data acquisition (acquisition from m/z 500 to
2,000) with a scan rate of approximately 4 Hz and a resolution set at 30,000 (at m/z 400). The MS/MS spectra of the QC sample
were acquired under different fragmentation energy (25 NCE and 50 NCE) of the top 10 parent ions. The resulting mass spectra
were exported into Progenesis QI Software (Nonlinear Dynamics, Durham, NC, USA) for further processing.
Chromatographic conditions
RPLC separation was performed using Zorbax SB columns (2.1 X 50mm, 1.8 Micron, 600 Bar; 827700-914) purchased from Agilent
Technologies (Santa Clara, CA, USA). Mobile phases for RPLC consisted of 0.06% acetic acid in water (phase A) and 0.06% acetic
acid in MeOH (phase B). Metabolites were eluted from the column at a flow rate of 0.6 mL/min, leading to a backpressure of 220–
280 bar at 99% phase A. A linear 1%–80% phase B gradient was applied over 9–10 min. The oven temperature was set to 60 C,
and the sample injection volume was 5 mL.
was performed under default settings in Progenesis QI. Acquired data were processed using an analysis pipeline written in R (https://
www.R-project.org). Progenesis QI output was then processed by removing all metabolites that were quantified in less than 30% of
the samples or had a median intensity of less than twofold signal over the noise threshold (S/N < 2). The noise threshold was esti-
mated by using the median signal across all the blank runs (if no quantitation was reported in any of the blank runs, the feature
was also included in the analysis, as it likely had good S/N characteristics). Then the data were log-transformed and normalized.
For each run, the median of all features was centered to correct for variation in the sample amount. Then for each analyte, a linear
correction was applied per batch to correct for any linear decrease or increase in abundance during the acquisition of a batch. In
short, for each analyte and each batch, a linear model was fitted with the log-abundance of the analyte as the dependent variable
and the acquisition number [run order (randomized)] as the independent variable. The model prediction was interpreted as an under-
lying drift in mass spectrometric sensitivity and subtracted from the analyte level to yield within-batch normalized abundances.
Finally, for each analyte, the abundances were median centered by batch to correct for sensitivity differences between batches.
The positive- and negative-mode features were then concatenated for downstream analysis. In total, 9,651 features were included
in the final analysis. In addition, for samples with more than 50% of the values missing, the sample was removed (one sample in total).
The remaining missing values were imputed by the nearest 10 neighbors using the k-Nearest Neighbor algorithm (Altman, 1992). Note
that Discovery and Test Set 1 were normalized together, while samples of Test Set 2 were normalized independently.
We applied principal component analysis (PCA) to examine the overall distribution of the sample data (with all 9,651 features) and
check the run quality. The gestational ages (based on first-trimester ultrasound measurements) were superimposed to facilitate the
analysis. During the analysis, the vast majority of the samples were separated by pre- and postpartum in PCA space defined by two
components, which explained the largest variations (PC1 and 2, Figure 1B), while two samples of a same subject (last two in her
collection, before and after childbirth) displayed irregular behavior in PCA and unsupervised clustering analysis. The two samples
were treated as outliers and excluded from further analysis. We also performed partial least-squares discriminant analysis (PLS-
DA) according to the categories of gestational age (by the mixOmics package).
To identify top changed compounds with abundance increases or decreases more than 50% during the whole pregnancy
(40 weeks), we performed a linear regression between log2 abundance and the gestational weeks of samples, and only those com-
pounds with absolute slope larger than log2(1.5)/40 weeks = 0.015 were chosen.
Section 7: Analyze the discrepancies between metabolic clock (GA prediction model) and first-trimester ultrasound
estimations
We evaluated the individual correlation between the predictions made by the metabolic clock and the estimations from first-trimester
ultrasound: We first examined the correlation between metabolic clock predictions and the gestational age based on first-trimester
ultrasound in individual persons. Each correlation was evaluated by Pearson’s correlation. We then performed meta-analysis across
the persons to generate a summary p value, using Fisher’s method, to describe the overall correlation in each cohort (cross-validation
in the Discovery, Independent validation of Test Set 1).
Previous literature (Donahue et al., 2010) and our own observations suggest that birth weight and gestational length are positively
correlated; later delivery is associated with a heavier absolute birth weight of an infant. To determine whether an infant’s birth weight
falls above or below the group mean, we performed a linear regression between the two parameters and took the residuals to repre-
sent the birth weight deviation adjusted for delivery timing.
Average D(GAmetabolic- GAultrasound): For each person, at each time point, we examined the differences between the metabolic clock
and first-trimester ultrasound estimation of gestational age. These values were averaged for each person to represent the overall
relative pace of metabolic clock compared to the first-trimester ultrasound estimation. We then examined the correlation between
delivery timing adjusted birth weights and average D(GAmetabolomic- GAultrasound) (Figure 5C).
To examine whether an accelerated metabolic clock (compared to the first-trimester ultrasound estimation) associates with
advanced delivery, we performed the correlation between average D(GAmetabolomic- GAultrasound) and delivery timing, only in women
with a natural labor onset (Figure 5D).
Supplemental Figures
A Postpartum samples B
Pregnancy samples
Child birth event
C
PLS−DA
GA:
Under 10
Component 2 (2%)
50
10−20
20−30
25 Over 30
PP
0
−25
−30 0 30
Component 1 (5%)
5 14 28 37 40
Gestational age (weeks)
D E
60 60
Batches
Discovery
30 30
Validation
PC2 (3.9 %)
PC2 (3.9 %)
0 0
−30 −30
F G
1000
log10(Frequency)
Log2(Intensity)
10
0.1 0.2
−0.1 0.0
Linear fitting slope
ln(Intensity) ~ Gestational age
Figure S1. Untargeted Metabolomics for Longitudinal Pregnancy Samples, Related to Figure 1
(A) High-density longitudinal sampling of pregnancies.
(B) The Scree plot of the principal component analysis.
(C) The PLS-DA result according to the categories of gestational age. GA: gestational age; PP: postpartum.
(legend continued on next page)
ll
Resource
(D and E) Principal component analysis based on all 9,651 features shows that the samples do not separate according to the 30 subjects (D) samples from
individual subjects are represented by different colors or experimental batches of Discovery and Validation (Test Set 1) analyzed across two different years (E)
samples of the discovery cohort are presented in red; samples of the validation cohort (Test Set 1) are presented in blue.
(F) Histogram shows the distribution of slopes in the linear fitting model of the 9,651 features (intensities against the gestational ages).
(G) For each of the 30 women, the intensities of an example metabolic feature are shown over the course of gestation, which reveals consistent increases in
abundance according to gestational age among 30 subjects, despite individual differences.
ll
Resource
A
Pearson correlation coefficient
1 0.5 0 −0.5 −1
Taurochenodeoxycholate
Cyclo(leucylprolyl)
Theophylline
Caffeine Pregnancy alteration
Theobromine
1−Methylxanthine
Pregnenolone sulfate
Decrease
Estrone 3−sulfate
Corticosterone
Increase
Cortisone
17,18−EpETE
Cortisol
3−Acetoxypyridine Group
N−Acetyl−D−glucosamine
Androstane−3,17−diol Amino acid metabolism
5−Pregnane−3,17−diol−20−one 3−sulfate
Estriol−16−Glucuronide
Bile acid biosynthesis
Progesterone Caffeine metabolism
17alpha−Hydroxyprogesterone Fatty acid metabolism
THDOC
7−Methylguanine Phospholipid metabolism
7alpha,24−Dihydroxy−4−cholesten−3−one
Hexadecadienoylcarnitine Steroid hormone biosynthesis
Dodecanoylcarnitine Others
C16 PAF (Platelet−activating factor)
Tetracosahexaenoic acid
Tetracosapentaenoic acid
Sphingosine
Tetracosatetraenoic acid
Docosadienoic acid
Erucic acid
Androsterone sulfate
Glycochenodeoxycholate
Sinapyl alcohol
Isobutyryl−L−carnitine
8,9−DHET
2−Phenylbutyric acid
DHEA-S
LPC(18:2)
LPC(17:0)
PE(P-16:0e/0:0)
LPE(22:2)
PC(22:1/22:1) (Lecithin)
LPE(22:4)
LPE(20:3)
LPC(24:0)
LPE(22:1)
LPC(P−18:0)
PC(18:1(9Z)e/2:0)
LPE(20:0)
LPE(20:1)
MG(20:0)
MG(22:2)
MG(18:1)
Tricosanoic acid
MG(24:1)
MG(24:0)
MG(14:1)
Oleoylcarnitine
3−Hydroxyoleylcarnitine
LPC(20:5)
LPC(P−18:1)
LPC(P−16:0)
Hydroxybupropion
Ketoisovaleric acid
Valylhistidine
Glycyrrhetinic acid
beta−Glycyrrhetinic acid
Taurochenodeoxycholate
Cyclo(leucylprolyl)
Theophylline
Caffeine
Theobromine
1−Methylxanthine
Pregnenolone sulfate
Estrone 3−sulfate
Corticosterone
Cortisone
17,18−EpETE
Cortisol
3−Acetoxypyridine
N−Acetyl−D−glucosamine
Androstane−3,17−diol
5−Pregnane−3,17−diol−20−one 3−sulfate
Estriol−16−Glucuronide
Progesterone
17alpha−Hydroxyprogesterone
THDOC
7−Methylguanine
7alpha,24−Dihydroxy−4−cholesten−3−one
Hexadecadienoylcarnitine
Dodecanoylcarnitine
C16 PAF (Platelet−activating factor)
Tetracosahexaenoic acid
Tetracosapentaenoic acid
Sphingosine
Tetracosatetraenoic acid
Docosadienoic acid
Erucic acid
Androsterone sulfate
Glycochenodeoxycholate
Sinapyl alcohol
Isobutyryl−L−carnitine
8,9−DHET
2−Phenylbutyric acid
DHEA-S
LPC(18:2)
LPC(17:0)
PE(P-16:0e/0:0)
LPE(22:2)
PC(22:1/22:1) (Lecithin)
LPE(22:4)
LPE(20:3)
LPC(24:0)
LPE(22:1)
LPC(P−18:0)
PC(18:1(9Z)e/2:0)
LPE(20:0)
LPE(20:1)
MG(20:0)
MG(22:2)
MG(18:1)
Tricosanoic acid
MG(24:1)
MG(24:0)
MG(14:1)
Oleoylcarnitine
3−Hydroxyoleylcarnitine
LPC(20:5)
LPC(P−18:1)
LPC(P−16:0)
Hydroxybupropion
Ketoisovaleric acid
Valylhistidine
Glycyrrhetinic acid
beta−Glycyrrhetinic acid
Pregnancy Alteration
Group
B
Strength Closeness Betweenness
Androsterone sulfate
Androstane−3,17−diol
THDOC
Dehydroisoandrosterone sulfate (DHEA−S)
PC(22:1/22:1) (Lecithin)
Estrone 3−sulfate
Estriol−16−Glucuronide
17alpha−Hydroxyprogesterone
LPC(20:5)
7−Methylguanine
Pregnenolone sulfate
LPC(18:2)
PC(18:1(9Z)e/2:0)
LPC(17:0)
3−Hydroxyoleylcarnitine
Taurochenodeoxycholate
LPE(20:1)
LPE(22:4)
Progesterone
Sphingosine
Tetracosatetraenoic acid
5−Pregnane−3,17−diol−20−one 3−sulfate
Cyclo(leucylprolyl)
LPE(20:3)
Valylhistidine
Cortisol
LPE(22:1)
Tetracosapentaenoic acid
Erucic acid
C16 PAF (Platelet−activating factor)
Cortisone
17,18−EpETE
Oleoylcarnitine
LPE(20:0)
Tricosanoic acid
Glycochenodeoxycholate
Tetracosahexaenoic acid
PE(P−16:0e/0:0)
N−Acetyl−D−glucosamine
Docosadienoic acid
LPC(P−18:0)
3−Acetoxypyridine
LPC(24:0)
Sinapyl alcohol
Hexadecadienoylcarnitine
MG(24:1)
MG(14:1)
Corticosterone
Caffeine
MG(24:0)
Dodecanoylcarnitine
7alpha,24−Dihydroxy−4−cholesten−3−one
beta−Glycyrrhetinic acid
Theophylline
LPC(P−18:1)
LPE(22:2)
1−Methylxanthine
MG(20:0)
Glycyrrhetinic acid
Theobromine
Isobutyryl−L−carnitine
MG(18:1)
Hydroxybupropion
MG(22:2)
2−Phenylbutyric acid
LPC(P−16:0)
−2 −1 0 1 2 −2 −1 0 1 2 −1 0 1 2 3
Figure S2. Functional Metabolite Groups Altered during Pregnancy, Related to Figure 2
(A) Correlation matrix colored by the Pearson correlation coefficient of each pair of pregnancy-related compounds across samples.
(B) The strength, closeness, and betweenness of metabolites in the regularized partial correlation network indicate how important the metabolites are in the
network. Metabolite names are listed on the left side ranked by the closeness, with the names of the seven compounds in the prediction models of Figure 4 and
Figure 5 (bold).
ll
Resource
Figure S3. Pregnancy-Related Metabolic Pathways and Metabolite Origin Analysis, Related to Figure 3
(A) Steroid hormone biosynthesis pathway, with metabolite increases (in red) or decreases (in blue) over the course of gestation.
(B) Numerous metabolites in plasma that were altered during pregnancy can be traced back to organs by metabolite set enrichment analysis (MSEA).
(C) Arachidonic acid metabolism pathway, with metabolite increases (in red) or decreases (in blue) over the course of gestation.
(D) The average levels of the 20-HETE and 5-HETE changes against the gestational progression. The intensities were normalized to the baseline, which was
defined by averaging all samples before 14 weeks. The standard errors, derived from 30 subjects, are shown. The gestational ages were adjusted by scaling
delivery events to 40 weeks. PP, postpartum.
ll
Resource
A Model reduction
256
Number of predictors
32
4 6 8
Cross validation: RMSE
B C
Discovery Test Set 1
2
40 2
R = 0.93 R = 0.91
GAmetabolic (weeks)
40
P< 1X10-100 N= 245
N= 507
30 30
20 20
10 10
10 20 30 40 10 20 30 40
GAultrasound (weeks) GAultrasound (weeks)
D Model reduction
256
Number of predictors
32
3 4 5 6 7 8 9
Cross validation: RMSE
E F
1.0 1.0
m/z: 438.2974 m/z: 367.1583
RT error (second): 0.6 RT error (second): 9
0.5 0.5
Relative intensity
Relative intensity
0.0 0.0
-0.5 -0.5
O
O
H3 C P NH 2
O O
OH
-1.0 Standard: PE(P-16:0e/0:0) HO H -1.0 Standard: DHEA-S(Dehydroepiandrosterone sulfate)
100 200 300 400 100 200 300 400
Mass to charge ratio (m/z) Mass to charge ratio (m/z)
Figure S4. Metabolites Predict Gestational Age in Machine-Learning Models, Related to Figure 4
(A) Feature selection for predicting gestational age (GA) using metabolomic features.
(B and C) GA predicted by metabolic features (GAmetabolic, y axis) highly correlates with clinical values determined by standard of care (by first-trimester ultra-
sound, GAultrasound, x axis) in the Discovery (B) and the validation cohort (Test Set 1) (C). The 95% confidence interval for the linear regression is represented by the
gray area.
(D) Feature selection for predicting GA using identified metabolites.
(E and F) Measured MS/MS fragmentation profiles (upper) matching of PE(P-16:0e/0:0) (E) and DHEA-S (F) with the MS/MS of standard compounds (lower). GA,
gestational age.
ll
Resource
Child birth
Gestational age (weeks): 20 24 28 32 37
AUC in validation: 0.98 0.98 0.97 0.89 0.87
THDOC 2 1 1 2
Progesterone 2 1 2
Androstane-3,17-diol 3
Estriol-16-glucuronide 1 3 2 1
0.75 0.4
GA > 37w
Contribution
log2(Intensity)
12
0.50 0.2
AUC: 0.91
Discovery N= 222 0.0 11
0.25 95% CI: 0.87-0.96
ne TH e
ol
,1 C
id
di
−3 DO
AUC: 0.87
n
7−
ro
10
cu
Test Set 1 N= 98
lu
G
ta
−1
os
28
29
30
22
23
24
25
26
27
ol
dr
tr i
An
Es
13.0
12.5
12.5
12.0
11.5 12.0
28
29
30
28
29
30
22
23
24
25
26
27
22
23
24
25
26
27
Subject ID Subject ID
G H
Prediction: Gestational age (GA) > 20w Prediction: Gestational age (GA) > 24w
1.00 1.00
0.75 0.75
True positive rate
0.50 0.50
AUC: 0.99
Discovery N= 507 AUC: 0.97
Discovery N= 507
95% CI: 0.97-0.99
0.25 0.25 95% CI: 0.96-0.98
AUC: 0.98 AUC: 0.98
Validation N= 245 Validation N= 245
95% CI: 0.96-0.99 95% CI: 0.97-0.99
0.00 0.00
0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00
False positive rate False positive rate
I Prediction: Gestational age (GA) > 28w J Prediction: Gestational age (GA) > 32w
1.00 1.00
0.75 0.75
True positive rate
0.50 0.50
0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00
False positive rate False positive rate
Figure S5. Metabolites Selected by Machine Learning Can Accurately Predict Gestational Age before or after 20, 24, 28, 32, and 37 Weeks in
Both the Discovery and Validation Cohort (Test Set 1), Related to Figure 4
(A) Summary of prediction models of gestational age (GA) before or after 20, 24, 28, 32, and 37 weeks, using two to three metabolites. Note that the prediction
models for 20, 24, and 28 gestational weeks were built using samples from all three trimesters and the ones for late pregnancy (32 and 37 weeks) were build using
third-trimester samples. The contribution rank of each predictor in every model is listed as number 1, 2, and 3. Area under the curves (AUCs) in the validation
cohort (Test Set 1) are listed.
(B) The logistic regression model based on three metabolites can accurately distinguish the third-trimester plasma samples before or after 37 weeks.
(C) Contribution of the three metabolites to the prediction model of gestational age before or after 37 weeks.
(D) Estriol-16-Glucuronide shows intensity range separations before and after 37 weeks.
(E and F) THDOC and androstane-3,17-diol show intensity range separations before/after 37 weeks.
(G–J) The logistic regression models can accurately distinguish pregnancy samples before or after 20 (G) 24 (H), and 28 (I) weeks, and the third trimester plasma
samples before or after 32 weeks (J). GA, gestational age.
ll
Resource
A B C 5000
●
●
10.0 5
●
● ●
●
●
●
7.5
4000
Frequency
Frequency
●
●
3 ●
●
5.0 ●
●
●
●
2
3500
●
●
2.5
1
●
●
●
●
● ●
0.0 0 3000 ●
●
●
●
●
●
● ●
●
2 4 6 8 10 2 4 6 8 10
●
●
●
●
D
Androstane-3,17-diol WD > 2w E Estriol−16−Glucuronide WD > 2w
WD < 2w WD < 2w
13.5
log2(Intensity)
13.0
12.5
12.0
28
29
30
22
25
26
28
29
30
22
25
26
Subject ID Subject ID
F G
1.0 1.0
m/z: 257.226076 m/z: 331.226241
RT error (second): 2.4 RT error (second): 29.4
0.5 0.5
Relative intensity
Relative intensity
0.0 0.0
-0.5 -0.5
H Prediction: Weeks to Delivery (WD) < 4w I Prediction: Weeks to Delivery (WD) < 8w
1.00 1.00
0.75 0.75
True positive rate
0.50 0.50
0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00
Figure S6. Identified Compounds Predict Gestational Age and 4 and 8 Weeks Approaching Delivery, Related to Figure 5
(A and B) Histogram shows the distribution of prediction deviation (RMSE) in the cross-validation of the discovery cohort (A) and the validation cohort (B) Test
Set 1.
(C) The baby birth weight shows correlation with the gestational length (gestational age at childbirth). All 29 subjects who had baby birth weight information are
included here. The 95% confidence interval for the linear regression is represented by the gray area.
(D and E) Androstane-3,17-diol (D) and estriol-16-Glucuronide (E) show intensity range separations before or after 2 weeks approaching the delivery.
(F and G) Measured MS/MS fragmentation profiles (upper) matching of androstane-3,17-diol (F) and 17a-hydroxyprogesterone (G) with the MS/MS of standard
compounds (lower).
(H and I) The logistic regression models can accurately identify the third trimester plasma samples approaching delivery (weeks to delivery, WD < 4w (H), WD < 8w
(I); only includes women with natural labor onset). Note that the discovery results were from the 10-fold cross-validation (CV) instead of direct fitting to avoid
over-fitting.
ll
Correction
An Engineered CRISPR-Cas9 Mouse Line
for Simultaneous Readout of Lineage Histories
and Gene Expression Profiles in Single Cells
Sarah Bowling, Duluxan Sritharan, Fernando G. Osorio, Maximilian Nguyen, Priscilla Cheung, Alejo Rodriguez-Fraticelli,
Sachin Patel, Wei-Chien Yuan, Yuko Fujiwara, Bin E. Li, Stuart H. Orkin, Sahand Hormoz,* and Fernando D. Camargo*
*Correspondence: sahand_hormoz@hms.harvard.edu (S.H.), fernando.camargo@childrens.harvard.edu (F.D.C.)
https://doi.org/10.1016/j.cell.2020.06.018
. s .
Mj;k = mj1;k1 m j ; mj1;k1 ˛Mj1;k1
rk
. s .
W = d j1;k1 m j ; d j1;k1 ˛Dj1;k1
rk
. s .
W = i j1;k1 m j ; i j1;k1 ˛I j1;k1
rk
. B .
Dj;k = mj;k1 m ; mj;k1 ˛Mj;k1
rk
. B . . B .
W = d j;k1 m ; d j;k1 ˛Dj;k1 W = i j;k1 m ; i j;k1 ˛I j;k1
rk rk
. s
I j;k = mj1;k m j ; mj1;k ˛Mj1;k
B
. s . . s .
W = d j1;k m j ; d j1;k ˛Dj1;k W = i j1;k m j ; i j1;k ˛I j1;k
B B
Incorrect equation
Cell 181, 1693–1694, June 25, 2020 ª 2020 Elsevier Inc. 1693
ll
. s .
Mj;k = mj1;k1 m j ; mj1;k1 ˛Mj1;k1
rk
. s .
W d j1;k1 m j ; d j1;k1 ˛Dj1;k1
rk
. s .
W i j1;k1 m j ; i j1;k1 ˛I j1;k1
rk
. B .
Dj;k = mj;k1 m ; mj;k1 ˛Mj;k1
rk
. B .
W d j;k1 m ; d j;k1 ˛Dj;k1
rk
. B .
W i j;k1 m ; i j;k1 ˛I j;k1
rk
. s
I j;k = mj1;k m j ; mj1;k ˛Mj1;k
B
. s .
W d j1;k m j ; d j1;k ˛Dj1;k
B
. s .
W i j1;k m j ; i j1;k ˛I j1;k
B
Correct equation
These errors have now been corrected online, and we apologize for the inconvenience.
Retraction
Retraction Notice to:
A Monoclonal Antibody that Targets
a NaV1.7 Channel Voltage Sensor
for Pain and Itch Relief
Jun-Ho Lee, Chul-Kyu Park, Gang Chen, Qingjian Han, Rou-Gang Xie, Tong Liu, Ru-Rong Ji,* and Seok-Yong Lee*
*Correspondence: ru-rong.ji@duke.edu (R.-R.J.), sylee@biochem.duke.edu (S.-Y.L.)
https://doi.org/10.1016/j.cell.2020.06.019
The first author, Jun-Ho Lee, did not respond to the request to sign this retraction.
Cell 181, 1695, June 25, 2020 ª 2020 Elsevier Inc. 1695
SnapShot: JAK-STAT Signaling II
Alejandro V. Villarino,1 Massimo Gadina,1 John J. O’Shea,1 and Yuka Kanno1
1
National Institute of Arthritis and Musculoskeletal and Skin Diseases (NIAMS),
National Institutes of Health (NIH), Bethesda, MD 20892, USA
JH
FERM domain 6-7 ND N-terminal domain
Receptor
CCD Coiled-coil domain PLASMA MEMBRANE
SH2 like domain 3-5
DBD DNA-binding domain
Pseudokinase 2 USP18
domain SH2 Src homology 2 domain JAK JAK CYTOPLASM
Y PY PTP SOCS
Kinase domain 1 ATP
CD
C-terminal
PY PY Cytoplasmic
PY Y STAT pool Monomer
JAKi trans-activation domain
PS S JAKi PY PY
STAT STAT
Dimer
FERM 4.1 protein, ezrin,
radixin moesin
Parallel SK
JH JAK homology (1-7) IRF9 Tetramer
Protein tyrosine
STAT2 STAT1 PS ETC
PTP phosphatase PY PY
PY PY PY PY PY PY PY Y Y
PS PS
PS PS PS PS PS PS PS
PS
I II IV ATP
Suppressor of
SOCS cytokine signaling
III V Ca2+
Heterotrimer Tetramer Heterodimer Homodimer Antiparallel STAT3
Ubiquitin specific
USP18
peptidase 18
Mitochondrion
SK Serine kinase
IRF9 HAT
TF Pol II HAT
MT
IFN-β-induced STAT2 binding motif IL-4-induced STAT6 binding motif TF
ATA
GT
TC A
CA
AGTTTC
AGTTTCC
C
CCGG
T TC
T
AGG C
CC
GG
A TTCC A
G
TT
TCG
CGT
A AG AC
GAA
T C
JAK3 T−B+NK− SCID T−B+NK− SCID Leukemia Impaired mammary gland Leukemia, lymphomas
STAT5A development Not reported other cancers
Somatic mutations and
Impaired NK cell function and eosinophilia, urticaria, leukemia,
STAT5B sexual dimorphic growth Dwarfism, autoimmunity
Viral susceptibility, Primary immunodeficiency, lymphomas, other cancers
TYK2 diminished responses variants protect against Not reported
to type I IFNs, IL-12, IL-23 autoimmunity Impaired type II
STAT6 immune responses
Not reported Lymphoma
The discovery of the Janus kinase (JAK) and signal transducer and activator of transcription (STAT) pathway arose from investigation of interferon (IFN) signaling (Darnell et
al., 1994; Leonard and O’Shea, 1998). Canonical JAK-STAT signaling begins with extracellular apposition of members of a family of structurally related cytokines, interleukins,
interferons, colony-stimulating factors, and some hormones with their corresponding structurally related transmembrane receptors. This enables trans-activation of receptor-
bound JAKs that catalyze tyrosine phosphorylation (p-Tyr) of receptors and STATs, resulting in the formation of homodimers and/or heterodimers that accumulate in the
nucleus and instruct gene transcription. Interferons also induce a complex of STAT1, STAT2, and IRF9.
In mammals, there are 4 JAKs sharing 4 major structural domains: (1) the FERM (band-four-point-one ezrin radixin moesin) domain, which mediates interaction with
receptors and promotes kinase function, (2) the SH2-like domain, which mediates interaction with receptors, (3) the pseudokinase domain, which regulates kinase activity,
and (4) the kinase domain. Germline loss of function (LOF) mutations of JAK3 and TYK2 underlie human primary immunodeficiency disorders (Tangye et al., 2017). Somatic
LOF mutations of JAK1 can arise in tumor cells and are associated with resistance to IFN-γ and cancer evasion. Gain of function (GOF) mutations underlie systemic autoim-
munity (JAK1), polycythemia vera (JAK2 kinase-like domain), leukemias, lymphomas, and other malignancies. JAKs were recognized as pivotal drug targets and multiple JAK
inhibitors (jakinibs) have been approved for the treatment of myeloproliferative neoplasms, graft versus host disease, rheumatoid arthritis, psoriatic arthritis, and inflammatory
bowel disease (Gadina et al., 2020). Most jakinibs target the JAK kinase domain, but newer agents act via allosteric mechanisms (e.g., target the kinase-like domain). Despite
the clinical success, questions remain regarding the optimal degree of JAK inhibition for specific cell types in various tissues and disorders. Compared with first-generation
pan-jakinibs, selective jakinibs may provide advantages in terms of reduced toxicity. For example, selective targeting of JAK1, TYK2, and JAK3 would avoid disrupting actions
of JAK2-dependent cytokines involved in hematopoiesis (e.g., erythropoietin). In some circumstances, though, selectivity might result in reduced efficacy.
There are seven mammalian STAT family members bearing five major structural domains. In addition to JAKs, STATs may be phosphorylated by receptor tyrosine kinases,
src family kinases, and bacterial and parasite enzymes. STAT expression and availability are subject to a range of intrinsic (e.g., cellular lineage) and extrinsic (e.g., cytokines)
factors. They are regulated at transcriptional, post-transcriptional, and post-translational levels, including by protein tyrosine phosphatases (PTPs), which “de-activate” STATs
and mediate degradation or recycling. STATs are subject to serine phosphorylation (p-Ser), which influences both p-Tyr-dependent and p-Tyr-independent functions. The latter
are particularly relevant for “unphosphorylated” STATs (uSTATs), which are arranged as anti-parallel dimers (unlike conventional p-Tyr-dependent parallel dimers). uSTATs can
modulate the pool of cytoplasmic STATs and mediate gene transcription (Stark et al., 2018; Stark and Darnell, 2012). STATs are present in mitochondria and positively regulate
of complexes I, II, and V of the electron transport chain, elevating mitochondrial membrane potential and favoring the mitochondrial respiration (Garama et al., 2016).
STATs are classical transcription factors (TFs) that engage DNA regulatory elements (DREs) bearing a defined sequence motif and instruct transcription of protein-coding
(mRNAs) or non-coding genes (microRNAs, long non-coding RNAs). This is achieved through a combination of proximal DREs located close to the transcriptional start sites
(TSS) or within gene bodies, distal elements which can physically interact via chromatin looping. STATs bind broadly throughout the genome at promoters, even more so at
enhancers. STATs often congregate at both conventional and super-enhancers; notably, the genes encoding STATs themselves bear super-enhancers, in line with the idea that
they are tightly transcriptionally regulated.
STATs typically engage multiple DREs associated with a given gene, and genes are often bound by multiple STAT family members, along with other TFs, such as IRF, AP-1,
and NF-κB family proteins, creating a platform for multi-molecular networks to control gene expression (Villarino et al., 2017). STATs also influence chromatin remodeling by
recruiting histone-modifying enzymes. Although originally identified as transcriptional activators, STAT binding is associated with both induction and repression of genes.
Germline STAT mutations are associated with primary immunodeficiency and autoimmunity, whereas somatic GOF STAT mutations are associated with cancer. STATs do
not have enzymatic activity and are more challenging than JAKs as therapeutic targets. Three potential strategies have been employed: (1) inhibitory peptides, which seques-
ter STATs from upstream receptors and kinases, (2) small-molecule inhibitors, which impede STAT activation and/or function, and (3) decoy oligonucleotides, which sequester
STATs away from genomic binding sites.
The JAK-STAT pathway is negatively regulated in multiple ways. Aside from dephosphorylation, suppressor of cytokine signaling (SOCS) proteins are induced by cytokine
and IFNs and inhibit proximal receptor signaling (Morris et al., 2018). Ubiquitin specific peptidase 18 (USP18) is induced by IFNs and binds to STAT2, negatively regulating its
function; mutations that interfere with USP18 action are associated with lethal interferonopathy (Basters et al., 2018). STATs are also targeted and degraded by viruses.
In summary, the JAK-STAT pathway has emerged as a paradigm for membrane-to-nucleus signaling, and building on the legacy of the first quarter century of JAK-STAT
research, the field continues to deliver transformative, clinically relevant insights on the nature of intercellular communication, gene expression, and signal-dependent
transcription factors. Still, there is much to learn. Detailed molecular and cell-biologic understanding of cytokine receptor/JAK structure-function relationships will be further
enhanced by advanced real-time measurement of signaling and gene transcription by imaging. These advances are eagerly awaited and will no doubt offer many new trans-
lational insights and therapeutic opportunities.
ACKNOWLEDGMENTS
This work was supported by the NIAMS Intramural Research Program.
REFERENCES
Basters, A., Knobeloch, K.P., and Fritz, G. (2018). USP18 - a multifunctional component in the interferon response. Biosci. Rep. 38, BSR20180250.
Darnell, J.E., Jr., Kerr, I.M., and Stark, G.R. (1994). Jak-STAT pathways and transcriptional activation in response to IFNs and other extracellular signaling proteins. Science 264,
1415–1421.
Gadina, M., Chisolm, D.A., Philips, R.L., McInness, I.B., Changelian, P.S., and O’Shea, J.J. (2020). Translating JAKs to Jakinibs. J. Immunol. 204, 2011–2020.
Garama, D.J., White, C.L., Balic, J.J., and Gough, D.J. (2016). Mitochondrial STAT3: Powering up a potent factor. Cytokine 87, 20–25.
Leonard, W.J., and O’Shea, J.J. (1998). Jaks and STATs: biological implications. Annu. Rev. Immunol. 16, 293–322.
Morris, R., Kershaw, N.J., and Babon, J.J. (2018). The molecular details of cytokine signaling via the JAK/STAT pathway. Protein Sci. 27, 1984–2009.
Stark, G.R., and Darnell, J.E., Jr. (2012). The JAK-STAT pathway at twenty. Immunity 36, 503–514.
Stark, G.R., Cheon, H., and Wang, Y. (2018). Responses to Cytokines and Interferons that Depend upon JAKs and STATs. Cold Spring Harb. Perspect. Biol. 10, a028555.
Tangye, S.G., Pelham, S.J., Deenick, E.K., and Ma, C.S. (2017). Cytokine-Mediated Regulation of Human Lymphocyte Development and Function: Insights from Primary Immuno-
deficiencies. J. Immunol. 199, 1949–1958.
Villarino, A.V., Kanno, Y., and O’Shea, J.J. (2017). Mechanisms and consequences of Jak-STAT signaling in the immune system. Nat. Immunol. 18, 374–384.
1696.e1 Cell 181, June 25, 2020 © 2020 Published by Elsevier Inc. DOI https://doi.org/10.1016/j.cell.2020.04.052