Cell - Vol. 181 (Nº7)

ll
Leading Edge
Editorial
Science Has a Racism Problem
We are the editors of a science journal, committed to publishing admissions committees, classmates, researchers—what can
and disseminating exciting work across the biological sciences. you do to raise up Black students and colleagues in your com-
We are 13 scientists. Not one of us is Black. Underrepresentation munities and institutions? None of us individually can stem the
of Black scientists goes beyond our team—to our authors, re- tide of racism or rebuild an unjust society, but every ac-
viewers, and advisory board. And we are not alone. It is easy tion helps.
to divert blame, to point out that the journal is a reflection of We are part of the problem, as are all of us who do not press for
the scientific establishment, to quote statistics. But it is this change on a daily basis. It should not have taken the recent
epidemic of denial of the integral role that each and every mem- deaths of George Floyd, Breonna Taylor, and Ahmaud Arbery
ber of our society plays in supporting the status quo by failing to for us to speak and to act. We are asking ourselves what we
actively fight it that has allowed overt and systemic racism to can do to be stronger allies, stronger anti-racists.
flourish, crippling the lives and livelihoods of Black Americans, Cell stands with our Black readers, reviewers, authors, and
including Black scientists. colleagues. We are committed to listening to and amplifying their
Science has a racism problem. voices, to educating ourselves, and to finding ways that we can
Look to the history of human genetics, a field that has been help and do better. We alone cannot fix racism. But we have the
used repeatedly as scientific rationale for the definition of human advantage of having a platform, so we will put in the work, we will
‘‘races’’ and to support inherent inequalities. Proponents of eu- listen, and we will act.
genics use the alleles we carry as reason to declare racial supe- As a start, we are committing to the following actions to high-
riority, as if expression of a lactase gene has bearing on one’s hu- light and increase representation of Black scientists:
manity. Race is not genetic.
Look to the exploitation of Black research subjects. Acknowl- 1. Representing – we will feature and amplify Black and other
edge the sheer volume of past and current scientific research underrepresented minority authors of Cell papers on so-
made possible by cells stolen decades ago from Henrietta cial media. If you are a person of color and you wish to
Lacks, a Black woman with cancer. Remember the Tuskeegee be highlighted in this way, please tell us. Email the editor
syphilis study that intentionally withheld appropriate treatment of your paper with the subject line ‘‘Faces of Cell’’ at any
to hundreds of Black men. Think about the issues of consent, point in the publication process, and we will be honored
of ownership, and of medical ethics and do not overlook the to post about your paper with your photo and/or your
shared role of race in these violations. Twitter handle and to re-tweet and amplify your own posts
Look to the extreme disparity in the genetic and clinical data- and stories.
bases scientists have built, with the overwhelming majority of 2. Educating – we are committed to featuring issues of
data from white Americans of European descent and the result- importance to the scientific community in our pages.
ing dearth of understanding of health and disease in Black indi- We pledge to purposefully highlight Black authors and
viduals. Read statistics about morbidity and mortality disparities perspectives in the review and commentary content
in hospitals around the country, highlighted by the current that we commission and publish and to share these
pandemic—ask why Black women are five times more likely with the greater scientific community. Has your depart-
than white women to die during pregnancy, or why Black infants ment or institute already made changes or launched
are twice as likely to die as white babies born in the US. Black successful initiatives? Tell us, and we will try to find
health has never been the priority. ways to share those stories. Have new ideas? Let
Science has a racism problem. And it is not limited to scientific us know.
discoveries and their attendant usage. The scientific establish- 3. Diversifying – we pledge to improve the diversity of our
ment, scientific education, and the metrics used to define scien- advisory board and our reviewer pool, using our experi-
tific success have a racism problem as well. ence with gender equity initiatives to increase represen-
Black Americans face a mountain of challenges built on cen- tation of non-white scientists, which is far too low. We
turies of systemic structural racism and the United States’ his- are actively investigating ways to improve diversity
tory of slavery and racial oppression. Educational opportunities, through our outreach, recruiting, and hiring efforts, at
mentorship and representation, and our ingrained, often uncon- Cell and across Cell Press. If you are a Black scientist
scious attitudes all play a role. The gatekeeping system in with an interest in editorial careers, get in touch. We’re
academia, industry, and scientific organizations was not de- eager to talk.
signed to correct for centuries of compounded disadvantage 4. Listening – we are editors because we want to learn. If
and oppression. It is time for renovation. there are ways that we can use our voice and our platform
We urge our community members who have the means to to help the Black scientist community, we want to hear
enact change to do so. Hiring committees, educators, mentors, them. Please email us if you have concrete ideas for
Cell 181, June 25, 2020 ª 2020 Published by Elsevier Inc. 1443
ll
Editorial
perspectives you want to see or creative ways that you beginning. We are learning, and we will almost certainly make
think we can help. We promise to hear them. mistakes along the way. But silence is not, and never should
have been, an option.
We and our colleagues across Cell Press hope to serve as one Science has a racism problem. Scientists are problem solvers.
small part of amplifying Black voices in STEM, and this is just the Let’s get to it.
The Cell Editorial Team

https://doi.org/10.1016/j.cell.2020.06.009
1444 Cell 181, June 25, 2020

ll
Leading Edge
Commentary
How Support of Early Career Researchers
Can Reset Science in the Post-COVID19 World
Erin M. Gibson,1,12,* F. Chris Bennett,2 Shawn M. Gillespie,3 Ali Deniz Güler,4 David H. Gutmann,5 Casey H. Halpern,6
Sarah C. Kucenas,4 Clete A. Kushida,1 Mackenzie Lemieux,7 Shane Liddelow,8 Shannon L. Macauley,9 Qingyun Li,10
Matthew A. Quinn,11 Laura Weiss Roberts,1 Naresha Saligrama,5 Kathryn R. Taylor,3 Humsa S. Venkatesh,3 Belgin Yalçın,3
and J. Bradley Zuchero6
1Department of Psychiatry and Behavioral Sciences, Stanford University School of Medicine, Palo Alto, CA 94305, USA
2Department of Psychiatry, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104, USA
3Department of Neurology and Neurological Sciences, Stanford University School of Medicine, Palo Alto, CA 94305, USA
4Department of Biology, University of Virginia, Charlottesville, VA 22903, USA
5Department of Neurology, Washington University School of Medicine, St. Louis, MO 63110, USA
6Department of Neurosurgery, Stanford University School of Medicine, Palo Alto, CA 94305, USA
7The Salk Institute for Biological Studies, La Jolla, CA 92037, USA
8Neuroscience Institute, NYU School of Medicine, New York, NY 10016, USA
9Department of Internal Medicine, Wake Forest School of Medicine, Winston-Salem, NC 27101, USA
10Departments of Neuroscience and Genetics, Washington University School of Medicine, St. Louis, MO 63110, USA
11Department of Pathology, Section on Comparative Medicine, Wake Forest School of Medicine, Winston-Salem, NC 27517, USA
12Lead Contact
*Correspondence: egibson1@stanford.edu
The COVID19 crisis has magnified the issues plaguing academic science, but it has also provided the scien-
tific establishment with an unprecedented opportunity to reset. Shoring up the foundation of academic sci-
ence will require a concerted effort between funding agencies, universities, and the public to rethink how we
support scientists, with a special emphasis on early career researchers.
The novel coronavirus, SARS-CoV-2, has more experiments per paper, more pa- times and will undoubtedly suffer more
placed science at the center of every con- pers per year, more expectations and re- from the present lab closures. The re-
versation, amplifying the importance of quirements for grants and tenure, more sponsibilities of family life disproportion-
scientific research to economic stability, opinions from reviewers. The scientific ately impact women. A parent who is
healthcare infrastructure, and disaster community rewards quantity over quality. trying to homeschool their children,
preparedness. In academic science, re- Most scientists can easily name a seminal manage household duties, and work will
covery from the immediate COVID19 paper; many were published long before have left little time to further their own
crisis will require departments, univer- the 2000s, and many had, at most, a scientific agenda. Faculty with family re-
sities, private foundations, federal handful of figures. Today, papers are sponsibilities—women specifically—
agencies, and the public to work together often published with a plethora of supple- must be supported. The COVID19 crisis
collaboratively and comprehensively. The mental figures that will largely go unread will only highlight the rampant diversity is-
goal of recovery should not be to return to and underappreciated. The desire for sues plaguing the scientific establish-
‘‘normal’’ but, rather, to reset. Here, we ‘‘more’’ results in delays in publication, ment, many of which begin with the
argue that recovery provides us with the the awarding of grants, and career loss of women and minorities during
opportunity to address three systemic is- advancement for early career re- early career stages and may lead to
sues that plague the conduct of research searchers; it also stymies creativity and further disenfranchisement of the disad-
in the twenty-first century, with an encourages the proliferation of low-qual- vantaged (Malisch et al., 2020).
emphasis on supporting early career re- ity journals.
searchers who are the most vulnerable. Rethink the Fundamentals of
The strategies needed to ensure stability Diversification Leads to Discovery Funding
and success of early career scientists This crisis is exacerbating the well-docu- The current model of academic science is
post-COVID19 can be adapted to chip mented discrimination afflicting academic heavily reliant upon federal funding, even
away at the systemic issues affecting the science (Monroe et al., 2008). Women, though agencies such as the National In-
scientific establishment. parents, and individuals who identify as stitutes of Health (NIH) were not built to
racial or ethnic minorities leave science, sustain such expectations. The federal
Excess Does Not Equal Excellence technology, engineering, and math government’s funding capacity has signif-
Science has changed immensely over the (STEM) fields as early career researchers icantly diminished as the cost of science
past 50 years. More has become better: at an excessively high rate in the best of has radically increased. The 2019 defense
Cell 181, June 25, 2020 ª 2020 Elsevier Inc. 1445

ll
Commentary
budget was $685 billion while the 2019 next generation of independent sci- early investigator status for grant applica-
NIH budget was $39 billion. The COVID19 entists. tions and implement no-cost extensions
crisis has clearly amplified that the great- for currently held grants. Additional bridge
est risk to American life is not war, but dis- Funding Agencies funding programs may be especially
ease. Funding is needed at all levels; how- Grantsmanship important for faculty who are between
ever, early career researchers should be The resiliency of research is dependent projects or aiming to switch areas of study
particularly supported as the consistent upon the support of funding agencies. following the COVID19 crisis.
trend of shifting funding away from Like the broader scientific community,
younger researchers has no end in sight funding agencies will need to adapt their Universities
(Daniels, 2015). strategies and structure to fit the chang- Extensions for Tenure: Faculty
ing times. Simplification of grant applica- Most universities have added one-year
Ensuring a Durable Future for tion processes, including fewer supple- extensions to the tenure tracks of early
Academic Science Post-COVID19 mental documentations and more career researchers, but sliding extensions
Recovery from the immediate COVID19 implementation of letter-of-intent formats may better support the success of vulner-
crisis necessitates a multi-pronged prior to full proposals, could increase effi- able academics. Many early career inves-
approach including fiscal and non-fiscal ciency for both the funding agency and tigators may request extensions during
strategies to help graduate students, researcher. Lab closures will undoubtedly lab closures, but they should also have
postdoctoral fellows, and early and later create a void in the preliminary data that the ability to go up for tenure early if the
career faculty. This pandemic has partic- are necessary to obtain most awards. opportunity arises. Ensuring the promo-
ularly impacted senior postdoctoral fel- Early career researchers who had less tion and advancement of marginalized
lows seeking academic faculty positions time to acquire these data prior to lab groups such as women, who make up <
and early career faculty seeking to estab- shutdowns will be the most affected. 30% of STEM faculty, is even more imper-
lish themselves as independent investi- Funding agencies could introduce pol- ative post-COVID19. COVID19-initiated
gators. Special consideration for these icies and programs targeted at early in- resetting of expectations for the publish-
early career researchers is key to over- vestigators that require fewer preliminary ing, teaching, mentorship, and service re-
coming the crisis and strengthening the data (similar to the National Institute of quirements for tenure may not only help
foundations of academic science. Our Mental Health [NIMH] Brain Research minimize the excesses innate to the cur-
action plan proposed below is not an through Advancing Innovative Neurotech- rent tenure structure, but also may help
exhaustive list of all possible recommen- nologies [BRAIN] Initiative R01 or the foster environments that can acknowl-
dations for supporting scientists, nor is it DP2), reducing the excess in data edge implicit biases and keep marginal-
inclusive of every academic scientist’s required for most grants. Grants submit- ized groups from disproportionately
specific circumstance. Not all of our sug- ted by graduate students, postdoctoral leaving STEM fields. Tenure expectations
gestions are applicable at every univer- fellows, and early career faculty who do for the next generation of early career re-
sity or institution, as each will have its not have sufficient preliminary data per searchers may need to account for
own unique set of challenges. We current standards should be given special increased variability between faculty that
acknowledge that monetary support will consideration. Currently, many of the new is exacerbated by the COVID19 crisis
be limited due to the deteriorating eco- funding opportunities by funding and allow for more flexibility in the pro-
nomic situation and drastic loss of reve- agencies, such as the NIH, are geared to- cess. This crisis has amplified how the
nue from clinical operations for most ward supplements to existing grants or antiquated one-size-fits-all guidelines
medical campuses. While the immediate COVID-related research. As there will only encourage the disenfranchisement
goal of the recommendations is to pro- likely be restrictions or reductions to new of women and racial or ethnic minorities
vide support for scientists from funding funding opportunities in the coming years (Diversification Leads to Discovery and
agencies, universities, departments, and due to fiscal shortages, faculty with exist- Excess Does Not Equal Excellence).
the public following COVID19, this sup- ing grants might help early career faculty Trainees
port also provides solutions to the three by including them in their supplemental The current crisis will have a dramatic
major challenges. Solutions to these sys- applications. Including early career fac- trickle-down effect, and numerous hiring
temic issues (i.e., Excess Does Not Equal ulty will also foster collaboration and freezes are already in place. Mechanisms
Excellence, Diversification Leads to Dis- resource sharing, both of which will be vi- to allow postdoctoral fellows or graduate
covery, or Funding Agencies) are inter- tal during this time (Excess Does Not students in their final year to continue in
woven across the structure of academic Equal Excellence and Rethink the Funda- their current positions should be enacted,
science, allowing us to comprehensively mentals of Funding). if necessary, and if labs or universities are
tackle these issues at all levels. Plans Extension of Deadlines, Timelines, able to provide fiscal support. Current clo-
for recovery from the COVID19 and Funding sures are also disrupting the ability of
pandemic must ensure as much continu- Numerous funding agencies have already many graduate students to complete their
ity as possible in research while implemented deadline extensions, but rotations. Universities could extend the
improving upon existing infrastructures deadlines must be further extended for timeline for rotations and potentially cover
in order to provide a more inclusive, the duration of lab disruptions. It is also graduate students’ stipends. Trainees,
cohesive, and efficient future for the imperative that funding agencies extend particularly postdoctoral fellows, may
1446 Cell 181, June 25, 2020

ll
Commentary
have limited ability to extend their period If startup funds are set to expire, the expi- Supplementation: Access to
of training due to visa restrictions. Univer- ration date should be extended. New fac- Technology
sities should coordinate with federal ulty should be given the funds needed to Universities should encourage and enable
agencies to pursue strategies aimed at establish their labs once research activ- graduate students and postdocs to use
extending visa expiration timelines, allow- ities resume (Rethink the Fundamentals this time to learn new computational skills
ing trainees to complete work that was of Funding). in anticipation of reductions in ability to do
delayed due to the COVID19 crisis. These Supplementation work at the bench. Many university-
mechanisms are needed to assure that The economic toll caused by shelter-in- offered computational courses were
we do not lose an entire generation of sci- place will undoubtedly be significant, over-committed during lab closures due
entists following the coronavirus crisis. including the reduction in funding through to a significant increase in enrollment re-
Curtailment of Applicable Hiring endowments and charitable giving. We quests. Universities should make a
Freezes fully acknowledge that monetary supple- concerted effort to increase bandwidth
Many universities have implemented hiring mentation may be difficult for universities and capacity for computational courses.
freezes for faculty and staff for the following the COVID19 crisis. Any combi- Many free online resources are also avail-
remainder of the year or beyond. Univer- nation of fiscal supplementation with able to supplement the acquisition of cod-
sities should not limit the ability of early other mechanisms of non-fiscal support ing skills.
career faculty to hire postdoctoral fellows should be considered. Universities might Departments: Administrative and
and staff, however. Restricting early career implement new or expanded fellowships Teaching Load
faculty from hiring technical assistance and for postdocs and graduate students, add Administrative and teaching expecta-
lab managers will stymie their ability to to existing startup packages for faculty, tions should be reevaluated during uni-
generate preliminary data, which will assist with the purchasing of equipment versity closures. Departments should re-
consequently limit grant and paper sub- or expand shared equipment funding, or assess administrative and teaching
missions and delay career advancement. create subsidies or joint ventures with loads, especially for early career faculty
Even a short hiring freeze could have federal programs similar to unemploy- whose promotions are contingent upon
devastating effects on the ability of early ment or re-deployment programs. Univer- teaching requirements. This is especially
career faculty labs to succeed. Allowing sities might supplement pay or provide important, since female scientists gener-
early career faculty to continue hiring will reimbursement for staff, postdoc, and ally have increased teaching loads and
also help to ease the bottleneck of grad- graduate student salaries during the dura- more advisory expectations than male
uate students looking for postdoctoral or tion of academic closures. scientists (Gibney, 2017), which could
research scientist positions within the Supplementation: Per Diem Costs disproportionately delay scientific recov-
next few years. Hiring freezes at any level Many universities have per diem policies ery of female scientists from COVID19
will disproportionately affect early career that differ based on funding source, with closures (Diversification Leads to Dis-
individuals and oversaturate the market reduced per diem costs associated with covery).
with qualified candidates. Permitting federal grants. Early career faculty without
ongoing interviews for faculty positions, federal funding have per diem costs double Mentorship
even if the official hire date is postponed, that of other labs. Universities could imple- Mental Health
could alleviate stress on the postdoc popu- ment mechanisms to reduce or The COVID19 crisis and subsequent lab
lation and expedite the hiring process when supplement animal costs that will be closures will take an incredible toll on
hiring freezes are lifted. The faculty search accrued during lab closures and when mental health. Early career faculty who
process serves as a valuable feedback labs reopen and expand their animal have yet to establish themselves or their
mechanism for postdoctoral fellows that colonies (Rethink the Fundamentals of research independently and postdocs
sometimes has an impact on career path. Funding). whose future job prospects are now
Halting all hiring and all faculty searches Supplementation: Childcare significantly limited will be especially
may drive talented postdocs, especially Initiatives impacted by prolonged lab shutdowns.
women and members of ethnic or racial mi- Onsite daycare facilities support postdoc- Department chairs, division leaders,
norities, out of academia (Diversification toral fellows and faculty with young chil- and mentors should do their best to
Leads to Discovery). dren. These family care centers are critical check in with early career faculty and
Institutional Funds and Startup to narrowing the gap and slow the attrition postdocs during this time. Mentoring
Packages of women and parents in science. Univer- will be key both during and after this
Although universities may curtail sities could work with early childhood ed- crisis. Establishing scheduled virtual
spending from institutional funds, special ucation programs to establish or expand meetings during social distancing and
consideration should be given to new daycare and preschool programs, in-person meetings after labs are reop-
and early career faculty. Early career fac- providing free or subsidized childcare for ened could help alleviate some mental
ulty must retain access to their startup faculty and teaching opportunities for stress. University mental health re-
packages during this time. Institutional early education majors. Universities might sources are also available for anyone
funds should be released for salary sup- also reach out to current or retired teach- who needs support. As students gener-
port for early career faculty and for all ers seeking supplemental income (Diver- ally contact female faculty about mental
staff, students, and trainees in their labs. sification Leads to Discovery). health issues more frequently than male
Cell 181, June 25, 2020 1447

ll
Commentary
and curtailment of patient enrollment in

clinical trials, they will also have the extra
physical and mental stressors of working
in the hospital during a crisis. Establish-
ment of protocols to aid clinician-scien-
tists is imperative to ensuring their impor-
tant contributions to science. Just as
senior faculty mentoring will be critical
for junior faculty and graduating postdocs
to successfully transition to a post-COVID
era in the basic sciences, this type of
mentorship protocol may be even more
critical for clinician-scientists, many of
whom do not have doctorates beyond
the medical degree.
Public Initiatives
Make Science a National Priority
The current crisis has brought the impor-
tance of science and research to the
forefront of public life. Not only is science
critical for public health decision-mak-
ing, but a sustained investment in
research better positions political
leaders to efficiently deploy testing and
therapeutic solutions. Capitalizing this
momentum is crucial to engaging the
public in science and science funding.
Providing additional funding sources
Figure 1. The COVID19 crisis has magnified the systemic issues plaguing academic focused on conveying science to the
research. These include the often stifling excess requirements in publication, tenure, and
greater public and stimulating interest in
grant processes; the reliance on funding from national agencies that is catered towards
senior level researchers; and the lack of diversity in academic research due to the attrition of science through educational outreach is
women and racial or ethnic minorities during early career stages. critical. Exploiting technology and social
media to bring science and research
directly to the public will be vital in the
faculty (Bennett, 1982), equal encour- Faculty Mentorship Programs post-COVID19 world. Such technology
agement of mentorship from all faculty Once labs are reopened, pairing early might include mechanisms to allow pri-
is essential to not overburdening women career faculty with a later career faculty vate citizens to directly invest in science
faculty during this time (Diversification mentor of an established lab could facili- and scientists (Else, 2019; Miller, 2019),
Leads to Discovery). tate more effective research programs including simplified website-based
Graduate Student Programs and allow for resource sharing. Later donation platforms or inclusion on elec-
Mentoring graduate students throughout career faculty could be incentivized to tion ballots. This is necessary for estab-
lab closures and after reopening should help early career researchers through re- lishing new funding sources for scien-
be strongly encouraged. Those con- ductions in teaching or administrative tists, potentially supplementing the
ducting experiments will be most affected loads, supplementations to animal care dearth of funding for early career re-
by lab closures, and this should be explic- costs, core facility usages, or other means searchers at federal funding agencies
itly acknowledged by faculty and mentors. of reimbursement and/or subsidizations. (Rethink the Fundamentals of Funding).
Universities must assure graduate stu- Investment of later career faculty in the Enhanced Scientific Transparency
dents that graduate programs will be stabi- success of early career faculty will help The COVID19 crisis has revealed a lack of
lized and that admittance will not be to ensure stability and success in the public understanding about how science
decreased. For many faculty, graduate younger generation of independent reis funded, conducted, and reported. The
students are the major workforce of the searchers. current administration’s belief that the
lab. To ensure that faculty can successfully Clinician-Scientists NIH is ‘‘giving away $32 billion a year’’
build and sustain a lab, continued ability to Faculty who have clinical responsibilities should be cause for concern (DeYoung
attract graduate students is necessary. also necessitate special consideration et al., 2020). Much of the mistrust evident
This is especially important for new inves- during this time, especially if they are on between the scientific establishment and
tigators, as getting postdoctoral fellows the front lines. These individuals will not the general population is rooted in lack
can be more challenging for newer faculty. only lose productivity due to lab closures of transparency and community
1448 Cell 181, June 25, 2020

ll
Commentary
involvement in science. Taking scientists community should speak openly and Daniels, R.J. (2015). A generation at risk: young in-
out of the ‘‘ivory tower’’ and increasing honestly about the difficulties faced dur- vestigators and the future of the biomedical work-
force. Proc. Natl. Acad. Sci. USA 112, 313–318.
accessibility through technology may ing the current situation. Early career
help to assuage the mistrust that hinders researchers should be involved in the de- DeYoung, K., Sun, L.H., and Rauhala, E. (2020).
Americans at World Health Organization trans-
our preparedness in times of crisis. Peo- cision-making processes, as they repre-
mitted real-time information about coronavirus to
ple cannot support what they do not un- sent the future of science and academic Trump administration. The Washington Post, Avail-
derstand. Removing excess requirements leadership. The COVID19 crisis has pro- able from. https://www.washingtonpost.com/
in publishing, grantsmanship, and tenure vided us with the unique opportunity to world/national-security/americans-at-world-health-
expectations could have the added reflect upon the present norms and enact organization-transmitted-real-time-information-
benefit of creating more time for scientists change through fiscal and non-fiscal about-coronavirus-to-trump-administration/2020/
04/19/951c77fa-818c-11ea-9040-68981f488eed_
to interact in the public domain. Scientists strategies. Our hope is that this
story.html.
must work on building the trust that is pandemic will allow us to chart a new
Else, H. (2019). Crowdfunding research flips sci-
imperative to success as a community, course for science, both academically ence’s traditional reward model. Nature, Available
and early career scientists are primed to and socially, and to begin to address from. https://www.nature.com/articles/d41586-
help pave this new future (Excess Does the core challenges of research, with a 019-00104-1.
Not Equal Excellence). special focus on supporting the next Gibney, E. (2017). Teaching load could put
generation of independent scientists. female scientists at career disadvantage. Na-
Conclusions ture, Available from. https://www.nature.com/
Beyond the immediate challenges of re- news/teaching-load-could-put-female-scientists-at-
DECLARATION OF INTERESTS career-disadvantage-1.21839.
turning to laboratories and research ca-
Malisch, J.L., Harris, B.N., Sherrer, S.M., Lewis,
reers, the COVID19 crisis has exposed Dr. Roberts serves as Editor-in-Chief of books for
K.A., Shepherd, S.L., McCarthy, P.C., Spott, J.L.,
some of the underlying weaknesses and the American Psychiatric Association Publishing
Karam, E.L., Moustaid-Moussa, N., McCrory Cal-
problems that permeate the current sci- Division and as Editor-in-Chief of the journal Aca-
arco, et al. (2020). In the wake of COVID-19,
demic Medicine. Unrelated to this publication, Dr.
entific enterprise (Figure 1). For example, academia needs new solutions to ensure gender eq-
Roberts serves as an advisor for the Bucksbaum
editors are asking reviewers to not Institute of the University of Chicago Pritzker
uity. Proceedings of the National Academy of Sci-
request more experiments unless abso- ence. https://doi.org/10.1073/pnas.2010636117.
School of Medicine and owns the small business
lutely necessary to validate the core Terra Nova Learning Systems. Miller, Z. (2019). The best platforms for crowd-
claims of a manuscript during the review funding science research. The Balance: Small
REFERENCES Business, Available from. https://www.
process. Most are applauding this effort
thebalancesmb.com/top-sites-for-crowdfunding-
to minimize excess and calling for its scientific-research-985238.
Bennett, S.K. (1982). Student perceptions of and
continued implementation even after sci-
expectations for male and female instructors: Evi- Monroe, K., Ozyurt, S., Wrigley, T., and Alexander,
entists are able to get back to the bench. dence relating to the question of gender bias in A. (2008). Gender Equality in Academia: Bad News
All institutions, funding agencies, depart- teaching evaluation. J. Educ. Psychol. 74, from the Trenches, and Some Possible Solutions.
ments, and members of the scientific 170–179. Perspectives on Politics 6, 215–233.
Cell 181, June 25, 2020 1449

ll
Leading Edge
Previews
Cap-Snatching Leads to Novel Viral Proteins
Alistair B. Russell1,*
1Division of Biology, University of California, San Diego, San Diego, CA, USA
*Correspondence: a5russell@ucsd.edu
Some negative-sense RNA viruses prime mRNA transcription using host 50 cap sequences, usurping host
translational machinery and evading antiviral surveillance. In this issue of Cell, Ho et al. identify an additional
consequence of this viral strategy: the acquisition of upstream start codons from host-derived sequences
and subsequent translation of novel viral products.
Canonical eukaryotic mRNAs possess a start codons, and, for a subset of these sion to a viral protein driven by an
50 methylguanosine cap and a 30 polyade- proteins, they could detect expression upstream host-derived start codon
nosine tail. These features, combined, re- by mass spectrometry. Combined, these altered in vivo pathogenesis. Caution
cruit cellular translational machinery and represent relatively unequivocal evidence should still be taken when trying to link
mark these molecules as ‘‘belonging’’ in that host-derived alternative start codon these virulence data to a true biological
the cytoplasmic compartment wherein acquisition can drive protein production function; infection outcome for the host
they are accessible to ribosomes. Influ- in IAV. is ancillary to viral fitness, and, even in a
enza A virus (IAV), lacking its own capping What is the consequence of translation model that perfectly recapitulates the
machinery, produces capped mRNAs from host-derived start codons? Such host response, the route and dosage of
through a process known as cap-snatch- proteins could play a role in viral infection delivery can produce an incredible
ing. The viral polymerase associates with via at least two, non-exclusive mecha- breadth of outcomes that may have little
an actively transcribing RNA polymerase nisms. (1) They could represent bone- resemblance to the normal course of dis-
II complex and cleaves the nascent host fide, functional, overprinted products ease. Therefore, without a specific pro-
mRNA, thereafter using the cleaved prod- that serve significant roles in the viral life posed mechanism, these data do not yet
uct to prime viral mRNA production. This cycle, and (2) they could provide targets support inferring a function for these novel
process appears to roughly target caps for adaptive immunity (Figure 1). With proteins in IAV but also do not preclude
according to their relative abundance in respect to the first, the genomes of RNA such a role.
the nuclear compartment, resulting in viruses are restricted in length and over- In addition to potential novel functional
diverse cap sequences associated with printing, encoding multiple proteins from proteins, this discovery is incredibly
a given viral mRNA (Walker and Fo- the same overlapping sequence, is a important in light of the second possibil-
dor, 2019). common mechanism by which additional ity—that these upstream start codons
Previously, little to no consequence has coding capacity is procured. Ribosomal may generate targets for immune surveil-
been ascribed to the composition of host- frameshifting, encoded alternative start lance. Any protein, regardless of function
derived sequences appended to the 50 sites, and alternative splicing are, among or lack thereof, once degraded can
end of viral mRNAs, and the heterogeneity other strategies, utilized by IAV to generate potential peptides for display
of such sequences is certainly consistent generate alternative protein products via MHC-I and mark a cell for subsequent
with a lack of function in viral transcription (Chen et al., 2001; Dubois et al., 2014; destruction and recruitment of a more
and translation. Ho et al. (2020) posited Jagger et al., 2012). Ho et al. (2020) found robust immune response. One could
that AUG codons snatched in such a broad conservation between IAV strains even posit that, unlike full-length, canoni-
manner might permit the recruitment of ri- of an overprinted protein that would be cal viral proteins, out-of-frame, or even
bosomes upstream of the canonical IAV generated from upstream host-derived N-terminal extended, proteins may lack
start codons—producing either N-termi- start codons, consistent with a functional functional selection driving stability.
nal extensions to known viral proteins or hypothesis. However, other evolutionary Without such selection, these proteins
novel frameshifted peptides. The authors constraints such as the sequence of the may be rapidly targeted to the protea-
found that AUG codons could be readily ‘‘primary’’ influenza protein, genome some, and thus preferentially presented
identified in host-derived 50 sequences in packaging constraints, and viral promoter via MHC-I, and represent a heretofore un-
two different IAV strains in two different sequences would also produce signa- recognized target for antiviral immunity. It
cell types, indicating that there appeared tures of conservation in these genomic re- has been increasingly recognized that
to be no specific mechanism precluding gions, a point also noted by the authors. ‘‘off-products’’ of viral replication, be
their acquisition by viral transcriptional Furthering a functional argument, Ho they aberrant peptides or even aberrant
machinery. Furthermore, Ho et al. (2020) et al. (2020) found evidence that disrupt- genomes, are frequently the trigger for
were able to detect initiating ribosomes ing the expression of one of these over- an immune response (López, 2014; Wei
associated with upstream, host-derived printed proteins or an N-terminal exten- and Yewdell, 2019). This is perhaps an
1450 Cell 181, June 25, 2020 ª 2020 Elsevier Inc.

ll
Previews
Figure 1. Biological Impacts of Cap-Snatching

IAV engages in cap-snatching, producing chimeric host-viral mRNAs with a 50 methylguanosine cap (host sequence, blue; viral, black). Previously identified
implications of this strategy (left) include global host transcriptional suppression (cap-stolen from actively transcribing RNA polymerase II, shown in blue),
recruitment of ribosomes to viral messages (ribosome, brown), and evasion of antiviral surveillance by RIG-I (red)-like receptors (Decroly et al., 2011). New
implications from this study identifying translation from host-derived start codons (left, N-terminal extension and frameshifted proteins, red) include potential
novel viral functions and additional targets for host surveillance (T cell engaging in MHC-I detection shown).
unavoidable feature of the rapid, expo- their reporter. Therefore, the findings Decroly, E., Ferron, F., Lescar, J., and Canard, B.
nential growth of viral populations wherein with IAV have widespread relevance (2011). Conventional and unconventional mecha-
nisms for capping viral mRNA. Nat. Rev. Microbiol.
fidelity of replication may be at odds with across many negative-sense RNA vi-
10, 51–65.
replicative speed (Fitzsimmons et al., ruses. What we make of this moving for-
2018). Ho et al. (2020) found predicted ward will take a significant amount of Dubois, J., Terrier, O., and Rosa-Calatrava, M.
(2014). Influenza viruses and mRNA splicing: doing
MHC-I epitopes within an overprinted work, but such work will have been
more with less. MBio 5, e00070–14.
peptide, that these epitopes varied be- made possible by this initial observation.
tween IAV strains consistent with poten- We must uncover whether these alterna- Fitzsimmons, W.J., Woods, R.J., McCrone, J.T.,
Woodman, A., Arnold, J.J., Yennawar, M., Evans,
tial immune pressure, and that an out-of- tive products serve functional roles,
R., Cameron, C.E., and Lauring, A.S. (2018). A
frame canonical MHC-I peptide could be whether they represent significant con- speed-fidelity trade-off determines the mutation
produced and displayed—thus formally, tributors to MHC recognition of viral infec- rate and virulence of an RNA virus. PLoS Biol. 16,
MHC-I targets can be generated via tion, and, not touched upon in this study e2006459.
acquisition of host-derived start codons. but important nevertheless, whether se- Ho, J.S.Y., Angel, M., Ma, Y., Sloan, E., Wang, G.,
A critical next step is establishing whether lection against particular N-terminal ex- Martinez-Romero, C., Alenquer, M., Roudko, V.,
peptides derived from native proteins tensions or immunogenic peptides con- Chung, L., Zheng, S., et al. (2020). Hybrid
generated by upstream host-derived start strains the evolutionary trajectories of gene origination creates human-virus chimeric
codons are displayed via MHC-I, and if viral species that rely on cap-snatching. proteins during infection. Cell 181, this issue,
1502–1517.
so, what role they may play in viral surveil- Ho et al. (2020) have provided us with
lance and clearance. Such work becomes the critical step, identifying that such fea- Jagger, B.W., Wise, H.M., Kash, J.C., Walters, K.-
even more important in light of recent at- tures can impact viral biology, and it will A., Wills, N.M., Xiao, Y.-L., Dunfee, R.L., Schwartz-
man, L.M., Ozinsky, A., Bell, G.L., et al. (2012). An
tempts to generate T cell responses be exciting to explore the implications of
overlapping protein-coding region in influenza A
against IAV, which, although they may this study. virus segment 3 modulates the host response.
not prevent infection, are still thought Science 337, 199–204.
have the potential to reduce morbidity
ACKNOWLEDGMENTS La Gruta, N.L., and Turner, S.J. (2014). T cell medi-
and mortality (La Gruta and Turner, 2014). ated immunity to influenza: mechanisms of viral
Cap-snatching is not unique to IAV. To Work from A.B.R. in this field is supported in part control. Trends Immunol. 35, 396–402.
that end, Ho et al. (2020) explored 50 cap by the Damon Runyon Cancer Research Founda- López, C.B. (2014). Defective viral genomes: crit-
sequences from influenza B virus and tion (DFS-36-19) and NIAID (1K22AI141678). ical danger signals of viral infections. J. Virol. 88,
lassa virus and found host-acquired 8720–8723.
AUG codons in both viral species. Using
Walker, A.P., and Fodor, E. (2019). Interplay be-
a plasmid-based system that recapitu- REFERENCES
tween Influenza Virus and the Host RNA Polymer-
lates the replicon of Heartland banyangvi- ase II Transcriptional Machinery. Trends Microbiol.
Chen, W., Calvo, P.A., Malide, D., Gibbs, J., Schu-
rus, a Bunyavirus, they further confirmed 27, 398–407.
bert, U., Bacik, I., Basta, S., O’Neill, R., Schickli, J.,
translation of a luciferase lacking a start Palese, P., et al. (2001). A novel influenza A virus Wei, J., and Yewdell, J.W. (2019). Flu DRiPs in
codon, consistent with acquisition of mitochondrial protein that induces cell death. MHC Class I Immunosurveillance. Virol. Sin. 34,
host-derived start codons in frame with Nat. Med. 7, 1306–1312. 162–167.
Cell 181, June 25, 2020 1451

ll
Previews
Role of Microbiota-Derived Bile

Acids in Enteric Infections
Casey M. Theriot1 and William A. Petri, Jr.2,*
1North Carolina State University, Raleigh, NC, USA
2Department of Medicine, University of Virginia, PO Box 801340, Charlottesville, VA 22908-1340, USA
*Correspondence: wap3g@virginia.edu
In this issue of Cell, Alavi et al. report that infection by Vibrio cholerae is blocked by gut microbiome-mediated
hydrolysis of bile acids. Cholera therefore joins amebic dysentery and Clostridioides difficile colitis as enteric
infections profoundly influenced by the microbiome’s impact on bile acid metabolism.
Vibrio cholerae causes watery diarrhea so the microbiome impacts infectious dis- disease is regulated by microbiome meta-
severe that it kills by dehydration within eases via microbial metabolism of bile bolism of bile acids.
hours. We are now experiencing the acids in the gut. A second example is the parasite
7th pandemic of cholera, all 7 of which The primary bile acids cholate (CA) and Entamoeba histolytica, an ameba that in-
likely originated in the Indian subconti- chenodeoxycholate (CDCA) are made by vades the intestine by eating the epithelial
nent, with current estimates of up to 3 the liver, where they are conjugated with lining in a process called trogocytosis.
million cases and 100,000 deaths annu- a glycine or taurine before being secreted Burgess et al. (2020) recently identified
ally. That cholera is water-borne was es- into the duodenum. As they make their the intestinal bacterium Clostridium scin-
tablished by the physician John Snow in way through the small intestine, 95% of dens as providing protection from amebic
1854 by linking victims of the London bile acids are absorbed in the terminal colitis. Introduction of C. scindens into
Broad Street cholera epidemic not to ileum through the enterohepatic system, the microbiome of a mouse altered the
bad air but to the Broad Street water a majority being conjugated bile acids bone marrow by inducing expansion
pump. V. cholerae exists in aquatic envi- (Figure 1). Gut microbes that encode bac- of granulocyte-monocyte progenitors
ronments on the surface and in the intes- terial bile salt hydrolase bsh genes can de- (GMPs). C. scindens-mediated protec-
tine of copepods (a type of small crusta- conjugate or cleave the glycine and taurine tion from amebiasis could be transferred
cean). This leads to sporadic outbreaks from conjugated bile acids to yield decon- with adoptive transfer of bone marrow
near rivers in the Indian subcontinent, jugated bile acids (e.g., taurocholate to a naive mouse and act via an
amplified by human fecal-oral spread of [TCA]/taurine and CA). This is a critical increased recruitment of polymorphonu-
the bacteria during outbreaks causing first step in microbial bile acid metabolism clear neutrophils to the colon. Because
pandemics. that leads to all subsequent biotransfor- C. scindens can dehydroxylate CA to
Diarrhea is caused by cholera toxin, an mations. The deconjugated bile acids DCA, Burgess et al. (2020) tested if this
enzyme that ADP-ribosylates the Gs pro- that reach the large intestine are then mediated alteration of the marrow. In
tein that regulates adenylate cyclase, metabolized by members of the gut micro- fact, administration of DCA alone pro-
leading to a cyclic AMP (cAMP)-mediated biota into secondary bile acids, including vided complete protection from amebi-
chloride ion (Cl ) secretion. Cholera toxin deoxycholate (DCA). (Foley et al., 2019). asis via GMP expansion, demonstrating
and the toxin coregulated pili (TCP) are Here, Alavi et al. (2020) demonstrate microbiome-to-bone-marrow communi-
regulated by TcpP, a membrane bound that the composition of the gut micro- cation via bile acids.
transcriptional activator. There is sub- biome contributes to resistance to A final example relates to Clostridioides
stantial person-to-person variation in the cholera. By reconstituting germ-free difficile infection (CDI). C. difficile is a
severity of cholera, with one explanation mice with defined communities of human Gram-positive spore forming bacillus
being personal differences in the micro- microbiome bacteria, they discovered and the most common cause of hospi-
biome regulating virulence gene expres- that the commensal bacterium Blautia tal-acquired, antibiotic-associated diar-
sion by TcpP. However, we still lack a obeum mediates resistance. They show rhea. Ingestion of the spore form of
complete understanding of the factors that B. obeum mediates resistance in C. difficile in an individual with a dysbiotic
that cause person-to-person variation on mice through degrading TCA to CA and microbiome (usually due to prior antibiotic
severity of cholera. that the abundance of the B. obeum therapy) leads to infection of the large in-
In this issue of Cell, Ansel Hsiao and BSH enzyme correlated with resistance testine by the vegetative stage of
colleagues demonstrate that hydrolysis in humans. In the absence of B. obeum- C. difficile. Primary and secondary bile
of the bile acid taurocholate to cholate dependent degradation, TCA induces acids have been shown to impact
by the gut microbiome blocks TcpP acti- the TcpP virulence regulator to cause C. difficile vegetative growth as well as
vation and cholera colonization. This pa- disease. Cholera therefore joins a list of spore germination and toxin activity. For
per adds to emerging literature on how enteropathogens whose ability to cause example, the primary bile acid TCA

ll
Previews
Figure 1. Role of Microbially-Derived Bile Acids in Enteric Infections

Role of hepatically synthesized primary and microbially derived secondary bile acids (green color) in enteric infections. Primary bile acids cholate (CA) and
chenodeoxycholate (CDCA) are synthesized from cholesterol by hepatocytes. Primary bile acids can be further modified via conjugation with taurine or glycine
within the liver (T(G)CA, T(G)CDCA). Once synthesized, primary bile acids enter into bile. Bile is stored in the gallbladder until released in the duodenum following
ingestion of a meal. Once within the GI tract, gut microbial enzymes convert host-derived conjugated primary bile acids into secondary bile acids. For example,
the primary bile acid taurocholate (TCA) is metabolized by the bile salt hydrolase (bsh) of Blautia obeum into CA. CA in turn is metabolized by Clostridium scindens
7a-dehydoxylation via the bai operon into deoxycholate (DCA). The primary bile acid TCA is essential for V. cholerae TcpP virulence activation and for germination
of C. difficile spores. Conversion of TCA to CA by B. obeum BSH protects against cholera and C. difficile. Conversion by C. scindens bai operon of the secondary
bile acid CA to DCA increases granulocyte monocyte progenitors (GMPs) in the bone marrow to provide gut polymorphonuclear neutrophil (PMN) protection from
the parasite Entamoeba histolytica.
induces spore germination, and DCA in- due to the conversion of CA to DCA, verted by the microbiota are becoming
hibits C. difficile growth. which should provide protection. To sum- central players, both for their direct
Like the amebic colitis example, marize, there is strong in vitro and sup- impact on enteropathogens and for their
C. scindens is implicated as having a pro- portive in vivo evidence that microbial impact on the immune system.
tective role in CDI. First, depletion of metabolism of bile acids has a direct
C. scindens is associated with more se- impact on C. difficile spore germination, ACKNOWLEDGMENTS
vere disease in humans and mice. More- growth, and toxin production and activity.
Work from the authors’ labs is supported by
over, in the mouse model of CDI, reconsti- Bile acids also regulate the immune
National Institutes of Health grants R35
tution of C. scindens was able to partially system, as demonstrated in amebic dys- GM119438 (to C.M.T.), 2R37 AI026649-31,
restore colonization resistance against entery with DCA protecting by increasing and R01 AI043596-22 (to W.A.P.) and the Henske
CDI in mice, and colonization resistance marrow GMPs. Additional examples of a and McGrath families.
was associated with secondary bile acid direct impact of bile acids on the immune
synthesis (Buffie et al., 2015). In patients system include the secondary bile acid REFERENCES
with recurrent CDI, high levels of conju- LCA regulating Th17 responses by inter-
Alavi, S., Mitchell, J.D., Cho, J.Y., Liu, R., MacBeth,
gated primary bile acids and reduced sec- fering with RORyT transcriptional activity
J.C., and Hsiao, A. (2020). Interpersonal gut
ondary bile acids were observed in feces (Hang et al., 2019) and, in the context of
microbiome variation drives susceptibility and
when compared to healthy individuals. colitis, bile acids interacting with macro- resistance to Vibrio cholerae. Cell 181, this issue,
Successful treatment of recurrent CDI phages to induce IL-10, which polarizes 1533–1546.
with fecal microbial transplant restored T cells to a regulatory phenotype (Biagioli Biagioli, M., Carino, A., Cipriani, S., Francisci, D.,
the level of fecal secondary bile acids, et al., 2017). Marchianò, S., Scarpelli, P., Sorcini, D., Zampella,
specifically, DCA and LCA (Weingarden In summary, Alavi et al. (2020) have A., and Fiorucci, S. (2017). The bile acid receptor
et al., 2015; Seekatz et al., 2018). Most deepened our understanding of how GPBAR1 regulates the M1/M2 phenotype of intes-
tinal macrophages and activation of GPBAR1 res-
recently, Reed et al. (2020) have shown composition of the gut microbiome pro-
cues mice from murine colitis. J. Immunol. 199,
that several commensal Clostridia encod- tects from cholera by the discovery of 718–733.
ing the bai operon (which encodes en- the role of bacterially encoded bile salt hy-
Buffie, C.G., Bucci, V., Stein, R.R., McKenney,
zymes that convert cholate into the sec- drolases. As microbiome science moves P.T., Ling, L., Gobourne, A., No, D., Liu, H., Kinne-
ondary bile acid deoxycholate), but not from description to mechanism, bile acids brew, M., Viale, A., et al. (2015). Precision micro-
all, are able to inhibit C. difficile growth synthesized by the host and further con- biome reconstitution restores bile acid mediated
Cell 181, June 25, 2020 1453

ll
Previews
resistance to Clostridium difficile. Nature 517, Hang, S., Paik, D., Yao, L., Kim, E., Trinath, J., Lu, metabolism following fecal microbiota transplanta-
205–208. J., Ha, S., Nelson, B.N., Kelly, S.P., Wu, L., et al. tion in patients with recurrent Clostridium difficile
Burgess, S.L., Leslie, J.L., Uddin, M.J., Oakland, (2019). Bile acid metabolites control TH17 and infection. Anaerobe 53, 64–73.
D.N., Gilchrist, C.A., Moreau, G.B., Watanabe, K., Treg cell differentiation. Nature 576, 143–148.
Saleh, M.M., Simpson, M., Thompson, B.A., et al. Reed, A.D., Nethery, M.A., Stewart, A., Barrangou,
(2020). Gut microbiome communication with R., and Theriot, C.M. (2020). Strain-dependent in- Weingarden, A., González, A., Vázquez-Baeza, Y.,
bone marrow regulates susceptibility to amebiasis. hibition of Clostridioides difficile by commensal Weiss, S., Humphry, G., Berg-Lyons, D., Knights,
J. Clin. Invest. https://doi.org/10.1172/JCI133605. Clostridia encoding the bile acid inducible (bai) D., Unno, T., Bobr, A., Kang, J., et al. (2015). Dy-
Foley, M.H., O’Flaherty, S., Barrangou, R., and operon. J Bacteriol. https://doi.org/10.1128/JB. namic changes in short- and long-term bacterial
Theriot, C.M. (2019). Bile salt hydrolases: Gate- 00039-20. composition following fecal microbiota transplan-
keepers of bile acid metabolism and host-micro- Seekatz, A.M., Theriot, C.M., Rao, K., Chang, Y.M., tation for recurrent Clostridium difficile infection.
biome crosstalk in the gastrointestinal tract. PLoS Freeman, A.E., Kao, J.Y., and Young, V.B. (2018). Microbiome 3, 10. https://doi.org/10.1186/
Pathog. 15, e1007581. Restoration of short chain fatty acid and bile acid s40168-015-0070-0.
Mapping the Uncharted Territories of Human

Brain Malignancies
Daan Juri Kloosterman1 and Leila Akkari1,*
1Division of Tumor Biology and Immunology, Netherlands Cancer Institute, Oncode Institute, Amsterdam 1066CX, the Netherlands
*Correspondence: l.akkari@nki.nl
Despite its success in multiple tumor types, immunotherapy remains poorly efficacious in brain malignancies.
In this issue of Cell, Friebel et al. and Klemm et al. provide in-depth insights into the versatile nuances of
immune cells in primary and metastatic brain tumors, granting the field with a rich framework to explore novel
therapeutic avenues.
The development of therapeutic strate- on tumor cell features, while introducing Through the use of multiparameter fluo-
gies enlisting cells composing the tumor the notion that stromal composition can rescence-activated cell sorting followed by
microenvironment (TME) has revolution- be shaped according to different cancer RNA sequencing (Klemm et al., 2020) and
ized clinical approaches in recent years, mutational statuses. Further evidence CYTOF analyses (Friebel et al., 2020), the
exemplified by T cell immunotherapy in elaborating on this concept have delin- authors assessed the abundance and het-
cancer treatment (Waldman et al., 2020). eated the particular phenotype of tumor- erogeneity of tissue-resident and peripher-
Yet, in organs as unique as the brain associated macrophages (TAMs) in spe- ally recruited leucocytes in BrMs and gli-
with its blood-brain barrier (BBB)-pro- cific subsets of GBM (Wang et al., 2017), omas. Microglia (MG) dominated the TME
tected niche, the current understanding with TAM enrichment generally associ- of IDHmut gliomas, considered to be less
of how tumors intrinsically emerging in ated with poor disease outcome (Kiel- aggressive brain lesions that displayed
the central nervous system or originating bassa et al., 2019). However, the extent limited infiltration of leucocytes. These re-
from extracranial sites sculpt the TME to of myeloid cell diversity, including TAM sults contrasted with the immunological
their advantage remains limited, thwarting ontogeny and distinct education, has re- landscapes of IDHwt GBM and BrMs,
efficient immune cell harnessing in thera- mained largely unexplored in human brain which were enriched by peripherally re-
peutic intervention. tumors. cruited monocyte-derived macrophages
Glioblastoma (GBM) and brain metas- In this issue of Cell, two studies (Friebel (MDMs). BrMs from multiple primary tumor
tases (BrMs) bear some of the worst prog- et al., 2020; Klemm et al., 2020) per- origins exhibited substantial infiltration of
noses for cancer patients. Discoveries of formed extensive and comprehensive an- T cells and neutrophils, with melanoma-
brain metastases drivers (Brastianos alyses of GBM and BrM immune land- to-brain BrMs singled out in both studies,
et al., 2015), genomic alterations of GBM scapes from large cohorts of patients, harboring numerous CD4+ and CD8+
subtypes (Brennan et al., 2013), and di- thus providing much needed information T cells subsets, and neutrophils heavily
versity at the single-cell level (Patel on the phenotype and transcriptional pro- infiltrating breast BrMs (Figure 1A).
et al., 2014; Tirosh et al., 2016) have grams acquired by components of the The prominence of TAMs prompted the
shed light into inter- and intra-tumoral TME, in a cell-type- and disease-specific authors to perform further analyses of MG
heterogeneity of brain lesions with a focus manner. and MDM subset composition, spatial

ll
Previews
Figure 1. Immune Landscape Composition and Features across Different Types of Brain Tumors
(A) Content of immune cells in the GBM and BrM TME. Tissue-resident microglia (MG) and monocyte-derived macrophages (MDMs) constitute the dominant cell
types composing the immune landscape of GBM in the depicted distinct abundances, according to the IDH mutational status of GBM. Recruited at the tumor site
from the peripheral circulation, leucocytes, including CD4+ T cells, CD8+ T cells, neutrophils, and monocytes, enter the brain parenchyma through the blood-brain
barrier and infiltrate brain tumors, a process enhanced in brain metastases, resulting in increased content of these recruited leucocytes in the BrM TME.
(B) Overview of the functional features acquired by each immune cell type in a disease-specific manner. Macrophages (MG and MDMs) are heterogenous across
tumor types and display the highest magnitude of plasticity, which ranges between ECM remodeling to immunomodulatory roles in IDHwt GBM and BrMs. Greatly
enriched in BrMs, CD4+ T cells and CD8+ T cells display features of hyporesponsiveness and exhaustion, whereas T cells in GBM are characterized by low
proliferative abilities and activation markers. Natural killer cells present an immature phenotype in IDHwt GBM in comparison to the cytotoxic features displayed in
IDHmut gliomas and BrMs.
Cell 181, June 25, 2020 1455

ll
Previews
organization, and phenotype and tran- BrM TME but not in gliomas (Klemm methodology of combination therapies
scriptional programs in GBM and BrM in et al., 2020), with BrM CD4+ and CD8+ to account for the progressive changes
order to establish their nexus role in regu- T cells exhibiting anergic and exhaustion undergone by the TME (Rothschilds and
lating the immune landscape in a disease- signatures, respectively. Flawed T cell Wittrup, 2019). These may include neo-
specific manner. Using orthogonal ap- features can be explained by chronic acti- adjuvant treatment approaches, which
proaches of RNA sequencing (Klemm vation through excessive antigen presen- have proven successful in recurrent
et al., 2020) and mass cytometry (Friebel tation, including by TAMs, which exert GBM (Cloughesy et al., 2019) and could
et al., 2020), both studies showed that greater immunomodulatory functions in be guided by artificial-intelligence-based
the plasticity of TAMs largely relied upon BrM than in IDHwt GBM and express mul- radiogenomics to dynamically monitor
the high magnitude of MDM phenotype tiple co-inhibitory receptors (Figure 1B). the brain TME (Rudie et al., 2019) and
adaptation and transcriptional programs, These major differences are in line with stratify patients into tailored therapeutic
leading to limited commonalities between the outcome of immune checkpoint intervention.
TAM subsets across different diseases. blockade, which showed promising effi- The vivid immune cell heterogeneity
The prognostic value of myeloid cell cacy in melanoma BrMs, where T cells parallels their remarkably challenging tar-
abundance was similarly relevant only are abundant and can be rewired, but little geting to achieve clinical response in
when examining MDM features in GBM. to no response in T cell-excluded primary brain tumors and emphasize the need to
Indeed, monocyte infiltration, which was brain tumors. further enrich our knowledge of the im-
prominent in IDHmut GBM, only correlated Primary cancer cells or their metastatic mune contexture in a dynamic setting.
with a trend in increased GBM patient sur- counterparts face considerable chal- Collectively, these studies provide much
vival, whereas expression of the pan- lenges to strive within the unique brain needed insights into the multifaceted nu-
MDM marker CD163+ (Friebel et al., TME, which can only be overcome by ances of immune cells in primary and met-
2020) or expression of the MDM-repre- hijacking a selective niche, leading to a astatic brain tumors and resources for the
sentative gene set (Klemm et al., 2020) paralleled evolution of stromal and im- brain tumor community to explore novel
both correlated with poor survival in low- mune cells. The deep insights into the di- therapeutic targets.
and high-grade GBM. versity of specialized immune landscapes
Strikingly, MDMs’ heterogeneity was shaped by GBM and BrMs presented in REFERENCES
not a random feature across brain tumors the studies discussed here highlight the
but distinct and dictated by each tumor need for careful identification of the Brastianos, P.K., Carter, S.L., Santagata, S., Cahill,
D.P., Taylor-Weiner, A., Jones, R.T., Van Allen,
type. The origin of MDM diversity was ontogeny, cellular phenotype, and local
E.M., Lawrence, M.S., Horowitz, P.M., Cibulskis,
proposed to rely upon distinct differentia- education of myeloid cell subsets, without K., et al. (2015). Genomic Characterization of Brain
tion trajectories from their monocyte pro- simply utilizing content to predict survival Metastases Reveals Branched Evolution and
genitors, thus giving rise to multiple MDM outcomes or to devise therapeutic inter- Potential Therapeutic Targets. Cancer Discov. 5,
subsets carrying specific phenotypes ventions. Holistic approaches using sin- 1164–1177.
with different prognostic values. Single- gle-cell RNA sequencing and spatial Brennan, C.W., Verhaak, R.G., McKenna, A., Cam-
cell proteomic analysis was leveraged to determinants of the GBM and BrM TME pos, B., Noushmehr, H., Salama, S.R., Zheng, S.,
Chakravarty, D., Sanborn, J.Z., Berman, S.H.,
ascribe a positive survival outcome to will be necessary to understand how
et al.; TCGA Research Network (2013). The so-
the CD163+MRC1+ MDM subset in low- cellular plasticity gives rise to these matic genomic landscape of glioblastoma. Cell
grade GBM (Friebel et al., 2020). subtype-specific immunological niches. 155, 462–477.
The findings that MG and MDMs shared Indeed, several questions remained to Cloughesy, T.F., Mochizuki, A.Y., Orpilla, J.R.,
some, but limited, features within the be answered in light of these novel re- Hugo, W., Lee, A.H., Davidson, T.B., Wang, A.C.,
same disease setting suggest that the sources: do brain tumors poise peripheral Ellingson, B.M., Rytlewski, J.A., Sanders, C.M.,
type of tumor in addition to the affected immune cells for reprogramming, beyond et al. (2019). Neoadjuvant anti-PD-1 immuno-
tissue heavily weighs onto the program- affecting their recruitment? How does the therapy promotes a survival benefit with intratu-
moral and systemic immune responses in recur-
ming of recruited immune cells. This is genetic make-up of brain tumor cells
rent glioblastoma. Nat. Med. 25, 477–486.
further exemplified by the distinct reactive shape the education of each distinct im-
Friebel, E., Kapolou, K., Unger, S., Núñez, N.G.,
phenotype acquired by MG in IDHmut mune cell type? What underlies the dy-
Utz, S., Rushing, E.J., Regli, L., Weller, M., Greter,
GBM compared to IDHwt tumors (Friebel namic acquisition of these phenotypes in M., Tugues, S., et al. (2020). Single-Cell Mapping of
et al., 2020). Importantly, the transcrip- the course of tumor malignancy, and are Human Brain Cancer Reveals Tumor-Specific In-
tomic education acquired by either TAM these altered by standard of care treat- struction of Tissue-Invading Leukocytes. Cell
subsets did not reflect the defined M1- ment? The ability of TAMs to remodel 181, this issue, 1626–1642.
like or M2-like macrophage polarization the extracellular matrix (ECM) landscape Kielbassa, K., Vegna, S., Ramirez, C., and Akkari,
phenotypes but included shades of either in GBM and BrM additionally opens the L. (2019). Understanding the Origin and Diversity
of Macrophages to Tailor Their Targeting in Solid
activation states (Klemm et al., 2020). perspective to apply tumor-tissue archi-
Cancers. Front. Immunol. 10, 2215.
A key distinctive feature between BrM tecture therapies to brain malignancies
Klemm, F., Maas, R.R., Bowman, R.L., Kornete,
and GBM was revealed when assessing and either disrupt the locally shaped niche
M., Soukup, K., Nassiri, S., Brouland, J.P., Iacobu-
the spatial organization and phenotype or facilitate drug penetration though the zio-Donahue, C.A., Brennan, C., Tabar, V., et al.
of infiltrating leucocytes. TAMs and BBB. A major challenge will then be to (2020). Interrogation of the microenvironmental
T cells were closely interacting in the design rational and optimized timing and landscape in brain tumors reveals disease-specific
1456 Cell 181, June 25, 2020

ll
Previews
alterations of immune cells. Cell 181, this issue, Rudie, J.D., Rauschecker, A.M., Bryan, R.N., Waldman, A.D., Fritz, J.M., and Lenardo, M.J.
1643–1660. Davatzikos, C., and Mohan, S. (2019). Emerging (2020). A guide to cancer immunotherapy: from
Patel, A.P., Tirosh, I., Trombetta, J.J., Shalek, A.K., Applications of Artificial Intelligence in Neuro- T cell basic science to clinical practice. Nat.
Gillespie, S.M., Wakimoto, H., Cahill, D.P., Nahed, Oncology. Radiology 290, 607–618. Rev. Immunol. https://doi.org/10.1038/s41577-
B.V., Curry, W.T., Martuza, R.L., et al. (2014). Sin- 020-0306-5.
gle-cell RNA-seq highlights intratumoral heteroge- Tirosh, I., Izar, B., Prakadan, S.M., Wadsworth, Wang, Q., Hu, B., Hu, X., Kim, H., Squatrito, M.,
neity in primary glioblastoma. Science 344, M.H., 2nd, Treacy, D., Trombetta, J.J., Rotem, A., Scarpace, L., deCarvalho, A.C., Lyu, S., Li, P., Li,
1396–1401. Rodman, C., Lian, C., Murphy, G., et al. (2016). Dis- Y., et al. (2017). Tumor Evolution of Glioma-
Rothschilds, A.M., and Wittrup, K.D. (2019). What, secting the multicellular ecosystem of metastatic Intrinsic Gene Expression Subtypes Associates
Why, Where, and When: Bringing Timing to Im- melanoma by single-cell RNA-seq. Science 352, with Immunological Changes in the Microenviron-
muno-Oncology. Trends Immunol. 40, 12–21. 189–196. ment. Cancer Cell 32, 42–56.6.
Cell 181, June 25, 2020 1457

ll
Leading Edge
Perspective
Pandemic Preparedness: Developing Vaccines
and Therapeutic Antibodies For COVID-19
Gregory D. Sempowski,1,2,* Kevin O. Saunders,3 Priyamvada Acharya,3 Kevin J. Wiehe,1 and Barton F. Haynes1,4,*
1Department of Medicine, Duke Human Vaccine Institute, Duke University School of Medicine, Durham, NC 27710, USA
2Department of Pathology, Duke Human Vaccine Institute, Duke University School of Medicine, Durham, NC 27710, USA
3Department of Surgery, Duke Human Vaccine Institute, Duke University School of Medicine, Durham, NC 27710, USA
4Department of Immunology, Duke Human Vaccine Institute, Duke University School of Medicine, Durham, NC 27710, USA
*Correspondence: gregory.sempowski@duke.edu (G.D.S.), barton.haynes@duke.edu (B.F.H.)

The SARS-CoV-2 pandemic that causes COVID-19 respiratory syndrome has caused global public
health and economic crises, necessitating rapid development of vaccines and therapeutic counter-
measures. The world-wide response to the COVID-19 pandemic has been unprecedented with
government, academic, and private partnerships working together to rapidly develop vaccine
and antibody countermeasures. Many of the technologies being used are derived from prior gov-
ernment-academic partnerships for response to other emerging infections.
Introduction cholesterolemia, osteoporosis, cancer, and infectious diseases

The novel coronavirus outbreak of SARS-CoV-2, as of May 14, (Shepard et al., 2017). Recombinant human or humanized
2020, has resulted in more than 4,300,000 individuals infected monoclonal antibodies are proving to be safe, effective, and
and over 290,000 deaths worldwide (Dong et al., 2020). The cur- highly specific in their ability to target a pathway, process, or
rent pandemic, as well as the potential of future pandemics invading pathogen. More than 70 recombinant monoclonal an-
based on estimates of undiscovered zoonotic infections (Carroll tibodies have now been approved by the FDA for use in the
et al., 2018), has brought to the forefront urgency and necessity treatment of infectious, autoimmune and inflammatory, malig-
for rapid development of pandemic countermeasures. Two nant, or cardiovascular diseases (Carter and Lazar, 2018; She-
countermeasures with promise for controlling the current pard et al., 2017). Specifically, recombinant neutralizing anti-
SARS-CoV-2 pandemic are recombinant neutralizing antibodies bodies for infectious diseases, such as for protection from
(Ju et al., 2020; Walker and Burton, 2018) and vaccines (Graham, anthrax toxin and for the prevention of respiratory syncytial vi-
2020; Graham et al., 2018; Graham and Sullivan, 2018) directed rus infection (Empey et al., 2010; Shepard et al., 2017), have
against the virus that causes COVID-19, SARS-CoV-2. In partic- been approved by the FDA. Neutralizing antibodies are
ular, over the past 15 years, the NIAID Center for HIV/AIDS Vac- currently in development for prevention and/or treatment of
cine Immunology (CHAVI) program (Burton et al., 2012; Haynes HIV (Caskey et al., 2019; Gaudinski et al., 2019) and pending
et al., 2016), the NIH Vaccine Research Center (Kwong and Mas- approval for Ebola (Saphire et al., 2018).
cola, 2012) as well as others, and, for the past three years, the Thus, recombinant neutralizing antibodies isolated from those
DARPA Pandemic Prevention Program (P3) program (Cable infected with SARS-CoV-2 are the most rapid and readily manu-
et al., 2020; DARPA, 2017; Kose et al., 2019) have worked to facturable immune intervention for passive administration that
define the platforms and enable technology for HIV vaccine might be developed to either prevent or treat COVID-19 disease
development and rapid response to viral pandemics. Although (Andreano et al., 2020; Brouwer et al., 2020; Ju et al., 2020;
an HIV vaccine has not yet been developed, much of the technol- Rogers et al., 2020; Seydoux et al., 2020). SARS-CoV-2 antibody
ogy the HIV vaccine field has developed is now being used to countermeasures will benefit from the last 20 years of antibody
fight the COVID-19 pandemic. From the HIV field and the DARPA optimization research that has discovered point mutations in
preparedness programs have come teams and technologies that the Fc portion of antibodies that finetune antibody function and
are now responding to the COVID-19 epidemic to both isolate circulation half-life (Saunders, 2019). Such mutations have
SARS-CoV-2 neutralizing antibodies and develop SARS-CoV-2 been described for the Fc region of IgG that have prolonged anti-
vaccine candidates. Here we comment on some of the strategies body half-life for up to 6–7 weeks (Gaudinski et al., 2019; Robbie
that are being used to develop antibody and vaccine counter- et al., 2013; Yu et al., 2016). Additionally, mutations are known
measures for SARS CoV-2 (Figure 1). that can increase antibody-dependent infected cell killing and
Neutralizing antibodies. Antibodies isolated from a single B antibody-dependent complement activation (Idusogie et al.,
cell are called monoclonal antibodies (mAbs) and have become 2001; Richards et al., 2008). Given the ability of certain anti-
an effective new biologic class in our pharmacopeia with a bodies to facilitate SARS-CoV-1 virus entry via engagement of
wide-range of FDA-approved mAbs for indications such as Fc receptors on host cells (Jaume et al., 2011), the introduction
arthritis and other inflammatory diseases, heart disease, hyper- of mutations that inhibit Fc binding to Fc receptors could also

ll
Perspective
this technology was recently used in China to isolate the first

neutralizing antibodies to SARS-CoV-2 (Ju et al., 2020). As
seen with SARS-CoV-1 and HIV, viruses resistant to a particular
antibody can emerge among the circulating virus variants (Cas-
key et al., 2017; ter Meulen et al., 2006). In these cases, combi-
nations of two of more neutralizing antibodies can broaden the
efficacy of neutralizing antibody prophylaxis or treatment (Men-
doza et al., 2018; ter Meulen et al., 2006). Thus, large numbers of
effective neutralizing antibodies are needed so that antibody
cocktails can be formulated to provide optimal coverage of
circulating SARS-CoV-2 isolates.
The Defense Advanced Research Projects Agency (DARPA)
Pandemic Prevention Program (P3) within the Department of De-
fense has played a key role in funding and driving public/private
development of end-to-end platforms to rapidly develop anti-
body countermeasures. The DARPA P3 program aims to identify
and propagate viral pathogens, isolate human neutralizing anti-
body sequences, develop effective approaches to deliver
neutralizing antibodies as nucleic acids, and develop good
manufacturing practices for a final drug product to submit to
the Agency for Phase I approval within 60 days of receiving an
outbreak blood specimen (Cable et al., 2020; DARPA, 2017).
To meet this challenge, the Duke DARPA P3 program devel-
oped a ‘‘thaw and infect’’ permissive cell line array to rapidly
grow Risk Group 2 and 3 viruses, such as SARS-CoV-2. This
team also developed platform technologies for real-time virus
growth monitoring, fluorescently labeling viruses, antibody
focus/plaque-reduction neutralization assays, and whole virus
binding assays. By using innovative and time-tested approaches
from the HIV field (Klein et al., 2013; Liao et al., 2009), the
Duke P3 group developed a rapid, virus-independent, high-
throughput, semi-automated Ab sequence isolation and
screening pipeline in high-containment (Figure 2). This pipeline
has recently isolated a panel of H3N2 influenza neutralizing anti-
bodies that provide protection in mouse challenge models and is
now fully engaged in isolating a cocktail of neutralizing SARS-
CoV-2 human monoclonal antibodies. The human angiotensin-
converting enzyme 2 (ACE2) transgenic mouse (Bao et al.,
2020) will be incorporated into the pipeline as well as rhesus
and cynomolgus macaque models (Lu et al., 2020; Rockx
et al., 2020) as SARS-CoV-2 acquisition models to rapidly eval-
uate the ability of antibodies to protect from SARS-CoV-2 chal-
lenge and the possibility of antibody-induced enhancement of
Figure 1. Schema of Iterative and Synergistic Approaches Being disease (Peeples, 2020). Rapid movement of potent viral neutral-
Used to Simultaneously Develop Both Vaccines and Antibody izing antibodies to testing in the clinic requires an alternative to
Countermeasures for SARS-CoV-2/COVID-19 typical recombinant antibody protein production. DARPA P3
performer teams and others are therefore developing gene-
be important for successful development of SARS-CoV-2 delivered approaches (Figure 1) for rapid cGMP-scalable
neutralizing antibody treatments. manufacturing. One approach being used is modified RNA
Neutralizing antibodies to the spike protein receptor binding encapsulated in lipid nanoparticles (LNPs) due to its proven util-
domain (RBD) protect mice from MERS, SARS-CoV-1, and ity and potential for rapid manufacturing (Kose et al., 2019; Pardi
SARS-CoV-2 infection (Quinlan et al., 2020; Wang et al., 2018; et al., 2018). In pre-clinical models, protective concentrations of
Zhou et al., 2018). Thus, neutralizing antibodies are under devel- recombinant neutralizing antibodies have been successfully ex-
opment as proteins or gene-delivered formulations to prevent or pressed from intravenously and intramuscularly delivered LNP-
treat SARS-CoV-2 infection. One example of technology now encapsulated mRNA, supporting this approach for use in
brought to bear on SARS-CoV-2 countermeasure work is the response to a rapidly spreading pandemic (Kose et al., 2019;
strategy developed to isolate and screen for HIV neutralizing an- Pardi et al., 2017b). Moreover, Crowe at al. (Vanderbilt P3
tibodies without antibody gene cloning (Liao et al., 2009), and Performer Site), in collaboration with Moderna, has successfully
Cell 181, June 25, 2020 1459

ll
Perspective
Figure 2. Accelerated Platform Technology for Rapid B Cell Screening and Isolation of Pathogen Neutralizing Antibodies Being Used to
Isolate SARS-CoV-2 Antibodies
isolated and delivered a Chikungunya neutralizing antibody by technologies for vaccine delivery (Graham and Sullivan, 2018).
using mRNA in LNPs in a Phase 1 human study (Kose et al., For the past 15 years, the HIV vaccine field has pioneered devel-
2019). In addition to mRNA-LNP, DNA or viral vector approaches opment and use of recombinant antibody technology, advanced
are also being rapidly developed for pandemic prevention anti- computational methods, novel animal models, and new vaccine
body delivery (Balazs et al., 2011; Muthumani et al., 2013). delivery approaches to accelerate HIV vaccine immunogen
SARS-CoV-2 vaccine development. Vaccines are the time- design and development (Burton et al., 2012; Caskey et al.,
honored method for establishing long-lived immune memory 2019; Haynes et al., 2019; Haynes et al., 2016; Klein et al.,
for controlling infectious diseases, and technologies have been 2013; Kwong and Mascola, 2018; Liao et al., 2009). This work
developed such that vaccines can now be developed faster has deciphered the roadblocks for this most difficult-to-develop
than in previous times (Figure 1) (Graham, 2020; Graham et al., HIV vaccine (Haynes et al., 2019). It is expected that the timeline
2018; Graham and Sullivan, 2018). Over 100 companies or aca- for a SARS-CoV-2 vaccine will be much faster and much easier
demic institutions are working on COVID-19 vaccines with stra- than for HIV-1. Investigators have worked to integrate these iter-
tegies that include recombinant vectors, mRNA in lipid nanopar- ative approaches for vaccine and antibody countermeasure
ticles, DNA, inactivated virus, live attenuated virus, virus-like development and applied them to a rapid response to the
particles, and protein subunits (Thanh Le et al., 2020; WHO, COVID-19 disease epidemic caused by SARS-CoV-2 (Figure 1).
2020b). Three vaccine candidates have already advanced to The COVID-19 pandemic caused by SARS-CoV-2 was first
Phase II testing that include an mRNA vaccine encoding the viral widely recognized in December 2019, and the first virus
spike protein from Moderna, an Adeno-type 5 vector vaccine ex- sequence published online in January 2020. By March 16,
pressing the S protein from CanSino Biologicals, and a chim- 2020, the first mRNA/LNP vaccine trial developed by the VRC
panzee adenovirus encoding the spike protein from the Jenner in collaboration with Moderna had begun (NIH, 2020). A rapid
Institute in Oxford, UK. Five other vaccine candidates are also SARS-CoV2 vaccine development approach involves the inte-
now in phase I trials including other mRNA/LNP or DNA vaccines gration of computational and structural-based immunogen
as well as three forms of whole inactivated vaccines design strategies; production of immunogens as inactivated vi-
(WHO, 2020b). rus; DNA, mRNA, vectored or protein subunits; and immuno-
As seen from this rapid movement of SARS-CoV-2 vaccine genic profiling in animal models prior to vaccine manufacturing
candidates into human trials, the time it takes to develop vac- and testing in clinical trials (Figure 1). Computational biology
cines for emerging pathogens is decreasing from that in the techniques have facilitated the rapid analysis of antibody and vi-
past for traditional childhood vaccines. Recently, a DNA vaccine rus sequences for influenza, HIV, and now SARS-CoV-2 to
for the original SARS (SARS-CoV-1) was developed in enable vaccine development (GISAID, 2020; Los Alamos Na-
20 months, a vaccine for H5 influenza A/Indonesia/2006 in tional Laboratory, 2020a; Saunders et al., 2019; Wiehe et al.,
11 months, a vaccine for H1 influenza A/California/2009 2018). Monitoring of HIV evolution by using the Los Alamos
in 4 months, and a Zika virus vaccine in 3.5 months (Graham HIV Sequence Database (Los Alamos National Laboratory,
et al., 2018). These successes have been brought about by inno- 2020a) has been critical for HIV vaccine immunogen designs.
vative technology and approaches that have allowed for rapid The SARS-CoV-2 virus is evolving, albeit at a slower rate than
identification and sequencing of new viral pathogens and new HIV, and virus evolution is a concern for successful COVID-19
1460 Cell 181, June 25, 2020

ll
Perspective
vaccine development. The SARS-CoV-2 spike protein RBD is the rapidly than proteins or viral vectors and can be more cost
prime target for vaccine-induced neutralizing antibodies effective.
although other spike protein neutralizing epitopes are of interest. In addition to efficacy being the primary goal of SARS-CoV-2
However, comparison of the SARS-CoV2 RBD with that of vaccine development, safety is also a major concern (Peeples,
SARS-COV-1 reveals only partial homology although both retain 2020). Immunization with a SARS-CoV-1 vaccine has induced
the ability to bind to the ACE2 as a receptor (Wrapp et al., 2020). vaccine-associated immunopathology in the lung (Bolles et al.,
The GISAID database is proving to be helpful information for 2011; Liu et al., 2019; Tseng et al., 2012). Both CD4 and CD8
monitoring the viral evolution of SARS-CoV-2 (GISAID, 2020), T cell responses have been suggested to also be protective for
and the Los Alamos National Laboratory is developing a website SARS-CoV-1 (Zhao et al., 2016). Thus, careful animal preclinical
with tools for analysis of global SARS-CoV-2 spike protein se- studies as well as intense monitoring of human clinical trials will
quences (Korber et al., 2020; Los Alamos National Labora- be of critical importance to developing safe and effective anti-
tory, 2020b). COVID-19 antibody and vaccine countermeasures.
Structural determination of the primary targets of neutralizing Finally, collaboration and coordination will be essential to
antibodies, for example, hemagglutinin in influenza, Env in HIV, ending the pandemic. Globally, one example of private support
and now the spike protein in SARS-CoV-2, has provided valu- for COVID-19 research is the Coalition for Epidemic Prepared-
able atomic-level insight for vaccine design strategies. In partic- ness Innovations (CEPI). CEPI is raising funds for COVID-19 vac-
ular, cryo-electron microscopy has enabled the rapid solution of cine development and as well is funding vaccine development
the structure of the SARS-CoV-2 spike protein (Walls et al., 2020; projects (Gouglas et al., 2019). The Bill & Melinda Gates Founda-
Wrapp et al., 2020). Structural biology analysis of pathogens tion has made a $250 million commitment to fight COVID-19 and
combines structural, computational, biophysical, and biochem- has established a Coronavirus Immunotherapy Consortium, or
ical methods to understand interactions of pathogens with the CoVIC, to foster sharing and comparison of SARS-CoV-2 anti-
immune system (Henderson et al., 2020; LaBranche et al., bodies to speed therapeutic antibody development (Bill & Me-
2019; Murin et al., 2019; Saunders et al., 2019). Early this year, linda Gates Foundation, 2020). The recent formation of an NIH-
structural biologists pivoted to apply technology developed for organized public-private partnership, termed Accelerating
HIV-1 envelope or respiratory syncytial virus (RSV) structural COVID-19 Therapeutic Interventions and Vaccines (ACTIV)
biology to fast-track structure-based vaccine design for (Corey et al., 2020; Kaiser, 2020), is necessary and will facilitate
COVID-19 (Lan et al., 2020; Walls et al., 2020; Wrapp et al., a coordinated COVID-19 pandemic response. Globally, the
2020; Yuan et al., 2020). Currently, established pipelines for World Health Organization is playing a critical multinational coor-
high-resolution cryo-EM structural determination of the SARS- dination and informational role (WHO, 2020a).
CoV-2 spike are integrated with the computational teams, thus
providing atomic-level feedback to COVID-19 vaccine designs.
Summary
The past eight years of HIV-1 antibody discovery has provided
Government-funded and private initiatives have synergized to
templates for HIV-1 vaccine design aiming to elicit broadly reac-
provide countermeasure platforms to rapidly respond to the
tive neutralizing antibodies (Kwong and Mascola, 2018; Sok and
SARS-CoV-2 pandemic. Continued cooperation among public
Burton, 2018). From the study of the ontogeny of HIV neutralizing
and private institutions coupled with speed of development of
antibodies it has become clear that an effective vaccine will likely
antibody countermeasures and vaccines, with rapid evaluation
require multiple immunogens administered in a specific order to
of their safety and efficacy, and early planning for scale-up and
facilitate proper antibody development to multiple neutralizing
manufacture will be critical for expeditious control of the global
targets on HIV (Haynes et al., 2019). Hopefully, the development
COVID-19 pandemic.
of SARS-CoV-2 neutralizing antibodies will require a much
simpler vaccination regimen like the Zika vaccine, where one im-
ACKNOWLEDGMENTS
munization with one immunogen was sufficient to elicit protec-
tive neutralizing antibodies (Pardi et al., 2017a). Such a vaccine Funded by NIH grants AI142596 (B.F.H.), AI145687 and AI150415 (P.A.),
would be amenable to rapid development, large-scale AI058607 (G.D.S.); Department of Defense HR0011-17-2-0069 (G.D.S.); and
manufacturing, and global administration. the Translating Duke Health Initiative (P.A.). All authors wrote and edited the
What will follow rapidly now for prevention of COVID-19 will be manuscript. We thank Megan Averill for editorial assistance and David East-
a number of mRNA/LNP (e.g., from Moderna/NIAID, BioNTech/ erhoff and QiFeng Han for their contributions to the DARPA P3 program.
Fosum, Pharma/Pfizer) or DNA (e.g., Inovio) vaccines as well

as attenuated viruses, proteins, nanoparticles, and viral vectors DECLARATION OF INTERESTS
containing SARS-CoV-2 viral genes as vaccine candidates mov-
B.F.H. and G.D.S. have a patent submitted for the emerging infections pre-
ing through safety and immunogenicity trials, and a smaller sub-
paredness platform and influenza antibodies (No. 62_902705).
set of vaccine candidates will be tested in Phase III or efficacy tri-
als to continue to determine if they are safe, as well to determine
their efficacy. In parallel now with Phase I and II trials, it is impor- REFERENCES
tant to develop capacity for large-scale vaccine production, in

Andreano, E., Nicastri, E., Paciello, I., Pileri, P., Manganaro, N., Piccini, G.,
the event of a successful efficacy trial (Corey et al., 2020; Gra- Manenti, A., Pantano, E., Kabanova, A., Troisi, M., et al. (2020). Identification
ham, 2020). It is possible that genetic immunization strategies of neutralizing human monoclonal antibodies from Italian Covid-19 convales-
such as DNA or mRNA in LNPs can be manufactured more cent patients. bioRxiv. https://doi.org/10.1101/2020.05.05.078154.
Cell 181, June 25, 2020 1461

ll
Perspective
Balazs, A.B., Chen, J., Hong, C.M., Rao, D.S., Yang, L., and Baltimore, D. Haynes, B.F., Shaw, G.M., Korber, B., Kelsoe, G., Sodroski, J., Hahn, B.H.,
(2011). Antibody-based protection against HIV infection by vectored immuno- Borrow, P., and McMichael, A.J. (2016). HIV-Host Interactions: Implications
prophylaxis. Nature 481, 81–84. for Vaccine Design. Cell Host Microbe 19, 292–303.
Bao, L., Deng, W., Huang, B., Gao, H., Liu, J., Ren, L., Wei, Q., Yu, P., Xu, Y., Haynes, B.F., Burton, D.R., and Mascola, J.R. (2019). Multiple roles for HIV
Qi, F., et al. (2020). The pathogenicity of SARS-CoV-2 in hACE2 transgenic broadly neutralizing antibodies. Sci. Transl. Med. 11, eaaz2686.
mice. Nature. https://doi.org/10.1038/s41586-020-2312-y. Henderson, R., Lu, M., Zhou, Y., Mu, Z., Parks, R., Han, Q., Hsu, A.L., Carter,
Bill & Melinda Gates Foundation (2020). Press Release: COVID-19 Therapeu- E., Blanchard, S.C., Edwards, R.J., et al. (2020). Disruption of the HIV-1 Enve-
tics Accelerator Awards $20 Million in Initial Grants to Fund Clinical Trials. lope allosteric network blocks CD4-induced rearrangements. Nat. Commun.
11, 520.
Bolles, M., Deming, D., Long, K., Agnihothram, S., Whitmore, A., Ferris, M.,
Funkhouser, W., Gralinski, L., Totura, A., Heise, M., and Baric, R.S. (2011). A Idusogie, E.E., Wong, P.Y., Presta, L.G., Gazzano-Santoro, H., Totpal, K.,
double-inactivated severe acute respiratory syndrome coronavirus vaccine Ultsch, M., and Mulkerrin, M.G. (2001). Engineered antibodies with increased
provides incomplete protection in mice and induces increased eosinophilic activity to recruit complement. J. Immunol. 166, 2571–2575.
proinflammatory pulmonary response upon challenge. J. Virol. 85, Jaume, M., Yip, M.S., Cheung, C.Y., Leung, H.L., Li, P.H., Kien, F., Dutry, I.,
12201–12215. Callendret, B., Escriou, N., Altmeyer, R., et al. (2011). Anti-severe acute respi-
Brouwer, P., Caniels, T., van Straten, K., Snitselaar, J., Aldon, Y., Bangaru, S., ratory syndrome coronavirus spike antibodies trigger infection of human im-
Torres, J., Okba, N., Claireaux, M., Kerster, G., et al. (2020). Potent neutralizing mune cells via a pH- and cysteine protease-independent FcgR pathway.
antibodies from COVID-19 patients define multiple targets of vulnerability. bio- J. Virol. 85, 10582–10597.
Rxiv. https://doi.org/10.1101/2020.05.12.088716. Ju, B., Zhang, Q., Ge, X., Wang, R., Yu, J., Shan, S., Zhou, B., Song, S., Tang,
X., Yu, J., et al. (2020). Potent human neutralizing antibodies elicited by SARS-
Burton, D.R., Ahmed, R., Barouch, D.H., Butera, S.T., Crotty, S., Godzik, A.,
CoV-2 infection. bioRxiv. https://doi.org/10.1101/2020.03.21.990770.
Kaufmann, D.E., McElrath, M.J., Nussenzweig, M.C., Pulendran, B., et al.
(2012). A Blueprint for HIV Vaccine Discovery. Cell Host Microbe 12, 396–407. Kaiser, J. (2020). NIH organizes hunt for drugs. Science 368, 351, 351.
Cable, J., Srikantiah, P., Crowe, J.E., Jr., Pulendran, B., Hill, A., Ginsberg, A., Klein, F., Mouquet, H., Dosenovic, P., Scheid, J.F., Scharf, L., and Nussenz-
Koff, W., Mathew, A., Ng, T., Jansen, K., et al. (2020). Vaccine innovations for weig, M.C. (2013). Antibodies in HIV-1 vaccine development and therapy. Sci-
emerging infectious diseases-a symposium report. Ann. N Y Acad. Sci. 1462, ence 341, 1199–1204.
14–26. Korber, B., Fischer, W., Gnanakaran, S., Yoon, H., Theiler, J., Abfalterer, W.,
Foley, B., Giorgi, E., Bhattacharya, T., Parker, M., et al. (2020). Spike mutation
Carroll, D., Daszak, P., Wolfe, N.D., Gao, G.F., Morel, C.M., Morzaria, S., Pa-
pipeline reveals the emergence of a more transmissible form of SARS-CoV-2.
blos-Méndez, A., Tomori, O., and Mazet, J.A.K. (2018). The Global Virome
bioRxiv. https://doi.org/10.1101/2020.04.29.069054.
Project. Science 359, 872–874.
Kose, N., Fox, J.M., Sapparapu, G., Bombardi, R., Tennekoon, R.N., de Silva,
Carter, P.J., and Lazar, G.A. (2018). Next generation antibody drugs: pursuit of
A.D., Elbashir, S.M., Theisen, M.A., Humphris-Narayanan, E., Ciaramella, G.,
the ‘high-hanging fruit’. Nat. Rev. Drug Discov. 17, 197–223.
et al. (2019). A lipid-encapsulated mRNA encoding a potently neutralizing hu-
Caskey, M., Schoofs, T., Gruell, H., Settler, A., Karagounis, T., Kreider, E.F., man monoclonal antibody protects against chikungunya infection. Sci. Immu-
Murrell, B., Pfeifer, N., Nogueira, L., Oliveira, T.Y., et al. (2017). Antibody 10- nol. 4, eaaw6647.
1074 suppresses viremia in HIV-1-infected individuals. Nat. Med. 23, 185–191.
Kwong, P.D., and Mascola, J.R. (2012). Human antibodies that neutralize HIV-
Caskey, M., Klein, F., and Nussenzweig, M.C. (2019). Broadly neutralizing anti- 1: identification, structures, and B cell ontogenies. Immunity 37, 412–425.
HIV-1 monoclonal antibodies in the clinic. Nat. Med. 25, 547–553. Kwong, P.D., and Mascola, J.R. (2018). HIV-1 Vaccines Based on Antibody
Corey, B.L., Mascola, J.R., Fauci, A.S., and Collins, F.S. (2020). A strategic Identification, B Cell Ontogeny, and Epitope Structure. Immunity 48, 855–871.
approach to COVID-19 vaccine R&D. Science, eabc5312. LaBranche, C.C., Henderson, R., Hsu, A., Behrens, S., Chen, X., Zhou, T.,
DARPA (2017). Pandemic Prevention Platform (P3). https://www.darpa.mil/ Wiehe, K., Saunders, K.O., Alam, S.M., Bonsignori, M., et al. (2019). Neutrali-
program/pandemic-prevention-platform. zation-guided design of HIV-1 envelope trimers with high affinity for the unmu-
tated common ancestor of CH235 lineage CD4bs broadly neutralizing anti-
Dong, E., Du, H., and Gardner, L. (2020). An interactive web-based dashboard
bodies. PLoS Pathog. 15, e1008026.
to track COVID-19 in real time. Lancet Infect. Dis. 20, 533–534.
Lan, J., Ge, J., Yu, J., Shan, S., Zhou, H., Fan, S., Zhang, Q., Shi, X., Wang, Q.,
Empey, K.M., Peebles, R.S., Jr., and Kolls, J.K. (2010). Pharmacologic ad-
Zhang, L., and Wang, X. (2020). Structure of the SARS-CoV-2 spike receptor-
vances in the treatment and prevention of respiratory syncytial virus. Clin.
binding domain bound to the ACE2 receptor. Nature 581, 215–220.
Infect. Dis. 50, 1258–1267.
Liao, H.X., Levesque, M.C., Nagel, A., Dixon, A., Zhang, R., Walter, E., Parks,
Gaudinski, M.R., Houser, K.V., Doria-Rose, N.A., Chen, G.L., Rothwell, R.S.S.,
R., Whitesides, J., Marshall, D.J., Hwang, K.K., et al. (2009). High-throughput
Berkowitz, N., Costner, P., Holman, L.A., Gordon, I.J., Hendel, C.S., et al.; VRC
isolation of immunoglobulin genes from single human B cells and expression
605 study team (2019). Safety and pharmacokinetics of broadly neutralising
as monoclonal antibodies. J. Virol. Methods 158, 171–179.
human monoclonal antibody VRC07-523LS in healthy adults: a phase 1
dose-escalation clinical trial. Lancet HIV 6, e667–e679. Liu, L., Wei, Q., Lin, Q., Fang, J., Wang, H., Kwok, H., Tang, H., Nishiura, K.,
Peng, J., Tan, Z., et al. (2019). Anti-spike IgG causes severe acute lung injury
GISAID (2020). Next hCoV-19 App (Germany: Munich). https://www.gisaid. by skewing macrophage responses during acute SARS-CoV infection. JCI
org/epiflu-applications/next-hcov-19-app/. Insight 4, e123158.
Gouglas, D., Christodoulou, M., Plotkin, S.A., and Hatchett, R. (2019). CEPI: Los Alamos National Laboratory (2020a). Los Alamos HIV Sequence Database
Driving Progress Toward Epidemic Preparedness and Response. Epidemiol. (US: Los Alamos, NM). http://www.hiv.lanl.gov/.
Rev. 41, 28–33.
Los Alamos National Laboratory (2020b). SARS-CoV-2 Sequence Analysis
Graham, B.S. (2020). Rapid COVID-19 vaccine development. Science, Pipeline. https://cov.lanl.gov/.
eabb8923.
Lu, S., Zhao, Y., Yu, W., Yang, Y., Gao, J., Wang, J., Kuang, D., Yang, M.,
Graham, B.S., and Sullivan, N.J. (2018). Emerging viral diseases from a vacci- Yang, J., Ma, C., et al. (2020). Comparison of SARS-CoV-2 infections among
nology perspective: preparing for the next pandemic. Nat. Immunol. 19, 20–28. 3 species of non-human primates. bioRxiv. https://doi.org/10.1101/2020.04.
Graham, B.S., Mascola, J.R., and Fauci, A.S. (2018). Novel Vaccine Technol- 08.031807.
ogies: Essential Components of an Adequate Response to Emerging Viral Dis- Mendoza, P., Gruell, H., Nogueira, L., Pai, J.A., Butler, A.L., Millard, K., Leh-
eases. JAMA 319, 1431–1432. mann, C., Suárez, I., Oliveira, T.Y., Lorenzi, J.C.C., et al. (2018). Combination
1462 Cell 181, June 25, 2020

ll
Perspective
therapy with anti-HIV-1 antibodies maintains viral suppression. Nature 561, acterization of neutralizing antibodies from a SARS-CoV-2 infected individual.
479–484. bioRxiv. https://doi.org/10.1101/2020.05.12.091298.
Murin, C.D., Wilson, I.A., and Ward, A.B. (2019). Antibody responses to viral in- Shepard, H.M., Phillips, G.L., D Thanos, C., and Feldmann, M. (2017). Devel-
fections: a structural perspective across three different enveloped viruses. opments in therapy with monoclonal antibodies and related proteins. Clin.
Nat. Microbiol. 4, 734–747. Med. (Lond.) 17, 220–232.
Muthumani, K., Flingai, S., Wise, M., Tingey, C., Ugen, K.E., and Weiner, D.B. Sok, D., and Burton, D.R. (2018). Recent progress in broadly neutralizing anti-
(2013). Optimized and enhanced DNA plasmid vector based in vivo construc- bodies to HIV. Nat. Immunol. 19, 1179–1188.
tion of a neutralizing anti-HIV-1 envelope glycoprotein Fab. Hum. Vaccin. Im- ter Meulen, J., van den Brink, E.N., Poon, L.L., Marissen, W.E., Leung, C.S.,
munother. 9, 2253–2262. Cox, F., Cheung, C.Y., Bakker, A.Q., Bogaards, J.A., van Deventer, E., et al.
NIH (2020). Press Release: NIH clinical trial of investigational vaccine for (2006). Human monoclonal antibody combination against SARS coronavirus:
COVID-19 begins. synergy and coverage of escape mutants. PLoS Med. 3, e237.
Pardi, N., Hogan, M.J., Pelc, R.S., Muramatsu, H., Andersen, H., DeMaso, Thanh Le, T., Andreadakis, Z., Kumar, A., Gómez Román, R., Tollefsen, S., Sa-
C.R., Dowd, K.A., Sutherland, L.L., Scearce, R.M., Parks, R., et al. (2017a). ville, M., and Mayhew, S. (2020). The COVID-19 vaccine development land-
Zika virus protection by a single low-dose nucleoside-modified mRNA vacci- scape. Nat. Rev. Drug Discov. 19, 305–306.
nation. Nature 543, 248–251. Tseng, C.T., Sbrana, E., Iwata-Yoshikawa, N., Newman, P.C., Garron, T., At-
Pardi, N., Secreto, A.J., Shan, X., Debonera, F., Glover, J., Yi, Y., Muramatsu, mar, R.L., Peters, C.J., and Couch, R.B. (2012). Immunization with SARS co-
H., Ni, H., Mui, B.L., Tam, Y.K., et al. (2017b). Administration of nucleoside- ronavirus vaccines leads to pulmonary immunopathology on challenge with
modified mRNA encoding broadly neutralizing antibody protects humanized the SARS virus. PLoS ONE 7, e35421.
mice from HIV-1 challenge. Nat. Commun. 8, 14630. Walker, L.M., and Burton, D.R. (2018). Passive immunotherapy of viral infec-
Pardi, N., Hogan, M.J., Porter, F.W., and Weissman, D. (2018). mRNA vaccines tions: ‘super-antibodies’ enter the fray. Nat. Rev. Immunol. 18, 297–308.
- a new era in vaccinology. Nat. Rev. Drug Discov. 17, 261–279. Walls, A.C., Park, Y.J., Tortorici, M.A., Wall, A., McGuire, A.T., and Veesler, D.
Peeples, L. (2020). News Feature: Avoiding pitfalls in the pursuit of a COVID-19 (2020). Structure, Function, and Antigenicity of the SARS-CoV-2 Spike Glyco-
vaccine. Proc. Natl. Acad. Sci. USA 117, 8218–8221. protein. Cell 181, 281–292.e6, e286.
Quinlan, B.D., Mou, H., Zhang, L., Guo, Y., He, W., Ojha, A., Parcells, M.S., Wang, L., Shi, W., Chappell, J.D., Joyce, M.G., Zhang, Y., Kanekiyo, M.,
Luo, G., Li, W., Zhong, G., et al. (2020). The SARS-CoV-2 receptor-binding Becker, M.M., van Doremalen, N., Fischer, R., Wang, N., et al. (2018). Impor-
domain elicits a potent neutralizing response without antibody-dependent tance of Neutralizing Monoclonal Antibodies Targeting Multiple Antigenic Sites
enhancement. bioRxiv. https://doi.org/10.1101/2020.04.10.036418. on the Middle East Respiratory Syndrome Coronavirus Spike Glycoprotein To
Richards, J.O., Karki, S., Lazar, G.A., Chen, H., Dang, W., and Desjarlais, J.R. Avoid Neutralization Escape. J. Virol. 92, e02002–e02017.
(2008). Optimization of antibody binding to FcgammaRIIa enhances macro- WHO (2020a). Coronavirus disease (COVID-2019) technical guidance, World
phage phagocytosis of tumor cells. Mol. Cancer Ther. 7, 2517–2527. Health Organization. https://www.who.int/emergencies/diseases/novel-
Robbie, G.J., Criste, R., Dall’acqua, W.F., Jensen, K., Patel, N.K., Losonsky, coronavirus-2019/technical-guidance.
G.A., and Griffin, M.P. (2013). A novel investigational Fc-modified humanized WHO (2020b). Draft landscape of COVID-19 candidate vaccines – 20 March
monoclonal antibody, motavizumab-YTE, has an extended half-life in healthy 2020. https://www.who.int/blueprint/priority-diseases/key-action/novel-
adults. Antimicrob. Agents Chemother. 57, 6147–6153. coronavirus-landscape-ncov.pdf?ua=1.
Rockx, B., Kuiken, T., Herfst, S., Bestebroer, T., Lamers, M.M., Oude Munnink, Wiehe, K., Bradley, T., Meyerhoff, R.R., Hart, C., Williams, W.B., Easterhoff, D.,
B.B., de Meulder, D., van Amerongen, G., van den Brand, J., Okba, N.M.A., Faison, W.J., Kepler, T.B., Saunders, K.O., Alam, S.M., et al. (2018). Functional
et al. (2020). Comparative pathogenesis of COVID-19, MERS, and SARS in a Relevance of Improbable Antibody Mutations for HIV Broadly Neutralizing
nonhuman primate model. Science, eabb7314. Antibody Development. Cell Host Microbe 23, 759–765.e6, e756.
Rogers, T.F., Zhao, F., Huang, D., Beutler, N., Abbott, R.K., Callaghan, S., Gar- Wrapp, D., Wang, N., Corbett, K.S., Goldsmith, J.A., Hsieh, C.L., Abiona, O.,
cia, E., He, W.-t., Hurtado, J., Limbo, O., et al. (2020). Rapid isolation of potent Graham, B.S., and McLellan, J.S. (2020). Cryo-EM structure of the 2019-
SARS-CoV-2 neutralizing antibodies and protection in a small animal model. nCoV spike in the prefusion conformation. Science 367, 1260–1263.
bioRxiv. https://doi.org/10.1101/2020.05.11.088674. Yu, X.Q., Robbie, G.J., Wu, Y., Esser, M.T., Jensen, K., Schwartz, H.I., Bell-
Saphire, E.O., Schendel, S.L., Fusco, M.L., Gangavarapu, K., Gunn, B.M., amy, T., Hernandez-Illas, M., and Jafri, H.S. (2016). Safety, Tolerability, and
Wec, A.Z., Halfmann, P.J., Brannan, J.M., Herbert, A.S., Qiu, X., et al.; Viral Pharmacokinetics of MEDI4893, an Investigational, Extended-Half-Life, Anti-
Hemorrhagic Fever Immunotherapeutic Consortium (2018). Systematic Anal- Staphylococcus aureus Alpha-Toxin Human Monoclonal Antibody, in Healthy
ysis of Monoclonal Antibodies against Ebola Virus GP Defines Features that Adults. Antimicrob. Agents Chemother. 61, 61.
Contribute to Protection. Cell 174, 938–952.e13, e913. Yuan, M., Wu, N.C., Zhu, X., Lee, C.D., So, R.T.Y., Lv, H., Mok, C.K.P., and Wil-
Saunders, K.O. (2019). Conceptual Approaches to Modulating Antibody son, I.A. (2020). A highly conserved cryptic epitope in the receptor binding do-
Effector Functions and Circulation Half-Life. Front. Immunol. 10, 1296. mains of SARS-CoV-2 and SARS-CoV. Science 368, 630–633.
Saunders, K.O., Wiehe, K., Tian, M., Acharya, P., Bradley, T., Alam, S.M., Go, Zhao, J., Zhao, J., Mangalam, A.K., Channappanavar, R., Fett, C., Meyerholz,
E.P., Scearce, R., Sutherland, L., Henderson, R., et al. (2019). Targeted selec- D.K., Agnihothram, S., Baric, R.S., David, C.S., and Perlman, S. (2016). Airway
tion of HIV-specific antibody mutations by engineering B cell maturation. Sci- Memory CD4(+) T Cells Mediate Protective Immunity against Emerging Respi-
ence 366, eaay7199. ratory Coronaviruses. Immunity 44, 1379–1391.
Seydoux, E., Homad, L.J., MacCamy, A.J., Parks, K.R., Hurlburt, N.K., Jenne- Zhou, Y., Jiang, S., and Du, L. (2018). Prospects for a MERS-CoV spike vac-
wein, M.F., Akins, N.R., Stuart, A.B., Wan, Y.-H., Feng, J., et al. (2020). Char- cine. Expert Rev. Vaccines 17, 677–686.
Cell 181, June 25, 2020 1463

ll
Leading Edge
Perspective
Molecular Transducers of Physical Activity
Consortium (MoTrPAC): Mapping the Dynamic
Responses to Exercise
James A. Sanford,1,12 Christopher D. Nogiec,2,12 Malene E. Lindholm,3,12 Joshua N. Adkins,1 David Amar,3
Surendra Dasari,4 Jonelle K. Drugan,5 Facundo M. Fernández,6 Shlomit Radom-Aizik,7 Simon Schenk,8
Michael P. Snyder,3 Russell P. Tracy,9 Patrick Vanderboom,4 Scott Trappe,10,11,12,* Martin J. Walsh,2,11,12,* and the
Molecular Transducers of Physical Activity Consortium
1Pacific Northwest National Laboratory, Richland, WA 99354, USA
2Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
3Stanford University, Stanford, CA 94305, USA
4Mayo Clinic, Rochester, MN 55901, USA
5National Institutes of Health, Bethesda, MD 20892, USA
6Georgia Institute of Technology, Atlanta, GA 30322, USA
7University of California, Irvine, Irvine, CA 92617, USA
8University of California, San Diego, La Jolla, CA 92093, USA
9University of Vermont, Burlington, VT 05405, USA
10Ball State University, Muncie, IN 47306, USA
11Senior author
12These authors contributed equally
*Correspondence: strappe@bsu.edu (S.T.), martin.walsh@mssm.edu (M.J.W.)

Exercise provides a robust physiological stimulus that evokes cross-talk among multiple tissues that when
repeated regularly (i.e., training) improves physiological capacity, benefits numerous organ systems, and de-
creases the risk for premature mortality. However, a gap remains in identifying the detailed molecular signals
induced by exercise that benefits health and prevents disease. The Molecular Transducers of Physical Activ-
ity Consortium (MoTrPAC) was established to address this gap and generate a molecular map of exercise.
Preclinical and clinical studies will examine the systemic effects of endurance and resistance exercise across
a range of ages and fitness levels by molecular probing of multiple tissues before and after acute and chronic
exercise. From this multi-omic and bioinformatic analysis, a molecular map of exercise will be established.
Altogether, MoTrPAC will provide a public database that is expected to enhance our understanding of the
health benefits of exercise and to provide insight into how physical activity mitigates disease.
INTRODUCTION vention or treatment. In fact, almost all the grants which em-
ployed an exercise intervention only addressed health outcomes
Exercise perturbs multiple systems from the whole body to the and adherence issues. The MoTrPAC initiative provides a much
molecular level in an integrated manner (Hawley et al., 2014). needed comprehensive program to understand the interplay be-
However, in-depth fundamental knowledge into the molecular tween these biological systems with the goal of improving the
and cellular mechanisms that are responsible for physical activ- design of physical activity interventions. In addition, there is a
ity’s benefits on multiple organ systems and the diseases and potential to identify molecular targets that can be manipulated
disorders that derive from inactivity is incomplete (Booth et al., to mimic the effects of exercise in persons unable to do so for
2017; Neufer et al., 2015). A better understanding of these bio- a variety of reasons, such as physical disability, coma, or pa-
logical processes and pathways would allow for the develop- ralysis.
ment of targeted exercise interventions and prescriptions and To address the gaps in knowledge about how exercise en-
provide a foundation for developing exercise-mimetic pharma- hances health and ameliorates disease, multiple agencies at
cologic interventions. the NIH—including the National Institute of Arthritis and Muscu-
The Molecular Transducers of Physical Activity Consortium loskeletal and Skin Diseases (NIAMS), the National Institute of
(MoTrPAC) was established to elucidate how exercise improves Diabetes and Digestive and Kidney Diseases (NIDDK), the Na-
health and ameliorates diseases by building a map of the molec- tional Institute on Aging (NIA), and other institutes and centers
ular responses to acute and chronic exercise. In 2014, a portfolio who participated in the trans-NIH Exercise Interest Group—pro-
analysis of National Institutes of Health (NIH) grants revealed that posed the Common Fund program supporting MoTrPAC. To
most research regarding physical activity involved disease pre- create a substantive complex map of molecular transducers in

ll
Perspective
Figure 1. General Overview of MoTrPAC
Preclinical Animal Study Sites (PASSs) (rats) and
Human Clinical Exercise Sites will collect bio-
specimen samples after acute and chronic exercise.
The biospecimen samples will be sent to a central
biorepository where they will be logged, processed,
and distributed to various Chemical Analysis Sites
(CASs) for ‘omics analysis. A portion of the bio-
specimen samples will be kept at the biorepository
for future ancillary study opportunities by the sci-
entific community. Data generated from the CASs
will be transferred to the Bioinformatics Center (BIC)
for a multi-omics, multi-species, multi-tissue, and
multi-time point integration of the data with the goal
of generating a molecular map of exercise. The data
will be available to the scientific community via the
MoTrPAC Data Hub: https://motrpac-data.org.
Figure design by Jill K. Gregory, Mount Sinai.
Preclinical Animal Study Sites

A primary goal of the Preclinical Animal
Study Sites (PASSs) investigations is to
enable the analysis of the systemic ef-
fects of exercise across many different
organs and blood, most of which cannot
be collected in humans. The first phase
of the PASS studies is being conducted
across three separate sites, with each
collecting numerous biospecimens after
acute (i.e., single bout) treadmill exercise
or chronic treadmill training of male and
female F344 rats at 6 and 18 months of
age (Figure 2). For both acute and chronic
diverse populations across the lifespan, MoTrPAC established a exercise studies, the same biospecimens are being collected
multi-site collaboration across the United States encompassing from non-exercised, control animals; for the acute studies this in-
various scientific disciplines: preclinical animal study sites and cludes groups accounting for potential effects of time of day and
human clinical exercise sites to perform the exercise testing, ex- time of feeding. This rat strain was selected for MoTrPAC
ercise interventions, and biospecimen collection; a consortium because there is a large body of previous exercise research uti-
coordinating center to manage sample collection, distribution lizing this strain, and rats provide larger amounts of tissue in
of samples, and consortium logistics; chemical analysis sites contrast to mice. Larger tissue samples allow for multiple assays
to perform ‘omics analysis from the samples collected; and a to be performed on the same sample, which supports bioinfor-
bioinformatics center to collaboratively facilitate data quality matics analysis for the development of the molecular map of ex-
control, bioinformatics analysis, and dissemination to make the ercise. Also, larger tissues will provide additional material for
data and related resources available to the broad scientific com- ancillary studies. To control variation across the three PASSs,
munity (Figure 1). The animal studies will enable analysis of the the rats are being provided from the same NIA colony, and the
effects of exercise on many different tissues that are not readily housing and feeding conditions are standardized across the
obtainable in humans, thereby enabling a broad view of the sys- sites. Furthermore, because rats are most active during the
temic effects of exercise. The collection of human specimens ‘‘dark’’ light phase, all rats are first being acclimated to a reverse
(blood, muscle, and adipose) will permit the analysis of these light cycle for a minimum of 10 days, and with all exercise bouts
critical systems, which are central to the energetics of exercise conducted during the dark cycle.
and appear to interact in a coordinated manner to improve over- The PASS study design for the acute response to exercise en-
all metabolic health (Pedersen and Febbraio, 2012; Romijn et al., tails a single 30-min bout of treadmill running (intensity: 80%–
1993; Stanford and Goodyear, 2018). In addition to providing in- 90% VO2max; incline: 5 ; speed: 6 months: male – 16.8 m/min,
formation concerning the effects of exercise at different physio- female – 18.0 m/min; 18 months: male – 12.0 m/min, female –
logical and molecular levels, the large scope of this study (hu- 13.8 m/min), with tissue collections occurring immediately
mans: N = 2,600; rats: N = 820) will create a complex, post-exercise and at six additional times up to 48 h after the ex-
integrative dataset that will be available to the scientific commu- ercise bout. This sampling series is weighted toward early time
nity. This dataset and some associated biospecimens can be points (0, 0.5, 1, and 4 h post-exercise) to capture the temporal
leveraged by other groups by proposing ancillary studies to dynamics of the molecular responses, but it also includes
MoTrPAC. later time points (7, 24, and 48 h post-exercise) to capture
Cell 181, June 25, 2020 1465

ll
Perspective
Figure 2. A General Schematic of the Preclinical and Clinical Studies

(A) Biospecimen collection for the rat studies (N = 820, including non-exercised controls) is ongoing and will be completed in Fall 2020. For both the acute (i.e.,
single bout; n = 526) and chronic phases (i.e., training; n = 294), 6- and 18-month-old male and female rats are being studied, and both exercise phases include a
cohort of non-exercised controls.
(B) 2,600 healthy individuals will be evaluated physiologically, and biospecimens will be collected to accommodate molecular probing of various tissues before
and after acute and chronic endurance or resistance exercise. For the adult cohort, the control participants will rest quietly (i.e., no exercise) during the acute bout
with biospecimen collections.
*Assessments include blood profile, body composition, heart health, VO2max and strength.
long-duration primary responses as well as secondary molecular white adipose), as well as liver, heart, kidney, lungs, brain, and
events (detailed protocols are available at https://motrpac.org/ brown adipose. It is expected that nucleic acid, proteomic, and
protocols.cfm). To study the biological events that occur during targeted metabolomic assays will be performed only on tissues
the early, intermediate, and later stages of endurance training, where the amount is non-limiting, whereas transcriptomics,
the PASS study design for the chronic response to exercise, non-targeted metabolomics, and non-targeted lipidomics will
which has been completed, entailed up to 8 weeks of treadmill be performed on all tissues. Together, these assays are ex-
training (5 days per week at 70% VO2max), with tissues pected to provide molecular and physiological insights about
collected 48 h after 1, 2, 4, and 8 weeks of training; incline, dura- the effect of exercise on many different organs. Ultimately,
tion, and speed of exercise progressively increased on a daily to MoTrPAC should begin to explain how molecular transducers
weekly basis during the initial 6 weeks of training. Sex differ- function across an entire mammal (Pedersen and Feb-
ences in both the acute exercise response and the training braio, 2012).
response are being investigated along with other study aims. After preliminary characterization of the changes that occur in
The most powerful aspect of the PASS design is the breadth of the initial set of analyses, a second phase of the PASS will
tissues collected. In addition to being studied in the context of include mechanistic studies of exercise-induced molecules
MoTrPAC, these will serve as a data resource for generating hy- that transduce stress resistance and circulating factors that
potheses for future studies. For both the acute exercise and might be implicated in the health benefits of exercise. Additional
chronic exercise training studies (including the non-exercise studies will focus on the adaptation to chronic resistance exer-
controls), as many as 27 biospecimens per rat are being cise and the impact of age and sex on these responses, as
collected for potential analysis. In addition to biospecimen well as other studies that have yet to be determined.
collection, other phenotypic outcomes are being collected,
including blood lactate concentration, maximal oxygen con- Human Clinical Exercise Sites
sumption (VO2max), and body composition. At Chemical Anal- The human component of MoTrPAC is an in-depth study of the
ysis Sites, initial biospecimens of focus will include those that effects of two different forms of exercise (endurance and resis-
overlap with the human studies (i.e., plasma, skeletal muscle, tance training) across multiple individuals of different ages
1466 Cell 181, June 25, 2020

ll
Perspective
(including children) and sexes, as well as sedentary and highly et al., 1993), bioenergetic flux >10-fold (Kjaer et al., 1991; Romijn
active individuals. This large cohort will be used to study the et al., 1993; Steensberg et al., 2000), and large dynamic range in
response to exercise at the whole body and cellular levels and gene expression from small to >100-fold changes (Louis et al.,
attempt to identify the molecular underpinnings that might be 2007; Radom-Aizik et al., 2013, 2014) and likely enhance
responsible for the adaptive process and variation among indi- cross-talk among many organs (Pedersen and Febbraio, 2012).
viduals. Several traditional methods from the field of exercise Standardized conditions that control for physical activity,
physiology will be combined with novel biospecimen sampling time of day, and dietary intake will be implemented prior to
and high-throughput molecular analytical approaches that will the acute exercise bout. On the day of an acute exercise bout
likely yield important insights into the effects of exercise on with biospecimen collections, volunteers will arrive at a Human
health. The human study has many unique aspects that are high- Clinical Center in the morning after an overnight fast and rest
lighted below and will be conducted as a randomized controlled comfortably for 0.5 h prior to obtaining baseline blood (antecu-
trial (RCT) with an intent-to-treat design. bital vein), skeletal muscle (vastus lateralis), and adipose (peri-
Participants umbilical region) samples. Participants will then perform the
The goal is to recruit 270 children and adolescents (10–17 years standardized acute exercise bout (or rest for the non-exercise
of age) who are low-active in endurance-type exercise and 1,980 control group) with additional biospecimen samples (blood,
healthy sedentary adults (age 18 years or greater) who will be muscle, adipose) obtained 0.5 h (early), 4 h (middle), and
medically screened and randomly assigned to endurance 24 h (late) after exercise. These time points were chosen to
training (170 youth, 840 adults), resistance training (840 adults), capture the dynamic changes in the response to exercise as
or non-exercise control (50 youth, 300 adults) (Figure 2). An addi- metabolic, post-translational, and epigenetic modifications
tional group of highly active endurance- (50 youth, 150 adults) can occur quite rapidly (Barrès et al., 2012; Bolster et al.,
and resistance- (150 adults) trained individuals will serve as com- 2003; Hoffman et al., 2015; Romijn et al., 1993), whereas
parators and will not participate in the MoTrPAC exercise training mRNA induction generally peaks a few hours after exercise
programs. The recruitment and enrollment approach will be sex (Louis et al., 2007; Yang et al., 2005), and increases in protein
balanced and provide participants across a wide range of ages synthesis rates are detectable in the hours and days following
(10–17, 18–39, 40–59 and R60 year age groups) and of different exercise (Phillips et al., 1997) (Figure 3). Additional blood sam-
races. ples will be collected during the endurance exercise bout (20-
Exercise Training Program: Adults and 40-min time points) and shortly after (10 min) both endur-
The sedentary adult participants randomized to endurance or ance and resistance exercise bouts. All participants will have
resistance exercise training will perform 12 weeks of supervised pre-exercise biospecimen collections, but to reduce partici-
exercise, 3 days per week, with progression in both volume and pant burden in the post-exercise phase, sedentary participants
intensity. Each endurance training session will be 1 h in dura- will undergo skeletal muscle and adipose biopsies at one of
tion and be evenly split between cycling and treadmill (walking/ three time points (early, middle, late). The highly active partici-
running) exercise with intensity set to 60%–80% of heart rate pants will have muscle biopsies and blood at all time points,
reserve and monitored in real time during each session. Each whereas adipose biopsies will be collected at the pre and mid-
resistance training session will target the whole body and consist dle time points.
of eight total exercises (five upper body: chest press, military Acute Exercise Bout and Biospecimens Collections:
press, seated row, triceps extension, biceps curl; three lower Pediatrics
body: leg press, leg curl, knee extension) at a prescribed plan Children and adolescents undergo critical periods of growth and
of 3 sets of 8–12 repetitions at an intensity of 60%–80% of development, which are distinct from adult physiology. Pediatric
maximum for each exercise. These exercise protocols are well studies must also comply with additional ethical considerations
known to improve clinically relevant parameters (i.e., VO2max (Radom-Aizik and Cooper, 2016). Consequently, although the
and muscular strength and hypertrophy) via alterations in meta- pediatric arm of the study will mimic the adult protocol as closely
bolic, biochemical, and molecular signatures (Coggan et al., as possible, there are a few notable exceptions: (1) children who
1990; Gollnick et al., 1973; Raue et al., 2012; Rönn et al., 2014; are low-active in endurance-type exercise will be recruited
Timmons et al., 2010). (versus sedentary adults) to account for the fact that children
Acute Exercise Bout and Biospecimens Collections: are naturally more active than adults and also participate in
Adults mandatory physical education classes; (2) no tissue biopsies
A unique feature of the MoTrPAC adult protocol will be the inte- (only blood will be collected); (3) an acute bout of endurance ex-
gration of strategic biospecimen collections (blood, muscle, and ercise with blood collection will be performed in both the training
adipose) before, during, and after standardized bouts of acute intervention and no-exercise control groups; (4) blood samples
exercise. Participants will perform a 40 to 45-min bout of exer- will be collected in all participants before, 20 and 40 min during
cise (exercise-mode specific; rest for the non-exercise controls) exercise, and 10 min, 0.5 h, and 3.5 h into recovery; and (5) for
with biospecimens collected before and after 12 weeks of a subgroup of 170 low-active endurance exercise children and
training. The highly active group will perform the exercise- adolescents who will be randomized to receive 12-week endur-
mode specific bout only once. Compared to resting homeosta- ance training (N = 120) or continue their standard practice (N =
sis, these types of exercise challenges are expected to dramat- 50) (Figure 2), the endurance exercise intervention will be modi-
ically increase metabolic rate 5- to 10-fold (Coggan et al., 1990; fied to provide an exercise intervention that is appropriate to the
Farinatti and Castinheiras Neto, 2011; Mulla et al., 2000; Romijn pediatric participants’ age group. For middle and high school
Cell 181, June 25, 2020 1467

ll
Perspective
Figure 3. Overview of Biospecimen Collec-
tions and Integrated Analysis Plan
Preclinical studies: For the acute exercise arm,
biospecimens (N = 25 per animal) are being
collected 0 (i.e., immediately post), 0.5, 1, 4, 7, 24,
or 48 h following exercise. For the chronic exercise
training arm, which has been completed, bio-
specimens (N = 28) were collected 48 h after rats
completed 1, 2, 4, or 8 weeks of treadmill training.
For both study arms, the same biospecimens are
being collected from an unexercised group of rats.
Clinical Studies: Human biospecimens (blood,
muscle, adipose; blood only for pediatrics) will be
obtained before and at 0.5, 4, and 24 h following
endurance or resistance exercise. The sedentary
participants will perform the acute exercise bout
with biospecimen collections twice (before and
after 12 weeks of training) whereas the highly
active participants will perform this one time (see
Human Clinical Sites description in the text for
more detail). The ‘omics data generated from
multiple different genomics, proteomics, and me-
tabolomics assays will be processed in assay-
specific pipelines. Omic-specific analyses fol-
lowed by state-of-the-art integrative methods will
be applied for a multi-omic analysis of the multiple
time points and tissues collected in MoTrPAC with
the goal of creating a map of the molecular
transducers of exercise. All data (pipelines, raw
and processed data, results, and integrative ana-
lyses) will be made publicly available through the
MoTrPAC Data Hub: https://motrpac-data.org.
Phenotype Measures: Pediatrics

and Adults
To complement the molecular map,
selected phenotypic measures will be ob-
tained. Participants will be assessed
before and after the 12-week intervention
period, whereas the highly active adult
comparator group members will be
assessed once. These measurements
include maximal oxygen consumption
on a cycle ergometer (VO2max), grip
strength, maximal isometric knee exten-
sion strength, body composition (DXA,
dual-energy X-ray absorptiometry), clin-
ical blood profiles, heart rate profiles dur-
ing the acute endurance and resistance
exercise bouts, substrate utilization (car-
bohydrate and fat) during the acute
endurance exercise bout at 65%
VO2max (adult endurance participants
only), and upper and lower body strength
(one-repetition maximum; adult resis-
tance participants only). Most of the adult
students the modes of endurance exercise will include cycling participants also will provide information on self-reported health
and treadmill and also an option for elliptical and rowing ma- outcomes using PROMIS measures (www.nihpromis.org) that
chines. Elementary school students will be trained in a form of will provide opportunities to investigate the effect of exercise
circuit training (e.g., endurance activity stations: cycle ergom- on mood, anxiety, and depression. In a subgroup of adult partic-
eter, steppers, individual jump rope, pacer, and sliders) to ipants, skeletal muscle and adipose histology (cell type, size,
keep the younger participants engaged. capillarization) and single cell analysis across various ‘omics
1468 Cell 181, June 25, 2020

ll
Perspective
platforms will be conducted. Investigators are exploring the po- specific targets to benefits (Carter et al., 2017). Although several
tential for collecting and analyzing microbiome samples from a studies (for review see Loos et al., 2015; Warburton et al., 2006)
subgroup of adult participants. have provided a rich source of information to develop a founda-
Implementation tion for larger and more comprehensive genomic, epigenomic,
To optimize this complex protocol, the adult component of the and transcriptomic (GET) analyses, sufficiently powered studies
study will be implemented in two phases. The first phase will with the complementary detailed study design to make accurate
involve 150 adult participants and will require 4–6 months, predictions as machine-learned models remain underdevel-
enabling assessment of participant and clinical burden and oped. Moreover, the information needed to understand the role
feasibility, as well as allowing for refinement of the MoTrPAC pro- genetic variation plays in the response of individuals to acute
tocol. In phase two, the remainder of the project with a target of and chronic exercise remains limited. MoTrPAC, although pre-
over 2,000 participants will be implemented. dicted to be statistically underpowered for a genome-wide asso-
ciation study (GWAS), should be able to be statistically organized
Consortium Coordinating Center and prioritized (Cantor et al., 2010) so that there is potential
The Consortium Coordinating Center (CCC) is composed of four benefit from the orthogonal measurements assessed through
parts: an Administrative Coordinating Center (ACC), a Data Man- other ‘omes, leading to improved mechanistic insight. Such in-
agement and Quality Control (DMAQC) Core, an Exercise Inter- formation can begin as a knowledge base for enabling better
vention Core (EIC), and a central Biorepository. The role of the treatment considerations for a variety of diseases (whether acute
ACC is to enable the organization and governance of MoTrPAC or chronic) through recognizing potential genetic and epigenetic
by facilitating key processes such as meeting logistics, IRB sub- differences in responses to exercise and training. This could be
mission, and preparation of Manuals of Operations. accomplished through identifying novel gene/genetic network
The Biorepository, working with the preclinical and clinical involvement, their corresponding changes in RNA transcripts
sites, the DMAQC, and the Chemical Analysis Sites, oversees and how such genes are regulated at the epigenetic level from
sample collection, shipping, archiving, and distribution of human adult and adolescent and between athletic and sedentary indi-
and animal samples. This includes ensuring that homogeneous viduals, and associated sex differences in response to acute
cryo-pulverization of tissue samples occurs prior to distribution and chronic exercise.
of aliquots to the various Chemical Analysis Sites. Uniform sam- The goal of the GET assays are to map and measure changes
ple processing is important to ensure that diverse data types can in the (1) RNA transcriptome and transcript isoforms including
be directly compared. Each tissue sample will also be stored for small and micro RNA using RNA sequencing, (2) DNA methyl-
future use by MoTrPAC and non-MoTrPAC investigators. Sam- ation and chromatin accessibility from rat and human tissues us-
ples include serum, EDTA plasma, PAXgene-protected whole ing reduced representation bisulfite (RRBS) for rat or methyl CpG
blood and peripheral blood mononuclear cells, and vastus later- hybrid capture for human specimens and ATAC-seq (assay for
alis skeletal muscle and subcutaneous abdominal adipose tissue transposase-accessible chromatin with sequencing), respec-
from humans and >20 different tissues from the preclinical ani- tively, and (3) genomic sequence and structure of all human par-
mals. Each sample will be analyzed by the Chemical Analysis ticipants. The assays are expected to provide insights into
Sites, and additional material will be archived for future use. changes in biological processes as well as gene regulatory net-
The Biorepository inventory system interacts with the DMAQC works that occur in response to acute and chronic exercise. The
to enable sample tracking, quality control, and other process GET assay component of MoTrPAC will involve comprehensive
support systems. analyses of extensively curated rat and human MoTrPAC sam-
ples with an exercise intervention, contribute these data to public
Chemical Analysis Sites databases, help identify candidate molecular transducers, eluci-
To understand the exercise response in detail, an in-depth date new mechanisms that might explain the human response to
analysis of molecular and ‘omic assays will be performed using exercise, and cooperate with the Bioinformatics Center (BIC) to
state-of-the-art laboratory techniques. Technologies include ge- develop predictive models of the individual response to physical
nomics, transcriptomics, DNA methylomics, targeted and untar- activity.
geted proteomics, and targeted and untargeted metabolomics. Proteomics
Genomic, Transcriptomic, and Regulatory Analyses Proteins are important drivers of cellular structure, function, and
Evidence has shown, through more than 150 small cohort signal mediation (Cox and Mann, 2011); thus, uncovering the
studies (typically with under 50 participants analyzed) (Bouchard pathways through which physical activity influences health re-
et al., 2011; Pacheco et al., 2018), that exercise is accompanied quires analysis of the proteome and the critical signaling-associ-
with massive changes at both the transcriptional and epige- ated post-translational modifications of the proteome in various
nomic levels in muscle, adipose, and most other tissue systems tissues. To date, a number of proteomic studies have shown
(Lindholm et al., 2014; Ling and Rönn, 2014; Rönn and Ling, important changes influenced by exercise (Burniston, 2008; Hoff-
2013) with the poorly understood influence of the underlying human et al., 2015; Magherini et al., 2012; Sollanek et al., 2017).
man genetic/environmental variation that exists between and The majority of this work has focused on skeletal muscle, which
within populations (Leon ska-Duniec et al., 2016). Therefore, is the tissue that actively performs the motions involved in exer-
recent scientific studies have been conducted generating data cise, and blood and plasma, which circulate signals systemically
reflecting some of the underlying genetic and epigenetic basis through the body and may be responsible for facilitating cross-
for responses to exercise, physical activity, and training linking talk between organ systems. Furthermore, this research is often
Cell 181, June 25, 2020 1469

ll
Perspective
performed in the context of diabetes because of the role of muscle acids, and myriad other molecule types, and its wide dynamic
and the interplay of exercise with insulin-resistance (Kleinert et al., range (sub-nM to mM) implies that no single chemical assay
2018). Although these studies have largely been constrained to can adequately profile all metabolites in one experiment (Smilde
experimental models of exercise in animals or very small cohorts et al., 2005). To this end, MoTrPAC will employ a combination of
of human subjects, the results are tantalizing and have identified non-targeted and targeted approaches for mapping the broader
several proteins and signaling molecules that potentially play a effects of exercise on both the metabolome and lipidome. These
key role in the response to exercise. The large-scale and well- will range from triple-quadrupole-based liquid chromatography-
controlled preclinical and clinical protocols adopted for MoTrPAC mass spectrometry (LC-MS) using stable isotope-labeled inter-
will allow for expansion of this knowledge by providing a deeper nal standards for absolute quantification to high resolution MS
interrogation of the proteomic response to acute and chronic ex- and tandem MS using reversed phase and hydrophilic interac-
ercise in numerous tissues from individuals across a range of tion LC for mapping relative changes in both known and
fitness levels. unknown molecular transducers. Targeted and non-targeted
Importantly, proteomic analyses should be inclusive of not LC-MS assays that focus on the non-polar fraction of the metab-
only protein expression but also the state of protein post-trans- olome (the lipidome) will also be leveraged to map exercise ef-
lational modifications, such as phosphorylation or acetylation, fects on lipid metabolism and oxidation (Nieman et al., 2013,
because these chemical moieties can act as rapid integrators 2014). It is expected that these studies will provide insights
by dictating protein localization and enzymatic activity (Brandes into energy metabolism and signaling molecules involved in the
et al., 2009; Choudhary et al., 2014; Emmerich et al., 2011; Hunt- response to exercise.
er, 1995). Primarily, untargeted mass spectrometry methods and Exosomes
targeted aptamer-based detection techniques will be employed Exercise is a potent stimulus that has broad ranging systemic ef-
to probe changes in protein abundance and modifications fects that are indubitable (Egan and Zierath, 2013). One prevail-
induced by exercise. Given that distinct tissues present techno- ing hypothesis is that circulating extracellular vesicles termed
logical challenges to discovery-based proteomic analysis (e.g., exosomes play an important role in carrying training-induced
dynamic range in skeletal muscle), state-of-the-art instrumenta- protein, mRNA, and microRNA (miRNA) cargo between organs
tion and protocols, including tandem mass tag labeling and frac- as a means of integrating responses to exercise (Safdar and Tar-
tionation (Mertins et al., 2018), will be employed. Indeed, pilot nopolsky, 2018) (Figure 4). Many techniques have been
discovery-based proteomics efforts with muscle and other developed for isolating exosomes from plasma to analyze their
tissues have yielded robust datasets with levels of protein molecular cargo (Barrachina et al., 2019). Importantly, these
coverage exceeding previous studies, presenting a wealth of techniques have been used to demonstrate that acute exercise
opportunities to elucidate proteomic response to exercise and increases the abundance of a wide variety of exosome-associ-
integrate these findings with data obtained from GET and metab- ated proteins related to metabolic and immune regulation
olomic studies of the same tissues. (Whitham et al., 2018). Exosome isolation and analysis of
Metabolomics MoTrPAC specimens will further investigate these effects by
Complementing genomics, transcriptomics, epigenomics, and describing how exosome content is modulated in response to
proteomic studies, MoTrPAC will also carry out a highly compre- endurance and resistance exercise. The identification of protein
hensive mapping of exercise-associated alterations in the me- and RNA signatures associated with exercise will shed light on
tabolome of both rats and humans. The metabolome is the total exosome-mediated inter-organ cross-talk and provide a frame-
collection of biologically active small molecules in a given organ- work for studies to characterize the systemic response to phys-
ism (Nicholson and Wilson, 2003). This includes endogenous ical activity.
molecules that are biosynthesized by metabolic networks in pri-
mary metabolism, molecules derived from diet or environmental Bioinformatics Center
exposures (the exposome; Wild, 2005), and molecules derived The immediate goals of MoTrPAC will be vested in the ‘omic plat-
from the biosynthetic interactions with the microbiome. Metabo- forms used and the data being generated, the quality of this data,
lomics can either be ‘‘targeted’’ to a set of known compounds both meta and experimental, and how it will be utilized to map
(e.g., certain acylcarnitines) or ‘‘non-targeted,’’ which attempts the molecular transducers involving the responses to acute
to detect and relatively quantify as many metabolites as possible and chronic exercise. Data from each assay will be collected at
(Dettmer et al., 2007). In the context of acute and chronic exer- the BIC and analyzed using consistent bioinformatic and analytic
cise, metabolomics can provide sensitive and dynamic pheno- pipelines, whenever possible. This will improve reproducibility,
typic patterns that closely reflect cellular and molecular changes interpretability, and ease in data harmonization across sites.
and will likely improve our understanding of the effects of exer- Assay-specific quantitative data will undergo quality control
cise beyond the individual pathway level (Heaney et al., 2017). assessment and be normalized to reduce undesirable sample-
A number of studies have documented profound metabolomic to-sample variation, minimize batch effects, and deal with ana-
alterations associated with exercise (Fukai et al., 2016; Heaney lyte heteroskedasticity typically observed in molecular abun-
et al., 2017; Lewis et al., 2010; Xiao et al., 2016), but these typi- dance datasets. Relative levels of analytes will be determined
cally involve smaller cohorts (n < 100), are limited to only one (or a and the changes in molecules and pathways in response to ex-
handful of) metabolomics assays, or focus primarily on alter- ercise deduced.
ations in energy production pathways. The vast chemical diver- Investigators across the consortium will conduct a series of
sity of the metabolome, which includes lipids, sugars, amino integrative analyses with the end goal of creating a map of
1470 Cell 181, June 25, 2020

ll
Perspective
revealing significant differential responses possibly interacting

with other clinical variables (e.g., sex, age, and VO2max). Sys-
tems biology methods and network algorithms will be used to
correlate the different ‘omics and provide interpretable modules,
focusing on regulatory processes. This will include basic inter-
pretation of differential abundance patterns via network and
pathway enrichment analyses. Additional analysis will be per-
formed to understand the correlation structure between ‘omics,
detecting latent subject classes with different temporal re-
sponses to exercise, and network propagation techniques to
highlight tissue-specific response (Amar and Shamir, 2014;
Amar et al., 2015; Cowen et al., 2017; Gallant et al., 2013; Hofree
et al., 2013; Jo et al., 2016; Schulz et al., 2012). In addition to
identifying differential molecular responses and their associa-
tions with physiological changes (e.g., molecules and pathways
that are associated with VO2max), MoTrPAC will attempt to build
predictive models of the effects of exercise on change in these
parameters. Plans are in place to study the effects of age, sex,
race, and exercise type on these changes. To enhance interpret-
ability, the results from all analyses above will be summarized us-
ing multiple visualization techniques.
Data and Resources Dissemination

As an NIH Common Fund Program, MoTrPAC aims to provide a
foundation for further research to be used by the broad biomedical
research community. The goal is to share all MoTrPAC data that
does not compromise personal health information (PHI) for gen-
eral research use. All of the produced raw and processed data,
analysis pipelines, and results will be made rapidly available to
the scientific community through the MoTrPAC Data Hub
(https://motrpac-data.org) and, whenever suitable, in the appro-
priate public repositories (e.g., Metabolomics Workbench for me-
tabolomics data and Database of Genotypes and Phenotypes for
sequencing data). We plan to follow the FAIR (findable, acces-
sible, interoperable, and reusable) data principles (Wilkinson
et al., 2016) to ensure that the data are widely available to and us-
able by the scientific community. In addition, we will be collabo-
rating with the Common Fund Data Ecosystem (https://
commonfund.nih.gov/dataecosystem) team on tools, techniques,
and external user training to maximize the utility of the MoTrPAC
data to the scientific community. The MoTrPAC Data Hub itself
is a cloud-native application which uses a service-oriented archi-
tecture. It is being hosted on the Google Cloud Platform. In addi-
tion to providing direct access to data, there will be tools for cus-
tomizable web-based data visualizations provided on the site.
Figure 4. Role of Exosomes in Integrating the Exercise Response
across Organ Systems Scientific Opportunities/Ancillary Studies
Exosomes are small extracellular vesicles (EVs) that are packaged with func- Although many activities are occurring within MoTrPAC, there
tional molecular cargo and released from most cell types. Exosomes released
in response to exercise can be transported in the blood to various other tissues will be opportunities to expand the breadth and depth of
(adipose, brain, liver, etc.) where their contents can be released and have MoTrPAC. These opportunities include (1) additional interroga-
biological impact. Protein and miRNA cargo of circulating EVs from MoTrPAC tion of clinical metadata derived from animal and human studies
samples will be analyzed to provide additional insight into this emerging bio-
logical phenomenon.
and through the ‘omics applications and the interpretation of the
Figure design by Jill K. Gregory, Mount Sinai. data beyond what has been proposed and approved through the
main MoTrPAC study, (2) addition or expansion of data collection
molecular transducers of physical activity and understanding the to the current parent protocol, and (3) accession of the bio-
dynamic biological changes that occur in response to acute and repository for remaining biospecimen samples obtained from
chronic exercise (Figure 3). Unbiased statistical analysis will be the animal and human studies to perform complementary ana-
performed to understand the variability in the data, thereby lyses on these tissues. The MoTrPAC Ancillary Study policy
Cell 181, June 25, 2020 1471

ll
Perspective
and proposal template are available at https://motrpac.org/ this article. Additionally, the authors would like to gratefully acknowledge the
ancillarystudyguidelines.cfm. expert administrative functions by Heather Kiesel of the University of
Florida for help organizing efforts by the MoTrPAC Writing Group to better
enable the completion of this article. The MoTrPAC Study is supported by
Study Challenges NIH grants U24OD026629 (Bioinformatics Center), U24DK112349,
There are many challenges associated with a large multicenter U24DK112342, U24DK112340, U24DK112341, U24DK112326,
project generating a wide variety of data types. Standardization U24DK112331, U24DK112348 (Chemical Analysis Sites), U01AR071133,
procedures, operation manuals, and quality control steps are in U01AR071130, U01AR071124-01, U01AR071128, U01AR071150,
place to reduce the variation in exercise performance and eval- U01AR071160, U01AR071158 (Clinical Centers), U24AR071113 (Consortium
Coordinating Center), U01AG055133, U01AG055137, and U01AG055135
uation and sample collections of the animal and human samples.
(PASS/Animal Sites). The views expressed are those of the authors and do
Participant and clinic burden are a concern given the large scope not necessarily reflect those of the NIH or the Department of Health and Hu-
and complexity of MoTrPAC; the consortium has made strategic man Services of the United States.
choices regarding the protocol design and biospecimens sam-
pling to help mitigate these challenges. Similarly, standardization
REFERENCES
and quality control steps are in place for the data generated us-
ing multiple ‘omics platforms even for similar data types (e.g., Amar, D., and Shamir, R. (2014). Constructing module maps for integrated
metabolomics and proteomics). An initial implementation phase analysis of heterogeneous biological networks. Nucleic Acids Res. 42,
(described above) was also put in place for the human studies to 4208–4219.
further evaluate the adult protocol during the early stages of Amar, D., Yekutieli, D., Maron-Katz, A., Hendler, T., and Shamir, R. (2015). A
recruitment, data collection, and analysis to identify any unfore- hierarchical Bayesian model for flexible module discovery in three-way time-
seen issues. series data. Bioinformatics 31, i17–i26.
Integrating heterogeneous data types across MoTrPAC will Barrachina, M.N., Calderón-Cruz, B., Fernandez-Rocca, L., and Garcı́a, Á.
require sophisticated tracking, data normalization, and analytic (2019). Application of Extracellular Vesicles Proteomics to Cardiovascular Dis-
approaches. MoTrPAC will generate large amounts of data, and ease: Guidelines, Data Analysis, and Future Perspectives. Proteomics 19,
e1800247.
many measurements may not meet statistical significance at the
individual molecule level but may do so at the pathway level. Barrès, R., Yan, J., Egan, B., Treebak, J.T., Rasmussen, M., Fritz, T., Caidahl,
K., Krook, A., O’Gorman, D.J., and Zierath, J.R. (2012). Acute exercise re-
State-of-the art analytic tools are in place to help manage the
models promoter methylation in human skeletal muscle. Cell Metab. 15,
depth and breadth of data analysis, and these tools will continue 405–411.
to evolve as MoTrPAC progresses. Finally, the data are complex;
Bolster, D.R., Kubica, N., Crozier, S.J., Williamson, D.L., Farrell, P.A., Kimball,
ensuring that they are accessible and understandable to the broad S.R., and Jefferson, L.S. (2003). Immediate response of mammalian target of
scientific community in a timely fashion is essential for this project rapamycin (mTOR)-mediated signalling following acute resistance exercise in
to be successful. Protection of participant clinical data (PHI, geno- rat skeletal muscle. J. Physiol. 553, 213–220.
mics data) will be given the highest priority to protect each individ- Booth, F.W., Roberts, C.K., Thyfault, J.P., Ruegsegger, G.N., and Toede-
ual’s identity. Incorporation of useful visualization tools as well as busch, R.G. (2017). Role of Inactivity in Chronic Diseases: Evolutionary Insight
active engagement with the broader scientific community will be and Pathophysiological Mechanisms. Physiol. Rev. 97, 1351–1402.
equally important to fully capitalize on the MoTrPAC project. Bouchard, C., Rankinen, T., and Timmons, J.A. (2011). Genomics and genetics
in the biology of adaptation to exercise. Compr. Physiol. 1, 1603–1648.
Summary Brandes, N., Schmitt, S., and Jakob, U. (2009). Thiol-based redox switches in
When complete, MoTrPAC will deliver a map of the biological eukaryotic proteins. Antioxid. Redox Signal. 11, 997–1014.
molecules and pathways underlying the systemic effects of Burniston, J.G. (2008). Changes in the rat skeletal muscle proteome induced
acute and chronic exercise. The data, which will ultimately be by moderate-intensity endurance exercise. Biochim. Biophys. Acta 1784,
made freely available to the scientific community, will provide 1077–1086.
unprecedented opportunities to begin to understand the path- Cantor, R.M., Lange, K., and Sinsheimer, J.S. (2010). Prioritizing GWAS re-
sults: A review of statistical methods and recommendations for their applica-
ways by which physical activity influences health. In the future,
tion. Am. J. Hum. Genet. 86, 6–22.
it is expected that the knowledge gained will allow researchers
Carter, A.C., Chang, H.Y., Church, G., Dombkowski, A., Ecker, J.R., Gil, E.,
and health professionals to develop personalized exercise rec-
Giresi, P.G., Greely, H., Greenleaf, W.J., Hacohen, N., et al. (2017). Challenges
ommendations and provide insights into molecular targets that
and recommendations for epigenomics in precision health. Nat. Biotechnol.
could be manipulated to mimic some of the effects of exercise 35, 1128–1132.
in persons unable to do so.
Choudhary, C., Weinert, B.T., Nishida, Y., Verdin, E., and Mann, M. (2014). The
growing landscape of lysine acetylation links metabolism and cell signalling.
SUPPLEMENTAL INFORMATION Nat. Rev. Mol. Cell Biol. 15, 536–550.
Coggan, A.R., Kohrt, W.M., Spina, R.J., Bier, D.M., and Holloszy, J.O. (1990).
Supplemental Information can be found online at https://doi.org/10.1016/j.
Endurance training decreases plasma glucose turnover and oxidation during
cell.2020.06.004.
moderate-intensity exercise in men. J. Appl. Physiol. 68, 990–996.
ACKNOWLEDGMENTS Cowen, L., Ideker, T., Raphael, B.J., and Sharan, R. (2017). Network propaga-
tion: a universal amplifier of genetic associations. Nat. Rev. Genet. 18,
The authors would like to gratefully acknowledge Jill K. Gregory, CMI, FAMI 551–562.
(Certified Medical Illustrator) of the Icahn School of Medicine at Mount Sinai Cox, J., and Mann, M. (2011). Quantitative, high-resolution proteomics for
for working with the writing group to generate the figures enclosed within data-driven systems biology. Annu. Rev. Biochem. 80, 273–299.
1472 Cell 181, June 25, 2020

ll
Perspective
Dettmer, K., Aronov, P.A., and Hammock, B.D. (2007). Mass spectrometry- Magherini, F., Abruzzo, P.M., Puglia, M., Bini, L., Gamberi, T., Esposito, F.,
based metabolomics. Mass Spectrom. Rev. 26, 51–78. Veicsteinas, A., Marini, M., Fiorillo, C., Gulisano, M., and Modesti, A. (2012).
Egan, B., and Zierath, J.R. (2013). Exercise metabolism and the molecular Proteomic analysis and protein carbonylation profile in trained and untrained
regulation of skeletal muscle adaptation. Cell Metab. 17, 162–184. rat muscles. J. Proteomics 75, 978–992.
Emmerich, C.H., Schmukle, A.C., and Walczak, H. (2011). The emerging role of Mertins, P., Tang, L.C., Krug, K., Clark, D.J., Gritsenko, M.A., Chen, L.,
linear ubiquitination in cell signaling. Sci. Signal. 4, re5. Clauser, K.R., Clauss, T.R., Shah, P., Gillette, M.A., et al. (2018). Reproducible
workflow for multiplexed deep-scale proteome and phosphoproteome anal-
Farinatti, P.T.V., and Castinheiras Neto, A.G. (2011). The effect of between-set
ysis of tumor tissues by liquid chromatography-mass spectrometry. Nat. Pro-
rest intervals on the oxygen uptake during and after resistance exercise ses-
toc. 13, 1632–1661.
sions performed with large- and small-muscle mass. J. Strength Cond. Res.
25, 3181–3190. Mulla, N.A., Simonsen, L., and Bülow, J. (2000). Post-exercise adipose tissue
and skeletal muscle lipid metabolism in humans: the effects of exercise inten-
Fukai, K., Harada, S., Iida, M., Kurihara, A., Takeuchi, A., Kuwabara, K., Su-
sity. J. Physiol. 524, 919–928.
giyama, D., Okamura, T., Akiyama, M., Nishiwaki, Y., et al. (2016). Metabolic
Profiling of Total Physical Activity and Sedentary Behavior in Community- Neufer, P.D., Bamman, M.M., Muoio, D.M., Bouchard, C., Cooper, D.M.,
Dwelling Men. PLoS ONE 11, e0164877. Goodpaster, B.H., Booth, F.W., Kohrt, W.M., Gerszten, R.E., Mattson, M.P.,
et al. (2015). Understanding the Cellular and Molecular Mechanisms of Phys-
Gallant, A., Leiserson, M.D.M., Kachalov, M., Cowen, L.J., and Hescott, B.J.
ical Activity-Induced Health Benefits. Cell Metab. 22, 4–11.
(2013). Genecentric: a package to uncover graph-theoretic structure in high-
throughput epistasis data. BMC Bioinformatics 14, 23. Nicholson, J.K., and Wilson, I.D. (2003). Opinion: understanding ‘global’ sys-
tems biology: metabonomics and the continuum of metabolism. Nat. Rev.
Gollnick, P.D., Armstrong, R.B., Saltin, B., Saubert, C.W., 4th, Sembrowich,
Drug Discov. 2, 668–676.
W.L., and Shepherd, R.E. (1973). Effect of training on enzyme activity and fiber
composition of human skeletal muscle. J. Appl. Physiol. 34, 107–111. Nieman, D.C., Gillitt, N.D., Knab, A.M., Shanely, R.A., Pappan, K.L., Jin, F., and
Lila, M.A. (2013). Influence of a polyphenol-enriched protein powder on exer-
Hawley, J.A., Hargreaves, M., Joyner, M.J., and Zierath, J.R. (2014). Integra-
cise-induced inflammation and oxidative stress in athletes: a randomized trial
tive biology of exercise. Cell 159, 738–749.
using a metabolomics approach. PLoS ONE 8, e72215.
Heaney, L.M., Deighton, K., and Suzuki, T. (2017). Non-targeted metabolomics
Nieman, D.C., Shanely, R.A., Luo, B., Meaney, M.P., Dew, D.A., and Pappan,
in sport and exercise science. J. Sports Sci. 37, 959–967.
K.L. (2014). Metabolomics approach to assessing plasma 13- and 9-hydroxy-
Hoffman, N.J., Parker, B.L., Chaudhuri, R., Fisher-Wellman, K.H., Kleinert, M., octadecadienoic acid and linoleic acid metabolite responses to 75-km cycling.
Humphrey, S.J., Yang, P., Holliday, M., Trefely, S., Fazakerley, D.J., et al. Am. J. Physiol. Regul. Integr. Comp. Physiol. 307, R68–R74.
(2015). Global Phosphoproteomic Analysis of Human Skeletal Muscle Reveals
Pacheco, C., Felipe, S.M.D.S., Soares, M.M.D.C., Alves, J.O., Soares, P.M.,
a Network of Exercise-Regulated Kinases and AMPK Substrates. Cell Metab.
Leal-Cardoso, J.H., Loureiro, A.C.C., Ferraz, A.S.M., de Carvalho, D.P., and
22, 922–935.
Ceccatto, V.M. (2018). A compendium of physical exercise-related human
Hofree, M., Shen, J.P., Carter, H., Gross, A., and Ideker, T. (2013). Network- genes: an ’omic scale analysis. Biol. Sport 35, 3–11.
based stratification of tumor mutations. Nat. Methods 10, 1108–1115.
Pedersen, B.K., and Febbraio, M.A. (2012). Muscles, exercise and obesity:
Hunter, T. (1995). Protein kinases and phosphatases: the yin and yang of pro- skeletal muscle as a secretory organ. Nat. Rev. Endocrinol. 8, 457–465.
tein phosphorylation and signaling. Cell 80, 225–236.
Phillips, S.M., Tipton, K.D., Aarsland, A., Wolf, S.E., and Wolfe, R.R. (1997).
Jo, K., Jung, I., Moon, J.H., and Kim, S. (2016). Influence maximization in time Mixed muscle protein synthesis and breakdown after resistance exercise in
bounded network identifies transcription factors regulating perturbed path- humans. Am. J. Physiol. 273, E99–E107.
ways. Bioinformatics 32, i128–i136. Radom-Aizik, S., and Cooper, D.M. (2016). Bridging the Gaps: the Promise of
Kjaer, M., Kiens, B., Hargreaves, M., and Richter, E.A. (1991). Influence of Omics Studies in Pediatric Exercise Research. Pediatr. Exerc. Sci. 28,
active muscle mass on glucose homeostasis during exercise in humans. 194–201.
J. Appl. Physiol. 71, 552–557. Radom-Aizik, S., Zaldivar, F., Haddad, F., and Cooper, D.M. (2013). Impact of
Kleinert, M., Parker, B.L., Jensen, T.E., Raun, S.H., Pham, P., Han, X., James, brief exercise on peripheral blood NK cell gene and microRNA expression in
D.E., Richter, E.A., and Sylow, L. (2018). Quantitative proteomic characteriza- young adults. J. Appl. Physiol. 114, 628–636.
tion of cellular pathways associated with altered insulin sensitivity in skeletal Radom-Aizik, S., Zaldivar, F.P., Jr., Haddad, F., and Cooper, D.M. (2014).
muscle following high-fat diet feeding and exercise training. Sci. Rep. Impact of brief exercise on circulating monocyte gene and microRNA expres-
8, 10723. sion: implications for atherosclerotic vascular disease. Brain Behav. Immun.
Leon ska-Duniec, A., Ahmetov, I.I., and Zmijewski, P. (2016). Genetic variants 39, 121–129.
influencing effectiveness of exercise training programmes in obesity - an over- Raue, U., Trappe, T.A., Estrem, S.T., Qian, H.-R., Helvering, L.M., Smith, R.C.,
view of human studies. Biol. Sport 33, 207–214. and Trappe, S. (2012). Transcriptome signature of resistance exercise adapta-
Lewis, G.D., Farrell, L., Wood, M.J., Martinovic, M., Arany, Z., Rowe, G.C., tions: mixed muscle and fiber type specific profiles in young and old adults.
Souza, A., Cheng, S., McCabe, E.L., Yang, E., et al. (2010). Metabolic signa- J. Appl. Physiol. 112, 1625–1636.
tures of exercise in human plasma. Sci. Transl. Med. 2, 33ra37. Romijn, J.A., Coyle, E.F., Sidossis, L.S., Gastaldelli, A., Horowitz, J.F., Endert,
Lindholm, M.E., Marabita, F., Gomez-Cabrero, D., Rundqvist, H., Ekström, E., and Wolfe, R.R. (1993). Regulation of endogenous fat and carbohydrate
T.J., Tegnér, J., and Sundberg, C.J. (2014). An integrative analysis reveals cometabolism in relation to exercise intensity and duration. Am. J. Physiol.
ordinated reprogramming of the epigenome and the transcriptome in human 265, E380–E391.
skeletal muscle after training. Epigenetics 9, 1557–1569. Rönn, T., and Ling, C. (2013). Effect of exercise on DNA methylation and meta-
Ling, C., and Rönn, T. (2014). Epigenetic adaptation to regular exercise in hu- bolism in human adipose tissue and skeletal muscle. Epigenomics 5, 603–605.
mans. Drug Discov. Today 19, 1015–1018. Rönn, T., Volkov, P., Tornberg, A., Elgzyri, T., Hansson, O., Eriksson, K.-F.,
Loos, R.J.F., Hagberg, J.M., Pérusse, L., Roth, S.M., Sarzynski, M.A., Wolf- Groop, L., and Ling, C. (2014). Extensive changes in the transcriptional profile
arth, B., Rankinen, T., and Bouchard, C. (2015). Advances in exercise, fitness, of human adipose tissue including genes involved in oxidative phosphorylation
and performance genomics in 2014. Med. Sci. Sports Exerc. 47, 1105–1112. after a 6-month exercise intervention. Acta Physiol. (Oxf.) 211, 188–200.
Louis, E., Raue, U., Yang, Y., Jemiolo, B., and Trappe, S. (2007). Time course Safdar, A., and Tarnopolsky, M.A. (2018). Exosomes as Mediators of the Sys-
of proteolytic, cytokine, and myostatin gene expression after acute exercise in temic Adaptations to Endurance Exercise. Cold Spring Harb. Perspect. Med.
human skeletal muscle. J. Appl. Physiol. 103, 1744–1751. 8, a029827.
Cell 181, June 25, 2020 1473

ll
Perspective
Schulz, M.H., Devanny, W.E., Gitter, A., Zhong, S., Ernst, J., and Bar-Joseph, Warburton, D.E.R., Nicol, C.W., and Bredin, S.S.D. (2006). Health benefits of
Z. (2012). DREM 2.0: Improved reconstruction of dynamic regulatory networks physical activity: the evidence. CMAJ 174, 801–809.
from time-series expression data. BMC Syst. Biol. 6, 104. Whitham, M., Parker, B.L., Friedrichsen, M., Hingst, J.R., Hjorth, M., Hughes,
Smilde, A.K., van der Werf, M.J., Bijlsma, S., van der Werff-van der Vat, B.J.C., W.E., Egan, C.L., Cron, L., Watt, K.I., Kuchel, R.P., et al. (2018). Extracellular
and Jellema, R.H. (2005). Fusion of mass spectrometry-based metabolomics Vesicles Provide a Means for Tissue Crosstalk during Exercise. Cell Metab.
data. Anal. Chem. 77, 6729–6736. 27, 237–251.e4.
Sollanek, K.J., Burniston, J.G., Kavazis, A.N., Morton, A.B., Wiggs, M.P., Ahn, Wild, C.P. (2005). Complementing the genome with an ‘‘exposome’’: the
B., Smuder, A.J., and Powers, S.K. (2017). Global Proteome Changes in the outstanding challenge of environmental exposure measurement in molecular
Rat Diaphragm Induced by Endurance Exercise Training. PLoS ONE 12, epidemiology. Cancer Epidemiol. Biomarkers Prev. 14, 1847–1850.
e0171007. Wilkinson, M.D., Dumontier, M., Aalbersberg, I.J.J., Appleton, G., Axton, M.,
Stanford, K.I., and Goodyear, L.J. (2018). Muscle-Adipose Tissue Cross Talk. Baak, A., Blomberg, N., Boiten, J.-W., da Silva Santos, L.B., Bourne, P.E.,
Cold Spring Harb. Perspect. Med. 8, a029801. et al. (2016). The FAIR Guiding Principles for scientific data management
Steensberg, A., van Hall, G., Osada, T., Sacchetti, M., Saltin, B., and Klarlund and stewardship. Sci. Data 3, 160018.
Pedersen, B. (2000). Production of interleukin-6 in contracting human skeletal Xiao, Q., Moore, S.C., Keadle, S.K., Xiang, Y.-B., Zheng, W., Peters, T.M.,
muscles can account for the exercise-induced increase in plasma interleukin- Leitzmann, M.F., Ji, B.-T., Sampson, J.N., Shu, X.-O., and Matthews, C.E.
6. J. Physiol. 529, 237–242. (2016). Objectively measured physical activity and plasma metabolomics in
Timmons, J.A., Knudsen, S., Rankinen, T., Koch, L.G., Sarzynski, M., Jensen, the Shanghai Physical Activity Study. Int. J. Epidemiol. 45, 1433–1444.
T., Keller, P., Scheele, C., Vollaard, N.B.J., Nielsen, S., et al. (2010). Using mo- Yang, Y., Creer, A., Jemiolo, B., and Trappe, S. (2005). Time course of
lecular classification to predict gains in maximal aerobic capacity following myogenic and metabolic gene expression in response to acute exercise in hu-
endurance exercise training in humans. J. Appl. Physiol. 108, 1487–1496. man skeletal muscle. J. Appl. Physiol. 98, 1745–1752.
1474 Cell 181, June 25, 2020

Article
Host-Viral Infection Maps Reveal Signatures of

Severe COVID-19 Patients
Graphical Abstract Authors
Pierre Bost, Amir Giladi, Yang Liu, ...,
Benno Schwikowski, Zheng Zhang,
Ido Amit
Correspondence
benno@pasteur.fr (B.S.),
ido.amit@weizmann.ac.il (I.A.),
zhangzheng1975@aliyun.com (Z.Z.)
In Brief
A computational framework that allows
for the identification and characterization
of virus-infected cells as well as
bystander cell responses reveals how
SARS-CoV-2 alters the immune
responses of patients.
361667513
Highlights
d Viral-Track: a computational framework to analyze host-viral
infection maps
d Viral-Track sorts infected from bystander cells and reveals

virus-induced expression
d SARS-CoV-2 infects epithelial cells and alters immune

landscape in severe patients
d Co-infection of SARS-Cov-2 and hMPV affects monocytes

and dampens interferon response
Bost et al., 2020, Cell 181, 1475–1488

June 25, 2020 ª 2020 Elsevier Inc.
https://doi.org/10.1016/j.cell.2020.05.006 ll
ll
Article
Host-Viral Infection Maps Reveal Signatures
of Severe COVID-19 Patients
Pierre Bost,1,2,3,6 Amir Giladi,1,6 Yang Liu,4,6 Yanis Bendjelal,2 Gang Xu,4 Eyal David,1 Ronnie Blecher-Gonen,1
Merav Cohen,1 Chiara Medaglia,1 Hanjie Li,1 Aleksandra Deczkowska,1 Shuye Zhang,5 Benno Schwikowski,2,*
Zheng Zhang,4,* and Ido Amit1,7,*
1Department of Immunology, Weizmann Institute of Science, Rehovot, Israel
2Systems Biology Group, Department of Computational Biology and USR 3756, Institut Pasteur and CNRS, Paris 75015, France
3Sorbonne Universite, Complexite du vivant, Paris 75005, France
4Institute for Hepatology, National Clinical Research Center for Infectious Disease, Shenzhen Third People’s Hospital, School of Medicine,
Southern University of Science and Technology, Shenzhen 518112, Guangdong Province, China
5Shanghai Public Health Clinical Center and Institute of Biomedical Sciences, Fudan University, Shanghai 201508, China
7Lead Contact
*Correspondence: benno@pasteur.fr (B.S.), zhangzheng1975@aliyun.com (Z.Z.), ido.amit@weizmann.ac.il (I.A.)

SUMMARY
Viruses are a constant threat to global health as highlighted by the current COVID-19 pandemic. Currently,
lack of data underlying how the human host interacts with viruses, including the SARS-CoV-2 virus, limits
effective therapeutic intervention. We introduce Viral-Track, a computational method that globally scans un-
mapped single-cell RNA sequencing (scRNA-seq) data for the presence of viral RNA, enabling transcriptional
cell sorting of infected versus bystander cells. We demonstrate the sensitivity and specificity of Viral-Track to
systematically detect viruses from multiple models of infection, including hepatitis B virus, in an unsuper-
vised manner. Applying Viral-Track to bronchoalveloar-lavage samples from severe and mild COVID-19 pa-
tients reveals a dramatic impact of the virus on the immune system of severe patients compared to mild
cases. Viral-Track detects an unexpected co-infection of the human metapneumovirus, present mainly in
monocytes perturbed in type-I interferon (IFN)-signaling. Viral-Track provides a robust technology for dis-
secting the mechanisms of viral-infection and pathology.
INTRODUCTION thus far regarding the interaction of the SARS-CoV-2 virus with the
human host and, as a consequence, no efficient treatment has
The development of efficient vaccines against viral pathogens is been designed so far (Chen et al., 2020). Moreover, only few ther-
considered one of the biggest achievements of modern apeutic targets have been identified, highlighting the urgency to
medicine and has significantly contributed to the increase in life ex- develop additional strategies to dissect the virus-host interactions.
pectancy worldwide. However, no vaccines exist for many life- Single-cell RNA sequencing (scRNA-seq) is an emerging tech-
threatening viruses such as HIV (Burton, 2019), Zika virus (Pierson nology that has been extensively used to study several complex
and Diamond, 2018), or hepatitis C virus (HCV) (Bailey et al., 2019). diseases, including cancer (Li et al., 2019), neurodegeneration
Additionally, efficient broad-spectrum antiviral drugs are still (Keren-Shaul et al., 2017), and auto-immune (Zhang et al., 2019)
missing, making infectious diseases a significant challenge for and metabolic diseases (Jaitin et al., 2019), providing new insights
modern health systems. Viruses can also trigger or fuel non-infec- and revealing new therapeutic targets and strategies (Yofe et al.,
tious diseases such as cancer (Young and Rickinson, 2004) and 2020). In the context of infectious diseases, scRNA-seq studies
are suspected to contribute to various other chronic diseases identified the underlying cells and pathways interacting with
such as Alzheimer disease (Itzhaki, 2018) and various auto-im- various pathogens (Drayman et al., 2019; Shnayder et al., 2018;
mune disorders (Münz et al., 2009). The recent emergence of high- Steuerman et al., 2018; Zanini et al., 2018). During the immune
ly pathogenic viruses such as the Ebola virus and the emerging response to a pathogen, a limited number of antigen-positive or in-
SARS-CoV-2 pandemic recalls the constant threat that viruses fected cells initiate and modulate the host immune response
represent to global health. So far, the SARS-CoV-2 pandemic (Blecher-Gonen et al., 2019), while most of the tissue response is
has caused a global financial and social catastrophe and is ex- propagated through cytokines, such as type I interferon (IFN)
pected to make a significant long-lasting impact on human health signaling, to bystander, uninfected cells. It is therefore essential
(Zhu et al., 2020). Despite intensive research efforts, little is known to develop new analytical tools to identify the rare infected cells
Cell 181, 1475–1488, June 25, 2020 ª 2020 Elsevier Inc. 1475
ll
Article
in order to better understand complex host-virus interactions un- reference host genome of the relevant profiled organism. Irrele-
derlying these pathologies. Multiple experimental tools have vant reads, representing other organisms, primers, adaptors, tem-
been developed over the years to track virus-infected cells in vivo, plate switching oligonucleotides, and other contaminants are then
characterize the cellular state of the infected cells, and differentiate commonly discarded. We reasoned that during infection, and
them from their bystander neighbors. These include fluorescently likely many other pathological processes, these reads can poten-
labeled pathogens or pathogens expressing fluorescent proteins tially carry valuable information about viral RNA that is discarded
(De Baets et al., 2015; Blecher-Gonen et al., 2019), as well as in this filtering step. In order to efficiently detect viral reads from
reporter mice (Lienenklaus et al., 2009). However, in the case of hu- raw scRNA-seq data in an unsupervised manner, we developed
man clinical samples, these tools are limited, making the pathogen- Viral-Track, an R-based computational pipeline (Figure 1A;
infected cells and viral reservoir cell types hard to detect. STAR Methods). Briefly, Viral-Track relies on the STAR aligner
Viruses exploit their host cells to first express viral genes, opti- (Dobin et al., 2013) to map the reads of scRNA-seq data to both
mize the cellular environment, and then fully activate the viral repli- the host reference genome and an extensive list of high-quality
cation program. Because scRNA-seq technologies rely on polya- viral genomes (Stano et al., 2016). Because viral reads are highly
denylated RNA isolation and amplification, current scRNA-seq repetitive and generate substantial sequencing artifacts, the viral
methods can, in theory, detect these viral RNA programs and genomes identified in Viral-Track with a sufficient number of map-
therefore enable accurate identification of the bona fide infected ped reads are then filtered, based on read mapping quality, nucle-
cells and their unique properties at single-cell resolution. While otide composition, sequence complexity, and genome coverage,
such an approach has already been used to study both in vitro to limit the occurrence of false-positives (STAR Methods). Due to
(Drayman et al., 2019; Shnayder et al., 2018) and in vivo infection the lack of high-quality viral genome annotations, Viral-Track in-
models (Steuerman et al., 2018), no general computational frame- cludes de novo transcriptome assembly of the identified viruses
work has been developed to detect viruses and analyze host-viral using StringTie (Pertea et al., 2015). Finally, viral reads are demul-
maps in clinical samples. Here, we present a new computational tiplexed, quantified using unique molecular identifiers (UMI), and
tool, called Viral-Track, that is designed to systematically scan assigned to unique viral transcripts and cells (Figures 1A and
for viral RNA in scRNA-seq data of physiological viral infections us- S1A). The Viral-Track algorithm has been designed to robustly
ing a direct mapping strategy. Viral-Track performs comprehen- handle various types of scRNA-seq datasets, as illustrated below,
sive mapping of scRNA-seq data onto a large database of known and is publicly accessible at https://github.com/PierreBSC/
viral genomes, providing precise annotation of the cell types asso- Viral-Track.
ciated with viral infections. Integrating these data with the host In order to evaluate the specificity and sensitivity of Viral-
transcriptome enables transcriptional sorting and differential Track, we benchmarked Viral-Track on several scRNA-seq
profiling of the viral-infected cells compared to bystander cells. Us- datasets (Table S1). These datasets include a large number of
ing a new statistical approach for differential gene expression be- experiments we conducted, as well as published studies, that
tween infected and bystander cells, we are able to recover virus- span several tissues (lung, spleen, liver, and lymph node) and a
induced programs and reveal key host factors required for viral wide range of viruses: influenza A, lymphocytic choriomeningitis
replication. Viral-Track is able to annotate the viral program with virus (LCMV), vesicular stomatitis virus (VSV), herpes simplex vi-
high accuracy and sensitivity, as we demonstrate in several in vivo rus 1 (HSV-1), human immunodeficiency virus (HIV), and HBV.
mouse models of infection, as well as human samples of hepatitis We first evaluated mouse lungs infected in vivo by influenza A vi-
B virus (HBV) infection. Applying Viral-Track on bronchoalveolar rus and sequenced using MARS-seq2.0 (Keren-Shaul et al.,
lavage (BAL) samples from moderate and severe COVID-19 pa- 2019; Steuerman et al., 2018). Viral-Track analysis specifically
tients, we reveal the infection landscape of SARS-CoV-2 and its detected the 8 distinct influenza A viral segments (NC_002016
interaction with the host tissue. Our analysis shows a dramatic to NC_002023 Refseq nucleotide sequences) from the specific
impact of the SARS-CoV-2 virus on the immune system of severe infecting strain (H1N1 Puerto Rico 8 strain) (Figure 1B). We per-
patients, compared to mild cases, including replacement of the tis- formed transcriptome assembly to test the feasibility of recon-
sue-resident alveolar macrophages with recruited inflammatory structing the viral transcriptome from 30 -enriched scRNA-seq
monocytes, neutrophils, and macrophages and an altered CD8+ data. The results were highly coherent with the current knowl-
T cell cytotoxic response. We find that SARS-CoV-2 mainly infects edge of influenza A transcriptome, exemplified by Viral-Track’s
the epithelial and macrophage subsets. In addition, Viral-Track de- ability to identify documented spliced transcript structures with
tects an unexpected co-infection of the human metapneumovirus single-nucleotide precision. For instance, we identified the exact
in one of the severe patients. This study establishes Viral-Track as location of the key splicing site on segment 7 that gives rise to M2
a broadly applicable tool for dissecting mechanisms of viral infec- transcript and links nucleotides 51 and 740 (Dubois et al., 2014)
tions, including identification of the cellular and molecular signa- (Figure 1C). Quantification of the number of viral reads across
tures involved in virus-induced pathologies. different experimental conditions was consistent with current
knowledge of the disease, with lung stomal cells of non-immune
RESULTS lineages (CD45) exhibiting a significantly higher viral load
compared to immune cells (CD45+) (p = 0.039, two-tailed
Viral-Track: An Unsupervised Pipeline for Welch’s t test) (Figure 1D).
Characterization of Viral Infections in scRNA-Seq Data As inbred mice lack the influenza-specific restriction factor
All scRNA-seq computational packages implement a pipeline that Mx1, influenza A infection is extremely virulent in inbred mice
initially aligns the sequenced reads to the expressed part of a (Haller et al., 1980). Moreover, all influenza A mRNA are capped
1476 Cell 181, 1475–1488, June 25, 2020

ll
Article
B C
D
E F
G H
(legend on next page)
Cell 181, 1475–1488, June 25, 2020 1477

ll
Article
and polyadenylated, making them an optimal substrate for commonly used scRNA-seq technologies and non-RNA viruses.
scRNA-seq isolation and amplification protocols. We therefore We applied Viral-Track to scRNA-seq data from a recently publi-
evaluated the sensitivity and specificity of Viral-Track in a more cation of human primary cells infected ex vivo with HSV-1, a linear
challenging dataset. In this model, photoactivatable-GFP (PA- double-stranded DNA virus, generated by the Drop-seq platform
GFP) mice were infected with LCMV (Armstrong acute strain), (Drayman et al., 2019; Macosko et al., 2015). We found that Viral-
a virus lacking strong poly(A) mRNA signals (Burrell et al., Track detected and identified correctly HSV-1 RNA specifically in
2017), via injection to the footpad. 72 h post-infection, CD45+ the infected samples but not in the controls (NC_001806 Refseq
splenic immune cells from different spatial niches (T zone, B nucleotide sequences) (Figures S1F and S1G). Finally, we
zone, marginal zone, and total spleen) were profiled using the analyzed scRNA-seq data of CD4+ T cells infected ex vivo with
NICHE-seq technology (Medaglia et al., 2017). Even though HIV-1 (Bradley et al., 2018), generated using the droplet-based
the LCMV viral mRNAs are not polyadenylated, we detected chromium platform (Zheng et al., 2017). Viral-Track successfully
mRNA molecules that converted to cDNA through priming of identified HIV as the unique virus present in the infected samples
the MARS-seq oligo(dt) RT primer, and Viral-Track successfully (Figures S1H and S1I), but detected significant amounts of HIV-1
identified the two viral segments (LCMV segment L viral reads in one control samples probably due to ambient
[NC_004291] and S [NC_004294]) (Figure S1B), albeit the num- contamination (Yang et al., 2020).
ber of detected reads was an order of magnitude lower than
the number observed in influenza A infection (Figure 1E). We de- Defining the Host Viral Interactions of HBV Using
tected viral reads in samples from the marginal zone, B zone, and Viral-Track
the total spleen, but not in T zone samples, and marginal zone We further tested Viral-Track’s applicability for detecting viral
samples exhibited significantly higher viral load compared to B reads in human clinical samples. For this purpose, we generated
zone and total spleen samples (Figure 1E; p = 0.0067 and scRNA-seq data from a liver biopsy of an untreated hepatitis B
0.0083 respectively, two-tailed Welch’s t test). This observation patient and analyzed the data using Viral-Track. Viral-Track suc-
is in line with the biology of LCMV, which primarily infects mac- cessfully identified HBV as the only virus present in the sample
rophages and lymphocytes from the marginal zone of the spleen (Figure 1F) with 18,420 reads assigned to the HBV genome
(Müller et al., 2002). (NC_003977 Refseq sequence). Coverage analysis revealed a
We next evaluated whether Viral-Track is sensitive to barcode strong peak located at the 50 end of the C gene, encoding for
swapping during Illumina-based scRNA-seq (Griffiths et al., the main core protein, suggesting that the HBV virus is actively
2018), which, in the case of viral RNA detection, can lead to the producing virions (Figure 1G). We then overlaid the viral data
false assignment of viral reads to uninfected cells. To this end, on the host transcriptome to identify infected and bystander
we infected mice with one of two different viruses, LCMV and populations. A total of 13,803 cells passed a lenient quality con-
VSV, and performed MARS-seq2.0 on CD45+CD19CD3 non- trol, permitting apoptotic signals that may arise from viral infec-
B/T cells from the auricular draining lymph node 1 day after infection. We identified several non-immune cell types (Figure S1J),
tion (STAR Methods). All samples were sequenced concurrently including hepatocytes (expressing ALB and APOA2), as well as
to test for cross-sample viral read contamination. For both viruses, hepatocytes showing apoptotic signatures (ALB with high
Viral-Track was able to identify the correct viral segments (Figures expression of mitochondrial genes), sinusoidal endothelial cells
S1C and S1D), with no cross-contamination, evident by the (FCN2), and epithelial cells (KRT7). We also observed several
absence of VSV reads detected in the LCMV-infected cells and subsets of immune cells such as B cells (MS4A1), plasma cells
vice versa (Figure S1E). We further generalized Viral-Track for (MZB1), conventional dendritic cells 1 (cDC1; XCR1),
Figure 1. Viral-Track Retrieves Viral Reads in a Variety of Tissues, Viral Strains, and Sequencing Platforms
(A) Schematics of the Viral-Track approach. Single-cell sequencing data of cells from an infected tissue, containing infected and bystander cells are analyzed by
Viral-Track. Viral-Track maps the sequenced reads to both the host reference genome and a database of viral genomes, overlaying infection status on top of the
host transcriptional landscape.
(B) Results of Viral-Track analysis on scRNA-seq data from influenza A PR8-infected mouse lungs. For each viral segment, represented by a dot, the complexity of
the sequences (measured by entropy, i.e., how repetitive are the mapped sequences) and the percentage of the segment that is mapped are plotted. Dark red
dots correspond to viral segments of the influenza A PR8 strain and yellow dots to segments belonging to other H1N1 influenza strains. Viral segments with more
than 50 mapped reads are plotted.
(C) Coverage plot of the influenza A segment NC_002016 (influenza A PR8 segment 7), M2 transcript location estimated using StringTie is shown below with the
splicing site position.
(D) Quantification of the number of reads assigned to influenza viral segments across experimental settings. Each dot corresponds to a technical replicate (384-
well plate). Two-tailed Welch’s t test was used to compare viral load betwen CD45 and CD45+ cells (p = 0.039).
(E) Quantification of the number of reads assigned to LCMV viral segments in the different zones of the spleen. Each dot corresponds to a technical replicate (384-
well plate). Two-tailed Welch’s t test was used to compare viral load between cells from the infected marginal zone to cells from the B zone or the whole speen (p =
0.0067 and 0.0083 respectively).
(F) Result of Viral-Track analysis on scRNA-seq data from a HBV patient. For each viral segment, represented by a dot, the entropy of the sequence and the
percentage of the segment that is mapped is plotted. Green dots correspond to viral segments that passed quality control. Viral segments with more than 50
mapped reads are plotted.
(G) Coverage plot of the HBV genome. Locations of the different viral genes from NCBI database are depicted at the bottom.
(H) Enrichment of infected cells across hepatic cell subsets (left panel); red line corresponds to an enrichment of one. Distribution of the number of HBV UMIs per
cell in each cell subset (right panel).
See also Figure S1.
1478 Cell 181, 1475–1488, June 25, 2020

ll
Article
A B
D E
Figure 2. Viral-Track Identifies Virus-Modified Transcription in Infected Cell Subsets

(A) Distribution of vUMI+ and GFP+ cells across cells types found in the spleen.
(B) Distribution of the Pearson Correlation between GFP+ cells, vUMI+, and bystander (GFPvUMI) cells. Two-tailed Kruskal-Wallis test.
(C) Number of differentially expressed genes between bystander and infected cells in MZB cells, monocytes, and macrophages.
(D) Top 10 enriched terms identified by Gene Ontology enrichment analysis.
(E) Mean expression of four top differentially expressed genes in bystander and infected MZB cells.
See also Figure S2.
plasmacytoid dendritic cells (pDCs) (TCF4), and three different several studies, reporting active infection of macrophages
macrophage subsets (expressing TREM2, CD163, and FCN1, (Faure-Dupuy et al., 2019).
respectively). We observed a large diversity among the lympho- Together, this extensive list of validations demonstrate that
cyte compartment with CD8+ T cells (CD8A), Th17 cells (CCR6, Viral-Track is a sensitive and accurate method to detect and
IL23A), gd T cells (TRGC1), activated CD4 T cells (LEF1, OX40), identify, in an unsupervised manner, virus strains in diverse
natural killer (NK) cells (NKG7), and a distinct cluster of activated scRNA-seq samples, in different tissues, and at varying viral
CD8+ T cells (CSF2 and TOX2). We analyzed infected cells using types and loads. Importantly, Viral-Track can be applied to hu-
automated thresholding over the viral signal (Figure S1J; STAR man clinical samples to extract valuable insight into the biology
Methods). As expected, hepatocytes and apoptotic hepatocytes of the host-virus interactions.
were strongly enriched among the infected cells (Figures 1H and
S1K). Interestingly, we also detected viral reads in non-hepato- Viral-Track Identifies Infected versus Bystander Cells
cyte clusters, including two subsets of macrophages (CD163+ and Uncovers Virus-Induced Pathways
and TREM2+ populations, respectively), the cDC1 subset To further evaluate the accuracy of Viral-Track against a well-es-
(XCR1+), as well as endothelial (OIT3+ cells) and epithelial cells tablished model for tracking infection in single cells, we infected
(KRT7+) (Figures 1H and S1K). Infection of non-hepatocyte clus- mice with a GFP-expressing LCMV virus (LCMV-GFP virus) (Med-
ters, although with relatively low viral load, is coherent with aglia et al., 2017). We performed MARS-seq on GFP+ splenocytes
Cell 181, 1475–1488, June 25, 2020 1479

ll
Article
A B
D E F
G I
H J
1480 Cell 181, 1475–1488, June 25, 2020

ll
Article
and total spleen cells 72 h post-infection and analyzed the ure 2E). This is in line with a previous report highlighting the ability
sequenced cells (Figures S2A and S2B; STAR Methods). GFP+ of LCMV to trigger an abortive form of cell division blocked in the
cells were enriched for vUMI+ cells compared to total spleen (Fig- G1 phase (Beier et al., 2015). Altogether, our results show that
ure S2A). We then calculated whether the cells positive for the Viral-Track is sufficient to detect infected cells in in vivo scRNA-
LCMV-GFP signal (GFP+ cells) were similar to the ones desig- seq data and infer the differential gene expression in infected
nated by Viral-Track as containing viral UMIs (vUMI+). Following versus bystander cells.
clustering and annotation, we observed similar proportions of
GFP+ and vUMI+ cells across cell clusters (Figures 2A and S2C; A Single-Cell Map of SARS-CoV-2 Infection in Mild and
R = 0.95, p = 9.0 * 1012), with monocytes, marginal zone B cells Severe Patients
(MZBs), and macrophages being the major infected cell types. We COVID-19 is a viral disease caused by SARS-CoV-2 infection,
then evaluated the transcriptional signatures within these two sets which has recently been recognized as the cause for a pandemic
of cells by computing the Pearson correlation between each pair (Wang et al., 2020a). Little is currently known about the course of
of cells. We observed similar distribution of Pearson correlation the disease and how the virus interacts with the host immune
within the GFP+ and vUMI+ monocyte cells (Figure 2B) that was system in its mild and severe manifestations. To gain insights
significantly higher (median correlation of 0.65, 0.64, and 0.51, on the infection course in humans, we performed scRNA-seq
respectively) than the correlation observed between GFP vUMI and Viral-Track analysis on BALF samples from three mild and
bystander monocytes. We conclude that Viral-Track correctly six severe COVID-19 patients (Liao et al., 2020). In total, 50,615
identifies a homogeneous set of infected cells from in vivo cells passed quality control and were analyzed using the MetaCell
scRNA-seq samples similar to the one identified by conventional algorithm (Baran et al., 2019) (Figure 3A; STAR Methods). Meta-
reporter viruses, even in the more difficult scenario in which viral cell analysis coarsely grouped the metacells into the myeloid,
transcripts are poorly polyadenylated. lymphoid, and epithelial lineages, and each lineage was further
We next evaluated the ability of Viral-Track to detect host fac- subdivided into smaller subsets (Figures 3A, 3B and S3A). Among
tors associated with virus replication. For this purpose, we devel- epithelial cells, we identified epithelial progenitors (expressing
oped a statistical method that detects differentially expressed SOX4), type II alveolar cells (AT2, expressing SFTPB), ciliated
genes based on data binarization and complementary log-log cells (FOXJ1), ionocytes (CFTR), goblet cells (MUC5B), and club
regression (STAR Methods; Methods S1). We used this approach cells (SCGB1A1; Figure S3B). Lymphoid cells consisted several
to test for transcriptional differences between bystander and in- subtypes of CD4+ T cells, including naive CD4+ T cells (express-
fected cells during spleen LCMV infection across the three main ing CCR7), regulatory T cells (Treg, expressing FOXP3), and T
infected cell types: macrophages, MZB cells, and monocytes. follicular helper cells (Tfh, expressing CXCL13 and PDCD1), but
We observed that MZB cells were the most influenced by the viral also diverse CD8+ subsets, such as NK cells (NCAM1), resident
infection, compared to monocytes and macrophages (107, 42, memory CD8+ T cells (Trm, CD8A, and ZNF683), effector CD8+
and 3 genes upregulated, respectively, Z score >3) (Figure 2C). T cells (GZMA and GZMK), and cytotoxic CD8+ T cells (GNLY,
We performed Gene Ontology enrichment analysis on the upregu- PRF1), as well as B cells (CD79A; Figure S3C). The myeloid
lated genes in MZB cells and observed a significant enrichment in compartment exhibited a high diversity of cell states, including
several pathways, including ‘‘chromosome organization,’’ ‘‘DNA neutrophils (FCGR3B), mast cells (CPA3), alveolar macrophages
replication,’’ and ‘‘cell cycle,’’ suggesting that LCMV triggers cell (FABP4), dendritic cells (DCs; FSCN1), and plasmacytoid DCs
division in MZB cells (Figure 2D). Indeed, LCMV-infected MZB (pDC; TCF4) as well as a large diversity of monocytes (FCN1)
cells exhibited higher levels of cell cycle-related genes such as and monocyte-derived macrophages (SPP1) sub-populations
Smc2 (required for chromatin condensation), Cdc6 (regulator of (Figure S3D). These results were robust across different analysis
DNA replication), and Stmn1 (regulator of mitotic spindle) (Figures platforms (Liao et al., 2020).
2E and S2D), but also fibrillarin (Fbl), a host factor whose expres- Comparison of the cellular landscape of mild and severe
sion is required by several viruses (Deffrasnes et al., 2016) (Fig- patients revealed key differences in the composition of BAL
Figure 3. scRNA-Seq of 6 COVID-19 Samples Reveals Myeloid Remodeling in Severe Patients

(A) A 2-dimensional visualization of 50,615 single cells from three mild and six severe COVID-19 patients, generated by the MetaCell algorithm. Colors indicate
grouping of cells into 27 subsets, based on transcriptional similarity (Figure S3A).
(B) Quantification of the three main compartments, myeloid, lymphoid, and epithelial, across the three mild (M1–M3) and six severe (S1–S6) patients.
(C) Density plots depicting projection of cells from the mild (left) and severe (right) patients on the 2D map shown in (A).
(D–F) Quantification of the frequency of specific cell subsets in the myeloid (D), lymphoid (E), and epithelial (F) compartments, across the nine patients. Diamond
marks patient S1, co-infected with the human metapneumovirus (Figures 4D–4H). Horizontal lines indicate mean frequency.
(G) Percentage of proliferating cells (determined by thresholding over a cell-cycle-related gene module, detailed in Table S3) in each of 455 metacells, projected
on the 2D map shown in (A).
(H) Quantification of the type I interferon response gene module across 455 metacells, projected on the 2D map shown in (A). Color scale represents log2 fold
change over the median expression of the module across all metacells.
(I) Differential gene expression analysis. Each panel compares pooled gene expression between naive and non-naive CD4+ T cells (left) and effector and cytotoxic
CD8+ T cells (right) cell subsets.
(J) Differential gene expression analysis between cells belonging to AM (left) and SPP1hiC1Qhi macrophages (right) from mild (x axis) and severe (y axis) patients. (I
and J) Values represent log2 size-normalized expression (transcripts per 1,000 UMI).
See also Figure S3.
Cell 181, 1475–1488, June 25, 2020 1481

ll
Article
A B
C D
F
E
1482 Cell 181, 1475–1488, June 25, 2020

ll
Article
samples (Figures 3B and 3C). We found changes to each of the Viral-Track Identifies Co-infection of SARS-CoV-2 with
three compartments (Figures 3D–3F and S3E–S3G). While alve- the Human Metapneumovirus
olar macrophages and pDC where enriched in the myeloid To characterize the in vivo crosstalk of SARS-CoV-2 with its human
compartment in the mild patients, the severe patients’ myeloid host, we applied Viral-Track on the data generated from the nine
cells were characterized by a patient-specific diversity associ- SARS-CoV-2 patients and the rich cellular landscape we identi-
ated with accumulation of neutrophils, FCN1+ monocytes, and fied. SARS-CoV-2 transcripts were detected in all six severe sam-
monocyte-derived SPP1+ macrophages (Figures 3D and S3E). ples in variable amounts, ranging from less than 400 transcripts to
Additionally, NK cells and naive CCR7+ CD4+ T cells were more than 15,000 (Figures 4A and S4A). In contrast, no viral reads
consistently enriched across severe patients BAL, while were detected in the three mild patients (Figure 4A). Coverage
ZNF683hi CD8+ Trm cells were specific to mild patients (Figures analysis revealed that the majority of the viral reads mapped to
3E and S3F). We also observed changes in the epithelial the 30 end of the viral segment and corresponded to positive-
compartment, as severe patients exhibited higher numbers of stranded RNA (Figure 4B). This is in agreement with the coronavi-
club cells and AT2 cells (Figures 3F and S3G). By investigating rus transcription: due to a nested transcription process all genomic
expression patterns of shared gene expression programs, we and subgenomic RNA molecules share the same 30 end (Masters,
observed that cytotoxic CD8+ cells and the CD4+ Tfh cells are 2006). We then analyzed the enrichment of vUMIs in the cell pop-
the most proliferative compartments (Figure 3G), while a broad ulations represented in the BAL samples. We observed a strong
interferon type I response, a hallmark of viral response, is mainly enrichment of viral reads in the ciliated and epithelial progenitor
expressed by neutrophils and, to a lesser extent, FCN1+ mono- population, two known cellular targets of the virus, which express
cytes (Figure 3H). We next performed in-depth differential gene the main receptor of the SARS-CoV-2 virus ACE2, as well as
expression analysis between subsets characteristic of mild or TMPRSS2, a protease essential for SARS-CoV-2 entry (Figures
severe patients. We found that CD4+ T cells in the severe pa- 4C and S4B; Table S2) (Hoffmann et al., 2020). We also observed
tients exhibit a more naive phenotype, expressing higher levels enrichment of SARS-CoV-2 reads in the SPP1+ macrophage pop-
of IL7R, CCR7, S1PR1, and LTB. The CD8+ Trm cells signatures ulation, suggesting either that SARS-CoV-2 can infect immune
are restricted to the mild patients and have higher levels of the cells from the myeloid compartment or that SPP1+ macrophages
effector molecules XCL1, ITGAE, CXCR6, and ZNF683 (Fig- phagocytose infected cells or viral particles. Differential gene
ure 3I). Comparing gene expression differences in myeloid types expression analysis between vUMI+ infected and vUMI
between severe and mild patients revealed disease severity- bystander SPP1+ macrophages in the patients with the highest
associated upregulation of inflammatory chemokine genes in viral load, revealed that infected macrophages have a higher
SPP1+ monocyte-derived macrophages populations (CCL2, expression of chemokines (CCL7, CCL8, and CCL18) and APOE,
CCL3, CCL4, CCL7, and CCL8; Figure 3J), as well as genes and a lower expression of TAOK1, a serine/threonine-protein ki-
associated with hypoxia or oxidative stress (HMOX1 and nase in the p38 MAPK cascade (Figure S4C). Interestingly,
HIF1A), and downregulation of MHC class II (HLA-A and HLA- CD147 (also known as BSG), a potential new SARS-CoV-2
DQA1) and type I IFN genes (IFIT1 and OAS1). Alveolar macro- receptor (Wang et al., 2020b), is expressed by all cell types,
phages displayed a severity-associated signature, including including immune cells, suggesting alternative routes for the virus
upregulation of the chemokines CCL18 and CCL4L2 and the to infect these cells.
cathepsins CTSL and CTSB (Figure 3J). Together, we identified Often in cases of infectious diseases, the specific infecting vi-
dramatic differences between the mild and severe COVID-19 rus is not known, or may be accompanied by co-infection with
patients, including an inflammatory signature and a perturbed additional unknown viruses. Viral-Track applies an unsupervised
immune response associated with the severe manifestation mapping strategy and is optimally designed to systematically
of the COVID-19 disease. These also highlight potential profile the source of infection or co-infections in human clinical
immunotherapy treatment of the severe patients by targeting samples. To our surprise, Viral-Track analysis of data from one
the hyper inflammatory response that is activated by inflamma- of the severe patients (S1) revealed the presence of a second vi-
tory cytokines such as interleukin (IL)-6 and IL-8 (Liu et al., rus, the human metapneumovirus (hMPV) (NC_039199 Refseq
2019) (Figure S3H). sequence, Figure 4D) with more than one million reads mapped
Figure 4. Viral-Track Reveals Infection Specificity and a Co-infection in Severe COVID-19

(A) Total number of viral reads mapped to the SARS-CoV-2 viral genome in the profiled COVID-19 patients.
(B) Coverage plot of the SARS-CoV-2 viral genome.
(C) Enrichment of viral UMIs over expected values across 361 metacells, projected on the 2D map shown in Figure 4A. Color scale indicates log2 observed/
expected vUMIs. Only metacells with more than one expected UMI are plotted.
(D) Result of Viral-Track analysis on patient S1. For each viral segment, represented by a dot, the entropy of the sequence (how repetitive are the mapped
sequences) and the percentage of the segment that is mapped is plotted. Green dots correspond to viral segments that have passed quality control. Viral
segments with more than 50 mapped reads are plotted.
(E) Coverage plot of the human metapneumovirus (hMPV) genome.
(F) Distribution of hMPV UMIs across patient S1 sequenced cells. Red dashed line indicates automatic thresholding of vUMI+ cells.
(G) Enrichment of vUMI+ cells over expected values across 297 metacells, projected on the 2D map shown in Figure 4A. Color scale indicates log2 observed/
expected. Only metacells with more than one expected vUMI+ cell are plotted.
(H) Volcano plot showing the relative expression between infected and bystander monocytes of patient S1. Differentially expressed (>1 log2 fold change) and
statistically significant (p value <0.01) are colored in orange.
See also Figure S4.
Cell 181, 1475–1488, June 25, 2020 1483

ll
Article
to hMPV in this specific patient. hMPV is a non-segmented, sin- says are unbiased and sensitive in their ability to detect extremely
gle-stranded, and negative-sense RNA virus that is responsible rare viral sequences (Moustafa et al., 2017), but do not provide in-
for upper and lower respiratory tract infections in mostly young formation about the infected cells and the cellular changes
(<5 years) children but can also target elderly as well as im- induced by the infection. Alternatively, it is possible to combine
muno-compromised patients (Panda et al., 2014). hMPV has DNA probes with scRNA-seq to enrich for viral sequences and in-
been implicated as a possible source of co-infection with the crease the sensitivity of the assay, but this requires prior knowl-
original SARS-CoV virus (Chan et al., 2003). edge of the viruses present in each sample (Zanini et al., 2018).
Coverage analysis revealed that most reads fall into the N, P, M, Here, we present Viral-Track, a robust and unsupervised compu-
F, M2, SH, G, but not L, genes of hMPV (Figure 4E). We observed a tational pipeline that can detect viral RNA in any scRNA-seq data-
typical pattern of biased scRNA-seq coverage, indicating that the set without the need for experimental modifications or prior knowl-
N, P, M, F, M2, SH, and G genes are actively transcribed, and sug- edge of the infecting agent. Viral-Track was benchmarked on data
gesting that the hMPV was active and replicating at the time of originating from various tissues, infected by viruses with marked
sample collection. Analysis of the viral UMI distribution across differences in their RNA properties, and generated with different
cells revealed a substantial viral load in a large subset of the cells, scRNA-seq platforms. We demonstrate that Viral-Track can
spanning hundreds to thousands vUMIs per infected cell (Fig- readily provide essential information on infection status in clinical
ure 4F), independently of the total host UMIs in that cell (Fig- samples, identify infected cells, probe viral-induced transcrip-
ure S4D). We mapped the infected cells and characterized their tional alterations, and reveal cases of co-infection.
distribution across cell types. The infected patient is characterized In practice, only 70%–85% of scRNA-seq reads map to the host
by high levels of monocytes and CD4+ T cells (Figure S4E). Unlike genome and represent polyadenylated exonic host transcripts,
the SARS-CoV-2 virus infection map, hMPV-infected cells were whereas the remainder of the data is usually overlooked in analysis.
highly enriched in the monocyte compartment but not in the We show that these unmapped scRNA-seq reads, in pathological
epithelial and SPP1+ macrophage compartments (Figure 4G). human samples, potentially contain valuable information on viral
We tested whether the hMPV could alter the function of the in- infection and can be effectively used for viral genome assembly.
fected monocytes, and therefore influence the course of the dis- Viral-Track can resolve complex cellular ecosystems perturbed
ease. Using Viral-Track, we detected a large number of up- and by viral infection and provide an unbiased map of the infected cells,
downregulated genes in infected monocytes compared to as well as the transcriptional perturbations induced by the virus at
bystander monocytes (Figure 4H). Interestingly, several key recep- the single cell level. We combine Viral-Track with a novel statistical
tor genes required for monocyte activation such as CD16 approach to detect differentially expressed genes from scRNA-
(FCGR3B), G-CSF receptor (CSF3R), and the formyl peptide recep- seq data, therefore allowing the detection of gene expression
tor (FRP1) were downregulated in the infected compared to the changes triggered by viral infection and differentiating them from
bystander cells. Moreover, we observed a dramatic downregulation the more abundant bystander effects, such as type I IFN signaling,
of type I Interferon signaling and interferon stimulated genes (ISGs), at the single cell level. Further advances will focus on applying Viral-
including viral restriction factors, (e.g., IFIT3). A gene set enrichment Track on largescale datasets containing scRNA-seq data from
analysis (Figure S4F) revealed a strong enrichment of interferon dozens of samples, leading to robust single-cell viral metagenomic
response genes in the downregulated gene set, suggesting that studies that characterize the viral evolution and interactions of vi-
the hMPV is strongly downregulating the IFN response pathway. rus-induced disease mechanisms with host genetics.
Several anti-inflammatory genes were upregulated, including Here, we applied scRNA-seq and Viral-Track analysis to
LILRB4 (a potent inhibitor of monocyte activation) (Lu et al., 2009) COVID-19 patient-derived samples to provide a cellular and viral
and MITF, a transcription factor known to be a critical suppressor atlas of the BAL lung cells from COVID-19 patients. This analysis
of innate immunity (Harris et al., 2018). Last, we observed a positive revealed the diversity of the immune responses across COVID-
and significant association between total number of hMPV UMIs 19 patients and between mild and severe patients. We expect
and production of type I IFN, highlighting that while hMPV dampens that as the pandemic keeps spreading and global research ef-
the response to type I IFN, production of this signal is highly forts grow, additional scRNA-seq samples from COVID-19 pa-
restricted to a rare (~1%) population of cells with a high viral load tients will be generated, including patients treated with emerging
(Figure S4G). Altogether, our analysis described the distribution of immunotherapies (Liu et al., 2019). Such an approach might help
SARS-CoV-2-infected cells in patient’s BAL and revealed the pres- to solve key questions including the contribution of the humoral
ence of a viral co-infection by the hMPV that dampens the immune response (Iwasaki and Yang, 2020), the role of the IL6 pathway
activation of the monocyte compartment in the infected patient. (Herold et al., 2020), and the immune memory induced by the vi-
Further large-scale analyses of mild versus severe patients need rus (Prompetchara et al., 2020). Viral-Track can contribute to the
to be conducted to better understand if the co-infection is corre- global effort to identify the different cellular compartments that
lated or even causative in SARS-CoV-2 pathology. are targeted and affected by COVID-19 and other viruses and
to detect possible co-infection by unexpected viruses. Co-infec-
DISCUSSION tions are gaining recognition in the scientific and medical com-
munity as critical factors in disease prognosis (Zhang et al.,
The virosphere contains hundreds of thousands of species that 2020). So far, research focused mainly on co-infections of bac-
constantly interact with their host cells. Over the years, several terial sources or of well-known viruses such as influenza A (Wu
genomic techniques have been developed to detect virus-derived et al., 2020). Understanding the diversity of viral co-infections
sequences in human samples. For instance, deep sequencing as- and their mechanisms of immune suppression at the cellular
1484 Cell 181, 1475–1488, June 25, 2020

ll
Article
and molecular level could therefore provide highly valuable infor- d QUANTIFICATION AND STATISTICAL ANALYSIS
mation and lead toward possible therapeutic targets, especially B Read mapping/alignment
for severe patients, whose treatment options are limited. B Viral database and STAR Index building
B Processing and filtering of the BAM files
Limitations B Transcript reconstruction
Viral-Track is a new and powerful tool to decipher host-viral in- B MARS-seq data demultiplexing and UMI count
teractions. However, its impact is dependent on several factors, B Drop-seq and 10X data download, pre-processing and
the most critical one being the biochemical and pathophysiolog- demultiplexing
ical properties of the virus. The absence of a poly(A) tail at the B Analysis of the MARS-seq spleen LCMV dataset
end of viral RNA molecules can significantly decrease their cap- B Analysis of the 10X HBV liver dataset
ture rate efficiency in current scRNA-seq techniques, as shown B Analysis of the COVID-19 BAL dataset
by the LCMV example. This may hinder Viral-Track’s ability to B Testing for infection specificity in COVID-19 BAL
robustly identify infected cells or discern differential expression dataset
between infected and bystander cells in such viruses. Other B Dichotomized differential gene expression analysis
properties of the viral RNA molecules, absence/presence of 50 B Automate thresholding to detect HBV and hMPV in-
capping, nucleotide composition, or dependence on RNA bind- fected cells
ing proteins, may also affect capture efficiency, and as the tech- B Gene set enrichment analysis
nology develops, further research will focus on the classification
of molecular features that facilitate or prevent virus identification SUPPLEMENTAL INFORMATION
by scRNA-seq. Notably, non poly(A)-based scRNA-seq tech-

niques, such as RamDA-seq (Hayashi et al., 2018), can be poten- cell.2020.05.006.
tially used when profiling these datasets.
Another limiting factor for Viral-Track’s applicability is the ACKNOWLEDGMENTS
potential scarcity of viral reads and infected cells in the sam-
ple. As shown in our analysis of SARS-CoV-2-infected sam- We thank Dr. Noam Stern Ginossar and Dr. Yoav Golan for careful evaluation of
ples, only a limited number of viral reads are detected in the manuscript; Dr. Etienne Simon-Loriere, Thomas Jacquemont, and Alice
Balfourier for valuable advices; Tali Wiesel from the Scientific Illustration unit
some of the samples. This may be due to the specific stage
of the Weizmann Institute for artwork; and members of the Amit laboratory
of the disease (He et al., 2020), or sampling biases favoring for discussions. I.A. is an Eden and Steven Romick Professorial Chair and sup-
mainly the lung immune populations, with lower representa- ported by Merck KGaA (Darmstadt, Germany), the Chan Zuckerberg Initiative
tion of non-immune cells that are the primary targets of the vi- (CZI), an HHMI International Scholar award, the European Research Council
rus. Therefore, future COVID-19 scRNA-seq studies should Consolidator Grant (ERC-COG) 724471-HemTree2.0, an SCA award from
consider this limitation in their experimental design and aim the Wolfson Foundation and Family Charitable Trust, the Thompson Family
Foundation, an MRA Established Investigator Award (509044), the Israel Sci-
for a better representation of the upper respiratory tissue
ence Foundation (703/15), the Ernest and Bonnie Beutler Research Program
and the lung parenchyma. Alternative approaches may rely for Excellence in Genomic Medicine, the Helen and Martin Kimmel award for
on index sorting and single-cell transcriptome-trained sorting innovative investigation, a NeuroMac DFG/Transregional Collaborative
to design optimal gating strategies for capturing and enriching Research Center grant, an International Progressive MS Alliance/NMSS (PA-
the stromal populations. 1604 08459), and an Adelis Foundation grant. P.B. is supported by a PhD
scholarship from the Ecole Normale Supérieure, Paris. B.S. has received fund-
ing from the French Government’s Investissement d’Avenir program, Labora-
STAR+METHODS toire d’Excellence ‘‘Integrative Biology of Emerging Infectious Diseases’’
(ANR-10-LABX-62-IBEID). Z.Z. and Y.L. were supported by fundings from
Detailed methods are provided in the online version of this paper the National Natural Science Foundation of China (91442127 to Z.Z.；
and include the following: 81700540 to Y.L.).
d KEY RESOURCES TABLE AUTHOR CONTRIBUTIONS

d RESOURCE AVAILABILITY
B Lead Contact P.B. designed and developed Viral-Track, performed various computational
B Materials Availability analyses, and wrote the manuscript. A.G. performed computational analyses
and wrote the manuscript. Y.L., G.X., and S.Z. designed and performed exper-
B Data and Code Availability
iments. Y.B. developed Viral-Track. E.D. contributed to data processing and
d EXPERIMENTAL MODEL AND SUBJECT DETAILS analysis. R.B.-G., M.C., and C.M. designed and performed experiments.
B Mice H.L. contributed to data analysis. A.D. contributed to data analysis, manu-
B LCMV/VSV infections script writing, and scientific communication. B.S. contributed to development
B Subjects of computational methods and bioinformatic analysis and wrote the manu-
d METHOD DETAILS script. Z.Z. conceived, designed, and analyzed experiments and wrote the
manuscript. I.A. directed the project, conceived, designed, and analyzed ex-
B Lymph Node MARS-seq data generation
periments, and wrote the manuscript.
B Influenza MARS-seq data generation
B LCMV spleen MARS-seq data generation DECLARATION OF INTERESTS
B 10X HBV liver data generation
B 10X COVID-19 data generation The authors declare no competing interests.
Cell 181, 1475–1488, June 25, 2020 1485

ll
Article
Received: March 30, 2020 acterizing heterogeneity in single-cell RNA sequencing data. Genome Biol.
Revised: April 16, 2020 16, 278.
Accepted: May 1, 2020 Griffiths, J.A., Richard, A.C., Bach, K., Lun, A.T.L., and Marioni, J.C. (2018).
Published: May 8, 2020 Detection and removal of barcode swapping in single-cell RNA-seq data.
Nat. Commun. 9, 2667.
REFERENCES
Hafemeister, C., and Satija, R. (2019). Normalization and variance stabilization
of single-cell RNA-seq data using regularized negative binomial regression.
Bailey, J.R., Barnes, E., and Cox, A.L. (2019). Approaches, Progress, and
Genome Biol. 20, 296.
Challenges to Hepatitis C Vaccine Development. Gastroenterology 156,
418–430. Haller, O., Arnheiter, H., Lindenmann, J., and Gresser, I. (1980). Host gene in-
fluences sensitivity to interferon action selectively for influenza virus. Nature
Baran, Y., Bercovich, A., Sebe-Pedros, A., Lubling, Y., Giladi, A., Chomsky,
283, 660–662.
E., Meir, Z., Hoichman, M., Lifshitz, A., and Tanay, A. (2019). MetaCell: anal-
ysis of single-cell RNA-seq data using K-nn graph partitions. Genome Biol. Harris, M.L., Fufa, T.D., Palmer, J.W., Joshi, S.S., Larson, D.M., Incao, A., Gil-
20, 206. dea, D.E., Trivedi, N.S., Lee, A.N., Day, C.-P., et al.; NISC Comparative
Sequencing Program (2018). A direct link between MITF, innate immunity,
Beier, J.I., Jokinen, J.D., Holz, G.E., Whang, P.S., Martin, A.M., Warner, N.L.,
and hair graying. PLoS Biol. 16, e2003648.
Arteel, G.E., and Lukashevich, I.S. (2015). Novel mechanism of arenavirus-
induced liver pathology. PLoS ONE 10, e0122839. Hayashi, T., Ozaki, H., Sasagawa, Y., Umeda, M., Danno, H., and Nikaido, I.
(2018). Single-cell full-length total RNA sequencing uncovers dynamics of
Benjamini, Y., and Hochberg, Y. (1995). Controlling the false discovery rate: a
recursive splicing and enhancer RNAs. Nat. Commun. 9, 619.
practical and powerful approach to multiple testing. J. R. Stat. Soc. B 57,
289–300. He, X., Lau, E.H.Y., Wu, P., Deng, X., Wang, J., Hao, X., Lau, Y.C., Wong, J.Y.,
Blecher-Gonen, R., Bost, P., Hilligan, K.L., David, E., Salame, T.M., Rous- Guan, Y., Tan, X., et al. (2020). Temporal dynamics in viral shedding and trans-
sel, E., Connor, L.M., Mayer, J.U., Bahar Halpern, K., Tóth, B., et al. missibility of COVID-19. Nat. Med. 26, 672–675.
(2019). Single-Cell Analysis of Diverse Pathogen Responses Defines a Herold, T., Jurinovic, V., Arnreich, C., Hellmuth, J.C., von Bergwelt-Baildon,
Molecular Roadmap for Generating Antigen-Specific Immunity. Cell M., Klein, M., and Weinberger, T. (2020). Level of IL-6 predicts respiratory fail-
Syst. 8, 109–121. ure in hospitalized symptomatic COVID-19 patients. MedRxiv. https://doi.org/
Blondel, V.D., Guillaume, J.-L., Lambiotte, R., and Lefebvre, E. (2008). Fast un- 10.1101/2020.04.01.20047381.
folding of communities in large networks. J. Stat. Mech. Theory Exp. 2008, Hoffmann, M., Kleine-Weber, H., Schroeder, S., Krüger, N., Herrler, T., Erich-
P10008. sen, S., Schiergens, T.S., Herrler, G., Wu, N.-H., Nitsche, A., et al. (2020).
Bradley, T., Ferrari, G., Haynes, B.F., Margolis, D.M., and Browne, E.P. (2018). SARS-CoV-2 Cell Entry Depends on ACE2 and TMPRSS2 and Is Blocked by
Single-Cell Analysis of Quiescent HIV Infection Reveals Host Transcriptional a Clinically Proven Protease Inhibitor. Cell 181, 271–280.
Profiles that Regulate Proviral Latency. Cell Rep. 25, 107–117. Itzhaki, R.F. (2018). Corroboration of a Major Role for Herpes Simplex Virus
Burrell, C.J., Howard, C.R., and Murphy, F.A. (2017). Fenner and White’s Med- Type 1 in Alzheimer’s Disease. Front. Aging Neurosci. 10, 324.
ical Virology (Academic Press). Iwasaki, A., and Yang, Y. (2020). The potential danger of suboptimal antibody
Burton, D.R. (2019). Advancing an HIV vaccine; advancing vaccinology. Nat. responses in COVID-19. Nat. Rev. Immunol. Published online April 21, 2020.
Rev. Immunol. 19, 77–78. https://doi.org/10.1038/s41577-020-0321-6.
Chan, P.K.S., Tam, J.S., Lam, C.-W., Chan, E., Wu, A., Li, C.-K., Buckley, T.A., Jaitin, D.A., Kenigsberg, E., Keren-Shaul, H., Elefant, N., Paul, F., Zaretsky, I.,
Ng, K.-C., Joynt, G.M., Cheng, F.W.T., et al. (2003). Human metapneumovirus Mildner, A., Cohen, N., Jung, S., Tanay, A., and Amit, I. (2014). Massively par-
detection in patients with severe acute respiratory syndrome. Emerg. Infect. allel single-cell RNA-seq for marker-free decomposition of tissues into cell
Dis. 9, 1058–1063. types. Science 343, 776–779.
De Baets, S., Verhelst, J., Van den Hoecke, S., Smet, A., Schotsaert, M., Jaitin, D.A., Adlung, L., Thaiss, C.A., Weiner, A., Li, B., Descamps, H., Lundg-
Job, E.R., Roose, K., Schepens, B., Fiers, W., and Saelens, X. (2015). A ren, P., Bleriot, C., Liu, Z., Deczkowska, A., et al. (2019). Lipid-Associated
GFP expressing influenza A virus to report in vivo tropism and protection Macrophages Control Metabolic Homeostasis in a Trem2-Dependent Manner.
by a matrix protein 2 ectodomain-specific monoclonal antibody. PLoS Cell 178, 686–698..
ONE 10, e0121491. Chen, J., Liu, D., Liu, L., Liu, P., Xu, Q., Xia, L., Ling, Y., Huang, D., Song,
Deffrasnes, C., Marsh, G.A., Foo, C.H., Rootes, C.L., Gould, C.M., Grusovin, S., Zhang, D., et al. (2020). A pilot study of hydroxychloroquine in treatment
J., Monaghan, P., Lo, M.K., Tompkins, S.M., Adams, T.E., et al. (2016). of patients with common coronavirus disease-19 (COVID-19). J. Zhejiang
Genome-wide siRNA screening at biosafety level 4 reveals a crucial role for fi- Univ. Medical Sci. 49 https://doi.org/10.3785/j.issn.1008-9292.2020.
brillarin in henipavirus infection. PLoS Pathog. 12, e1005478. 03.030.
Dobin, A., Davis, C.A., Schlesinger, F., Drenkow, J., Zaleski, C., Jha, S., Batut, Keren-Shaul, H., Spinrad, A., Weiner, A., Matcovitch-Natan, O., Dvir-Sztern-
P., Chaisson, M., and Gingeras, T.R. (2013). STAR: ultrafast universal RNA-seq feld, R., Ulland, T.K., David, E., Baruch, K., Lara-Astaiso, D., Toth, B., et al.
aligner. Bioinformatics 29, 15–21. (2017). A Unique Microglia Type Associated with Restricting Development of
Drayman, N., Patel, P., Vistain, L., and Tay, S. (2019). HSV-1 single-cell anal- Alzheimer’s Disease. Cell 169, 1276–1290.
ysis reveals the activation of anti-viral and developmental programs in distinct Keren-Shaul, H., Kenigsberg, E., Jaitin, D.A., David, E., Paul, F., Tanay, A., and
sub-populations. eLife 8, e46339. Amit, I. (2019). MARS-seq2.0: an experimental and analytical pipeline for in-
Dubois, J., Terrier, O., and Rosa-Calatrava, M. (2014). Influenza viruses and dexed sorting combined with single-cell RNA sequencing. Nat. Protoc. 14,
mRNA splicing: doing more with less. MBio 5, e00070-14. 1841–1862.
Faure-Dupuy, S., Delphin, M., Aillot, L., Dimier, L., Lebossé, F., Fresquet, J., Kharchenko, P.V., Silberstein, L., and Scadden, D.T. (2014). Bayesian
Parent, R., Matter, M.S., Rivoire, M., Bendriss-Vermare, N., et al. (2019). Hep- approach to single-cell differential expression analysis. Nat. Methods 11,
atitis B virus-induced modulation of liver macrophage function promotes he- 740–742.
patocyte infection. J. Hepatol. 71, 1086–1098. Lake, B.B., Chen, S., Sos, B.C., Fan, J., Kaeser, G.E., Yung, Y.C., Duong, T.E.,
Finak, G., McDavid, A., Yajima, M., Deng, J., Gersuk, V., Shalek, A.K., Gao, D., Chun, J., Kharchenko, P.V., and Zhang, K. (2018). Integrative single-
Slichter, C.K., Miller, H.W., McElrath, M.J., Prlic, M., et al. (2015). MAST: a cell analysis of transcriptional and epigenetic states in the human adult brain.
flexible statistical framework for assessing transcriptional changes and char- Nat. Biotechnol. 36, 70–80.
1486 Cell 181, 1475–1488, June 25, 2020

ll
Article
Li, H., Handsaker, B., Wysoker, A., Fennell, T., Ruan, J., Homer, N., Marth, G., Prompetchara, E., Ketloy, C., and Palaga, T. (2020). Immune responses in
Abecasis, G., and Durbin, R.; 1000 Genome Project Data Processing Sub- COVID-19 and potential vaccines: Lessons learned from SARS and MERS
group (2009). The Sequence Alignment/Map format and SAMtools. Bioinfor- epidemic. Asian Pac. J. Allergy Immunol. 38, 1–9.
matics 25, 2078–2079. Shnayder, M., Nachshon, A., Krishna, B., Poole, E., Boshkov, A., Binyamin, A.,
Li, H., van der Leun, A.M., Yofe, I., Lubling, Y., Gelbard-Solodkin, D., van Maza, I., Sinclair, J., Schwartz, M., and Stern-Ginossar, N. (2018). Defining the
Akkooi, A.C.J., van den Braber, M., Rozeman, E.A., Haanen, J.B.A.G., Transcriptional Landscape during Cytomegalovirus Latency with Single-Cell
Blank, C.U., et al. (2019). Dysfunctional CD8 T Cells Form a Proliferative, RNA Sequencing. MBio 9, e00013-18.
Dynamically Regulated Compartment within Human Melanoma. Cell 176, Silverman, J.D., Roche, K., Mukherjee, S., and David, L.A. (2018). Naught all
775–789. zeros in sequence count data are the same. bioRxiv. https://doi.org/10.
Liao, M., Liu, Y., Yuan, J., Wen, Y., Xu, G., Zhao, J., Cheng, L., Li, J., Wang, X., 1101/477794.
Wang, F., et al. (2020). The landscape of lung bronchoalveolar immune cells in Smith, T., Heger, A., and Sudbery, I. (2017). UMI-tools: modeling sequencing
COVID-19 revealed by single-cell RNA sequencing. medRxiv. https://doi.org/ errors in Unique Molecular Identifiers to improve quantification accuracy.
10.1101/2020.02.23.20026690. Genome Res. 27, 491–499.
Liberzon, A., Birger, C., Thorvaldsdóttir, H., Ghandi, M., Mesirov, J.P., and Stano, M., Beke, G., and Klucar, L. (2016). viruSITE-integrated database for
Tamayo, P. (2015). The Molecular Signatures Database (MSigDB) hallmark viral genomics. Database J. Biol. Databases Curation 2016. https://doi.org/
gene set collection. Cell Syst. 1, 417–425. 10.1093/database/baw162.
Lienenklaus, S., Cornitescu, M., Zie˛tara, N., qyszkiewicz, M., Gekara, N., Ja- Steuerman, Y., Cohen, M., Peshes-Yaloz, N., Valadarsky, L., Cohn, O., David,
b1ónska, J., Edenhofer, F., Rajewsky, K., Bruder, D., Hafner, M., et al. E., Frishberg, A., Mayo, L., Bacharach, E., Amit, I., and Gat-Viks, I. (2018).
(2009). Novel reporter mouse reveals constitutive and inflammatory expres- Dissection of Influenza Infection In Vivo by Single-Cell RNA Sequencing. Cell
sion of IFN-b in vivo. J. Immunol. 183, 3229–3236. Syst. 6, 679–691.
Liu, L., Wei, Q., Lin, Q., Fang, J., Wang, H., Kwok, H., Tang, H., Nishiura, K., Subramanian, A., Tamayo, P., Mootha, V.K., Mukherjee, S., Ebert, B.L., Gil-
Peng, J., Tan, Z., et al. (2019). Anti-spike IgG causes severe acute lung injury lette, M.A., Paulovich, A., Pomeroy, S.L., Golub, T.R., Lander, E.S., and Me-
by skewing macrophage responses during acute SARS-CoV infection. JCI sirov, J.P. (2005). Gene set enrichment analysis: a knowledge-based approach
Insight 4, 123158. for interpreting genome-wide expression profiles. Proc. Natl. Acad. Sci. USA
102, 15545–15550.
Lu, H.K., Rentero, C., Raftery, M.J., Borges, L., Bryant, K., and Tedla, N.
Svensson, V. (2020). Droplet scRNA-seq is not zero-inflated. Nat. Biotechnol.
(2009). Leukocyte Ig-like receptor B4 (LILRB4) is a potent inhibitor of Fcgam-
38, 147–150.
maRI-mediated monocyte activation via dephosphorylation of multiple ki-
nases. J. Biol. Chem. 284, 34839–34848. Svensson, V., da Veiga Beltrame, E., and Pachter, L. (2019). Quantifying the
tradeoff between sequencing depth and cell number in single-cell RNA-seq.
Macosko, E.Z., Basu, A., Satija, R., Nemesh, J., Shekhar, K., Goldman, M., Tir-
bioRxiv. https://doi.org/10.1101/762773.
osh, I., Bialas, A.R., Kamitaki, N., Martersteck, E.M., et al. (2015). Highly Par-
allel Genome-wide Expression Profiling of Individual Cells Using Nanoliter Townes, F.W., Hicks, S.C., Aryee, M.J., and Irizarry, R.A. (2019). Feature selec-
Droplets. Cell 161, 1202–1214. tion and dimension reduction for single-cell RNA-Seq based on a multinomial
model. Genome Biol. 20, 295.
Masters, P.S. (2006). The molecular biology of coronaviruses. Adv. Virus Res.
Wang, C., Horby, P.W., Hayden, F.G., and Gao, G.F. (2020a). A novel corona-
66, 193–292.
virus outbreak of global health concern. Lancet 395, 470–473.
McInnes, L., Healy, J., and Melville, J. (2018). Umap: Uniform manifold
Wang, K., Chen, W., Zhou, Y.-S., Lian, J.-Q., Zhang, Z., Du, P., Gong, L.,
approximation and projection for dimension reduction. ArXiv,
Zhang, Y., Cui, H.-Y., Geng, J.-J., et al. (2020b). SARS-CoV-2 invades host
ArXiv1802.03426.
cells via a novel route: CD147-spike protein. BioRxiv. https://doi.org/10.
Medaglia, C., Giladi, A., Stoler-Barak, L., De Giovanni, M., Salame, T.M., 1101/2020.03.14.988345.
Biram, A., David, E., Li, H., Iannacone, M., Shulman, Z., et al. (2017). Spatial Wu, X., Cai, Y., Huang, X., Yu, X., Zhao, L., Wang, F., Li, Q., Gu, S., Xu, T., Li, Y.,
reconstruction of immune niches by combining photoactivatable reporters et al. (2020). Co-infection with SARS-CoV-2 and Influenza A Virus in Patient
and scRNA-seq. Science 358, 1622–1626. with Pneumonia, China. Emerg. Infect. Dis. 26 https://doi.org/10.3201/
Moustafa, A., Xie, C., Kirkness, E., Biggs, W., Wong, E., Turpaz, Y., Bloom, K., eid2606.200299.
Delwart, E., Nelson, K.E., Venter, J.C., and Telenti, A. (2017). The blood DNA Yang, S., Corbett, S.E., Koga, Y., Wang, Z., Johnson, W.E., Yajima, M., and
virome in 8,000 humans. PLoS Pathog. 13, e1006292. Campbell, J.D. (2020). Decontamination of ambient RNA in single-cell RNA-
Müller, S., Hunziker, L., Enzler, S., Bühler-Jungo, M., Di Santo, J.P., Zinker- seq with DecontX. Genome Biol. 21, 57.
nagel, R.M., and Mueller, C. (2002). Role of an intact splenic microarchitec- Yofe, I., Dahan, R., and Amit, I. (2020). Single-cell genomic approaches for
ture in early lymphocytic choriomeningitis virus production. J. Virol. 76, developing the next generation of immunotherapies. Nat. Med. 26,
2375–2383. 171–177.
Münz, C., Lünemann, J.D., Getts, M.T., and Miller, S.D. (2009). Antiviral im- Young, L.S., and Rickinson, A.B. (2004). Epstein-Barr virus: 40 years on. Nat.
mune responses: triggers of or triggered by autoimmunity? Nat. Rev. Immunol. Rev. Cancer 4, 757–768.
9, 246–258. Zanini, F., Robinson, M.L., Croote, D., Sahoo, M.K., Sanz, A.M., Ortiz-
Otsu, N. (1979). A Threshold Selection Method from Gray-Level Histograms. Lasso, E., Albornoz, L.L., Rosso, F., Montoya, J.G., Goo, L., et al. (2018).
IEEE Trans. Syst. Man Cybern. 9, 62–66. Virus-inclusive single-cell RNA sequencing reveals the molecular signature
of progression to severe dengue. Proc. Natl. Acad. Sci. USA 115, E12363–
Panda, S., Mohakud, N.K., Pena, L., and Kumar, S. (2014). Human metapneu-
E12369.
movirus: review of an important respiratory pathogen. Int. J. Infect. Dis.
25, 45–52. Zhang, F., Wei, K., Slowikowski, K., Fonseka, C.Y., Rao, D.A., Kelly, S.,
Goodman, S.M., Tabechian, D., Hughes, L.B., Salomon-Escoto, K., et al.;
Pertea, M., Pertea, G.M., Antonescu, C.M., Chang, T.-C., Mendell, J.T., and Accelerating Medicines Partnership Rheumatoid Arthritis and Systemic
Salzberg, S.L. (2015). StringTie enables improved reconstruction of a tran-
Lupus Erythematosus (AMP RA/SLE) Consortium (2019). Defining inflam-
scriptome from RNA-seq reads. Nat. Biotechnol. 33, 290–295.
matory cell states in rheumatoid arthritis joint synovial tissues by inte-
Pierson, T.C., and Diamond, M.S. (2018). The emergence of Zika virus and its grating single-cell transcriptomics and mass cytometry. Nat. Immunol.
new clinical syndromes. Nature 560, 573–581. 20, 928–942.
Cell 181, 1475–1488, June 25, 2020 1487

ll
Article
Zhang, H., Wang, X., Fu, Z., Luo, M., Zhang, Z., Zhang, K., He, Y., Wan, D., Massively parallel digital transcriptional profiling of single cells. Nat. Com-
Zhang, L., Wang, J., et al. (2020). Potential Factors for Prediction of Disease mun. 8, 14049.
Severity of COVID-19 Patients. MedRxiv. https://doi.org/10.1101/2020.03. Zhu, N., Zhang, D., Wang, W., Li, X., Yang, B., Song, J., Zhao, X., Huang, B.,
20.20039818. Shi, W., Lu, R., et al.; China Novel Coronavirus Investigating and Research
Zheng, G.X.Y., Terry, J.M., Belgrader, P., Ryvkin, P., Bent, Z.W., Wilson, R., Team (2020). A Novel Coronavirus from Patients with Pneumonia in China,
Ziraldo, S.B., Wheeler, T.D., McDermott, G.P., Zhu, J., et al. (2017). 2019. N. Engl. J. Med. 382, 727–733.
1488 Cell 181, 1475–1488, June 25, 2020

ll
Article
STAR+METHODS
KEY RESOURCES TABLE
REAGENT or RESOURCE SOURCE IDENTIFIER

Antibodies
santi-mouse TCRb biotin (clone H57-597) Biolegend Cat#:109203; RRID:AB_313426
anti-mouse CD3 biotin (clone 17A2) Biolegend Cat#:100243; RRID:AB_2563946
anti-mouse CD19 biotin (clone 6D5) Biolegend Cat#:115503; RRID:AB_313638
Bacterial and Virus Strains
Vesicular Stomatitis Virus (VSV) Indiana In house N/A
Strain
Lymphocytic choriomeningitis virus In house N/A
(LCMV)- Armstrong (Arm) strain
LCMV-Arm-eGFP In house N/A
Biological Samples
COVID-19 BAL samples Shenzhen Third People’s Hospital N/A
HBV liver sample Shenzhen Third People’s Hospital N/A
Chemicals, Peptides, and Recombinant Proteins
Liberase TL Roche Cat#:5401020001
Dnase I, grade II Roche Cat#:10104159001
Critical Commercial Assays
Chromium Single Cell 3ʹ Reagent Kit (v3 10X Genomics 1000075
chemistry)
Chromium Single Cell V(D)J Reagent Kits 10X Genomics 1000006
(v1 Chemistry)
Deposited Data
Raw data files for the 10X COVID-19 and This paper GEO: GSE145926
HBV patients
Raw data files for the LCMV/VSV single-cell This paper GEO: GSE149443
RNA-seq
Experimental Models: Organisms/Strains
Mouse: C57BL/6 WT Jackson laboratories RRID:IMSR_JAX:000664
Software and Algorithms
R (3.5.0) The R project https://www.r-project.org
Python (3.6.5) Python software foundation https://www.python.org
STAR (2.7.0) Dobin et al., 2013 https://github.com/alexdobin/STAR
Samtools (1.4.0) Li et al., 2009 http://www.htslib.org/download/
StringTie (1.3.5) Pertea et al., 2015 https://ccb.jhu.edu/software/stringtie/
UMI-tools (1.0.0) Smith et al., 2017 https://umi-tools.readthedocs.io/en/latest/
Pagoda2 (0.1.0) Lake et al., 2018 https://github.com/hms-dbmi/pagoda2/
MetaCell (0.3.41) Baran et al., 2019 https://github.com/tanaylab/metacell
Cell Ranger (3.1.0) N/A https://support.10xgenomics.com/
single-cell-gene-expression/software/
pipelines/latest/what-is-cell-ranger
Other
MARS-seq reagents Jaitin et al., 2014 N/A
Cell 181, 1475–1488.e1–e6, June 25, 2020 e1

ll
Article
RESOURCE AVAILABILITY
Lead Contact
Further information and requests for resources and reagents should be directed to and will be fulfilled by the Lead Contact Ido Amit
(ido.amit@weizmann.ac.il).
Materials Availability
This study did not generate new unique reagents.
Data and Code Availability

The whole Viral-Track pipeline is freely available at https://github.com/PierreBSC/Viral-Track. The datasets generated during this
study were deposited to the Gene Expression Omnibus (GEO) repository with accession codes GEO: GSE145926 and GSE149443.
EXPERIMENTAL MODEL AND SUBJECT DETAILS
Mice
C57BL/6 mice were purchased from Jackson Laboratories and bred and housed at the Weizmann Institute of Science animal facility,
under specific pathogen-free conditions. Female mice, 6-8 weeks of age, were used for all experiments. Experimental protocols were
approved by the Weizmann Institute of Science Ethics Committee and were performed according to institutional guidelines.
LCMV/VSV infections
For LCMV infection, 1x105 Focus-Forming Units (FFUs) of the LCMV-Arm strain were injected. For VSV, 1x105 Plaque-Forming Units
(PFUs) of the VSV Indiana strain were used. Mice were anesthetized and viruses administered by intradermal injection into the ear
pinna. 24h later, mice were sacrificed and auricular LN were harvested.
Subjects
This study was conducted according to the principles expressed in the Declaration of Helsinki. Ethical approval was obtained from
the Research Ethics Committee of Shenzhen Third People’s Hospital. All participants provided written informed consent for sample
collection and subsequent analyses.
METHOD DETAILS
Lymph Node MARS-seq data generation

To prepare single cell suspensions for MARS-seq and flow cytometry, auricular LNs were digested in IMDM containing 100mg/mL
Liberase TL and 100mg/mL DNase I (both from Roche, Germany) for 20 minutes at 37C. In the last 5 minutes of incubation, EDTA was
added at a final concentration of 10mM. Cells were collected, filtered through a 70mm cell strainer, washed with IMDM and main-
tained strictly at 4C. Cells were sorted with FACSARIA-FUSION (BD Biosciences, San Jose, CA). Prior to sorting, all samples
were filtered through a 70-mm nylon mesh. Isolated cells were single cell sorted into 384-well cell capture plates containing 2 mL
of lysis solution and barcoded poly(T) reversetranscription (RT) primers for single-cell RNA-seq (Jaitin et al., 2014). Four empty wells
were kept in each 384-well plate as a no-cell control for data analysis. Immediately after sorting, each plate was spun down to ensure
cell immersion into the lysis solution, snap frozen on dry ice, and stored at –80C until processing Single-cell RNA-seq libraries were
prepared as previously described (Jaitin et al., 2014). In brief, mRNA from single cells sorted into capture plates were barcoded and
converted into cDNA and then pooled using an automated pipeline. The pooled sample was linearly amplified by T7 in vitro transcrip-
tion, and the resulting RNA was fragmented and converted into a sequencing-ready library by tagging the samples with pool barc-
odes and Illumina sequences during ligation, RT, and PCR. Each pool of cells was tested for library quality and concentration as
described previously (Jaitin et al., 2014).
Influenza MARS-seq data generation

Full description of the protocol used to generate the Influenza A lung data can be found in Steuerman et al. (2018). Influenza PR8
H1N1 influenza virus (A/Puerto Rico/8/34) was cultivated in hen egg anion. 40mL of diluted virus (6x103 PFU per mouse) were inoc-
ulated intranasaly to the mice, or 40mL of PBS for the control mice. Mice were killed 48 or 72h post infection and the lung perfused.
Immune and none-immune cells were then extracted using two different extraction protocols before being single-cell sorted in 384-
well plates and sequenced using the original MARS-seq protocol (Jaitin et al., 2014).
LCMV spleen MARS-seq data generation

Description for the full protocol used to generate the NICHE-Seq spleen data can be found in Medaglia et al. (2017). Briefly female
mice received 1x106 FFU of LCMV-Arm or LCMV-Arm-eGFP in the footpad. 72 hours after injection, spleens were harvested and
forced through a 70mm mesh to form a single-cell suspension. Cells were then single-cell sorted using a SORP-aria into 384-well
e2 Cell 181, 1475–1488.e1–e6, June 25, 2020

ll
Article
plates containing lysis buffer before processing the plate according to the MARS-seq protocol (Jaitin et al., 2014). All infectious work
was performed in designated Biosafety Level 2 (BSL-2) and BSL-3 workspaces in accordance with institutional guidelines
10X HBV liver data generation

The approximately 1 cm long Liver biopsy was homogenized by mincing with scissors into smaller pieces (~0.5 mm2 per piece). Then
the tissue was transferred into 10 mL of enzyme mix consisting of 0.3 mg/ml collagenase type IV (Sigma, C9891) and DNase I (Sigma,
D5025) for mild enzymatic digestion for 1 h at 37 C while shaking. 5 mL of Dulbecco’s phosphate-buffered saline (DPBS, Thermo,
14190250) supplemented with 5% FBS was added to interrupt digestion and dissociated cells in suspension were passed through a
40 mm strainer and centrifuged at 300 g for 5 min at 4 C. Erythrocytes were lysed using Ammonium-Chloride-Potassium (ACK,
Thermo, A1049201), and finally cells were re-suspended in DPBS supplemented with 1% FBS at the concentration of 2, 000
cells/ml for scRNA-Seq. The single-cell capturing and downstream library constructions were performed using the Chromium Single
Cell 30 V3 library preparation kit according to the manufacturer’s protocol (10x Genomics). Full-length cDNA along with cell-barcode
identifiers were PCR-amplified and sequencing libraries were prepared and normalized to 3 nM. The constructed library was
sequenced on BGI MGISEQ-2000 platform. The Cell Ranger Software Suite (Version 3.1.0) was then used to perform sample de-mul-
tiplexing, barcode processing and single-cell 30 UMI counting with human GRCh38 as the reference genome
10X COVID-19 data generation

20 mL of BALF was obtained and placed on ice. BALF was processed within 2 hours and all operations were performed in BSL-3
laboratory. By passing BALF through a 100 mm nylon cell strainer to filter out lumps, the supernatant was centrifuged and the cells
were re-suspended in the cooled RPMI 1640 complete medium. Then the cells were counted in 0.4% trypan blued, centrifuged and
re-suspended at the concentration of 2 3 106 /ml for further use. Total 11 ml of single cell suspension and 40 ml barcoded Gel Beads
were loaded to Chromium Chip A to generate single-cell gel bead-in-emulsion (GEM). The poly-adenylated transcripts were reverse-
transcribed later. The single-cell capturing and downstream library constructions were performed using the Chromium Single Cell 50
library preparation kit according to the manufacturer’s protocol (10x Genomics). Full-length cDNA along with cell-barcode identifiers
were PCR-amplified and sequencing libraries were prepared and normalized to 3 nM. The constructed library was sequenced on BGI
MGISEQ-2000 platform. Each sample was sequenced on a different sequencing run to avoid contamination between samples. The
Cell Ranger Software Suite (Version 3.1.0) was then used to perform sample de-multiplexing, barcode processing and single-cell 50
UMI counting with human GRCh38 as the reference genome. A more extensive description of the data generation process can be
found in Liao et al. (2020).
QUANTIFICATION AND STATISTICAL ANALYSIS
Read mapping/alignment
Reads were aligned using STAR 2.7.0 (Dobin et al., 2013) in the two-pass mode using the following parameters:–runThreadN was set
to 14,–outSAMattributes to ‘NH HI AS nM NM XS’,–outSAMtype to ‘BAM SortedByCoordinate’,–outFilterScoreMinOverLread to
0.6,–outFilterMatchNminPverLread to 0.6, and–twopassMode to ‘Basic’.
Viral database and STAR Index building

As STAR performance drastically dropped when the reference index contains more than 10.000 scaffold/chromosomes, we decided
to base our analysis on the limited, but high-quality, viruSITE database (Stano et al., 2016), derived from the NCBI Refseq database.
The corresponding FASTA file was downloaded from the viruSITE website (http://www.virusite.org/archive/2019.1/genomes.fasta.
zip). STAR indexes were build for both human and mouse samples using respectively the GRCh38 (hg38) and GRCm38 (mm10) refer-
ence genomes in addition with the whole viruSITE database. Both reference genomes were downloaded at http://www.ensembl.
org//useast.ensembl.org/info/data/ftp/index.html?redirectsrc=//www.ensembl.org%2Finfo%2Fdata%2Fftp%2Findex.html.
For the analysis of COVID-19 patients we added the official SARS-CoV-2 reference genome from the Refseq database
(NC_045512.2) as it has not been added to the viruSITE database yet. In total this database contains 11988 viral segments from
9431 different viruses.
Processing and filtering of the BAM files

We empirically observed that viral genome sequences can contain highly repetitive subsequences and can therefore create false
positive signal. Moreover, some viral genes can share a significant similarity with host genes and also generate mapping artifacts.
To remove those, we implemented a strict filtering approach where for each viral segment, a list of mapping features are measured
and used to estimate the quality of the mapping.
Following the alignment, the resulting BAM files were processed using the samtools toolbox (Li et al., 2009): first the BAM files were
indexed using the samtools index command. Virus segment with more than 50 mapped reads were detected using the samtools idx-
stats command and a unique bam file was then created for each of the viral segment using the samtools view command.
Each viral bam files were then loaded into an R environment using the readGAlignments() function from the GenomicAlignments
package. Various features were then extracted to assess the quality of the mapping:
Cell 181, 1475–1488.e1–e6, June 25, 2020 e3

ll
Article
d The length of the longest mapped contig computed using the coverage() function.
d The percentage of the viral segment that is mapped, also computed using the coverage() function.
d The mean sequencing quality of the mapped reads.
d The number and percentage of uniquely mapped reads.
d The mean sequenced entropy of the mapped reads defined as follows: for each mapped read each nucleotide frequency was
extracted using the alphabetFrequency() function of the Biostring package and averaged over the reads. Then the correspond-
ing Shanon entropy was computed using napierian logarithm.
Empirically we determined that a mean sequence entropy bigger than 1.2, a coverage bigger than 5% and the longest contig bigger
than three times the mean read length is sufficient to consider a viral segment to be present. This filter configuration eliminated all
manually identified artifacts in the various benchmarked datasets and was used unchanged in the HBV and COVID-19 patient
data analysis.
When using this strategy, we observed two different kinds of ‘contamination’:
-
d the first one consists of the detection of retroviruses specific to the sequenced host species: this is likely due to the expression
of host endogenous retro-viral elements that highly similar to ‘real’ retroviruses.
-
d the second is the presence of a plant virus, the Tomato brown rugose fruit virus: this is an emerging virus that infects tomatoes
and peppers and is endemic in Israel and Jordan. It is highly contagious and spreads easily. We detected this virus only in sam-
ples sequenced in Rehovot (Israel) suggesting that it was due to an airborne contamination.
To improve computation speed, this step was parallelised using the doParallel R package.
Transcript reconstruction
As viral genomes are poorly annotated, we decided to systemically reconstruct the transcriptome of each viral segment detected
using the transcript assembler StringTie (Pertea et al., 2015). StringTie was used with default parameter except the minimum isoform
abundance parameter -f which was set to 0.01 to detect lowly abundant transcripts and the minimal distance between two transcript
-g set to 5.
MARS-seq data demultiplexing and UMI count

In order to have a UMI-counting procedure adapted to viral genomes, i.e that distinguish spliced and un-spliced RNA molecule, we
developed an in-house R script based on the GenomicRanges, GenomicAlignments and GenomicFeatures packages that used the
same strategy as the commercial CellRanger toolkit. Briefly cell barcodes were extracted and compared with a cell barcode whitelist
provided by the MARS-seq2 demultiplexing pipeline (Keren-Shaul et al., 2019): cell barcode that belong to the whitelist were kept
while cell barcodes that did not belong to the whitelist but that has a highly similar barcode (Hamming distance equal to one,
computed using the stringdist() function from the stringdist package) were corrected and kept. UMIs were also extracted and
mono-nucleotide UMIs filtered out. Hamming distances between UMIs assigned to the same cell and the same gene were then
computed similarly to cell barcodes and UMIs with a Hamming distance equal to one were aggregated and considered as redundant
UMIs. Lastly the mapping file was loaded using the readGAlignments() function from the GenomicAlignments package and reads
were assigned to a specific viral gene using the findOverlaps() function from the same package. In case the read mapped to a given
viral transcript but was not assigned to any viral gene, it was considered as coming from an un-spliced viral RNA molecule.
Drop-seq and 10X data download, pre-processing and demultiplexing

Fastq files were downloaded through the SRA Explorer tool (https://sra-explorer.info/#). Identification and correction of cellular bar-
code, as well as UMI demultiplexing was performed using UMI-tools 1.0.0 (Smith et al., 2017). First, cell barcodes were extracted and
a putative whitelist computed using the umi_tools whitelist command with the parameters ‘–stdin —bc-pattern =
CCCCCCCCCCCCCCCCNNNNNNNNNN–log2stderr ’ for the 10X data. For Drop-Seq data the same command is used except
the–bc-pattern option set to CCCCCCCCCCCCNNNNNNNN. Collapsing of the UMIs is performed using the command umi_tools
extract with parameters ‘—bc-pattern = CCCCCCCCCCCCCCCCNNNNNNNNNN —stdin —filter-cell-barcode’ on the 10X data
and with the same command for Drop-seq data except for the–bc-pattern option set to ‘CCCCCCCCCCCCNNNNNNNN’. Following
the mapping of the reads to viral genomes and transcript assembly, the mapped reads were assigned to transcripts using the R pack-
age Rsubread through the function featureCounts() with default parameters. The command umi_tools count is then used to compute
the final expression table with the following parameters:–per-gene–gene-tag = XT–assigned-status-tag = XS–per-cell.
Analysis of the MARS-seq spleen LCMV dataset

High-level analysis were performed using the R-based Pagoda2 pipeline (https://github.com/hms-dbmi/pagoda2/) (Lake et al., 2018)
in addition to an in-house R script. Briefly UMI table were loaded and cells with less than 350 UMIs were removed. Lowly abundant
genes (less than 100 UMIs) were also removed from analysis. Analysis of the filtered dataset was then performed similarly to our pre-
vious paper (Blecher-Gonen et al., 2019) by using the 1500 most variant genes and 100 PCs for dimensionality reduction. kNN graph
was build with a parameter K equal to 30 and Louvain’s method used for clustering. Cluster marker genes were computed by using
e4 Cell 181, 1475–1488.e1–e6, June 25, 2020

ll
Article
the getdiffGenes function with default parameters. Data were visualized using UMAP (McInnes et al., 2018) implemented by the uwot
package.
Analysis of the 10X HBV liver dataset

High-level analysis were performed using the R-based Pagoda2 pipeline (https://github.com/hms-dbmi/pagoda2/) (Lake et al., 2018)
in addition to an in-house R script. Briefly UMI table were loaded and cells with less than 1000 UMIs were removed. Lowly abundant
genes (less than 50 UMIs) were also removed from analysis. Analysis of the filtered dataset was then performed similarly to our pre-
vious paper (Blecher-Gonen et al., 2019) by using the 1000 most variant genes and 100 PCs for dimensionality reduction. kNN graph
was build with a parameter K equal to 30 and Louvain’s method used for clustering. Cluster marker genes were computed by using
the getdiffGenes function with default parameters. Data were visualized using UMAP (McInnes et al., 2018) implemented by the uwot
package.
Analysis of the COVID-19 BAL dataset

Upstream processing of reads was done with the CellRanger toolkit, resulting in a UMI table of 75,790 cells with a median UMI count
of 2,442, and a median of 868 genes per cell. Cells with less than 500 UMI, or more than 50% mitochondrial genes were excluded.
We used the MetaCell package (Baran et al., 2019) to group single cells from all patients into groups of transcriptionally homoge-
neous groups, termed metacells . We first removed mitochondrial genes, ERCC, and the diverse immunoglobulin genes (IGH, IGK,
and IGL).
Gene features for metacell covers were selected using the parameter Tvm = 0.4, total umi > 30, and more than 4 UMI in at least 3
cells (using the functions mcell_gset_filter_varmean, and mcell_gset_filter_cov). We excluded gene features associated with the cell
cycle, stress response, type I interferon, and batch-specific genes via a clustering approach (using the functions mcell_mat_rpt_cor_-
anchors and mcell_gset_split_by_dsmat). To this end we first identified all genes with a correlation coefficient of at least 0.1 for one of
the anchor genes TOP2A, MKI67, PCNA, MCM4, UBE2C, STMN1 (cell cycle), HSPA1B, HSPA1A, DNAJB1, HSPB1, HSPA6, FOS,
JUN, CCL4, CCL4L2, MT1E, MT1X, MT1F, TYMS, GADPH, DUT, HMGB2 (stress and batch effect), IFIT1, IFIT3, OASL, IRF7, IRF1,
STAT1, and STAT3 (type I IFN). We then hierarchically clustered the correlation matrix between these genes (filtering genes with low
coverage and computing correlation using a down-sampled UMI matrix) and selected the gene clusters that contained the above
anchor genes. We thus retained 402 genes as features (Table S3). We used metacell to build a kNN graph, perform boot-strapped
co-clustering (500 iterations; resampling 70% of the cells in each iteration), and derive a cover of the co-clustering kNN graph (K =
100). Outlier cells featuring gene expresssion higher than 4-fold than the geometric mean in the metacells in at least one gene were
discarded.
Annotation of the metacell model was done using the metacell confusion matrix and analysis of marker genes. Detailed annotation
within the myeloid, lymphoid and epithelial compartments was performed using hierarchical clustering of the metacell confusion ma-
trix (Figure S3A) and supervised analysis of enriched genes. Metacells enriched for markers from more than one lineage (either T
(TRBC2), myeloid (S100A8, C1QB), epithel (KRT18), and plasma cells (XBP1)) were marked as doublets and discarded from further
analysis. We additionally discarded metacells of erythrocytes or plasma cells from further analysis.
To derive cell cycle and type I interferon response co-expressed gene modules, we used a clustering-approach as described in the
previous paragraphs (using the functions mcell_mat_rpt_cor_anchors and mcell_gset_split_by_dsmat) on a set of cell cycle and
interferon genes. We clustered, and manually inspected the resulting clusters, retrieving 72 cell-cycle related and 65 interferon
related genes (Table S3).
To extract proportion of proliferating cells (Figure 3G), we calculated for each cells the number of cell-cycle related transcripts per
1,000 UMI. Cells with more than 8 transcripts were determined proliferating.
Testing for infection specificity in COVID-19 BAL dataset

To test for SARS-CoV-2 infection specificity in different cell populations, we computed for each metacell the total number of host
UMIs (hUMI) and viral UMIs (vUMI) in the three severe patients (S1-3). We then computed for each metacell its expected vUMI
cout, based on its total UMI count (hUMI + vUMI) and the total vUMI proportion across all cells. Figure 4C shows log2 fold change
between the observed and expected UMI in each metacell, after adding a regularization factor ( = 5) for each factor. Log2 fold change
for the 27 subsets in Figure 3A, and calculated for each severe patient separately is shown in Table S2.
Testing for hMPV infection specificity was done in a similar manner. However, since UMI distribution across cells was abundant
and heavy-tailed, we computed for each metacell the expected number of vUMI+ cells instead of its total vUMI count. A cell was
determined vUMI+ if it had more than 10 viral UMI, as determined by automatic thresholding (Figure 4F). Figure 4G shows log2
fold change between the observed and expected vUMI+ cells in each metacell, after adding a regularization factor ( = 5) for each
factor.
Dichotomized differential gene expression analysis

ScRNA-seq data are intrinsically noisy data with a large proportion of zeros values (previously called dropouts) due to limited sam-
pling of the initial mRNA molecule pool. In addition, cell library size is a major cofounder variable, even after common normalization
Cell 181, 1475–1488.e1–e6, June 25, 2020 e5

ll
Article
procedures such as TPM, especially for lowly expressed genes (Hafemeister and Satija, 2019). We therefore improved the method
used in our former paper (Blecher-Gonen et al., 2019) that was based on logistic regression.
Briefly our method is based on the global trend of the field that consists in sequencing large amounts of cells but with a limited
sequencing depth. Such approach will produce mostly ‘binary’ data and seem to be represent the best compromise on a cost/effi-
ciency point of view (Svensson et al., 2019). So far, several statistical models have been used to model and analyze scRNA-seq count
data, most of them being based on the zero-inflated negative-binomial (ZINB) distribution (Finak et al., 2015; Kharchenko et al., 2014).
However, recent studies suggested that those models are too complex and introduce artificial complexity (Silverman et al., 2018;
Svensson, 2020; Townes et al., 2019). We hypothesize that with such binary data, current models will not fit properly and more suited
ones need to be developed.
We therefore developed a new approach based on the binomial complementary Log-log regression (cloglog model): once a given
group of cells has been isolated, through Louvain’s clustering for instance (Blondel et al., 2008), we first dichotomized gene expres-
sion (if the normalized expression is bigger than 0 the gene is considered as expressed) and then computed a binomial Generalized
Linear Model (GLM) with a complementary log log link function (cloglog) using the glm() R function. To mitigate the variation of the
library size as well as the global effect of the infection (bystander effect), we include both variables in the regression model. The cor-
responding p value are then computed using a Likelihood Ratio Test (LRT) and then corrected using Benjamini Hochberg correction
(Benjamini and Hochberg, 1995).
For a more comprehensive description of the approach please see Methods S1.
Automate thresholding to detect HBV and hMPV infected cells

In the case of the HBV and hMPV infections, we observed that cells could contain from one to several thousands UMIs. In order to
know which cells were really infected and which one contain viral UMIs due to ambient contamination, we decided to apply Otsu’s
thresholding after logarithmic transformation. Otsu’s method was implemented using an in-house R script (Otsu, 1979).
Gene set enrichment analysis

Gene set enrichment analysis was performed using the online GSEA tool https://www.gsea-msigdb.org/gsea/index.jsp (Liberzon
et al., 2015; Subramanian et al., 2005). The enrichment analysis was performed using the Hallmark and Gene Ontology biological
process databases. False detection rate was set to 0.05. Only the top 10 most enriched terms were reported.
e6 Cell 181, 1475–1488.e1–e6, June 25, 2020

ll
Article
Supplemental Figures

ll
Article
Figure S1. Benchmarking of Viral-Track on Diverse Infection Models, Related to Figure 1

A. Graph chart representing the different steps of the Viral-Track pipeline. B-D. Results of Viral-Track analysis performed on LCMV spleen, LCMV lymph node and
VSV lymph node datasets, respectively. Viral segments with more than 50 mapped reads are plotted. (E). Number of detected LCMV (left panel) and VSV (right
panel) reads in the different samples from the lymph node experiment. F. Results of Viral-Track analysis performed on the in-vitro HSV-1 data. G. Quantification of
the number of HSV-1 reads in HSV-1 infected and control samples. (H). Results of Viral-Track analysis performed on the in-vitro HIV data. I. Quantification of the
number of HIV reads in HIV infected and control samples. J. UMAP plot of the liver HBV data, dots are colored by cell subset assignment based on Louvain
clustering. K. UMAP plot of the liver HBV data. infected cells are colored in orange and bystander cells in gray.
ll
Article
Figure S2. Comparison of Viral-Track Performance to Fluorescence Tagging Techniques, Related to Figure 2
A. Proportion of vUMI+ cells from total spleen and the LCMV-GFP+ population B. UMAP plot of the spleen LCMV data, spots are colored based on Louvain
clustering. C. UMAP plot of the spleen LCMV data, bystander cells are colored in gray, vUMI+ cells are colored in red and GFP+ cells in green. D. Mean gene
expression in bystander and infected MZB cells. Genes with a log2FC bigger than 1 or lower than 1 and a corrected p value lower than 0.01 are colored in
orange.
ll
Article

ll
Article
Figure S3. Detailed Molecular and Cellular Profiling of COVID-19 BAL Samples, Related to Figure 3
A. The confusion matrix of the MetaCell model shown in Figure 3A. Entries denote for each pair of metacells the propensity of cells from both metacells to be
clustered together in a bootstrap analysis. B-D. Gene expression profiles of cells belonging to the epithelial (B), lymphoid (C), and myeloid (D). In A-D, color bars
indicate association to 27 cell subsets depicted in Figure 3A. E-G. Quantification of the frequency of specific cell subsets in the myeloid (E), lymphoid (F), and
epithelial (G) compartments, across the nine patients. Diamond marks patient S1, co-infected with the human Metapneumovirus (Figures 4D-4H). Horizontal lines
indicate mean frequency. (H). Projection of IL6 and IL8 (CXCL8) expression on the 2D map shown in Figure 4A. Colors represent expression quantiles.
ll
Article
Figure S4. Viral-Track Performance on COVID-19 BAL Samples, Related to Figure 4

A. Results of Viral-Track analysis performed on samples with highest viral load (patients S2 and S3). B. Mean normalized expression of ACE2, TMPRSS2 and BSG
across the 27 cell subsets C. Log2 fold change between vUMI+ and vUMI- SPP1+ monocyte-derived macrophages in patient S2 (x axis) and patient S3 (y axis). D.
Relation between total human and viral UMIs in cells from patient S1. E. Projection of cells from patient S1, co-infected with hMPV, on the metacell map from
Figure 3A. F. Enrichment analysis of the downregulated genes in hMPV infected monocytes. G. Number of hMPV UMIs in cells producing type I IFN or not. P value
was computed by fitting a logistic regression predicting if a cell would produce type I IFN using total host and viral UMIs.
Article
Targets of T Cell Responses to SARS-CoV-2

Coronavirus in Humans with COVID-19 Disease and
Unexposed Individuals
Alba Grifoni, Daniela Weiskopf,
Sydney I. Ramirez, ..., Davey M. Smith,
Shane Crotty, Alessandro Sette
Correspondence
shane@lji.org (S.C.),
alex@lji.org (A.S.)
In Brief
An analysis of immune cell responses to
SARS-CoV-2 from recovered patients
identifies the regions of the virus that is
targeted and also reveals cross-reactivity
with other common circulating
coronaviruses
Highlights
d Measuring immunity to SARS-CoV-2 is key for
understanding COVID-19 and vaccine development
d Epitope pools detect CD4+ and CD8+ T cells in 100% and

70% of convalescent COVID patients
d T cell responses are focused not only on spike but also on M,

N, and other ORFs
d T cell reactivity to SARS-CoV-2 epitopes is also detected in

non-exposed individuals
Grifoni et al., 2020, Cell 181, 1489–1501

ll
Article
Targets of T Cell Responses to SARS-CoV-2
Coronavirus in Humans with COVID-19
Disease and Unexposed Individuals
Alba Grifoni,1 Daniela Weiskopf,1 Sydney I. Ramirez,1,2 Jose Mateus,1 Jennifer M. Dan,1,2
Carolyn Rydyznski Moderbacher,1 Stephen A. Rawlings,2 Aaron Sutherland,1 Lakshmanane Premkumar,3
Ramesh S. Jadi,3 Daniel Marrama,1 Aravinda M. de Silva,3 April Frazier,1 Aaron F. Carlin,2 Jason A. Greenbaum,1
Bjoern Peters,1,2 Florian Krammer,4 Davey M. Smith,2 Shane Crotty,1,2,5,* and Alessandro Sette1,2,5,6,*
1Center for Infectious Disease and Vaccine Research, La Jolla Institute for Immunology, La Jolla, CA 92037, USA
2Department of Medicine, Division of Infectious Diseases and Global Public Health, University of California, San Diego, La Jolla, CA
92037, USA
3Department of Microbiology and Immunology, University of North Carolina School of Medicine, Chapel Hill, NC 27599-7290, USA
4Department of Microbiology, Icahn School of Medicine at Mount Sinai, New York, NY, USA
6Lead Contact
*Correspondence: shane@lji.org (S.C.), alex@lji.org (A.S.)

SUMMARY
Understanding adaptive immunity to SARS-CoV-2 is important for vaccine development, interpreting coro-
navirus disease 2019 (COVID-19) pathogenesis, and calibration of pandemic control measures. Using HLA
class I and II predicted peptide ‘‘megapools,’’ circulating SARS-CoV-2-specific CD8+ and CD4+ T cells
were identified in 70% and 100% of COVID-19 convalescent patients, respectively. CD4+ T cell responses
to spike, the main target of most vaccine efforts, were robust and correlated with the magnitude of the anti-
SARS-CoV-2 IgG and IgA titers. The M, spike, and N proteins each accounted for 11%–27% of the total CD4+
response, with additional responses commonly targeting nsp3, nsp4, ORF3a, and ORF8, among others. For
CD8+ T cells, spike and M were recognized, with at least eight SARS-CoV-2 ORFs targeted. Importantly, we
detected SARS-CoV-2-reactive CD4+ T cells in 40%–60% of unexposed individuals, suggesting cross-
reactive T cell recognition between circulating ‘‘common cold’’ coronaviruses and SARS-CoV-2.
INTRODUCTION different depending on whether SARS-CoV-2 infection creates

substantial immunity, and whether any cross-reactive immunity
COVID-19 is a worldwide emergency. The first cases occurred in exists between SARS-CoV-2 and circulating seasonal ‘‘common
December 2019, and now more than 240,000 deaths and cold’’ human coronaviruses. Definition and assessment of human
3,000,000 cases of SARS-CoV-2 infection have been reported antigen-specific SARS-CoV2 T cell responses are best made
worldwide as of May 1st (Dong et al., 2020; Wu and McGoogan, with direct ex vivo T cell assays using broad-based epitope pools
2020). Vaccines against SARS-CoV-2 are just beginning devel- and assays capable of detecting T cells of any cytokine polariza-
opment (Amanat and Krammer, 2020; Thanh Le et al., 2020). tion. Herein, we have completed such an assessment with blood
An understanding of human T cell responses to SARS-CoV2 is samples from COVID-19 patients.
lacking, due to the rapid emergence of the pandemic. There is There is also great uncertainty about whether adaptive im-
an urgent need for foundational information about T cell mune responses to SARS-CoV-2 are protective or pathogenic,
responses to this virus. or whether both scenarios can occur depending on timing,
The first steps for such an understanding are the ability to quan- composition, or magnitude of the adaptive immune response.
tify the virus-specific CD4+ and CD8+ T cells. Such knowledge is Hypotheses range the full gamut (Peeples, 2020), based on avail-
of immediate relevance, as it will provide insights into immunity able clinical data from severe acute respiratory disease
and pathogenesis of SARS-CoV-2 infection, and the same knowl- syndrome (SARS) or Middle East respiratory syndrome (MERS)
edge will assist vaccine design and evaluation of candidate vac- (Alshukairi et al., 2018; Wong et al., 2004; Zhao et al., 2017) or an-
cines. Estimations of immunity are also central to epidemiological imal model data with SARS in mice (Zhao et al., 2009, 2010,
model calibration of future social distancing pandemic control 2016), SARS in non-human primates (NHPs) (Liu et al., 2019; Ta-
measures (Kissler et al., 2020). Such projections are dramatically kano et al., 2008) or feline infectious peritonitis virus (FIPV) in cats
ll
Article
(Vennema et al., 1990). Protective immunity, immunopathogene- Limited information is also available about which SARS-CoV-2
sis, and vaccine development for COVID-19 are each briefly dis- proteins are recognized by human T cell immune responses. In
cussed below, related to introducing the importance of defining some infections, T cell responses are strongly biased toward
T cell responses to SARS-CoV-2. certain viral proteins, and the targets can vary substantially be-
Based on data from SARS patients in 2003–2004 (caused by tween CD4+ and CD8+ T cells (Moutaftsi et al., 2010; Tian
SARS-CoV, the most closely related human betacoronavirus to et al., 2019). Knowledge of SARS-CoV-2 proteins and epitopes
SARS-CoV-2), and based on the fact that most acute viral infec- recognized by human T cell responses is of immediate rele-
tions result in development of protective immunity (Sallusto et al., vance, as it will allow for monitoring of COVID-19 immune re-
2010), a likely possibility has been that substantial CD4+ T cell, sponses in laboratories worldwide. Epitope knowledge will also
CD8+ T cell, and neutralizing antibody responses develop to assist candidate vaccine design and facilitate evaluation of vac-
SARS-CoV-2, and all contribute to clearance of the acute infec- cine candidate immunogenicity. Almost all of the current COVID-
tion, and, as a corollary, some of the T and B cells are retained 19 vaccine candidates are focused on the spike protein.
long term (i.e., multiple years) as immunological memory and A final key issue to consider in the study of SARS-CoV-2 im-
protective immunity against SARS-CoV-2 infection (Guo et al., munity is whether some degree of cross-reactive coronavirus
2020b; Li et al., 2008). However, a contrarian viewpoint is also immunity exists in a fraction of the human population, and
legitimate. While most acute infections result in the development whether this might influence susceptibility to COVID-19 disease.
of protective immunity, available data for human coronaviruses This issue is also relevant for vaccine development, as cross-
suggest the possibility that substantive adaptive immune re- reactive immunity could influence responsiveness to candidate
sponses can fail to occur (Choe et al., 2017; Okba et al., 2019; vaccines (Andrews et al., 2015).
Zhao et al., 2017) and robust protective immunity can fail to In sum, the ability to measure and understand the human CD4+
develop (Callow et al., 1990). A failure to develop protective im- and CD8+ T cell responses to SARS-CoV-2 infection is a major
munity could occur due to a T cell and/or antibody response of knowledge gap currently impeding COVID-19 vaccine develop-
insufficient magnitude or durability, with the neutralizing anti- ment, interpretation of COVID-19 disease pathogenesis, and
body response being dependent on the CD4+ T cell response calibration of future social distancing pandemic control
(Crotty, 2019; Zhao et al., 2016). Thus, there is urgent need to un- measures.
derstand the magnitude and composition of the human CD4+
and CD8+ T cell responses to SARS-CoV-2. If natural infection RESULTS
with SARS-CoV-2 elicits potent CD4+ and CD8+ T cell responses
commonly associated with protective antiviral immunity, COVID- SARS-CoV-2 Peptides and Predicted Class I and Class II
19 is a strong candidate for rapid vaccine development. Epitopes
Immunopathogenesis in COVID-19 is a serious concern (Cao, We recently predicted SARS-CoV-2 T cell epitopes utilizing
2020; Peeples, 2020). It is most likely that an early CD4+ and the Immune Epitope Database and Analysis Resource (IEDB)
CD8+ T cell response against SARS-CoV-2 is protective, but an (Dhanda et al., 2019; Vita et al., 2019). Utilizing bioinformatic
early response is difficult to generate because of efficient innate approaches, we identified specific peptides in SARS-CoV-2
immune evasion mechanisms of SARS-CoV-2 in humans with increased probability of being T cell targets (Grifoni
(Blanco-Melo et al., 2020). Immune evasion by SARS-CoV-2 is et al., 2020). We previously developed the megapool (MP)
likely exacerbated by reduced myeloid cell antigen-presenting approach to allow simultaneous testing of large numbers of
cell (APC) function or availability in the elderly (Zhao et al., 2011). epitopes. By this technique, numerous epitopes are solubi-
In such cases, it is conceivable that late T cell responses may lized, pooled, and re-lyophilized to avoid cell toxicity problems
instead amplify pathogenic inflammatory outcomes in the pres- (Carrasco Pro et al., 2015). These MPs have been used in hu-
ence of sustained high viral loads in the lungs, by multiple hypo- man T cell studies of a number of indications, including al-
thetical possible mechanisms (Guo et al., 2020a; Li et al., 2008; lergies (Hinz et al., 2016), tuberculosis (Lindestam Arlehamn
Liu et al., 2019). Critical (ICU) and fatal COVID-19 (and SARS) out- et al., 2016), tetanus, pertussis (Bancroft et al., 2016; da Silva
comes are associated with elevated levels of inflammatory cyto- Antunes et al., 2017), and dengue virus, for both CD4+ and
kines and chemokines, including interleukin-6 (IL-6) (Giamarel- CD8+ T cell epitopes (Grifoni et al., 2017; Weiskopf et al.,
los-Bourboulis et al., 2020; Wong et al., 2004; Zhou et al., 2020) 2015). Here, we generated MPs based on predicted SARS-
Vaccine development against acute viral infections classi- CoV-2 epitopes. Specifically, one MP corresponds to 221 pre-
cally focuses on vaccine-elicited recapitulation of the type of dicted HLA class II CD4+ T cell epitopes (Grifoni et al., 2020)
protective immune response elicited by natural infection. covering all proteins in the viral genome, apart from the spike
Such foundational knowledge is currently missing for (S) antigen (CD4_R MP). The prediction strategy utilized is
COVID-19, including how the balance and the phenotypes of geared to capture 50% of the total response (Dhanda
responding cells vary as a function of disease course and et al., 2018; Paul et al., 2015) and was designed and validated
severity. Such knowledge can guide selection of vaccine to predict dominant epitopes independently of ethnicity and
strategies most likely to elicit protective immunity against HLA polymorphism. This approach takes advantage of the
SARS-CoV-2. Furthermore, knowledge of the T cell responses extensive cross-reactivity and repertoire overlap between
to COVID-19 can guide selection of appropriate immunolog- different HLA class II loci and allelic variants to predict pro-
ical endpoints for COVID-19 candidate vaccine clinical trials, miscuous epitopes, capable of binding many of the most
which are already starting. common HLA class II prototypic specificities (Greenbaum
1490 Cell 181, 1489–1501, June 25, 2020

ll
Article
Table 1. Participant Characteristics mind in terms of comparison of the magnitude of the CD4+
T cell responses to those pools.
Unexposed (n = 20) COVID-19 (n = 20)
In the case of CD8 epitopes, since the overlap between
20–66 20–64
different HLA class I allelic variants and loci is more limited to
(median = 31, (median = 44,
Age (years) IQR = 21) IQR = 9)
specific groups of alleles, or supertypes (Sidney et al., 2008),
we targeted a set of the 12 most prominent HLA class I A and
Gender
B alleles, which together allow broad coverage (>85%) of the
Male (%) 35% (7/20) 45% (9/20)
general population. Two class I MPs were synthesized based
Female (%) 65% (13/20) 55% (11/20) on epitope predictions for those 12 most common HLA A and
Residency B alleles (Grifoni et al., 2020), which collectively encompass
California (%) 95% (19/20) 100% (20/20) 628 predicted HLA class I CD8+ T cell epitopes from the entire
USA, Non-California 5% (1/20) 0% (0/20) SARS-CoV-2 proteome (CD8 MP-A and MP-B).
(%)
Sample Collection March 2015– March–April 2020 Immunological Phenotypes of Recovered COVID-19
Date March 2018 Patients
SARS-CoV-2 PCR N/A 100% (16/16 tested) To test for the generation of SARS-CoV-2 CD4+ and CD8+ T cell
Positivity responses following infection, we initially recruited 20 adult pa-
Antibody Test N/A 90% (18/20)
tients who had recovered from COVID-19 disease (Table 1).
Positivitya We also utilized peripheral blood mononuclear cell (PBMC) and
plasma samples from local healthy control donors collected in
Disease Severityb
2015–2018 (see STAR Methods). Blood samples were collected
Mild N/A 70% (14/20)
at 20–35 days post-symptoms onset from non-hospitalized
Moderate N/A 20% (4/20) COVID-19 patients who were no longer symptomatic. SARS-
Severe N/A 10% (2/20) CoV-2 infection was determined by swab test viral PCR during
Critical N/A 0% (0/20) the acute phase of the infection. Verification of SARS-CoV-2
Symptoms exposure was attempted both by lateral flow serology and
Cough N/A 79% (15/19) SARS-CoV-2 spike protein receptor binding domain (RBD)
ELISA (Stadlbauer et al., 2020), using plasma from the convales-
Fatigue N/A 42% (8/19)
cence stage blood draw. Most patients were confirmed positive
Fever N/A 37% (7/19)
by lateral flow immunoglobulin (Ig) tests (Table 1). All patients
Anosmia N/A 21% (4/19) were confirmed COVID-19 cases by SARS-CoV-2 RBD ELISA
Dyspnea N/A 16% (3/19) (Figures 1 and S1). All cases were IgG positive; anti-RBD IgM
Diarrhea N/A 5% (1/19) and IgA was also detected in the large majority of cases (Figures
Days Post Symptom N/A 20–36 (18/20) 1 and S1).
Onset at Collection (median = 26, IQR = 7) We defined a 21-color flow cytometry panel of mononuclear
Past Medical History leukocyte lineage and phenotypic markers (Table S2) to broadly
No known N/A 65% (13/20) assess the immunological cellular profile of recovered COVID-19
patients (Figures 1 and S2). The frequency of CD3+ cells was
Hyperlipidemia N/A 15% (3/20)
slightly increased in recovered COVID-19 patients relative to
Hypertension N/A 10% (2/20)
non-exposed controls, while no significant differences overall
Asthma N/A 10% (2/20) were observed in the frequencies of CD4+ or CD8+ T cells
Known or suspected N/A 75% (15/20) between the two groups. Frequencies of CD19+ cells were
sick contact/exposure somewhat decreased, while no differences were observed in
a
Commercial skin prick lateral flow assay. the frequencies of CD3–CD19– cells or CD14+CD16– monocytes
b
WHO criteria. (Figures 1 and S2). No evidence of general lymphopenia was
observed in the convalescing patients, consistent with the litera-
ture. Next, we utilized the SARS-CoV-2 MPs to probe CD4+ and
et al., 2011; O’Sullivan et al., 1991; Sidney et al., 2010a, CD8+ T cell responses.
2010b; Southwood et al., 1998).
For the spike protein, to ensure that all T cell reactivity Identification and Quantitation of SARS-CoV-2-Specific
against this important antigen can be detected, we generated CD4+ T Cell Responses
a separate MP covering the entire antigen with 253 15-mer We utilized T cell receptor (TCR) dependent activation induced
peptides overlapping by 10-residues (MP_S, Table S1). As marker (AIM) assays to identify and quantify SARS-CoV-2-spe-
stated above, the MP used to probe the non-spike regions is cific CD4+ T cells in recovered COVID-19 patients. Initial defini-
expected to capture 50% of the total response. The use of tion and assessment of human antigen-specific SARS-CoV-2
overlapping peptides spanning entire open reading frames T cell responses are best made with direct ex vivo T cell assays
(ORFs) instead allows for a more complete characterization using broad-based epitope pools, such as MPs, and
but also requires more cells. This factor should be kept in assays capable of detecting T cells of unknown cytokine
Cell 181, 1489–1501, June 25, 2020 1491

ll
Article
A B C Figure 1. SARS-CoV-2 IgM, IgA, and IgG Re-

107 107 sponses of Recovered COVID-19 Patients
107 **** ****
SARS-CoV2 spike RBD IgM
SARS-CoV2 spike RBD IgA

****
SARS-CoV2 spike RBD IgG
(A–C) Plasma ELISA titers to SARS-CoV-2 spike RBD. (A)

IgG. (B) IgM. (C) IgA. Neg, unexposed donors from 2015–
106 106 106
2018 (n = 20); COVID, convalescing COVID-19 patients (n =
20). All data are shown as ELISA titers based on a standard.
105 105 105 The dotted line indicates limit of detection. Geometric
mean titers with geometric SDs are indicated.
104 104 104 (D–I) Immunophenotyping of mononuclear leukocytes.

Frequency of (D) CD3+ total T cells, (E) CD4+ T cells
(CD4+CD3+), (F) CD8+ T cells (CD8+CD3+), (G) CD19+ B
103 103 103
Neg COVID Neg COVID cells (CD19+CD3–), (H) CD3–CD19– cells, and (I)
Neg COVID
CD14+CD16– monocytes (CD3–CD19–CD56–) from the
D E F PBMCs of unexposed donors (Neg, n = 13) or convalescing
80 * 80 60 ns
ns COVID-19 patients (COVID, n = 14). Data were analyzed
using the Mann-Whitney test with mean and standard de-
70
60 viation shown.
% CD8+ T cells
% CD4+ T cells
% CD3+ cells
40 *p < 0.05, ****p < 0.0001. See also Figures S1 and S2.
60
40
50
20
20
40
surements in independent experiments was
0 30 0 high (p < 0.0002, Figure S3D). To assess func-
Neg COVID Neg COVID Neg COVID
tionality and polarization of the SARS-CoV-2-
G H I specific CD4+ T cell response, we measured cy-
30 * 80 ns 80 ns
tokines secreted in response to MP stimulation.
% CD14+CD16- monocytes
The SARS-CoV-2-specific CD4+ T cells were

% CD3-CD19- cells
60 70
functional, as the cells produced IL-2 in
% CD19+ cells
20
response to non-spike and spike MPs (Fig-
40 60 ure 2D). Polarization of the cells appeared to be
10 a classic TH1 type, as substantial interferon
20 50 (IFN)-g was produced (Figure 2E), while little to
no IL-4, IL-5, IL-13, or IL-17a was expressed
0 0 40 (Figures S3G–S3J).
Thus, recovered COVID-19 patients consis-
tently generated a substantial CD4+ T cell
polarization and functional attributes. AIM assays are cytokine- response against SARS-CoV-2. Similar conclusions were
independent assays to identify antigen-specific CD4+ T cells reached using stimulation index as the metric (Figures S3E
(Havenar-Daughton et al., 2016; Reiss et al., 2017). AIM assays and S3F). In terms of total CD4+ T cell response per donor (Fig-
have been successfully used to identify virus-specific, vaccine- ure 2A), on average 50% of the detected response was
specific, or tuberculosis-specific CD4+ T cells in a range of directed against the spike protein, and 50% was directed
studies (Dan et al., 2016, 2019; Herati et al., 2017; Morou against the MP representing the remainder of the SARS-CoV-
et al., 2019). 2 orfeome (Figure 2A). This is of significance, since the SARS-
We stimulated PBMCs from 10 COVID-19 cases and 11 CoV-2 spike protein is a key component of the vast majority
healthy controls (SARS-CoV-2 unexposed, collected in 2015– of candidate COVID-19 vaccines under development. Of
2018) with a spike MP (MP_S) and the class II MP covering the note, given the nature of the MP_R peptide predictions, the
remainder of the SARS-CoV-2 orfeome (‘‘non-spike,’’ MP actual CD4+ T cell response to be ascribed to non-spike
CD4_R). A CMV MP was used as a positive control, while ORFs was likely to be higher, addressed in further experiments
DMSO was used as the negative control (Figures 2 and S3). below.
SARS-CoV-2 spike-specific CD4+ T cell responses
(OX40+CD137+) were detected in 100% of COVID-19 cases Identification and Quantitation of SARS-CoV-2-Specific
(p < 0.0001 versus unexposed donors spike MP, Figures 2A CD8+ T Cell Responses
and 2B. p = 0.002 versus DMSO control, Figure 2C). CD4+ To measure SARS-CoV-2-specific CD8+ T cells in the recov-
T cell responses to the remainder of the SARS-CoV-2 orfeome ered COVID-19 patients, we utilized two complementary meth-
were also detected in 100% of COVID-19 cases (p < 0.0079 odologies, AIM assays and intracellular cytokine staining (ICS).
versus unexposed donors non-spike MP, Figures 2A and 2B. The two SARS-CoV-2 class I MPs were used, CD8-A and CD8-
p = 0.002, non-spike versus DMSO control, Figure 2C). The B, with CMV MP and DMSO serving as positive and negative
magnitude of the SARS-CoV-2-specific CD4+ T cell responses controls, respectively (Figures 3 and S4). CD8+ T cell responses
measured was similar to that of the CMV MP (Figure S3C). The were detected by AIM (CD69+CD137+) in 70% of COVID-19
concordance between SARS-CoV-2-specific CD4+ T cell mea- cases (p < 0.0011 versus unexposed donors ‘‘CD8 total,’’
1492 Cell 181, 1489–1501, June 25, 2020

ll
Article
A B
C D E
Figure 2. SARS-CoV-2-Specific CD4+ T Cell Responses of Recovered COVID-19 Patients

(A) SARS-CoV-2-specific CD4+ T cells measured as percentage of AIM+ (OX40+CD137+) CD4+ T cells after stimulation of PBMCs with peptide pools encom-
passing spike only (Spike) MP or the CD4_R MP representing all the proteome without spike (Non-spike). Data were background subtracted against DMSO
negative control and are shown with geometric mean and geometric standard deviation. Samples were from unexposed donors (Unexposed, n = 11) and
recovered COVID-19 patients (COVID-19, n = 10).
(B) Fluorescence-activated cell sorting (FACS) plot examples, gated on total CD4+ T cells.
(C) AIM+ CD4+ T cell reactivity in COVID-19 cases between the negative control (DMSO) and antigen-specific stimulations.
(D and E) Cytokine levels in the supernatant of PBMCs from COVID-19 donors after stimulation with peptide pools (Spike and Non-spike) or the negative control
(DMSO). (D) IL-2. (E) IFN-g.
Statistical comparisons across cohorts were performed with the Mann-Whitney test. Pairwise comparisons (C–E) were performed with the Wilcoxon test. **p <
0.01; ***p < 0.001. See also Figure S3 and Table S6.
Figures 3A and 3B; p = 0.002, CD8-A or CD8-B versus DMSO higher antibody titers in COVID-19 cases. Given that spike is
control, Figure S4B). MP CD8-A contains spike epitopes, the primary target of SARS neutralizing antibodies, we exam-
among epitopes to other proteins. The magnitude of the ined spike-specific CD4+ T cells. Spike-specific CD4+ T cell
SARS-CoV-2 reactive CD8+ T cell responses measured by responses correlated well with the magnitude of the anti-spike
AIM was somewhat lower than the CMV MP (Figure S4C). RBD IgG titers (R = 0.81; p < 0.0001; Figure 4A). Similar re-
Similar conclusions were reached using stimulation index (Fig- sults were obtained using stimulation index (Figure S5A).
ures S3D and S3E). The non-spike SARS-CoV-2-specific CD4+ T cell response
Independently, ICS assays detected IFN-g+ SARS-CoV-2- did not correlate as well with anti-spike RBD IgG titers (Fig-
specific CD8+ T cells in the majority of COVID-19 cases (Figures ures 4B and S5B), consistent with a common requirement
3C and 3D). The majority of IFN-g+ cells co-expressed granzyme for intramolecular CD4+ T cell help (Sette et al., 2008). Anti-
B (Figures 3D and 3E). A substantial fraction of the IFN-g+ spike IgA titers also correlated with spike-specific CD4+
cells expressed tumor necrosis factor (TNF) but not IL-10 (Fig- T cells (p < 0.0002, Figure S5). Thus, COVID-19 patients
ure 3D). Thus, the majority of recovered COVID-19 patients make anti-spike RBD antibody responses commensurate
generated a CD8+ T cell response against SARS-CoV-2. with the magnitude of their spike-specific CD4+ T cell
response. We then assessed the relationship between the
Relationship between SARS-CoV-2-Specific CD4+ T Cell CD4+ and CD8+ T cell responses to SARS-CoV-2. SARS-
Responses and IgG and IgA Titers CoV-2-specific CD4+ and CD8+ T cell responses were well
Most protective antibody responses are dependent on CD4+ correlated (R = 0.62. p = 0.0025, Figures 4C and S5). Thus,
T cell help. Therefore, we assessed whether stronger SARS- antibody, CD4+, and CD8+ T cell responses to SARS-CoV-2
CoV-2-specific CD4+ T cell responses were associated with were generally well correlated.
Cell 181, 1489–1501, June 25, 2020 1493

ll
Article
A B
C D E
Figure 3. SARS-CoV-2-Specific CD8+ T Cell Responses by Recovered COVID-19 Patients

(A) SARS-CoV-2-specific CD8+ T cells measured as percentage of AIM+ (CD69+CD137+) CD8+ T cells after stimulation of PBMCs with class I MPs (CD8-A, CD8-
B, and the combined data [Total]). Data were background subtracted against DMSO negative control and are shown with geometric mean and geometric
standard deviation. Samples were from unexposed donors (Unexposed, n = 11) and recovered COVID-19 patients (COVID-19, n = 10).
(B) FACS plot examples.
(C) Percentage of CD8+ T cells producing IFN-g in response to SARS-CoV-2 MPs, or CMV MP, in PBMCs from COVID-19 and unexposed donors after back-
ground subtraction. Data are shown with geometric mean and geometric standard deviation.
(D) Functional profile of IFN-g+CD8+ T cells producing granzyme B (GzB), TNF-a (TNF), or IL-10 in response to SARS-CoV-2 MPs. Mean and SD are shown.
(E) FACS plot examples of IFN-g and granzyme B co-expression.
Statistical comparisons across cohorts were performed with the Mann-Whitney test. *p < 0.05; **p < 0.01.; ns not significant. See also Figure S4 and Table S6.
Pre-existing Cross-Reactive Coronavirus-Specific tested the SARS-CoV-2 unexposed donors for seroreactivity to
T Cells HCoV-OC43 and HCoV-NL63 as a representative betacoronavi-
While spike- and non-spike-specific CD4+ T cell responses were rus and alphacoronavirus, respectively. All donors were IgG
detectable in all COVID-19 cases, cells were also detected in un- seropositive to HCoV-OC43 and HCoV-NL63 RBD, to varying
exposed individuals (Figures 3A and 3B). These responses were degrees (Figure 5C), consistent with the endemic nature of these
statistically significant for non-spike-specific CD4+ T cell reac- viruses (Gorse et al., 2010; Huang et al., 2020; Severance et al.,
tivity (non-spike, p = 0.039; spike, p = 0.067; Figures 5A and 2008). We therefore examined whether these represented true
5B). Non-spike-specific CD4+ T cell responses were above the pan-coronavirus T cells capable of recognizing SARS-CoV-2
limit of detection in 50% of donors based on stimulation index epitopes.
(SI) (Figure S3E). All of the donors were recruited between
2015 and 2018, excluding any possibility of exposure to SARS- SARS-CoV-2 ORF Targets of CD4+ and CD8+ T Cells
CoV-2. Four human coronaviruses are known causes of sea- A most pressing, yet unresolved, set of issues in understanding
sonal ‘‘common cold’’ upper-respiratory tract infections: SARS-CoV-2 immune responses is what antigens are targeted
HCoV-OC43, HCoV-HKU1, HCoV-NL63, and HCoV-229E. We by CD4+ and CD8+ T cells, whether the corresponding antigens
1494 Cell 181, 1489–1501, June 25, 2020

ll
Article
A B C
Figure 4. Correlations between SARS-CoV-2-Specific CD4+ T Cells, Antibodies, and CD8+ T Cells
(A) Correlation between SARS-CoV-2 spike-specific CD4+ T cells (%) and anti-spike RBD IgG.
(B) Correlation between SARS-CoV-2 non-spike-specific CD4+ T cells (%) and anti-spike RBD IgG.
(C) Correlation between SARS-CoV-2-specific CD4+ T cells and SARS-CoV-2-specific CD8+ T cells. Total MP responses per donor were used in each case
(‘‘Non-spike’’ + ‘‘spike’’ (CD4_R + MP_S) for CD4+ T cells, CD8_A + CD8_B for CD8+ T cells).
Statistical comparisons were performed using Spearman correlation. See also Figure S5.
are the same or different, and how do they reflect the antigens When examining the non-exposed donors, the pattern of CD4+
currently considered for COVID-19 vaccine development. We T cell targets changed. While S was still a relatively prominent
synthesized sets of overlapping peptides spanning the entire target (23% of total, on average), there was no, or marginal, reac-
sequence of SARS-CoV-2 and pooled them separately so that tivity against SARS-CoV-2 N and M. Among donors with detect-
each pool would represent one antigen (with the exception of able CD4+ T cells, a shift in reactivity was observed toward
nsp3, for which two pools were made; Table S1). SARS-CoV-2 nsp14 (25%), nsp4 (15%) and nsp6 (14%) (Figures
In the case of CD4+ T cell responses, no obvious pattern of 6B and 6C). SARS-CoV-2-reactive CD4+ T cells were detected in
antigen specificity was observed based on SARS-CoV-2 at least six different unexposed donors, demonstrating that the
genome organization; however, coronaviruses increase protein cross-reactivity is relatively widely distributed (Figure S6A).
synthesis of certain ORFs in infected cells via subgenomic Having scanned the full SARS-CoV-2 orfeome for CD4+ T cell
RNAs. Accounting for the relative abundance of subgenomic reactivity in multiple donors, it was possible to assess whether
RNAs (Figure 6A) (Irigoyen et al., 2016; Snijder et al., 2003; the epitope prediction MP approach successfully enriched for
Xie et al., 2020), the ORFs were re-ordered based on predicted SARS-CoV-2 epitopes targeted by human CD4+ T cells. When
protein abundance (Figure 6B). A clear hierarchy of SARS-CoV- the total reactivity observed with the CD4_R MP was plotted
2-specific CD4+ T cell targets was then apparent, with the ma- versus the sum total of all antigen pools (excluding spike, given
jority of the CD4+ T cell response in COVID-19 cases directed that spike predictions were not included in the CD4_R MP), a sig-
against highly expressed SARS-CoV-2 ORFs spike, M, and N. nificant correlation was observed (p < 0.0002, Figure S6C). The
On average, these antigens accounted for 27%, 21%, and single MP-R captured 50% (44% +/ range 28%–80%) of
11% of the total CD4+ T cell response, respectively. Most the non-spike response per COVID-19 donor, demonstrating
COVID-19 cases also had CD4+ T cells specific for SARS- the success of the prediction approach, which, as mentioned
CoV-2 nsp3, nsp4, and ORF8 (Figure 6B), on average each ac- above, was devised to attempt to capture approximately 50%
counting for 5% of the total CD4+ T cell response (Figure 6C). of the total response (Dhanda et al., 2018; Paul et al., 2015).
E, ORF6, hypothetical ORF10, and nsp1 are all small antigens In the case of CD8+ T cell responses, the data in the literature
(or potentially not expressed, in the case of ORF10) and were from other coronaviruses (57 different studies curated in the
most likely predominantly unrecognized as a result. These re- IEDB; Table S3) reported spike accounting for 50% and N ac-
sults are somewhat unexpected, because data for other coro- counting for 36% of the defined epitopes. In a large study of hu-
naviruses, from 27 different studies curated in the IEDB, re- man SARS-CoV-1 responses, spike was reported as essentially
ported that spike accounted for nearly two-thirds of reported the only target of CD8+ T cell responses (Li et al., 2008), while in a
CD4+ T cell reactivity (Table S3). N accounted for most of the study of MERS CD8+ T cells, responses were noted for spike, N
remaining epitopes in the published literature, although human and a pool of M/E peptides (Zhao et al., 2017). Few epitopes
N-specific CD4+ T cell responses were not observed in one of have been reported from other coronavirus antigens (Table
the most comprehensive studies of human SARS-CoV-1 S3). Here, we scanned the full SARS-CoV-2 orfeome for CD8+
T cell responses (Li et al., 2008). Coronavirus M has not previ- T cell recognition. Our data indicate a somewhat different pattern
ously been described as a prominent target of CD4+ T cell re- of immunodominance for SARS-CoV-2 CD8+ T cell reactivity
sponses (Table S3). In sum, these results, fully scanning the (Figures 6D and 6E), with spike protein accounting for 26%
SARS2 orfeome, demonstrate a pattern of robust and diverse of the reactivity, and N accounting for 12%. Significant reac-
SARS-CoV-2-specific CD4+ T cell reactivity in convalescing tivity in COVID-19 recovered subjects was derived from other
COVID-19 cases that correlated largely with predicted viral pro- antigens, such as M (22%), nsp6 (15%), ORF8 (10%), and
tein abundance in infected cells. ORF3a (7%) (Figures 6D and 6E). In unexposed donors, SARS-
Cell 181, 1489–1501, June 25, 2020 1495

ll
Article
A B C
Figure 5. SARS-CoV-2 Epitope Reactivity in Unexposed Individuals

(A) SARS-CoV-2-reactive CD4+ T cells measured as percentage of AIM+ (OX40+CD137+) CD4+ T cells in unexposed (n = 11) donors.
(B) FACS plot examples, gated on total CD4+ T cells.
(C) Plasma IgG ELISAs for seroreactivity to RBD of HCoV-OC43 or HCoV-NL63. Data are expressed as geometric mean and geometric SD.
Pairwise statistical comparisons (A) were performed with the Wilcoxon test. *p < 0.05; ns, not significant.
CoV-2-reactive CD8+ T cells were detected in at least four recognized by 100% of COVID-19 cases studied here. Signif-
different donors (Figure S7), with less clear targeting of specific icant CD4+ T cell responses were also directed against nsp3,
SARS-CoV-2 proteins than was observed for CD4+ T cells, sug- nsp4, ORF3s, ORF7a, nsp12, and ORF8. These data suggest
gesting that coronavirus CD8+ T cell cross-reactivity exists but is that a candidate COVID-19 vaccine consisting only of SARS-
less widespread than CD4+ T cell cross-reactivity. CoV-2 spike would be capable of eliciting SARS-CoV-2-spe-
cific CD4+ T cell responses of similar representation to that
DISCUSSION of natural COVID-19 disease, but the data also indicate that
there are many potential CD4+ T cell targets in SARS-CoV-2,
There is a critical need for foundational knowledge about T cell and inclusion of additional SARS-CoV-2 structural antigens
responses to SARS-CoV-2. Here, we report functional validation such as M and N would better mimic the natural SARS-CoV-
of predicted epitopes when arranged in epitope MPs, utilizing 2-specific CD4+ T cell response observed in mild to moderate
PBMCs derived from convalescing COVID-19 cases. The exper- COVID-19 disease.
iments also used protein-specific peptide pools to determine Regarding SARS-CoV-2 CD8+ T cell responses, the pattern of
which SARS-CoV-2 proteins are the predominant targets of hu- immunodominance found here differed from the literature for
man SARS-CoV-2-specific CD4+ and CD8+ T cells generated other coronaviruses. However, stringent comparisons are not
during COVID-19 disease. Importantly, we utilized the exact possible, as some earlier studies were not similarly comprehen-
same series of experimental techniques with blood samples sive and did not utilize the same experimental strategy. The spike
from healthy control donors (PBMCs collected in the 2015– protein was a target of human SARS-CoV-2 CD8+ T cell re-
2018 time frame), and substantial cross-reactive coronavirus sponses, but it is not dominant. SARS-CoV-2 M was just as
T cell memory was observed. strongly recognized, and significant reactivity was noted for
Our results demonstrate that the epitope MPs are reagents other antigens, mostly nsp6, ORF3a, and N, which comprised
well suited to analyze and detect SARS-CoV-2-specific T cell nearly 50% of the total CD8+ T cell response, on average.
responses with limited sample material. We also developed Thus, these data indicate that candidate COVID-19 vaccines
and tested peptide pools corresponding to each of the 25 pro- endeavoring to elicit CD8+ T cell responses against the spike
teins encoded in the SARS-CoV-2 genome. Data from both protein will be eliciting a relatively narrow CD8+ T cell response
the epitope MPs and protein peptide pool experiments can compared to the natural CD8+ T cell response observed in mild
be interpreted in the context of previously reported T cell to moderate COVID-19 disease. An optimal vaccine CD8+
response immunodominance patterns observed for other co- T cell response to SARS-CoV-2 might benefit from additional
ronaviruses, particularly the SARS and MERS viruses, which class I epitopes, such as the ones derived from the M, nsp6,
have been studied in humans, HLA-transgenic mice, wild- ORF3a, and/or N.
type mice, and other species. In the case of CD4+ T cell re- There have been concerns regarding vaccine enhancement of
sponses, data for other coronaviruses found that spike ac- disease by certain candidate COVID-19 vaccine approaches, via
counted for nearly two-thirds of reported CD4+ T cell reac- antibody-dependent enhancement (ADE) or development of a
tivity, with N and M accounting for limited reactivity, and no TH2 responses (Peeples, 2020). Herein, we saw predominant
reactivity in one large study of human SARS-CoV-1 responses TH1 responses in convalescing COVID-19 cases, with little to
(Li et al., 2008). Our SARS-CoV-2 data reveal that the pattern no TH2 cytokines. Clearly more studies are required, but the
of immunodominance in COVID-19 is different. In particular, data here appear to predominantly represent a classic TH1
M, spike, and N proteins were clearly co-dominant, each response to SARS-CoV-2.
1496 Cell 181, 1489–1501, June 25, 2020

ll
Article
B C
D E
Figure 6. Protein Immunodominance of SARS-CoV-2-Specific CD4+ and CD8+ T Cells in COVID-19 Cases and Unexposed Donors
(A) SARS-CoV-2 genome organization and predicted viral protein abundance in infected cells.
(B) SARS-CoV-2 antigen-specific CD4+ T cells (AIM+, OX40+CD137+) quantified by stimulation index, using a peptide pool for each viral protein (with two ex-
ceptions, see Table S1). COVID-19 cases (top, in blue. n = 10) and unexposed donors (bottom, in white. n = 10). Data are expressed as geometric mean and
geometric SD.
(C) Fraction of SARS-CoV-2 proteins recognized by CD4+ T cells in COVID-19 cases (top) and unexposed donors (bottom).
(D) SARS-CoV-2 antigen-specific CD4+ T cells (AIM+, OX40+CD137+) quantified by stimulation index, using a peptide pool for each viral protein (with two ex-
ceptions, see Table S1). COVID-19 cases (top, in red. n = 10) and unexposed donors (bottom, in gray. n = 10). Data are expressed as geometric mean and
geometric SD.
(E) Fraction of SARS-CoV-2 proteins recognized by CD8+ T cells in COVID-19 cases (top) and unexposed donors (bottom).
See also Figures S6 and S7 and Table S6.
Cell 181, 1489–1501, June 25, 2020 1497

ll
Article
While it was important to identify antigen-specific T cell re- important detailed resolution of the human coronavirus-specific
sponses in COVID-19 cases, it is also of great interest to under- T cell responses.
stand whether cross-reactive immunity exists between corona- In sum, we measured SARS-CoV-2-specific CD4+ and CD8+
viruses to any degree. A key step in developing that T cells responses in COVID-19 cases. Using multiple
understanding is to examine antigen-specific CD4+ and CD8+ experimental approaches, SARS-CoV-2-specific CD4+ T cell
T cells in COVID-19 cases and in unexposed healthy controls, and antibody responses were observed in all COVID-19 cases,
utilizing the exact same antigens and series of experimental and CD8+ T cell responses were observed in most. Importantly,
techniques. CD4+ T cell responses were detected in 40%–60% pre-existing SARS-CoV-2-cross-reactive T cell responses were
of unexposed individuals. This may be reflective of some degree observed in healthy donors, indicating some potential for pre-ex-
of cross-reactive, preexisting immunity to SARS-CoV-2 in some, isting immunity in the human population. ORF mapping of T cell
but not all, individuals. Whether this immunity is relevant in influ- specificities revealed valuable targets for incorporation in candi-
encing clinical outcomes is unknown—and cannot be known date vaccine development and revealed distinct specificity pat-
without T cell measurements before and after SARS-CoV-2 terns between COVID-19 cases and unexposed healthy
infection of individuals—but it is tempting to speculate that the controls.
cross-reactive CD4+ T cells may be of value in protective immu-
nity, based on SARS mouse models (Zhao et al., 2016). Clear
identification of the cross-reactive peptides, and their sequence STAR+METHODS
homology relation to other coronaviruses, requires deconvolu-
tion of the positive peptide pools, which is not feasible with the Detailed methods are provided in the online version of this paper
cell numbers presently available, and time frame of the pre- and include the following:
sent study.
d KEY RESOURCES TABLE
Regarding the value of cross-reactive T cells, influenza (flu)
immunology in relationship to pandemics may be instructive. In
B Lead Contact
the context of the 2009 H1N1 influenza pandemic, preexisting
B Materials Availability
T cell immunity existed in the adult population, which focused
on the more conserved internal influenza viral proteins (Green-
d EXPERIMENTAL MODEL AND SUBJECT DETAILS
baum et al., 2009). The presence of cross-reactive T cells was
B Human Subjects
found to correlate with less severe disease (Sridhar et al.,
d METHOD DETAILS
2013; Wilkinson et al., 2012). The frequent availability of cross-
B Peptide Pools
reactive memory T cell responses might have been one factor
B PBMC isolation
contributing to the lesser severity of the H1N1 flu pandemic
B SARS-CoV-2 RBD ELISA
(Hancock et al., 2009). Cross-reactive immunity to influenza
B OC43 and NL63 coronavirus RBD ELISA
strains has been modeled to be a critical influencer of suscepti-
B Flow Cytometry
bility to newly emerging, potentially pandemic, influenza strains
B Cytokine bead assays
(Gostic et al., 2016). Given the severity of the ongoing COVID-
B Identification of coronavirus epitopes and associated
19 pandemic, it has been modeled that any degree of cross-pro-
literature references
tective coronavirus immunity in the population could have a very
d QUANTIFICATION AND STATISTICAL ANALYSIS
substantial impact on the overall course of the pandemic, and
the dynamics of the epidemiology for years to come (Kissler
et al., 2020). SUPPLEMENTAL INFORMATION
Limitations and Future Directions Supplemental Information can be found online at https://doi.org/10.1016/j.
cell.2020.05.015.
Caveats of this study include the sample size and the focus on
non-hospitalized COVID-19 cases. Sample size was limited by
expediency. The focus on non-hospitalized cases of COVID-19 ACKNOWLEDGMENTS
is a strength, in that these donors had uncomplicated disease
of moderate duration, and thus it was encouraging that substan- We would like to thank Cheryl Kim, director of the LJI flow cytometry core fa-
cility for outstanding expertise. We thank Prof. Peter Kim, Abigail Powell, PhD,
tial CD4+ T cell and antibody responses were detected in all
and colleagues (Stanford) for RBD protein synthesized from Prof. Florian
cases, and CD8+ T cell responses in the majority of cases. Com- Krammer (Mt. Sinai) constructs. J.M. was supported by PhD student fellow-
plementing these data with MP T cell data from acute patients ships from the Departamento Administrativo de Ciencia, Tecnologia e Innova-
and patients with complicated disease course will also be of cion (COLCIENCIAS), and Pontificia Universidad Javeriana. This work was
clear value, as will studies on the longevity of SARS-CoV-2 funded by the NIH NIAID under awards AI42742 (Cooperative Centers for Hu-
immunological memory. Additionally, lack of detailed informa- man Immunology) (S.C. and A.S.), National Institutes of Health contract Nr.
tion on common cold history or matched blood samples pre- 75N9301900065 (A.S. and D.W.), and U19 AI118626 (A.S. and B.P.). The BD
FACSymphony purchase was partially funded by the Bill and Melinda Gates
exposure to SARS-CoV-2 prevents conclusions regarding the
Foundation and LJI Institutional Funds (S.C. and A.S.). This work was addition-
abundance of cross-reactive coronavirus T cells before expo- ally supported in part by the Johnathan and Mary Tu Foundation (D.M.S.), the
sure to SARS-CoV-2 and any potential protective efficacy of NIAID under K08 award AI135078 (J.D.), and UCSD T32s AI007036 and
such cells. Finally, full epitope mapping in the future will add AI007384 Infectious Diseases Division (S.I.R. and S.A.R.).
1498 Cell 181, 1489–1501, June 25, 2020

ll
Article
AUTHOR CONTRIBUTIONS Dan, J.M., Havenar-Daughton, C., Kendric, K., Al-Kolla, R., Kaushik, K., Ro-
sales, S.L., Anderson, E.L., LaRock, C.N., Vijayanand, P., Seumois, G., et al.
Conceptualization, A.G., D.W., S.C., and A.S.; Investigation, A.G., D.W., J.M., (2019). Recurrent group A Streptococcus tonsillitis is an immunosusceptibility
C.R.M., J.M.D., D.M., L.P., R.S.J., A.S., and D.W.; Formal Analysis, A.G., D.W., disease involving antibody deficiency and aberrant TFH cells. Sci. Transl. Med.
C.R.M., J.M.D., J.M., and S.C.; Resources, S.I.R., S.A.R., D.M.S., A.F.C., F.K., 11, eaau3776.
S.C., and A.S.; Data Curation, J.A.G. and B.P.; Writing, S.C., A.S., A.G., and Dhanda, S.K., Karosiene, E., Edwards, L., Grifoni, A., Paul, S., Andreatta, M.,
D.W.; Supervision, B.P., A.M.d.S., S.C., and A.S.; Project Administration, Weiskopf, D., Sidney, J., Nielsen, M., Peters, B., and Sette, A. (2018). Predict-
A.F.; Funding Acquisition, S.C., A.S., D.W., D.S., and J.D. ing HLA CD4 Immunogenicity in Human Populations. Front. Immunol. 9, 1369.
Dhanda, S.K., Mahajan, S., Paul, S., Yan, Z., Kim, H., Jespersen, M.C., Jurtz,
DECLARATION OF INTERESTS V., Andreatta, M., Greenbaum, J.A., Marcatili, P., et al. (2019). IEDB-AR: im-
mune epitope database-analysis resource in 2019. Nucleic Acids Res. 47
The authors declare no competing interests. (W1), W502–W506.
Dong, E., Du, H., and Gardner, L. (2020). An interactive web-based dashboard
Received: April 20, 2020
to track COVID-19 in real time. Lancet Infect. Dis. 20, 533–534.
Revised: May 4, 2020
Accepted: May 7, 2020 Giamarellos-Bourboulis, E.J., Netea, M.G., Rovina, N., Akinosoglou, K., Anto-
Published: May 14, 2020 niadou, A., Antonakos, N., Damoraki, G., Gkavogianni, T., Adami, M.E., Kat-
saounou, P., et al. (2020). Complex Immune Dysregulation in COVID-19 Pa-
tients with Severe Respiratory Failure. Cell Host Microbe. Published online
REFERENCES
April 17, 2020. https://doi.org/10.1016/j.chom.2020.04.009.
Alshukairi, A.N., Zheng, J., Zhao, J., Nehdi, A., Baharoon, S.A., Layqah, L., Bo- Gorse, G.J., Patel, G.B., Vitale, J.N., and O’Connor, T.Z. (2010). Prevalence of
khari, A., Al Johani, S.M., Samman, N., Boudjelal, M., et al. (2018). High Prev- antibodies to four human coronaviruses is lower in nasal secretions than in
alence of MERS-CoV Infection in Camel Workers in Saudi Arabia. MBio 9, serum. Clin. Vaccine Immunol. 17, 1875–1880.
e01985–e01918. Gostic, K.M., Ambrose, M., Worobey, M., and Lloyd-Smith, J.O. (2016). Potent
Amanat, F., and Krammer, F. (2020). SARS-CoV-2 Vaccines: Status Report. protection against H5N1 and H7N9 influenza via childhood hemagglutinin
Immunity 52, 583–589. imprinting. Science 354, 722–726.
Andrews, S.F., Huang, Y., Kaur, K., Popova, L.I., Ho, I.Y., Pauli, N.T., Henry Du- Greenbaum, J.A., Kotturi, M.F., Kim, Y., Oseroff, C., Vaughan, K., Salimi, N.,
nand, C.J., Taylor, W.M., Lim, S., Huang, M., et al. (2015). Immune history pro- Vita, R., Ponomarenko, J., Scheuermann, R.H., Sette, A., and Peters, B.
foundly affects broadly protective B cell responses to influenza. Sci. Transl. (2009). Pre-existing immunity against swine-origin H1N1 influenza viruses in
Med. 7, 316ra192. the general human population. Proc. Natl. Acad. Sci. USA 106, 20365–20370.
Bancroft, T., Dillon, M.B., da Silva Antunes, R., Paul, S., Peters, B., Crotty, S., Greenbaum, J., Sidney, J., Chung, J., Brander, C., Peters, B., and Sette, A.
Lindestam Arlehamn, C.S., and Sette, A. (2016). Th1 versus Th2 T cell polari- (2011). Functional classification of class II human leukocyte antigen (HLA) mol-
zation by whole-cell and acellular childhood pertussis vaccines persists upon ecules reveals seven different supertypes and a surprising degree of repertoire
re-immunization in adolescence and adulthood. Cell. Immunol. 304- sharing across supertypes. Immunogenetics 63, 325–335.
305, 35–43. Grifoni, A., Angelo, M.A., Lopez, B., O’Rourke, P.H., Sidney, J., Cerpas, C.,
Blanco-Melo, D., Nilsson-Payant, B.E., Liu, W.-C., Møller, R., Panis, M., Balmaseda, A., Silveira, C.G.T., Maestri, A., Costa, P.R., et al. (2017). Global
Sachs, D., Albrecht, R.A., and tenOever, B.R. (2020). SARS-CoV-2 launches Assessment of Dengue Virus-Specific CD4+ T Cell Responses in Dengue-
a unique transcriptional signature from in vitro, ex vivo, and in vivo systems. Endemic Areas. Front. Immunol. 8, 1309.
bioRxiv. https://doi.org/10.1101/2020.03.24.004655. Grifoni, A., Sidney, J., Zhang, Y., Scheuermann, R.H., Peters, B., and Sette, A.
Callow, K.A., Parry, H.F., Sergeant, M., and Tyrrell, D.A. (1990). The time (2020). A Sequence Homology and Bioinformatic Approach Can Predict
course of the immune response to experimental coronavirus infection of Candidate Targets for Immune Responses to SARS-CoV-2. Cell Host Microbe
man. Epidemiol. Infect. 105, 435–446. 27, 671–680.
Cao, X. (2020). COVID-19: immunopathology and its implications for therapy. Guo, T., Fan, Y., Chen, M., Wu, X., Zhang, L., He, T., Wang, H., Wan, J., Wang,
Nat. Rev. Immunol. 20, 269–270. X., and Lu, Z. (2020a). Cardiovascular Implications of Fatal Outcomes of Pa-
Carrasco Pro, S., Sidney, J., Paul, S., Lindestam Arlehamn, C., Weiskopf, D., tients With Coronavirus Disease 2019 (COVID-19). JAMA Cardiol. Published
Peters, B., and Sette, A. (2015). Automatic Generation of Validated Specific online March 27, 2020. https://doi.org/10.1001/jamacardio.2020.1017.
Epitope Sets. J. Immunol. Res. 2015, 763461. Guo, X., Guo, Z., Duan, C., Chen, Z., Wang, G., Lu, Y., Li, M., and Lu, J.
Choe, P.G., Perera, R.A.P.M., Park, W.B., Song, K.H., Bang, J.H., Kim, E.S., (2020b). Long-Term Persistence of IgG Antibodies in SARS-CoV Infected
Kim, H.B., Ko, L.W.R., Park, S.W., Kim, N.J., et al. (2017). MERS-CoV Antibody Healthcare Workers. medRxiv. https://doi.org/10.1101/2020.02.12.20021386.
Responses 1 Year after Symptom Onset, South Korea, 2015. Emerg. Infect. Hancock, K., Veguilla, V., Lu, X., Zhong, W., Butler, E.N., Sun, H., Liu, F., Dong,
Dis. 23, 1079–1084. L., DeVos, J.R., Gargiullo, P.M., et al. (2009). Cross-reactive antibody re-
Crotty, S. (2019). T Follicular Helper Cell Biology: A Decade of Discovery and sponses to the 2009 pandemic H1N1 influenza virus. N. Engl. J. Med. 361,
Diseases. Immunity 50, 1132–1148. 1945–1952.
da Silva Antunes, R., Paul, S., Sidney, J., Weiskopf, D., Dan, J.M., Phillips, E., Havenar-Daughton, C., Reiss, S.M., Carnathan, D.G., Wu, J.E., Kendric, K.,
Mallal, S., Crotty, S., Sette, A., and Lindestam Arlehamn, C.S. (2017). Definition Torrents de la Peña, A., Kasturi, S.P., Dan, J.M., Bothwell, M., Sanders,
of Human Epitopes Recognized in Tetanus Toxoid and Development of an R.W., et al. (2016). Cytokine-Independent Detection of Antigen-Specific
Assay Strategy to Detect Ex Vivo Tetanus CD4+ T Cell Responses. PLoS Germinal Center T Follicular Helper Cells in Immunized Nonhuman Primates
ONE 12, e0169086. Using a Live Cell Activation-Induced Marker Technique. J. Immunol. 197,
Dan, J.M., Lindestam Arlehamn, C.S., Weiskopf, D., da Silva Antunes, R., Ha- 994–1002.
venar-Daughton, C., Reiss, S.M., Brigger, M., Bothwell, M., Sette, A., and Herati, R.S., Muselman, A., Vella, L., Bengsch, B., Parkhouse, K., Del Alcazar,
Crotty, S. (2016). A Cytokine-Independent Approach To Identify Antigen-Spe- D., Kotzin, J., Doyle, S.A., Tebas, P., Hensley, S.E., et al. (2017). Successive
cific Human Germinal Center T Follicular Helper Cells and Rare Antigen-Spe- annual influenza vaccination induces a recurrent oligoclonotypic memory
cific CD4+ T Cells in Blood. J. Immunol. 197, 983–993. response in circulating T follicular helper cells. Sci Immunol 2, eaag2152.
Cell 181, 1489–1501, June 25, 2020 1499

jixiansheng
ll
Article
Hinz, D., Seumois, G., Gholami, A.M., Greenbaum, J.A., Lane, J., White, B., Sallusto, F., Lanzavecchia, A., Araki, K., and Ahmed, R. (2010). From vaccines
Broide, D.H., Schulten, V., Sidney, J., Bakhru, P., et al. (2016). Lack of allergy to memory and back. Immunity 33, 451–463.
to timothy grass pollen is not a passive phenomenon but associated with the Sette, A., Moutaftsi, M., Moyron-Quiroz, J., McCausland, M.M., Davies, D.H.,
allergen-specific modulation of immune reactivity. Clin. Exp. Allergy 46, Johnston, R.J., Peters, B., Rafii-El-Idrissi Benhnia, M., Hoffmann, J., Su, H.P.,
705–719. et al. (2008). Selective CD4+ T cell help for antibody responses to a large viral
Huang, A.T., Garcia-Carreras, B., Hitchings, M.D.T., Yang, B., Katzelnick, L., pathogen: deterministic linkage of specificities. Immunity 28, 847–858.
Rattigan, S.M., Borgert, B., Moreno, C., Solomon, B.D., Rodriguez-Barraquer,
Severance, E.G., Bossis, I., Dickerson, F.B., Stallings, C.R., Origoni, A.E., Sul-
I., et al. (2020). A systematic review of antibody mediated immunity to
lens, A., Yolken, R.H., and Viscidi, R.P. (2008). Development of a nucleo-
coronaviruses: antibody kinetics, correlates of protection, and association
capsid-based human coronavirus immunoassay and estimates of individuals
of antibody responses with severity of disease. medRxiv,
exposed to coronavirus in a U.S. metropolitan population. Clin. Vaccine Immu-
2020.2004.2014.20065771.
nol. 15, 1805–1810.
Irigoyen, N., Firth, A.E., Jones, J.D., Chung, B.Y., Siddell, S.G., and Brierley, I.
Sidney, J., Peters, B., Frahm, N., Brander, C., and Sette, A. (2008). HLA class I
(2016). High-Resolution Analysis of Coronavirus Gene Expression by RNA
supertypes: a revised and updated classification. BMC Immunol. 9, 1.
Sequencing and Ribosome Profiling. PLoS Pathog. 12, e1005473.
Sidney, J., Steen, A., Moore, C., Ngo, S., Chung, J., Peters, B., and Sette, A.
Jurtz, V., Paul, S., Andreatta, M., Marcatili, P., Peters, B., and Nielsen, M.
(2010a). Divergent motifs but overlapping binding repertoires of six HLA-DQ
(2017). NetMHCpan-4.0: Improved Peptide-MHC Class I Interaction Predic-
molecules frequently expressed in the worldwide human population.
tions Integrating Eluted Ligand and Peptide Binding Affinity Data.
J. Immunol. 185, 4189–4198.
J. Immunol. 199, 3360–3368.
Sidney, J., Steen, A., Moore, C., Ngo, S., Chung, J., Peters, B., and Sette, A.
Kissler, S.M., Tedijanto, C., Goldstein, E., Grad, Y.H., and Lipsitch, M. (2020).
(2010b). Five HLA-DP molecules frequently expressed in the worldwide human
Projecting the transmission dynamics of SARS-CoV-2 through the postpan-
population share a common HLA supertypic binding specificity. J. Immunol.
demic period. Science, eabb5793.
184, 2492–2503.
Li, C.K., Wu, H., Yan, H., Ma, S., Wang, L., Zhang, M., Tang, X., Temperton,
Snijder, E.J., Bredenbeek, P.J., Dobbe, J.C., Thiel, V., Ziebuhr, J., Poon, L.L.,
N.J., Weiss, R.A., Brenchley, J.M., et al. (2008). T cell responses to whole
Guan, Y., Rozanov, M., Spaan, W.J., and Gorbalenya, A.E. (2003). Unique and
SARS coronavirus in humans. J. Immunol. 181, 5490–5500.
conserved features of genome and proteome of SARS-coronavirus, an early
Lindestam Arlehamn, C.S., McKinney, D.M., Carpenter, C., Paul, S., Rozot, V., split-off from the coronavirus group 2 lineage. J. Mol. Biol. 331, 991–1004.
Makgotlho, E., Gregg, Y., van Rooyen, M., Ernst, J.D., Hatherill, M., et al.
Southwood, S., Sidney, J., Kondo, A., del Guercio, M.F., Appella, E., Hoffman,
(2016). A Quantitative Analysis of Complexity of Human Pathogen-Specific
S., Kubo, R.T., Chesnut, R.W., Grey, H.M., and Sette, A. (1998). Several com-
CD4 T Cell Responses in Healthy M. tuberculosis Infected South Africans.
mon HLA-DR types share largely overlapping peptide binding repertoires.
PLoS Pathog. 12, e1005760.
J. Immunol. 160, 3363–3373.
Liu, L., Wei, Q., Lin, Q., Fang, J., Wang, H., Kwok, H., Tang, H., Nishiura, K.,
Peng, J., Tan, Z., et al. (2019). Anti-spike IgG causes severe acute lung injury Sridhar, S., Begom, S., Bermingham, A., Hoschler, K., Adamson, W., Carman,
by skewing macrophage responses during acute SARS-CoV infection. JCI W., Bean, T., Barclay, W., Deeks, J.J., and Lalvani, A. (2013). Cellular immune
Insight 4, 123158. correlates of protection against symptomatic pandemic influenza. Nat. Med.
19, 1305–1312.
Morou, A., Brunet-Ratnasingham, E., Dubé, M., Charlebois, R., Mercier, E.,
Darko, S., Brassard, N., Nganou-Makamdop, K., Arumugam, S., Gendron- Stadlbauer, D., Amanat, F., Chromikova, V., Jiang, K., Strohmeier, S., Arunku-
Lepage, G., et al. (2019). Altered differentiation is central to HIV-specific mar, G.A., Tan, J., Bhavsar, D., Capuano, C., Kirkpatrick, E., et al. (2020).
CD4+ T cell dysfunction in progressive disease. Nat. Immunol. 20, 1059–1070. SARS-CoV-2 Seroconversion in Humans: A Detailed Protocol for a Serological
Assay, Antigen Production, and Test Setup. Curr. Protoc. Microbiol. 57, e100.
Moutaftsi, M., Tscharke, D.C., Vaughan, K., Koelle, D.M., Stern, L., Calvo-
Calle, M., Ennis, F., Terajima, M., Sutter, G., Crotty, S., et al. (2010). Uncover- Takano, T., Kawakami, C., Yamada, S., Satoh, R., and Hohdatsu, T. (2008).
ing the interplay between CD8, CD4 and antibody responses to complex path- Antibody-dependent enhancement occurs upon re-infection with the identical
ogens. Future Microbiol. 5, 221–239. serotype virus in feline infectious peritonitis virus infection. J. Vet. Med. Sci. 70,
1315–1321.
O’Sullivan, D., Arrhenius, T., Sidney, J., Del Guercio, M.F., Albertson, M., Wall,
M., Oseroff, C., Southwood, S., Colón, S.M., Gaeta, F.C., et al. (1991). On the Thanh Le, T., Andreadakis, Z., Kumar, A., Gómez Román, R., Tollefsen, S., Sa-
interaction of promiscuous antigenic peptides with different DR alleles. Identi- ville, M., and Mayhew, S. (2020). The COVID-19 vaccine development land-
fication of common structural motifs. J. Immunol. 147, 2663–2669. scape. Nat. Rev. Drug Discov. 19, 305–306.
Okba, N.M.A., Raj, V.S., Widjaja, I., GeurtsvanKessel, C.H., de Bruin, E., Chan- Tian, Y., Grifoni, A., Sette, A., and Weiskopf, D. (2019). Human T Cell Response
dler, F.D., Park, W.B., Kim, N.J., Farag, E.A.B.A., Al-Hajri, M., et al. (2019). Sen- to Dengue Virus Infection. Front. Immunol. 10, 2125.
sitive and Specific Detection of Low-Level Antibody Responses in Mild Middle Vennema, H., de Groot, R.J., Harbour, D.A., Dalderup, M., Gruffydd-Jones, T.,
East Respiratory Syndrome Coronavirus Infections. Emerg. Infect. Dis. 25, Horzinek, M.C., and Spaan, W.J. (1990). Early death after feline infectious peri-
1868–1877. tonitis virus challenge due to recombinant vaccinia virus immunization. J. Virol.
Paul, S., Lindestam Arlehamn, C.S., Scriba, T.J., Dillon, M.B., Oseroff, C., Hinz, 64, 1407–1409.
D., McKinney, D.M., Carrasco Pro, S., Sidney, J., Peters, B., and Sette, A. Vita, R., Mahajan, S., Overton, J.A., Dhanda, S.K., Martini, S., Cantrell, J.R.,
(2015). Development and validation of a broad scheme for prediction of HLA Wheeler, D.K., Sette, A., and Peters, B. (2019). The Immune Epitope Database
class II restricted T cell epitopes. J. Immunol. Methods 422, 28–34. (IEDB): 2018 update. Nucleic Acids Res. 47 (D1), D339–D343.
Paul, S., Sidney, J., Sette, A., and Peters, B. (2016). TepiTool: A Pipeline for Weiskopf, D., Angelo, M.A., de Azeredo, E.L., Sidney, J., Greenbaum, J.A.,
Computational Prediction of T Cell Epitope Candidates. Curr. Protoc. Immu- Fernando, A.N., Broadwater, A., Kolla, R.V., De Silva, A.D., de Silva, A.M.,
nol. 114, 18.19.1–18.19.24. et al. (2013). Comprehensive analysis of dengue virus-specific responses sup-
Peeples, L. (2020). News Feature: Avoiding pitfalls in the pursuit of a COVID-19 ports an HLA-linked protective role for CD8+ T cells. Proc. Natl. Acad. Sci. USA
vaccine. Proc. Natl. Acad. Sci. USA 117, 8218–8221. 110, E2046–E2053.
Reiss, S., Baxter, A.E., Cirelli, K.M., Dan, J.M., Morou, A., Daigneault, A., Bras- Weiskopf, D., Cerpas, C., Angelo, M.A., Bangs, D.J., Sidney, J., Paul, S., Pe-
sard, N., Silvestri, G., Routy, J.P., Havenar-Daughton, C., et al. (2017). ters, B., Sanches, F.P., Silvera, C.G., Costa, P.R., et al. (2015). Human CD8+ T-
Comparative analysis of activation induced marker (AIM) assays for sensitive Cell Responses Against the 4 Dengue Virus Serotypes Are Associated With
identification of antigen-specific CD4 T cells. PLoS ONE 12, e0186998. Distinct Patterns of Protein Targets. J. Infect. Dis. 212, 1743–1751.
1500 Cell 181, 1489–1501, June 25, 2020

ll
Article
Wilkinson, T.M., Li, C.K.F., Chui, C.S.C., Huang, A.K.Y., Perkins, M., Liebner, Zhao, J., Zhao, J., and Perlman, S. (2010). T cell responses are required for
J.C., Lambkin-Williams, R., Gilbert, A., Oxford, J., Nicholas, B., et al. (2012). protection from clinical disease and for virus clearance in severe acute respi-
Preexisting influenza-specific CD4+ T cells correlate with disease protection ratory syndrome coronavirus-infected mice. J. Virol. 84, 9318–9325.
against influenza challenge in humans. Nat. Med. 18, 274–280. Zhao, J., Zhao, J., Legge, K., and Perlman, S. (2011). Age-related increases in
Wong, C.K., Lam, C.W., Wu, A.K., Ip, W.K., Lee, N.L., Chan, I.H., Lit, L.C., Hui, PGD(2) expression impair respiratory DC migration, resulting in diminished
D.S., Chan, M.H., Chung, S.S., and Sung, J.J. (2004). Plasma inflammatory cy- T cell responses upon respiratory virus infection in mice. J. Clin. Invest. 121,
tokines and chemokines in severe acute respiratory syndrome. Clin. Exp. Im- 4921–4930.
munol. 136, 95–103. Zhao, J., Zhao, J., Mangalam, A.K., Channappanavar, R., Fett, C., Meyerholz,
D.K., Agnihothram, S., Baric, R.S., David, C.S., and Perlman, S. (2016). Airway
Wu, Z., and McGoogan, J.M. (2020). Characteristics of and Important Lessons
Memory CD4(+) T Cells Mediate Protective Immunity against Emerging Respi-
From the Coronavirus Disease 2019 (COVID-19) Outbreak in China: Summary
ratory Coronaviruses. Immunity 44, 1379–1391.
of a Report of 72 314 Cases From the Chinese Center for Disease Control and
Prevention. JAMA 323, 1239–1242. Zhao, J., Alshukairi, A.N., Baharoon, S.A., Ahmed, W.A., Bokhari, A.A., Nehdi,
A.M., Layqah, L.A., Alghamdi, M.G., Al Gethamy, M.M., Dada, A.M., et al.
Xie, X., Muruato, A., Lokugamage, K.G., Narayanan, K., Zhang, X., Zou, J., Liu, (2017). Recovery from the Middle East respiratory syndrome is associated
J., Schindewolf, C., Bopp, N.E., Aguilar, P.V., et al. (2020). An Infectious cDNA with antibody and T-cell responses. Sci. Immunol. 2, eaan5393.
Clone of SARS-CoV-2. Cell Host Microbe 27, 841–848.
Zhou, F., Yu, T., Du, R., Fan, G., Liu, Y., Liu, Z., Xiang, J., Wang, Y., Song, B.,
Zhao, J., Zhao, J., Van Rooijen, N., and Perlman, S. (2009). Evasion by stealth: Gu, X., et al. (2020). Clinical course and risk factors for mortality of adult inpa-
inefficient immune activation underlies poor T cell response and severe dis- tients with COVID-19 in Wuhan, China: a retrospective cohort study. Lancet
ease in SARS-CoV-infected mice. PLoS Pathog. 5, e1000636. 395, 1054–1062.
Cell 181, 1489–1501, June 25, 2020 1501

ll
Article
STAR+METHODS
KEY RESOURCES TABLE

Antibodies
M5E2 (V500) [anti-CD14] Becton Dickinson 561391 (RRID:AB_10611856)
HIB19 (V500) [anti-CD19] Becton Dickinson 561121 (RRID:AB_10562391)
FN50 (BV605) [anti-CD4] Becton Dickinson 562989 (RRID:AB_2737935)
RPA-T8 (BV650) [anti-CD8] BioLegend 301042 (RRID:AB_2563505)
FN50 (PE-CF594) [anti-CD69] Becton Dickinson 562617 (RRID:AB_2737680)
Ber-ACT35 (PE-Cy7) [anti-OX40] Biolegend 350012 (RRID:AB_10901161)a
4B4-1 (APC) [anti-CD137] BioLegend 309810 (RRID:AB_830672)
OKT3 (AF700) [anti-CD3] Biolegend 317340 (RRID:AB_2563408)
G043H7 (BV421) [anti-CD45RA] BioLegend 353207 (RRID:AB_10915137)
4S.B3 (FITC) [anti-IFNg] Thermo Fisher Scientific 11-7319-82 (RRID: AB_465415)
GB11 (PE) [anti-Granzyme B] Thermo Fisher Scientific 12-8899-41 (RRID: AB_1659718)
Mab11 (PeCy7) [anti-TNFa] ebioscience 25-7349-82 (RRID:AB_469686)
JES3-19F1 (APC) [anti-IL-10] BioLegend 506807 (RRID:AB_315457)
3D12 (APC ef780) [anti-CCR7] eBioscience 47-1979-42 (RRID:AB_1518794)
B56 (FITC) [anti-KI67] Becton Dickinson 556026 (RRID:AB_396302)
SK3 (percp efluor710) [anti-CD4] Invitrogen 46-0047-42 (RRID:AB_1834401)
GB11 (af647) [anti-GzmB] Biolegend 515406 (RRID:AB_2566333)
MHM-88 (af700) [anti-IgM] Biolegend 314538 (RRID:AB_2566615)
O323 (APC cy7) [anti-CD27] Biolegend 302816 (RRID:AB_571977)
IA6-2 (PE) [anti-IgD] Becton Dickinson 555779 (RRID:AB_396114)
HCD56 (PE Dazzle) [anti-CD56] Biolegend 318348 (RRID:AB_2563564)
HIB19 (Cy5) [anti-CD19PE] Biolegend 302210 (RRID:AB_314240)
HIT2 (PECy7) [anti-CD38] Invitrogen 25-0389-42 (RRID:AB_1724057)
J252D4 (bv421) [anti-CXCR5] Biolegend 356920 (RRID:AB_2562303)
63D3 (bv510) [anti-CD14] Biolegend 367123 (RRID:AB_2716228)
HI100 (bv570) [anti-CD45RA] Biolegend 304132 (RRID:AB_2563813)
G025H7 (bv605) [anti-CXCR3] Biolegend 353728 (RRID:AB_2563157)
2H7 (bv650) [anti-CD20] Biolegend 302336 (RRID:AB_2563806)
G043H7 (bv711) [anti-CCR7] Biolegend 353228 (RRID:AB_2563865)
EH12.2H7 (bv786) [anti-PD-1] Biolegend 329930 (RRID:AB_2563443)
UCHT1 (buv395) [anti-CD3] Becton Dickinson 563546 (RRID:AB_2744387)
Live/dead (UV) [Zombie] Biolegend 423108
11A9 (buv496) [anti-CCR6] Becton Dickinson 612948 (RRID:AB_2833076)
3G8 (buv737) [anti-CD16] Becton Dickinson 612786 (RRID:AB_2833077)
SK1 (buv805) [anti-CD8] Becton Dickinson 612889 (RRID:AB_2833078)
LEGENDplex 13-plex kit Biolegend 740809
Biological Samples
Healthy donor blood samples Carter BloodCare http://www.carterbloodcare.org/
Healthy donor blood samples LJI Clinical Core https://www.iedb.org/
Convalescent donor blood samples UC San Diego Health http://www.health.ucsd.edu/
Synthetic peptides Synthetic Biomolecules (aka A&A) http://www.syntheticbiomolecules.com/
SARS-CoV-2 Receptor Binding Domain Stadlbauer et al., 2020 N/A
(RBD) protein
(Continued on next page)
e1 Cell 181, 1489–1501.e1–e6, June 25, 2020

ll
Article
Continued
CoronaCheck COVID-19 Rapid Antibody 20/20 BioResponse https://coronachecktest.com/
Test Kit
Deposited Data
Wuhan-Hu-1 RNA isolate NCBI nuccore database GenBank: MN_908947
ORF10 protein NCBI protein database NCBI: YP_009725255.1
Nucleocapsid phosphoprotein NCBI protein database NCBI: YP_009724397.2
ORF7a protein NCBI protein database NCBI: YP_009724395.1
membrane glycoprotein NCBI protein database NCBI: YP_009724393.1
envelope protein NCBI protein database NCBI: YP_009724392.1
ORF3a protein NCBI protein database NCBI: YP_009724391.1
surface glycoprotein NCBI protein database NCBI: YP_009724390.1
orf1ab polyprotein NCBI protein database NCBI: YP_009724389.1
IEDB Vita et al., 2019 https://www.iedb.org
IEDB-AR (analysis resource) Dhanda et al., 2019 http://tools.iedb.org
NetMHCpan EL 4.0 Jurtz et al., 2017 http://tools.iedb.org/mhci/
IEDB Vita et al., 2019 https://www.iedb.org
Tepitool Paul et al., 2016; Paul et al., 2015 http://tools.iedb.org/tepitool/
FlowJo 10 FlowJo https://www.flowjo.com/
GraphPad Prism 8.4 GraphPad https://www.graphpad.com/
LEGENDplex v8.0 Biolegend https://www.biolegend.com/
Lead Contact
Further information and requests for resources and reagents should be directed to and will be fulfilled by the Lead Contact, Dr. Ales-
sandro Sette (alex@lji.org).
Aliquots of synthesized sets of peptides utilized in this study will be made available upon request. There are restrictions to the avail-
ability of the peptide reagents due to cost and limited quantity.

The published article includes all data generated or analyzed during this study, and summarized in the accompanying tables, figures
and Supplemental materials.
Human Subjects
Healthy Unexposed Donors
Samples from healthy adult donors were obtained by the La Jolla Institute for Immunology (LJI) Clinical Core or provided by a
commercial vendor (Carter Blood Care) for prior, unrelated studies between early 2015 and early 2018. These samples were
considered to be from unexposed controls, given that SARS-CoV-2 emerged as a novel pathogen in late 2019, more than one
year after the collection of any of these samples. These donors were considered healthy in that they had no known history of
any significant systemic diseases, including, but not limited to, autoimmune disease, diabetes, kidney or liver disease, congestive
heart failure, malignancy, coagulopathy, hepatitis B or C, or HIV. An overview of the characteristics of these unexposed donors is
provided in Table 1.
Cell 181, 1489–1501.e1–e6, June 25, 2020 e2

ll
Article
The LJI Institutional Review Board approved the collection of these samples (LJI; VD-112). At the time of enrollment in the
initial studies, all individual donors provided informed consent that their samples could be used for future studies, including
this study.
Convalescent COVID-19 Donors
The Institutional Review Boards of the University of California, San Diego (UCSD; 200236X) and La Jolla Institute (LJI; VD-214)
approved blood draw protocols for convalescent donors. All human subjects were assessed for capacity using a standardized
and approved assessment. Subjects deemed to have capacity voluntarily gave informed consent prior to being enrolled in the study.
Individuals did not receive compensation for their participation.
Study inclusion criteria included subjects over the age of 18 years, regardless of disease severity, race, ethnicity, gender, preg-
nancy or nursing status, who were willing and able to provide informed consent, or with a legal guardian or representative willing
and able to provide informed consent when the participant could not personally do so. Study exclusion criteria included lack of
willingness or ability to provide informed consent, or lack of an appropriate legal guardian or representative to provide informed
consent.
Blood from convalescent donors was obtained at a UC San Diego Health clinic. Blood was collected in acid citrate dextrose (ACD)
tubes and stored at room temperature prior to processing for PBMC isolation and plasma collection. A separate serum separator
tube (SST) was collected from each donor. Samples were de-identified prior to analysis. Other efforts to maintain the confidentiality
of participants included referring to specimens and other records via an assigned, coded identification number.
Prior to enrollment in the study, donors were asked to provide proof of positive testing for SARS-CoV-2, and screened for clinical
history and/or epidemiological risk factors consistent with the World Health Organization (WHO) or Centers for Disease Control and
Prevention (CDC) case definitions of COVID-19 or Persons Under Investigation (PUI) (https://www.who.int/emergencies/diseases/
novel-coronavirus-2019/technical-guidance/surveillance-and-case-definitions, https://www.cdc.gov/coronavirus/2019-nCoV/hcp/
clinical-criteria.html). Per CDC and WHO guidance, clinical features consistent with COVID-19 included subjective or measured
fever, signs or symptoms of lower respiratory tract illness (e.g., cough or dyspnea). Epidemiologic risk factors included close contact
with a laboratory-confirmed case of SARS-CoV-2 within 14 days of symptom onset or a history of travel to an area with a high rate of
COVID-19 cases within 14 days of symptom onset.
Disease severity was defined as mild, moderate, severe or critical based on a modified version of the WHO interim guidance,
‘‘Clinical management of severe acute respiratory infection when COVID-19 is suspected’’ (WHO Reference Number: WHO/2019-
nCoV/clinical/2020.4). Mild disease was defined as an uncomplicated upper respiratory tract infection (URI) with potential non-
specific symptoms (e.g., fatigue, fever, cough with or without sputum production, anorexia, malaise, myalgia, sore throat, dys-
pnea, nasal congestion, headache; rarely diarrhea, nausea and vomiting) that did not require hospitalization. Moderate disease
was defined as the presence of lower respiratory tract disease or pneumonia without the need for supplemental oxygen, without
signs of severe pneumonia, or a URI requiring hospitalization (including observation admission status). Severe disease was
defined as severe lower respiratory tract infection or pneumonia with fever plus any one of the following: tachypnea (respiratory
rate > 30 breaths per minute), respiratory distress, or oxygen saturation less than 93% on room air. Critical disease was defined as
the need for ICU admission or the presence of acute respiratory distress syndrome (ARDS), sepsis, or septic shock, as defined in
the WHO guidance document.
Convalescent donors were screened for symptoms prior to scheduling blood draws, and had to be symptom-free and approxi-
mately 3 weeks out from symptom onset at the time of the initial blood draw. Following enrollment, whole blood from convalescent
donors was run on a colloidal-gold immunochromatographic ‘lateral flow’ assay to evaluate for prior exposure to SARS-CoV-2. This
assay detects IgM or IgG antibodies directed against recombinant SARS-CoV-2 antigen labeled with a colloidal gold tracer (20/20
BioResponse CoronaCheck). Ninety percent of convalescent donors tested positive for IgM or IgG to SARS-CoV-2 by this assay
(Table 1).
Convalescent donors were California residents, who were either referred to the study by a health care provider or self-referred. The
majority (75%) of donors had a known sick contact with COVID-19 or suspected exposure to SARS-CoV-2 (Table 1). The most com-
mon symptoms reported were cough, fatigue, fever, anosmia, and dyspnea. Seventy percent of donors experienced mild illness.
Donors were asked to self-report any known medical illnesses. Of note, 65% of these individuals had no known underlying medical
illnesses.
METHOD DETAILS
Peptide Pools
Epitope MegaPool (MP) design and preparation
SARS-CoV-2 virus-specific CD4 and CD8 peptides were synthesized as crude material (A&A, San Diego, CA), resuspended in
DMSO, pooled and sequentially lyophilized as previously reported (Carrasco Pro et al., 2015). SARS-CoV-2 epitopes were predicted
using the protein sequences derived from the SARS-CoV-2 reference (GenBank: MN908947) and IEDB analysis-resource as previ-
ously described (Dhanda et al., 2019; Grifoni et al., 2020). Specifically, CD4 SARS-CoV-2 epitope prediction was carried out using a
previously described approach in Tepitool resource in IEDB (Paul et al., 2015; Paul et al., 2016), to select peptides with median
consensus percentile % 20, similar to what was previously described, but removing the resulting spike glycoprotein epitopes
e3 Cell 181, 1489–1501.e1–e6, June 25, 2020

ll
Article
from this prediction (CD4-R (remainder) ‘‘Non-spike’’ MP, n = 221). This approach takes advantage of the extensive cross-reactivity
and repertoire overlap between different HLA class II loci and allelic variants to predict promiscuous epitopes, capable of binding
across the most common HLA class II prototypic specificities (Greenbaum et al., 2011; O’Sullivan et al., 1991; Sidney et al.,
2010a, b; Southwood et al., 1998). The algorithm utilizes predictions for seven common HLA-DR alleles (DRB1*03:01,
DRB1*07:01, DRB1*15:01, DRB3*01:01, DRB3*02:02, DRB4*01:01 and DRB5*01:01) empirically determined to allow coverage of
diverse populations and for different pathogens and antigen systems (Dhanda et al., 2018; Paul et al., 2015).
To investigate in-depth spike-specific CD4 T cells, 15-mer peptides (overlapping by 10 amino acids) spanning the entire antigen
have been synthesized and pooled separately (CD-4 S (spike) MP, n = 253).
In the case of CD8 epitopes, since the overlap between different HLA class I allelic variants and loci is more limited to specific
groups of alleles, or supertypes (Sidney et al., 2008), we targeted a set of the 12 most prominent HLA class I A and B alleles
(A*01:01, A*02:01, A*03:01, A*11:01, A*23:01, A*24:02, B*07:02, B*08:01, B*35:01, B*40:01, B*44:02, B*44:03), which have been
shown to allow broad coverage of the general population. CD8 SARS-CoV-2 epitope prediction was performed as previously re-
ported, using NetMHC pan EL 4.0 algorithm (Jurtz et al., 2017) for the top 12 more frequent HLA alleles and selecting the top 1
percentile predicted epitope per HLA allele clustered with nested/overlap reduction (Grifoni et al., 2020). The 628 predicted CD8 epi-
topes were split in two CD8 MPs containing 314 peptides each (CD8-A and CD8-B). The CMV MP is a pool of previously reported
class I and class II epitopes (Carrasco Pro et al., 2015).
Protein peptide pools
In the case of the protein pools, peptides of 15 amino acid length overlapping by 10 spanning each entire protein sequence were
tested in a single MP (6-253 peptides per pool). Table S1 lists the number of peptides pooled for each of the viral proteins. Upon
request we are prepared to make these MP available to the scientific community for use in a diverse set of investigations.
PBMC isolation
For all samples whole blood was collected in ACD tubes (COVID-19 donors) or heparin coated blood bag (healthy unexposed do-
nors). Whole blood was then centrifuged for 15 min at 1850 rpm to separate the cellular fraction and plasma. The plasma was
then carefully removed from the cell pellet and stored at 20C.
Peripheral blood mononuclear cells (PBMC) were isolated by density-gradient sedimentation using Ficoll-Paque (Lymphoprep,
Nycomed Pharma, Oslo, Norway) as previously described (Weiskopf et al., 2013). Isolated PBMC were cryopreserved in cell recovery
media containing 10% DMSO (GIBCO), supplemented with 10% heat inactivated fetal bovine serum, depending on the processing
laboratory, (FBS; Hyclone Laboratories, Logan UT) and stored in liquid nitrogen until used in the assays.
SARS-CoV-2 RBD ELISA

SARS-CoV-2 Receptor Binding Domain (RBD) protein was obtained courtesy of Florian Krammer and Peter Kim (Stadlbauer et al.,
2020). Corning 96-well half-area plates (ThermoFisher 3690) were coated with 1mg/mL SARS-CoV-2 RBD overnight at 4 C. ELISA
protocol generally followed that of the Krammer lab, which previously demonstrated specificity (Stadlbauer et al., 2020). Plates
were blocked the next day with 3% milk (Skim Milk Powder ThermoFisher LP0031 by weight/volume) in Phosphate Buffered Saline
(PBS) containing 0.05% Tween-20 (ThermoScientific J260605-AP) for 2 hours at room temperature. Plasma was then added to the
plates and incubated for 1.5 hours at room temperature. Prior to plasma addition to the plates, plasma was heat inactivated at 56 C
for 30-60 minutes. Plasma was diluted in 1% milk in 0.05% PBS-Tween 20 starting at a 1:3 dilution and diluting each sample at by 1:3.
Plates were then washed 5 times with 0.05% PBS-Tween 20. Secondary antibodies were diluted in 1% milk in 0.05% Tween-20 and
incubated for 1 hour. For IgG, anti-human IgG peroxidase antibody produced in goat (Sigma A6029) was used at a 1:5000 dilution. For
IgM, anti-human IgM peroxidase antibody produced in goat (Sigma A6907) was used at a 1:10,000 dilution. For IgA, anti-human IgA
horseradish peroxidase antibody (Hybridoma Reagent Laboratory HP6123-HRP) was used at a 1:1,000 dilution. Plates were washed
5 times with 0.05% PBS-Tween 20. Plates were developed with TMB Substrate Kit (ThermoScientific 34021) for 15 minutes at room
temperature. The reaction was stopped with 2M sulfuric acid. Plates were read on a Spectramax Plate Reader at 450 nm using Soft-
Max Pro, and ODs were background subtracted. A positive control standard was created by pooling plasma from six convalescing
COVID-19 patients. Positive control standard was run on each plate and was used to calculate titers (relative units) for all samples
using non-linear regression interpolations, done to quantify the amount of anti-RBD IgG, anti-RBD IgM, and anti-RBD IgA present in
each specimen. Titers were plotted for each specimen and compared to COVID-19 negative specimens. As a second analytical
approach, Area under the curve was also calculated for each specimen to compare COVID-19 to negative specimens, using a base-
line of 0.05 for peak calculations.
OC43 and NL63 coronavirus RBD ELISA

An in-house ELISA at UNC was performed by coating with recombinant S RBD antigens (SARS-CoV-2, SARS-CoV, OC43-CoV and
NL63-CoV) in TBS for 1 h at 37 C. After blocking, we added 1:20 diluted serum and incubated at 37 C for 1 h. Antigen-specific
antibodies (Ig) were measured at 405 nm by using alkaline phosphatase conjugated goat anti-human IgG, IgA and IgM Abs and
4-Nitrophenyl phosphate.
Cell 181, 1489–1501.e1–e6, June 25, 2020 e4

ll
Article
Flow Cytometry
Direct ex vivo PBMC immune cell phenotyping
For the surface stain, 1x106 PBMCs were resuspended in 100 ml PBS with 2% FBS (FACS buffer) and stained with antibody cocktail
for 1 hour at 4 C in the dark. Following surface staining, cells were washed twice with FACS buffer. Cells were then fixed/permea-
bilized for 40min at 4C in the dark using the eBioscience FoxP3 transcription factor buffer kit (ThermoFisher Scientific, Waltham, MA).
Following fixation/permeabilization, cells were washed twice with 1x permeabilization buffer, resuspended in 100 ml permeabilization
buffer and stained with intracellular/intranuclear antibodies for 1 hour at 4 C in the dark. Samples were washed twice with 1x per-
meabilization buffer following staining. After the final wash, cells were resuspended in 200ml FACS buffer. All samples were acquired
on a BD FACSymphony cell sorter (BD Biosciences, San Diego, CA). A list of antibodies used in this panel can be found in Table S2.
T cell stimulations
For all flow cytometry assays of stimulated T cells, cryopreserved cells were thawed by diluting them in 10 mL complete RPMI 1640
with 5% human AB serum (Gemini Bioproducts) in the presence of benzonase [20ul/10mL]. All samples were acquired on a ZE5 Cell
analyzer (Bio-rad laboratories), and analyzed with FlowJo software (Tree Star, San Carlos, CA).
Activation induced cell marker assay
Cells were cultured for 24 hours in the presence of SARS-CoV-2 specific MPs [1 mg/ml] or 10 mg/mL PHA in 96-wells U bottom plates
at 1x106 PBMC per well. A stimulation with an equimolar amount of DMSO was performed as negative control, phytohemagglutinin
(PHA, Roche, 1 mg/ml) and stimulation with a combined CD4 and CD8 cytomegalovirus MP (CMV, 1 mg/ml) were included as positive
controls. Supernatants were harvested at 24 hours post-stimulation for multiplex detection of cytokines. Antibodies used in the AIM
assay are listed in Table S4. AIM assays shown in Figures 2 and 3 and AIM assays shown in Figure 6 had five COVID-19 donors in
common and nine Unexposed donors. Full raw data is listed in Table S6.
Intracellular cytokine staining assay
For the intracellular cytokine staining, PBMC were cultured in the presence of SARS-CoV-2 specific MPs [1 mg/ml] for 9 hours. Golgi-Plug
containing brefeldin A (BD Biosciences, San Diego, CA) and monensin (Biolegend, San Diego, CA) were added 3 hours into the culture.
Cells were then washed and surface stained for 30 minutes on ice, fixed with 1% of paraformaldehyde (Sigma-Aldrich, St. Louis, MO) and
kept at 4 C overnight. Antibodies used in the ICS assay are listed in Table S5. The gates applied for the identification of IFNg, GzB, TNFa,
or IL-10 production on the total population of CD8+ T cells were defined according to the cells cultured with DMSO for each individual.
Cytokine bead assays

Supernatants were collected from 24-hour stimulation cultures of the AIM assays and stored in 96 well plates at 20 C. Cytokines in
cell culture supernatants of the same samples used for AIM were quantified using a human Th cytokine panel (13-plex) kit (LEGEND-
plex, Biolegend) according to the manufacturer’s instruction. Supernatants were mixed with beads coated with capture antibodies
specific for IL-5, IL-13, IL-2, IL-6, IL-9, IL-10, IFNg, TNFa, IL-17a, IL-17F, IL-4, IL-21 and IL-22 and incubated on a 96 well filter plate
for 2 hours. Beads were washed and incubated with biotin-labeled detection antibodies for 1 hour, followed by a final incubation with
streptavidin-PE. Beads were analyzed by flow cytometry using a FACS Canto cytometer. Analysis was performed using the
LEGENDplex analysis software v8.0, which distinguishes between the 13 different analytes on basis of bead size and internal dye.
Identification of coronavirus epitopes and associated literature references

To identify coronavirus epitopes and associated references, the IEDB was searched (on April 16, 2020) utilizing the following queries.
A first query was run to identify references associated with class I restricted CD8 epitopes, which utilized the criteria settings ‘‘An-
tigen’’: Organism = Coronavirus (taxonomy ID 11118); ‘‘Assay’’: Positive assays only; ‘‘Assay’’: T cell assay; ‘‘MHC restriction’’ = MHC
Class II; no parameters were defined for ‘‘Host’’ or ‘‘Disease.’’ This query identified 57 references, which are listed and displayed
under the ‘‘References’’ tab on the results page.
A second query was run to identify references associated with class II restricted CD4 epitopes which utilized the criteria settings
‘‘Antigen’’: Organism = Coronavirus (taxonomy ID 11118); ‘‘Assay’’: Positive assays only; ‘‘Assay’’: T cell assay; ‘‘MHC restriction’’ =
MHC Class II; no parameters were defined for ‘‘Host’’ or ‘‘Disease.’’ This query identified 27 references, which are listed and
displayed under the ‘‘References’’ tab on the results page.
A third query was run to specifically capture epitopes and map them back to the antigen of origin using the setting; ‘‘Antigen’’: Or-
ganism = Coronavirus (taxonomy ID 11118); ‘‘Assay’’: Positive assays only; ‘‘Assay’’: T cell assay; no parameters were defined for
‘‘MHC restriction,’’ ‘‘Host’’ or ‘‘Disease.’’ Results were exported as csv files, and then examined in Excel to tabulate the number
of CD4 and CD8 epitopes recognized in humans, mice, transgenic mice and other hosts associated with each respective antigen.
Data and statistical analyses were done in FlowJo 10 and GraphPad Prism 8.4, unless otherwise stated. The statistical details of the
experiments are provided in the respective figure legends. Data plotted in linear scale were expressed as Mean + Standard Deviation
(SD). Data plotted in logarithmic scales were expressed as Geometric Mean + Geometric Standard Deviation (SD). Correlation an-
alyses were performed using Spearman, while Mann-Whitney or Wilcoxon tests were applied for unpaired or paired comparisons,
respectively. Details pertaining to significance are also noted in the respective legends. T cell data have been calculated as
e5 Cell 181, 1489–1501.e1–e6, June 25, 2020

ll
Article
background subtracted data or stimulation index. Background subtracted data were derived by subtracting the percentage of AIM+
cells after SARS-CoV-2 stimulation from the DMSO stimulation. Stimulation Index was calculated instead by dividing the percentage
of AIM+ cells after SARS-CoV-2 stimulation with the percentage of AIM+ cells derived from DMSO stimulation. If the AIM+ cells per-
centage after DMSO stimulation was equal to 0, the minimum value across each cohort was used. When two stimuli were combined
together, the percentage of AIM+ cells after SARS-CoV-2 stimulation was combined and either subtracted twice or divided by twice
the value of the percentage of AIM+ cells derived from DMSO stimulation. Additional data analysis techniques are described in the
STAR Methods sections above.
Cell 181, 1489–1501.e1–e6, June 25, 2020 e6

ll
Article
A B C
anti-RBD IgG anti-RBD IgM anti-RBD IgA
1.5 2.0 3
1565 AK4 1565 AK4 1565 AK4
1570 AK1 1570 AK1 1570 AK1
1592 AK2 1.5 1592 AK2 1592 AK2
1.0 2079 AK1 2 2079 AK1
2079 AK1
OD450
OD450
2095 AK1
OD450
2095 AK1 2095 AK1

1.0
4745 4745 4745
0.5 3SJD1 3SJD1 1 3SJD1
4747 0.5 4747 4747
4748 4748 4748
0.1 4750 0.1 4750 0.1 4750
0.0 0.0 0
1 10 100 1000 1 10 100 1000 1 10 100 1000
D E F
104 104 104
**** **** ****
SARS-CoV-2 spike RBD IgM (AUC)
SARS-CoV-2 spike RBD IgG (AUC)
SARS-CoV-2 spike RBD IgA (AUC)
103 103
103
102 102
102
101 101
101
100 100
10-1 10-1 100

Figure S1. SARS-CoV-2 Spike Protein RBD Serology, Related to Figure 1

(A-C) ELISA curves for (A) IgG, (B) IgM, and (C) IgA from 10 representative donors. Five COVID-19 cases and
(D-F) Area under the curve (AUC) SARS-CoV-2 spike protein RBD (D) IgG (E) IgM, and (F) IgA, ELISA quantitation, from the same donors and experiments shown
in Figure 1. Geometric mean titers with geometric SDs are indicated. P values are two-tailed Mann-Whitney tests.
ll
Article
Figure S2. Phenotyping Flow Cytometry, Related to Figure 1

Representative gating of CD3+ T cells, CD19+ B cells, CD3-CD19- cells, CD4+ T cells, CD8+ T cells and CD14+ monocytes from donor PBMCs is shown. Briefly,
mononuclear cells were gated out of all events followed by subsequent singlet gating. Live cells are gated as Zombie UV-. Cells were then gated as CD19-PE-
Cy5+, CD3-buv395+ or CD19-CD3- cells. T cells were further subdivided into either CD8-buv805+ or CD4-PerCPefluor710+ populations. CD3-CD19- cells were
defined as CD56-PE-Dazzlebright NK cells, CD56dimCD-16buv737+ NK cells or CD56- monocytes. Monocytes were further classified on differential expression of
CD14-bv510 and CD16.
ll
Article
Figure S3. SARS-CoV-2-Specific CD4+ T Cell Responses of Recovered COVID-19 Patients, Related to Figure 2
(A) Example flow cytometry gating strategy.
(B) FACS plot examples for controls. DMSO negative control, CMV positive control, PHA positive control.
(C) CMV-specific CD4+ T cells as percentage of AIM+ (OX40+CD137+) CD4+ T cells after stimulation of PBMCs with CMV peptide pool. Data were background
subtracted against DMSO negative control and are shown with geometric mean and geometric standard deviation. Samples were from unexposed donors
(‘‘Unexposed,’’ n = 11) and recovered COVID-19 patients (‘‘COVID-19,’’ n = 10).
(D) Spearman correlation of SARS-CoV-2 spike specific CD4+ T cells (AIM+ (OX40+CD137+) CD4+ T cells, background subtracted) after stimulation with spike
pool run on the same donors in two independent experiment series run on different dates. COVID-19 patient samples shown in blue. Unexposed donor samples
shown in black.
(E-F) Stimulation index quantitation of AIM+ (OX40+CD137+) CD4+ T cells; the same samples as in Figure 2 and Figure S3C were analyzed.
(G-H) Cytokine levels in the supernatants of AIM assays after stimulation with (G) Spike MP (MP_S), or (H) CD4-R (‘‘Non-spike’’). Data are shown in comparison to
the negative control (DMSO), per donor.
(legend continued on next page)

ll
Article
(I-J) Cytokine production by CD4+ T cells in response to Non-spike (CD4-R MP) or Spike (MP_S) peptide pools (‘‘CoV antigen (Ag)’’) was confirmed by analyzing
cytokine secretion from the subset of COVID-19 donors determined to have low or negative CD8+ T cell responses (< 0.1% by AIM) to the same peptide pool
determined positive for SARS-CoV-2 specific CD4+ T cells by AIM. (I) IL-2. (J) IFNg.
Statistical comparisons across cohorts were performed with the Mann-Whitney test, while paired sample comparisons were performed with the Wilcoxon test.
**p < 0.01; ***p < 0.001. ns not significant.
ll
Article
Figure S4. SARS-CoV-2-Specific CD8+ T Cell Responses of Recovered COVID-19 Patients, Related to Figure 3
(A) Flow cytometry gating strategy.
(B) SARS-CoV-2 specific CD8+ T cells as determined by AIM+ (CD69+CD137+) CD8+ T cells. Response of PBMCs from COVID-19 cases between the negative
control (DMSO) and antigen specific stimulation.
(C) CMV-specific CD8+ T cells as percentage of AIM+ (CD69+CD137+) CD8+ T cells after stimulation of PBMCs with CMV peptide pool. Data were background
subtracted against DMSO negative control and are shown with geometric mean and geometric standard deviation. Samples were from unexposed donors
(‘‘Unexposed,’’ n = 11) and recovered COVID-19 patients (‘‘COVID-19,’’ n = 10).
(D-E) Stimulation index quantitation of AIM+ (CD69+CD137+) CD8+ T cells; the same samples as in Figure 2 and Figure S4C were analyzed.
Statistical comparisons across cohorts were performed with the Mann-Whitney test, while paired sample comparisons were performed with the Wilcoxon test.
**p < 0.01; ***p < 0.001. ns not significant.
ll
Article
Figure S5. Correlations between SARS-CoV-2-Specific CD4+ T Cells, Antibodies, and CD8+ T Cells, Related to Figure 4
(A) Correlation between SARS-CoV-2 spike specific CD4+ T cells and anti-spike RBD IgG, using CD4+ T cell stimulation index.
(B) Correlation between SARS-CoV-2 non-spike specific CD4+ T cells and anti-spike RBD IgG, using CD4+ T cell stimulation index.
(C) Correlation between SARS-CoV-2 spike specific CD4+ T cells (%) and anti-spike RBD IgA.
(D) Correlation between SARS-CoV-2 spike specific CD4+ T cells (%) and anti-spike RBD IgA.
(E) Correlation between SARS-CoV-2 specific CD4+ T cells and SARS-CoV-2 specific CD8+ T cells, using stimulation index. Total MP responses per donor
were used in each case (‘‘Non-spike’’ + ‘‘spike’’ (CD4_R + MP_S) for CD4+ T cells, CD8-A + CD8-B for CD8+ T cells).
Statistical comparisons were performed using Spearman correlation.
ll
Article
Figure S6. Protein Immunodominance of SARS-CoV-2 Specific CD4+ T Cells in Recovered COVID-19 Patients and Unexposed Donors,
Related to Figure 6
(A) The same data as Figure 6B, but with each unexposed donor color coded.
(B) The same experiment as Figure 6B, but with SARS-CoV-2 specific CD4+ T cells measured as percentage of AIM+ (OX40+CD137+) CD4+ T cells, after
background subtraction. COVID-19 cases (top, in blue. n = 10) and unexposed donors (bottom, in white. n = 10).

ll
Article
(C) Correlation of SARS-CoV-2 specific CD4+ T cells detected using the epitope prediction approach (CD4_R MP) compared against the sum total of all antigen
pools of overlapping peptides (excluding spike), run with samples from the same donors in two different experiment series. Dotted line indicates 1:1 concordance.
Statistical comparison was performed using Spearman correlation.
ll
Article
Figure S7. Protein Immunodominance of SARS-CoV-2-Specific CD8+ T Cells in Recovered COVID-19 Patients and Unexposed Donors,
Related to Figure 6
(A) The same data as Figure 6D, but with each unexposed donor color coded.
(B) The same experiment as Figure 6D, but with SARS-CoV-2 specific CD8+ T cells measured as percentage of AIM+ (CD69+CD137+) CD8+ T cells, after
background subtraction. COVID-19 cases (top, in red. n = 10) and unexposed donors (bottom, in gray. n = 10).
Article
Hybrid Gene Origination Creates Human-Virus

Chimeric Proteins during Infection
Jessica Sook Yuin Ho, Matthew Angel,
IAV mRNA synthesis Yixuan Ma, ..., Jonathan W. Yewdell,
Edward Hutchinson, Ivan Marazzi
5’ of Host mRNA Cap During infection
Cleavage of host mRNA Cap Correspondence

edward.hutchinson@glasgow.ac.uk
Annealing to viral genomic RNA vPol viral RNA
(E.H.),
Initiation of viral RNA transcription ivan.marazzi@mssm.edu (I.M.)
Generation of chimeric host-virus mRNA viral mRNA
In Brief
The process by which RNA viruses, such
Translation
as influenza virus, cleave capped host
Known mechanism Novel mechanism transcripts to drive viral mRNA
Viral proteins Chimeric proteins production leads to the translation of host
Cap
Cap UTR and viral RNA to make hybrid proteins
AUG AUGGAA ...
UTR
{
AUGGAA ... that then generate T cell responses and

{
In-frame Off-frame
contribute to virulence.
Met E ... Met X X Met E ... Met - X X X X X X ...
Canonical viral proteins Viral proteins with Upstream Frankenstein ORF (UFO)
host-encoded extensions Novel host-virus encoded proteins
Highlights
d A mechanism of hybrid gene birth is employed by many
families of RNA viruses
d Human RNA and viral RNA encode new genes together
d Hybrid genes either make extensions of viral proteins or

novel proteins (UFOs)
d Human-virus genes and proteins play roles in pathogenesis

and are conserved
Ho et al., 2020, Cell 181, 1502–1517

June 25, 2020 ª 2020 The Authors. Published by Elsevier Inc.
ll
OPEN ACCESS
Article
Hybrid Gene Origination Creates
Human-Virus Chimeric Proteins during Infection
Jessica Sook Yuin Ho,1,26 Matthew Angel,2,26 Yixuan Ma,1,26 Elizabeth Sloan,14,26 Guojun Wang,1,12,25
Carles Martinez-Romero,1,12,13 Marta Alenquer,15 Vladimir Roudko,6,7,8,9 Liliane Chung,16 Simin Zheng,1 Max Chang,4
Yesai Fstkchyan,1 Sara Clohisey,16 Adam M. Dinan,17 James Gibbs,2 Robert Gifford,14 Rong Shen,20 Quan Gu,14
Nerea Irigoyen,17 Laura Campisi,1 Cheng Huang,19 Nan Zhao,1 Joshua D. Jones,17,22 Ingeborg van Knippenberg,14,23
Zeyu Zhu,1 Natasha Moshkina,1 Léa Meyer,14 Justine Noel,1 Zuleyma Peralta,5 Veronica Rezelj,14,24 Robyn Kaake,3
Brad Rosenberg,1 Bo Wang,16 Jiajie Wei,2 Slobodan Paessler,19 Helen M. Wise,16 Jeffrey Johnson,1,3
Alessandro Vannini,20,21 Maria João Amorim,15 J. Kenneth Baillie,16 Emily R. Miraldi,10,11 Christopher Benner,4
Ian Brierley,17 Paul Digard,16 Marta quksza,5 Andrew E. Firth,17 Nevan Krogan,3 Benjamin D. Greenbaum,6,7,8,9
Megan K. MacLeod,18 Harm van Bakel,5 Adolfo Garcı̀a-Sastre,1,12,13 Jonathan W. Yewdell,2 Edward Hutchinson,14,27,*
and Ivan Marazzi1,12,27,28,*
1Department of Microbiology, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
2Laboratory of Viral Diseases, National Institute of Allergy and Infectious Diseases, NIH, Bethesda, MD 20892, USA
3Department of Cellular and Molecular Pharmacology, University of California, San Francisco, San Francisco, CA 94158, USA
4Department of Medicine, School of Medicine, University of California San Diego, La Jolla, CA 92037, USA
5Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
6Tisch Cancer Institute, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
7Department of Medicine, Hematology and Medical Oncology, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
8Department of Oncological Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
9Department of Pathology, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
10Divisions of Immunobiology and Biomedical Informatics, Cincinnati Children’s Hospital, Cincinnati, OH 45229, USA
11Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, OH 45257, USA
12Global Health and Emerging Pathogens Institute, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
13Division of Infectious Diseases, Department of Medicine, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
14MRC-University of Glasgow Centre for Virus Research, Glasgow G61 1QH, UK
15Instituto Gulbenkian de Ciência, 2780-156 Oeiras, Portugal
16The Roslin Institute, University of Edinburgh, Edinburgh EH25 9PS, UK
17Division of Virology, Department of Pathology, University of Cambridge, Cambridge CB2 0SP, UK
18Centre for Immunobiology, Institute of Infection, Immunity and Inflammation, University of Glasgow, Glasgow G12 8QQ, UK
19Department of Pathology, the University of Texas Medical Branch, Galveston, TX 77555, USA
20Division of Structural Biology, The Institute of Cancer Research, London SW7 3RP, UK
21Fondazione Human Technopole, Structural Biology Research Centre, 20157 Milan, Italy
22Present address: Infection Medicine, Edinburgh Medical School: Biomedical Sciences, University of Edinburgh, Edinburgh, UK
23Present address: Department of Learning and Teaching Enhancement, Sighthill Court, Edinburgh Napier University, Edinburgh, UK
24Present address: Viral Populations and Pathogenesis Unit, Department of Virology, Institut Pasteur, CNRS UMR 3569, Paris, France
25Present address: The State Key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, College of Life Sciences,
Inner Mongolia University, Hohhot, China

27Senior author
28Lead Contact
*Correspondence: edward.hutchinson@glasgow.ac.uk (E.H.), ivan.marazzi@mssm.edu (I.M.)

SUMMARY
RNA viruses are a major human health threat. The life cycles of many highly pathogenic RNA viruses like influ-
enza A virus (IAV) and Lassa virus depends on host mRNA, because viral polymerases cleave 50 -m7G-capped
host transcripts to prime viral mRNA synthesis (‘‘cap-snatching’’). We hypothesized that start codons within
cap-snatched host transcripts could generate chimeric human-viral mRNAs with coding potential. We report
the existence of this mechanism of gene origination, which we named ‘‘start-snatching.’’ Depending on the
reading frame, start-snatching allows the translation of host and viral ‘‘untranslated regions’’ (UTRs) to create
N-terminally extended viral proteins or entirely novel polypeptides by genetic overprinting. We show that
both types of chimeric proteins are made in IAV-infected cells, generate T cell responses, and contribute
to virulence. Our results indicate that during infection with IAV, and likely a multitude of other human, animal
and plant viruses, a host-dependent mechanism allows the genesis of hybrid genes.
1502 Cell 181, 1502–1517, June 25, 2020 ª 2020 The Authors. Published by Elsevier Inc.
This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
ll
Article OPEN ACCESS
INTRODUCTION ure 1B, OUTCOME 2, upper panel) or alternatively, an entirely

novel protein overprinted on the canonical viral ORF (Figure 1B,
In eukaryotes, ribosomes typically recognize mRNAs with a ter- OUTCOME 2, lower panel). These outcomes are contingent on
minal 50 cap structure followed by an untranslated region (UTR), two assumptions: (1) uAUGs are present in cap-snatched host
which can be tens to hundreds of nucleotides in length (Decroly sequences and can enable translation initiation, and (2) the 50
et al., 2011; Kochetov et al., 2008; Leppek et al., 2018). However, mRNA transcribed from the viral UTR should lack stop codons.
a growing body of work has shown that translation can initiate in Furthermore, the absence of stop codons interrupting UTRs or
the 50 UTRs of a large proportion of eukaryotic mRNAs, some- the downstream ORFs should be evolutionarily conserved.
times extremely close to the 50 cap, resulting in upstream open To address the first point, we determined the abundance of
reading frames (uORFs) (Andreev et al., 2015; Calvo et al., uAUGs in cap-snatched host sequences archived in a Decap
2009; Dikstein, 2012; Elfakess and Dikstein, 2008; Haimov and 50 end sequencing (DEFEND-seq) dataset (Rialdi et al.,
et al., 2017; Johnstone et al., 2016; Kochetov et al., 2008; Young 2017) that we had previously generated from A549 cells infected
and Wek, 2016). with the IAV A/Puerto Rico/8/34(H1N1) (PR8) (Figure 1C). AUG-
A large subphylum of RNA viruses, the segmented negative containing, host-derived capped sequences (Figure 1C, red
strand RNA viruses (sNSVs), makes direct use of the 50 termini bars) ranged from 7–20 nt, with a median length of 11 nt, similar
of host mRNAs when transcribing their own genes. The sNSVs to the distribution obtained for all cap-snatched sequences (Fig-
include the families Arenaviridae, Peribunyaviridae, and Ortho- ure 1C, gray bars). Host-derived oligonucleotides with AUG co-
myxoviridae. Highly contagious human and animal viruses like dons were present at similar ratios in all eight genome segments
influenza A virus (IAV) and Lassa virus (LASV) belong to these of the virus and were present in all three reading frames, consti-
families and are responsible for significant levels of morbidity tuting 12% of all cap-snatched sequences (Figures 1D and
and mortality worldwide. In sNSVs, viral mRNA synthesis is S1A). Similar results were also obtained when we performed
primed using short 50 methyl-7-guanosine (m7G) capped RNA cap analysis of gene expression (CAGE) on primary human
sequences, which the viral polymerase cleaves from host RNA monocyte-derived macrophages infected with a different strain
polymerase II (RNAPII) transcripts in a process known as ‘‘cap- of IAV (A/Udorn/72(H3N2); Udorn) (Figure S1B; Table S1). These
snatching’’ (Dias et al., 2009; Plotch et al., 1981; Reich et al., results indicate that, upon infection, neither the virus nor the host
2014; Rialdi et al., 2017). Cap-snatching creates viral transcripts cells appear to prevent the formation of chimeric RNAs with
that are genetic hybrids of host and viral sequences, with the hybrid coding potential.
host-derived 50 sequences being highly diverse (Gu et al.,
2015; Koppstein et al., 2015; Rialdi et al., 2017; Sikora et al., IAV 50 UTRs Are Translatable
2017). Once made, viral mRNAs are translated by the host We next performed a bioinformatic analysis to determine if stop
machinery. codons were absent from IAV sequences within the 50 UTRs and,
In this manuscript, we hypothesized that by appropriating 50 if so, whether this was evolutionarily conserved across IAV
terminal mRNA sequences from their hosts, sNSVs could obtain strains. First, we analyzed the nucleotide sequence variability
functional upstream start codons (uAUGs), a mechanism we of the 50 UTRs of all eight segments, using all IAV H1N1 strains
termed ‘‘start-snatching.’’ Translation from host-derived up- available from the NCBI Influenza Virus database (Zhang et al.,
stream start codons in chimeric host-viral transcripts would ac- 2017). 50 UTRs of each individual segment are highly conserved
cess upstream viral ORFs (uvORFs). Depending on the frame of within each individual segment, as shown by the positional
the uAUG relative to that of the canonical viral protein, two novel weight matrices (Figure S2, top panels) and sequence alignment
chimeric types of protein in infected cells could be generated: (Figure S2, lower panels). We then translated the 50 UTR of each
canonical viral proteins with host and viral UTR-derived N-termi- genome segment in silico in all possible frames (Figure 2A, upper
nal extensions, and previously uncharacterized proteins read panels) This revealed that the 50 UTR of every IAV genome
from ORFs that are out-of-frame with, and overprinted on, segment can maintain a reading frame in at least one frame (Fig-
canonical viral ORFs. Below, we report on how we tested this hy- ure 2A, upper panels, stop codons indicated by red boxes).
pothesis using genomics, cell biology, virology, and phyloge- We found that the 50 UTRs of five out of the eight genome seg-
netic analyses. ments (PB2, HA, NP, NA, and NS) lacked upstream stop codons
in-frame with the major ORF (Figure 2A, upper panels, major ORF
RESULTS start codons indicated by green boxes). These segments thus
have the potential to code for N-terminally extended viral pro-
IAV Cap-Snatches Sequences Containing uAUGs teins. Stop codons were also absent from the 50 UTRs of six of
IAV gene transcription is initiated by cap-snatching from a host the eight genome segments when these were read out of frame
mRNA (Figure 1A). This process generates an IAV mRNA with with the major ORF (Figure 2A; segments PB2, PB1, PA, NA, M,
a 50 end portion derived from the host. This mechanism is used and HA). This suggested the intriguing possibility that, in the
to express viral genes that encode canonical viral proteins (Fig- presence of a host-donated start codon, these genome seg-
ure 1B, OUTCOME 1). We hypothesized that AUGs within host ments could make novel genes encoding hybrid polypeptides.
sequences could generate upstream host-virus chimeric ORFs To probe the length of uvORFs, we translated viral sequences
with coding potential. Depending on the reading frame, a host- that had cap-snatched uAUGs in our dataset in silico. The result
derived uAUG might initiate the synthesis of two novel chimeric of these analyses (Figure 2A, lower panels) indicated the general
genes encoding for an N-terminally extended viral protein (Fig- propensity to create chimeric ORFs, with half of the viral genome
Cell 181, 1502–1517, June 25, 2020 1503

ll
OPEN ACCESS Article
A Host mRNA viral mRNA (vmRNA)

Infection vRNA
cap cap cap cap cap
1 - Cleavage 2 - Annealing to viral 3 - Primed viral RNA
genomic RNA transcription
B OUTCOME 1. Known mechanism Viral Protein
AUGGAA...
Met Glu ... canonical viral protein
{
cap UTR
OUTCOME 2. Novel mechanism Chimeric Protein
Depending M X M D V N ... Viral protein with human-encoded extension

AUGGAA... on frame
G
AU
{
cap UTR M X X X X X X... New human-virus encoded protein
C D
40
11 nt Percent of all unique CS sequences 15
Percentage of CS sequences
that contain uAUGs

30
10
20
5
10
0 0
0 5 10 15 20 PB2 PB1 PA HA NP NA M NS
CS length (nt) Segment
All CS Sequences CS containing AUGs
Figure 1. Upstream AUGs Are Present in Host-Derived Sequences of Viral mRNAs

(A) Schematic of cap-snatching during the transcription of a segmented negative sense RNA virus (sNSV) such as influenza A virus (IAV).
(B) Schematic showing how the presence of upstream AUGs (uAUGs) in host-derived cap-snatched RNA sequences may drive the formation of novel host-viral
chimeric proteins.
(C) Histograms showing the length distributions of all cap-snatched (CS) sequences (gray bars) or only CS sequences containing uAUGs (red bars) in A549 cells
infected with IAV (strain PR8) for 4 h, as determined by DEFEND-seq.
(D) Bar plots showing the percentages of uAUG containing CS sequences in each IAV genome segment.
segments predicted to make sizable products (>30 aa) (Fig- gation of de novo assembled 80S initiation complexes but not of
ure 2B). These ORFs overlap with canonical viral genes but are those already engaged in elongation. Ribosome-protected frag-
read in different frames (overprinted). They range from over 40 ments (RPFs) were mapped to both the human and viral ge-
residues (HA) to nearly 80 residues (PB1). Where N-terminal nomes (Figures 3A and S3A–S3C). Mapping of RPF sequences
extensions of the major ORF were possible, these ranged from revealed an accumulation of ribosomes at the canonical initiation
8–21 aa in length (Figure 2B). site in mRNAs transcribed from all eight genome segments
Thus, uvORFs are present in all genome segments and, if (Figure 3B; main ORF AUG), consistent with previous reports
licensed by host-derived uAUG-containing RNAs, could (Machkovech et al., 2019). As well as observing ribosomes accu-
generate polypeptides of varying length (Figure 2B). mulating at the canonical initiation sites, we also observed RPFs
mapping to the host-derived sequence upstream of the 50 UTR,
Host-Virus mRNA Chimeras Associate with Elongating suggesting that translation initiated in this region (Figure 3B, in-
Ribosomes sets). The total number of RPF reads mapping to host-derived
If cap-snatched host uAUGs did initiate translation of viral 50 sequences for each segment was 5%–20% of the reads map-
UTRs, the 50 termini of viral mRNAs would be bound by initiating ping to the canonical start codon (Figure 3C), broadly consistent
ribosomes. We therefore performed ribosomal profiling of IAV in- with the proportion of cap-snatched sequences containing
fected cells, in the presence of harringtonine, which blocks elon- uAUGs (Figures 1D and S1B).
1504 Cell 181, 1502–1517, June 25, 2020

ll
Article OPEN ACCESS
Figure 2. IAV 50 UTRs Are Conserved and Translatable

(A) Sequence analysis of all unique 50 UTR sequences from each segment of 10,904 H1N1 subtype IAV genomes (coding sense), showing (upper panels) the
translation of the 50 UTR in all three reading frames; and (lower panels) the predicted amino acid length (aa) distributions of N-terminal extensions to the major
gene product and of overprinted new ORFs. This is calculated from the distribution of uAUG positions in DEFEND-seq data and (for overprinted new ORFs) from
the position of stop codons in the IAV PR8.
(B) The numbers of translatable products that could be accessed from uAUGs in each genome segment of IAV.
Cell 181, 1502–1517, June 25, 2020 1505

ll
OPEN ACCESS Article
A mRNA 1
C
mRNA 2
Proportion of total CS Reads relative

to Reads at the Main ORF AUG
Ribo 1
0.20
Ribo 2
Ribo + Harr 1 0.15
Ribo + Harr 2
0.10
0 25 50 75 100
Aligned reads (%)
0.05
B
PB2 Segment PB1 Segment
Normalized Read Counts
1500 20
0.00
1500 100 Canonical
CS Reads Canonical CS Reads
15 AUG 80 AUG PB2 PB1 PA HA NP NA M NS
1000 10 1000
60 Segment
40
5 20
500 0 500 0
-20 -15 -10 -5 0 -20 -15 -10 -5 0
0 0
-20 -15 -10 -5 0 5 10 15 20 25 -20 -15 -10 -5 0 5 10 15 20
PA Segment HA Segment D
40
800 20 50
CS Reads Canonical 25000
CS Reads
Sequences with AUG

15 AUG 40 Canonical
600 20000 AUG
Percentage CS
30
10 30
15000 20
400 5
10
0
10000 DMSO
200 -20 -15 -10 -5 0 0
5000 -20 -15 -10 -5 0 20 Harringtonine
0 mRNA
-20 -15 -10 -5 0 5 10 15 20 0
-20 -10 0 10 20 30
10
NP Segment NA Segment
10000 200 8000 50

CS Reads CS Reads Canonical
40
8000 150
Canonical 6000 30 AUG 0
6000
100 AUG All Segments
20
50 4000
4000 10
0 2000 0
2000 -20 -15 -10 -5 0 -20 -15 -10 -5 0
0 0
-20 -10 0 10 20 30 40 -20 -15 -10 -5 0 5 10 15 20
M Segment NS Segment
25000 250 15000 250

CS Reads Canonical CS Reads
200 AUG 200
20000
150 150 Canonical
10000 AUG
15000 100 100
10000 50 50
5000 0
0
-20 -15 -10 -5 0 -20 -15 -10 -5 0
5000
0 0
-20 -15 -10 -5 0 5 10 15 20 -20
20 -15
1 -10
10 -5 0 5 10 15 20 25
Distance from first viral nucleotide Distance from first viral nucleotide
Figure 3. IAV mRNAs Can Be Translated from Host-Derived AUGs

(A) Proportion of reads that align to viral and human transcripts for the indicated experimental conditions.
(B) 50 end mapping of ribosome protected fragments (RPFs) in harringtonine-treated A549 cells infected with the IAV PR8 at 8 h post-infection, showing for each
segment of the IAV genome the distribution of reads in the cap-snatched regions (shown in insets) and virally encoded mRNA up to 10 nt after the canonical start
codon. The x axis is shown relative to the first virally encoded nucleotide.
(C) For each IAV genome segment, the number of ribosome-protected fragments (RPFs) upstream of the canonical AUG as a proportion of those mapping to the
canonical AUG is shown. Data are shown as the mean ± SD.
(D) Barplots showing the percentages of RPFs that contain an AUG when cells were treated with DMSO (black bars) or harringtonine (gray bars) immediately prior
to harvest, or from total mRNA-seq (white bars). Results from two sequencing replicates are shown as points, with bars showing the mean.
Precisely mapping initiation sites very close to the cap is chal- segment), and less frequent toward the 50 end of the host-derived
lenging, because many of the heterogeneous 50 mRNA ends would sequence (Figure S3D). As well as inferring upstream ribosome
be too short to extrude from the ribosome, making P-site phasing initiation by mapping RPFs to protected uAUGs, we could test for
problematic by standard Ribo-seq analysis. To address this, we it directly by comparing ribosomal profiles with and without harring-
used the location of AUGs within the RPF to identify the reading tonine arrest. Harringtonine increased the proportion of RPFs from
frame being translated. This suggested that initiation occurred in cap-snatched sequences that contained an AUG, indicating trans-
all three reading frames (Figure S3D). uAUG codons were more lation was initiating on uAUGs in these host-derived sequences
frequently close to the start of the viral UTR sequences, peaking (Figure 3D). Taken together, our data show that translation initiates
at the 2 position of mRNAs from all genome segments (numbered from cap-snatched host-derived uAUGs in viral mRNA chimeras,
from the first position in the coding sense of the viral genome albeit at lower frequencies than at canonical start codons.
1506 Cell 181, 1502–1517, June 25, 2020

ll
Article OPEN ACCESS
A B
C D
E F
Cell 181, 1502–1517, June 25, 2020 1507

ll
OPEN ACCESS Article
Host-Virus Protein Chimeras Are Expressed during I-restricted epitope of ovalbumin (Porgador et al., 1997). Based
Infection, Recognized by T Cells, and Affect Virulence on the uvORFs predicted from our in silico analyses, we inserted
To demonstrate that chimeric proteins are expressed during the epitope (OVAI; OVA 257-264; SL8; SIINFEKL) in frame with
infection, we performed mass spectrometry analyses of cell ly- the longest uvORF (PB1 frame 3 uvORF; PB1-UFO(SIIN) (Fig-
sates from infected cells. We also checked whether any chimeric ure 4C) and one of the shortest uvORFs (NS, frame 2 uvORF;
proteins could be integrated into viral progeny by analyzing pu- NS-UFO(SIIN) (Figure 4D). In the case of PB1 segment, we inte-
rified virions (Figures 4A, S4A, and S4B). grated sequences encoding OVAI directly into the UTR, placing
There are limitations to this approach, as the likelihood of a the epitope within the uvORF encoding PB1-UFO (Figure 4C, top
tryptic digest generating peptides that can be detected by the panels). For the NS segment, we used synonymous mutations in
mass spectrometer is lower for short proteins. This issue the canonical viral gene to delete five naturally occurring stop co-
reduces the chance of finding peptides derived from small over- dons in the uvORF; we then inserted OVAI into the extended
printed uvORFs (<30 aa), or that map to short N-terminal exten- uvORF, positioning the insertion in a flexible ‘‘linker’’ region of
sions. Nevertheless, we were able to identify at least 2 distinct the major viral gene NS1 (Thulasi Raman and Zhou, 2016). This
peptides that were derived from the two long overprinted uvORFs genetic configuration was chosen to ascertain whether uvORFs
in the PB1 and PB2 segments, which we named PB1-UFO and are translated by default provided that they are not interrupted by
PB2-UFO, respectively (for ‘‘Upstream Frankenstein ORF’’). In stop codons (Figure 4D, top panels).
addition, we detected a UTR-encoded N-terminal extension of Mouse DC2.4 cells infected with PB1-UFO(SIIN) activated trans-
NP, which we named NP-extension (NP-ext) (Figures 4A, 4B, genic OT-I CD8+ T cells (that are highly specific for mouse H-2 Kb
S4A, and S4B; Table S2A). Peptides from all three proteins were class I molecule complexed with SIINFEKL; Kb-SIIN) (Hogquist
present in PR8 IAV infected cell lysates (Figures 4B, left panels, et al., 1994) as determined by upregulation of CD25 and CD69 (Fig-
and S4A; Table S2A). These novel viral peptides were not de- ure 4C, lower panels). Recombinant IAV expressing SIIN(PB1-Ub-
tected in uninfected controls (Figure S4A). We were also able to SIIN) at high levels (Wei et al., 2019) was used as a positive control
identify peptides derived from the PB1-UFO protein when we (Figure 4C, right panels). No upregulation of CD25 and CD69 was
re-analyzed three previously published proteomic datasets of observed in mock treated samples. Similar results were obtained
IAV infection (Heaton et al., 2016) (Figure S4C; Table S2C). Only with the NS-UFO(SIIN) virus. Here, OT-I CD8+ T cells were acti-
NP-ext was detected in virions (Figure S4B; Table S2B), presum- vated when incubated with bone marrow-derived dendritic cells
ably because influenza virions specifically package hundreds of (BMDCs) infected with the NS-UFO(SIIN) virus (Figure 4D, right
copies of NP, while there is no known mechanism to specifically panels). This was comparable to the activation seen in a control
package other uvORF-encoded proteins (Hutchinson et al., 2014). experiment using a virus in which OVAI was inserted into the
Quantification of the PB1-UFO, PB2-UFO, and NP-ext pro- stem of the viral NA protein (NA-SIIN) (Figure 4D, middle panels)
teins indicated that, although they are less abundant than the (Bottermann et al., 2018). Again, noo upregulation was observed
major viral proteins, they are expressed at detectable levels during mock infection. Taken together, our data with both the
within an infected cell. When quantified, tryptic peptides from PB1-UFO(SIIN) and the NS-SIIN viruses indicate that, unless
these proteins were found between the 20th and 40th percentile blocked by stop codons, uvORFs are translated and expressed
of normalized peptide intensities, including both host and viral during infection, and T cell immunosurveillance extends to pep-
proteins, within our samples (Figures S4A and S4B). Taken tides encoded by uvORFs.
together, our data show that N-terminal extensions and over- Next, to probe if the expression of chimeric host-viral proteins
printed uvORFs are synthesized during IAV infection and are pre- has an impact on viral pathogenesis, we generated a battery of
sent at a moderate abundance within infected cells. recombinant viruses, in which specific N-terminal extensions
We next asked whether chimeric host-viral proteins could or uvORFs were knocked out through the introduction of
be recognized by the host’s immune system. To test this, premature stop codons (NP-Dext and UFOD, respectively). The
we created modified IAVs containing insertions of a class viruses were generated either in the PR8 (Figures 4E, 4F, and
Figure 4. uvORFs Are Expressed during Infection and Can Contribute to Virulence
(A) The number of upstream viral open reading frames (uvORFs) that could be translated for each segment of the IAV genome (empty circles), highlighting those
detected in infected cell lysates by mass spectrometry (filled red circles).
(B) Tryptic peptides that map to translated uvORFs, detected by mass spectrometry across multiple experiments (summarizing data in Figures S4A and S4C).
(C) Schematic showing the generation of the PB1-UFO(SIIN) virus. DC2.4 cells were infected with the indicated viruses and co-cultured with OT-I CD8+ T cells.
OT-1 activation, assessed by CD69 and CD25 expression, was assayed by flow cytometry at 24 h post co-culture. vmRNA, viral mRNA.
(D) Schematic showing the generation of the NS-SIIN virus. Red bars indicate stop codons mutated to permit uninterrupted NS1-UFO translation. Mouse BMDC
cells were incubated with IAV antigen presentations, and co-cultured with OT1-CD8+ T cells. OT-I activation, assessed by CD69 and CD25 expression, was
assayed by flow cytometry of CD69 and CD25 expression at 24 h post co-culture.
(E) Upper panel: schematic showing mutations that truncate NP-ext (NP-DEXT) and control mutations (NP-SYN), as engineered into the IAV PR8. Wild-type PR8 is
also shown. Lower panel: weight loss and survival curves of 6- to 8-week-old BALB/c mice infected with 15 plaque-forming unit (PFU)/mouse of the indicated
viruses. Data are an aggregate of 2 independent experiments of n = 3 mice, using 2 independently plaque purified clones of the NP-DEXT or PR8;NP-SYN viruses
(total n = 6/condition). *p < 0.05; data are shown as the mean ± SEM.
(F) Upper panel: schematic showing mutations that knocked out PB1-UFO (PB1-UFOD) and control mutations (PB1-UFOSYN), as engineered into the IAV PR8.
Wild-type PR8 is also shown. Lower panel: weight loss and survival curves of 6- to 8-week-old BALB/c mice infected with the indicated dose (per mouse) of the
indicated viruses. n = 10 mice/condition. *p < 0.05. Data are shown as the mean ± SEM.
1508 Cell 181, 1502–1517, June 25, 2020

ll
Article OPEN ACCESS
chemistry
A Acidic Basic Hydrophobic Neutral Polar
4
Bits
2
0
100
Conservation
75
50
25
0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77
amino acid position
B H1N1 H3N2 H5N1

5.92% 0.35% PB1-UFO lengths
3.36% 2.00% 0.20% 14.58%
14.43%
>70 aa
50-70 aa
30-50 aa
< 30 aa
90.72% 83.37% 85.07%
C
Positive selection
Propagator Model Analysis G(X) : Chance that a class of mutation Mutation class likely to be propagated
in PB1-UFO reaches frequency X
g(X) : quantifies the relative likelihood that a
test class of mutations reaches frequency > x G0(X): Chance that a neutral class of mutations
g(X)
1 Neutral / Heterogenous selection
when compared to a neutral class of mutations reaches the same frequency X.
G X Neutral class defined as synonymous mutations
g X = that occur in PB1 reading frame Negative selection
G0 X and do not overlap with PB1-UFO Mutation class not likely to be propagated
Frequency, X
D Consider all nucleotides E Consider all nucleotides F Consider all nucleotides
R1 R2 R3 R1 R2 R3 R1 R2 R3
PB1-UFO Frame Test PB1-UFO Frame Test PB1-UFO Frame
PB1 Frame Neutral class PB1 Frame Neutral class PB1 Frame Test Neutral class
1.0 1.6 1.5

1.5 1.4
0.9 1.4 1.3
1.3 1.2
Propagator ratio, g(X)
0.8
1.2 1.1
0.7 1.1 1.0
1.0 0.9
0.6
0.9 0.8
0.5 0.8
0.7
0.7
0.4 0.6
0.6
0.5
0.3 0.5
0.4 0.4
0.2 0.3 0.3
0.2 0.2
0.1
0.1 0.1
0.0 0.0 0.0
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
Frequency, X Frequency, X Frequency, X
G 15 H
Position of epitopes within aa sequence of PB1-UFO
10
number of unique # of unique
identities, log2 HLA-epitope pairs
5
(Kd < 500nM)
0 0 77
100
HLA-B5801 10
Unique PB1 sequences
Percent identity across
HLA-B4001
HLA-B3901
HLA-B2705
50
HLA-B1501
HLA-B0801
5
HLA-B0702
HLA-A2601
HLA-A2402
0
HLA-A0301
HLA-A0201 0 50 100 150 200 232

HLA-A0101
0 Nucleotide Position
Cell 181, 1502–1517, June 25, 2020 1509

ll
OPEN ACCESS Article
S4D), A/WSN/33(H1N1) (WSN) (Figure S4E), or mouse-adapted Chimeric Host-IAV Proteins Are Conserved
A/California/04/2009(H1N1) (Cal09) (Figure S4F) backgrounds. We next asked if NP-ext and PB1-UFO are conserved across
We also generated the reciprocal control viruses carrying synon- different strains. The ability to express NP-ext without interrup-
ymous mutations (NPSYN;UFOSYN). Both genomic configurations tion by stop codons in the 50 UTR was maintained in 99.9% of
of control and knockout viruses maintained intact the canonical IAV isolates present in the NCBI Influenza database (Zhang
viral ORFs (Table S3). et al., 2017) (Figures S2, S5A, and S5B). Sequence analysis of
The mutant viruses did not display gross alterations in viral the translated 50 UTR also suggested that N-terminally extended
growth in vitro (Figures S4D–S4F). This was independent of sequences would be similar within IAV subtypes (Figure S5C).
viral background and also of the cell type infected (Figures There are many reasons why these sequences are conserved,
S4D–S4F). To determine if interrupting upstream translation including constraints imposed by RNA structure and the require-
had effects in vivo, we focused on the NP-Dext and PB1- ment to interact with the viral polymerase complex (Fodor, 2013).
UFOD viruses in the PR8 background. The strategy used to Whatever the primary selective pressure, the result of the con-
generate these viruses is shown in the top panels of Figures servation of the 50 UTR sequence is that the ability to express
4E and 4F. NP-ext is nearly universal among IAV strains.
We found that the NP-Dext viruses were less virulent in mice The ability to express PB1-UFO requires not only a lack of stop
compared to the control NP-SYN viruses (Figure 4E), suggest- codons in the appropriate frame of the 50 UTR, but also the main-
ing that NP-ext expression contributes to virulence. A similar tenance of a uvORF overprinted on the canonical PB1 ORF. We
role for NP-ext was recently proposed for the pandemic first analyzed sequences of the IAV subtypes H1N1, H3N2, and
2019 IAV (pdm2009) strain, in which an extended NP protein H5N1. We found that PB1-UFO is conserved within each of these
was found to contribute to virulence in mice and pigs (Wise three virus subtypes (Figure 5A), and stop codons resulting in
et al., 2019). Importantly, however, pdm2009 viruses translate PB1-UFO proteins <77 aa long were infrequent (Figure 5B).
NP-ext from a uAUG encoded in the 50 UTR of NP, but no cor- To understand the factors that contribute to the maintenance
responding uAUG is encoded by the PR8 virus used in of PB1-UFO ORF length and amino acid sequence composition
our study. within the IAV, we first looked at the probability that an ORF
The PB1-UFOD viruses displayed increased virulence when similar in length to PB1-UFO could have arisen stochastically
compared to the PB1-UFOSYN viruses in vivo, although in this in the IAV PB1 segment. We used a sequence randomization
case an effect was only observed at high infectious doses (Fig- model (Figure S5D) on the H3N2 subtype of IAV, the subtype
ure 4F). Gene expression analyses suggested that there were for which the greatest number of complete sequences were
distinct transcriptomic signatures in the lungs of mice infected available. We found that 77% of the sequences in the NCBI
with high doses of the PB1-UFOD or PB1-UFOSYN viruses Influenza database (Zhang et al., 2017) encoded a 77-aa PB1-
(Figures S4G and S4H; Table S4A). Gene Ontology analysis of UFO (Figure S5E) that is significantly longer than the 15–30
differentially expressed genes indicated changes in a number aa long ORFs expected by chance (Figures S5E–S5G). We
of pathways, including leukocyte activation and pro-inflamma- also found that these predicted ORFs would require multiple
tory cytokine secretion (Figure S4I; Table S5). Immune cell dys- (30–70) additional synonymous mutations in order to generate
regulation may therefore be at least partially responsible for the an ORF that is of similar length to PB1-UFO (Figure S5H).
differences in morbidity and mortality during infection with the The above analysis does not take into account constraints
PB1-UFOD or PB1-UFOSYN viruses. imposed by nucleotide biases in the viral UTR or canonical
Together, these functional data show that uvORFs are ex- PB1 ORF or from viral RNA structure. To examine their roles in
pressed during IAV infections, can be detected by the adap- the maintenance of the PB1-UFO ORF we used the frequency
tive immune system, and can modulate the severity of propagator method (Luksza and Lässig, 2014; Strelkowa and
infection. Lässig, 2012) (Figures 5C and S6A). This method can determine
Figure 5. uvORFs Are Conserved

(A) Conservation analysis of PB1-UFO protein sequences across all IAV subtypes.
(B) Pie charts showing percentages of sequences in H1N1, H3N2, and H5N1 IAV subtypes that have a PB1-UFO that is 77 aa long (blue), 50–77 aa long (gray), 30–
50 aa long (orange), and <30 aa long (yellow).
(C) Outline of the propagator model analysis. Diagrams describe possible outcomes and interpretations of calculated g(x) ratios
(D) Frequency propagator ratios of the indicated classes of mutations occurring in PB1-UFO relative to the PB1 open reading frame of H3N2 viruses. Top: regions
used for the test (G(x); yellow), and neutral class (G0(X); blue) ratios are shown. The test class is the region of PB1-UFO ORF that overlaps only with the viral 50 UTR;
the neutral class consists of synonymous mutations in the PB1 ORF that do not overlap with PB1-UFO. All nucleotides positions were considered. Error bars
indicate sampling uncertainties. See also Figure 5C for interpretations
(E) Frequency propagator ratios, as in (D), but with the test class comprising the C-terminal region of the PB1-UFO ORF.
(F) Frequency propagator ratios, as in (D), but with the test class comprising the region in the main PB1 ORF overlapping the PB1-UFO reading frame.
(G) Number of predicted PB1-UFO epitope-allele interactions for frequent 11 human HLA alleles. Heatmaps show number of PB1-UFO epitopes derived from all
possible unique identities and predicted to bind selected MHC-I alleles. Number of unique identities (i.e., unique influenza A virus sequences) encoding predicted
epitopes are shown in histograms, next to the heatmaps.
(H) Locations of PB1-UFO peptides that are predicted to result in strong (Kd <500 nM) unique interacting HLA-epitope pairs across the PB1-UFO reading frame.
This plot is juxtaposed with percent identity plot of PB1-UFO (lower panel) across 3,140 unique PB1-UFO sequences taken from the NCBI Influenza Database
(Zhang et al., 2017).
1510 Cell 181, 1502–1517, June 25, 2020

ll
Article OPEN ACCESS
if these factors imposed constraints on the PB1-UFO amino acid LASV genomes comprise two ambisense segments. The median
sequence. The model and its possible outcomes are shown and cap-snatched length of LASV mRNAs was seven nucleotides
discussed in detail in Figures 5C and S6A and the STAR (Figure S7C) in agreement with structural prediction of the
Methods. LASV polymerase (Wallat et al., 2014). Sequence analysis indi-
Briefly, mutations that occur in the viral UTR region, which cates that these uAUGs could lead to the translation of N-termi-
encodes the N-terminal part of PB1-UFO, undergo negative se- nal extensions of the GPC protein, as well as the formation of two
lection (Figure 5D; g < 1). This indicates that mutations in the viral overprinted new ORFs of 50 and 80 aa from the viral mRNAs
UTR, should they occur, have a low probability of being propa- encoding the nucleoprotein (N) and Z proteins of LASV (Fig-
gated down the IAV strain tree. On the other hand, when we ure 6C, 6D, and S7D). The proportions of uAUGs detected in
consider the nucleotide sequences that encode the overlapping cap-snatched sequences from IBV and LASV were dependent
regions of PB1-UFO and the canonical PB1 ORF, we see that on viral segments and ranged between 4% and 12% (Table S7).
there is heterogeneous/neutral selection occurring on mutations We also tested the hypothesis that translation of UTR-derived
in the PB1-UFO ORF (g z 1). This is most likely shaped by the sequences could occur in other sNSVs by using minireplicon as-
requirement to maintain the main PB1 ORF sequence, as muta- says encoding a luciferase reporter to a member of the Phenui-
tions that maintain the PB1 ORF aa sequence (synonymous mu- viridae (Heartland banyangvirus; L segment UTRs). By mutating
tations in PB1 ORF) are more likely to be fixed in the population the canonical AUG, we identified low but readily detectable
(Figure 5E; red line; g < 1). Mutations that change the PB1 amino levels of upstream translation (Figure S7E).
acid sequence instead undergo negative selection (Figure 5F; Overall, these data suggest that generation of chimeric virus-
blue line; g < 1) and are unlikely to be propagated down the strain host ORFs is a common feature of sNSVs. To quantify the poten-
tree, consistent with PB1 ORF being fixed and essential for IAV. tial pervasiveness of this mechanism and the likelihood of novel
Selection in these regions is unlikely to be dominated by RNA ORFs being conserved and functionalized into new genes, we
structural constraints because similar effects are observed when analyzed RNA virus genomic sequences for their propensity to
RNA secondary structure is taken into account for our analysis generate novel proteins by performing in silico analyses of their
(Figures S6B–S6D). Overall, our analyses suggest that PB1- genomes. Although the exact levels of upstream translation will
UFO conservation is largely dictated by the need to preserve depend on a range of factors, including the intrinsic properties
both the viral UTR nucleotide sequence and the amino acid of viral polymerase complexes and, potentially, mechanisms
sequence of the main PB1 ORF. Taken together, this suggests that modulate upstream AUG translation, our results indicate
that the evolution of the PB1-UFO ORF is heavily constrained the genomic potential of start-snatching (Figure 7). Given that
by converging selective pressures. viral mRNA and proteins are among the most highly expressed
Because we had shown that peptides derived from PB1-UFO biotypes in infected cells, our data support the idea that all
could be presented to the immune system (Figures 4C and 4D), cap-snatching virus could expand their proteome by start-
we asked whether epitope-HLA class I interactions could play a snatching uAUGs from their hosts.
role in shaping PB1-UFO sequence. We found that multiple
unique PB1-UFO peptides were predicted to bind to and interact DISCUSSION
with various HLA types (Figure 5G; Table S6). Notably, high-affin-
ity (<500 nM) HLA-epitope pairs were concentrated in regions of In this manuscript, we describe the existence of a mechanism em-
PB1-UFO where conservation was low, suggesting that immune ployed by sNSVs to generate chimeric host-virus genes. This
pressure on PB1-UFO may lead to some diversifying selection mechanism, ‘‘start-snatching,’’ involves the co-opting of start co-
on the protein (Figure 5H). dons from host mRNA sequences to expand the viral proteome.
This mechanism appears to be accessible to all sNSVs, including
Chimeric Host-Virus Proteins of Other Viruses major human pathogens such as IAV and LASV. Start-snatching
Finally, we asked whether our finding that start-snatching gener- allows the translation of proteins from cryptic uvORFs, either as
ates novel ORFs could be generalized from IAV to other sNSVs. canonical viral proteins with N-terminal extensions, or as UFO pro-
We began by looking at another member of the Orthomyxoviri- teins overprinted on the canonical viral ORF. In this study, we have
dae family, influenza B virus (IBV), by performing DEFEND-seq identified examples of both types of uvORF in IAV infections. We
on A549 cells infected with IBV. The host-derived sequences have shown that translation can initiate on uAUGs in the host-
that IBV obtains by cap-snatching had comparable median derived sequence of viral mRNAs, and that this leads to the
lengths to those appropriated by IAV (Figure S7A). Sequence expression of chimeric host-virus proteins that can be detected
analysis indicates that uAUG-initiated translation could read in infected cells. In our hands, the ablation of uvORFs did not
through the 50 UTR of every IBV genome segment in at least impact viral replication in vitro but had a moderate effect in vivo,
one frame and predicted at least two long overprinted new which would be consistent with uvORFs encoding accessory pro-
ORFs (PA and NA segments) (Figures 6A and 6B), as well as teins. We found that uvORFs can be recognized by the immune
N-terminal extensions of six of the eight major viral proteins (Fig- system, and we modeled the contribution of different evolutionary
ures 6A and S7B). forces at play on uvORFs by characterizing viral-intrinsic and host-
Next, we looked at other families of sNSVs. We performed immune features that contribute to their evolution. Finally, we
CAGE analysis on cells infected by Lassa virus (LASV), a member showed experimentally and by sequence analysis that the capa-
of the family Arenaviridae and an emerging virus that in the past bility to express uvORFs through start-snatching is widespread
decade has caused several epidemics of hemorrhagic fever. among the sNSVs.
Cell 181, 1502–1517, June 25, 2020 1511

ll
OPEN ACCESS Article
A C
B D
Figure 6. uvORFs Are Encoded by Cap-Snatching Viruses from Diverse Families

(A) The number of host-virus chimeric protein species potentially encoded by influenza B virus (IBV; B/Wisconsin/01/2010).
(B) Sequence analysis of the PA and NA segments of IBV, showing the translation of the 50 UTR in all three reading frames and the predicted length distributions of
N-terminal extensions to the main ORF and of overprinted new ORFs, calculated from uAUG positions in DEFEND-seq data.
(C) The number of host-virus chimeric protein species potentially encoded by the ambisense genome of Lassa virus (LASV; Josiah strain), in both forward and
reverse senses. The ORF encoded by the segment is indicated in the square brackets.
(D) Sequence analysis of L and S segments of LASV in the indicated orientations, showing a schematic of genome organization, the translation of the 50 UTR in all
three reading frames, and the predicted length distributions of overprinted new ORFs, calculated from uAUG positions in CAGE-seq data.
Chimeric mRNAs Encode Novel Viral Proteins Furthermore, uvORFs are translated in at least three of the eight
We hypothesized that cap-snatching of sNSVs could generate IAV genome segments, generating NP-ext, PB2-UFO, and PB1-
ORFs that are encoded by two genomes (human and virus). UFO (Figures 2 and 4). Genetic evidence suggested that many
Consistent with this, our analyses indicate that roughly 10% of other uvORF proteins are also likely to be expressed, although
IAV mRNA contains host-derived uAUGs (Figures 1D and S1B). we did not detect them in our current study, potentially due to
1512 Cell 181, 1502–1517, June 25, 2020

ll
Article OPEN ACCESS
25 Canonical ORFs both of these effects (Figures 5D–5F). Despite this, we

New ORFs (>30 aa) also observe that (1) nonsense mutations do not occur
20 New Extensions frequently in the population (Figures S5D–S5H), and (2)
Number of ORFs
missense mutations that do eventually accumulate in

15 the PB1-UFO ORF tend to be those that change poten-
tially immunogenic epitopes (Figures 5G and 5H).
10
This information, and the mere fact that full-length PB1-UFO is
5
present in more than 75% of all IAV isolates and NP extensions
Viral Family are present in more than 99% of IAV isolates, suggests that mul-
0
Orthomyxoviridae tiple forces at the host-virus interface drive the virus to maintain
Influenza A
Influenza B
Influenza C
Thogoto
LCMV
Lassa
Tacaribe
Bunyamwera
La Crosse
Herbert
Hantaan
EMARV
Arenaviridae the full-length proteins in their sequences. The relative contribu-

Peribunyaviridae
tions of distinct evolutionary forces in maintaining these proteins
Fimoviridae
Hantaviridae
are not yet clear.
An important point to be made about uvORFs is that conserva-
Figure 7. Start-Snatching Increases the Number of Potential ORFs
tion and/or expression does not equate to functionality. While
in sNSVs some uvORFs might have gained functions, we predict others
The increase in number of potential ORFs in cap-snatching viruses when will exist as afunctional, evolutionary spandrels. Such uvORFs
uvORFs are considered. Black, number of canonical ORFs; yellow, number of are stuck in a place where they have to be made but suffer too
new overprinted ORFs >30 aa; red, number of new extensions. LCMV, lym- many external constraints to productively sample evolutionary
phocytic choriomeningitis virus; EMARV, European mountain ash ringspot- space for functionalization. All things considered, we can fairly
associated emaravirus.
surmise that any cost the virus might incur through uvORFs be-
ing made is outweighed by the fitness benefits of maintaining a
the substantial sequence overlap of N-terminal extensions with
genetic architecture that allows for their expression. The aware-
canonical viral proteins and the short lengths of many over-
ness of uvORF existence, and their pervasiveness in the viral
printed ORFs. Overall, our analysis indicates that multiple fam-
world, is thus critical for our understanding of viral biology, viral
ilies of viruses can generate chimeric RNAs and could produce
evolution, and host immune surveillance.
proteins via this mechanism (Figures 6 and 7).
Conservation and Function of uvORFs Gene Origination through Overprinting and the Mis-
Our analysis shows that most sNSV infections lead to expression naming of "UTRs"
of chimeric genes and uvORFs. Because they are host and virally Genetic overprinting typically occurs when a pre-existing
encoded, it is therefore reasonable to ask who benefits from their reading frame acquires mutations that enable translation in alter-
expression. Key considerations in this regard, and based on our native reading frames while maintaining the function of the
analysis are: ancestral frame. This is an important mechanism for the creation
of new proteins, especially in the context of compact genomes
(1) Epitopes encoded in uvORFs are recognized by the adap- (viral, prokaryotic, and eukaryotic organelles) with little coding
tive immune system. MHC I presentation of uvORF- capacity (Keese and Gibbs, 1992; Kovacs et al., 2010; Poulin
derived peptides poses the risk of an adaptive immune et al., 2003; Sabath et al., 2012).
response against cells infected with sNSVs, analogous While genetic overprinting could be selectively advantageous
to the risks posed to IAV by the presentation of alternative for some organisms, the evolution of overprinted genes is prob-
reading frames (ARFs) and defective ribosomal products lematic. Any evolution of the overprinted ORF will be constrained
(DRiPs) (Dolan et al., 2010; Wei et al., 2019; Wei and Yew- by the effects of mutations in the underlying ORF. In addition, es-
dell, 2017, 2019; Zanker et al., 2019). Indeed, the risks tablished overlapping ORFs typically have dedicated mecha-
posed to the virus by the presentation of uvORFs are nisms for their expression, such as ribosomal scanning or frame-
potentially even higher due to the high conservation of shifting, which allow for efficient and regulated expression
these sequences. patterns. Exploring the limited evolutionary space that satisfies
(2) Two uvORFs considered here (NP-ext and PB1-UFO) are all of these constraints presumably requires the overprinted
both highly conserved across multiple strains of IAV. gene to provide a strong selective advantage.
However, merely assessing conservation is insufficient, Start-snatching exposes the 50 coding regions of sNSV ge-
as other forms of selection also act on IAV genome se- nomes to low levels of non-specific out-of-frame translation.
quences. In particular, genome packaging signals in the This ‘‘genetic feature’’ could facilitate the evolution of novel
primary RNA sequence are concentrated in the terminal genes through genetic overprinting, without having to evolve a
regions of each genome segment (Dadonaite et al., dedicated method to express an overprinted ORF before that
2019; Gog et al., 2007; Hutchinson et al., 2010), resulting ORF could provide a selective advantage.
in a suppression of synonymous codon usage (Gog et al., A similar argument applies to the evolution of alternative up-
2007; Jagger et al., 2012). In overprinted regions, like stream translation mechanisms for N-terminally extended pro-
PB1-UFO, there is also selective pressure conferred by teins: if an N-terminal extension provided by start-snatching was
the sequence encoding the canonical ORF. We observe selectively advantageous, the virus could evolve to directly
Cell 181, 1502–1517, June 25, 2020 1513

ll
OPEN ACCESS Article
encode an uAUG in the UTR and make the generation of extended B Mouse Infection studies
protein host-independent and heritable. In this respect, it is inter- B Preparation of RNA sequencing Libraries (In-
esting to note that some recent strains of IAV have evolved to fected Mice)
encode a uAUG in the UTR of NP that allows it to express an B SIINFEKL expression analysis
N-terminally extended protein that can modulate virulence (Wise B Minireplicon Assays
et al., 2019). In essence, start-snatching might simply be a way d QUANTIFICATION AND STATISTICAL ANALYSES
to increase the chances of UTR translation by outsourcing B Mouse Infection Studies
uAUG to non-viral genomic material. B Quantitative qPCR assays
The translation of 50 UTRs (that implies their misnaming) oc- B CAGE sequencing of WSN IAV virus infected cells
curs frequently in eukaryotic genes. uORFs are, in fact, perva- B Ribosome sequencing analyses
sively expressed, with some functioning as short biologically B RNA sequencing Analyses
active polypeptides (Andrews and Rothnagel, 2014; Calvo B LASV CAGE sequencing Analyses
et al., 2009; Combier et al., 2008; Sendoel et al., 2017; Starck B Sequence Randomization Model for PB1-UFO length
et al., 2016; Wang and Rothnagel, 2004; Wen et al., 2009). B Frequency Propagator Ratio Analysis
uORFs are abundantly expressed in cancer cells (Sendoel B Epitope predictions for PB1-UFO
et al., 2017) and activated T cells (Starck et al., 2016). Overall,
future work will be needed to redefine what, in reality, a gene is. SUPPLEMENTAL INFORMATION

Lessons for Other Viruses
cell.2020.05.035.
The capacity of a pathogen to overcome host barriers and estab-
lish infection is based on the expression of pathogen-derived ACKNOWLEDGMENTS
proteins. To understand how a pathogen antagonizes the host
and establishes infection we need to have a clear understanding We thank the Genomics and Mouse facility at Icahn School of Medicine at
of what proteins a pathogen encodes, how they function, and the Mount Sinai, the Global Health and Emerging Pathogens Institute (GHEPI) at
manner in which they contribute to virulence. The current dogma Mount Sinai, and the entire Marazzi Lab team. The authors would also like to
about many life-threatening pathogens is that they encode a thank Ervin Fodor, University of Oxford for support and critical comments on
the project; Svenja Hester, Benjamin Thomas, and Shabaz Mohammed of
small repertoire of proteins because of their limited genome
the Advanced Proteomics Facility, University of Oxford for proteomics; Elly
size. RNA viruses, such as IAV, are a prime example of this. Gaunt, University of Edinburgh and Michael Goodin, University of Kentucky,
Here, we have shown that IAV, IBV, LASV, and likely most, if for critical reading of the manuscript; and staff within the Institute of Infection,
not all, other sNSVs, can use host RNA to expand their genetic Immunity, and Inflammation Flow Cytometry Facility, the Central Research Fa-
repertoire. Similar to novel human genes which originated from cility at the University of Glasgow, Thomas Purnell of the Institute of Infection,
other mechanisms and contributed to organismal evolution Immunity and Inflammation, University of Glasgow and Dimitris Athineos of the
Beatson Institute for technical assistance and discussion. I.M. is supported by
(Kaessmann, 2010; Ohno, 1970), we expect chimeric genes to
the Burroughs Wellcome Fund (United States; 1017892) and by the Chan
shape (and have shaped) host-virus relationships. Zuckerberg Initiative (United States; 2018-191895). I.M. and H.v.B. are sup-
ported by the NIH (United States; R01AI113186). A.G.-S. and I.M. are sup-
STAR+METHODS ported by the NIH (U19AI135972 FLUOMICS). E.H., E.S., and Q.G. are sup-
ported by an MRC (United Kingdom) Career Development Award (MR/
N008618/1), and E.H. carried out proteomics work when funded by an MRC
Detailed methods are provided in the online version of this paper
Programme Grant to Prof. Ervin Fodor, University of Oxford (MR/K000241/
and include the following: 1). V.R. and I.v.K. were supported by a Wellcome Trust (United Kingdom) Se-
nior Investigator award (099220/Z/12/Z) awarded to Prof. Richard M. Elliott,
d KEY RESOURCES TABLE
University of Glasgow. R.G. and Q.G. were supported by the MRC
d RESOURCE AVAILABILITY (MC_UU_12014/12). J.K.B. was supported by a Wellcome Trust Intermediate
B Lead Contact Clinical Fellowship (103258/Z/13/Z), a Wellcome-Beit Prize (103258/Z/13/A),
B Materials Availability and the UK Intensive Care Society. J.K.B. and S.C. acknowledge the BBSRC
B Data and Code Availability (United Kingdom) Institute Strategic Programme Grant to the Roslin Institute.
d EXPERIMENTAL MODEL AND SUBJECT DETAILS B.W. was supported by a SHIELD (MR/N02995X/1) Edinburgh Global
Research Scholarship. A.E.F. was supported by the Wellcome Trust
B Cells cultures
(106207) and European Research Council (European Union; 646891). I.B.,
B Mice P.D., and H.M.W. were supported by an MRC project grant (MR/M011747/
B Virus Strains 1). P.D. was supported by the BBSRC Institute Strategic Programme (BB/
d METHOD DETAILS J004324/1 and BB/P013740/1). J.D.J. was funded by a Wellcome Trust PhD
B Growth kinetics of Viruses in Cell Culture scholarship. M. Alenquer and M.J.A. were supported by the FCT (Portugal)
B Quantification of IAV titers by Plaque Assays award PTDC/BIA-CEL/32211/2017 and IF/00899/2013, respectively. M.K.M.
was supported by a Wellcome Investigator Award (210703/Z/18/Z).
B Ribosome profiling and analysis
B Mass Spectrometry experiments (in infected cell
AUTHOR CONTRIBUTIONS
lysates)
B Mass Spectrometry experiments (in virions) Conceptualization, I.M., E.H., A.G.-S., J.W.Y., and E.S.; Methodology, I.M.,
B DEFEND sequencing of IBV infected cells J.W.Y., Y.M., M.A., G.W., and J.S.Y.H.; Formal Analysis, Y.M., M.A., G.W.,
B Preparation of CAGE libraries from LASV infected cells J.S.Y.H., N.Z., J.N., N.M., J.G., J.W., J.J., M.C., Z.P., H.v.B., M.L., E.R.M.,
1514 Cell 181, 1502–1517, June 25, 2020

ll
Article OPEN ACCESS
and A.E.F.; Investigation, J.S.Y.H., M.A., Y.M., E.S., G.W., C.M.-R., M.A., V.R., Decroly, E., Ferron, F., Lescar, J., and Canard, B. (2011). Conventional and un-
L. Campisi, S.Z., M.C., Y.F., S.C., A.M.D., J.G., R.G., R.S., Q.G., N.I., L. Chung, conventional mechanisms for capping viral mRNA. Nat. Rev. Microbiol.
N.Z., J.D.J., I.v.K., Z.Z., N.M., L.M., J.N., Z.P., V.R., R.K., B.R., B.W., J.W., 10, 51–65.
H.W., J.J., A.V., M.J.A., E.R.M., C.B., I.B., P.D., M.L., A.E.F., N.K., B.D.G., Dias, A., Bouvier, D., Crépin, T., McCarthy, A.A., Hart, D.J., Baudin, F., Cusack,
M.K.M., and H.v.B.; Data Curation, M.A., Y.M., H.v.B., R.G., and J.J.; Re- S., and Ruigrok, R.W. (2009). The cap-snatching endonuclease of influenza vi-
sources, Y.M., M.A., J.J., M.C., H.v.B., E.R.M., A.G.-S., S.P., and C.H.; Writing rus polymerase resides in the PA subunit. Nature 458, 914–918.
– Original Draft, I.M. and E.H.; Writing – Review & Editing, E.H., I.M., J.W.Y.,
Dikstein, R. (2012). Transcription and translation in a package deal: the TISU
Y.M., J.S.Y.H., M.A., E.S., A.M.D., H.v.B., M.L., B.D.G., E.R.M., A.G.-S.,
paradigm. Gene 491, 1–4.
P.D., and A.E.F.; Visualization, E.H., Y.M., M.A., G.W., J.H., J.J., M.C., Z.P.,
E.R.M., and Z.Z.; Funding Acquisition, A.G.-S., I.M., J.K.B., A.E.F., I.B., P.D., Dobin, A., Davis, C.A., Schlesinger, F., Drenkow, J., Zaleski, C., Jha, S., Batut,
M.J.A., E.H., and M.K.M.; Project Administration, I.M. and E.H.; Supervision, P., Chaisson, M., and Gingeras, T.R. (2013). STAR: ultrafast universal RNA-seq
E.H. and I.M. aligner. Bioinformatics 29, 15–21.
Dolan, B.P., Li, L., Takeda, K., Bennink, J.R., and Yewdell, J.W. (2010). Defec-
tive ribosomal products are the major source of antigenic peptides endoge-
DECLARATION OF INTERESTS
nously generated from influenza A virus neuraminidase. J. Immunol. 184,
1419–1424.
The authors declare no competing interests.
Edgar, R.C. (2004). MUSCLE: multiple sequence alignment with high accuracy
Received: March 29, 2019 and high throughput. Nucleic Acids Res. 32, 1792–1797.
Revised: February 26, 2020 Elfakess, R., and Dikstein, R. (2008). A translation initiation element specific to
Accepted: May 18, 2020 mRNAs with very short 5’UTR that also regulates transcription. PLoS ONE
Published: June 18, 2020 3, e3094.
Fodor, E. (2013). The RNA polymerase of influenza a virus: mechanisms of viral
transcription and replication. Acta Virol. 57, 113–122.
REFERENCES
Fodor, E., Devenish, L., Engelhardt, O.G., Palese, P., Brownlee, G.G., and Gar-
Andreatta, M., and Nielsen, M. (2016). Gapped sequence alignment using arti- cı́a-Sastre, A. (1999). Rescue of influenza A virus from recombinant DNA.
ficial neural networks: application to the MHC class I system. Bioinformatics J. Virol. 73, 9679–9682.
32, 511–517. Forrest, A.R., Kawaji, H., Rehli, M., Baillie, J.K., de Hoon, M.J., Haberle, V.,
Andreev, D.E., O’Connor, P.B., Fahey, C., Kenny, E.M., Terenin, I.M., Dmitriev, Lassmann, T., Kulakovskiy, I.V., Lizio, M., Itoh, M., et al.; FANTOM Consortium
S.E., Cormican, P., Morris, D.W., Shatsky, I.N., and Baranov, P.V. (2015). and the RIKEN PMI and CLST (DGT) (2014). A promoter-level mammalian
Translation of 50 leaders is pervasive in genes resistant to eIF2 repression. eLife expression atlas. Nature 507, 462–470.
4, e03971. Gaush, C.R., and Smith, T.F. (1968). Replication and plaque assay of influenza
Andrews, S.J., and Rothnagel, J.A. (2014). Emerging evidence for functional virus in an established line of canine kidney cells. Appl. Microbiol. 16, 588–594.
peptides encoded by short open reading frames. Nat. Rev. Genet. 15, Gog, J.R., Afonso, Edos.S., Dalton, R.M., Leclercq, I., Tiley, L., Elton, D., von
193–204. Kirchbach, J.C., Naffakh, N., Escriou, N., and Digard, P. (2007). Codon conser-
Bottermann, M., Foss, S., van Tienen, L.M., Vaysburd, M., Cruickshank, J., vation in the influenza A virus genome defines RNA packaging signals. Nucleic
O’Connell, K., Clark, J., Mayes, K., Higginson, K., Hirst, J.C., et al. (2018). Acids Res. 35, 1897–1907.
TRIM21 mediates antibody inhibition of adenovirus-based gene delivery and Gruber, A.R., Lorenz, R., Bernhart, S.H., Neuböck, R., and Hofacker, I.L.
vaccination. Proc. Natl. Acad. Sci. USA 115, 10440–10445. (2008). The Vienna RNA websuite. Nucleic Acids Res. 36, W70-4.
Buchholz, U.J., Finke, S., and Conzelmann, K.K. (1999). Generation of bovine Gu, W., Gallagher, G.R., Dai, W., Liu, P., Li, R., Trombly, M.I., Gammon, D.B.,
respiratory syncytial virus (BRSV) from cDNA: BRSV NS2 is not essential for vi- Mello, C.C., Wang, J.P., and Finberg, R.W. (2015). Influenza A virus preferen-
rus replication in tissue culture, and the human RSV leader region acts as a tially snatches noncoding RNA caps. RNA 21, 2067–2075.
functional BRSV genome promoter. J. Virol. 73, 251–259. Haimov, O., Sinvani, H., Martin, F., Ulitsky, I., Emmanuel, R., Tamarkin-Ben-
Calvo, S.E., Pagliarini, D.J., and Mootha, V.K. (2009). Upstream open reading Harush, A., Vardy, A., and Dikstein, R. (2017). Efficient and Accurate Transla-
frames cause widespread reduction of protein expression and are polymor- tion Initiation Directed by TISU Involves RPS3 and RPS10e Binding and Differ-
phic among humans. Proc. Natl. Acad. Sci. USA 106, 7507–7512. ential Eukaryotic Initiation Factor 1A Regulation. Mol. Cell. Biol. 37, e00150-17.
Clohisey, S., Parkinson, N., Wang, B., Bertin, N., Wise, H., Tomoiu, A., Sum- Heaton, N.S., Moshkina, N., Fenouil, R., Gardner, T.J., Aguirre, S., Shah, P.S.,
mers, K.M., Hendry, R.W., Carninci, P., Forrest, A.R.R., et al.; FANTOM5 Con- Zhao, N., Manganaro, L., Hultquist, J.F., Noel, J., et al. (2016). Targeting Viral
sortium (2020). Comprehensive characterisation of transcriptional activity dur- Proteostasis Limits Influenza Virus, HIV, and Dengue Virus Infection. Immunity
ing influenza A virus infection reveals biases in cap-snatching of host RNA 44, 46–58.
sequences. J. Virol. 94, e01720-19. Hoffmann, E., Neumann, G., Kawaoka, Y., Hobom, G., and Webster, R.G.
Combier, J.P., de Billy, F., Gamas, P., Niebel, A., and Rivas, S. (2008). Trans- (2000). A DNA transfection system for generation of influenza A virus from eight
regulation of the expression of the transcription factor MtHAP2-1 by a uORF plasmids. Proc. Natl. Acad. Sci. USA 97, 6108–6113.
controls root nodule development. Genes Dev. 22, 1549–1559. Hogquist, K.A., Jameson, S.C., Heath, W.R., Howard, J.L., Bevan, M.J., and
Cox, J., and Mann, M. (2008). MaxQuant enables high peptide identification Carbone, F.R. (1994). T cell receptor antagonist peptides induce positive se-
rates, individualized p.p.b.-range mass accuracies and proteome-wide pro- lection. Cell 76, 17–27.
tein quantification. Nat. Biotechnol. 26, 1367–1372. Hutchinson, E.C., and Stegmann, M. (2018). Purification and Proteomics of
Dadonaite, B., Gilbertson, B., Knight, M.L., Trifkovic, S., Rockman, S., Laeder- Influenza Virions. Methods Mol. Biol. 1836, 89–120.
ach, A., Brown, L.E., Fodor, E., and Bauer, D.L.V. (2019). The structure of the Hutchinson, E.C., Curran, M.D., Read, E.K., Gog, J.R., and Digard, P. (2008).
influenza A virus genome. Nat. Microbiol. 4, 1781–1789. Mutational analysis of cis-acting RNA signals in segment 7 of influenza A virus.
de Wit, E., Spronken, M.I., Bestebroer, T.M., Rimmelzwaan, G.F., Osterhaus, J. Virol. 82, 11869–11879.
A.D., and Fouchier, R.A. (2004). Efficient generation and growth of influenza vi- Hutchinson, E.C., von Kirchbach, J.C., Gog, J.R., and Digard, P. (2010).
rus A/PR/8/34 from eight cDNA fragments. Virus Res. 103, 155–161. Genome packaging in influenza A virus. J. Gen. Virol. 91, 313–328.
Cell 181, 1502–1517, June 25, 2020 1515

ll
OPEN ACCESS Article
Hutchinson, E.C., Charles, P.D., Hester, S.S., Thomas, B., Trudgian, D., Mar- into cap-snatching and RNA synthesis by influenza polymerase. Nature 516,
tı́nez-Alonso, M., and Fodor, E. (2014). Conserved and host-specific features 361–366.
of influenza virion architecture. Nat. Commun. 5, 4816. Rezelj, V.V., Mottram, T.J., Hughes, J., Elliott, R.M., Kohl, A., and Brennan, B.
Jagger, B.W., Wise, H.M., Kash, J.C., Walters, K.A., Wills, N.M., Xiao, Y.L., (2019). M Segment-Based Minigenomes and Virus-Like Particle Assays as an
Dunfee, R.L., Schwartzman, L.M., Ozinsky, A., Bell, G.L., et al. (2012). An over- Approach To Assess the Potential of Tick-Borne Phlebovirus Genome Reas-
lapping protein-coding region in influenza A virus segment 3 modulates the sortment. J. Virol. 93, e02068-18.
host response. Science 337, 199–204. Rialdi, A., Hultquist, J., Jimenez-Morales, D., Peralta, Z., Campisi, L., Fenouil,
Johnstone, T.G., Bazzini, A.A., and Giraldez, A.J. (2016). Upstream ORFs are R., Moshkina, N., Wang, Z.Z., Laffleur, B., Kaake, R.M., et al. (2017). The RNA
prevalent translational repressors in vertebrates. EMBO J. 35, 706–723. Exosome Syncs IAV-RNAPII Transcription to Promote Viral Ribogenesis and
Kaessmann, H. (2010). Origins, evolution, and phenotypic impact of new Infectivity. Cell 169, 679–692.
genes. Genome Res. 20, 1313–1326. Rosenfeld, J., Capdevielle, J., Guillemot, J.C., and Ferrara, P. (1992). In-gel
Keese, P.K., and Gibbs, A. (1992). Origins of genes: ‘‘big bang’’ or continuous digestion of proteins for internal sequence analysis after one- or two-dimen-
creation? Proc. Natl. Acad. Sci. USA 89, 9489–9493. sional gel electrophoresis. Anal. Biochem. 203, 173–179.
Kim, D., Langmead, B., and Salzberg, S.L. (2015). HISAT: a fast spliced aligner Sabath, N., Wagner, A., and Karlin, D. (2012). Evolution of viral proteins origi-
with low memory requirements. Nat. Methods 12, 357–360. nated de novo by overprinting. Mol. Biol. Evol. 29, 3767–3780.
Kochetov, A.V., Ahmad, S., Ivanisenko, V., Volkova, O.A., Kolchanov, N.A., Sagulenko, P., Puller, V., and Neher, R.A. (2018). TreeTime: Maximum-likeli-
and Sarai, A. (2008). uORFs, reinitiation and alternative translation start sites hood phylodynamic analysis. Virus Evol. 4, vex042.
in human mRNAs. FEBS Lett. 582, 1293–1297.
Schwanhäusser, B., Busse, D., Li, N., Dittmar, G., Schuchhardt, J., Wolf, J.,
Koppstein, D., Ashour, J., and Bartel, D.P. (2015). Sequencing the cap-snatch- Chen, W., and Selbach, M. (2011). Global quantification of mammalian gene
ing repertoire of H1N1 influenza provides insight into the mechanism of viral expression control. Nature 473, 337–342.
transcription initiation. Nucleic Acids Res. 43, 5052–5064.
Sendoel, A., Dunn, J.G., Rodriguez, E.H., Naik, S., Gomez, N.C., Hurwitz, B.,
Kovacs, E., Tompa, P., Liliom, K., and Kalmar, L. (2010). Dual coding in alter- Levorse, J., Dill, B.D., Schramek, D., Molina, H., et al. (2017). Translation
native reading frames correlates with intrinsic protein disorder. Proc. Natl. from unconventional 50 start sites drives tumour initiation. Nature 541,
Acad. Sci. USA 107, 5429–5434. 494–499.
Langmead, B., Trapnell, C., Pop, M., and Salzberg, S.L. (2009). Ultrafast and Shu, Y., and McCauley, J. (2017). GISAID: Global initiative on sharing all influ-
memory-efficient alignment of short DNA sequences to the human genome. enza data - from vision to reality. Euro Surveill. 22, 30494.
Genome Biol. 10, R25.
Sikora, D., Rocheleau, L., Brown, E.G., and Pelchat, M. (2017). Influenza A vi-
Leppek, K., Das, R., and Barna, M. (2018). Functional 50 UTR mRNA structures
rus cap-snatches host RNAs based on their abundance early after infection.
in eukaryotic translation regulation and how to find them. Nat. Rev. Mol. Cell
Virology 509, 167–177.
Biol. 19, 158–174.
Stamatakis, A. (2014). RAxML version 8: a tool for phylogenetic analysis and
Liao, Y., Smyth, G.K., and Shi, W. (2014). featureCounts: an efficient general
post-analysis of large phylogenies. Bioinformatics 30, 1312–1313.
purpose program for assigning sequence reads to genomic features. Bioinfor-
matics 30, 923–930. Starck, S.R., Tsai, J.C., Chen, K., Shodiya, M., Wang, L., Yahiro, K., Martins-
Green, M., Shastri, N., and Walter, P. (2016). Translation from the 50 untrans-
Love, M.I., Huber, W., and Anders, S. (2014). Moderated estimation of fold
lated region shapes the integrated stress response. Science 351, aad3867.
change and dispersion for RNA-seq data with DESeq2. Genome Biol. 15, 550.
Strelkowa, N., and Lässig, M. (2012). Clonal interference in the evolution of
Luksza, M., and Lässig, M. (2014). A predictive fitness model for influenza. Na-
influenza. Genetics 192, 671–682.
ture 507, 57–61.
Machkovech, H.M., Bloom, J.D., and Subramaniam, A.R. (2019). Comprehen- Stuller, K.A., Cush, S.S., and Flaño, E. (2010). Persistent gamma-herpesvirus
sive profiling of translation initiation in influenza virus infected cells. PLoS infection induces a CD4 T cell response containing functionally distinct
Pathog. 15, e1007518. effector populations. J. Immunol. 184, 3850–3856.
Martin, M. (2011). Cutadapt removes adapter sequences from high- Takahashi, H., Lassmann, T., Murata, M., and Carninci, P. (2012). 50 end-
throughput sequencing reads. EMBnet.journal 17, 10–12. centered expression profiling using cap-analysis gene expression and next-
generation sequencing. Nat. Protoc. 7, 542–561.
Masella, A.P., Bartram, A.K., Truszkowski, J.M., Brown, D.G., and Neufeld,
J.D. (2012). PANDAseq: paired-end assembler for illumina sequences. BMC Thulasi Raman, S.N., and Zhou, Y. (2016). Networks of Host Factors that
Bioinformatics 13, 31. Interact with NS1 Protein of Influenza A Virus. Front. Microbiol. 7, 654.
McGlincy, N.J., and Ingolia, N.T. (2017). Transcriptome-wide measurement of Tilston-Lunel, N.L., Shi, X., Elliott, R.M., and Acrani, G.O. (2017). The Potential
translation by ribosome profiling. Methods 126, 112–129. for Reassortment between Oropouche and Schmallenberg Orthobunyavi-
Ohno, S. (1970). Evolution by Gene Duplication (Springer). ruses. Viruses 9, 220.
Plotch, S.J., Bouloy, M., Ulmanen, I., and Krug, R.M. (1981). A unique Tyanova, S., Temu, T., and Cox, J. (2016). The MaxQuant computational plat-
cap(m7GpppXm)-dependent influenza virion endonuclease cleaves capped form for mass spectrometry-based shotgun proteomics. Nat. Protoc. 11,
RNAs to generate the primers that initiate viral RNA transcription. Cell 23, 2301–2319.
847–858. Wallat, G.D., Huang, Q., Wang, W., Dong, H., Ly, H., Liang, Y., and Dong, C.
Porgador, A., Yewdell, J.W., Deng, Y., Bennink, J.R., and Germain, R.N. (2014). High-resolution structure of the N-terminal endonuclease domain of
(1997). Localization, quantitation, and in situ detection of specific peptide- the Lassa virus L polymerase in complex with magnesium ions. PLoS ONE
MHC class I complexes using a monoclonal antibody. Immunity 6, 715–726. 9, e87577.
Poulin, F., Brueschke, A., and Sonenberg, N. (2003). Gene fusion and overlap- Wang, X.Q., and Rothnagel, J.A. (2004). 50 -untranslated regions with multiple
ping reading frames in the mammalian genes for 4E-BP3 and MASK. J. Biol. upstream AUG codons can support low-level translation via leaky scanning
Chem. 278, 52290–52297. and reinitiation. Nucleic Acids Res. 32, 1382–1391.
Price, M.N., Dehal, P.S., and Arkin, A.P. (2010). FastTree 2–approximately Wei, J., and Yewdell, J.W. (2017). Autoimmune T cell recognition of alternative-
maximum-likelihood trees for large alignments. PLoS ONE 5, e9490. reading-frame-encoded peptides. Nat. Med. 23, 409–410.
Reich, S., Guilligay, D., Pflug, A., Malet, H., Berger, I., Crépin, T., Hart, D., Lu- Wei, J., and Yewdell, J.W. (2019). Flu DRiPs in MHC Class I Immunosurveil-
nardi, T., Nanao, M., Ruigrok, R.W., and Cusack, S. (2014). Structural insight lance. Virol. Sin. 34, 162–167.
1516 Cell 181, 1502–1517, June 25, 2020

ll
Article OPEN ACCESS
Wei, J., Kishton, R.J., Angel, M., Conn, C.S., Dalla-Venezia, N., Marcel, V., Vin- 2009 H1N1 pandemic virus: potential for strains with altered virulence pheno-
cent, A., Catez, F., Ferre, S., Ayadi, L., et al. (2019). Ribosomal Proteins Regu- type? PLoS Pathog. 6, e1001145.
late MHC Class I Peptide Generation for Immunosurveillance. Mol. Cell 73,
1162–1173. Young, S.K., and Wek, R.C. (2016). Upstream Open Reading Frames Differen-
tially Regulate Gene-specific Translation in the Integrated Stress Response.
Wen, Y., Liu, Y., Xu, Y., Zhao, Y., Hua, R., Wang, K., Sun, M., Li, Y., Yang, S.,
J. Biol. Chem. 291, 16927–16935.
Zhang, X.J., et al. (2009). Loss-of-function mutations of an inhibitory upstream
ORF in the human hairless transcript cause Marie Unna hereditary hypotricho- Zanker, D.J., Oveissi, S., Tscharke, D.C., Duan, M., Wan, S., Zhang, X., Xiao,
sis. Nat. Genet. 41, 228–233. K., Mifsud, N.A., Gibbs, J., Izzard, L., et al. (2019). Influenza A Virus Infection
Westerhof, L.M., McGuire, K., MacLellan, L., Flynn, A., Gray, J.I., Thomas, M., Induces Viral and Cellular Defective Ribosomal Products Encoded by Alterna-
Goodyear, C.S., and MacLeod, M.K. (2019). Multifunctional cytokine produc- tive Reading Frames. J. Immunol. 202, 3370–3380.
tion reveals functional superiority of memory CD4 T cells. Eur. J. Immunol. 49,
2019–2029. Zhang, Y., Aevermann, B.D., Anderson, T.K., Burke, D.F., Dauphin, G., Gu, Z.,
He, S., Kumar, S., Larsen, C.N., Lee, A.J., et al. (2017). Influenza Research
Wise, H.M., Gaunt, E., Ping, J., Holzer, B., Jasim, S., Lycett, S.J., Murphy, L.,
Database: An integrated bioinformatics resource for influenza virus research.
Livesey, A., Brown, R., Smith, N., et al. (2019). An alternative AUG codon that
Nucleic Acids Res. 45 (D1), D466–D474.
produces an N-terminally extended form of the influenza A virus NP is a viru-
lence factor for a swine-derived virus. bioRxiv. https://doi.org/10.1101/ Zhou, Y., Zhou, B., Pache, L., Chang, M., Khodabakhshi, A.H., Tanaseichuk,
738427. O., Benner, C., and Chanda, S.K. (2019). Metascape provides a biologist-ori-
Ye, J., Sorrell, E.M., Cai, Y., Shao, H., Xu, K., Pena, L., Hickman, D., Song, H., ented resource for the analysis of systems-level datasets. Nat. Commun.
Angel, M., Medina, R.A., et al. (2010). Variations in the hemagglutinin of the 10, 1523.
Cell 181, 1502–1517, June 25, 2020 1517

ll
OPEN ACCESS Article
STAR+METHODS
KEY RESOURCES TABLE

Antibodies
anti-CD25-APC (PC61.5) Thermo Fisher Cat #17-0251-82; RRID: AB_469366
anti-CD8-Alexaflor488 (53-6.7) Thermo Fisher Cat #53-0081-82; RRID: AB_469897
anti-Va2-E450 (B20.1) Thermo Fisher Cat #48-5812-82; RRID: AB_10804752
anti-Vb5-PE (MR9-4) BD Biosciences Cat # 553190; RRID: AB_394698
anti-CD44-PerCp-Cyanine5.5 (IM7) Thermo Fisher Cat #45-0441-82; RRID: AB_925746
anti-CD69-PE-Cy7 (H1.2F3) Thermo Fisher Cat #25-0691-82; RRID: AB_469637
Anti-NP antibody BioRad Cat # MCA400; RRID: AB_2151884
m-IgGk BP-HRP Santa Cruz Cat # sc-516102; RRID: AB_2687626
A/Puerto Rico/8/34 (H1N1) (PR8) de Wit et al., 2004 N/A
PR8; PB1-UFOD This study N/A
PR8; PB1-UFOSYN This study N/A
PR8; PB1-EXT+ This study N/A
PR8; NP-EXTD This study N/A
PR8; NP-EXTSYN This study N/A
A/California/04/09(H1N1) (Cal09) Ye et al., 2010 N/A
Cal09; PB1-UFOD This study N/A
Cal09; PB1-UFOSYN This study N/A
Cal09; PB2-UFOD This study N/A
Cal09; PB2-UFOSYN This study NA
Cal09; PA-UFOD This study NA
SYN
Cal09; PB1-UFO This study NA
Cal09; HA-UFOD This study NA
Cal09; HA-UFOSYN This study NA
A/WSN/33(H1N1) (WSN) Hoffmann et al., 2000 N/A
WSN; PB1-UFOD This study N/A
WSN; PB1-UFOSYN This study N/A
WSN; PB2-UFOD This study N/A
WSN; PB2-UFOSYN This study N/A
WSN; PA-UFOD This study N/A
WSN; PA-UFOSYN This study N/A
WSN; HA-UFOD This study N/A
WSN; HA-UFOSYN This study N/A
PB1-UFO(SIIN) This study N/A
NS-UFO(SIIN) This study N/A
PB1-SIIN Wei et al., 2019 N/A
NA-SIIN MRC-University of Glasgow Centre for Virus N/A
Research; As Bottermann et al., 2018
A/Udorn/72(H3N2) The Roslin Institute, University of N/A
Edinburgh; As Clohisey et al., 2020
LASV (Josiah strain) Department of Pathology, the University of N/A
Texas Medical Branch
B/Wisconsin/01/2010 Department of Microbiology, Icahn School N/A
of Medicine at Mount Sinai
e1 Cell 181, 1502–1517.e1–e11, June 25, 2020

ll
Article OPEN ACCESS
Continued
Biological Samples
Primary CD14+ human monocytes The Roslin Institute, University of N/A
Edinburgh; As Clohisey et al., 2020
Dulbecco’s Modified Eagle Thermo Fisher / GIBCO Cat#11965175
Medium (DMEM)
Minimum Essential Medium (MEM) Sigma-Aldrich Cat# 51411C
Purified Agar Oxoid Cat #: LP0028
Trypsin from bovine pancreas, TPCK- Sigma-Aldrich Cat #: T1426-500MG
treated
Protease Inhibitor Cocktail Set III, EDTA- EMD Millipore Cat# 539134-10ML
Free - Calbiochem
Trypsin Sigma-Aldrich Cat# T8802-100MG
TRIzol Reagent Thermo Fisher Scientific Cat#15596018
SimplyBlueTM SafeStain Thermo Fisher Scientific Cat# LC6060
NuPage 412% BT Gel 1.5mm 12w 10 Thermo Fisher Scientific Cat# NP0322BOX
Per Box
MG-132 Sigma-Aldrich Cat# M7449-1ML
NuPAGE MOPS SDS Running Buffer (20X) Thermo Fisher Scientific Cat# NP0001
Ovalbumin (257-264) chicken Sigma-Aldrich Cat# S7951
LT-1 transfection reagent Mirius Cat# MIR 2304
recombinant human colony-stimulating A gift from Chiron, Emeryville, CA, US; As N/A
factor 1 Clohisey et al., 2020
Lys-C lysyl endopeptidase Wako 121-05063
Harringtonine LKT biochemicals H0169
Cycloheximide Sigma-Aldrich Cat# C7698
Sequencing grade modified trypsin Promega 9PIV511
Dual Luciferase Reporter Assay System Promega Cat#E1910
CD8a+ T Cell Isolation Kit Miltenyi Biotec Cat#130-104-075
EasySep Mouse CD8+ T Cell Isolation Kit StemCell Technologies Cat# 19853
PureLink RNA Mini Kit 250 Reactions Thermo Fisher Scientific Cat# 12183025
PureLink DNase Set Thermo Fisher Scientific Cat# 12185010
miRNeasy Mini Kit QIAGEN Cat# 217004
Q5 site directed mutagenesis kit NEB Cat# E0554S
Ribo-Zero Gold rRNA Removal Kit Illumina Cat# MRZG12324
(Human/Mouse/Rat)
SMARTer total RNA Pico kit Clontech Cat# 634411
TruSeq Stranded Total RNA Library Prep Kit Illumina Cat # 20020596
Deposited Data
CAGE sequencing of WSN IAV virus Clohisey et al., 2020 https://fantom.gsc.riken.jp/5/data/
infected cells
DEFEND seq of PR8 IAV infected A549 cells Rialdi et al., 2017 GEO: GSE96677
DEFEND seq of IBV infected A549 cells This study GEO: GSE85474
Ribosome Profiling of PR8 IAV infected cells This study GEO: GSE148245
CAGE sequencing of LASV infected This study GEO: GSE148122
vero cells
RNA seq of PR8; PB1-UFOD and PR8;PB1- This study GEO: GSE128519
UFOSYN infected mouse lungs
Cell 181, 1502–1517.e1–e11, June 25, 2020 e2

ll
OPEN ACCESS Article
Continued
GISAID Database Shu and McCauley, 2017 https://www.gisaid.org
NCBI Influenza Virus Database Zhang et al., 2017 http://www.ncbi.nlm.nih.gov/genomes/
FLU/Database
Mass spectrometry Data: PR8 IAV infected This study Table S2A
A549 and 293T cells
Mass spectrometry Data: WSN IAV Virions Hutchinson et al., 2014 https://massive.ucsd.edu/ProteoSAFe/
datasets.jsp using the MassIVE ID
MSV000078740; Table S2B
Mass spectrometry Data: Heaton et al., 2016 Table S2C
Immunoprecipitation of PR8 IAV RdRp
Experimental Models: Cell Lines
Dog: MDCK ATCC CCL-34; RRID: CVCL_0422
Human: A549 ATCC CCL-185; RRID: CVCL_0023
Human: 293T ATCC CRL-3216; RRID: CVCL_0063
Cow: MDBK Sigma 90050801-1VL; RRID: CVCL_0421
Monkey: Vero ATCC CCL-81; RRID: CVCL_0059
Mouse: DC2.4 Sigma-Aldrich Cat# SCC142; RRID: CVCL_J409
Hamster: BSR-T7/5 Buchholz et al., 1999 N/A
Mouse: BALB/cJ (6-8 weeks) Jackson Laboratories 00651
Chicken: Specific Pathogen Free Charles River Cat #: 10100329
Fertile Eggs
Mouse: OT-I: C57BL/6-Tg(TcraTcrb) The Jackson Laboratory / in-house; Cat# 003831; RRID: IMSR_JAX:003831
1100Mjb/J Hogquist et al., 1994
Mouse: C57BL/6 (10-14 weeks) Envigo N/A
Oligonucleotides
DEFEND-seq cDNA synthesis–3’ primer Rialdi et al., 2017 N/A
qPCR Primers This Study Table S4B
Recombinant DNA
PR8 pDUAL plasmids A kind gift of Prof Ron Fouchier; de Wit N/A
et al., 2004
Cal09 pDP2002 plasmids A kind gift of Prof Daniel Perez.; Ye N/A
et al., 2010
pT7HRTMRen(-) MRC-University of Glasgow Centre for Virus N/A
Research; Rezelj et al., 2019
pTMHRTN MRC-University of Glasgow Centre for Virus N/A
pTMHRTL MRC-University of Glasgow Centre for Virus N/A
pTM1-FFLuc MRC-University of Glasgow Centre for Virus N/A
pRL-TK Promega E2241
DESeq2 Love et al., 2014 https://bioconductor.org/packages/
release/bioc/html/DESeq2.html
Bowtie Langmead et al., 2009 http://bowtie-bio.sourceforge.net/
index.shtml
MaxQuant Cox and Mann, 2008 https://www.biochem.mpg.de/5111795/
maxquant
Cutadapt Martin, 2011 https://cutadapt.readthedocs.io/en/stable/
e3 Cell 181, 1502–1517.e1–e11, June 25, 2020

ll
Article OPEN ACCESS
Continued
STAR Dobin et al., 2013 https://github.com/alexdobin/STAR
FlowJo Treestar N/A
Metascape Zhou et al., 2019 https://metascape.org/gp/index.html#/
main/step1
Vienna RNA Webserver Gruber et al., 2008 http://rna.tbi.univie.ac.at
FastTree Price et al., 2010 http://www.microbesonline.org/fasttree
RAxML Stamatakis, 2014 https://cme.h-its.org/exelixis/web/
software/raxml/index.html
TreeTime Sagulenko et al., 2018 https://github.com/neherlab/treetime
PANDASeq Masella et al., 2012 https://github.com/neufeld/pandaseq
NetMHC (v3.4 and v4.0) Andreatta and Nielsen, 2016 https://services.healthtech.dtu.dk/service.
php?NetMHC-4.0
MUSCLE Edgar, 2004 https://www.drive5.com/muscle/
HISAT2 Kim et al., 2015 http://daehwankimlab.github.io/hisat2
Prism 8 Graphpad N/A
Lead Contact
Further information and requests for reagents may be directed to and will be fulfilled by Lead Contact Ivan Marazzi (ivan.marazzi@
mssm.edu).
All unique/stable reagents generated in this study are available from the Lead Contact with a completed Materials Transfer
Agreement.

The datasets for CAGE sequencing of A/Udorn/72 (H3N2) IAV virus infected cells are reported in Clohisey et al. (2020) deposited in
https://fantom.gsc.riken.jp/5/data/. Datasets for DEFEND-seq of PR8-IAV infected A549 cells were taken from a pre-existing dataset
[GEO: GSE96677] (Rialdi et al., 2017). DEFEND-seq of IBV infected cells were generated in this study and deposited in GEO:
GSE85474. Ribosome profiling profile of PR8 IAV infected cells were generated in this study and deposited in GEO: GSE148245.
The datasets for CAGE sequencing of LASV infected Vero cells were generated in this study and deposited in GEO: GSE148122.
RNA seq of PR8; PB1-UFOD and PR8;PB1-UFOSYN infected mouse lungs was generated in this study and deposited in
GSE128519. Mass spectrometry data for PR8 infected IAV infected A549 and 293 cells was generated in this study and presented
in Table S2A. Mass spectrometry of WSN IAV virions was analyzed from datasets generated in Hutchinson et al. (2014), and taken
from https://massive.ucsd.edu/ProteoSAFe/datasets.jsp using the MassIVE ID MSV000078740. Tables are also found in Table S2B.
Mass spectrometry data for PB1-UFO interactions with IAV polymerase subunits was analyzed using datasets from Heaton et al.
(2016) and presented in Table S2C.
Cells cultures
Madin–Darby Canine Kidney (MDCK) cells, A549 human lung epithelial cells, Vero (ATCC-CCL81) and 293T human embryonic kidney
cells were cultured in Dulbecco’s Modified Eagle’s Medium (DMEM; GIBCO) supplemented with 10% fetal bovine serum (FBS;
GIBCO). Madin-Darby Bovine Kidney (MDBK) cells were cultured in Minimum Essential Medium (MEM; Sigma) supplemented
with 2 mM L-glutamine and 10% fetal calf serum (FCS). BSR-T7/5 golden hamster cells (Buchholz et al., 1999) were cultured in Glas-
gow Minimal Essential Medium (GMEM) supplemented with 10% FCS and 10% tryptose phosphate broth under G418 selection. All
cells were maintained at 37C and 5% CO2.
Mice
For infection studies: Six to eight-week-old female BALB/c mice were obtained from Jackson Laboratories (Bar Harbor, ME). All mice
infection procedures were performed following protocols approved by the Icahn School of Medicine at Mount Sinai Institutional
Cell 181, 1502–1517.e1–e11, June 25, 2020 e4

ll
OPEN ACCESS Article
Animal Care and Use Committee (IACUC). Animal studies were carried out in strict accordance with the recommendations in the
Guide for the Care and Use of Laboratory Animals of the National Research Council.
For antigen presentation experiments: female OTI (Hogquist et al., 1994) mice were bred in-house on a mixed genetic background.
Animals were kept in dedicated barrier facilities, proactive in environmental enrichment under the EU Directive 2010 and Animal (Sci-
entific Procedures) Act (UK Home Office license number 70/8645) with ethical review approval (University of Glasgow). Animals were
cared for by trained and licensed individuals and humanely sacrificed using Schedule 1 methods.
For BMDC Isolation: 10-14 week old naive female C57BL/6 mice, purchased from Envigo (UK) and maintained at the University of
Glasgow under standard animal husbandry conditions in accordance with UK home office regulations and approved by the local
ethics committee.
Virus Strains
Wild-type viruses
A/Puerto Rico/8/34(H1N1) (PR8) virus was generated by reverse genetics and propagated in 9-11 day old embryonated chicken eggs
(Charles River, Cat # 10100329). Mouse-adapted A/California/04/09(H1N1) (Cal09) was generated by reverse genetics (Ye et al.,
2010) and propagated on MDCK cells in the presence of 1 mg/ml TPCK-trypsin, as described previously (Hutchinson et al., 2008).
The influenza virus A/WSN/33(H1N1) (WSN) (Hoffmann et al., 2000) was propagated on MDBK cells. A/Udorn/72(H3N2) (Udorn)
was propagated on MDCK cells in the presence of 1 mg/ml TPCK-trypsin, as described previously (Clohisey et al., 2020; Hutchinson
et al., 2014). Plaque assays were carried out in MDCK cells and visualized by immunocytochemistry or staining with crystal violet or
Coomassie blue, as previously described (Gaush and Smith, 1968) (See below also for method details).
Mutant viruses
All mutant and control viruses were generated using a plasmid-based reverse genetics system (Fodor et al., 1999; Ye et al., 2010),
using either the A/Puerto Rico/8/1934 (PR8), A/WSN/33 (WSN) or mouse-adapted A/California/4/09 (Cal09) strains as the backbone.
Plasmids used for reverse genetics were the PR8 pDUAL plasmids (de Wit et al., 2004) and the Cal09 pDP2002 plasmids (Ye et al.,
2010) (a kind gift of Prof Daniel R. Perez (University of Georgia, USA). Site-directed mutagenesis of plasmids was performed using the
Q5 site-directed mutagenesis kit (QIAGEN); the edited NS segment sequence required for the PR8-NS.F3.SIIN mutant virus
(described in Figure 4) was synthesized by Genewiz.
PB1-UFO(SIIN) virus
OVA257-264 (SIINFEKL) epitope was inserted into the 50 UTR of the PB1 segment of the influenza A virus (IAV) genome at position 1
before the PB1 start codon. This insertion did not result in an N-terminal extension of or mutations in the PB1 protein, but results in the
insertion of the OVA257-264 antigenic epitope in frame with the PB1-UFO protein.
NS-UFO(SIIN) virus
The OVA257-264 (SIINFEKL) epitope was inserted into frame 2 of the NS segment of the IAV genome, in a region corresponding to the
linker sequence of the NS1 protein (encoded in frame). This effectively replaced codons 79-84 of NS1, while retaining the sequence of
NEP. The replacement sequence was flanked by two upstream nucleotides and one downstream nucleotide to introduce a frameshift
into frame 2. Premature stop codons in frame 2 were also mutated at positions 4, 27, 32, 74 and 77, relative from the start codon of
NS1, to generate a 106 amino acid long NS-UFO sequence, extending it from the original 4 amino acid long uvORF in reading frame 2.
PB1-SIIN virus and NA-SIIN viruses
These viruses have been described in Wei et al. (2019) and Bottermann et al. (2018) respectively.
PR8; PB1-UFOD, PR8; PB1-UFOSYN, PR8; PB1-EXT+ viruses
PR8; PB1-UFOD contains a C to T nucleotide substitution 9 nucleotides after the start of PB1 open reading frame. This generates a
premature stop codon in the PB1-UFO ORF. Its control virus, PR8; PB1-UFOSYN, contains a C to G nucleotide substitution at the
same position. Both viruses retain the amino acid sequence of the PB1 ORF. PR8; PB1-EXT+ contains a T to C nucleotide substi-
tution three nucleotides before the start of PB1 open reading frame. This disrupts a conserved stop codon (‘‘TGA’’) in frame with
PB1 ORF, resulting in the N-terminal extension of the PB1-ORF. PB1-UFO ORF is maintained in this virus. Mutations were confirmed
by sequencing both plasmids and viruses. All viruses were expanded in 9-11 day old embryonated chicken eggs after rescue. The
stock virus titers were calculated from the average of three independent experiments.
PR8; NP-EXTD, PR8; NP-EXTSYN viruses
PR8; NP-EXTD contains an A to T nucleotide substitution 6 nucleotides before the start of the NP open reading frame. This generates
an in-frame stop codon that results in the loss of the N-terminal NP-extension. Its control virus, PR8; NP-EXTSYN, bears an A to G
nucleotide substitution at the same position in the UTR, preserving the NP-extension. Mutations were confirmed by sequencing
both plasmids and viruses, and 3 independent plaque purified clones of each virus, grown on MDCK cells, were used in subsequent
experiments. Stock virus titers were calculated from the average of three independent experiments.
WSN; PB1-UFOD, WSN; PB1-UFOSYN, Cal09; PB1-UFOD, Cal09; PB1-UFOSYN viruses
WSN; PB1-UFOD and Cal09; PB1-UFOD viruses contain C to U nucleotide substitutions 9 nucleotides after the start of PB1 open
reading frame. This generates a premature stop codon in the PB1-UFO ORF. Their control viruses, WSN; PB1-UFOSYN and
Cal09; PB1-UFOSYN respectively, contain C to G nucleotide substitutions at the same positions. All the viruses retain the amino
acid sequence of the PB1 ORF.
e5 Cell 181, 1502–1517.e1–e11, June 25, 2020

ll
Article OPEN ACCESS
WSN; PB2-UFOD, WSN; PB2-UFOSYN, Cal09; PB2-UFOD, Cal09; PB2-UFOSYN viruses

WSN; PB2-UFOD and Cal09; PB2-UFOD viruses contain A to T nucleotide substitutions 12 nucleotides after the start of PB2 open
reading frame. This generates a premature stop codon in the PB2-UFO ORF. Their control viruses, WSN; PB2-UFOSYN and
Cal09; PB2-UFOSYN respectively, contain a A to C nucleotide substitutions at the same position. All the viruses retain the amino
acid sequence of the PB2 ORF.
WSN; PA-UFOD, WSN; PA-UFOSYN, Cal09; PA-UFOD, Cal09; PA-UFOSYN viruses
WSN; PA-UFOD and Cal09; PA-UFOD viruses contain C to T nucleotide substitutions 42 nucleotides after the start of PA open reading
frame. This generates a premature stop codon in the PA-UFO ORF. Their control viruses, WSN; PA-UFOSYN and Cal09; PA-UFOSYN
respectively, contain C to A nucleotide substitutions at the same position. All the viruses retain the amino acid sequence of the
PA ORF.
WSN; HA-UFOD, WSN; HA-UFOSYN viruses
WSN; HA-UFOD viruses contain A to T nucleotide substitutions 45 nucleotides after the start of HA open reading frame. This gener-
ates a premature stop codon in the HA-UFO ORF. Their control viruses, WSN; HA-UFOSYN and Cal09; PA-UFOSYN respectively,
contain A to C nucleotide substitutions at the same position. All the viruses retain the amino acid sequence of the HA ORF.
Cal09; HA-UFOD, Cal09; HA-UFOSYN viruses
Cal09; HA-UFOD viruses contain G to T nucleotide substitutions 52 nucleotides after the start of HA open reading frame. This gen-
erates a premature stop codon in the HA-UFO ORF. Its control virus, Cal09; HA-UFOSYN contains an G to C nucleotide substitution at
the same position. All the viruses retain the amino acid sequence of the HA ORF.
Primary CD14+ human monocytes
Primary CD14+ human monocytes were isolated from whole blood samples under ethical approval from Lothian Research Ethics
Committee (11/AL/0168). Cells were obtained from blood donated by 4 anonymous healthy volunteers. Volunteers were not treated
with any drugs. Some volunteers have donated blood used in multiple experiments outside this study. Health status is not assessed.
Plasmids
Plasmids used for HRTV minireplicon assays were the Renilla-luciferase-encoding pT7HRTMRen(–); the viral-gene-encoding
pTMHRTN and pTMHRTL and the firefly-luciferase-encoding control plasmid pTM1-FFluc (Rezelj et al., 2019).
METHOD DETAILS
Growth kinetics of Viruses in Cell Culture

A549 or MDCK cells were infected with the indicated viruses at a multiplicity of infection (MOI) of 0.001 and incubated for one hour at
37 C. Infected cells were washed twice, and then cultured with Opti-MEM and TPCK-treated trypsin at 37 C for 72 h. Supernatants
were collected at the indicated time points. Viral titers were determined by plaque assays.
Quantification of IAV titers by Plaque Assays

Plaque assay in MDCK cells were performed as described previously (Gaush and Smith, 1968). Briefly, serially diluted culture super-
natants of infected cells were adsorbed on layers of confluent MDCK cells for 1 hour. Infected cells were then overlaid with 2ml of
DMEM, 25mM HEPES, 2mM glutamine, 100ug/ml penicillin-streptomycin, 1ug/ml TPCK-trypsin and 0.8% Oxoid Agar. Plates
were incubated for 48-72h until plaques were observed. Plaque were then fixed in 4% formaldehyde and visualized through staining
with 1% crystal violet solution. Alternatively, MDCK cells were overlaid with DMEM mixed 1:1 with 2% (w/v) low gelling temperature
agarose in PBS and supplemented with 1ug/ml TPCK-trypsin, incubated for 48-72h until plaques were observed, and then either
fixed and stained directly (with 0.2% (w/v) Coomassie Brilliant Blue R in 7.5% (v/v) acetic acid and 50% (v/v) ethanol) or fixed in
80% chilled acetone and visualized by immunocytochemistry (permeabilized in 1% Triton X-100 in PBS, blocked in 10% FBS in
PBS, immunostained with mouse anti-NP (BioRad: Cat# MCA400) and peroxidase-conjugated rabbit anti-mouse IgG (Santa
Cruz; Cat # sc-516102) and visualized with True Blue Peroxidase).
Ribosome profiling and analysis

A549 cells were infected in a 10cm dish with A/Puerto Rico/8/1934 (H1N1, PR8) at a MOI of 3. At 8h post infection, ribosome profiling
libraries were prepared as previously described (McGlincy and Ingolia, 2017) with the following exceptions. Infected cells were
treated with either DMSO or 5mg/mL harringtonine for 15 minutes. Cell lysis was performed by flash freezing in liquid nitrogen prior
to the addition of ice-cold lysis buffer. rRNA removal was performed as previously described (Wei et al., 2019). Sequencing was per-
formed two lanes of a HiSeq using a 2x150 bp configuration.
Mass Spectrometry experiments (in infected cell lysates)

A549 or HEK293T cells were infected with PR8 virus stock at multiplicities of infection of 3 and 5 respectively. At 8h or 24h post infec-
tion, cells were scraped, washed twice in PBS with protease inhibitors (Calbiochem), before being snap-frozen in liquid nitrogen.
Where indicated, MG132 was added to the cell culture media 4h prior to sample collection. Mock infected samples were included
as negative controls. To prepare cell lysates for mass spectrometry, cell pellets were lysed in lysis buffer (50mM Tris pH8, 1% NP-40,
100mM NaCl, protease inhibitors) on ice. NaCl concentration was then brought up to 500mM by adding salt drop-wise into the
Cell 181, 1502–1517.e1–e11, June 25, 2020 e6

ll
OPEN ACCESS Article
solution while agitating. Lysates were rotated for 30min at 4 C before an equal volume of water was added to the sample to bring
NaCl concentration back to 250mM. Samples were then centrifuged at full speed for 15 min at 4 C. 4x Laemmli buffer (200mM
Tris-HCl pH6.8, 8% SDS, 40% glycerol, 0.588M B-mercaptoethanol, 50mM EDTA and 0.08% Bromophenol Blue) was then added
to the supernatant to 1x concentration, and 5ml of the lysate was loaded on a 4%–12% Bis-Tris gel (Novex). Gels were run under a
hood for 150V for 1h15min in 1X MOPS running buffer and stained in SimplyBlueTM SafeStain (Invitrogen), following the manufac-
turer’s recommended protocol. Once stained, gel bands corresponding to 40-60kDa and < 15kDa were excised. Gel slices were sub-
ject to in-gel tryptic digests as previously described (Rosenfeld et al., 1992).
Digested samples were analyzed on a Thermo Fisher Orbitrap Fusion mass spectrometry system equipped with an Easy nLC 1200
ultra-high pressure liquid chromatography system interfaced via a Nanospray Flex nanoelectrospray source. Samples were injected
on a C18 reverse phase column (25 cm 3 75 mm packed with ReprosilPur C18 AQ 1.9 mm particles). Peptides were separated by an
organic gradient from 5% to 30% ACN in 0.1% formic acid over 70 minutes at a flow rate of 300 nL/min. The MS continuously ac-
quired spectra in a data-dependent manner throughout the gradient, acquiring a full scan in the Orbitrap (at 120,000 resolution with
an AGC target of 200,000 and a maximum injection time of 100 ms) followed by as many MS/MS scans as could be acquired on the
most abundant ions in 3 s in the dual linear ion trap (rapid scan type with an intensity threshold of 5000, HCD collision energy of 29%,
AGC target of 10,000, a maximum injection time of 35 ms, and an isolation width of 1.6 m/z). Singly and unassigned charge states
were rejected. Dynamic exclusion was enabled with a repeat count of 1, an exclusion duration of 20 s, and an exclusion mass width of
± 10 ppm. Raw mass spectrometry data were assigned to human protein sequences and MS1 intensities extracted with the Max-
Quant software package (version 1.6.8) (Cox and Mann, 2008). Data were searched against the SwissProt human protein database
(downloaded on October 10, 2019) and a custom influenza A virus database comprising all six open-reading frames greater than 10
amino acids for the IAV (strain PR-8) genomic sequence. Variable modifications were allowed for N-terminal protein acetylation,
methionine oxidation, and lysine acetylation. A static modification was indicated for carbamidomethyl cysteine. All other settings
were left using MaxQuant default settings.
Mass Spectrometry experiments (in virions)

The purification of influenza virions and collection of mass spectra by LC-MS/MS has been described previously (Hutchinson et al.,
2014), and followed previously-described protocols for purification, mass spectrometry and data analysis (Hutchinson and Stegmann,
2018). Briefly, the IAV WSN was propagated on MDBK cells. Six viral stocks were prepared, of which half were subjected to haemad-
sorption on chicken red blood cells to stringently remove non-viral material. Virus particles were then purified by sucrose gradient ul-
tracentrifugation, lysed in urea, reduced, alkylated and digested with trypsin and LysC. Tryptic peptides were analyzed by liquid chro-
matography and tandem mass spectrometry (LC-MS/MS) using an Ultimate 3000 RSLCnano HPLC system (Dionex, Camberley, UK)
run in direct injection mode and coupled to a Q Exactive mass spectrometer (Thermo Electron, Hemel Hempstead, UK) in ‘Top 10’ data-
dependent acquisition mode. Raw files describing these mass spectra have been deposited at the Mass spectrometry Interactive Vir-
tual Environment (MassIVE; Center for Computational Mass Spectrometry at University of California, San Diego) and can be accessed at
https://massive.ucsd.edu/ProteoSAFe/datasets.jsp using the MassIVE ID MSV000078740. For the purposes of this project, data were
re-analyzed using MaxQuant 1.5.8.3 analysis software (Tyanova et al., 2016) using standard settings and the following parameters: la-
bel-free quantitation and the iBAQ algorithm (Schwanhäusser et al., 2011) enabled; enzyme: trypsin/P; variable modifications: oxidation
(M) and acetyl (Protein N-ter); and fixed modifications: carbamidomethyl (C); digestion mode: semi-specific free N terminus. Peptide
spectra were matched to custom databases containing the IAV WSN proteome (including full-length translations of all six reading
frames), an edited version of the Bos taurus proteome (UP000009136; retrieved from UniProt on 16/05/2017) in which all instances
of the ubiquitin sequence had been deleted, and a single repeat of the ubiquitin protein sequence.
DEFEND sequencing of IBV infected cells

DEFEND-seq was performed as previously described (Rialdi et al., 2017). Briefly, RNA was extracted from A549 cells infected with
influenza B virus (B/Wisconsin/01/2010) for 8 hours using Trizol (Invitrogen) and subjected to DNase treatment (QIAGEN). 5mg of
DNase treated RNA was then incubated with 10U of Tobacco Acid Phosphotase (Epicentre; 37 C, 1.5h) to remove mRNA 50 caps.
Sodium periodate was then added (to 500mM) into the reaction to block the 30 OH. The reaction was then allowed to proceed for
1.5h at 4 C, before being blocked by the addition of 1/10 volume 1M L-lysine, and incubating for an additional 10min at room tem-
perature. RNA was purified with 1.8X AMPure XP beads (Beckman Coulter). Barcoded with RNA adapters were then ligated to the
50 ends of RNAs overnight at 16 C. Adaptor-ligated RNA was purified using 1.8X volume of AMPure XP beads. Ribosomal RNAs were
removed using the Ribo-Zero Gold rRNA Removal Kit (Human/Mouse/Rat) (Illumina), according to the manufacturer’s protocol.
cDNA synthesis was performed using a custom 30 primer ((50 -AGA CGT GTG CTC TTC CGA TCT N*N*N*N*N*N*-30 , Bioo Scientific,
N* = randomized bases) for 2 min at 65 C. Illumina adapters were added by PCR, and products were size-selected (200-400bp) using
BluePippin 2% M1 gels (Sage Scientific). The library was validated on the Agilent Bioanalyzer, and samples were sequenced on the
Illumina HiSeq 2500 platform in a 100bp SE read run format.
Preparation of CAGE libraries from LASV infected cells

Vero cells (ATCC-CCL81) grown on T75 flask were infected with recombinant LASV (Josiah strain) at MOI 0.1. At 2 days post infection,
cells were lysed in Trizol (Invitrogen). The infection work with pathogenic Lassa virus and RNA lysate preparation were performed at the
e7 Cell 181, 1502–1517.e1–e11, June 25, 2020

ll
Article OPEN ACCESS
BSL4 facilities in Galveston National Laboratory in the University of Texas Medical Branch in accordance with institutional health and
safety guidelines and federal regulations. Total RNA from the trizol-treated lysates was isolated and DNase treated using the Purelink
RNA Minikit (Invitrogen). The purified RNA was then submitted for CAGE-sequencing at Kabushiki Kaisha DNAFORM, Japan.
Mouse Infection studies

All mice infection procedures were performed following protocols approved by the Icahn School of Medicine at Mount Sinai Institu-
tional Animal Care and Use Committee (IACUC). Animal studies were carried out in strict accordance with the recommendations in
the Guide for the Care and Use of Laboratory Animals of the National Research Council. Six to eight-week-old female BALB/c mice
were obtained from Jackson Laboratories (Bar Harbor, ME). Mice were anesthetized by intraperitoneal injection of a mixture of 85mg/
kg ketamine and 12.5mg/kg xylazine before infection before being inoculated intranasally with 50ml virus re-suspended in PBS. Mice
were monitored daily for clinical signs of illness and weight loss after infection. Upon reaching 75% of initial body weight, animals
were humanely euthanized with carbon dioxide (CO2) as per the IACUC protocol.
Preparation of RNA sequencing Libraries (Infected Mice)

3 mice were intranasally (i.n.) infected with 100 plaque-forming units (PFU) of viruses in a volume of 50 mL and euthanized at 6 days post-
inoculation (d.p.i.). The middle lobe of the lung was collected for total RNA extraction, and the post-caval lobes of the lung was collected
to determine virus titers by plaque assay on MDCK cells. Lung tissue was then homogenized in Trizol (Invitrogen), and RNA was ex-
tracted as per manufacturer’s guidelines. Libraries were constructed using the Illumina TruSeq Stranded Total RNA Library Prep Kit.
SIINFEKL expression analysis

For T cell activation assays with PB1-UFO(SIIN) and PB1-SIIN viruses, OT-I T cells were harvested from the spleen and lymph nodes
of OTI transgenic mice and purified on the AutoMACS with the CD8a+ T Cell Isolation Kit (Miltenyi, Germany). DC2.4 cells were in-
fected with influenza A viruses for 18 hours, and then co-cultured with OTI T cells. T cells were stained with anti-CD25 and anti-CD28
labeled antibodies at 24 hours post co-culture for activation assays. T cell proliferation assays were conducted at 48 hours post infec-
tion by measuring CellTrace Violet staining by flow cytometry.
For T cell activation assays with the NS-UFO(SIIN) and NA-SIIN viruses, IAV antigen was propagated by infecting MDCK cells with
IAV PR8 wild-type, PR8 containing an NS segment with SIINFEKL inserted into frame 3 (PR8-NS.F3.SIIN) or PR8 containing an NA
segment with SIINFEKL inserted into frame 1 (PR8-NA.SIIN) (Bottermann et al., 2018). The IAV antigen preparations were prepared as
described (Stuller et al., 2010; Westerhof et al., 2019). Briefly, MDCK cells were infected for 48 h with each IAV stain and then centri-
fuged, resuspended in 0.1 M glycine buffer containing 0.9% NaCl (pH 9.75), and shaken at 4 C for 20 min. Preparations were
sonicated 4 times at 10 s intervals before centrifugation, and the supernatant stored at 80 C.
Bone marrow was then taken from 10-14 week old naive female C57BL/6 mice, purchased from Envigo (UK) and maintained at the
University of Glasgow under standard animal husbandry conditions in accordance with UK home office regulations and approved by the
local ethics committee. Bone marrow derived dendritic cells (BMDCs) were prepared as previously described (Westerhof et al., 2019).
Briefly, the tibias and femurs were flushed to obtain bone marrow cells. Red blood cells were lysed. Cells were then cultured in RPMI
with 10% FCS, 100ug/ml penicillin-streptomycin and 2mM L-glutamine, in the presences of GM-CSF (prepared from X-63 supernatant),
for 7 days, with media supplemented on day 2 and replaced on day 5. DCs were then harvested and incubated overnight with IAV an-
tigen preparations. Control BMDCs were incubated with SIINFEKL peptide (Ovalbumin (257-264), chicken, Sigma-Aldrich) for 1 h
at 37 C.
Lymph nodes (LN) (inguinal, brachial, axillary and cervical) and spleen were obtained from OTI mice sacrificed at weeks 12-13. CD8
T cells were negatively selected from LN and spleen using EasySep Mouse CD8+ T Cell Isolation Kit (Stemcell technologies).
BMDCs that had been exposed to viral antigen were co-cultured with CD8+ OTI T cells for 24 h. Activated T cells were detected by
immunostaining with antibodies against Va2-E450 (Thermo Fisher), Vb5-PE (M59-4 BD Biosciences), CD8-Alexaflor488 (53-6.7
Thermo Fisher), CD25-APC (PC61.3 Thermo Fisher), CD44-PerCpC5.5 (IM7 Thermo Fisher), and CD69-PerCy7 (H1.2F3 Thermo
Fisher). Data were acquired with a BD Fortessa cell analyzer and analyzed by FlowJo (BD, version 10).
Minireplicon Assays
Minireplicon assays were performed as previously described (Rezelj et al., 2019; Tilston-Lunel et al., 2017). Briefly, and using the
plasmids indicated above, LT-1 transfection reagent (Mirus) was used to transfect sub-confluent BSR-T7/5 cells. After 24 h cells
were processed using a Dual-Luciferase Reporter Assay System (Promega), with luciferase measured using Glowmax 20/20 lumin-
ometer (Promega).
QUANTIFICATION AND STATISTICAL ANALYSES
Mouse Infection Studies

Statistical significance between survival curves were compared using Log-rank (Mantel-Cox) test using Graphpad Prism 8.0 soft-
ware. Two tailed Student’s t tests under the assumption of equal variances between groups were used to compare weight loss in
mice from different groups for each day post infection. Data are shown as +/- SEM.
Cell 181, 1502–1517.e1–e11, June 25, 2020 e8

ll
OPEN ACCESS Article
Quantitative qPCR assays

qPCR assays were done with 4 biological replicates (4 infected mice/condition). Statistical significance in gene expression was
calculated with Graphpad Prism 8.0 software, and determined using one-tailed Student’s t test under the assumption of equal var-
iances between groups. Data are shown as mean +/- SEM.
CAGE sequencing of WSN IAV virus infected cells

The sequencing of cap-snatched leader sequences was described in detail in a recent publication (Clohisey et al., 2020). Briefly,
primary CD14+ human monocytes were isolated from 4 volunteer donors under ethical approval from Lothian Research Ethics Com-
mittee (11/AL/0168) and cultured in the presence of 100 ng/ml (104 U/ml) recombinant human colony-stimulating factor 1 (a gift from
Chiron, USA) for 8 days to differentiate them into macrophages. Monocyte-derived macrophages were then infected with influenza
(Udorn) at an MOI of 5, harvested at 0, 2, 7 and 24 hours post-infection (times defined as starting after a 1h adsorption step), and
processed for RNA extraction using a miRNeasy Mini Kit (QIAGEN). Cap analysis of gene expression (CAGE) was performed as
part of the FANTOM5 project, following the procedure of (Takahashi et al., 2012). Data were processed as in (Forrest et al., 2014)
using custom Python scripts available at https://github.com/baillielab/influenza_cage ’ATG analysis.’ The datasets analyzed during
the current study are available in the Fantom5 repository, https://fantom.gsc.riken.jp/5/data/
Ribosome sequencing analyses

Footprints were obtained by first removing the AGATCGGAAGAGC linker and filtering for low quality sequences with Cutadapt (Mar-
tin, 2011). Contigs were then generated from the paired end reads with PANDASeq (Masella et al., 2012) using default parameters.
Concurrent demultiplexing of the libraries by sample ID and UMI extraction was then performed. Reads were then aligned against
rRNA and tRNA sequences with Bowtie (Langmead et al., 2009) to remove these contaminating sequences. Unmapped reads
were aligned against a custom reference containing the human genome (hg38) and the eight genome segments of PR8 with HISAT2
(Kim et al., 2015). Host primer sequences were extracted from this alignment as well as unmapped reads by searching for a match to
conserved nucleotides at the 50 end of the influenza mRNA (GC[GA]AAAGCAGG). These reads were kept if the sequences could be
extended to unambiguously assign it to a segment. Finally, 50 end mapping was performed on these and all reads mapping to PR8.
RNA sequencing Analyses

After adaptor removal with cutadapt (Martin, 2011) and base-quality trimming to remove 30 read sequences if more than 20 bases with
Q < 20 were present, paired-end reads were mapped to the mouse (mm10) reference genome with STAR (Dobin et al., 2013), and
gene-count summaries were generated with featureCounts (Liao et al., 2014). DESeq2 (Love et al., 2014) was used to variance-
normalize the data before a 1-factor model (gene ConditionTimeMutant) was applied to identify differentially expressed genes.
Differentially expressed genes were identified as genes that had a 2-fold difference, with an adjusted p .value < 0.01. RNA-seq
raw data are deposited in GEO: GSE128519. Gene ontology analysis was performed using Metascape (Zhou et al., 2019).
LASV CAGE sequencing Analyses

Unique chimeric host-virus reads were extracted from the resulting FASTQ files by searching for a match to conserved nucleotides at
the 50 end of the LASV (Josiah Strain) mRNAs (GCAC[M]G[N]GGATCCT), allowing for a maximum of 1 mismatch, and removing all
reads with ambiguously mapped nucleotides. The reference genome of LASV was obtained from UniProt (Accessions: J04324
and U73034). Reads were kept if at least 60 nucleotides could be mapped and assigned unambiguously to the viral reference se-
quences. Each read was then split into host derived or virus derived sequences based on the sequences of viral 50 end (GCAC[M]
G[N]GGATCCT). To calculate potential uvORF length, each read was extended bioinformatically, based off the mapped genome
segment and coding sense, and translated from the first AUG found in the read.
Sequence Randomization Model for PB1-UFO length

Influenza A PB1 nucleotide sequences were obtained from the NCBI database (Zhang et al., 2017). Only unique sequences contain-
ing complete 50 UTR regions were included. Sequences containing ambiguous nucleotides were excluded. Multiple sequence align-
ment was then performed by using MUSCLE (Edgar, 2004).
We then constructed a codon usage table for each individual nucleotide sequence. To run the random sequence model, each
nucleotide sequence was translated into two protein sequences in the two translation reading frames of interest: the canonical
PB1 open reading frame (Pr-ORF) and PB1-UFO frame (Pr-UFO). Pr-UFO was considered as the observed protein sequence. Based
on the frequencies of synonymous codons within a codon usage table, each Pr-ORF was reverse translated into multiple random
nucleotide sequences in the open reading frame (Nt-ORFs) 1,000 times. 1,000 Nt-ORFs were then translated into proteins in the
UFO frame (Pr-UFOs) which were considered as the expected protein sequences and their protein lengths were computed. We
used the length of observed Pr-UFO and the lengths of expected Pr-UFOs to calculate the z-score for each nucleotide sequence.
In total, 3140 unique IAV PB1 (H3N2 only: 499) sequences were included in the analysis. From the z-scores, P values were calculated
for the Pr-UFOs occurrence biases. A threshold of p < 0.05 was used for the prediction of the likelihood of IAV PB1 sequences that
were able to be translated. Similar analyses were also performed for other genome segments.
e9 Cell 181, 1502–1517.e1–e11, June 25, 2020

ll
Article OPEN ACCESS
Frequency Propagator Ratio Analysis

Sequence dataset
Our study was based on a dataset of 26,742 human influenza A/H3N2 sequences available from the GISAID database (Shu and
McCauley, 2017), which contains 6,244 unique PB1 strains. For downstream analyzes, we included only sequences that are had
a complete 50 and 30 UTR.
Prediction of RNA secondary structure
We used the most abundant unique, full length PB1 nucleotide sequence as an input to predict RNA secondary structure. RNA
secondary structure was predicted using RNAfold from the ViennaRNA Webserver (version 2.4.13) (Gruber et al., 2008), using the
default settings to calculate the minimum free energy (MFE) structure of the PB1 segment RNA. The output structure was saved
in a dot-bracket format, and used to partition nucleotides into probable loop and stem regions for downstream analyses.
Strain tree reconstruction
Our analysis was based on an ensemble of strain trees obtained from the PB1 sequence dataset described above. Such trees
describe the genealogy of influenza strains resulting from an evolutionary process under selection (Strelkowa and Lässig, 2012).
Trees were constructed with maximum-likelihood phylogenies using FastTree (Price et al., 2010). We used a general time-reversible
model. We further refined the tree topology with RAxML (Stamatakis, 2014). Given the output topology, we reconstructed maximum-
likelihood sequences and timing of internal nodes with the TreeTime package (Sagulenko et al., 2018).
Frequency Propagator Ratio analysis
A detailed discussion of this method has previously been presented in Strelkowa and Lässig (2012) and Luksza and Lässig (2014).
Briefly, for a given polymorphism time-series, the frequency propagator GðxÞ can be used as a statistical measure of selection.
GðxÞ is defined as the conditional probability that a mutation class of interest, with an initial frequency of xi , reaches a frequency
of x > xi at a later point in time. This is estimated in our dataset as
nðxÞ
GðxÞ =
n
where nðxÞ is the number of mutations that reach frequency x

and n is the total number of mutations
Data availability might vary, depending on the year of sequence collection (fewer data points are available in the earlier years). As
such, to attain a more robust measure of selection, we use the ratio of propagators between our mutation class of interest, GðxÞ,
against a neutral reference class of mutations, G0 ðxÞ, to calculate
GðxÞ
gðxÞ =
Go ðxÞ
where,
GðxÞ is the likelihood a mutation in a given class reaches frequency, x

Go ðxÞ is the likelihood a mutation in the neutral reference class of mutations reaches the same frequency x.
The frequency propagator ratio takes into account both numbers and histories of the mutation class of interest. It is a robust mea-
sure of selection because it is (a) largely independent of data entry frequency, and (b) insensitive to clonal expansion of mutations.
At the limit x = 1, the propagator ratio gðxÞ reduces to g, where
d=n
g=
d0 =n0
and,
d is the # of mutations in our class of interest that reach fixation

n is the total # of mutations in the same class of interest
d0 is the # of mutations in a neutral reference mutation class that reaches fixation
n0 is the total # of mutations in the same neutral reference mutation class.
Selection on a mutation class of interest can be inferred from the value of g. g < 1 suggests evolutionary constraints (negative se-
lection) on the mutation class of interest relative to the reference class, where a fraction ð1 gÞ of the mutations are under negative
selection. g > 1 suggests that fixation of the mutation class of interest undergoes positive selection, and that at least a fraction
ðg 1Þ=g of that mutation class is beneficial. gz1 suggests weak or heterogenous selection acting on the mutation class of interest,
relative to that of the neutral reference class.
To quantify selection occurring across the PB1-UFO frame, we calculated mutation frequencies in the set of codons derived from
the following three regions (R1-R3)
Cell 181, 1502–1517.e1–e11, June 25, 2020 e10

ll
OPEN ACCESS Article
R1: sequences that encode the N-terminal of PB1-UFO and the viral 50 UTR
R2: sequences that encode the C-terminal of PB1-UFO and overlap with the N-terminal of PB1
R3: Sequences that encode for the C-terminal region of the main PB1 ORF and do not overlap with PB1-UFO.
We chose to use synonymous mutations in the main PB1 ORF (reading frame) in R3 as our neutral reference class to calculate
G0 ðxÞ, as we reasoned that the majority of such mutations evolve near neutrality.
To quantify selection on the N-terminal of PB1-UFO in R1, we calculated the GðxÞ for two classes of mutations: Those that changed
(non-synonymous in PB1-UFO) or did not change (synonymous in PB1-UFO) the amino acid sequence of PB1-UFO. We used
synonymous mutations occurring the PB1 ORF in R3 as our neutral reference class (G0 ðxÞ). We found that g < 1 for both cases,
suggesting that mutations occurring in this region of PB1-UFO were not likely to be fixed over time, and mostly undergo negative
selection, relative to our reference class.
To quantify selection on the C-terminal of PB1-UFO in R2, we again calculated the GðxÞ for mutations that changed (non-synon-
ymous in PB1-UFO) or did not change (synonymous in PB1-UFO) the amino acid sequence of PB1-UFO. We used synonymous
mutations occurring the PB1 ORF in R3 as our neutral reference class ðG0 ðxÞÞ. We found that gz1 for both cases, suggesting
that mutations occurring in this region of PB1-UFO underwent heterogenous selection, relative to that of the reference class.
Since R2 mutations in PB1-UFO appear to undergo heterogeneous selection, we asked if selection occurring on the main PB1 ORF
was a contributing factor. To do so, we calculated the GðxÞ for mutations that changed (non-synonymous in PB1) or did not change
(synonymous in PB1) the amino acid sequence of PB1 in R2. Synonymous mutations occurring the PB1 ORF in R3 as our neutral
reference class ðG0 ðxÞÞ. Here we found that g > 1 for synonymous mutations and g < 1 for non-synonymous mutations, suggesting
that mutations that do NOT alter the amino acid sequence of PB1 are preferentially fixed over time. This suggests to us that part of the
reason why PB1-UFO is undergoing heterogeneous selection in R2 is that there is a requirement to maintain the protein sequence of
PB1. This is not surprising, given that PB1 is an integral part of the viral RNA dependent RNA polymerase complex.
Finally, to interrogate the effect of RNA structure, we classified nucleotides as pairing or non-pairing based on the MFE structure
(discussed above) calculated by RNAFold. We masked nucleotides that were predicted to base pair (‘‘stem-forming’’) from down-
stream analyses as we reasoned that mutations in these nucleotides are likely to affect both RNA structure AND protein sequence,
thus confounding later interpretations of the data. Regions that were not predicted to base pair (‘‘loop nucleotides’’) were then used
for downstream calculations of frequency propagator ratios. Mutation frequencies were calculated in the same regions (R1, R2 and
R3) and reading frames (PB1-UFO versus PB1) as described above. We found that similar effects to before were found, suggesting
that RNA structure was not a major contributor to the maintenance of the PB1-UFO frame.
Note: The absolute number of polymorphism histories that reach a given frequency are finite (since the tree is constructed over a
defined period of time). This can give rise to sampling fluctuations. These sampling uncertainties are reported as error bars in
our figures.
Epitope predictions for PB1-UFO

Analyses were done using NetMHC3.4 and NetMHC4.0 (Andreatta and Nielsen, 2016). Binders were filtered using KD threshold of
500 nM. The collection of viral MHC-I epitopes was downloaded from IEDB database and preformatted for BLAST usage (makeblastdb
-in iedb.fasta -parse_seqids -dbtype prot). Predicted epitopes from PB1-UFO were BLASTed against IEDB and the human proteome.
For comparison with viral antigens we used the following commands: blastp -db iedb.fasta -query antigens.fasta -outfmt ‘‘6
qseqid sseqid pident ppos positive mismatch gapopen length qlen slen qstart qend sstart send qseq sseq evalue bitscore’’ -word_size
3 -gapopen 32767 -gapextend 32767 -evalue 1 -max_hsps_per_subject 1 -matrix BLOSUM62 -max_target_seqs 10000000 -out anti-
gens.iedb.blast.out. For comparison with human proteome we used the command: blastp -db human.proteome.fasta -query anti-
gens.fasta -outfmt ‘‘6 qseqid sseqid pident ppos positive mismatch gapopen length qlen slen qstart qend sstart send qseq sseq evalue
bitscore’’ -word_size 3 -gapopen 32767 -gapextend 32767 -evalue 1 -max_hsps_per_subject 1 -matrix BLOSUM62 -max_target_seqs
10000000 -out antigens.human.proteome.blast.out. To find perfect matches between predicted epitopes and human proteome or viral
antigens, we used the last command. First, we preformatted the human proteome (ensemble archive from December 2016): lastdb -p
human.proteome human.proteome.fasta. Then we used following command to compare epitopes to this database: lastal -f MAF -r 2 -q
1 -m 100000000 -a 100000 -d 15 -l 4 -k 1 -j1 -P 10 human.proteome antigens.netMHC.score.fasta > antigens.human.last.out. Finally,
obtained results were processed with bash and python and finally analyzed in PRISM 8. Similar processing was performed with viral
antigens.
e11 Cell 181, 1502–1517.e1–e11, June 25, 2020

ll
Article OPEN ACCESS

ll
OPEN ACCESS Article
Figure S1. uAUGs Are Present in Viral mRNAs, Related to Figure 1

(A) Incorporation of host transcript sequences increases the diversity of putative alternative start codons. For each viral genome segment, the frequency and
position of alternative start codons is shown relative to native start of the viral genes. For each reading frame, the frequency and location of the first in-frame stop
codon are indicated.
(B) Percentages of cap-snatched sequences that contain AUG codons, as identified by CAGE. Data are shown relative to all the viral reads from the specified
genome segments.
ll
Article OPEN ACCESS
Figure S2. Viral 50 UTRs Are Conserved, Related to Figure 2

Multiple sequence alignments of unique H1N1 IAV 50 UTRs per genome segment (n = 10904). The overall distribution of each unique nucleotide sequence is
indicated on the left, and the consensus sequence of each UTR is indicated below each alignment. The top panels show the positional weight matrix of each
nucleotide across the UTRs.
ll
OPEN ACCESS Article

ll
Article OPEN ACCESS
Figure S3. IAV mRNAs Can Be Translated from Host-Derived AUGs, Related to Figure 3
(A) Length distribution of ribosome profiling reads that aligned to human (left panel) and viral (right panel) transcripts in DMSO (Ribo) or harringtonine (Ribo + Harr)
treated samples.
(B) Metagene alignment of average P site density around annotated start codons in human (left panel) or viral (right panel) transcripts in DMSO treated samples.
(C) Metagene alignment of average P site density around annotated start codons in human (left panel) or viral (right panel) transcripts in harringtonine treated
samples.
(D) Frequency of AUG codons by position relative to the viral transcription initiation site. Bars show the mean frequency and are color coded according to frame.
Error bars indicate the standard deviation.
ll
OPEN ACCESS Article

ll
Article OPEN ACCESS
Figure S4. uvORFs Are Expressed during Infection and Can Contribute to Virulence, Related to Figure 4
(A) Plots showing the position of uvORF peptides found in lysates of cells (A549 or 293) infected with A/PR/8/34 virus at 8 or 24h post infection. The specific cell
lysates they were found in are indicated on the right. 1: MG132 treated, 2: DMSO treated. Peptide locations are drawn relative to uvORFs (gray regions) and
canonical ORFs (blue regions) and are colored by the log10 of their intensities, relative to the sample median.
(B) Same as in (A), but for uvORF peptides found within purified A/WSN/33 virions.
(C) Same as in (A), but for uvORF peptides found from an independent, previously published dataset.
(D) In vitro growth curves of the indicated mutant (UFOD) and control (UFOSYN) viruses made in the PR8 background, and performed on MDCK cells. Error bars
indicate the standard deviation of 3 replicates.
(E) In vitro growth curves of the indicated mutant (UFOD) and control (UFOSYN) viruses made in the WSN/33 background, and performed on MDCK cells.
(F) In vitro growth curves of the indicated mutant (UFOD) and control (UFOSYN) viruses made in the Cal/09 background, and performed on A549 cells. Error bars
indicate the standard deviation of 3 replicates.
(G) Heatmap of differentially expressed genes (Fold Change > 2, p < 0.01) found in the lungs of mice infected with 100PFU of either the PR8;PB1-UFOD or
PR8;PB1-UFOSYN viruses at day 6 post infection.
(H) qPCR validation of four significantly changed genes identified in (G) (highlighted with green text). Each dot represents the lung of one mouse infected with
100PFU of the indicated viruses, collected at day 6 post infection. P values were calculated through a one tailed t test. *p < 0.05
(I) Gene ontology analysis of genes shown in (G).
ll
OPEN ACCESS Article

ll
Article OPEN ACCESS
Figure S5. uvORFs Are Conserved, Related to Figure 5

(A) Bar plot showing the number of unique NP sequences that give rise to the full length, extended NP protein of 514aa, or those that result in truncated (non-
extended) uvORFs.
(B) Percentages of unique NP sequences that preserve the propensity to code for NP-extension.
(C) Top five most common NP extension protein sequences in three types of influenza A strains, H1N1, H3N2 and H5N1.
(D) Schematic showing the model used to calculate the expected versus observed PB1-UFO sequence lengths.
(E) Density plot of predicted length of H3N2 PB1-UFO protein sequences. Sequences predicted to generate a protein of 77aa are shown in medium blue, shorter
than 77aa in light blue, and those longer than 77aa are in dark blue. Sequences predicted not to generate PB1-UFO protein are shown in gray.
(F) P value distribution/volcano plot of H3N2 PB1-UFO protein sequence length. Each dot represents the difference between observed length and expected
length of each individual sequence.
(G) Density plot showing the distribution of expected lengths of H3N2 PB1-UFO proteins, based on random codon-shuffled sequences.
(H) Line plot showing the number of synonymous mutations in frame of WT H3N2 PB1 (x axis) that are required to generate stop codons in frame of H3N2 PB1-
UFO (y axis).
ll
OPEN ACCESS Article
Figure S6. Controls Related to Propagator Analysis, Related to Figures 5C–5F

(A) Schematic of analysis steps taken to quantify selection occurring on synonymous and non-synonymous mutations in the PB1-UFO ORF. Propagator model
analyses were done by either not taking (Figure 5B and 5D) or taking the RNA structure of IAV PB1 segment into account (Figures 5C–5E).
(B) Frequency propagator ratios of the indicated classes of mutations occurring in PB1-UFO relative to the PB1 open reading frame of H3N2 viruses. The region
used to calculate the test class ratio (G(X)) is indicated in yellow, and the region used to calculate the neutral class ratio (G0(X)) is indicated in blue in the top
schematic. Here, the test class is the region of the PB1-UFO ORF that overlaps only with the virally-encoded 50 UTR; the neutral class consists of synonymous
mutations in the PB1 ORF that do not overlap with PB1-UFO. Only nucleotides within predicted loop regions (i.e., non-pairing) positions were considered. Error
bars indicate sampling uncertainties. gðxÞ < 1: negative selection, gðxÞz1: weak/heterogeneous selection; gðxÞ > 1: positive selection; see also Figure 5C)
(C) Frequency propagator ratios, as in (B), but with the test class comprising the C-terminal region of the PB1-UFO ORF.
(D) Frequency propagator ratios, as in (B), but with the test class comprising the region in the main PB1 ORF overlapping the PB1-UFO reading frame.
ll
Article OPEN ACCESS
Figure S7. DEFEND-Seq and CAGE Analysis of Other Cap-Snatching Viruses, Related to Figure 6
(A) Distribution of lengths for cap-snatched sequences found in IBV, as determined by DEFEND-seq.
(B) Host derived uAUGs give rise to long uvORFs (> 30aa). (Upper panels) Predicted peptide sequences derived upon translation of all three ribosome reading
frames in the indicated IBV genome segments. (Lower panels) Predicted distribution of the lengths of new ORF and extension peptides generated from each
reading frame of the viral 50 UTR. Peptide lengths are calculated based on AUG positions obtained through DEFEND-sequencing.
(C) Distribution of lengths for cap-snatched sequences found in LASV infected cells, as determined by CAGE-seq.
(D) Host derived uAUGs enable reverse sense genome segments of Lassa virus L and S to give rise to uvORFs and extensions. (Upper panels) Schematic of
proteins encoded in the indicated reading frames in either the L or S segment. Lassa virus RNA is ambisense. (Middle panels) Predicted peptide sequences
derived upon translation of all three reading frames in the reverse sense L and S segments. (Lower panels) Predicted distribution of the lengths of new ORFs and
extension peptides generated from each reading frame of the viral 50 UTR. Peptide lengths are calculated based on AUG positions obtained through CAGE.
ll
OPEN ACCESS Article
(E) (Left panels) Schematic showing (in coding sense) the 50 termini of viral reporter RNAs, in which a viral untranslated region (UTR) flanks a luciferase (Luc)
reporter gene. Reporter RNAs were used to assess upstream translation in the mRNAs of Heartland virus (HRTV). The 50 terminus of the mRNAs consisted of cap-
snatched sequence from host mRNAs (cap), followed by a viral 50 UTR (50 UTR) and the reporter gene (Luc). Cap structures are indicated as circles, the most
N-terminal AUG as a triangle, AUG mutations as crosses and stop codons as lines. (Right panels) Luc expression when these reporters were included in min-
ireplicon assays, as a percentage of expression with the WT construct, showing the means and s.d. of 3 repeats compared to WT-STOP by Student’s 2-tailed t
test (n.s.: p R 0.05, *p < 0.05, ***p % 0.0005).
Article
A Dual-Mechanism Antibiotic Kills Gram-Negative

Bacteria and Avoids Drug Resistance
James K. Martin II, Joseph P. Sheehan,
Benjamin P. Bratton, ...,
Mikhail M. Savitski, Maxwell Z. Wilson,
Zemer Gitai
Correspondence
zgitai@princeton.edu
In Brief
A compound that kills both Gram-positive
and Gram-negative bacteria through two
independent mechanisms may provide a
platform for the development of future
antibiotics.
Highlights
d SCH-79797 kills Gram-negative and Gram-positive bacteria
with undetectable resistance
d It works by simultaneously targeting folate metabolism and

membrane integrity
d SCH’s dual-targeting is synergistic, but only when on the

same chemical scaffold
d Irresistin-16, an SCH derivative, effectively treats mouse

N. gonorrhoeae infection
Martin et al., 2020, Cell 181, 1518–1532

ll
Article
A Dual-Mechanism Antibiotic Kills Gram-Negative
Bacteria and Avoids Drug Resistance
James K. Martin II,1,7 Joseph P. Sheehan,1,7 Benjamin P. Bratton,1,2,7 Gabriel M. Moore,1 André Mateus,3
Sophia Hsin-Jung Li,1 Hahn Kim,4,5 Joshua D. Rabinowitz,2,4 Athanasios Typas,3 Mikhail M. Savitski,3
Maxwell Z. Wilson,1,6 and Zemer Gitai1,8,*
1Department of Molecular Biology, Princeton University, Princeton, NJ 08544, USA
2Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ 08544, USA
3European Molecular Biology Laboratory, Genome Biology Unit, 69117 Heidelberg, Germany
4Department of Chemistry, Princeton University, Princeton, NJ 08544, USA
5Princeton University Small Molecule Screening Center, Princeton University, Princeton, NJ 08544, USA
6Department of Molecular, Cellular, and Developmental Biology, Center for BioEngineering, University of California, Santa Barbara, Santa
Barbara, CA 93106, USA

8Lead Contact
*Correspondence: zgitai@princeton.edu
SUMMARY
The rise of antibiotic resistance and declining discovery of new antibiotics has created a global health crisis.
Of particular concern, no new antibiotic classes have been approved for treating Gram-negative pathogens in
decades. Here, we characterize a compound, SCH-79797, that kills both Gram-negative and Gram-positive
bacteria through a unique dual-targeting mechanism of action (MoA) with undetectably low resistance fre-
quencies. To characterize its MoA, we combined quantitative imaging, proteomic, genetic, metabolomic,
and cell-based assays. This pipeline demonstrates that SCH-79797 has two independent cellular targets,
folate metabolism and bacterial membrane integrity, and outperforms combination treatments in killing
methicillin-resistant Staphylococcus aureus (MRSA) persisters. Building on the molecular core of SCH-
79797, we developed a derivative, Irresistin-16, with increased potency and showed its efficacy against Neis-
seria gonorrhoeae in a mouse vaginal infection model. This promising antibiotic lead suggests that
combining multiple MoAs onto a single chemical scaffold may be an underappreciated approach to targeting
challenging bacterial pathogens.
INTRODUCTION monas aeruginosa, is more effective than other fluoroquinolones

because it maintains its potency in acidic environments (McKe-
More than twenty unique classes of antibiotics were characterized age, 2015). However, finafloxacin is still susceptible to the same
in the 30 years following the discovery of penicillin in 1929 (Coates resistance mechanisms that affect other fluoroquinolones
et al., 2011; Davies, 2006). However, a combination of scientific (Randall et al., 2016). The recent discovery of the natural product
and economic factors have slowed the discovery and develop- teixobactin suggests that it is possible to find compounds that
ment of these life-saving molecules to the extent that only six selectively kill bacteria without being prone to resistance (Ling
new classes of antibiotics have been approved in the past 20 et al., 2015). However, teixobactin is only functional against
years, none of which are active against Gram-negative bacteria Gram-positive bacteria. Another natural product, darobactin,
(Butler et al., 2017). This decline in the discovery of new antibiotic was recently found to specifically target Gram-negative bacteria,
classes, coupled with the evolution of multi-drug resistant bacte- but resistance to darobactin was relatively easy to achieve
ria and horizontal transfer of resistance mechanisms, has created through mutations in bamA (Imai et al., 2019). Thus, there is still
a public health crisis that is predicted to only escalate in the com- a strong need for characterizing new classes of antibiotics with
ing years (Culyba et al., 2015; Hofer, 2019; O’Neill, 2014). distinct mechanisms of action (MoA), especially those that target
Recent efforts have begun to reinvigorate antibiotics research, Gram-negatives with low resistance frequency.
but most of this work has resulted in compounds that function via An ideal antibiotic would be hard to develop resistance
similar mechanisms to those of traditional antibiotics. For against, able to kill both Gram-positive and Gram-negative bac-
example, finafloxacin, a fluoroquinolone antibiotic that was teria, and easy to access. It is important to note that while anti-
recently approved to treat ear infections caused by Pseudo- biotics that are not prone to resistance are attractive clinically,
1518 Cell 181, 1518–1532, June 25, 2020 ª 2020 Elsevier Inc.
ll
Article
selecting for resistant mutants is the most common method for by directly functioning as an antibiotic (Gupta et al., 2018). Given
characterizing MoA, making the characterization of new anti- that studies focusing on characterizing its anticoagulant activ-
biotic MoAs without resistance mutants a significant challenge. ities suggested that at least 5 mg/kg SCH-79797 can be safely
Phenotypic methods, such as macromolecular synthesis as- tolerated in animals (Gobbetti et al., 2012; Strande et al., 2007)
says, have been previously used in such cases, as was done and its emergence as a potential antimicrobial with no known
for teixobactin (Ling et al., 2015). However, these assays only bacterial target (Gupta et al., 2018), we decided to further char-
allow the classification of molecules with previously described acterize SCH-79797 as a candidate antibiotic.
MoAs (King and Wu, 2009). Thus, there is also a need for resis- To assess the spectrum of bacterial species susceptible to
tance-independent approaches for the de novo characterization SCH-79797, we measured the minimal inhibitory concentration
of antibiotic MoA. (MIC) of SCH-79797 against several clinically relevant patho-
Here, we describe a compound, SCH-79797, which is bacte- gens, including the ESKAPE pathogens (Boucher et al., 2009).
ricidal toward both Gram-negative and Gram-positive bacteria, In this study, we define MIC as the concentration of drug that re-
including clinically significant bacterial pathogens such as meth- sults in no visible bacterial growth after 14 h of growth at 37 C.
icillin resistant Staphylococcus aureus (MRSA), Enterococcus We found that SCH-79797 significantly hindered the growth of
faecalis, Neisseria gonorrhoeae, and Acinetobacter baumannii, multiple Gram-negative and Gram-positive pathogens including
with no signs of resistance. In an animal host model, SCH- Neisseria gonorrhoeae, two clinical isolates of Acinetobacter
79797 blocked infection by A. baumannii with low toxicity to baumannii, Enterococcus faecalis, and Staphylococcus aureus
the host at the dose required for effective antibiotic activity. To (Figure 1A; Table S1). SCH-79797 also exhibited potent activity
rapidly and efficiently classify the MoA of SCH-79797, we used against several antibiotic-resistant pathogen strains including
a variant of a recently described quantitative imaging-based multi-drug-resistant WHO-L N. gonorrhoeae and MRSA
approach known as bacterial cytological profiling (BCP) (None- S. aureus. Using the E. coli lptD4213 strain from our original
juie et al., 2013). This effort showed that SCH-79797 functions screen, SCH-79797 exhibited potent and rapid bactericidal ac-
through a mechanism distinct from that of most known classes tivity (Figures 1B and 1C). SCH-79797 also exhibited similar
of antibiotics. In the absence of being able to evolve resistant bactericidal activity against a clinical isolate of S. aureus MRSA
mutants, we used thermal proteome profiling (Savitski et al., USA300 (Tenover and Goering, 2009) suggesting that its bacte-
2014), CRISPRi genetic sensitivity (Peters et al., 2016), and me- ricidal activity is not species-specific (Figure S1A).
tabolomic profiling (Kwon et al., 2008; Kwon et al., 2010) to char-
acterize the MoA of SCH-79797. Using this multi-dimensional, SCH-79797 Is Effective In Vivo and Has a Low Frequency
systems-level approach, we identified the candidate targets of of Resistance
SCH-79797 as dihydrofolate reductase and the bacterial mem- Given SCH-79797’s promising ability to kill bacteria, we sought to
brane. Classical enzymology and membrane permeability and determine if it can function as an effective antibiotic in vivo. To test
polarization assays confirmed the targets identified by our its antibiotic activity in the context of an animal host infection, we
high-throughput approaches. By analyzing derivatives of the focused on A. baumannii as it has emerged as an important Gram-
SCH-79797 structure, we demonstrated that the two pharmaco- negative pathogen that is targeted by relatively few available anti-
phores of this compound can be distinguished. Finally, we biotics, and has a well-established host animal model in the wax
describe a derivative of SCH-79797, Irresistin-16 (IRS-16), with worm, Galleria mellonella (Gebhardt et al., 2015; Peleg et al.,
improved potency that demonstrates efficacy in a mouse vaginal 2009). We first established that injecting G. mellonella with SCH-
Neisseria gonorrhoeae model. Thus, our findings identify and 79797 at concentrations four times higher than the MIC of SCH-
characterize a promising antibiotic candidate and provide a po- 79797 toward A. baumannii did not result in higher host toxicity
tential roadmap for future antibiotic discovery efforts. than the solvent-only control (Figures S1E and S1F; Table S2).
We next tested the ability of SCH-79797 to treat infection of
RESULTS G. mellonella with a lethal dose of A. baumannii AB17978. Treat-
ment with SCH-79797 significantly prolonged the survival of
SCH-79797 Is a Broad-Spectrum, Bactericidal Antibiotic A. baumannii-infected G. mellonella (p < 0.001) (Figures 1D,
With the aim of finding antibiotics with novel mechanisms of ac- S1G, and S1H). The survival rate of G. mellonella treated with
tion (MoA), we began with an unbiased, whole-cell screening SCH-79797 was similar to the control antibiotics meropenem,
approach. To include antibiotics that target both Gram-negative rifampicin, and gentamicin (Figures 1D, S1G, and S1H) (Karlowsky
and Gram-positive bacteria, we screened for compounds that in- et al., 2003; Viehman et al., 2014).
hibited the growth of E. coli lptD4213, which has a compromised To further characterize its promise as an antibiotic, we attemp-
outer membrane that makes it partially permeable to antibiotics ted to determine the frequency with which bacteria develop
that would otherwise have difficulty penetrating the Gram-nega- resistance toward SCH-79797. Because spontaneous suppres-
tive lipopolysaccharide (Ruiz et al., 2006). We screened a small sors can restore E. coli lptD4213’s membrane barrier function-
molecule library of 33,000 unique compounds and one of our ality, we focused our resistance studies on S. aureus MRSA
most potent hits was SCH-79797, a compound that had been USA300 (Tenover and Goering, 2009). We were unable to isolate
previously reported as a human PAR-1 antagonist (Ahn et al., stable SCH-79797-resistant mutants upon plating 108 CFU of
2000). This finding was surprising because there are no PAR-1 MRSA USA300 onto agar containing 25 mg/mL SCH-79797 (4X
homologs in bacteria. A recent report suggested that SCH- MIC). We were also unable to isolate SCH-79797-resistant mu-
79797 increases the ability of neutrophils to kill bacteria, perhaps tants upon plating 108 colony-forming units (CFUs) of
Cell 181, 1518–1532, June 25, 2020 1519

ll
Article
A B
D E
Figure 1. SCH-79797 Is a Broad-Spectrum, Bactericidal Antibiotic that Is Effective in an Animal Model and Has a Low Frequency of
Resistance
(A) The MIC of SCH-79797 against Gram-negative (red) and Gram-positive (black) bacteria. The MICs of E. cloacae and P. aeruginosa were greater the maximal
drug concentrations tested. See also Table S1.
(B) The relative growth of E. coli lptD4213 after treatment with SCH-79797. Bacterial growth was measured as the optical density at 600 nm (OD600) 14 h following
inoculation. Each data point represents 2 biological replicates. Mean ± SD are shown.
(C) Colony forming units (CFUs mL 1) after 3-h treatment of E. coli lptD4213 with 1% DMSO (solvent control), 6.2 mg/mL SCH-79797 (23 MIC), 0.12 mg/mL
ampicillin (23 MIC), or 0.48 mg/mL novobiocin (43 MIC). Data points at 1 3 102 CFU mL 1 are below the level of detection. Each data point represents 3 biological
replicates. Mean ± SD are shown.
(D) The percent survival of G. mellonella wax worms infected with A. baumannii and concomitantly treated with 2 mL/larva 100% DMSO, 67 mg/larva SCH-79797,
67 mg/larva gentamicin, 67 mg/larva merapenem, or 67 mg/larva rifampicin. Data represents a typical cohort (n = 12) from a biological triplicate. p values are
determined from a Mantel-Cox test using Prism (**p < 0.01; ***p < 0.001), and the pooled results are presented in the supplemental material (Figure S1G). For other
Mantel-Cox comparisons, see Table S2.
(E) Fold increase in resistance of S. aureus MRSA USA300 to SCH-79797, novobiocin, trimethoprim, or nisin after 25 days of serial passaging in each drug. Data
represents one biological replicate and the data for a second replicate is shown in Figure S1B.
B. subtilis, suggesting that the difficulty in developing resistant dent cultures of S. aureus MRSA USA300 in sub-lethal concen-
mutants is not species-specific. To address resistance rates trations of SCH-79797, as well as three control antibiotics: novo-
more quantitatively, we serial-passaged 2 biologically indepen- biocin, trimethoprim, and nisin. Over the course of 25 days, we
1520 Cell 181, 1518–1532, June 25, 2020

ll
Article
A Figure 2. Bacterial Cytological Profiling

Indicates that SCH-79797 Functions by a Mech-
anism Distinct from Known Classes of Antibi-
otics
(A) Fluorescent images of E. coli lptD4213 cells treated
with antibiotics representative of 5 different antibiotic
classes. Cells were treated for 2 h with 53 MIC of each
drug. Merged image channels are phase contrast
(gray), FM4-64 (red), DAPI (blue), and SYTOX (green).
Scale bar, 1 mm.
(B) Comparison of cytological profiles of known anti-
biotics with the cytological profile of SCH-79797.
Single-linkage clustered dendrogram from one-way
MANOVA comparisons between antibiotic treatment
groups compared to all other antibiotic treatment
groups. Inset: structure of SCH-79797.
did not demonstrate cross-resistance to

SCH-79797 (Figure S1D). To extend these
findings to a Gram-negative species, we
repeated our serial passaging study with 2
B biologically independent cultures of
A. baumannii AB17978 (Figure S1C).
A. baumannii resistance remained constant
for SCH-79797 but increased for all other an-
tibiotics, including gentamycin, supporting
the conclusion that the lack of resistance to
SCH-79797 is not species-specific.
A Variant of Bacterial Cytological

Profiling Suggests that SCH-79797
Has a Unique MoA
The inability to isolate SCH-79797-resistant
mutants makes SCH-79797 an appealing
candidate antibiotic but poses a challenge
for determining its MoA. As a result, we
used a quantitative imaging-based
approach to determine if the MoA of SCH-
79797 is similar to that of any previously
characterized antibiotics. Specifically, we
modified a single-cell, high-content imaging
methodology, known as BCP (Nonejuie
et al., 2013). The logic of BCP is that antibi-
otics with similar MoA result in similar death
phenotypes such that by quantifying how
bacteria appear upon death, we can gain
insight into the cause of death (much like a
bacterial autopsy). Here, we applied our
BCP analysis to a training set of 37 distinct
antibiotics with known MoA as well as to
successfully isolated mutants resistant to all the control antibi- SCH-79797. For each compound, we treated E. coli lptD4213
otics while no SCH-79797-resistant mutants emerged (Figures with 5X MIC of an antibiotic for 2 h, stained with three dyes
1E and S1B). For novobiocin, trimethoprim, and nisin, resistance that report on nucleoid morphology (DAPI), membrane
gradually increased throughout the experiment, while the resis- morphology (FM4-64), and membrane integrity (SYTOX Green),
tance level remained constant for SCH-79797 (Figures 1E and and imaged the cells at high resolution (Figure 2A). For each
S1B), indicating that these bacteria did not even acquire partial condition we imaged 100 cells and quantified 14 parameters
resistance to SCH-79797. In addition, the mutants that evolved reflecting various morphological and fluorescence features
increased resistance to antibiotics like trimethoprim and nisin (Table S3).
Cell 181, 1518–1532, June 25, 2020 1521

ll
Article
Figure 3. Thermal Proteome Profiling Sug-
A gests that SCH-79797 Binds Dihydrofolate
Reductase
(A) Schematic of the thermal shift assay that
+ compares the thermal stability of the entire pro-
Drug teome with and without drug treatment. Protein
samples are aliquoted, and each aliquot is heated
to a different temperature. The relative fraction of
soluble and insoluble proteins is then determined
for each aliquot by ultracentrifugation and mass
spectrometry.
(B and C) The relative thermal stability of the sol-
uble E. coli lptD4213 proteome after treatment of
whole cell and cell lysate samples with SCH-
79797 (B) or trimethoprim (C). Changes in thermal
Increasing Temperature Increasing Temperature stability were determined by measuring changes
in the abundance of soluble protein across 10
B C different temperatures ranging from 42 C–72 C
SCH-79797 Trimethoprim
and 4 drug concentrations and a vehicle control.
mild effect mild effect
2.5 For each point, the color indicates the maximal
maximum log2 fold-change in whole cell
maximum log2 fold-change in whole cell
effect at multiple effect at multiple

2
2.0 temperatures temperatures
FolA effect size across all temperatures and the largest
moderate and moderate and
1.5 consistent effect FolA 1 consistent effect change in abundance across all concentrations.
1.0 Squares represent the proteins with a change in
0.5
0 abundance of at least 25% at three or more
0.0
temperatures. To be considered consistent, the
-1
change in abundance of a protein had to show the
-0.5 2
average log2 fold-change
average log2 fold-change

-2
same sign at least 90% of the time and have an
-1.0 1 1
effect size of at least 2-fold in either whole cells or
-1.5 0 0
-3 cell lysates. Triangles represent a milder effect
-2.0 -1 -1
where at least one temperature had a change in
-2 -4 -2
-2.5 abundance of at least 25% in both whole cell and
-2.5 -2.0 -1.5 -1.0 -0.5 0.0 0.5 1.0 1.5 2.0 2.5 -4 -3 -2 -1 0 1 2 3 cell lysate treatments.
maximum log2 fold-change in cell lysate maximum log2 fold-change in cell lysate
Because we had gold standards of the BCP results of SCH- SCH-79797 targets. Specifically, we used thermal proteome
79797 and of several antibiotics representing different classes profiling, an assay that uses mass spectrometry to compare
and sub-groups within classes, we applied a machine learning the thermal stability of the entire proteome with and without
approach to classify the BCP data. Using a one-way MANOVA, drug treatment (schematized in Figure 3A) (Mateus et al., 2018;
we performed dimensionality reduction to remove the influence Savitski et al., 2014). Briefly, intact cells or cell lysate samples
of naturally covarying metrics, such as cell length and cell treated with a range of compound concentrations are heated
perimeter. We then used single-linkage clustering to cluster to a series of increasing temperatures and the soluble proteins
treatment groups by their neighborhood representation vec- at each temperature are collected (Becher et al., 2016). Proteins
tors, such that samples whose neighborhoods were similar that bind to the drug are thermally stabilized, which leads to a
would be clustered together. This analysis indicated that shift in the temperature at which those proteins precipitate (Fig-
SCH-79797 resulted in a phenotypic death-state that was ure 3A). Using E. coli lptD4213, we treated intact cells and cell ly-
different from the other antibiotics tested (Figure 2B). We also sates with SCH-79797 and found that it significantly shifted the
tested an additional method to assess whether this result thermal stability of dihydrofolate reductase (the DHFR homolog
was robust to multiple statistical methods. Using UMAP for in E. coli is known as FolA) (Figures 3B and S3A). The fact that
dimensionality reduction on Z score normalized data (McInnes the same result was observed with both intact cells and cell ly-
et al., 2018), we found that SCH-79797-treated cells formed a sates (Figures 3B and S3A) suggests that SCH-79797 enters
distinct peak separated from other treatments (Figure S2). E. coli cells and directly binds to FolA. As a positive control,
This result supports our conclusion that SCH-79797 is dissim- we used a well-characterized antibiotic that targets DHFR,
ilar from most antibiotics tested and therefore likely possesses trimethoprim, and found that it also thermally stabilizes its known
a MoA distinct from that of any of the antibiotics in our target, the E. coli DHFR, FolA (Figures 3C and S3B) (Gleckman
training set. et al., 1981).
To test both the physiological significance and species-spec-
Thermal Profiling and CRISPRi Genetics Demonstrate ificity of the suggestion that SCH-79797 binds to DHFR, we took
that SCH-79797 Targets Dihydrofolate Reductase advantage of a collection of B. subtilis essential gene CRISPR-
In the absence of resistant mutants or similarity to antibiotics interference (CRISPRi) knockdown mutants (Peters et al.,
with known MoA by BCP, we turned to a high-throughput prote- 2016). In each of these mutants, an essential gene is targeted
omics-based approach for de novo identification of candidate by CRISPRi to reduce its expression 3-fold. A strain with
1522 Cell 181, 1518–1532, June 25, 2020

ll
Article
reduced levels of the SCH-79797 target should be sensitized to additional targets that are not shared with trimethoprim. If this
sub-lethal doses of SCH-79797. Given the thermal profiling was the case, we would expect that cells resistant to trimetho-
result, we focused on mutants in the folate biosynthesis pathway prim would still be susceptible to SCH-79797. Previous studies
(schematized in Figure 4A). As a negative control, we confirmed demonstrated that resistance to trimethoprim can be achieved
that CRISPRi knockdowns of genes unrelated to folate meta- by deleting thyA and supplementing the media with thymine
bolism are not sensitized to SCH-79797 (Figure S4). As a positive (Amyes and Smith, 1975). We confirmed that deleting thyA
control for our assay, we again utilized trimethoprim. We from E. coli lptD4213 in the presence of excess thymine led
confirmed that dihydrofolate reductase (dfrA in B. subtilis) and to trimethoprim resistance (Figure 5A). We also found that
dihydrofolate synthase (folC, an enzyme that acts upstream of thymine supplementation decreased the sensitivity of E. coli
DfrA) knockdowns are hypersensitive to trimethoprim, while lptD4213 to SCH-79797 (comparing ‘‘WT’’ of Figure 5A to Fig-
knockdowns of enzymes that function downstream of DHFR, ure 1B; Table S1). The findings that reducing cellular depen-
folD and glyA, are not (Figure 4B). SCH-79797 exhibited the dence on DHFR activity (by thymine supplementation) makes
same genetic sensitivity pattern as trimethoprim in that both cells less sensitive to SCH-79797 treatment, while increasing
dfrA and folC, but not folD and glyA knockdowns, were sensi- reliance on DHFR activity (by dfrA CRISPRi) makes cells more
tized to SCH-79797 (Figure 4B). sensitive to SCH-79797, together indicate that DHFR inhibition
is a physiologically important target of SCH-79797. Neverthe-
SCH-79797 Inhibits DHFR Activity in Cells and in Purified less, comparing cells with and without thyA in the presence of
Enzymatic Assays thymine showed no change in sensitivity to SCH-79797 (Fig-
To determine how SCH-79797 affects folate metabolism in living ure 5A), suggesting that SCH-79797 is likely to have a second,
cells, we used mass spectrometry to measure the relative abun- folate-independent MoA.
dance of folate metabolite pools in E. coli NCM3722 treated with To obtain clues about the potential additional MoA of SCH-
SCH-79797. E. coli NCM3722 was used because these bacteria 79797, we revisited our fluorescent BCP images of E. coli
lack mutations that disrupt primary metabolism in other lab lptD4213 cells treated with SCH-79797. We observed SYTOX
strains of E. coli (Soupene et al., 2003). E. coli NCM3722 cells Green staining in some of the bacteria (Figure 2A), suggesting
were grown in Gutnick Minimal Media and treated with that SCH-79797 compromises the integrity of the bacterial mem-
13.9 mg/mL SCH-79797 (13 MIC) for 15 min (Kwon et al., brane. To directly quantify the effect of SCH-79797 on bacterial
2008, 2010). In response to SCH-79797 treatment, the levels of membrane integrity, we used flow cytometry to measure the
the DHFR substrate, 7,8-dihydrofolate (DHF), rose 10-fold membrane potential and permeability of E. coli lptD4213 in the
compared to untreated cells, while the levels of folate metabo- presence of the fluorescent dyes, DiOC2(3) and TO-PRO-3.
lites downstream of DHF dropped significantly (Figure 4C). DiOC2(3) is a cationic dye that accumulates in the cytoplasm of
This metabolic response is characteristic of dihydrofolate reduc- cells with an active membrane potential and shifts its fluores-
tase (FolA in E. coli) inhibition as we observed a similar pattern cence from red to green in these cells, providing a measure of
upon treatment with trimethoprim, a known inhibitor of DHFR membrane potential (Figure 5B) (Novo et al., 1999). TO-PRO-3
(Figure 4C) (Gleckman et al., 1981). is a nucleic acid stain that only accumulates in cells with compro-
To determine whether SCH-79797 inhibits DHFR directly, we mised membranes, providing an independent measure of mem-
obtained purified E. coli FolA protein and measured its enzymatic brane permeability (Figure 5B) (Novo et al., 2000). As positive
activity in the presence of increasing concentrations of SCH- controls, we observed the expected shifts in both DiOC2(3)
79797. We found that SCH-79797 has an IC50 of 8.6 ± 3 mM and TO-PRO-3 staining using CCCP, a membrane-decoupler
against FolA (Figure 4D). We also measured the initial velocity that affects membrane potential but not permeability (Novo
of FolA activity at various DHF substrate concentrations to et al., 2000), and two compounds that disrupt both membrane
establish if SCH-79797 acts competitively or non-competitively. potential and permeability: nisin, a pore-forming antibacterial
A Michaelis-Menten fit to the data demonstrated that 8.6 mM peptide (Prince et al., 2016; Wiedemann et al., 2001), and poly-
SCH-79797 (the IC50) increases the Km from 32 ± 25 mM to 100 myxin B, a small lipopeptide membrane destabilizer (Warren
± 80 mM. These results indicate that SCH-79797 functions at et al., 1957) (Figure 5C). As negative controls, we confirmed
least partially as a competitive inhibitor of FolA’s activity on its that antibiotics that do not target the membrane, including ampi-
DHF substrate (Figure 4E). Likewise, the Michaelis-Menten fits cillin, rifampicin, and novobiocin, do not shift DiOC2(3) or TO-
for the established FolA inhibitor trimethoprim show a very PRO-3 staining (Figures S5A–S5D). After 15 min of treatment
similar effect IC50 = 15 ± 4 nM (Figure 4D), consistent with previ- with SCH-79797, subsequent DiOC2(3) and TO-PRO-3 staining
ous measurements of the tight binding between trimethoprim revealed significant defects in both membrane polarization and
and E. coli FolA (Cammarata et al., 2017). permeability (Figure 5C). These effects on the membrane are
not secondary consequences of DHFR inhibition, as trimetho-
SCH-79797 Also Disrupts Bacterial Membrane Potential prim-treated E. coli showed no significant changes in DiOC2(3)
and Permeability Barrier and TO-PRO-3 staining (Figure 5C). The membrane-targeting ef-
The similarities between SCH-79797 and trimethoprim with fect of SCH-79797 is also not species-specific, as similar results
respect to FolA inhibition helped confirm DHFR as a target of were seen with SCH-79797-treated B. subtilis 168 (Figures S5E–
SCH-79797 but were also surprising because these two com- S5G). These findings indicate that independent of its ability to
pounds did not generate similar profiles in our BCP analysis inhibit DHFR activity, SCH-79797 disrupts both membrane po-
(Figure 2B). One potential explanation is that SCH-79797 has tential and permeability barrier.
Cell 181, 1518–1532, June 25, 2020 1523

ll
Article
A B
D E
Figure 4. SCH-79797 Targets Folate Metabolism by Competitively Inhibiting Dihydrofolate Reductase

(A) A partial representation of the folate synthesis pathway. Where the E. coli and B. subtilis differ, the E. coli names are listed in parentheses.
(B) The growth of CRISPRi B. subtilis knockdown mutants (Peters et al., 2016) involved in folate synthesis relative to a DMSO-treated control after SCH-79797 or
trimethoprim treatment. OD600 of each condition 14 h after inoculation was plotted against drug concentration. Each data point represents 2 biological replicates.
Mean ± SD are shown.
(C) Metabolomic analysis of E. coli NCM3722 cells treated with 13.9 mg/mL SCH-79797 (13 MIC) or 0.15 mg/mL trimethoprim (13 MIC). Samples were taken 0, 5,
10, and 15 min after drug treatment. Folate metabolite abundance at each time point was quantified relative to the DMSO-treated control samples at the initial
time point. Each data point represents 3 independent replicates. Mean ± SD are shown.
(D and E) E. coli dihydrofolate reductase (FolA) activity is reduced in the presence of SCH-79797, IRS-10, or trimethoprim. Activity is relative to the standard
condition of 60 mM NADPH and 100 mM DHF. (D) IC50 values were derived from fits to the Hill equation for reactions performed at 60 mM NADPH and 100 mM DHF.
(E) FolA activity as a function of dihydrofolate concentration in the presence of 8.6 mM SCH-79797 (IC50), 0.065 mM IRS-10 (IC50), or 0.015 mM trimethoprim (IC50).
Kinetic parameters are derived from fits to the Michaelis-Menten model.
1524 Cell 181, 1518–1532, June 25, 2020

ll
Article
A B
Figure 5. SCH-79797 Is Distinct from Other Dihydrofolate Reductase Inhibitors and Disrupts Membrane Integrity
(A) The growth of wild-type (WT) and DthyA E. coli lptD4213 relative to a DMSO-treated control after SCH-79797 and trimethoprim treatment. Bacterial growth
was measured for 14 h and the final OD600 of each condition was plotted against drug concentration. Each data point represents 2 biological replicates. Mean ±
SD are shown.
(B) Schematic of flow cytometry data showing the expected results for each class of polarized, depolarized, permeable, and impermeable bacteria.
(C) Flow cytometry analysis of the membrane potential and permeability of E. coli lptD4213 cells after 15 min incubation with 1% DMSO (solvent control), 5 mM
CCCP, 25 mg/mL nisin (23 MIC), 0.8 mg/mL polymyxin B (23 MIC), 12.5 mg/mL SCH-79797 (23 MIC), or 2 mg/mL trimethoprim (103 MIC). The limits for the
depolarized region were defined by comparing the values in the CCCP and solvent only controls. The limits for the permeabilized region were defined by
comparing the nisin and solvent only controls.
SCH-79797 Treatment Can Kill Bacteria in Contexts integrity. Qualitative inspection suggested that SCH-79797-
Where Combination Therapy Fails treated E. coli appeared similar to E. coli lptD4213 cells treated
Having established that SCH-79797 disrupts both folate meta- with a combination of trimethoprim and nisin (Figure 6A). Quan-
bolism and membrane integrity, we sought to determine if these tification of the images confirmed that SCH-79797 clusters with
two targets can together explain how SCH-79797 kills bacteria. the co-treatment of trimethoprim and nisin (Figure 6A). Co-treat-
To address this question, we used BCP analysis to compare the ment with polymyxin B and trimethoprim similarly clustered with
cell morphology of bacteria treated with SCH-79797 to that of SCH-79797, suggesting that this effect is due to membrane
bacteria treated with a combination of two different antibiotics, perturbation and not specific to the complex MoA of nisin (Has-
one of which targets DHFR and one of which targets membrane per et al., 2006; Prince et al., 2016; Wiedemann et al., 2001)
Cell 181, 1518–1532, June 25, 2020 1525

ll
Article
Figure 6. SCH-79797 Mimics Co-treatment with Folate Metabolism and Membrane Integrity Disruptors but Can Be More Effective Than Their
Combination
(A) BCP analysis of E. coli lptD4213 cells after 30 min of treatment with 1% DMSO, 6.3 mg/mL SCH-79797 (13 MIC), 2 mg/mL trimethoprim (103 MIC), 25 mg/mL
nisin (23 MIC), or the combination of 2 mg/mL trimethoprim (103 MIC) and 25 mg/mL nisin (23 MIC). Cytological profiles were clustered by the first three principal
components that account for at least 90% of the variance between samples. Cells were stained with DAPI, FM4-64, and SYTOX Green. Shown here are the
merged images of DAPI (blue) and FM4-64 (red). Scale bar, 2 mm.
(B)The viability of E. coli lptD4213 cells measured in CFU mL 1 after 2 h of treatment with 1% DMSO (solvent control), 2 mg/mL trimethoprim (103 MIC), 25 mg/mL
nisin (23 MIC), the combination of 2 mg/mL trimethoprim (103 MIC) and 25 mg/mL nisin (23 MIC), 0.8 mg/mL polymixin B (23 MIC), the combination of 2 mg/mL
trimethoprim (103 MIC) and 0.8 mg/mL polymixin B (23 MIC), or 3.1 mg/mL SCH-79797 (13 MIC). Each bar represents 3 biological replicates. Mean ± SD
are shown.
(C) Viability of S. aureus MRSA USA300 persister cells measured in CFU mL 1 after 2 h of treatment with 1% DMSO (solvent control), 63 mg/mL trimethoprim (103
MIC), 100 mg/mL nisin (23 MIC), the combination of 63 mg/mL trimethoprim (103 MIC) and 50 mg/mL nisin (23 MIC), 63 mg/mL daptomycin (23 MIC), the
combination of 63 mg/mL trimethoprim (103 MIC) and 63 mg/mL daptomycin (23 MIC), or 6.3 mg/mL SCH-79797 (13 MIC). Each bar represents 3 biological
replicates. Mean ± SD are shown.
1526 Cell 181, 1518–1532, June 25, 2020

ll
Article
B C
D E
F G
Figure 7. Derivates of SCH-79797 Show Increased Potency and the Ability to Help Clear Infection in Mouse Vaginal N. gonorrhea Model
(A) Structures of SCH-79797, the pyrroloquinazolinediamine core lacking the side chains (IRS-10), and the pyrroloquinazolinediamine core with a biphenyl
decoration (IRS-16).
(B) The MICs of SCH-79797, IRS-10, and IRS-16 against a few selected species. For the MICs against additional strains, see Table S1.
(C) The growth of CRISPRi B. subtilis knockdown mutants involved in folate synthesis relative to a DMSO-treated control after treatment with IRS-10 or IRS-16.
Bacterial growth was measured for 14 h and the final optical density (OD600) of each condition was plotted against drug concentration. Each data point represents
2 biological replicates. Mean ± SD are shown.
(D) Flow cytometry analysis of the membrane potential and permeability of E. coli lptD4213 cells after 15 min incubation with 0.4 mg/mL IRS-10 (13 MIC) or
0.02 mg/mL IRS-16 (13 MIC).
(E) Therapeutic index of SCH-79797 and IRS-16 was calculated by dividing the MIC of each drug for the indicated mammalian cell line by its MIC against E. coli
lptD4213. The MIC of IRS-16 against PBMC was greater than the maximal drug concentrations tested.
Cell 181, 1518–1532, June 25, 2020 1527

ll
Article
(Figure S6). The fact that SCH-79797 clusters more closely to the that the pyrroloquinazolinediamine core is sufficient to target
co-treatments than to the individual treatments with trimetho- DHFR (Figure 7C). However, unlike SCH-79797, DiOC2(3) and
prim or nisin/polymyxin B reinforces the conclusion that SCH- TO-PRO-3 staining showed that IRS-10 does not disrupt mem-
79797 kills bacteria by targeting both DHFR and the membrane. brane polarity or permeability (Figure 7D). We also confirmed
There are no other antibiotics that have been shown to target that IRS-10 directly inhibits the enzymatic activity of purified
both folate metabolism and membrane integrity, indicating that E. coli FolA (Figures 4D and 4E). IRS-10 proved to be a more
SCH-79797 represents an antibiotic with a unique MoA. This potent inhibitor of FolA than SCH-79797 (IRS-10 IC50 = 65 ±
result also explains why SCH-79797 failed to cluster with any 19 nM) (Figure 4D), suggesting that its increased efficacy against
of the known antibiotics in our BCP analysis (Figure 2B). E. coli in cell culture can be explained by increased activity to-
Combination antibiotic therapy has been suggested as a po- ward DHFR. Together, these findings suggest that the pyrrolo-
tential means of circumventing the rise of antibiotic resistance quinazolinediamine core of SCH-79797 targets DHFR.
(Tamma et al., 2012; Tyers and Wright, 2019) but it has remained We next sought to determine if the isopropylbenzene side
unclear whether it is better to combine multiple activities on the group is responsible for the membrane-targeting properties of
same molecule. To probe this issue, we measured the synergy of SCH-79797. We thus obtained isopropylbenzene alone (also
co-treatment with one antibiotic targeting dihydrofolate reduc- known as cumene) (Figure S7A) and determined its effects on
tase and another targeting the membrane and compared their membrane integrity and folate biosynthesis. DiOC2(3) and TO-
combined effectiveness to that of SCH-79797. Interestingly, PRO-3 staining showed that ispropylbenzene disrupts both
when E. coli lptD4213 cells were co-treated with trimethoprim membrane polarity and permeability (Figure S7B). Meanwhile,
and nisin, or co-treated with trimethoprim and polymixin B, the reduction of dfrA or folC levels by CRISPRi had no effect on
two antibiotics antagonized one another’s activity, resulting in the sensitivity of bacteria to isopropylbenzene (Figure S7C).
a greater number of viable cells remaining after 2 h of co-treat- These results support the conclusion that SCH-79797 is a
ment (Figure 6B). MRSA USA300 persister cells (Kim et al., dual-targeting compound where the pyrroloquinazolinediamine
2018) are resistant to treatment with the membrane-disrupting core specifically targets folate metabolism while the isopropyl-
daptomycin (Chen et al., 2014; Taylor and Palmer, 2016). Treat- benzene group specifically targets membrane integrity.
ing MRSA USA300 persister cells with 63 mg/mL daptomycin (13 As a further test of whether the hydrophobic isopropylbenzene
MIC) for 2 h did not reduce the number of CFUs remaining in the chain functions to target the membrane, we generated another
culture (Figure 6C). However, SCH-79797 treatment of MRSA small molecule, Irresistin-16 (IRS-16), in which we decorated
robustly killed these persister cells while the combination of the pyrroloquinazolinediamine core with a biphenyl group that
trimethoprim with either nisin or daptomycin could not (Fig- is even more hydrophobic than isopropylbenzene (Figure 7A).
ure 6C). These results suggest that the combination of two As predicted from the inclusion of both folate-targeting and
different antibacterial activities on the same molecular scaffold membrane-targeting moieties, dfrA and folC CRISPRi mutants
can, at least in the case of SCH-79797, produce a more potent proved hypersensitive to IRS-16 and IRS-16 disrupted mem-
antibacterial effect than co-treating with two antibiotics with brane permeability and polarity by DiOC2(3) and TO-PRO-3
the two separate targeting activities. staining (Figures 7C and 7D). IRS-16 also had more potent anti-
biotic activity than IRS-10 against most bacteria tested (Fig-
The Chemical Basis of the Two MoAs of SCH-79797 ure 7B; Table S1), suggesting that targeting both membrane
SCH-79797 consists of a pyrroloquinazolinediamine core that is integrity and folate biosynthesis is more powerful than targeting
substituted with an isopropylphenyl group on one side and a cy- folate metabolism alone.
clopropyl moiety on the other. In order to test the function of the
pyrroloquinazolinediamine core on the antibiotic activity of SCH- The SCH-79797 Derivative IRS-16 Is Efficacious in a
79797, we synthesized a derivative of SCH-79797 (Irresistin-10 Mouse Infection Model
or IRS-10) that lacks both side groups (Figure 7A). When An effective antibiotic needs to be able to target pathogenic bac-
compared to the parent molecule SCH-79797, removing the iso- teria without killing mammalian hosts. To determine the concen-
propylphenyl and cyclopropyl groups increased the potency trations required to inhibit the growth of mammalian cells, we
against E. coli lptD4213 but decreased the potency against treated several mammalian cell lines with both SCH-79797 and
B. subtilis 168, MRSA USA300, and A. baumannii AB17978 (Fig- IRS-16, the derivative with the most potent antibiotic activity.
ure 7B; Table S1). To determine whether the pyrroloquinazoline- SCH-79797 showed promising results with PBMC cells, as
diamine core of SCH-79797 is specifically involved in targeting they required more than 10-fold higher doses for growth inhibi-
folate metabolism or membrane integrity, we assessed the activ- tion than those required for killing E. coli lptD4213 (Figure 7E).
ity of IRS-10 using the dfrA and folC CRISPRi hypersensitivity However, SCH-79797 inhibited the growth of other mammalian
assay and the quantitative flow cytometry membrane integrity cell lines, including HK-2, HEK293, and HLF, at doses compara-
assay. The CRISPRi hypersensitivity assay indicated that IRS- ble to the doses needed to kill bacteria. In contrast, IRS-16 killed
10 maintains the ability to inhibit folate metabolism, suggesting bacteria at 100–1,000-fold lower doses than those required to
(F) The stability of IRS-16 was measured following incubation with mouse liver microsomes. Each data point represents 2 biological replicates. Mean ± SD
are shown.
(G) Treatment of mice with IRS-16 (10 mg/kg, i.v., twice a day [b.i.d.]) reduces the vaginal burden of N. gonorrhoeae 24 h after inoculation. p value from one-
factor ANOVA.
1528 Cell 181, 1518–1532, June 25, 2020

ll
Article
affect mammalian cells in all cell lines tested (Figure 7E). In light tential target for SCH-79797, the bacterial membrane. Quantita-
of this larger therapeutic window, we focused our further in vivo tive flow cytometry with dyes that report on membrane
analysis efforts on IRS-16. permeability and polarity confirmed that SCH-79797 has a
Because IRS-16 preferentially killed bacteria in culture folate-independent effect on bacterial membrane integrity (Fig-
models, we proceeded to characterize its in vivo effects on ure 5C). Together, these assays constitute a pipeline that can
mice. We determined the maximal tolerated dose (MTD) of be used in the future to rapidly characterize antibiotic MoAs de
IRS-16 to be 15 mg/kg when administered intravenously. A phar- novo. Such a pipeline is especially important for compounds
macokinetic analysis revealed that at this MTD, the plasma con- such as SCH-79797 that are not prone to resistance and do
centration of IRS-16 peaked at 1.4 mg/mL with a half-life of 15.8 h not mimic known MoAs. BCP, thermal proteome profiling, me-
(Figures S7D and S7E). Consistent with this robust in vivo stabil- tabolomics, CRISPRi sensitivity, and flow cytometry are all as-
ity, a mouse liver microsome study showed that IRS-16 is says that can be performed in small volumes, such that they
extremely stable as compared to a control drug, verapamil can be readily scaled without the need for synthesizing large
(Figure 7F). amounts of the compound in question. The orthogonal nature
Finally, we determined whether IRS-16 has antibiotic activity in of the assays enables the independent identification of multiple
a mouse bacterial infection model. For this purpose, we focused MoAs, which may help in the discovery of unique antibiotic
on N. gonorrhoeae as it is a Gram-negative pathogen for which classes.
there is an acute need for new antibiotics due to widespread Both of the targets of SCH-79797 are relevant for its function
resistance toward existing drugs (CDC, 2019). There is also as an antibiotic. The CRISPRi and metabolomic studies
a well-validated mouse vaginal infection model for demonstrate that SCH-79797 actively disrupts folate meta-
N. gonorrhoeae (Jerse et al., 2002; Song et al., 2008). In bacterial bolism in multiple bacterial species in a manner that is rate-
culture, IRS-16 showed robust activity toward N. gonorrhoeae, limiting for growth (Figures 4B and 4C). Meanwhile, the flow cy-
with an MIC of 0.03 mg/mL (Table S1). Our pharmacokinetic anal- tometry assay demonstrates that SCH-79797 simultaneously
ysis indicated that IRS-16 should persist in mice at concentra- disrupts membrane integrity even though folate inhibition itself
tions above this MIC for nearly 48 h (Figures S7D and S7E). To has no effect on the membrane (Figure 5C). The ability of SCH-
test its in vivo efficacy, we inoculated the vaginal tracts of 79797 to disrupt membrane integrity is particularly interesting
BALB/c mice with 1.85 3 106 CFU/mouse of N. gonorrhoeae given that membrane-disruptors are often selective for either
ATCC 700825, treated with intravenous (i.v.) doses of either Gram-positive or Gram-negative bacteria (Ling et al., 2015;
10 mg/kg IRS-16 or vehicle control at 2 and 14 h post-infection, Taylor and Palmer, 2016; Warren et al., 1957), while SCH-
and assayed vaginal N. gonorrhoeae CFUs at 26 h post-infec- 79797 proved potent against both Gram-positive pathogens
tion. IRS-16 significantly reduced the vaginal load of like S. aureus and E. faecalis as well as Gram-negative patho-
N. gonorrhoeae (p < 0.05) as compared to the vehicle control gens like A. baumannii, N. gonorrhoeae, and pathogenic
(Figure 7G). Consistent with its favorable therapeutic index and E. coli (Figure 1A). Host toxicity is often a concern for mem-
pharmacokinetic profile, this result confirms that IRS-16 can brane-targeting antibiotics, and while SCH-79797 was well
function as an effective antibiotic in an in vivo mouse gonorrhea tolerated by some animal cells like G. mellonella wax worms
infection model. and PBMC cells, it killed other mammalian cell lines at doses
similar to those at which it functions as an antibiotic. Mean-
DISCUSSION while, IRS-16, a derivative of SCH-79797, increased antibiotic
activity without increasing mammalian toxicity, thereby
Due to the rise in resistance to known antibiotics, there is an increasing its therapeutic window >100-fold. The ability of
acute need for new antibiotics with the key features of having IRS-16 to selectively target bacteria is consistent with a recent
unique MoAs, potency toward Gram-negatives, and reduced study of retinoid derivatives that provided proof-of-principle
susceptibility to resistance. Here, we describe a promising com- that small molecules can preferentially target bacterial mem-
pound, SCH-79797, and its derivative, IRS-16, that are effective branes (Kim et al., 2018). Future biophysical characterization
in animals and address these key criteria with a unique dual-tar- and medicinal chemistry will help to further increase potency
geting MoA, the ability to kill both Gram-negative and Gram-pos- and reduce toxicity.
itive pathogens, and an undetectably low frequency of The undetectably low frequency of resistance to SCH-79797
resistance. We also describe a systems-level pipeline that com- could result from its two distinct targets. Specifically, we were
bines independent orthogonal approaches to characterize the successful in isolating resistance mutants for mimics of each
MoA of SCH-79797 in the absence of resistant mutants. Specif- of its two individual targets, trimethoprim and nisin, but not for
ically, we used BCP classification to categorize the MoA of SCH- SCH-79797 (Figure 1E). The average mutation rate in E. coli is
79797 as distinct from those of 37 known antibiotics (Figure 2B). 2.1 3 10 7 per gene per generation (Chen and Zhang, 2013).
We then used thermal proteome profiling to identify DHFR as a If E. coli required 2 mutations to acquire resistance to
candidate binding partner of SCH-79797 and confirmed that SCH-79797, the number of bacteria that would be necessary
SCH-79797 inhibits folate metabolism through metabolomic to find a resistant mutant would be in the range of 1014. Even if
analysis and CRISPRi genetic hypersensitivity (Figures 3B, 4B, that represents an overestimate, humans are estimated to carry
and 4C). We confirmed that SCH-79797 directly inhibits DHFR roughly 4 3 1013 bacteria in total, so such low frequencies of
activity by acting competitively toward its DHF substrate (Fig- resistance would be unlikely to result in resistant mutants in a
ures 4D and 4E). The BCP images also alerted us to a second po- clinical context.
Cell 181, 1518–1532, June 25, 2020 1529

ll
Article
Our studies suggest that SCH-79797 is more potent than com- B Materials Availability
bination treatment with antibiotics that mimic its two activities, the B Data and Code Availability
DHFR-inhibitor trimethoprim and the membrane-disruptor nisin. B Bacterial Cytological Profiling Code
Similarly, co-treatment with trimethoprim and polymyxin B B Thermal Proteome Profiling Data
showed antagonistic interactions (Figure 6B), while MRSA B Flow Cytometry Data
persister cells were killed by SCH-79797 but not by combined B Additional Raw Data
treatment with trimethoprim and daptomycin (Figure 6C). A poten- d EXPERIMENTAL MODEL AND SUBJECT DETAILS
tial explanation for the potency of SCH-79797 is that recruiting a B Bacterial strains and growth conditions
DHFR inhibitor to the membrane could increase its effective con- B Mammalian cell lines
centration or potentiate its inactivation of DHFR by sequestering it. B Animal models
Permeabilizing the membrane could also enhance the access of d METHOD DETAILS
SCH-79797 to its cytoplasmic DHFR target. The difference be- B Minimum inhibitory concentration assays
tween SCH-79797 and the combination treatments could also B Compound library
be based on non-primary target effects such as differences in B Galleria mellonella killing assay
localized synergistic drug concentrations, drug uptake or efflux. B Colony forming units assay
Membrane-targeting molecules can act either synergistically or B S. aureus MRSA persister cell assay
antagonistically with antibiotics with different MoAs (Brochado B Serial passaging assay to evolve resistance
et al., 2018). Because trimethoprim and various membrane dis- B Bacterial cytological profiling
ruptors antagonize each other separately, but DHFR inhibition B Thermal proteome profiling
and membrane disruption synergize in the context of SCH- B Metabolomics
79797, combining antibiotic activities onto the same molecule B Dihydrofolate reductase activity assay
could present a solution for bypassing this antagonistic effect. In B Membrane potential and permeability assay
any event, our results suggest that despite the promise of combi- B Mammalian cell cytotoxicity
nation antibiotic therapies (Brochado et al., 2018; Tyers and B Mouse liver microsomal stability
Wright, 2019), an even more powerful approach could be to B Pharmacokinetic analysis
combine different targeting moieties onto the same chemical B Neisseria gonorrhea vaginal infection model
scaffold. d QUANTIFICATION AND STATISTICAL ANALYSIS
Discovering the MoA of SCH-79797 also enabled us to design B Center, spread, and statistical significance
derivatives that improve its efficacy. Our most promising deriva- B Bacterial cytological profiling
tive currently is IRS-16, in which we replaced the isopropyl- B Thermal proteome profiling
phenyl group with a biphenyl group (with the idea to increase B Flow cytometry analysis
the membrane-targeting activity) and removed the cyclopropyl B Pharmacokinetic analysis
side chain (to enhance the DHFR inhibition). IRS-16 showed
improved ability to kill bacteria, with significantly lower MICs SUPPLEMENTAL INFORMATION
than SCH-79797. More importantly, IRS-16 did not similarly
enhance the growth inhibition of mammalian cells. Thus, IRS-
cell.2020.05.005.
16 exhibited a promising therapeutic index, as reducing the con-
centration necessary to kill bacteria without affecting the con- ACKNOWLEDGMENTS
centration necessary to kill mammalian cells resulted in a com-
pound that is >100-fold more potent toward bacteria than The B. subtilis CRISPR knockdown library was a kind gift from Jason M. Pe-
hosts. IRS-16 was also stable in mice and tolerated at doses ters. Flow cytometry was performed in collaboration with Christina DeCoste
(Princeton University Flow Cytometry Resource Facility [FCRF]). We appre-
significantly above the MIC for several hours. Finally, we
ciate the support and feedback from lab members in the Gitai and Shaevitz
confirmed that IRS-16 significantly reduced the burden of labs. Funding was provided in part by NIH (DP1AI124669 to Z.G., J.P.S.,
N. gonorrhoeae in a mouse vaginal infection model. B.P.B., and J.K.M. and T32 GM007388 to J.K.M. and G.M.M.), as well as
N. gonorrhoeae is a Gram-negative pathogen with some of the Princeton DFR Innovation Funds for New Ideas in Science (to J.K.M. and
highest rates of drug resistance for any pathogen. The acute J.P.S.). Additional funding provided by the National Science Foundation
need for new antibiotics to treat N. gonorrhoeae makes IRS-16 (NSF PHY-1734030 to B.P.B.) and for the FCRF by the National Cancer Insti-
tute (NCI-CCSG P30CA072720-5921). The opinions, findings, and conclu-
a particularly promising small molecule candidate for future
sions or recommendations expressed in this material contents are solely the
development. responsibility of the authors and do not necessarily represent the official views
of the NIH or the National Science Foundation.
STAR+METHODS
Detailed methods are provided in the online version of this paper Conceptualization, Z.G., J.K.M., M.Z.W., and H.K.; Methodology, Z.G., B.P.B.,
and include the following: M.Z.W., A.M., A.T., M.M.S., J.R., S.H.-J.L., and H.K.; Software, M.Z.W.,
J.K.M., B.P.B., A.T., and M.M.S.; Validation, J.P.S.; Formal Analysis, B.P.B.
d KEY RESOURCES TABLE and G.M.M.; Investigation, M.Z.W., J.K.M., J.P.S., G.M.M., A.M., A.T.,
d RESOURCE AVAILABILITY M.M.S., S.H.-J.L., and B.P.B.; Resources, J.P.S. and M.Z.W.; Writing – Orig-
B Lead Contact inal Draft, Z.G. and J.K.M.; Writing – Reviewing & Editing, Z.G., B.P.B.,
1530 Cell 181, 1518–1532, June 25, 2020

ll
Article
J.P.S., H.K., and C.D.; Visualization, J.P.S., J.K.M., and B.P.B.; Supervision, Gleckman, R., Blagg, N., and Joubert, D.W. (1981). Trimethoprim: mecha-
Z.G., H.K., J.R., A.T., and M.M.S.; Funding Acquisition, Z.G., J.R., A.T., nisms of action, antimicrobial activity, bacterial resistance, pharmacokinetics,
and M.M.S. adverse reactions, and therapeutic indications. Pharmacotherapy 1, 14–20.
Gobbetti, T., Cenac, N., Motta, J.-P., Rolland, C., Martin, L., Andrade-Gordon,
DECLARATION OF INTERESTS P., Steinhoff, M., Barocelli, E., and Vergnolle, N. (2012). Serine protease inhi-
bition reduces post-ischemic granulocyte recruitment in mouse intestine.
A patent application describing the use of SCH-79797 as an antibiotic, as well Am. J. Pathol. 180, 141–152.
as the pharmaceutical composition and use as antibiotic of derivatives is
Gupta, N., Liu, R., Shin, S., Sinha, R., Pogliano, J., Pogliano, K., Griffin, J.H.,
currently pending.
Nizet, V., and Corriden, R. (2018). SCH79797 improves outcomes in experi-
mental bacterial pneumonia by boosting neutrophil killing and direct antibiotic
Received: June 6, 2019
activity. J. Antimicrob. Chemother. 73, 1586–1594.
Revised: February 24, 2020
Gutnick, D., Calvo, J.M., Klopotowski, T., and Ames, B.N. (1969). Compounds
Accepted: May 1, 2020
which serve as the sole source of carbon or nitrogen for Salmonella typhimu-
Published: June 3, 2020
rium LT-2. J. Bacteriol. 100, 215–219.
REFERENCES Hasper, H.E., Kramer, N.E., Smith, J.L., Hillman, J.D., Zachariah, C., Kuipers,
O.P., de Kruijff, B., and Breukink, E. (2006). An alternative bactericidal mech-
Ahn, H.S., Foster, C., Boykow, G., Stamford, A., Manna, M., and Graziano, M. anism of action for lantibiotic peptides that target lipid II. Science 313,
(2000). Inhibition of cellular action of thrombin by N3-cyclopropyl-7-[[4-(1- 1636–1637.
methylethyl)phenyl]methyl]-7H-pyrrolo[3, 2-f]quinazoline-1,3-diamine (SCH Hofer, U. (2019). The cost of antimicrobial resistance. Nat. Rev. Microbiol.
79797), a nonpeptide thrombin receptor antagonist. Biochem. Pharmacol. 17, 3.
60, 1425–1434. Imai, Y., Meyer, K.J., Iinishi, A., Favre-Godal, Q., Green, R., Manuse, S., Ca-
Amyes, S.G., and Smith, J.T. (1975). Thymineless mutants and their resistance boni, M., Mori, M., Niles, S., Ghiglieri, M., et al. (2019). A new antibiotic selec-
to trimethoprim. J. Antimicrob. Chemother. 1, 85–89. tively kills Gram-negative pathogens. Nature 576, 459–464.
Becher, I., Werner, T., Doce, C., Zaal, E.A., Tögel, I., Khan, C.A., Rueger, A., Jerse, A.E., Crow, E.T., Bordner, A.N., Rahman, I., Cornelissen, C.N., Moench,
Muelbaier, M., Salzer, E., Berkers, C.R., et al. (2016). Thermal profiling reveals T.R., and Mehrazar, K. (2002). Growth of Neisseria gonorrhoeae in the female
phenylalanine hydroxylase as an off-target of panobinostat. Nat. Chem. Biol. mouse genital tract does not require the gonococcal transferrin or hemoglobin
12, 908–910. receptors and may be enhanced by commensal lactobacilli. Infect. Immun. 70,
Bell-Pedersen, D., Galloway Salvo, J.L., and Belfort, M. (1991). A transcription 2549–2558.
terminator in the thymidylate synthase (thyA) structural gene of Escherichia coli Karlowsky, J.A., Draghi, D.C., Jones, M.E., Thornsberry, C., Friedland, I.R.,
and construction of a viable thyA:Kmr deletion. J. Bacteriol. 173, 1193–1200. and Sahm, D.F. (2003). Surveillance for antimicrobial susceptibility among clin-
Boucher, H.W., Talbot, G.H., Bradley, J.S., Edwards, J.E., Gilbert, D., Rice, ical isolates of Pseudomonas aeruginosa and Acinetobacter baumannii from
L.B., Scheld, M., Spellberg, B., and Bartlett, J. (2009). Bad bugs, no drugs: hospitalized patients in the United States, 1998 to 2001. Antimicrob. Agents
no ESKAPE! An update from the Infectious Diseases Society of America. Chemother. 47, 1681–1688.
Clin. Infect. Dis. 48, 1–12. Kim, W., Zhu, W., Hendricks, G.L., Van Tyne, D., Steele, A.D., Keohane, C.E.,
Brochado, A.R., Telzerow, A., Bobonis, J., Banzhaf, M., Mateus, A., Selkrig, J., Fricke, N., Conery, A.L., Shen, S., Pan, W., et al. (2018). A new class of syn-
Huth, E., Bassler, S., Zamarreño Beas, J., Zietek, M., et al. (2018). Species- thetic retinoid antibiotics effective against bacterial persisters. Nature 556,
specific activity of antibacterial drug combinations. Nature 559, 259–263. 103–107.
Butler, M.S., Blaskovich, M.A., and Cooper, M.A. (2017). Antibiotics in the clin- King, A.C., and Wu, L. (2009). Macromolecular synthesis and membrane
ical pipeline at the end of 2015. J. Antibiot. (Tokyo) 70, 3–24. perturbation assays for mechanisms of action studies of antimicrobial agents.
Cammarata, M., Thyer, R., Lombardo, M., Anderson, A., Wright, D., Ellington, Curr. Protoc. Pharmacol. Chapter 13, Unit 13A.17.
A., and Brodbelt, J.S. (2017). Characterization of trimethoprim resistant E. coli Kwon, Y.K., Lu, W., Melamud, E., Khanam, N., Bognar, A., and Rabinowitz,
dihydrofolate reductase mutants by mass spectrometry and inhibition by J.D. (2008). A domino effect in antifolate drug action in Escherichia coli. Nat.
propargyl-linked antifolates. Chem. Sci. (Camb.) 8, 4062–4072. Chem. Biol. 4, 602–608.
CDC (2019). Antibiotic Resistance Threats in the United States (U.S. Depart- Kwon, Y.K., Higgins, M.B., and Rabinowitz, J.D. (2010). Antifolate-induced
ment of Health and Human Services). depletion of intracellular glycine and purines inhibits thymineless death in
Chen, X., and Zhang, J. (2013). No gene-specific optimization of mutation rate E. coli. ACS Chem. Biol. 5, 787–795.
in Escherichia coli. Mol. Biol. Evol. 30, 1559–1562. Ling, L.L., Schneider, T., Peoples, A.J., and Spoering, A.L. (2015). A new anti-
Chen, Y.F., Sun, T.L., Sun, Y., and Huang, H.W. (2014). Interaction of daptomy- biotic kills pathogens without detectable resistance. Nature 517, 455–459.
cin with lipid bilayers: a lipid extracting effect. Biochemistry 53, 5384–5392. Mateus, A., Bobonis, J., Kurzawa, N., Stein, F., Helm, D., Hevler, J., Typas, A.,
Chen, L., Ducker, G.S., Lu, W., Teng, X., and Rabinowitz, J.D. (2017). An LC- and Savitski, M.M. (2018). Thermal proteome profiling in bacteria: probing pro-
MS chemical derivatization method for the measurement of five different one- tein state in vivo. Mol. Syst. Biol. 14, e8242.
carbon states of cellular tetrahydrofolate. Anal. Bioanal. Chem. 409, McInnes, L., Healy, J., and Melville, J. (2018). UMAP: Uniform Manifold
5955–5964. Approximation and Projection for Dimension Reduction. arXiv,
Coates, A.R., Halls, G., and Hu, Y. (2011). Novel classes of antibiotics or more arXiv:1802.03426.
of the same? Br. J. Pharmacol. 163, 184–194. McKeage, K. (2015). Finafloxacin: first global approval. Drugs 75, 687–693.
Culyba, M.J., Mo, C.Y., and Kohli, R.M. (2015). Targets for Combating the Evo- Nonejuie, P., Burkart, M., Pogliano, K., and Pogliano, J. (2013). Bacterial cyto-
lution of Acquired Antibiotic Resistance. Biochemistry 54, 3573–3582. logical profiling rapidly identifies the cellular pathways targeted by antibacte-
Davies, J. (2006). Where have All the Antibiotics Gone? Can. J. Infect. Dis. rial molecules. Proc. Natl. Acad. Sci. USA 110, 16169–16174.
Med. Microbiol. 17, 287–290. Novo, D., Perlmutter, N.G., Hunt, R.H., and Shapiro, H.M. (1999). Accurate
Gebhardt, M.J., Gallagher, L.A., Jacobson, R.K., Usacheva, E.A., Peterson, flow cytometric membrane potential measurement in bacteria using diethylox-
L.R., Zurawski, D.V., and Shuman, H.A. (2015). Joint Transcriptional Control acarbocyanine and a ratiometric technique. Cytometry 35, 55–63.
of Virulence and Resistance to Antibiotic and Environmental Stress in Acineto- Novo, D.J., Perlmutter, N.G., Hunt, R.H., and Shapiro, H.M. (2000). Multipa-
bacter baumannii. MBio 6, e01660-15. rameter flow cytometric analysis of antibiotic effects on membrane potential,
Cell 181, 1518–1532, June 25, 2020 1531

E23G671
ll
Article
membrane permeability, and bacterial counts of Staphylococcus aureus and iological studies of Escherichia coli strain MG1655: growth defects and
Micrococcus luteus. Antimicrob. Agents Chemother. 44, 827–834. apparent cross-regulation of gene expression. J. Bacteriol. 185, 5611–5626.
O’Neill, J. (2014). AMR Review Paper–Tackling a Crisis for the Health and Strande, J.L., Hsu, A., Su, J., Fu, X., Gross, G.J., and Baker, J.E. (2007). SCH
Wealth of Nations (AMR Review Paper). 79797, a selective PAR1 antagonist, limits myocardial ischemia/reperfusion
Peleg, A.Y., Jara, S., Monga, D., Eliopoulos, G.M., Moellering, R.C., Jr., and injury in rat hearts. Basic Res. Cardiol. 102, 350–358.
Mylonakis, E. (2009). Galleria mellonella as a model system to study Acineto- Tamma, P.D., Cosgrove, S.E., and Maragakis, L.L. (2012). Combination ther-
bacter baumannii pathogenesis and therapeutics. Antimicrob. Agents Chemo- apy for treatment of infections with gram-negative bacteria. Clin. Microbiol.
ther. 53, 2605–2609. Rev. 25, 450–470.
Peters, J.M., Colavin, A., Shi, H., Czarny, T.L., Larson, M.H., Wong, S., Haw- Taylor, S.D., and Palmer, M. (2016). The action mechanism of daptomycin.
kins, J.S., Lu, C.H.S., Koo, B.-M.M., Marta, E., et al. (2016). A Comprehensive, Bioorg. Med. Chem. 24, 6253–6268.
CRISPR-based Functional Analysis of Essential Genes in Bacteria. Cell 165, Tenover, F.C., and Goering, R.V. (2009). Methicillin-resistant Staphylococcus
1493–1506. aureus strain USA300: origin and epidemiology. J. Antimicrob. Chemother. 64,
Prince, A., Sandhu, P., Ror, P., Dash, E., Sharma, S., Arakha, M., Jha, S., 441–446.
Akhter, Y., and Saleem, M. (2016). Lipid-II Independent Antimicrobial Mecha- Tyers, M., and Wright, G.D. (2019). Drug combinations: a strategy to extend
nism of Nisin Depends On Its Crowding And Degree Of Oligomerization. Sci. the life of antibiotics in the 21st century. Nat. Rev. Microbiol. 17, 141–155.
Rep. 6, 37908. Ursell, T., Lee, T.K., Shiomi, D., Shi, H., Tropini, C., Monds, R.D., Colavin, A.,
Randall, L.B., Georgi, E., Genzel, G.H., and Schweizer, H.P. (2016). Finafloxa- Billings, G., Bhaya-Grossman, I., Broxton, M., et al. (2017). Rapid, precise
cin overcomes Burkholderia pseudomallei efflux-mediated fluoroquinolone quantification of bacterial cellular dimensions across a genomic-scale
resistance. J. Antimicrob. Chemother. 72, 1258–1260. knockout library. BMC Biol. 15, 17.
Ruiz, N., Kahne, D., and Silhavy, T.J. (2006). Advances in understanding bac- Viehman, J.A., Nguyen, M.H., and Doi, Y. (2014). Treatment options for carba-
terial outer-membrane biogenesis. Nat. Rev. Microbiol. 4, 57–66. penem-resistant and extensively drug-resistant Acinetobacter baumannii in-
Savitski, M.M., Reinhard, F.B.M., Franken, H., Werner, T., Savitski, M.F., Eber- fections. Drugs 74, 1315–1333.
hard, D., Martinez Molina, D., Jafari, R., Dovega, R.B., Klaeger, S., et al. (2014). Warren, G.H., Gray, J., and Yurchenco, J.A. (1957). Effect of polymyxin on the
Tracking cancer drugs in living cells by thermal profiling of the proteome. Sci- lysis of Neisseria catarrhalis by lysozyme. J. Bacteriol. 74, 788–793.
ence 346, 1255784. Werner, T., Sweetman, G., Savitski, M.F., Mathieson, T., Bantscheff, M., and
Song, W., Condron, S., Mocca, B.T., Veit, S.J., Hill, D., Abbas, A., and Jerse, Savitski, M.M. (2014). Ion coalescence of neutron encoded TMT 10-plex re-
A.E. (2008). Local and humoral immune responses against primary and repeat porter ions. Anal. Chem. 86, 3594–3601.
Neisseria gonorrhoeae genital tract infections of 17beta-estradiol-treated Wiedemann, I., Breukink, E., van Kraaij, C., Kuipers, O.P., Bierbaum, G., de
mice. Vaccine 26, 5741–5751. Kruijff, B., and Sahl, H.-G. (2001). Specific binding of nisin to the peptidoglycan
Soupene, E., van Heeswijk, W.C., Plumbridge, J., Stewart, V., Bertenthal, D., precursor lipid II combines pore formation and inhibition of cell wall biosyn-
Lee, H., Prasad, G., Paliy, O., Charernnoppakul, P., and Kustu, S. (2003). Phys- thesis for potent antibiotic activity. J. Biol. Chem. 276, 1772–1779.
1532 Cell 181, 1518–1532, June 25, 2020

ll
Article
STAR+METHODS
KEY RESOURCES TABLE

E. coli lptD4213 DthyA This paper ZG1599
B. subtilis 168 CRISPRi Library Jason Peters and Carol Gross Peters et al., 2016
Detailed bacterial strain information, N/A N/A
including strain identifiers, growth media
and temperature, is provided in Table S1
SCH-7979 Tocris Biosciences 1592
Ampicillin MP Biomedicals 190148
Cumene Sigma PHR1210
Daptomycin TCI D4229
Gentamicin sulfate Sigma G1914
Meropenem trihydrate TCI M2279
Nisin MP Biomedicals 155839
Novobiocin sodium salt MP Biomedicals 0215595705
Polymyxin B sulfate salt Sigma P1004
Purified E. coli dihydrofolate Genscript N/A
reductase (FolA)
DAPI ThermoFisher D1306
SYTOX Green Dead Cell Stain ThermoFisher S34860
FM4-64 Dye ThermoFisher T13320
BacLight Bacterial Membrane Potential kit ThermoFisher B34950
Colorimetric Dihydrofolate Reductase Sigma CS0340
Assay Kit
Deposited Data
Proteomics ProteomeXchange Consortium PXD013673
Flow cytometry FlowRespository FR-FCM-Z2JD
Raw images and Metabolomics Data This Paper; DataSpace at Princeton N/A
University
HEK293 ATCC CRL-1573
HK2 ATCC CRL-2190
HLF Cell Applications 506K-05a
PBMC TPCS PB010C
Male CD-1 mice Pharmaron N/A
Female ovariectomized BALB/c mice Pharmacology Discovery Services Taiwan N/A
Galleria mellonella PetCo 2336624
MATLAB MathWorks https://www.mathworks.com/products/
matlab.html
Fiji Eliceiri/LOCI group at University of https://imagej.net/Fiji
Washington
R Studio R Studio https://rstudio.com/products/rstudio/
Cell 181, 1518–1532.e1–e6, June 25, 2020 e1

ll
Article
Continued
FlowJo FlowJo, LLC https://www.flowjo.com/solutions/flowjo/
downloads
Other
Synergy HT microplate reader BioTek N/A
InfiniteM200 Pro microplate reader Tecan N/A
27G x 0.5 inch needle BD Biosciences 305109
QuantaMaster 40 Spectrophotometer HORIBA Instruments N/A
Lead Contact
Further information and requests for resources and reagents should be directed to and will be fulfilled by the Lead Contact, Zemer
Gitai (zgitai@princeton.edu).
The unique strain (E. coli lptD4213 DthyA) generated in this study are available from the Lead Contact.

Other than specific datasets and analysis code listed below, the published article includes all datasets generated or analyzed during
this study.
Bacterial Cytological Profiling Code

The BCP analysis code is available under a BSD 3-clause license at https://github.com/PrincetonUniversity/gitai-bacterialAutopsy
and archived at https://doi.org/10.5281/zenodo.3758582.
Thermal Proteome Profiling Data

The mass spectrometry proteomics data have been deposited to the ProteomeXchange Consortium via the PRIDE partner repository
with the dataset identifier PRIDE:PXD013673.
Flow Cytometry Data

Flow cytometry data has been deposited to FlowRepository with the dataset identifier FR-FCM-Z2JD https://flowrepository.org/id/
FR-FCM-Z2JD.
Additional Raw Data

The raw microscopy images used for bacterial cytological profiling and raw metabolomics data are available at Princeton DataSpace
https://dataspace.princeton.edu/jspui/handle/88435/dsp01cr56n3903.
Bacterial strains and growth conditions

Detailed bacterial strain information, including growth media and temperature, is provided in Table S1. Where listed, growth media
were prepared according manufacturer recommendations: LB Broth and LB Broth supplemented with 0.3 mM thymine (BD Biosci-
ences 244610, Alfa Aesar A15879), Brain Heart Infusion (BD Biosciences 237500), Tryptic Soy Broth (BD Biosciences 211825),
Nutrient Broth (BD Biosciences 234000), and Gutnick Minimal Media (1.0 g/L K2SO4, 13.5 g/L K2HPO4, 4.7 g/L KH2PO4, 0.1 g/L
of MgSO4-7H2O, 10 mM NH4Cl as a nitrogen source, and 0.4% w/v glucose as a carbon source) (Gutnick et al., 1969), Terrific Broth
(BD Biosciences 243820).
E. coli lptD4213 DthyA was constructed by moving DthyA::Kmr (Bell-Pedersen et al., 1991) into E. coli lptD4213 by P1 transduction.
Mammalian cell lines

Detailed cell line information is provided in Table S1. Mammalian cell line toxicity assays were performed by Pharmaron, Inc.
(Beijing, ROC).
e2 Cell 181, 1518–1532.e1–e6, June 25, 2020

ll
Article
Animal models
Pharmacokinetics determination
Care and handling of male CD-1 mice approximately 6-8 weeks old conformed to institutional animal care and use policies as carried
out at Pharmaron, Inc. (Beijing, ROC).
Neisseria gonorrhoeae infection model
Care and handling of 5-week old ovariectomized BALB/c mice conformed to institutional animal care and use policies as carried out
at Pharmacology Discovery Services Taiwan, Ltd. (New Taipei City, TW).
METHOD DETAILS
Minimum inhibitory concentration assays

The minimum inhibitory concentration (MIC) of each antibiotic was defined as the lowest concentration of antibiotic that resulted in no
visible growth. MICs were measured by diluting overnight cultures 1:150 and then adding 2-fold dilutions of each antibiotic in 96-well
plates and grown with shaking at 37 C. Cell growth was monitored by measuring the OD600. Assays performed by Pharmacology
Discovery Services (New Taipei City, TW) and Pharmaron Inc. (Beijing, ROC) are indicated in Table S1. For MIC measurements
done in-house, OD600 readings were measured by either a BioTek Synergy HT (Winooski, VT) or Tecan InfiniteM200 Pro (Männedorf,
CH) microplate reader. In locations where drug concentrations are listed, they are included both as a relative fold change compared
to the MIC for that bacterial strain and drug combination (1X MIC, 2X MIC, etc) as well as an absolute concentration (1 mg/mL, 2 mg/
mL, etc).
Compound library
Compounds were sourced from commercial vendors: MicrosourceDiversity, Aldrich, Selleckchem, Chiromics, and Chembridge.
Each compound dissolved in DMSO at 50 mM and then was screened for antibiotic activity against E. coli lptD4213 growing in Terrific
Broth. After normalizing for plate-to-plate variation, an OD600 of half the median plate OD600 was used as a cutoff, below which any
compound was assumed to have inhibited the growth of E. coli lptD4213 and above which compounds were assumed to be ineffec-
tive. Compounds that either had not been previously identified as antibiotics or had unknown or ambiguous mechanisms of action
were chosen for further investigation and their MIC’s were measured using the microdilution method described above.
Galleria mellonella killing assay

All Galleria mellonella larvae were obtained from Vita-Bugsª, distributed through PetCoª (2336624, San Diego, CA), and kept in a
20 C chamber. All injections were administered using a sterile 1 mL syringe with a 27G x 0.5 in needle (BD 305109) attached to a
syringe pump (KD Scientific, 78-8210) delivered at a rate of 250 mL/min into the fourth leg of the worm. Prior to injection, the site
was sterilized with ethanol. A. baumannii AB17978 (105 CFU/larva) and drug dissolved in DMSO (SCH-79797 at 66.6 mg/larva,
gentamicin at 66.6 mg/larva, merapenem at 66.6 mg/larva) were pre-mixed prior to injection. The viability of each injected larva
was determined by prodding each larva with a dowel and observing whether there was subsequent movement.
Colony forming units assay

Overnight E. coli lptD4213 or S. aureus MRSA USA300 cultures were diluted 1:100 in fresh media and grown to mid exponential phase
(OD600 = 0.4-0.6). Each culture was then diluted 1:10 into fresh media and then treated with the indicated concentration of each anti-
biotic. Each time point was taken by removing 150 mL and performing 10-fold serial dilutions. Six dilutions of each condition were then
plated in the absence of antibiotic and grown at 37 C overnight. Colony forming units (CFU’s) were measured by counting the result-
ing number of colonies the next day.
S. aureus MRSA persister cell assay

Stationary phase S. aureus cells have the same antibiotic tolerant properties of persister cells (Kim et al., 2018). Thus, overnight
MRSA USA300 cultures were used to measure the effectiveness of SCH-79797 against persister cells. An overnight S. aureus
MRSA USA300 culture was diluted 1:100 in PBS and then then treated with the indicated concentration of each antibiotic. After a
2 hour incubation with shaking at 37 C, 150 mL from each treatment condition was taken and 10-fold serial dilutions were performed.
Six dilutions of each condition were then plated in the absence of antibiotic and grown at 37 C overnight. CFU’s were measured by
counting the resulting number of colonies the next day.
Serial passaging assay to evolve resistance

MICs of two biological replicates of S. aureus MRSA USA300 or A. baumannii AB17978 were obtained against the indicated antibi-
otics. Cells that grew in 0.5X MIC of the indicated antibiotics were subsequently cultured in the absence of antibiotics, and the MIC
was then remeasured. This is a single passage and was repeated for 5-30 passages. Cells from each passage were stored as a frozen
stock. Resistance was confirmed by comparing the MICs of resistant mutants against S. aureus MRSA USA300 or A. baumannii
AB17978 that had not been previously exposed to the indicated antibiotics.
Cell 181, 1518–1532.e1–e6, June 25, 2020 e3

ll
Article
Bacterial cytological profiling

Compound library
Following the protocol of Nonejuie et al. (2013), overnight E. coli lptD4213 cultures were diluted 1:100 in LB and grown at 30 C for
90 minutes on a roller drum. Each culture was then treated with 5X the MIC and incubated at 30 C with slight agitation. Following
antibiotic treatment, cells were stained with 0.5 mM SYTOX Green, 1 mg/mL FM4-64, and 2 mg/mL DAPI for incubated for 10 minutes.
Each stained culture was then centrifuged at 3220 rcf. for 40 s and resuspended in 1/10 volume of LB. Cells spotted onto a 1.2%
agarose pad in 20% LB medium for imaging. Images were collected on a Nikon 90i upright microscope equipped with a 100X 1.4
N.A. objective (Nikon Instruments Inc., Melville, NY) and a RoleraXR (Photometrics, Tucson, AZ) camera. Microscope control and
image acquisition were performed in NIS Elements.
Dual-treatment
In dual-treatment experiments (Figures 6A and S6), overnight E. coli lptD4213 cultures were diluted 1:100 and grown to early-mid
exponential phase (OD600 = 0.4-0.6). Each culture was then diluted 1:10 into fresh LB and treated with the desired concentration
of antibiotic for 10 minutes. Following antibiotic treatment, cells were stained with 0.5 mM SYTOX Green, 1 mg/mL FM4-64, and
2 mg/mL DAPI. Each stained culture was then spotted onto a 1.5% agar pad supplemented with casamino acids and glucose in
M63 (15 mM (NH4)2SO4, 100 mM KH2PO4, 1.7 mM FeSO4, 0.5% glucose, 0.2% casamino acids, 1 mM MgSO4). Images were
collected on either Nikon 90i upright microscope equipped with a 100X 1.4 N.A. objective (Nikon) or a Nikon Ti-E inverted microscope
equipped with a 100X 1.4 NA objective. Both microscopes utilize an Orca Flash4 camera (Hamamatsu, Bridgewater, NJ). Microscope
control and image acquisition were performed in NIS Elements (Nikon).
Image analysis
Following imaging, the E. coli cells were segmented (Ursell et al., 2017) and single-cell features were extracted using custom MAT-
LAB code (Mathworks, Natwick, MA). Clustering was performed using the single-linkage method with MANOVA (Figure 2, MATLAB
functions manova1, manovacluster). Before applying the UMAP dimensionality reduction technique (McInnes et al., 2018), each of
the 14 cytological features was normalized into a z-score by subtracting off the mean and dividing by the standard deviation of that
feature (Figure S2). Principal component analysis was performed using the prcomp function in R and clustering was performed using
the single-linkage method (Figure 6).
Thermal proteome profiling

Thermal proteome profiling experiments were performed following Mateus et al. (2018). Whole cell samples were treated with 0.6,
1.1, 2.2, 4.4 mg/mL SCH-79797 or 0.1, 0.5, 2.3, 11.6 mg/mL trimethoprim. Cell lysate samples were treated with 0.4, 1.8, 8.9,
44.4 mg/mL SCH-79797 or 0.1, 0.5, 2.3, 11.6 mg/mL trimethoprim. After heat treatment, the soluble fraction was collected, digested
with trypsin and peptides were labeled with tandem mass tags (Werner et al., 2014). Samples were subjected to two-dimensional
liquid chromatography and analyzed on a Q Exactive Plus mass spectrometer (Thermo Fisher Scientific). While the data collected
is proteome wide, three different cutoffs were used to define three classes of potential targets. The color of the point indicates
the signal maximal effect size across all temperatures and the largest change in abundance across all concentrations was selected
for continued analysis (Dots, Figure 3.) A mild effect indicates at least one temperature had a change in abundance of at least 25% in
both whole cell and cell lysate treatments (Triangles, Figure 3) Proteins that had a change in abundance of at least 25% at three or
more temperatures (Squares, Figure 3). To be considered a consistent effect, the change in abundance of the protein had to show the
same sign at least 90% of the time and have an effect size of at least 1 (2-fold) in either whole cells or cell lysates.
Metabolomics
Overnight E. coli NCM3722 cultures were grown and diluted 1:100 in Gutnick Minimal Media and grown to early-mid exponential
phase (OD600 = 0.4-0.6). Cultures were treated with either 13.9 mg/mL SCH-79797 (1X MIC) or 0.15 mg/mL trimethoprim (1X MIC)
for 15 minutes. Folates were extracted by vacuum filtering 15 mL of treated cells using 0.45 mm HNWP Millipore nylon membranes
and immediately placing filters into an ice-cold quenching solution containing 40:40:20 Methanol:acetonitrile:25 mM NH4OAc + 0.1%
sodium Ascorbic in HPLC grade water. The resulting solution was then centrifuged at 16,000 3 g for 1.5 min at 4 C and the super-
natant saved for mass spectrometry analysis. Mass spectrometry analysis was performed as described in Chen et al. (2017).
Dihydrofolate reductase activity assay

E. coli dihydrofolate reductase (FolA) was purified by Genscript (Piscataway, NJ). The enzymatic activity of FolA was measured using
a colorimetric dihydrofolate reductase assay kit (Sigma CS0340) using a QuantaMaster 40 Spectrophotometer (Photon Technology
International Inc., Edison, NJ). The kit measures FolA activity by monitoring the change in sample absorbance at 340 nm due to FolA-
dependent NADPH consumption. For each condition, 1 mL purified enzyme in storage buffer (1X PBS, 10% glycerol, pH 7.4) was
thawed on ice and added to 1000 mL 1X assay buffer immediately before assaying. After mixing the reaction components together,
the 100 mL reaction volume was pipetted into a 1 mM pathlength quartz cuvette and the transmitted light intensity at 340 nm was
measured for 100 s at 1 kHz sampling. These high frequency readings were downsampled by averaging to 1 Hz and the activity
of each sample was calculated from the slope (b) a linear regression of the log transformed intensity measurements using MATLAB.
Because the enzyme was only moderately stable in assay buffer (half-time 20 minutes), all enzymatic measurements were
e4 Cell 181, 1518–1532.e1–e6, June 25, 2020

ll
Article
normalized to a standard condition (60 mM NADPH and 100 mM DHF) that was measured immediately before the sample of interest.
The relative activity was calculated from (bsample – bnoEnzyme)/(bstandard – bnoEnzyme).
Membrane potential and permeability assay

Overnight E. coli lptD4213 and B. subtilis 168 cultures were diluted 1:100 and grown to early-mid exponential phase (OD600 = 0.4-0.6)
at 37 C. Each culture was then diluted 1:10 into PBS and treated with the desired concentration of antibiotic for 15 minutes. Cells
were then stained with the BacLight Bacterial Membrane Potential kit (ThermoFisher B34950). This kit uses DiOC2(3) to measure
a cell’s membrane potential as a ratio of green (488 nm ex, 525/50 nm em) to red (488 nm ex, 610/20 nm em) (Novo et al., 1999).
Membrane integrity was measured by staining cells with TO-PRO-3, a dye that is excluded from cells with an intact membrane
(640 nm ex, 670/30 nm em). The LSRII flow cytometer (BD Biosciences) at the Flow Cytometry Resource Facility, Princeton Univer-
sity, was used to measure the fluorescent intensities of both dyes in response to antibiotic treatment. 100,000 events were recorded
for each data file. Data was analyzed using FlowJo v10 software (FlowJo LLC, Ashland, OR).
Mammalian cell cytotoxicity

In a white, opaque, 384-well plate, either HEK293 (500 cells/well), HK-2 (500 cells/well), HLF cells (500 cells/well) or PBMC (3000
cells/well) cells were seeded overnight. After 24 hours, DMSO or compounds were added in 3-fold dilutions and incubated for 72
hours. CyQUANT Detection Reagent was added in equal volume and incubated for 1 hour. Fluorescence was read from the bottom
of plate with standard green filter set (508/527 nm ex). Cell toxicity was evaluated by Pharmaron, Inc. (Beijing, ROC).
Mouse liver microsomal stability

The metabolic stability of IRS-16 was tested in mouse liver microsomes with NADPH and UDPGA over a period of 1 hour. IRS-16
(2 mM) and Verapamil (2 mM, positive control) were added to 0.5 mg/mL microsomes. Aliquots of 50 mL were taken from the reaction
solution at 0.5, 5, 15, 30, 45 and 60 minutes. The reaction was stopped by the addition of 5 volumes of cold acetonitrile with IS (100 nM
alprazolam, 200 nM caffeine, and 100 nM tolbutamide). Samples were centrifuged at 3,220xg for 40 minutes. Aliquots of 100 mL of the
supernatant were mixed with 100 mL of ultra-pure H2O and then used for LC-MS/MS analysis. Mouse liver microsomal stability was
evaluated by Pharmaron, Inc. (Beijing, ROC).
Pharmacokinetic analysis
6-8 week old male CD-1 mice were injected intravenously with a single dose of IRS-16 at 15 mg/kg in 5% DMSO + 95% (20% HP-
b-CD in water, W/V) and showed no adverse effects. Plasma measurements were averaged from 2 mice at the indicated time points
following administration. Concentration was determined by LC/MS. Pharmacokinetic assay was performed by Pharmaron, Inc. (Bei-
jing, ROC).
Neisseria gonorrhea vaginal infection model

Groups of 5 ovariectomized BALB/c mice 5 weeks of age were used. Animals were subcutaneously injected with a water soluble form
of estradiol at 0.23 mg/mouse on day 2, 0, 2, 4 and 6.Starting from day 2, and following inoculation to the study end, mice were
dosed twice daily with streptomycin (1.2 mg/mouse) and vancomycin (0.6 mg/mouse) by IP injection to minimize the vaginal flora.
Trimethoprim sulfate (0.04 g/100 mL) was administered in drinking water. At day 0, mice were anesthetized by ketamine (100 mg/
kg) and xylazine (10 mg/kg) through IP injection, and then inoculated intravaginally with 1.0-2.0 3 106 CFU of N. gonorrhoeae
(FA1090, ATCC 700825). Before inoculation, the mouse vagina was first rinsed with 30 mL of 50 mM HEPES (pH 7.4) followed by inoc-
ulation with 20 mL gonococci suspension in PBS containing 0.5 mM CaCl2 and 1 mM MgCl2. IRS-16 or vehicle were administered 2 h
and 12 h after inoculation. Mice are euthanized with CO2 asphyxiation after 2 h and 24 h. Lavage was performed with 400 mL GC broth
containing 0.05% saponin to recover vaginal bacteria. The bacterial suspensions were plated onto chocolate agar to determine the
N. gonorrhoeae counts. N. gonorrhoeae vaginal infection model was performed by Pharmacology Discovery Services (New Taipei
City, TW).
Center, spread, and statistical significance

For all assays except the Neisseria gonorrhea vaginal infection model, we use the arithmetic mean and standard deviation across
multiple biological replicates as our measures of center and spread. The particular numbers of replicates for each experiment
type are included in the respective figure legends. For the Neisseria gonorrhea vaginal infection model, we display median, Q1,
Q3, and the values for each of the five individual mice in each sample cohort are (Figure 7G).
Statistical significance for Galleria mellonella killing assays were determined from Mantel-Cox test using Prism 8.1.2 (GraphPad,
San Diego, CA). Each of three independent cohorts started with twelve animals. p values can be found in Table S2. For the Neisseria
gonorrhea vaginal infection model, significance (p < 0.05) was determined from a one-factor ANOVA.
Cell 181, 1518–1532.e1–e6, June 25, 2020 e5

ll
Article
Bacterial cytological profiling

Hierarchical-clustering of the cytological features was performed using the single-linkage method with MANOVA (Figure 2, MATLAB
functions manova1, manovacluster). Before applying the UMAP dimensionality reduction technique (McInnes et al., 2018), each of
the 14 cytological features was normalized into a z-score by subtracting off the mean and dividing by the standard deviation of that
feature (Figure S2). Principal component analysis was performed using the prcomp function in R and clustering was performed using
the single-linkage method (Figure 6). Details of the dimensionality reduction technique and hierarchical clustering method are
included in the respective figure legends.
Thermal proteome profiling

While the data collected is proteome wide, three different cutoffs were used to define three classes of potential targets. The color of
the point indicates the signal maximal effect size across all temperatures (nTemp = 10) and the largest change in abundance across
all concentrations (nConc = 4) was selected for continued analysis (Dots, Figure 3.) A mild effect indicates at least one temperature
had a change in abundance of at least 25% in both whole cell and cell lysate treatments (Triangles, Figure 3) Proteins that had a
change in abundance of at least 25% at three or more temperatures (Squares, Figure 3). To be considered a consistent effect,
the change in abundance of the protein had to show the same sign at least 90% of the time and have an effect size of at least 1
(2-fold) in either whole cells or cell lysates. The details for these symbols and cutoffs are included in the respective figure legend.
Flow cytometry analysis

100,000 events were recorded for each flow cytometry sample and analyzed using FloJo v10. Gates used to quantify the fraction of
events with depolarized or permeabilized membranes were determined from control samples treated with CCCP or nisin respec-
tively. These details are included in the respective figure legends (Figures 5C, 7D, S5, and S7B).
Pharmacokinetic analysis
Plasma measurements were averaged from 2 mice at the indicated time points following administration. After the rapid initial
approach to pseudoequilibrium, filled data symbols were used as the input for terminal half-life determination (Figure S7D). Pharma-
cokinetic parameters estimated of a non-compartmental model of IRS-16 serum levels (Figure S7D). These details are included in the
respective figure legends.
e6 Cell 181, 1518–1532.e1–e6, June 25, 2020

ll
Article
Figure S1. SCH-79797 Is Bactericidal against Staphylococcus aureus, Exhibits Undetectably Low Rates of Resistance, and Is an Effective
Antibiotic in an Infection Model of Galleria mellonella by Acinetobacter baumannii, Related to Figure 1
A. Colony forming units (CFU mL-1) after 3-hour treatment of S. aureus MRSA USA300 with solvent only (1% DMSO), and 6.3 mg/mL SCH-79797 (1X MIC), or
4.0 mg/mL novobiocin (5X MIC). Each data point represents 3 independent samples and 3 technical replicates. Mean ± SD are shown. B. Fold increase in
resistance of S. aureus MRSA USA300 to SCH-79797, novobiocin, trimethoprim, or nisin after 25 days of serial passaging in each drug. C. Fold increase in
resistance of A. baumannii AB17978 to SCH-79797 and gentamicin after 5 days of serial passaging in each drug. D. Fold increase in the susceptibility of S. aureus
MRSA USA300 mutants to the indicated antibiotics. Trimethoprim and nisin resistant mutants were obtained from serial passaging in respective antibiotics. E-F.
The percent survival of non-infected G. mellonella wax worms after treatment with 2 mL/larva of 100% DMSO, 67 mg/larva SCH-79797, 67 mg/larva gentamicin,
ll
Article
67 mg/larva rifampicin, or 67 mg/larva meropenem. Data in (E) represents a typical cohort (n = 12) from a biological triplicate and the pooled results are presented in
(F). p values are determined from a Mantel-Cox test using Prism (n.s., pR0.05; *p < 0.05). G. The percent survival of G. mellonella wax worms infected with
A. baumannii (AB 17978) and concomitantly treated with 67 mg/larva SCH-79797, 67 mg/larva gentamicin, 67 mg/larva rifampicin, or 67 mg/larva meropenem. Data
represents the pooled results from a biological triplicate. H. The relative survival of drug-treated of G. mellonella wax worms infected with A. baumannii (AB 17978)
and treated concomitantly with antibiotics relative to larvae treated with antibiotics only without infection. Data represents the pooled results from a biological
triplicate.
ll
Article
Figure S2. Bacterial Cytological Profiling of SCH-79797 and Antibiotics with Known Mechanisms of Action, Related to Figure 2
Dimensionality reduction of all treatments using umap were replotted using two color channels, one for only SCH-79797 treatments and one for the remaining
conditions. Density smoothed with a s = 0.1 width Gaussian kernel.
ll
Article
Figure S3. Thermal Stability of Dihydrofolate Reductase Increases after SCH-79797 and Trimethoprim Treatment, Related to Figure 3
A-B. The relative thermal stability of E. coli dihydrofolate reductase (FolA) after treatment of whole cell and cell lysate samples with (A) SCH-79797 and (B)
trimethoprim. Changes in thermal stability were determined by measuring changes in the abundance of FolA across 10 different temperatures ranging from 42-
72 C and 4 drug concentrations and a vehicle control.
ll
Article
Figure S4. CRISPRi Mutants Not Involved in Folate Metabolism Are Not Sensitized to SCH-79797, Related to Figure 4
A. The growth of CRISPRi B. subtilis knockdown mutants relative to a DMSO-treated control after SCH-79797 treatment. Bacterial growth was measured for 14 h
and the final optical density (OD600) of each condition was plotted against drug concentration. Each data point represents 2 independent replicates. Mean ± SD
are shown.
ll
Article
Figure S5. Treatment with Ampicillin, Rifampicin, and Novobiocin Does Not Disrupt Membrane Integrity while SCH-79797 Disrupts Bacillus
subtilis 168 Membrane Integrity, Related to Figure 5
A-D. Flow cytometry analysis of the membrane potential and permeability of E. coli lptD4213 cells after 15 minute incubation with (A) 0.06 mg/mL ampicillin (2X
MIC), (B) 0.002 mg/mL rifampicin (2X MIC), (C) 3.1 mg/mL SCH-79797 (1X MIC), or (D) 0.12mg/mL novobiocin (2X MIC). The limits for the depolarized region were
defined by comparing the values in the CCCP and solvent only controls. The limits for the permeabilized region were defined by comparing the nisin and solvent
only controls. E-G. Flow cytometry analysis of the membrane potential and permeability of B. subtilis 168 cells after 15 minute incubation with (E) 1% DMSO, (F)
3.1 mg/mL SCH-79797 (1X MIC), or (G) 6.2 mg/mL SCH-79797 (2X MIC). The limits for the regions were defined from the solvent only controls.
ll
Article
Figure S6. SCH-79797 Mimics Co-treatment with Folate Metabolism and Alternate Membrane Integrity Disruptors, Related to Figure 6
BCP analysis of E. coli lptD4213 cells after 30 minutes of treatment with 1% DMSO, 6.3 mg/mL SCH-79797 (1X MIC), 2 mg/mL trimethoprim (10X MIC), 0.8 mg/mL
polymyxin B (2X MIC), or the combination of 2 mg/mL trimethoprim (10X MIC) and 0.8 mg/mL polymyxin B (2X MIC). Cytological profiles were clustered by the first
three principal components that account for at least 90% of the variance between samples. Cells were stained with DAPI, FM4-64, and SYTOX Green. Shown
here are the merged images of DAPI (blue) and FM4-64 (red). Scale bar is 2 mm.
ll
Article
Figure S7. Cumene (Isopropylbenzene) Disrupts Membrane Integrity with No Additional Sensitivity in Folate-Metabolism Mutants and
Pharmacokinetic Analysis of IRS-16 Stability, Related to Figure 7
(A) Structure of cumene. (B) Flow cytometry analysis of the membrane potential and permeability of E. coli lptD4213 cells after 15 minute incubation with
17000 mg/mL cumene (1X MIC). The limits for the depolarized region were defined by comparing the values in the CCCP and solvent only controls. The limits for
the permeabilized region were defined by comparing the nisin and solvent only controls. (C) The growth of CRISPRi B. subtilis knockdown mutants relative to a
DMSO-treated control after cumene treatment. Bacterial growth was measured for 14 h and the final optical density (OD600) of each condition was plotted against
drug concentration. (D) Plasma concentrations of IRS-16 were measured following a single 15000 mg/kg IV dose. After the rapid initial approach to pseudoe-
quilibrium, filled data symbols were used as the input for terminal half-life determination. (E) Pharmacokinetic parameters estimated of a non-compartmental
model of IRS-16 serum levels.
Article
Interpersonal Gut Microbiome Variation Drives

Susceptibility and Resistance to Cholera Infection
Salma Alavi, Jonathan D. Mitchell,
Jennifer Y. Cho, Rui Liu, John C. Macbeth,
Ansel Hsiao
Correspondence
ansel.hsiao@ucr.edu
In Brief
Differences in the gut microbiome
between individuals determine resistance
to cholera infection through the effects on
the activity of a bile salt enzyme.
Highlights
d Interpersonal human gut microbiome variation confers
variable infection resistance
d Microbiome-dependent infection resistance can be restored

through co-transplantation
d Colonization resistance is mediated through the bile salt

hydrolase enzyme activity
d Bile salt hydrolase abundance in human microbiomes

correlates to final infection
Alavi et al., 2020, Cell 181, 1533–1546

ll
Article
Interpersonal Gut Microbiome Variation Drives
Susceptibility and Resistance to Cholera Infection
Salma Alavi,1,5 Jonathan D. Mitchell,1,5 Jennifer Y. Cho,1,2 Rui Liu,1,3 John C. Macbeth,1,4 and Ansel Hsiao1,6,*
1Department of Microbiology and Plant Pathology, University of California, Riverside, Riverside, CA, USA
2Department of Biochemistry, University of California, Riverside, Riverside, CA, USA
3Graduate Program in Genetics, Genomics, and Bioinformatics, University of California, Riverside, Riverside, CA, USA
4Division of Biomedical Sciences, School of Medicine, University of California, Riverside, Riverside, CA, USA
6Lead Contact
*Correspondence: ansel.hsiao@ucr.edu
SUMMARY
The gut microbiome is the resident microbial community of the gastrointestinal tract. This community is high-
ly diverse, but how microbial diversity confers resistance or susceptibility to intestinal pathogens is poorly
understood. Using transplantation of human microbiomes into several animal models of infection, we
show that key microbiome species shape the chemical environment of the gut through the activity of the
enzyme bile salt hydrolase. The activity of this enzyme reduced colonization by the major human diarrheal
pathogen Vibrio cholerae by degrading the bile salt taurocholate that activates the expression of virulence
genes. The absence of these functions and species permits increased infection loads on a personal micro-
biome-specific basis. These findings suggest new targets for individualized preventative strategies of
V. cholerae infection through modulating the structure and function of the gut microbiome.
INTRODUCTION terotoxigenic Escherichia coli, V. cholerae, and rotavirus infec-

tion (David et al., 2015; Hsiao et al., 2014; Kieser et al., 2018).
Gastrointestinal infections represent a major global health Generalized model ‘‘healthy’’ microbial communities of the hu-
concern. One major human diarrheal pathogen is Vibrio chol- man gut have been shown to be resistant to V. cholerae infection
erae, the etiologic agent of the severe disease cholera that af- (Hsiao et al., 2014), and associative metagenomic studies have
fects millions of people annually (Clemens et al., 2017). examined how the microbiome differs between cholera patients
V. cholerae cycles between aquatic reservoirs and the small in- and household contacts that did not exhibit disease symptoms
testine, requiring the coordinated regulation of environmental (Midani et al., 2018). Yet, few studies have mechanistically
fitness genes versus virulence genes including the attachment explored how interpersonal microbiome variation can drive path-
factor toxin co-regulated pilus (TCP) and cholera toxin (CT) (Her- ogen susceptibility. Here, we show that the post-malnutrition/
rington et al., 1988; Miller et al., 1987). In this and other patho- post-diarrheal dysbiotic community is highly vulnerable to sub-
gens, regulation depends on the chemical state of the gut, sequent infection. Moving beyond a dichotomous ‘‘normal’’
shaped by the gut microbiome, the dense resident gut microbial versus ‘‘dysbiotic’’ comparison, we show that microbiome differ-
community (Eckburg et al., 2005) that varies dramatically across ences among healthy humans drive striking differences in sus-
host species and across individuals as a function of diet, geog- ceptibility. We show that fecal studies in animals and potentially
raphy, and environmental insults (Yatsunenko et al., 2012). In humans may have limited utility for studies of community interac-
cholera-endemic areas, the gut microbiome is subject to mutu- tions with pathogens of the small intestine, and that microbiome-
ally reinforcing pressures: malnutrition, leading to reduced host dependent infection susceptibility at the small intestine can be
infection resistance, repeated diarrhea, and poorly controlled rescued by microbiome transplantation. In order to identify com-
antimicrobial usage in an attempt to mitigate the resulting mensals that strongly interact with enteropathogens across
sequelae. Previous 16S ribosomal gene studies of the gut micro- many community contexts, we established an efficient unbiased
biome in these areas demonstrate that these factors are able to experimental pipeline that revealed that the commensal species
drive the gut microbiome into a characteristic dysbiotic state, Blautia obeum can suppress virulence. We identify an enzymatic
dominated by Streptococci such as Streptococcus salivarius, mechanism in B. obeum that degrades the host-produced viru-
Enterococci, and Enterobacteriaceae. This configuration has lence-inducing molecule taurocholate (TC), which B. obeum
been shown to be inducible by malnutrition (Subramanian uses alongside other mechanisms (Hsiao et al., 2014) to sup-
et al., 2014), and diarrhea irrespective of etiology, including en- press V. cholerae virulence gene activation and colonization.
ll
Article
A B
C D
Figure 1. Model Human Gut Microbiomes Replicate Structure of Communities Affected by Diarrhea-Induced Dysbiosis
(A) Defined human gut communities.
(B) Composition of healthy US human donor fecal microbiomes.
(C) Principal coordinates analysis of defined and complete human gut microbiomes based on weighted UniFrac distance, % variance explained shown in pa-
rentheses. Ellipses show 95% confidence intervals.
(D) Weighted UniFrac distance to indicated defined human model microbiomes of fecal samples from cholera patients at the end of diarrhea (left) and healthy
human donors (right) *p < 0.05, ****p < 0.0001, Mann-Whitney U-test. Boxplots show inter-quartile range, whiskers minimum to maximum.
RESULTS of defined gut communities using cultured human isolates (Fig-

ure 1A), and the second involved studies with complete human
Dysbiotic Microbiomes Are Susceptible to V. cholerae fecal microbiomes (Figure 1B).
Colonization, and Pathogen Resistance Can Be Rescued As the basis for designing defined model communities, we
by Microbiome Transplantation compared fecal microbiomes of a healthy adult volunteer cohort
We took a two-pronged approach to study the effects of micro- in the United States (Figure 1C; Table S2C) and previously pub-
biome variation on pathogen resistance, both involving reconsti- lished 16S ribosomal RNA gene sequencing of Bangladeshi
tuting human gut microbiomes in animal models of V. cholerae adults (Hsiao et al., 2014; Subramanian et al., 2014). Previous
colonization and virulence. The first involved the construction studies in Bangladesh revealed that cholera drives the human
1534 Cell 181, 1533–1546, June 25, 2020

ll
Article
A B C D
E F
G H
Cell 181, 1533–1546, June 25, 2020 1535

ll
Article
gut microbial community to a highly dysbiotic, low-diversity state phenotypes in both feces and in the medial and distal thirds of
dominated by Streptococci, which recovers to a configuration the small intestine. In prior studies in adult mice, small intestinal
similar to non-diarrheal individuals over the course of weeks after colonization by V. cholerae required antibiotics (Freter, 1955,
the cessation of acute disease (Hsiao et al., 2014). This has also 1956) and ketamine anesthesia (Olivier et al., 2009). Our results
been observed in other diarrheal infections, such as enterotoxi- with human, as opposed to murine, gut bacteria suggest that mi-
genic E. coli and rotavirus (David et al., 2015; Kieser et al., 2018), crobiome differences across host species and inter-individual
and other gut pathologies, such as severe malnutrition (Subra- variation within host species both play key roles in determining
manian et al., 2014). Principal coordinates analysis (PCoA) of a pathogen susceptibility. Significantly, we could restore coloniza-
human cohort from Bangladesh (Hsiao et al., 2014) displays tion resistance by mixing the CR and DS bacteria, suggesting
the dysbiosis caused by cholera (Figure 1C, Diarrhea (start)) that susceptibility is reversible through microbiome modification
and community structure weeks after the cessation of diarrhea (Figures 2A and 2B). In ‘‘Mix’’ groups, where mice were adminis-
(Diarrhea (end) to Recovery (end)), when the microbiome be- tered a 1:1 mixture of CR and DS, V. cholerae levels were signif-
comes similar to that of individuals in the same area not suffering icantly less compared to that in DS mice, in fact dropping below
from acute malnutrition or diarrhea (Subramanian et al., 2014). the level of CR mice 2 days post-infection.
Interpersonal microbiome variation in Bangladesh was far higher We also observed increased colonization susceptibility of DS
than among healthy US individuals sampled as part of this study; microbiomes when compared to a simplified model healthy mi-
indeed, some Bangladeshi ‘‘healthy’’ microbiomes closely crobiome (‘‘SR’’) when GF mice were colonized with defined
resemble cholera-diarrheal communities. As infectious diarrhea communities for 2 weeks prior to introduction of V. cholerae (Fig-
and malnutrition are frequent in cholera endemic areas, we hy- ure 2C). The SR community consisted of three species repre-
pothesized that the distinctive dysbiotic microbiome structure senting major phylogenetic lineages commonly found in the
observed during recovery from multiple sources of environ- healthy human gut. To model an attempted microbiome restora-
mental insult to the gut may be a recurring window of vulnera- tion of a fully established and dense gut community, we also
bility to cholera. introduced DS microbes for 10 days, followed by a gavage of
We then used human-derived isolates to assemble defined gut SR microbes 4 days prior to infection with V. cholerae. In this
communities (Figure 1A) based on these metagenomic analyses. Mix group, pathogen colonization was strongly inhibited
One model microbiome (‘‘CR’’) was based on metagenomic sur- compared to DS-colonized mice, suggesting that microbiome
veys of healthy individuals, characterized by high taxonomic di- modification could be used to restore colonization resistance
versity but commonly including members of the genera Bacter- even to entrenched dysbiotic communities.
oides, Clostridium, and Blautia (Arumugam et al., 2011; Qin We then profiled gut microbiome structure during infection in
et al., 2010; Yatsunenko et al., 2012). Another (‘‘DS’’) model mi- feces and small intestines of gnotobiotic animals with different
crobiome is characteristic of the dysbiotic state found in cholera- communities (Figures 3A–3C and 3G). In these samples, the
endemic areas, comprising Streptococci, Enterococcus faecalis, CR and DS communities were distinct and the CR community
and E. coli. 16S sequencing analysis confirmed that the CR com- more similar to complete fecal microbiomes of healthy US do-
munity is more similar to healthy human microbiomes than dys- nors, while co-inoculation of CR and DS led to an intermediate
biotic diarrheal microbiomes, while the DS model community final microbiome.
was more similar to microbiomes at the conclusion of cholera
(Figures 1C and 1D). Non-dysbiotic Human Microbiomes Reduce Virulence
We grew bacterial species from each defined group in pure Gene Expression and Colonization of V. cholerae in a
culture and used culture optical density (OD600) to introduce Suckling Mouse Model of Infection
equivalent amounts of each member species with V. cholerae While gnotobiotic adult mice serve as a useful microbial-interac-
to germfree (GF) adult C57BL/6 mice by intra-gastric gavage. tion model, we extended our studies to suckling mice, where the
Overall, microbial load during infection was equivalent as pathology and virulence gene expression observed during
measured by qPCR of 16S gene levels (Figure 2F). Mice that V. cholerae infection is closer to that of humans (Klose, 2000).
received the CR microbiome at infection were resistant to First, we recapitulated the CR and DS resistance phenotypes
V. cholerae colonization, compared to animals receiving DS mi- in suckling GF C57BL/6 animals (Figure 2D). We then con-
crobes (Figures 2A and 2B). We observed these colonization structed a more accessible and scalable model of microbiome-
Figure 2. Gut Microbiome Composition Contributes to V. cholerae Infection Resistance

(A and B) V. cholerae colonization in germfree adult mice harboring defined model communities co-gavaged with V. cholerae in (A) feces and (B) small intestines
4 days post-infection.
(C) Fecal V. cholerae colonization in GF adult mice harboring defined communities for 2 weeks and then gavaged with V. cholerae. Mix: 10 days DS colonization,
followed by SR microbes 4 days prior to V. cholerae infection.
(D) Intestinal V. cholerae colonization of GF suckling mice co-gavaged with model communities and V. cholerae.
(E) Intestinal V. cholerae colonization of antibiotic-treated CD-1 suckling mice co-gavaged with indicated communities.
(F) 16S gene abundance in small intestine.
(G) Expression of tcpA in infected mice with model human microbiomes.
(H) T6SS effects on small intestine colonization in antibiotic-treated CD-1 pups. *p < 0.05, **p < 0.01, ***p < 0.001 (Mann-Whitney U-test); n.s., not significant. Error
bars represent mean ± SEM. n = 6–12 animals for all experiments.
1536 Cell 181, 1533–1546, June 25, 2020

ll
Article
A B C
D E F
G H
Figure 3. Addition of the CR Model Human Microbiome to Mice Hosting DS Microbes Yields a Community Structure Closer to Complete
Fecal Communities of Healthy Human Volunteers
(A–F) Principal coordinates analysis (PCoA) of microbial community diversity based on weighted UniFrac distance, % variance explained shown in parentheses
for each axis. Ellipses show 95% confidence intervals. (A) PCoA of fecal samples and distal third of small intestine of GF mice with model communities during
V. cholerae infection compared to healthy US donor fecal samples, with (B) PC1 positions and (C) all pairwise weighted UniFrac distances to healthy US donor
fecal samples. (D) PCoA of model communities and healthy human donor communities in suckling mice, with (E) PC1 positions and (F) all pairwise weighted
UniFrac distances to healthy US donor fecal samples.
Cell 181, 1533–1546, June 25, 2020 1537

ll
Article
pathogen interaction by clearing the native murine flora of CD-1 We normalized fecal slurries and transplanted these samples
pups with streptomycin before introduction of human-associ- into antibiotic-treated suckling mice with V. cholerae (Figure 4),
ated species. Using this system, we observed similar micro- with dramatically different effects on V. cholerae colonization,
biome-dependent infection outcomes; competitive CR/DS even though these fecal communities colonized suckling animals
transplantation yielded a dominant CR-like phenotype (Fig- at similar efficiencies and density (Figure 2F). We observed an
ure 2E), while CR and DS colonization load did not differ in 1.5-log10 range of V. cholerae colonization depending on the
non-Vibrio-infected animals (Figure 2F). This pattern was re- human donor (Figure 4), suggesting wide variation in infection
flected in 16S sequencing data (Figures 3D–3F; Table S3). During outcomes based on individual gut microbiome structure. The
infection, pups receiving CR microbes had very different com- higher basal microbiome diversity in Bangladesh (Figure 1C)
munity structure (with Vibrio reads filtered) compared to animals suggests that interpersonal variations in infection resistance in
with DS microbes, and animals receiving a mixed inoculum endemic areas could be substantially higher.
(CR+DS) closely resembled CR mice.
Total microbial diversity was not a dominant factor in infection A Pipeline for Randomization of Microbiome Members
resistance, because we restored colonization resistance in DS Identifies Commensal Species Consistently Correlated
mice to almost full CR levels by transplanting only a small subset with V. cholerae Infection Outcome
of CR species (SR). Expression levels of the key colonization fac- We hypothesized that the CR species best able to colonize intes-
tor tcpA were reduced 9.7-fold in SR compared to DS animals tines with the DS community might be prophylactic for infection,
(Figure 2G). We did not observe significant microbiome-depen- because transplantation of CR microbes into DS communities
dent differences in cholera toxin gene expression, diarrhea, or reduced V. cholerae colonization. Therefore, we examined gut
fluid accumulation in these animals (Figure S1). microbiome structure during V. cholerae infection in GF animals
Recent studies have shown type VI secretion system (T6SS) with a mixed CR+DS community (Figures 3A and 3G). The CR
killing of murine commensal E. coli acts to drive increased viru- community in the small intestine was quite distinct from that in
lence in infection of suckling mice (Zhao et al., 2018). As our feces, but all animals with this community were consistently
DS model community contains E. coli, albeit a different strain, colonized by Blautia and Bacteroides species. The DS commu-
we tested the effects of T6SS on V. cholerae colonization and nity was consistent in feces and small intestine, and dominated
E. coli levels in our experimental system. A T6SS DvasK mutant by Streptococci. The lack of generalizability of fecal data to other
was deficient for colonization compared to wild-type V. cholerae gut compartments suggested that fecal sampling studies may
in antibiotic-cleared suckling mice (Figure 2H). However, T6SS mask important differences for pathogenesis in the proximal
activity did not significantly alter levels of a co-inoculated strep- gastrointestinal tract.
tomycin-resistant K12 E. coli, and we observed E. coli at compa- In the small intestine, B. obeum maintained its relative abun-
rable levels in DS and Mix (DS+CR) communities in the small dance in gnotobiotic CR+DS and CR-only animals (Figure 3H),
intestine (Figure 3G). These differences may be E. coli strain- suggesting that it may play a role in CR infection resistance
specific, or due to the much higher levels of V. cholerae used and in transmitting this phenotype to DS animals. However,
in previous T6SS studies. these findings had potentially limited translational applicability
Together, our data suggested that the mechanism for given the basal inter-personal diversity in humans; the ability to
improved colonization resistance of CR/SR microbes lay in displace one dysbiotic microbiome and resist V. cholerae coloni-
T6SS-independent manipulation of V. cholerae virulence gene zation may not be representative across diverse individuals and
expression. microbial communities. To identify Vibrio-antagonistic microbes
across many different microbiome contexts, we generated
Inter-individual Variation in Pathogen Colonization random, unique, 5-member combinations drawn from CR and
Resistance of Human Gut Microbiomes DS strains (Figures 5A and 5B) and established OD600-normal-
Our microbiome transplantation system in suckling mice alized mixtures of these bacteria in antibiotic-cleared suckling an-
lowed us to screen numerous intact human fecal microbiomes imals with and without Vibrio infection. We reasoned that species
collected from healthy adult volunteers without malnutrition or whose presence/absence or abundance consistently correlated
recent antibiotic usage or diarrhea for effects on V. cholerae against V. cholerae colonization in many different communities
(Figure 1B). These complete fecal communities were taxonom- would be excellent putative targets for anti-Vibrio interventions.
ically similar to the CR, but not DS microbiomes in both original We identified several species consistently associated with
microbial content and community structure upon transplanta- pathogen levels across multiple microbiome combinations.
tion (Figures 3D–3F). This was unsurprising, because the CR Higher levels of B. obeum were significantly associated with
model community represented up to 73% of genus-level diver- reduced V. cholerae colonization (Figure 5C), but did not directly
sity by total relative abundance in these samples, while correlate with V. cholerae abundance. This is consistent with the
members of the DS community only represented <1% of the effects of the mixed CR/DS microbiome on infection; B. obeum
total (Table S4). consistently established in the DS small intestine, but high levels
(G) Microbiome structure during infection with V. cholerae and host reads filtered (left) and in antibiotic-treated suckling mice without V. cholerae (right).
(H) B. obeum abundance in adult GF mice containing indicated microbiomes during V. cholerae infection (4 days post-infection). *p < 0.05, ****p < 0.0001; n.s. not
significant (Mann-Whitney U-test). Error bars represent mean ± SEM.
1538 Cell 181, 1533–1546, June 25, 2020

ll
Article
Figure 4. Gut Microbiome Composition
Contributes to Inter-personal Differences
in V. cholerae Infection Resistance
Intestinal V. cholerae colonization of antibiotic-
treated CD-1 pups colonized with complete fecal
microbiomes from healthy US human volunteers.
n = 5–7 animals for all experiments. Error bars
represent mean ± SEM.
bated them anaerobically with a

V. cholerae lacZ:PtcpA -sh ble zeocin
resistance reporter. As expected, intesti-
nal homogenates induced tcpA expres-
sion, while pre-treatment of intestinal ho-
mogenates with B. obeum ablated tcpA
induction (Figure 6A). As a control, we
boiled intestinal homogenates that had
were not required to affect V. cholerae. That this effect was seen been incubated with B. obeum cultures in order to remove AI-
across numerous randomized communities suggests that the 2, which is heat labile (Figure S3). Strikingly, homogenates incu-
inhibitory activity of B. obeum on V. cholerae infection may be bated with B. obeum remained unable to induce tcpA even after
broadly generalizable across many gut microbiomes. Except boiling, in contrast to boiled homogenates alone or homoge-
for B. obeum, we found no statistically significant effects on nates incubated with S. salivarius and then boiled (Figure 6A).
V. cholerae colonization based on the presence or absence of These data suggested that B. obeum can deplete virulence-acti-
an SR or CR species. In contrast, levels during infection of DS mi- vating signals within the gut as well as produce virulence-sup-
crobes (Streptococcus, E. faecalis, and E. coli) all positively and pressing signals.
significantly correlated with higher Vibrio levels (Figure 5D). In
mice with combinations with both B. obeum and The Bile Salt Taurocholate Acts as a Potent Virulence
S. thermophilus, V. cholerae colonization was comparable to Gene Activator and Is More Efficiently Degraded by
mice with combinations including B. obeum but not Strepto- Commensal Microbes in Healthy but Not Dysbiotic
coccus, again supporting the observation that B. obeum’s ef- Communities
fects on pathogenesis are dominant across diverse microbiomes One abundant heat-resistant molecule present in the small intes-
(Figure 5C). tine is bile. In humans and mice, bile acids are synthesized in the
Because SR microbes largely recapitulated the colonization liver from cholesterol and stored in the gallbladder. These pri-
resistance of the full CR microbiome, we performed similar ana- mary bile acids are then secreted into the duodenum, where
lyses looking for whether combinations of different SR microbes they, typically in their sodium salt form, act to aid in the emulsi-
with B. obeum yielded lower V. cholerae colonization than when fication and absorption of dietary fats. More than 95% of
those species were present in isolation. We observed no statis- secreted bile acids are actively absorbed by the terminal ileum
tically significant additive effects on V. cholerae colonization of and sent back to the liver in a process known as enterohepatic
adding either Bacteroides vulgatus or Clostridium scindens to circulation (Di Ciaula et al., 2017). Bile is a highly complex
B. obeum in defined communities (Figure S2). mixture, although bile acids dominate the dry weight of biliary
bile (Muraca et al., 1991). Many prior studies have focused on
Intestinal Signals That Induce V. cholerae Virulence crude extracts from varying sources, including ruminants, con-
Gene Expression Are Depleted by B. obeum taining poorly defined mixtures of bile molecules. Human bile
Having identified a candidate driver of V. cholerae infection resis- acids secreted into the small intestine are predominantly conju-
tance, we began to search for a molecular mechanism for these gated to taurine (taurocholic acid/sodium taurocholate) and
effects. Numerous host and some commensal microbial cues glycine (glycocholic acid/sodium glycocholate), while taurine-
regulate V. cholerae gene expression in vivo (Gupta and Chowd- conjugated forms predominate in mice (Li and Dawson, 2019;
hury, 1997; Kovacikova et al., 2010; Yang et al., 2013). Prior Sayin et al., 2013). Commensal microbial action is important
studies have identified a role for a B. obeum-produced AI-2 auto- for bile acid metabolism; bacterial enzymes (bsh, bile salt hydro-
inducer in downregulating the expression of TCP biogenesis lases) are able to remove the conjugated amino acids from
genes during infection (Hsiao et al., 2014). Consistent with this secreted bile acids in the small intestine, for example converting
finding, we observed reduced tcpA expression during infection glycocholate (GC) and taurocholate (TC) to cholate/cholic acid
of mice with microbiomes containing B. obeum (Figure 2G). (CA) (Jones et al., 2008; Ridlon et al., 2006; Song et al., 2019).
In vivo conditions for virulence gene regulation can be Indeed, in GF mice, the bile pool in the small intestine is almost
mimicked ex vivo using microaerophilic/anaerobic growth of exclusively taurine-conjugated (Sayin et al., 2013).
V. cholerae with intestinal tissue from mice (Yang et al., 2013). Previous studies have shown that TC, one of the most abun-
We took homogenates of suckling mouse intestine and incu- dant bile molecules in humans and mice, can induce tcp
Cell 181, 1533–1546, June 25, 2020 1539

ll
Article
C D
Figure 5. An Unbiased Combinatorial Strategy for Identifying Commensal Correlates of Protection Against and Susceptibility to V. cholerae
colonization
(A) Combinatorial strategy.
(B) 5-member microbiome embodiments randomly generated using CR/DS members (left). Resulting V. cholerae infection of antibiotic-treated suckling mice
containing defined microbiome embodiments are shown at right.
(C) Mean V. cholerae colonization in suckling mice bearing communities containing B. obeum or Streptococcus species alone and in combination. Data
normalized across experiments as fold colony-forming unit (CFU)-gavaged V. cholerae recovered after infection.
(D) Abundance of DS member species in randomized microbiomes correlated to resulting V. cholerae abundance after infection. Points represent mice receiving
different microbiome embodiments. *p < 0.05 (Mann-Whitney U-test); n.s., not significant. Error bars represent mean ± SEM.
expression under anaerobic conditions through modulating the mogenates with the bile-sequestering resin cholestyramine ab-
structure and activity of the upstream virulence activator TcpP lated their ability to activate PtcpA (Figure 6A).
(Yang et al., 2013). Similarly, we saw that TC activated PtcpA, We next screened the DS and CR species for their effects on
while CA was not an efficient inducer (Figure 6B). Intestinal ho- TC. We incubated TC solutions at a physiologically relevant con-
mogenate effects on tcpA was bile-specific; pre-treatment of ho- centration with pure cultures of these microbes, heat-treated
1540 Cell 181, 1533–1546, June 25, 2020

ll
Article
A B
C D E
Figure 6. B. obeum Exerts Effects on V. cholerae Colonization through Degradation of the In Vivo Virulence Gene Activating Signal Taur-
ocholate (TC)
PtcpA activity normalized to tcpA induction by 125 mM TC unless noted.
(A) Modulation of tcpA-activating signals in suckling CD-1 mouse intestinal homogenates by pure cultures of B. obeum and S. salivarius, with heat treatment.
(B) Bile effects on tcp gene expression in vitro.
(C) Effects of CR and DS pure cultures on TC activation of virulence in vitro.
(D) Effects of B. obeum bsh enzyme expression on TC-mediated tcp activation in vitro.
Cell 181, 1533–1546, June 25, 2020 1541

ll
Article
and filter-sterilized the resulting supernatants, and measured dicted structure, B. obeum encodes for predicted type 1 bsh
their ability to induce PtcpA (Figure 6C). We observed dramatic enzymes, which are highly effective at deconjugating TC, GC,
differences in the ability of these strains to affect TC virulence in- glycodeoxycholate, and taurodeoxycholate (Song et al.,
duction, with members of the CR/SR microbiomes in general 2019), the strongest activators of tcpA expression in
being better able to prevent tcp activation. The ability to ablate V. cholerae (Figure S5). All DS members except S. infantarius
TC-mediated induction of tcp expression varied at genus level, lacked annotated bsh genes, while many CR species encoded
with Blautia torques unable to affect tcp induction by TC in bsh homologs. Although S. infantarius showed high activity
comparison to B. obeum, and Streptococcus infantarius able against TC in vitro, despite bearing bsh homologs to phylotypes
to process TC in contrast to other DS Streptococci. Of SR with poor predicted TC activity, we observed no statistically
species, B. vulgatus, but not C. scindens, showed efficient TC significant difference in effects on V. cholerae colonization
processing. compared to S. salivarius (Figure S4B). This suggests that there
Because the SR community largely recapitulated the CR may be differences in in vivo regulation of these enzymes in
colonization resistance phenotype, and B. vulgatus demon- S. infantarius. Conversely, B. torques, despite encoding several
strated high activity against TC in vitro, we wanted to examine putative bsh genes, was not able to prevent tcpA induction by
the relative contribution of B. obeum and B. vulgatus on TC TC. B. vulgatus was an efficient TC processor in vitro, but
levels in the distal small intestine. We colonized adult GF despite encoding, 3 putative bsh genes could not drive signifi-
mice with CR members, or CR species without B. obeum, cantly lower levels of TC to CA processing in the distal small in-
and measured TC and CA in the distal third of the small intes- testine in the absence of B. obeum (Figure S4A), further sug-
tine 2 days post-colonization, compared to GF mice (Fig- gesting that enzyme expression or function may diverge in
ure S4A). As expected given the absence of microbial bsh, in vitro versus in vivo.
GF distal small intestine showed a high TC/CA ratio, while the V. cholerae also encodes a putative bsh enzyme (VCA0877).
presence of CR microbes efficiently processed TC to CA, However, this is a predicted phylotype 6 bsh, which has poor
yielding low TC/CA ratios. Strikingly, the removal of B. obeum activity against TC but higher activity against rarer bile acids
restored the TC/CA ratio in the distal small intestine to a level that do not participate in regulation of virulence but may be
not statistically significantly different from GF animals, although bacteriostatic in vivo (Table S5). V. cholerae cannot prevent
there was a trend toward more TC in GF animals. This sug- TC-mediated tcp activation in vitro, suggesting that V. cholerae
gested that although other CR organisms can contribute to has limited bsh activity against TC in comparison to B. obeum
TC processing to CA, B. obeum is particularly well suited to (Figure S4C).
manipulating the level of this bile acid in the distal small intes- We constitutively expressed the B. obeum bsh
tine. This agrees with our findings that B. obeum efficiently col- RUMOBE_000028 by cloning this locus downstream of a
onizes the small intestine (Figures 3 and 5), and the presence of constitutive PLtet-O1 promoter in E. coli, generating strain
B. vulgatus and B. obeum together does not significantly bshC. This bshC strain efficiently reduced levels of TC and
improve the ability of a microbiome to exclude V. cholerae tcp activation compared to the isogenic vector control in
(Figure S2). both pure TC solutions (Figure 6D) and intestinal homogenates
(Figure 6E). Significantly, given the dominant effect of
A Bile Salt Hydrolase Enzyme Encoded by B. obeum Is B. obeum on Vibrio resistance, pure cultures of either
Able to Degrade Virulence-Activating Signals in the Gut B. obeum or bshC reduce tcp activation by TC (Figure 6D)
To determine the molecular basis for B. obeum’s effect on and TC levels in intestinal homogenates (Figure 6E) in the pres-
TC-dependent virulence induction, we examined genetic ence of S. salivarius. The activity of this B. obeum enzyme is
determinants of bile acid processing. The B. obeum genome enable to affect V. cholerae in vivo, because V. cholerae is unable
codes for a hypothetical choloylglycine hydrolase (EC 3.5.1.24). to colonize mice gavaged with bshC as effectively as mice with
Such bile salt hydrolase (bsh) enzymes catalyze the removal of the vector control (Figure 6F).
the conjugated amino acids of bile salts, for example the removal
of taurine from TC to form CA. Bile Salt Hydrolase Levels in Human Gut Microbial
Putative bsh genes are broadly distributed across gut micro- Communities Are Correlated to V. cholerae Infection
bial species, because the ability to survive the inhibitory effects Outcome
of bile are extremely important for enteric commensals (De To determine the distribution of bsh phylotypes in human gut mi-
Smet et al., 1995). A recent study classed bacterial bsh genes crobiomes predicted to be dysbiotic or healthy, we re-examined
into several phylotypes based on sequence similarity and an existing deeply sequenced shotgun metagenomic dataset of
showed that bsh phylotypes have variable and substrate- human cholera patients in Bangladesh (Table S2D) (David et al.,
dependent deconjugation activity (Song et al., 2019). We 2015). Importantly, data was available from patients at presenta-
binned predicted bsh genes in the CR and DS genomes into tion at clinic for cholera (‘‘Diarrhea (d0)’’) without any prior anti-
these phylotypes (Table S5). By sequence alignment and pre- biotic usage and from patients that received oral antibiotics as
(E) Mass spectrometry measurement of TC in suckling CD-1 mouse intestines after incubation with pure cultures of indicated strains.
(F) V. cholerae infection of suckling CD-1 mice after 1-day of colonization with indicated E. coli strains. *p < 0.05, **p < 0.01, ***p < 0.001, ****p < 0.0001 (unpaired
Student’s t test). Error bars represent mean ± SEM. n = 3–10 for all experiments.
1542 Cell 181, 1533–1546, June 25, 2020

ll
Article
A B Figure 7. Levels of B. obeum bsh Enzymes

in Human Samples Correlate to Infection
Outcome and Can Independently Modulate
V. cholerae Colonization
(A) Levels of phylotype 1 bsh enzymes in meta-
genomic sequencing of fecal microbiomes of
cholera patients pre- (‘‘Diarrhea (d0)’’) and post-
antibiotic (‘‘Diarrhea + abx’’) treatment, compared
to healthy adults in Bangladesh.
(B) Mass spectrometry measurement of bile levels
in 125-mM solutions of TC incubated with indicated
cultured human fecal communities in vitro. y, TC
not detected.
(C) B. obeum bsh levels in intestines of antibiotic-
cleared suckling CD-1 mice colonized with com-
plex human fecal samples without V. cholerae,
compared to V. cholerae colonization of antibiotic-
cleared suckling animals bearing human donor
microbiomes. **p < 0.01, ***p < 0.001 (Mann-
Whitney U-test). Error bars represent mean ± SEM.
C
used standardized amounts of the result-
ing mixed cultures to treat TC solutions,
which we then used for virulence re-
porter assays. Strikingly, we observed
that microbiomes (subjects B and D)
that allowed higher V. cholerae levels
when transplanted in suckling animals
were also unable to completely remove
TC from solution after 24 h, whereas
communities exhibiting stronger coloni-
zation resistance (A, C, and E) reduced
TC to undetectable levels (Figure 7B).
Because species in genus Blautia
demonstrated differences in bsh activity
in vitro, as well as association with cholera
patients and uninfected family members
(Midani et al., 2018), we assayed for the
level of the bsh gene of B. obeum specif-
part of the standard of care for cholera (‘‘Diarrhea + abx’’). These ically in total DNA extracted from human fecal samples by real-
data showed that bsh levels were already affected by diarrhea time PCR. This also served as a function-specific measure
prior to any clinical intervention. We found that several bsh of B. obeum abundance in these complex fecal mixtures.
phylotypes followed diarrhea-dependent patterns in comparison We found that communities associated with higher V. cholerae
to a healthy Bangladeshi cohort. Phylotypes 1, 3, 4, and 5 were colonization had lower levels of B. obeum bsh (Figure 7C), sug-
significantly depleted in dysbiotic samples compared to healthy gesting that B. obeum abundance and specifically the presence
controls (Figures 7A and S6). Of these, phylotypes 1, 3, and 4 of the bsh activity is associated with resistance to V. cholerae
were shown to be highly active against TC (Song et al., 2019) infection.
(summarized in Table S5). These data suggested that a charac-
teristic of healthy human microbiomes that may modulate DISCUSSION
V. cholerae susceptibility is their ability to deconjugate TC into
non virulence-inducing forms. A role for gastrointestinal microbes in resistance to enteropath-
We then assayed whether complete healthy US fecal com- ogens such as V. cholerae has been recognized for some time
munities were differentially able to convert TC to non-tcp-acti- (Freter, 1955, 1956). However, this colonization resistance has
vating forms. We took the first six healthy US donors and often been examined in dichotomous terms, either germfree or
anaerobically cultured bacteria from their fecal samples conventional, or ‘‘normal’’ and damaged by specific factors
in vitro and were able to recover species representing 66%– such as antibiotics. Our results suggest that beyond extreme
99% relative abundance of the original sample (Table S4C). fluctuations in structure, such as those due to diarrhea and se-
We inoculated these complex fecal specimens in media and vere malnutrition, diversity even within otherwise ‘‘healthy’’
Cell 181, 1533–1546, June 25, 2020 1543

ll
Article
human populations can serve as predictive markers for infection Our results suggest a model where, at the initial point of
resistance. colonization in the small intestine, tcp gene expression and
A key difficulty in identifying taxa that can drive infection thus TCP biogenesis is determined by the balance of bile acids
susceptibility is the complexity of animal gut microbiomes, that is modulated by commensal microbes with differential bsh
compounded by dramatic differences in the taxonomic diver- activity. Differences in early tcp gene expression, and thus
sity across host species (Seedorf et al., 2014). The limited colonization, may have disproportionate impact on the pro-
taxonomic and functional resolution of many commonly em- gression of V. cholerae infections; variation in microbiome
ployed metagenomic techniques is also a barrier for converting bsh activity may help determine whether infection proceeds
observations in large population studies to mechanistic in- to fulminant diarrhea or low or temporary colonization that
sights on microbial effects on pathogenesis and other pheno- leads to mild or asymptomatic infections that are common in
types. Findings in this study and others (Hsiao et al., 2014), in cholera endemic areas (King et al., 2008). Once severe diar-
which single genes encoded by specific microbiome members rhea has begun, the commensal community and existing
are able to affect V. cholerae colonization in isolation from lumenal bile is depleted, and any future bile secretion is
other functions, suggest that correlative studies have distinct predominantly conjugated primary species that stimulate
limits in their abilities to provide mechanistic insights to virulence.
microbiome-pathogen interactions. For instance, a recent Although B. obeum bile modification is an important regu-
sequencing-based study that sampled gut microbes of cholera lator of V. cholerae colonization, there may be additive effects
patients and healthy household contacts identified microbes on pathogen behavior of multiple community members and
from the same genus as associated with both individuals through other mechanisms. Removal of B. obeum was suffi-
with cholera and individuals with putative exposure but non- cient to raise TC levels in the mouse distal small intestine,
progression to disease (Midani et al., 2018). Thus, genus-, but although constitutive expression of B. obeum bsh yielded
and possibly even species-level data may be insufficient to a 1 log drop in V. cholerae colonization, the full CR micro-
identify clear targets for future mechanistic studies in the biome yielded almost 2 log-fold differences in colonization
absence of experimental manipulations. compared to DS microbiomes. In V. cholerae, virulence
Recent developments in high-throughput sequencing, gene expression is negatively regulated by several different
anaerobic microbiology, and gnotobiotic animal systems quorum sensing systems involving the sensing of specific
have allowed for mechanistic studies of the interaction of hu- autoinducers (Jung et al., 2015). Prior studies identified
man commensals and human pathogens in animal models of B. obeum-produced autoinducer AI-2 as a suppressor of
colonization and disease. Specific target taxa identified by tcpA, functioning through a pathway bypassing the canonical
multi-omics approaches can be established in animals, with AI-2 receptor LuxP and involving the regulator VqmA (Hsiao
species and gene content defined before introduction of et al., 2014). VqmA has been shown to activate the master
pathogens. These types of approaches allowed us to identify quorum-sensing regulator HapR, which leads to repression
B. obeum as a key member of the human gut microbiome of tcpP (Liu et al., 2006; Zhu et al., 2002). Thus, AI-2 expres-
that drove infection resistance and microbial interactions sion and bsh activity by B. obeum may synergize to reduce
with bile acids as a driver of virulence gene regulation the level and activity of TcpP during infection. Additional
in V. cholerae and a mechanism of protection against infection. studies will be required to determine how these two inputs
We hypothesize that microbial bile metabolism most affects exert their effects on regulation of virulence determinants of
V. cholerae pathogenesis during early infection. V. cholerae V. cholerae in vivo. Non-B. obeum CR members may also
tightly regulates gene expression in response to host impact V. cholerae colonization, both at the level of virulence
environmental signals such as bile. Bile acids are thought to regulation and possibly metabolic competition for colonization
stabilize the structure of the key virulence regulator TcpP niches. Host diet may play a role in colonization; bile is
(Yang et al., 2013), which drives the activation of toxT tran- secreted from the gallbladder in response to food ingestion,
scription. ToxT then causes full induction of tcp and cholera and this varies by the fat content of the meal (Marciani
toxin, with TCP-dependent colonization thought to begin prior et al., 2013). Diet is also a potent driver of microbiome struc-
to toxin expression (Lee et al., 1999). Following colonization, ture (Faith et al., 2011), but the effect of different dietary com-
the activity of bile becomes less clear. Some studies have positions on driving the microbiome to infection resistant or
demonstrated that the unsaturated fatty acids in bile are susceptible states has not been well studied. Ingestion of
able to modulate the binding of ToxT to target promoters, food has been shown to dramatically reduce the infectious
reducing CT and TCP expression (Plecha and Withey, 2015). dose of V. cholerae due to buffering of stomach pH, but
The deconjugated bile salt sodium deoxycholate promotes also possibly by raising the levels of virulence-activating
interaction between the virulence activators ToxR and ToxS conjugated bile acids secreted into the small intestine.
and subsequent activity (Midgett et al., 2017). However, Taken together, our results suggest that variation in
ToxRS likely does not directly activate toxT, but rather acts human gut microbiomes are a significant contributor to
to boost the activity of TcpP at the toxT promoter (Krukonis V. cholerae infection risk, and this can be modulated through
et al., 2000). Both conjugated and deconjugated bile acids introduction of a human gut commensal with multiple molec-
are also able to induce ToxT-independent activation of ular effects on V. cholerae, able to affect levels of both
cholera toxin dependent on ToxRS (Hung and Mekala- virulence-activating and virulence-suppressing signals at
nos, 2005). the site of infection. This suggests that targeted microbiome
1544 Cell 181, 1533–1546, June 25, 2020

ll
Article
modification can be a promising prophylactic target against REFERENCES

cholera.
Arumugam, M., Raes, J., Pelletier, E., Le Paslier, D., Yamada, T., Mende, D.R.,
Fernandes, G.R., Tap, J., Bruls, T., Batto, J.-M., et al.; MetaHIT Consortium
STAR+METHODS (2011). Enterotypes of the human gut microbiome. Nature 473, 174–180.
Bassler, B.L., Wright, M., and Silverman, M.R. (1994). Multiple signalling sys-
Detailed methods are provided in the online version of this paper tems controlling expression of luminescence in Vibrio harveyi: sequence and
and include the following: function of genes encoding a second sensory pathway. Mol. Microbiol. 13,
273–286.
d KEY RESOURCES TABLE Caporaso, J.G., Kuczynski, J., Stombaugh, J., Bittinger, K., Bushman, F.D.,
d RESOURCE AVAILABILITY Costello, E.K., Fierer, N., Peña, A.G., Goodrich, J.K., Gordon, J.I., et al.
B Lead Contact (2010). QIIME allows analysis of high-throughput community sequencing
B Materials Availability data. Nat. Methods 7, 335–336.
B Data and Code Availability Clemens, J.D., Nair, G.B., Ahmed, T., Qadri, F., and Holmgren, J. (2017).
d EXPERIMENTAL MODEL AND SUBJECT DETAILS Cholera. Lancet 390, 1539–1549.
B Human studies Dalia, A.B., McDonough, E., and Camilli, A. (2014). Multiplex genome editing
B Animal studies by natural transformation. Proc. Natl. Acad. Sci. USA 111, 8937–8942.
B Bacterial strains and growth conditions David, L.A., Weil, A., Ryan, E.T., Calderwood, S.B., Harris, J.B., Chowdhury,
d METHOD DETAILS F., Begum, Y., Qadri, F., LaRocque, R.C., and Turnbaugh, P.J. (2015). Gut mi-
B 16S library preparation crobial succession follows acute secretory diarrhea in humans. MBio 6,
e00381-15.
B Human gut microbiome 16S meta-analysis
B Metagenomic analysis of bsh phylotypes De Smet, I., Van Hoorde, L., Vande Woestyne, M., Christiaens, H., and Ver-
straete, W. (1995). Significance of bile salt hydrolytic activities of lactobacilli.
B Preparation of bacteria for animal studies
J. Appl. Bacteriol. 79, 292–301.
B Measurement of fluid accumulation
Di Ciaula, A., Garruti, G., Lunardi Baccetto, R., Molina-Molina, E., Bonfrate, L.,
B Assessment of T6SS killing by V. cholerae in vivo
Wang, D.Q., and Portincasa, P. (2017). Bile Acid Physiology. Ann. Hepatol. 16
B Quantitative real-time PCR
(Suppl 1 ), S4–S14.
B Culturing of complex human fecal communities
Eckburg, P.B., Bik, E.M., Bernstein, C.N., Purdom, E., Dethlefsen, L., Sargent,
B AI-2 heat-stability assay
M., Gill, S.R., Nelson, K.E., and Relman, D.A. (2005). Diversity of the human in-
B In vitro bile-dependent tcp induction testinal microbial flora. Science 308, 1635–1638.
B BSH structure comparisons
Faith, J.J., McNulty, N.P., Rey, F.E., and Gordon, J.I. (2011). Predicting a hu-
B Commensal effects on tcp-activating signals man gut microbiota’s response to diet in gnotobiotic mice. Science 333,
B Quantification of bile salts 101–104.
d QUANTIFICATION AND STATISTICAL ANALYSIS Freter, R. (1955). The fatal enteric cholera infection in the guinea pig, achieved
by inhibition of normal enteric flora. J. Infect. Dis. 97, 57–65.
SUPPLEMENTAL INFORMATION Freter, R. (1956). Experimental enteric Shigella and Vibrio infections in mice
and guinea pigs. J. Exp. Med. 104, 411–418.
Supplemental Information can be found online at https://doi.org/10.1016/j. Gupta, S., and Chowdhury, R. (1997). Bile affects production of virulence fac-
cell.2020.05.036. tors and motility of Vibrio cholerae. Infect. Immun. 65, 1131–1134.
Herrington, D.A., Hall, R.H., Losonsky, G., Mekalanos, J.J., Taylor, R.K., and
ACKNOWLEDGMENTS Levine, M.M. (1988). Toxin, toxin-coregulated pili, and the toxR regulon are
essential for Vibrio cholerae pathogenesis in humans. J. Exp. Med. 168,
We thank Jun Zhu and Gary Dunny for the kind gifts of V. cholerae and 1487–1492.
E. faecalis strains, respectively. We thank the Metabolomics Core Facility at
Hsiao, A., Ahmed, A.M.S., Subramanian, S., Griffin, N.W., Drewry, L.L., Petri,
University of California, Riverside for mass spectrometry studies. This work
W.A., Jr., Haque, R., Ahmed, T., and Gordon, J.I. (2014). Members of the hu-
was supported by the National Institute of General Medical Sciences
man gut microbiota involved in recovery from Vibrio cholerae infection. Nature
(R35GM124724 to A.H.).
515, 423–426.
Humbert, L., Maubert, M.A., Wolf, C., Duboc, H., Mahé, M., Farabos, D., Sek-
sik, P., Mallet, J.M., Trugnan, G., Masliah, J., and Rainteau, D. (2012). Bile acid
profiling in human biological samples: comparison of extraction procedures
All authors helped to design and analyze experiments. S.A., J.D.M., J.C.M.,
and application to normal and cholestatic patients. J. Chromatogr. B Analyt.
R.L., and J.Y.C. performed experiments. S.A., J.D.M., and A.H. wrote
Technol. Biomed. Life Sci. 899, 135–145.
the paper.
Hung, D.T., and Mekalanos, J.J. (2005). Bile acids induce cholera toxin expres-
sion in Vibrio cholerae in a ToxT-independent manner. Proc. Natl. Acad. Sci.
USA 102, 3028–3033.
The authors declare no competing interests. Jones, B.V., Begley, M., Hill, C., Gahan, C.G., and Marchesi, J.R. (2008). Func-
tional and comparative metagenomic analysis of bile salt hydrolase activity in
Received: December 5, 2019 the human gut microbiome. Proc. Natl. Acad. Sci. USA 105, 13580–13585.
Revised: March 16, 2020 Jung, S.A., Chapman, C.A., and Ng, W.L. (2015). Quadruple quorum-sensing
Accepted: May 18, 2020 inputs control Vibrio cholerae virulence and maintain system robustness. PLoS
Published: June 16, 2020 Pathog. 11, e1004837.
Cell 181, 1533–1546, June 25, 2020 1545

ll
Article
Kelley, L.A., Mezulis, S., Yates, C.M., Wass, M.N., and Sternberg, M.J. (2015). Muraca, M., Vilei, M.T., Miconi, L., Petrin, P., Antoniutti, M., and Pedrazzoli, S.
The Phyre2 web portal for protein modeling, prediction and analysis. Nat. Pro- (1991). A simple method for the determination of lipid composition of human
toc. 10, 845–858. bile. J. Lipid Res. 32, 371–374.
Kieser, S., Sarker, S.A., Sakwinska, O., Foata, F., Sultana, S., Khan, Z., Islam, Olivier, V., Queen, J., and Satchell, K.J. (2009). Successful small intestine colo-
S., Porta, N., Combremont, S., Betrisey, B., et al. (2018). Bangladeshi children nization of adult mice by Vibrio cholerae requires ketamine anesthesia and
with acute diarrhoea show faecal microbiomes with increased Streptococcus accessory toxins. PLoS ONE 4, e7352.
abundance, irrespective of diarrhoea aetiology. Environ. Microbiol. 20, Pettersen, E.F., Goddard, T.D., Huang, C.C., Couch, G.S., Greenblatt, D.M.,
2256–2269. Meng, E.C., and Ferrin, T.E. (2004). UCSF Chimera–a visualization system
Kim, D., Langmead, B., and Salzberg, S.L. (2015). HISAT: a fast spliced aligner for exploratory research and analysis. J. Comput. Chem. 25, 1605–1612.
with low memory requirements. Nat. Methods 12, 357–360. Plecha, S.C., and Withey, J.H. (2015). Mechanism for inhibition of Vibrio chol-
King, A.A., Ionides, E.L., Pascual, M., and Bouma, M.J. (2008). Inapparent in- erae ToxT activity by the unsaturated fatty acid components of bile.
fections and cholera dynamics. Nature 454, 877–880. J. Bacteriol. 197, 1716–1725.
Klose, K.E. (2000). The suckling mouse model of cholera. Trends Microbiol. 8, Qin, J., Li, R., Raes, J., Arumugam, M., Burgdorf, K.S., Manichanh, C., Nielsen,
189–191. T., Pons, N., Levenez, F., Yamada, T., et al.; MetaHIT Consortium (2010). A hu-
man gut microbial gene catalogue established by metagenomic sequencing.
Kovacikova, G., Lin, W., and Skorupski, K. (2010). The LysR-type virulence
Nature 464, 59–65.
activator AphB regulates the expression of genes in Vibrio cholerae in
response to low pH and anaerobiosis. J. Bacteriol. 192, 4181–4191. Ridlon, J.M., Kang, D.J., and Hylemon, P.B. (2006). Bile salt biotransforma-
tions by human intestinal bacteria. J. Lipid Res. 47, 241–259.
Krukonis, E.S., Yu, R.R., and Dirita, V.J. (2000). The Vibrio cholerae ToxR/
TcpP/ToxT virulence cascade: distinct roles for two membrane-localized tran- Sayin, S.I., Wahlström, A., Felin, J., Jäntti, S., Marschall, H.U., Bamberg, K.,
scriptional activators on a single promoter. Mol. Microbiol. 38, 67–84. Angelin, B., Hyötyläinen, T., Oresic
, M., and Bäckhed, F. (2013). Gut micro-
biota regulates bile acid metabolism by reducing the levels of tauro-beta-mur-
Lee, S.H., Hava, D.L., Waldor, M.K., and Camilli, A. (1999). Regulation and
icholic acid, a naturally occurring FXR antagonist. Cell Metab. 17, 225–235.
temporal expression patterns of Vibrio cholerae virulence genes during infec-
Seedorf, H., Griffin, N.W., Ridaura, V.K., Reyes, A., Cheng, J., Rey, F.E., Smith,
tion. Cell 99, 625–634.
M.I., Simon, G.M., Scheffrahn, R.H., Woebken, D., et al. (2014). Bacteria from
Li, J., and Dawson, P.A. (2019). Animal models to study bile acid metabolism. diverse habitats colonize and compete in the mouse gut. Cell 159, 253–266.
Biochim. Biophys. Acta Mol. Basis Dis. 1865, 895–911.
Song, Z., Cai, Y., Lao, X., Wang, X., Lin, X., Cui, Y., Kalavagunta, P.K., Liao, J.,
Liu, Z., Hsiao, A., Joelsson, A., and Zhu, J. (2006). The transcriptional regulator Jin, L., Shang, J., and Li, J. (2019). Taxonomic profiling and populational pat-
VqmA increases expression of the quorum-sensing activator HapR in Vibrio terns of bacterial bile salt hydrolase (BSH) genes based on worldwide human
cholerae. J. Bacteriol. 188, 2446–2453. gut microbiome. Microbiome 7, 9.
Liu, Z., Miyashiro, T., Tsou, A., Hsiao, A., Goulian, M., and Zhu, J. (2008). Subramanian, S., Huq, S., Yatsunenko, T., Haque, R., Mahfuz, M., Alam, M.A.,
Mucosal penetration primes Vibrio cholerae for host colonization by repressing Benezra, A., DeStefano, J., Meier, M.F., Muegge, B.D., et al. (2014). Persistent
quorum sensing. Proc. Natl. Acad. Sci. USA 105, 9769–9774. gut microbiota immaturity in malnourished Bangladeshi children. Nature 510,
Marciani, L., Cox, E.F., Hoad, C.L., Totman, J.J., Costigan, C., Singh, G., 417–421.
Shepherd, V., Chalkley, L., Robinson, M., Ison, R., et al. (2013). Effects of Yang, M., Liu, Z., Hughes, C., Stern, A.M., Wang, H., Zhong, Z., Kan, B., Fen-
various food ingredients on gall bladder emptying. Eur. J. Clin. Nutr. 67, ical, W., and Zhu, J. (2013). Bile salt-induced intermolecular disulfide bond for-
1182–1187. mation activates Vibrio cholerae virulence. Proc. Natl. Acad. Sci. USA 110,
Midani, F.S., Weil, A.A., Chowdhury, F., Begum, Y.A., Khan, A.I., Debela, M.D., 2348–2353.
Durand, H.K., Reese, A.T., Nimmagadda, S.N., Silverman, J.D., et al. (2018). Yatsunenko, T., Rey, F.E., Manary, M.J., Trehan, I., Dominguez-Bello, M.G.,
Human Gut Microbiota Predicts Susceptibility to Vibrio cholerae Infection. Contreras, M., Magris, M., Hidalgo, G., Baldassano, R.N., Anokhin, A.P.,
J. Infect. Dis. 218, 645–653. et al. (2012). Human gut microbiome viewed across age and geography. Na-
Midgett, C.R., Almagro-Moreno, S., Pellegrini, M., Taylor, R.K., Skorupski, K., ture 486, 222–227.
and Kull, F.J. (2017). Bile salts and alkaline pH reciprocally modulate the inter- Zhao, W., Caro, F., Robins, W., and Mekalanos, J.J. (2018). Antagonism to-
action between the periplasmic domains of Vibrio cholerae ToxR and ToxS. ward the intestinal microbiota and its effect on Vibrio cholerae virulence. Sci-
Mol. Microbiol. 105, 258–272. ence 359, 210–213.
Miller, V.L., Taylor, R.K., and Mekalanos, J.J. (1987). Cholera toxin transcrip- Zhu, J., Miller, M.B., Vance, R.E., Dziejman, M., Bassler, B.L., and Mekalanos,
tional activator toxR is a transmembrane DNA binding protein. Cell 48, J.J. (2002). Quorum-sensing regulators control virulence gene expression in
271–279. Vibrio cholerae. Proc. Natl. Acad. Sci. USA 99, 3129–3134.
1546 Cell 181, 1533–1546, June 25, 2020

ll
Article
STAR+METHODS
KEY RESOURCES TABLE

Vibrio cholerae C6706 El Tor Hsiao Lab stock C6706
DvasK V. cholerae This paper N/A
Vibrio harveyi AI-2 bioassay strain Bassler et al., 1994 BB170
lacZ::PtcpA-sh ble zeocin resistance Liu et al., 2008 PtcpA-sh ble
reporter V. cholerae C6706 El Tor strain
pZE21 in V. cholerae C6706 This paper C6706-KMR
luxS- E. coli strain BW30045D(araD-araB) E. coli Genetic Resources at CGSC#8227
567, DlacZ4787(::rrnB-3), l-, DluxS1368, Yale. CGSC, The Coli Genetic
D(rhaD-rhaB)568, hsdR514 Stock Center
Escherichia coli Hsiao Lab stock DH5a- lpir
E. coli expressing B. obeum luxS This paper BW30045_RO_AI-2
Constitutive B. obeum bsh expression This paper bshC
strain: E. coli DH5alpir pZE21-BSH
Vector control for bshC This paper DH5a-lpir pZE21
Bacteroides caccae ATCC ATCC 43185
Streptococcus salivarius subsp. salivarius ATCC ATCC 13419
Collinsella aerofaciens ATCC ATCC 25986
Dorea formicigenerans ATCC ATCC 27755
Blautia torques ATCC ATCC 27756
Blautia obeum ATCC ATCC 29174
Eubacterium rectale ATCC ATCC 33656
Clostridium scindens ATCC ATCC 35704
Bacteroides vulgatus ATCC ATCC 8482
Bacteroides uniformis ATCC ATCC 8492
Streptococcus infantarius subsp. ATCC ATCC BAA-102
infantarius
Dorea longicatena DSMZ DSM 13814
Faecalibacterium prausnitzii DSMZ DSM 17677
Bifidobacterium longum subsp. longum DSMZ DSM 20219
Streptococcus salivarius subsp. DSMZ DSM 20617
thermophilus
Enterococcus faecalis Dunny Lab stock (University OG1RF
of Minnesota)
Bacteroides thetaiotaomicron Hsiao Lab stock VPI-5482
Biological Samples
Human volunteer donor fecal sample This paper Subject A
Human volunteer donor fecal sample This paper Subject B
Human volunteer donor fecal sample This paper Subject C
Human volunteer donor fecal sample This paper Subject D
Human volunteer donor fecal sample This paper Subject E
Human volunteer donor fecal sample This paper Subject F
Human volunteer donor fecal sample This paper Subject G
Human volunteer donor fecal sample This paper Subject H
Human volunteer donor fecal sample This paper Subject I
Cell 181, 1533–1546.e1–e7, June 25, 2020 e1

ll
Article
Continued
Human volunteer donor fecal sample This paper Subject J
Human volunteer donor fecal sample This paper Subject K
Human volunteer donor fecal sample This paper Subject L
Human volunteer donor fecal sample This paper Subject M
Human volunteer donor fecal sample This paper Subject N
Human volunteer donor fecal sample This paper Subject O
Human volunteer donor fecal sample This paper Subject P
Zeocin Research Products International Cat# 1006-33-0
Corporation
Sodium taurocholate hydrate Sigma Aldrich Cat# 86339
Sodium glycocholate hydrate Sigma Aldrich Cat# G7132
Cholic acid Alfa Aesar Cat# A1125714
Sodium taurodeoxycholate hydrate Sigma Aldrich Cat# T0557
Sodium glycodeoxycholate Sigma Aldrich Cat# G9910
Deoxycholic acid MP Biomedicals Cat# 0210149610
Tauro-b-muricholic acid Steraloids Inc. Cat# C1899-000
b-muricholic acid Steraloids Inc. Cat# C1895-000
Cholestyramine Sigma Aldrich Cat# C4650
iQ SYBR Green Supermix Biorad Cat# 170882
Platinum Hot Start PCR Master Mix Thermo Scientific Cat# 13000013
SuperScript IV First-Strand Synthesis Invitrogen Cat# 18091200
System
Gibson Master Mix New England Biolabs Cat# E2611S
Deposited Data
Short-read sequencing data This paper European Nucleotide
Archive (ENA) PRJEB31497
Short-read sequencing data for meta- European Nucleotide Archive See Table S2 for accession
analysis numbers
Mouse: C57BL/6 UCR gnotobiotic facility N/A
Mouse: CD-1 IGS Charles River Laboratories N/A
Oligonucleotides
All primers for study, see Table S6. This paper N/A
Recombinant DNA
B. obeum codon optimized LuxS placed This paper N/A
downstream of the PLtet-O-1 constitutive
promoter sequence derived from the
plasmid vector pZE21 vector
(pMK_B. obeum_luxS)
Plasmid: Constitutive expression construct This paper N/A
for B. obeum bsh (bshc)
QIIME Caporaso et al., 2010 http://qiime.org/
Phyre2 Kelley et al., 2015 http://www.sbg.bio.ic.ac.uk/
phyre2/html/page.cgi?id=index
Chimera Pettersen et al., 2004 https://www.cgl.ucsf.edu/chimera/
e2 Cell 181, 1533–1546.e1–e7, June 25, 2020

ll
Article
Continued
HISAT2 Kim et al., 2015 http://ccb.jhu.edu/software/hisat2/
manual.shtml
Other
Lab diet Newco Distributors Cat# 5K52
Lead Contact
Further information and requests for resources should be directed to and will be fulfilled by the Lead Contact, Ansel Hsiao (ansel.
hsiao@ucr.edu).
Unique plasmids, strains, and reagents generated in this study are available from the Lead Contact with a completed Materials Trans-
fer Agreement.

The accession number for the Illumina sequencing data reported in this paper is European Nucleotide Archive (ENA): PRJEB31497.
Human studies
All human samples were part of a study approved by the UCR Institutional Review Board and followed NIH guidelines. We collected
intact fecal samples from a cohort of healthy adult volunteers at the University of California, Riverside. Inclusion criteria were: 1) age
between 18 and 40 years, 2) must be able to provide signed and dated informed consent, 3) must be willing and able to provide stool
specimen. Exclusion criteria were: 1) systemic antibiotic usage (oral, intramuscular, or intravenous) in the 2 weeks prior to sampling;
2) acute disease at time of enrollment (presence of moderate or severe illness with or without fever); 3) diarrhea (liquid or very loose
stools not associated with a change in diet) in the 2 weeks prior to sampling; 4) active uncontrolled GI disorders or diseases including
Inflammatory bowel disease (IBD), ulcerative colitis, Crohn’s disease, or indeterminate colitis, persistent, infectious gastroenteritis,
colitis, or gastritis, and chronic constipation; 5) Major surgery of the GI tract, excluding cholecystectomy and appendectomy, but
including major bowel resection at any time. Age inclusion criteria were chosen to avoid age-related microbiome differences, which
are strongest in early life (Yatsunenko et al., 2012). Fecal samples were collected aseptically from each person at UCR and
immediately preserved at 80 C until processing for DNA extraction, culturing, and animal colonization. Stocks of fecal slurries
for subsequent experiments were prepared by resuspending samples at 1:3 weight/volume in sterile reduced PBS and adding sterile
glycerol to a final concentration of 25% volume/volume.
Animal studies
All animal experiments used protocols approved by the Institutional Animal Care and Use Committee of the University of California,
Riverside (UCR) and followed NIH guidelines. All CD-1 suckling animals were purchased from Charles River Laboratories. Suckling
and adult germfree C57BL/6 mice were reared at the UCR gnotobiotic facility. No distinction was made between male and female
animals for bacterial studies. Adult animals were used at > 3 weeks of age. Germfree suckling mice used at 5-6 days of age. Animals
were checked for signs of moribund condition prior to use in experiments, and used for one experimental procedure only. Adult an-
imals were co-housed in cages without mixing sex. Male mice were separated except in cases of littermates.
For the antibiotic-cleared suckling mouse model, 4-day old suckling CD-1 animals were fasted for 1.5 hours, then orally dosed with
1mg/g body weight streptomycin using 30-gauge plastic tubing, after which the animals were placed with a lactating dam for 1 day.
After 24 hours, mice received microbial communities with V. cholerae in a maximum gavage volume of 50 ml. At 18 hours post-infec-
tion, animals were sacrificed, and relevant sections of intestinal tissue dissected and homogenized for CFU numeration and nucleic
acid extraction.
Germ-free C57BL/6 mice were bred and maintained in plastic gnotobiotic isolators at University of California, Riverside. Mice were
fed an autoclaved, low-fat plant polysaccharide-rich mouse chow (Lab Diet 5K52) and were 5–8 weeks old at time of gavage. Bac-
terial cultures were prepared as described above. Mice were fasted for 30 minutes prior to introduction of bacteria, and stomach pH
was buffered by intra-gastric gavage of 100 mL 1M NaHCO3, followed by gavage with 150uL of balanced defined microbial libraries.
Fecal samples were collected across the course of the experiment. Mice were sacrificed 4 days post gavage and small intestine
Cell 181, 1533–1546.e1–e7, June 25, 2020 e3

ll
Article
collected and cut to three equal (proximal, medial, distal) sections by length. Samples were homogenized and used for CFU enumer-
ation of bacteria on LB agar containing 200 mg/mL streptomycin.
For establishing defined microbiomes prior to V. cholerae infection, germ-free C57BL/6 mice were maintained as mentioned above
and used at 6-13 weeks of age. Mice were fasted and given NaHCO3 as previously described and either given 150 mL of the simple
resistant (SR), or dysbiotic (DS) communities. For the Mix group, mice were initially given the DS microbiome embodiment. 10 days
after microbiome introduction and 4 days prior to V. cholerae infection, the SR community was introduced into the Mix group animals
by gavage. 2 weeks after human commensal colonization, each group was infected with 5 3 109 CFU V. cholerae O1 El Tor C6706.
Fecal samples were suspended in 500 mL of PBS and homogenized using a bead beater (BioSpec) at 1,400 RPM for 30 s. CFU
enumeration of V. cholerae was done on LB agar containing 200 mg/mL streptomycin.
We used the antibiotic-treated infant mouse model described above to determine which members of the healthy human micro-
biome contribute most to resistance to V. cholerae. We made 18 random combinations of human gut microbiome strains and intro-
duced them to suckling mice along with V. cholerae. Each combination included five unique strains, and each gavage contained the
equivalent total microbial mass of 300 mL of OD600 = 0.4 culture, divided evenly across all constituent strains. After introducing human
microbiome and V. cholerae to suckling mice, V. cholerae levels in homogenized intestines were determined by plating on selective
agar. The absolute abundance of each species was determined with a combination of 16S rRNA gene qPCR and 16S rRNA
sequencing.
Bacterial strains and growth conditions

All human gut commensal strains used are listed in Table S1. Unless otherwise noted, human gut strains were propagated in LYH-BHI
liquid medium (BHI supplemented to 5g/L yeast extract, 5mg/L hemin, 1mg/mL cellobiose, 1mg/mL maltose and 0.5mg/mL cysteine-
HCl). Cultures were then propagated in a Coy anaerobic chamber (5% H2, 20% CO2, balance N2) or aerobically at 37 C.
All V. cholerae strains were derived from the C6706 El Tor pandemic isolate, including the lacZ:PtcpA:sh-Ble zeocin resistance re-
porter strain (Liu et al., 2008), and propagated in LB media with appropriate antibiotics at 37 C. To construct a strain resistant to kana-
mycin, the plasmid pZE21 was cloned into V. cholerae C6706 (C6706-KMR) and propagated in LB with kanamycin sulfate (Fisher
Scientific, 50 mg/ml) and streptomycin sulfate salt (100 mg/ml). Vibrio harveyi BB170 was propagated in LM medium (Bassler
et al., 1994) aerobically at 37 C.
To construct bshC, a strain constitutively expressing a bile salt hydrolase found in B. obeum, the RUMOBE_00028 locus was ampli-
fied from B. obeum genomic DNA (primers: 50 -GTCGACGGTATCGATAATGCTTATGTGTACAGCTGC-30 and 50 -GCAGGAATTCGA-
TATCACTAATTCTGAAAATGAATCTGC-30 ). All cloning and amplification primers are listed in Table S6. This amplicon was then
cloned downstream of the constitutive PLtet-O1 promoter of plasmid pZE21 through digestion of the vector backbone with HindIII fol-
lowed by Gibson assembly (New England Biolabs). The resulting plasmid was then introduced by electroporation into E. coli DH5al
pir to generate bshC. Strains were propagated aerobically in LB with kanamycin (50 mg/ml) at 37 C.
A strain overexpressing the AI-2 signal of B. obeum (BW30045_RO_AI-2) was constructed by constitutively expressing the
B. obeum luxS AI-2 synthase into E. coli BW30045 (DluxS). The B. obeum luxS coding region (from genome position 33,305-
33,784) was codon-optimized for expression in E. coli, placed downstream of the PLtet-O-1 constitutive promoter sequence derived
from the plasmid vector pZE21 vector, and the construct cloned into vector pMK using the GeneArt Subcloning & Express Cloning
Service (ThermoFisher). This expression construct was then amplified and inserted into the endA gene of the E. coli genome using
primers (forward: 50 -CCAAAACAGCTTTCGCTACGTTGCTGGCTCGTTTTAACACGGAGTAAGTGTTAGAAAAATTCATCCAGCA-3 0 ,
reverse: 50 -GGTTGTACGCGTGGGGTAGGGGTTAACAAAAAGAATCCCGCTAGTGTAGGCGGGCAGTGAAAGGAAGGCC-30 ).
We used natural transformation (Dalia et al., 2014) to construct a DvasK V. cholerae. Fragments of flanking genomic DNA upstream
(forward: 50 -GAACTTTCGTCACGTAAGTC-30 , reverse: 50 -GTCGACGGATCCCCGGAATCATGAATTGTGTCCTTGTTTAC-30 ) and
downstream (forward: 50 -GAACTTTCGTCACGTAAGTC-30 , reverse: 50 -GTCGACGGATCCCCGGAATCATGAATTGTGTCCTTGTT
TAC-30 ) of vasK and an antibiotic resistant gene cassette (forward: 50 -ATTCCGGGGATCCGTCGAC-30 , reverse: 50 -TGTAGGCTG
GAGCTGCTTC-30 ) were amplified from V. cholerae genomic DNA. Amplicons were then joined by overlapping PCR and introduced
into C6706 via natural transformation. Resistant colonies were then selected on trimethoprim-containing agar (10ug/ml) and insertion
confirmed via PCR.
METHOD DETAILS
16S library preparation

For DNA from human fecal samples, 200 mg (average wet weight) fecal sample was suspended in 600 mL PBS. For mouse intestinal
samples, tissues were dissected, homogenized in 5mL PBS, and 500 mL of the homogenate used for DNA extraction. 500 mL 0.1mm
glass beads (BioSpec), 210 mL SDS %20, and 500 mL neutral phenol:chloroform:isoamyl-alcohol (24:24:1, Fisher Scientific) were
added to each sample, and samples were lysed by bead-beating followed by ethanol precipitation (Hsiao et al., 2014).
The V4 variable region of bacterial 16S ribosomal RNA genes was amplified in 25 mL total volume reactions comprising 1 mL of
extracted DNA as template, 10 mL Platinum Hot Start PCR Master Mix (ThermoFisher), 13 mL PCR-grade water and 0.5 mL of forward
and reverse primers (10 mM). Cycling conditions were 94 C for 3 min, followed by 33 cycles (94 C for 45 s, 50 C for 60 s, 72 C for 90 s),
and 72 C for 10 min. An equal amount of each amplicon (240ng) was pooled into libraries, which were then purified using QIAquick
e4 Cell 181, 1533–1546.e1–e7, June 25, 2020

ll
Article
PCR purification columns (QIAGEN) and subjected to sequencing using the Illumina MiSeq platform. Paired-end 150nt reads were
assembled, de-multiplexed, rarefied to > 900 reads per sample, and analyzed using the QIIME 1.9.1 software package (Caporaso
et al., 2010). Sequencing run results are summarized in Tables S2A and S3.
Human gut microbiome 16S meta-analysis

For the analysis of bacterial composition between the human gut microbial communities and our artificial communities, the
sequencing data of the V4 region of the 16S rRNA gene from published studies and samples collected for this study. For the different
phases of V. cholerae infection, we used the first and last time points of diarrhea, and the last time of recovery (Hsiao et al., 2014). The
last time point of fecal samples collected from parents of malnourished Bangladesh children were selected as the healthy adult
Bangladesh control (Subramanian et al., 2014). See Table S2C for sequence accession numbers. Defined community inputs were
designed on the basis that all the strains in the specific community are evenly distributed (CR: 1000 reads/species; SR: 3000
reads/species; DS: 2000 reads/species). All of the sequencing data were collected together and analyzed using QIIME as described
above.
Metagenomic analysis of bsh phylotypes

The protein sequences of the eight representative BSH were obtained from Song et al. (2019). Deep metagenomic sequencing data
was obtained from David et al. (2015) (see Table S2D for sequence accession numbers). The microbial community DNA was aligned
to the protein sequences using blastx. For queries that hit multiple protein sequences, the one that had the highest hit score was
selected. The relative abundance of each type of the BSHs = the BSH reads count / (total reads count – human reads count).
Host reads were determined by mapping metagenomic DNA to Homo sapiens reference genome (assembly GRCh38.p13) using
HISAT2 (Kim et al., 2015).
Preparation of bacteria for animal studies

Each human gut bacterium was cultured from glycerol stocks in LYH-BHI media for 48 hours at 37 C, and then diluted (1:50) in fresh
LYH-BHI media. After growth for an additional 48 hours, cultures were normalized for density by OD600. For inoculation into suckling
mice, the equivalent of a total of 300 mL of 0.4 OD600 culture divided evenly across strains by community was pooled, pelleted by
centrifugation, and resuspended in fresh LYH-BHI. Each mouse received this mass of bacterial cells in a maximum gavage volume
of 50 mL per pup. In mice containing multiple defined communities, normalized mixtures were prepared so that 300 mL of OD600 = 0.4
equivalent of each community was represented in the final gavage. In mice receiving V. cholerae, the total resuspension volume of
commensal strains was 25 ml, with the remaining 25 mL containing 1x104-1x105 CFU V. cholerae in PBS.
Microbial levels in human fecal slurries were estimated via real-time PCR using universal 16S primers as described below, and
samples were normalized to so that each suckling animal received slurries containing the equivalent of 20 mg of microbial genomic
DNA.
Measurement of fluid accumulation

Suckling mice were treated as described above. The fluid accumulation ratio percentage was determined as: [weight of intestines
(Large and Small) / mouse body weight] x 100.
Assessment of T6SS killing by V. cholerae in vivo

CD-1 suckling animals were gavaged with antibiotics as described above. 3-5 day old infant mice were orally inoculated with total of
1x109 CFU E. coli TB1 (lacZ-) and 1x104 CFU V. cholerae (lacZ+) together in 50 mL LB. Animals were sacrificed after 16 hr of infec-
tion. Mouse intestines were dissected and homogenized in 5 mL of PBS, and 10 mL of the homogenate used for enumeration of bac-
teria via serial dilution on LB agar containing streptomycin and X-gal.
Quantitative real-time PCR

Total bacterial load in fecal samples and intestinal homogenates was determined by using real-time quantitative PCR. Reactions
comprised 2 mL of extracted DNA (200ng/reaction) as template, 12.5 mL SYBRGreen Master Mix (BioRad), 10 mL PCR-grade water,
and 0.25 mL of forward and reverse primers at 10 mM (forward: 50 -CTCCTACGGGAGGCAGCAG-30 , reverse: 50 -TTACCGCGG
CTGCTGGCAC-30 ). Cycle conditions were 95 C for 3 min, followed by 39 cycles (95 C for 10 s, 55 C for 30 s, 95 C for 10 s, 65 C
for 5 s, 95 C for 5 s).
Levels of the B. obeum bsh enzyme (RUMOBE_00028) were determined by real-time PCR as described above, using the primers
50 -GCGATCAGATTACGATCACTC-30 and 50 -GCCATGCCAACACCTTTTTC-30 . 200ng of purified DNA from intestinal homogenates
of CD-1 mice colonized with complex human fecal samples were used as template for each reaction.
Levels of tcpA and ctxA expression in antibiotic-cleared CD-1 suckling animals containing SR and DS microbiomes were
measured using real-time PCR (tcpA: primers 50 -GAAGAAGTTTGTAAAAGAAGAACACG-30 and 50 -CGCTGAGACCACACCCATA-
30 , ctxA: primers 50 -CACTAAGTGGGCACTTCTCA-30 and 50 -TGATCATGCAAGAGGAACTCA-30 ), with recA (primers 50 -ATTGA
AGGCGAAATGGGCGATAG-30 and 50 -TACACATACAGTTGGATTGCTTGAGG-30 ) as a control. Templates were generated by Trizol
(Ambion) extraction of total RNA from intestines of SR and DS-colonized mice infected with V. cholerae, followed by cDNA synthesis
Cell 181, 1533–1546.e1–e7, June 25, 2020 e5

ll
Article
with the SuperScript IV First-Strand Synthesis System (Invitrogen) following manufacturers’ instructions. Real-time PCR was per-
formed using conditions 95 C for 3 min, followed by 39 cycles (95 C for 10 s, 55 C for 30 s, 95 C for 10 s, 65 C for 5 s, 95 C for
5 s).
Culturing of complex human fecal communities

Fecal slurries of complex human fecal samples were prepared as described above, and spread on LYH-BHI agar and incubated aero-
bically and anaerobically at 37 C. All colonies recovered were gathered by scraping, and DNA extracted and 16S rRNA genes ampli-
fied for sequencing as described.
AI-2 heat-stability assay

Cultures of BW30045_RO_AI-2, BB170 and C6706 were grown overnight. BW30045_RO_AI-2 was subcultured 1:100 into 12ml of LB
and grown in a shaker at 37 C for until OD600 z0.22, centrifuged, and the supernatant filter sterilized. Aliquots of supernatant were
then heated at 100 C for 30 minutes and cooled to room temperature. AI-2 activity was assessed using the BB170 bioassay (Bassler
et al., 1994). Briefly, overnight cultures of reporter strain BB170 in LM medium were diluted at 1:1000 in AB medium, and 10 mL of cell-
free supernatant or heat-treated cell-free supernatant were then added to 90ml of BB170 dilution. Luminescence and OD600 of each
sample were measured immediately and after 3.5hrs growth at 30 C with agitation.
In vitro bile-dependent tcp induction

PtcpA-sh-ble was grown as overnight culture, diluted 1:1000 in fresh LB, and grown for 2 hours at 37 C. Each reaction was prepared
in 40ml 0.5X pH 8.5 LB medium, with sodium taurocholate hydrate (TC, Sigma-Aldrich), sodium glycocholate hydrate (Sigma Aldrich),
cholic acid (CA, Alfa Aesar), sodium taurodeoxycholate hydrate (Sigma-Aldrich), sodium glycodeoxycholate (Sigma Aldrich), deox-
ycholic acid (DCA, MB Biomedicals), tauro-b-muricholic acid (Steraloids Inc.), or b-muricholic acid (Steraloids Inc.) added to a final
concentration of 125mM. 2ml of reporter strain subculture was then added, and samples incubated anaerobically at 37 C for 4hrs. 2ml
of each reaction was then added to 200ml of 0.5X LB pH 8.5 ± 10mg/ml of zeocin (Sigma Aldrich), incubated for 30 minutes aerobically
at 37 C with agitation, and then serially diluted and plated onto LB agar plates with streptomycin to determine survival rates. Induc-
tion represents percentage of PtcpA-sh-ble reporter cells surviving treatment with zeocin after incubation with indicated samples un-
der anaerobic conditions, defined as (zeocin-treated sample survival/average of no-zeocin controls)*100. Where noted, survival rates
were normalized to that induced by 125mM taurocholate.
BSH structure comparisons

The potential 3D models of unknown structure were produced by Phyre2 (Kelley et al., 2015), based on amino acid sequence align-
ment to known protein structure. The 3D structure comparison and root-mean-square distance (RMSD) per-column spatial variation
among structures were calculated using UCSF developed Chimera (V1.14) (Pettersen et al., 2004).
Commensal effects on tcp-activating signals

Commensal isolate cultures were grown for 48 hr, and then subcultured at 3:100 for 48hrs. Growth was measured by OD600 and
cultures normalized to 1.5mL of OD600 = 0.4 culture. bshc and the corresponding vector strain were grown overnight in LB with kana-
mycin, subcultured at 1:100 for 24hr, and normalized as above. All cultures were clarified, the aqueous layer removed, and the pellets
resuspended in sterile PBS with TC to a final concentration of 125uM. Cultures grown with antibiotics were washed one additional
time with 1 volume of PBS prior to addition of TC. Samples were then incubated anaerobically for 24hr at 37 C followed by heat treat-
ment at 100 C for 30 minutes, cooled to room temperature, centrifuged and the supernatant filter-sterilized with a 0.22 mm filter.
These samples were then used to induce PtcpA-sh-ble, and percent survival following zeocin treatment determined as
described above.
Removal of tcp activating signals in the gut was assayed as above, replacing pure TC solution in PBS with homogenate. Tissues
were collected from 5-6-day-old CD-1 suckling animals in 2.5ml sterile H2O, disrupted with a tissue homogenizer, pooled, and centri-
fuged to clear tissue debris. The resulting aqueous layer was heat-treated and filter sterilized as described above. The resulting
sample was desiccated using a Savant Integrated speedvac system (Fisher Scientific) and resuspended in one-fifth volume of sterile
water. Four volumes of acetonitrile (Sigma Aldrich) were then added and sample was incubated at room temperature for 20 minutes
for deproteinization (Humbert et al., 2012). Samples were clarified, with the aqueous layer filter sterilized, desiccated and
resuspended in one-fifth original volume sterile H2O as described above. To sequester bile salts, 12.5mg of cholestyramine
(Sigma-Aldrich) was added to 0.5ml of de-proteinized sample, and the mixture incubated at 1 hour at 37 C with agitation followed
by passage through a 3kDa protein concentrator (Pierce PES Protein Concentrators).
Human complex fecal sample TC processing was assayed in vitro. 100 mL of fecal slurries in glycerol prepared as described
above was inoculated into 5ml LYBHI and incubated anaerobically for 2 days at 37 C. Cells were then pelleted, normalized to
OD600 = 0.4 in 1.5ml sterile PBS with 125 mM TC, and incubated anaerobically at 37 C for 24hrs. Supernatants were then collected
via centrifugation, heat-treated and filter sterilized as described above, and submitted for mass spectroscopy (see below).
The effects of B. obeum bsh on V. cholerae colonization was assayed by introducing bshC and the isogenic vector control into
suckling animals prior to infection with Vibrio. 4-day old CD-1 suckling mice were treated with 50 mL of 75mg/mL kanamycin,
e6 Cell 181, 1533–1546.e1–e7, June 25, 2020

ll
Article
then returned to lactating dams. Overnight cultures of bshC and vector strains were normalized to the equivalent 300 mL of OD600 = 0.4
culture, and cells pelleted and resuspended in fresh LYH-BHI. 50 mL of this was then introduced via intra-gastric gavage into anti-
biotic-treated suckling mice that had been fasted for 1.5 hours. Pups were then returned to a lactating dam. After 1 day of pre-infec-
tion colonization with E. coli, animals were gavaged with V. cholerae as described above.
Quantification of bile salts

All standards (TC, CA, and DCA) were submitted as 10mM solutions. LC-MS analysis of bile acids was performed on a Synapt G2-Si
quadrupole time-of-flight mass spectrometer (Waters) coupled to an I-class UPLC system (Waters). Separations were carried out on
a CSH phenyl-hexyl column (2.1 3 100mm, 1.7mM) (Waters). The mobile phases were (A) water with 0.1% formic acid and (B) aceto-
nitrile with 0.1% formic acid. The flow rate was 250 mL/min and the column was held at 40 C. The gradient was as follows: 0min, 1%
B; 1min, 1% B; 8min, 40% B; 13min, 58.8% B; 13.5min, 100% B; 15.5min, 100% B; 16min, 1% B; 18min, 1% B. Flow rate was
ramped to 600 mL/min at 13.5 min to speed up column flushing and re-equilibration.
The MS was operated in positive ion mode (50 to 1200 m/z) with a 100ms scan time. Source and desolvation temperatures were
150 C and 600 C, respectively. Desolvation gas was set to 1100L/hr and cone gas to 150L/hr. All gases were nitrogen except the
collision gas, which was argon. Capillary voltage was 1kV. Injection volume was 1 mL for all samples. The identity of bile acids in sam-
ples was confirmed by mass, retention time, and MS/MS as compared to authentic standards. Samples were analyzed in random
order and injected in duplicate. Leucine enkephalin was infused and used for mass correction. Data processing (peak integration)
was performed using QuanLynx software (Waters). Accuracy of peak integrations was checked manually.
Statistical tests were performed in the GraphPad Prism software package. Results are representative of two independent experi-
ments. Statistical parameters for studies are reported in relevant figure legends and tables.
Cell 181, 1533–1546.e1–e7, June 25, 2020 e7

ll
Article
Figure S1. V. cholerae Pathology and Gut Distribution in Different Microbiome Contexts, Related to Figure 2
All experiments are in antibiotic-cleared suckling CD-1 mice. (A) Expression of ctxA in intestinal tissues of infected mice containing defined model human mi-
crobiomes. (B) Fluid accumulation in intestines of infected mice containing defined model human microbiomes. (C) Distribution of V. cholerae in infected mice
containing defined model human microbiomes. n.s. not significant (Mann-Whitney U-test). Error bars represent mean ± SEM.
ll
Article
Figure S2. Mean V. cholerae Colonization in Antibiotic-Cleared Suckling CD-1 Mice Bearing Communities Containing B. obeum in Combi-
nations of SR Species, Related to Figure 2
Normalized colonization across experiments reported as fold CFU V. cholerae gavaged recovered after infection. n.s. not significant (Mann-Whitney U-test).
ll
Article
Figure S3. Induction of BB170 AI-2 Reporter by Indicated Cell-Free Supernatants, Related to Figure 6
**p < 0.01, (Mann-Whitney U-test). Error bars represent mean ± SEM.
ll
Article
Figure S4. Contribution of Different Microbial Species to TC Levels, Related to Figure 6

(A) Mass spectrometry measurement of taurocholate (TC) to cholic acid (CA) ratio in distal third of small intestine of adult germfree C57BL/6 mice 2 days after
colonization with pure cultures of indicated strains. (B) Amount of V. cholerae recovered during co-infection of suckling CD-1 mice with either S. infantarius or
S. salivarius, normalized to input CFU V. cholerae. (C) Ability of B. obeum and V. cholerae to interfere with TC activation of virulence in reporter V. cholerae in vitro
after 5 hours incubation, normalized to induction by 125 mM TC. *p < 0.05, **p < 0.01 (Mann-Whitney U-test), n.s. not-significant. Error bars represent
mean ± SEM.
ll
Article
Figure S5. Comparison of Bile Salt Hydrolases, Related to Figure 6

(A) Predicted structure of B. obeum bile salt hydrolase RUMOBE_00028. (B) Predicted structure of consensus phylotype 1 BSH. (C). Amino acid alignment of
RUMOBE_00028, phylotype 1 BSH, and phylotype 6 BSH using Chimera. Header in gray shows the spatial variation per column. Colored boxes shows structural
similarity between regions, with coloring of one-letter code amino acids using Clustal X, dependent on both residue type and the pattern of conservation across
aligned sequences.
ll
Article
Figure S6. Levels of Different Phylotypes of Microbial bsh Enzymes in Metagenomic Sequencing of Fecal Microbiomes of Cholera Patients
Pre- (d0) and Post- (+abx) Antibiotic Treatment Compared to Healthy Individuals in Bangladesh, Related to Figure 7
*p < 0.05, **p < 0.01, ***p < 0.001, ****p < 0.0001 (Mann-Whitney U-test), n.s. not-significant. Error bars represent mean ± SEM.
Article
Neuronal Inactivity Co-opts LTP Machinery to Drive

Potassium Channel Splicing and Homeostatic Spike
Widening
Boxing Li, Benjamin S. Suutari,
Simon D. Sun, ..., Thomas A. Neubert,
Gordon Fishell, Richard W. Tsien
Correspondence
liboxing@mail.sysu.edu.cn (B.L.),
richard.tsien@nyulangone.org (R.W.T.)
In Brief
Silencing neuronal activity triggers similar
molecular mechanisms as activating
neurons during long-term potentiation,
demonstrating Hebbian mechanisms of
homeostatic spike regulation.
Highlights
d Chronic spike blockade with tetrodotoxin causes
homeostatic spike broadening
d Alternative splicing of BK channels by exclusion of a specific

exon is responsible
d Synaptic homeostasis starts CaM kinase signaling to drive

nuclear exit of Nova-2
d Chronic inactivity and hyperactivity can initiate similar LTP-

like events
Li et al., 2020, Cell 181, 1547–1565

ll
Article
Neuronal Inactivity Co-opts LTP Machinery
to Drive Potassium Channel Splicing
and Homeostatic Spike Widening
Boxing Li,1,2,* Benjamin S. Suutari,2,3,7 Simon D. Sun,2,3,7 Zhengyi Luo,4,7 Chuanchuan Wei,1,7 Nicolas Chenouard,2
Nataniel J. Mandelberg,2 Guoan Zhang,5 Brie Wamsley,2,6 Guoling Tian,2 Sandrine Sanchez,2 Sikun You,1
Lianyan Huang,1 Thomas A. Neubert,5 Gordon Fishell,2,6 and Richard W. Tsien2,3,8,*
1Neuroscience Program, Guangdong Provincial Key Laboratory of Brain Function and Disease, Zhongshan School of Medicine and The Fifth
Affiliated Hospital, Sun Yat-sen University, Guangzhou 510810, China
2Department of Neuroscience and Physiology, Neuroscience Institute, NYU Grossman Medical Center, New York, NY 10016, USA
3Center for Neural Science, New York University, New York, NY 10003, USA
4Guangdong Provincial Key Laboratory of Malignant Tumor Epigenetics and Gene Regulation, RNA Biomedical Institute, Sun Yat-sen
Memorial Hospital, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou 510120, China
5Department of Biochemistry and Molecular Pharmacology and Skirball Institute, NYU Grossman Medical Center, New York, NY 10016, USA
6Stanley Center for Psychiatric Research, The Broad Institute, Cambridge, MA 02142, USA
8Lead Contact
*Correspondence: liboxing@mail.sysu.edu.cn (B.L.), richard.tsien@nyulangone.org (R.W.T.)

SUMMARY
Homeostasis of neural firing properties is important in stabilizing neuronal circuitry, but how such plasticity
might depend on alternative splicing is not known. Here we report that chronic inactivity homeostatically in-
creases action potential duration by changing alternative splicing of BK channels; this requires nuclear
export of the splicing factor Nova-2. Inactivity and Nova-2 relocation were connected by a novel synapto-nu-
clear signaling pathway that surprisingly invoked mechanisms akin to Hebbian plasticity: Ca2+-permeable
AMPA receptor upregulation, L-type Ca2+ channel activation, enhanced spine Ca2+ transients, nuclear trans-
location of a CaM shuttle, and nuclear CaMKIV activation. These findings not only uncover commonalities be-
tween homeostatic and Hebbian plasticity but also connect homeostatic regulation of synaptic transmission
and neuronal excitability. The signaling cascade provides a full-loop mechanism for a classic autoregulatory
feedback loop proposed 25 years ago. Each element of the loop has been implicated previously in neuro-
psychiatric disease.
INTRODUCTION basis of neuronal homeostasis. This is surprising because AS

is a critical intermediary between gene transcription and
Neurons use multiple forms of plasticity to allow circuits to be translation and engenders the enormous complexity of neuronal
flexible yet stable: Hebbian plasticity (positive feedback to proteomes. By regulating exon inclusion, AS diversifies synaptic
strengthen already active circuit elements, as in long-term components, ion channels, and other key neuronal proteins
potentiation [LTP]) and homeostatic plasticity (negative feed- (Furlanis and Scheiffele, 2018; Licatalosi and Darnell, 2010;
back to boost neuronal elements deprived of activity or depress Vuong et al., 2016). Dysregulation of AS contributes to brain dis-
those undergoing hyperactivity). Homeostatic plasticity helps orders (Licatalosi and Darnell, 2006; Parikshak et al., 2016),
keep neuronal circuits away from extremes of silence or runaway possibly via altered homeostasis (Mullins et al., 2016; Ramocki
excitation (Turrigiano and Nelson, 2004) and occurs at multiple and Zoghbi, 2008; Styr and Slutsky, 2018; Wondolowski and
levels of synapses, neurons, and circuits (Desai et al., 1999; Dickman, 2013).
Kim and Tsien, 2008; O’Leary et al., 2014; Turrigiano, 2008). Ho- We asked whether AS supports homeostatic responses
meostatic plasticity is known to rely on changes in gene tran- induced by long-term inactivity. Would this be converse to
scription (Schaukowitch et al., 2017), mRNA translation (Magh- neuronal responses to hyperactivity (Ding et al., 2017; Iijima
soodi et al., 2008; Penney et al., 2012; Schanzenbächer et al., et al., 2011; Li et al., 2007; Mauger et al., 2016; Xie and Black,
2016), and protein stability (Ehlers, 2003). In contrast, alternative 2001), or would inactivity engage its own distinct mecha-
splicing (AS) of precursor mRNA has received little attention as a nisms? In turn, how does AS affect neuronal activity?
ll
Article
Answering these questions would delineate a classic homeo- increases in APD and Ca2+ entry as negative feedback
static feedback loop autoregulation.
The slowing of repolarization that drove APD prolongation
D Firing / DCa2 + signaling was associated with blunted afterhyperpolarization (AHP) (Fig-
ure 1A). Both changes would ensue if potassium channel activity
[ Y was reduced. We considered the BK channel because it regu-
lates APD and AHP (Hu et al., 2001; Lee and Cui, 2010; Sausbier
D Cellular proteins )D Gene expression et al., 2004; Shao et al., 1999). Indeed, inhibiting BK channels
with a specific blocker, iberiotoxin (IbTx), lengthened AP half-
width (Figures 1C and 1D). To test whether BK channels are
proposed more than 25 year ago at Brandeis (LeMasson et al., involved in inactivity-induced APD widening, we applied IbTx
1993; Marder et al., 1996; Siegel et al., 1994). This scheme em- to TTX-silenced cortical cultures (48 h) just before AP recording.
bodies the feedback principles of neuronal homeostasis but has Strikingly, IbTx did not further widen APs (Figures 1C and 1D);
never been worked out in a full loop for any aspect of neuronal the effect of inactivity largely occluded IbTx’s effect on APD.
function. We focused on action potential duration (APD), funda- Evidently, neuronal silencing and IbTx converge on a common
mental in tuning neurons, synapses, and circuits. Even small pathway—inhibiting BK activity in widening the AP.
changes in APD greatly affect Ca2+ influx (Borst and Sakmann,
1998; Geiger and Jonas, 2000; Llinás et al., 1982) and, thus, Chronic Inactivity Alters BK Channel AS
neurotransmitter release, gene expression, and overall neuronal Chronic inactivity might inhibit BK channels via gene expression
function (Byrne and Kandel, 1996; Deng et al., 2013; Jackson and membrane trafficking. However, decreases in transcription,
et al., 1991; Matthews et al., 2009; Sabatini and Regehr, 1997). mRNA translation, or protein trafficking were ruled out by mea-
APD is generally prolonged by inactivity (Kim and Tsien, 2008; surements of BK mRNA, total protein, and surface membrane
Trasande and Ramirez, 2007), an appropriate response to home- expression (Figures S2A–S2D). BK channels also undergo AS,
ostatically restore Ca2+ entry, but remarkably little is known with profound effects on activity (Fodor and Aldrich, 2009; Shel-
about underlying mechanisms. ley et al., 2013; Shipston and Tian, 2016; Xie and McCobb, 1998;
Here we report that inactivity-induced homeostatic regulation Zarei et al., 2001). We assessed the expression of exons pre-
of APD in excitatory neurons is controlled by AS of the large- dicted to undergo splicing (X1–X6; Figure 1E) via RT-PCR (Pietr-
conductance Ca2+-activated potassium channel (BK channel). zykowski et al., 2008). In cultured cortical neurons, BK under-
We deciphered the underlying signaling mechanism: a specific went AS in X3, X4, and X6 but not in X1, X2, and X5 (Figures
change in BK AS driven by a novel cascade involving Ca2+- 1F and S3A). Notably, only X6 (called E29 hereafter) splicing
permeable glutamate receptors, voltage-gated Ca2+ channels, decreased after chronic TTX treatment (Figures 1F); AS of other
calcium/calmodulin-dependent (CaM) kinase kinase b exons remained unchanged (Figure S3B), narrowing our search
(bCaMKK), CaM kinase IV (CaMKIV), and the splicing factor for the basis of BK modulation.
Nova-2. We find that complete silencing of an excitatory neuron We also tested the effect of enhanced activity by depolariza-
triggers signaling similar to that activated during direct depolar- tion with K+-rich culture media (20, 40, and 60 mM K+;
ization. Strikingly, each signaling player is encoded by a gene Figure S3C). Surprisingly, chronic depolarization decreased
implicated previously in neuropsychiatric disease. E29 exon inclusion (Figures 1G and S3D); the magnitude of shift
in AS varied with strength and duration of depolarization (Fig-
RESULTS ure S3D), but always in the same direction as chronic inactivity.
We return later to resolve the paradox of how chronic silencing
Homeostatic AP Broadening Driven by BK Current and depolarization produce similar changes in BK AS.
Attenuation Depolarization- or seizure-induced changes in AS involve
Although homeostasis of action potential (AP) frequency is well activation of CaM-dependent kinases (Xie and Black, 2001). In
studied (Desai et al., 1999; Lee and Chung, 2014; Maffei and Tur- contrast, little is known about AS induced by inactivity, leading
rigiano, 2008), how APD is regulated remains mysterious. This us to focus on its functional effect and underlying mechanisms.
distinction is critical because spike width regulates neuronal
excitability, Ca2+ influx, and neurotransmission and is governed Altered E29 Inclusion Dampens K+ Current and
by its own set of ion channels operating in parallel with channels Broadens Spikes
controlling spike frequency (Kimm et al., 2015). E29 encodes 27 amino acids strategically located within the
To examine homeostasis of APD, we blocked spiking of regulator of K+ conductance (RCK) domain, near the Ca2+
cultured cortical neurons by chronic (24- or 48-h) sodium chan- bowl, one of the Ca2+ binding sites promoting BK activation (Fig-
nel blockade with tetrodotoxin (TTX) and examined APD after ure 1E). E29 inclusion affects BK responses to Ca2+ (Ha et al.,
TTX removal. Chronically silenced neurons had a significantly 2000) and to alcohol-induced microRNAs (miRNAs) (Pietrzykow-
longer APD than mock-treated controls (Figures 1A and S1), ski et al., 2008). We expressed BK channels in HEK293 cells
and AP-induced Ca2+ transients were elevated 1.5- to 2-fold in to find out whether E29 inclusion influences BK currents. BK
amplitude and duration (Figure 1B), as expected from APD- currents were smaller with DE29 than with E29 (Figures 1H and
dependent prolongation of Ca2+ channel activation (Bischof- 1I), whereas the membrane expression level was no different
berger et al., 2002). Thus, chronic inactivity drove compensatory (Figures S2E–S2G). Thus, the reduction of E29 inclusion after
1548 Cell 181, 1547–1565, June 25, 2020

ll
Article
Figure 1. Chronic Spike Blockade-Induced BK Channel AS Is Responsible for Homeostatic Prolongation of AP Duration
(A) Action potentials (APs) recorded from sham control- or chronic TTX-treated (48 h) neurons (left); AP duration (APD) was measured as half-width (right, n = 10).
Scale bars, 20 mV, 1 ms.
(B) Single AP-elicited Ca2+ transients from control or TTX neurons transfected with GCaMP6s (left) and DF/F (right, n = 6).
(C) APs recorded from control (left) and TTX (right) neurons with (red) or without (black) IbTx. Scale bar, 20 mV and 1 ms.
Cell 181, 1547–1565, June 25, 2020 1549

ll
Article
TTX treatment likely contributes to the dampening of BK outward Nova-2 Is Required for E29 Splicing
current. Testing the necessity of Nova-2 for E29 inclusion, we knocked
We returned to cultured cortical neurons to test directly down Nova-2 in cultured cortical neurons using short hairpin
whether a drop in E29 inclusion leads to APD prolongation. RNAs (shRNAs) that targeted its coding sequence (CDS), or its
Upon overexpression of the DE29 BK construct, APD was untranslated region (UTR) (Figure S4A); the UTR-directed shRNA
longer, in line with lower BK conductance, relative to that spared an exogenous Nova-2 construct lacking the UTR (Nova-
observed for its E29 counterpart. Furthermore, action potentials 2(R); Figure S4B). E29 inclusion was sharply reduced by lentiviral
of E29-expressing neurons failed to broaden with chronic delivery of UTR-targeting shRNA but not of scrambled shRNA
silencing. Likewise, the already widened AP of DE29-expressing and not of exogenous Nova-2(R) (Figure 2D). This match to the
showed no further prolongation (Figures 1J and 1K). Thus, inclu- pattern of Nova-2 knockdown and rescue (Figure S4B) sug-
sion or exclusion of E29 in overexpressed channels overrides gested that Nova-2 is necessary for the AS event.
the regulation engaged by chronic inactivity, suggesting that
such regulation involves E29 splicing. Clinching this calls for Changes in Nuclear Localization of Nova-2 Can Regulate
understanding how the splicing is controlled. E29 Splicing
Because Nova-2 acts outside of the nucleus (Racca et al.,
Nova-2 Binds to the Intron Downstream of E29 2010), not just within it, we assessed the effect of cellular locale
AS of BK channels has been intensely studied, but how E29 by comparing the effects of Nova-2 lacking its nuclear localiza-
inclusion is controlled remains unknown. Generally, splicing tion signal (DNLS) and with the NLS intact (wild type [WT]).
regulatory factors favor exon inclusion when binding to down- Although WT Nova-2 was concentrated in the nucleus, DNLS-
stream introns but inhibit it when binding to upstream introns Nova-2 was mainly cytoplasmic (Figure 2E). We verified that
or the exon itself (Black, 2003; Ule et al., 2006). Our bioinformat- Nova-2 directly regulates E29 inclusion using a splice reporter
ics analysis of the intronic sequence past E29 revealed YCAY (Stoilov et al., 2008) containing E29 and partial flanking intron
(Y=C/U) clusters (Figure 2A), consensus sequences for binding sequences (Figure 2F) in HEK293 cells (largely lacking endoge-
of Nova proteins, a well-studied class of neuron-specific RNA- nous Nova-2). In controls, E29 inclusion was knocked down by
binding proteins that regulate AS of neuronal proteins (Buckano- Nova-2 shRNA and rescued by Nova2(R) (Figure S4C). Critically,
vich and Darnell, 1997; Ule et al., 2005). Mice lacking Nova-2, the DNLS-Nova-2 largely failed to induce E29 inclusion (Figures 2F
predominant isoform in the neocortex (Yang et al., 1998), are and S4C), supporting the importance of nuclear localization.
deficient in E29 inclusion (Ule et al., 2005).
To see whether E29 splicing is directly regulated by Nova-2, we Nova-2 Is Necessary and Sufficient for TTX-Induced E29
tested whether Nova-2 binds to synthetic RNA oligonucleotides Exclusion
spanning the putative binding region in the downstream intron To confirm in neurons that Nova-2 is critical for chronic inac-
(probe A1; Figure 2A). Pull-down assays showed that probe A1 tivity-induced E29 reduction, we assayed the effects of chronic
binds efficiently to endogenous Nova-2 from mouse cortical ly- TTX after knockdown of Nova-2 (Figure 2G). Although scram-
sates. In contrast, no binding to Nova-2 was detected with a probe bled lentiviral shRNA spared the reduction of E29 inclusion
with mutations in putative Nova-2-binding (YCAY) sites (probe A2) following 48-h TTX treatment, knockdown of Nova-2 mimicked
or a control probe spanning a 35-bp intronic stretch upstream of and occluded the effect of chronic inactivity on E29 inclusion
A1 without YCAY motifs (probe B; Figures 2A and 2B). We asked (Figure 2G). These experiments showed that Nova-2 binds
whether Nova-2 also bound to the intron downstream of E29 directly to the intron downstream of E29 and is necessary and
in vivo, subjecting cortical lysates to RNA immunoprecipitation sufficient for regulation of E29 inclusion by inactivity.
(IP) with Nova-2 antibodies and assaying with RT-PCR using
primers flanking the YCAY sites. RT-PCR product was detected Chronic Inactivity Reduces Nova-2 Nuclear Localization
following IP with Nova-2 antibodies but not with control immuno- In Vitro
globulin G (IgG) (Figure 2C). No product from the Nova-2 immuno- This sets up the question of how Nova-2 effectiveness is linked to
precipitate was detected without reverse transcriptase, excluding chronic inactivity. Because Nova-2 must be in the nucleus to
an effect of genomic DNA contamination (Figure 2C). Thus, in vivo control splicing, its cellular relocalization is a potential control
and in vitro experiments show that Nova-2 binds directly to the point. We assessed Nova-2 localization in cultured cortical neu-
YCAY sites in the intron downstream of E29. rons after 48-h treatment with TTX. In TTX-treated neurons,
(D) Half-widths of APs depicted in (C) (n = 8).

(E) Diagram of BK channel pore-forming a subunit and AS sites (X1–X6, arrowheads) (Pietrzykowski et al., 2008). Amino acid sequences of the sixth exon X6, E29
(purple, underlined), near the ‘‘Ca2+ bowl’’ (orange, underlined).
(F) RT-PCR products from control or TTX cultures with or without E29 inclusion (E29 or DE29, left) and percentage of products containing E29 (right; n = 8).
(G) E29 splicing in control or 60 mM K+ (60K)-treated cultures (24 h) and percentage of products containing E29 (right, n = 6).
(H) Whole-cell recordings from HEK293 cells overexpressing BK channel constructs with or without E29.
(I) Corresponding current–voltage (I-V) curves.
(J) APs from control or TTX cortical neurons transfected with plasmids encoding BK channel isoforms with or without E29. Scale bar, 20 mV and 1 ms.
(K) Pooled half-widths of APs depicted in (J) (n = 7).
For (A), (B), (D), (F), (G), (I), and (K), data are represented as mean ± SEM; *p < 0.05; **p < 0.01; N.S. represents p > 0.05.
See also Figures S1–S3.
1550 Cell 181, 1547–1565, June 25, 2020

ll
Article
Figure 2. Nova-2 Regulates E29 AS
(A) Diagram of E29 (red box) and flanking introns (top)
and biotinylated RNA probes (A1, A2, and B) created,
spanning downstream intronic YCAY sequences (bot-
tom).
(B) RNA pull-down of proteins from adult mouse brains,
using the probes in (A) or empty beads with immunoblot
of Nova-2 protein (n = 3).
(C) RT-PCR products from IP of Nova-2 from mouse
cortical lysates. IP with IgG and an assay without
reverse transcription (No RT) are controls.
(D) E29 splicing in neurons transfected with scrambled
shRNA (Scr.), shRNA against the UTR of the Nova-2
gene (shRNA(UTR)), or shRNA(UTR) with exogenous
shRNA(UTR)-resistant Nova-2 (Nova-2 (R)).
(E) Diagram of Nova-2 with the NLS present (WT) or
absent (DNLS), with RNA binding domains (top: KH1,
KH2, and KH3) and immunofluorescence of FLAG-
tagged Nova-2 with or without the NLS. Nuclei are
indicated by dashes. Scale bar, 10 mm.
(F) Diagram of the splicing reporter containing consti-
tutive exons (black rectangles), E29 (red rectangle), and
partial flanking intron sequences (horizontal lines) (top)
and RT-PCR products from HEK293 cells transfected
with the splicing reporter with control vector (sham), WT
Nova-2 (WT), or DNLS-Nova-2 co-expression.
(G) E29 splicing in cultured cortical neurons infected by
lentiviral Scr. or shRNAs against Nova-2 gene exposed
to a sham control (Con) or 48 h TTX.
See also Figure S4.
tionation and immunoblot analysis (Figure 3C).

After TTX exposure, nuclear Nova-2 levels fell,
cytoplasmic Nova-2 rose, and total Nova-2
expression remained unchanged (Figures 3C
and 3D), indicating that Nova-2 translocates
from the nucleus to the cytosol upon silencing
of cultured neurons. A similar shift in the nu-
cleus/cytoplasmic ratio was seen upon
chronic depolarization (Figure 3B, right);
thus, the paradoxically similar regulation of
E29 inclusion arises at or before Nova-2
relocation.
Chronic Inactivity Reduces Nova-2

Nuclear Localization In Vivo
We next asked whether silencing neocortical
neurons in vivo caused similar changes as in
dissociated cultures. Tonic inputs from the
thalamus to the neocortex, which excite
ongoing activity of cortical neurons, were
chronically eliminated in mice by expressing
tetanus toxin light chain (TeTox) in Olig3-pos-
itive thalamic neurons (Figure 3E). Nova local-
ization was assayed in cortical neurons inner-
vated by thalamic inputs. Nova was largely
Nova-2 nuclear intensity was lower than control cells, whereas nuclear in Olig3 Cre (control) mice but excluded from the nucleus
cytosolic Nova-2 intensity increased (Figure 3A), yielding a rela- in Olig3 Cre; TeToxf/+mice (Figures 3F and 3G). Chronic removal
tive 2-fold drop in the nuclear/cytoplasmic ratio (Figure 3B, left). of thalamic input led to Nova translocation from the nucleus to
We verified these immunocytochemical findings by nuclear frac- the cytosol of cortical neurons.
Cell 181, 1547–1565, June 25, 2020 1551

ll
Article
1552 Cell 181, 1547–1565, June 25, 2020

ll
Article
Inactivity-induced Nova-2 translocation was also observed sayed Nova-2 localization. In vector control-transfected neurons,
with a monocular deprivation (MD) paradigm (Wiesel and Hubel, Nova-2 was predominantly located in the nucleus (Figure 4D, top
1963). A lid suture was applied in juvenile mice for 5 days during row). In contrast, in CA-CaMKIV-transfected neurons, Nova-2
the critical period of visual development (post-natal days 26–31) nuclear intensity was lower, and cytoplasmic intensity was higher
(Figure 3H), and subcellular localization of Nova-2 in monocular (Figure 4D, second row), causing a significant drop in nuclear/
V1 was assayed by immunostaining. The nuclear intensity of cytoplasmic (nuc/cyt) ratio (Figure 4E). Evidently, activation of
Nova-2 in the deprived hemisphere (contralateral to the closed CaMKIV is sufficient to drive Nova-2 translocation. Continuous
eye) was less than in the non-deprived hemisphere (ipsilateral); activation of CaMKIV requires its binding to Ca2+/CaM. Accord-
opposite differences were seen in cytoplasmic intensity (Figures ingly, we overexpressed a nucleus-localized Ca2+/CaM trap,
3I and 3J). Thus, MD-induced inactivity also induced in vivo CaMBP4(nuc) (Cohen et al., 2016; Wang et al., 1995), before
Nova-2 translocation from the nucleus to the cytosol. Strikingly, examining Nova-2 location. This intervention abolished the
MD also reduced E29 inclusion and increased APD in the vision- chronic inactivity-induced nuclear export of Nova-2 (Figures
deprived hemisphere compared with its non-deprived counter- 4D, fourth row, and 4E), supporting the idea that Nova-2 translo-
part (Figures 3K–3N). Thus, chronic inactivity in vivo leads to cation is regulated by activated CaMKIV. These experiments indi-
nucleus-to-cytosol Nova-2 redistribution, reduction of E29 inclu- cate that CaMKIV activation is sufficient and necessary to induce
sion, and increased APD. Nova-2 translocation from the nucleus to the cytosol.
CaMKIV Is Activated by Chronic Inactivity CaMKIV Interaction with Nova-2

While cellular control mechanisms of Nova-2 localization have Both Nova-2 and CaMKIV are mostly located in the nucleus (Fig-
not been reported, we considered scenarios including regula- ure 5A), but whether they directly interact is unknown. To explore
tion by phosphorylation. CaMKIV was a likely candidate as a this, we co-expressed constructs encoding FLAG-tagged Nova-
robust protein kinase, highly concentrated within the nucleus 2 and GFP-tagged CaMKIV in HEK293 cells. CoIP and western
(Bito et al., 1996; Nakamura et al., 1995); CaMKIV regulates blotting with anti-FLAG or anti-GFP antibodies were performed
other splicing factors, such as heterogeneous nuclear ribonu- on protein lysates to detect protein-protein interaction. FLAG-
cleoprotein L (hnRNP L) (Liu et al., 2012), and supports ho- tagged Nova-2 was found with IP using anti-GFP antibody;
meostatic adjustments (Ibata et al., 2008). To test for involve- GFP-tagged CaMKIV was seen upon IP with anti-FLAG antibody
ment of CaMKIV, we looked at its status in the nucleus after (Figure 5B). No reciprocal interaction was detected upon IP of
chronic inactivity, monitoring phosphorylation on Thr196, its lysates of mock-transfected cells or when they were expressed
dominant activation site (Selbert et al., 1995; Tokumitsu and individually (Figure 5B).
Soderling, 1996). Thr196 phosphorylation was higher in the The interaction was also evident even at endogenous levels of
nucleus of TTX-treated neurons relative to the control (Fig- protein expression in neurons. Nova-2 was immunoprecipitated
ure 4A), resulting in a 1.5-fold higher nuclear/cytoplasmic ratio with anti-CaMKIV antibody; CaMKIV was brought down with
(Figure 4B). Elevated activation of nuclear CaMKIV was also anti-Nova-2 antibody, but no binding interactions were seen
seen in vivo upon immunoblot analysis of samples from with IgG controls (Figure 5C). These results indicate that Nova-
cortical V1, comparing lysates from monocularly deprived 2 and CaMKIV engage in protein-protein interaction, studied as
and non-deprived hemispheres (Figure 4C). exogenous or endogenous proteins.
CaMKIV Signaling Drives Nuclear Nova-2 Export CaMKIV Phosphorylates Nova-2 and Regulates Its
Is mimicking CaMKIV activation sufficient to induce Nova-2 Nuclear Localization
translocation? To test this, we expressed constitutively activated To find out whether CaMKIV phosphorylates Nova-2, we co-ex-
CaMKIV (CA-CaMKIV) (Cruzalegui and Means, 1993) and as- pressed CA-CaMKIV (or GFP as a control) with FLAG-tagged
Figure 3. Chronic Spike Blockade Reduces Nova-2 Nuclear Localization and E29 Inclusion
(A) Immunofluorescence of Nova-2 from neurons treated with the sham Con or 48 h TTX. Nuclei are indicated by dashes. Scale bar, 10 mm.
(B) Quantification of the nuc/cyt ratio of Nova-2 immunofluorescence intensity from neurons sham- or 48 h TTX-treated (left) and sham- or 24 h 60 mM K+ solution-
treated (right) (n = 30 cells).
(C) Western blots from nuclear (left), cytosolic (center), or whole-cell fractions (right), with Nova-2, Lamin B, and Glyceraldehyde 3-phosphate dehydrogenase
(GAPDH) levels.
(D) Quantification of (C) (for each condition, n = 4).
(E) Diagram of eliminated thalamic inputs to the cortex.
(F) Nova (red) localization in cortical neurons from Olig3 Cre or Olig3 Cre; TeToxf/+ mice. Scale bar, 20 mm.
(G) Quantified nuc/cyt ratio of Nova immunofluorescence intensity from the groups in (F) (n = 20 cells).
(H) Diagram of visual pathways involved in monocular deprivation (MD), with the monocular region of the primary visual cortex (V1mono) indicated (boxes).
(I) Nova-2 (red) localization in the V1mono region, contralateral (Contra) or ipsilateral (Ipsi) to visual deprived eye (MD). Scale bar, 10 mm.
(J) Quantified nuc/cyt ratio of Nova-2 immunofluorescence intensity from the groups in (I) (n = 20 cells).
(K) E29 splicing in V1mono Contra or Ipsi to the visually deprived eye (MD).
(L) Quantification of E29 splicing in (K) (n = 4).
(M) AP waveforms recorded from the V1mono region, Contra or Ipsi to the visually deprived eye (MD).
(N) Half-width of APs in (M) (n = 12–14).
For (B), (D), (G), (J), (L), and (N), data are represented as mean ± SEM; **p < 0.01; N.S. represents p > 0.05.
Cell 181, 1547–1565, June 25, 2020 1553

ll
Article
Figure 4. Inactivity-Induced CaMKIV Activa-
tion Leads to Nuclear Nova-2 Export
(A) Immunofluorescence of phosphorylated CaM-
KIV (pCaMKIV) from neurons treated with sham
Con or 48 h TTX. The nucleus is indicated (dashed
white). Scale bar, 10 mm.
(B) Quantified nuc/cyt ratio of pCaMKIV immuno-
fluorescence intensity from groups in (A) (n = 20).
(C) Western blots indicating pCaMKIV and CaMKIV
levels in V1mono Contra or Ipsi to the visually
deprived eye (MD).
(D) Immunofluorescence of neurons transfected
with Con vectors, FLAG-tagged CA-CaMKIV (top
two rows, sham Con treatment), or FLAG-tagged
nuclear CaMBP4 (bottom two rows, 48 h TTX).
Scale bar, 10 mm.
(E) Quantified nuc/cyt ratio of Nova-2 immunofluo-
rescent intensity in (D) (n = 20).
For (B) and (E), data are represented as mean ±
SEM; **p < 0.01.
Nova-2 in HEK293 cells and probed Nova-2 phosphorylation Spontaneous Spine Depolarizations Detected with a
with mass spectrometry (liquid chromatography-tandem mass Membrane Voltage Probe
spectrometry [LC-MS/MS]) (Figures S5A–S5D and 5D5G). How does chronic elimination of AP firing lead to activation of
Compared with the control, CA-CaMKIV drove Nova-2 phos- CaMKIV and nuclear exit of Nova-2? Activity blockade with
phorylation at three sites: serine 25 (site 1), threonine 27 (site TTX abolishes evoked vesicle release but spares spontaneous,
2), and serine 194 (site 3). Sites 1 and 2 lie within or next to the AP-independent vesicle release. We asked whether sponta-
NLS in Nova-2, predictive of influence on nuclear localization neous synaptic transmission might be sufficient to activate
(Harreman et al., 2004); site 3 resides in the KH2 domain, one signaling to CaMKIV activation and Nova-2 translocation.
of the RNA binding domains (Yang et al., 1998). To test whether To test this, we monitored the membrane potential in den-
phosphorylation affects Nova-2 location in cultured neurons, we dritic spines of cortical neurons using the genetically encoded
probed the distribution of FLAG-tagged Nova-2 constructs with voltage indicator ASAP1 (St-Pierre et al., 2014). ASAP1 fluores-
the sites mutated to glutamic acid (E) to mimic the negative cence was seen in the plasma membrane of somata, dendritic
charge phosphorylation confers or to alanine (A) to prevent phos- shafts, and spines (Figure 6A). Decreases in ASAP1 fluores-
phorylation. In contrast to the mostly nuclear positioning of WT cence were linearly related to membrane depolarizations (St-
Nova-2, Nova-2 with phosphomimetic mutations at sites 1 and Pierre et al., 2014) imposed by K+-rich external solution (Fig-
2 (1E2E) was mostly cytoplasmic (Figures 5H and 5I). Single mu- ure 6B, gray symbols). In cortical cultures acutely exposed to
tations (1E or 2E) reduced Nova-2 nuclear localization, even TTX, dendritic spines exhibited dips in relative fluorescence in-
when the other site was mutated to alanine (1E2A or 1A2E) (Fig- tensity (DF/F), reflecting spontaneous excitatory postsynaptic
ures 5H and 5I). Conversely, Nova-2 with single or double muta- potentials (EPSPs) (Figure 6C, green exemplar trace, and 6B,
tions to alanine (2A and 1A2A) was predominantly nuclear, like pooled data). After chronic TTX, spontaneous spine depolariza-
the WT (Figures 5H and 5I). In contrast, glutamate mutations at tions grew by 10 mV (Figure 6C, red trace), exceeding depo-
site 3 had no effect on Nova-2 localization (Figures 5H and 5I) larization attained with 20 mM K+ (Figure 6B, pooled data), suf-
or on Nova-2 binding to RNA (Figure S5E). These results showed ficient to recruit L-type Ca2+ channel- and NMDA receptor
that active CaMKIV reduces Nova-2 nuclear localization by (NMDAR)-mediated Ca2+ influx (Helton et al., 2005; Mayer
phosphorylating sites 1 and 2 (S25, T27). et al., 1984; Nowak et al., 1984). For comparison, even larger
To test whether CaMKIV phosphorylation of Nova-2 controlled DF/F transients were observed in dendritic spines of control
its ability to regulate E29 splicing, we co-expressed the 1E2E neurons during backpropagating dendritic APs (bAPs) or so-
double mutant Nova-2 with the E29 splicing reporter in matic action potentials recorded without TTX (Figures 6C,
HEK293 cells. Unlike WT Nova-2, the 1E2E variant was unable blue and violet exemplar traces, and 6B, pooled data).
to induce E29 inclusion, like Nova-2 lacking its NLS (DNLS; Fig-
ure 5J). Evidently, CaMKIV binds to Nova-2 in the nucleus, Chronic Inactivity Enhances Spontaneous Spine Ca2+
primed to phosphorylate serine 25 and threonine 27 sites near Transients
the NLS of Nova-2; the resulting Nova-2 nuclear exit prevents To study spontaneous Ca2+ transients in dendritic spines,
its action in splicing, reducing exon 29 inclusion. we expressed the fluorescent Ca2+ indicator GCaMP6s
1554 Cell 181, 1547–1565, June 25, 2020

ll
Article
Figure 5. CaMKIV Phosphorylates Nova-2 and Regulates Its Nuclear Localization

(A) Immunofluorescence of CaMKIV and Nova-2. Scale bar, 10 mm.
(B) Immunoblots of coIP lysate from HEK293 cells transfected with Nova-2-FLAG, CaMKIV-GFP, or both. Sham-transfected cells were used as Con.
(C) Immunoblots of CaMKIV and Nova-2 coIP lysate from cortical neurons. IgG was used as Con.
(D) Location of S25, T27, and S194 (red); amino acid sequences of the NLS (blue, underlined) and the partial KH2 domain are indicated.
(E–G) Quantification of phosphorylation levels of sites S25 (E), T27 (F), and S194 (G) of Nova-2 induced by CaMKIV, normalized to cells co-expressing Nova-2 and
GFP (red dotted lines) (n = 3).
(H) Micrographs of FLAG-tagged wild-type (WT) and mutant Nova-2. Nuclei are indicated by white dotted circles. Scale bar, 10 mm.
(I) Quantification of the nuc/cyt ratio of the fluorescence intensity of WT or mutant Nova-2 from (H) (n = 20–62). **p < 0.01.
(J) RT-PCR products from HEK293 cells expressing the E29 splicing reporter and WT Nova-2, DNLS Nova-2, or S25E/T27E Nova-2.
For (E)–(G) and (I), data are represented as mean ± SEM. See also Figure S5.
(Figures 6D–6G). Although the soma of cortical neurons was treated with TTX than in controls (Figure 6H). Likewise, the Ca2+
totally silent in the presence of TTX (Figures S6A and S6B), den- transients were taller, broader, and larger in area (Figures 6I–6K),
dritic spines remained active despite spike blockade (Figures whereas fluorescent punctum size and event frequency per
6D–6G and S6C). More spines were active in neurons chronically punctum were no different (Figures S6D and S6E). Thus, chronic
Cell 181, 1547–1565, June 25, 2020 1555

ll
Article
1556 Cell 181, 1547–1565, June 25, 2020

ll
Article
blockade of APs enhances spontaneous Ca2+ transients in The CaV1-CaMKK-CaMKIV Pathway Drives Reduced
spines, in line with optical voltage recordings. E29 Inclusion
Intensified minis and spine Ca2+ signaling could link inactivity
Contributions of Various Ca2+ Pathways during to nuclear AS. To find out whether this involves a classical
Spontaneous Transmission CaV1-CaMKK-CaMKIV cascade, like that engaged by depolari-
We characterized the elevated synaptic Ca2+ transients, mindful zation (West et al., 2002), we monitored phosphorylation of
of increases in Ca2+-permeable, GluA1-containing AMPA recep- CaMKIV as a pivotal step. Inactivity induced elevation of phos-
tors (AMPARs) following activity blockade (Kim and Ziff, 2014; phorylated CaMKIV (pCaMKIV) but not in the presence of nimo-
Thiagarajan et al., 2005). Live-labeled surface GluA1 increased dipine (a CaV1 blocker), KN93 (a CaMK inhibitor), or STO-609
in neurons that had undergone chronic inactivity (Figures S6F (a CaMKK blocker). This pattern was consistent in western
and S6G). Elevated surface GluA1 contributed to the enlarged blots of nuclear extracts (Figures 7A and 7B) and in nuc/cyto ra-
Ca2+ transient, indicated by a sharp drop in synaptic Ca2+ tran- tios obtained by immunocytochemistry (Figures S7A and S7B).
sients upon acute exposure to the GluA1 antagonist philantho- Nova-2 localization showed a reciprocal pattern (Figures 7C,
toxin (PhTx) (Figure 6L). Importantly, PhTx completely blocked 7D, S7C, and S7D) which was mirrored by TTX-induced reduc-
chronic inactivity-induced AP prolongation (Figures 6M and tion on E29 inclusion (Figures 7E and 7F). These results sup-
6N), indicating that GluA1 activation was critical for homeostatic ported a chain of signaling events emanating from spontane-
regulation of APD. ously active spines whereby a CaV1-CaMKK-CaMKIV pathway
Following chronic inactivity, elevated Ca2+ transients were drives reductions in nuclear Nova-2 and in E29 inclusion.
also inhibited by blockade of NMDAR with ((2R)-amino-5- Activation of a CaV1-CaMKK-CaMKIV pathway by chronic
phosphonovaleric acid; (2R)-amino-5-phosphonopentanoate) inactivity appears surprising because it seemingly recapitulates
APV and depletion of internal Ca2+ stores with thapsigargin effects of hyperactivity (Deisseroth et al., 2003; Ma et al.,
(Figure 6L). This aligns with AMPAR activation recruiting 2014). However, we found that directly imposed depolarization
Ca2+ delivery via NMDAR and intracellular Ca2+ stores (Empt- caused similar effects as chronic TTX on Nova-2 translocation
age et al., 1999). Voltage-dependent L-type Ca2+ channels and BK E29 exclusion (Figures S7E and S7F), with the latter ef-
(CaV1) also contributed to spine Ca2+ transients during fect completely prevented by nimodipine or KN93 (Figures 7G,
chronic inactivity, as judged by partial reduction with nimodi- 7H, and S7F). Thus, activation of CaV1 channels and CaMKs
pine (Figure 6L). A glutamate-gated cation current would drives Nova-2 relocation and E29 splicing, irrespective of how
create a voltage drop across the spine neck resistance (Har- the pathway is engaged.
nett et al., 2012; Palmer and Stuart, 2009), giving rise to
directly measured depolarizations and CaV1- and NMDA- bCaMKK Translocation Is Required for E29 Inclusion
mediated Ca2+ influx (Figure 6L). A remaining question is how, without spiking, CaV1 activation at
We verified inactivity-induced engagement of Ca2+ signaling dendritic sites causes activation of CaMKIV in the nucleus. In
by testing for activation of CaMKII and CaMKI in dendritic excitation-transcription coupling (E-T coupling) following acute
spines, which, respectively, undergo autophosphorylation or depolarization, signaling to nuclear CaMKIV involves transloca-
CaM kinase kinase (CaMKK)-mediated phosphorylation in tion of Ca2+/calmodulin via different shuttle proteins: gCaMKII
response to local Ca2+ signals (Wayman et al., 2008). Using in cortical, hippocampal, and sympathetic neurons (Ma et al.,
site-specific anti-phospho-Thr antibodies, we showed CaMKII 2014) and gCaMKI in parvalbumin-positive inhibitory neurons
Thr286 autophosphorylation and CaMKI Thr177/178 phos- (Cohen et al., 2016). In seeking a translocator for excitation-AS
phorylation, respectively (Figures S6H–S6K). Chronic inactivity coupling, we looked for an increase in nuclear level paired
augmented the intensity of both markers relative to the control with a drop in cytoplasmic level during chronic inactivity. This
(Figures S6H–S6K), providing independent biochemical evi- pattern was not evident for aCaMKI, bCaMKI, gCaMKI, dCaMKI,
dence that local Ca2+ signaling in spines is enhanced by gCaMKII, aCaMKK, and CaM itself (Figures 7J, S7G, and S7H).
chronic inactivity. In contrast, levels of bCaMKK rose in the nucleus and fell in
Figure 6. Chronic Spike Blockade Leads to Elevated Depolarization and Ca2+ Transients in Dendritic Spines
(A) Micrograph of a neuron expressing ASAP1. Scale bar, 10 mm.
(B) Quantification of membrane potential (ordinate, left) of somata and spines induced by APs (AP in a soma, purple; bAP in a spine, blue) or by AP-independent
synaptic transmission (spine depolarization in sham Con cultures, Vspine green; spine depolarization in 48 h TTX cultures, Vspine red), plotted against the cor-
responding change in ASAP1 fluorescence from the respective events (DF/F). See also STAR Methods.
(C) Example traces of ASAP1 fluorescence intensity (DF/F) in the spines and somata in (B).
(D and F) GCaMP6s expression in Con (D) and 48 h TTX neurons (F). Active spines with Ca2+ transients (during 5 min of imaging) are indicated (red dots). Scale bar,
10 mm. See also STAR Methods.
(E and G) GaMP6s signal (spontaneous Ca2+ signals were recorded in the presence of TTX) from active spines (y axis, red dots in D and F) in Con (E) and 48 h TTX
neurons (G).
(H–K) Comparison of synaptic Ca2+ transients between Con (black) and TTX (red) neurons. Shown are (H) number of active puncta (putative active spines) during
5 min of recording, (I) mean amplitude, (J) mean duration, and (K) normalized total fluorescence in Con and TTX (48 h) neurons (n = 14–15).
(L) Quantification of Ca2+ transient amplitudes in dendritic spines with wash on PhTx, APV, thapsigargin, or nimodipine (n = 8).
(M and N) AP waveforms (M) and pooled half-widths (N) after sham (black), 48 h TTX (red), or 48 h TTX with co-application of PhTx (green). Scale bars, 20 mV, 1 ms.
For (H)–(L) and (N), data are represented as mean ± SEM; *p < 0.05; **p < 0.01; N.S. represents p > 0.05. See also Figure S6.
Cell 181, 1547–1565, June 25, 2020 1557

ll
Article
1558 Cell 181, 1547–1565, June 25, 2020

ll
Article
the cytosol; its nuc/cyto ratio increased by more than 50% units. However, the brain-dominant BK auxiliary subunit b4 is
(Figures 7I and 7J). bCaMKK can phosphorylate CaMKIV and not significantly altered by TTX treatment (Lee et al., 2015).
is thus a plausible mediator of cytosol-to-nucleus signaling and
nuclear CaMKIV activation. We probed bCaMKK’s involvement CaV1-CaMKIV Signaling Is Engaged by Chronic Inactivity
by shRNA knockdown. This completely blocked inactivity- and Acute Depolarization
induced reduction of E29 inclusion, whereas concomitantly ex- Elegant work shows how AS can be regulated by depolarization
pressing shRNA-resistant bCaMKK fully rescued it (Figure 7K or activity elevation (Ding et al., 2017; Iijima et al., 2011; Mauger
and 7L). Thus, bCaMKK translocates to the nucleus and is et al., 2016; Xie and Black, 2001). Here we show that AS is also
necessary for chronic inactivity-induced E29 splicing. controlled by chronic inactivity and dominates homeostatic
regulation of AP shape. Strikingly, chronic inactivity-induced
DISCUSSION AS of BK, although homeostatic in outcome, relies on CaV1-
CaMKK-CaMKIV signaling, like acute depolarization-induced
We found an unexpected mechanism by which excitatory neu- splicing. Activation of Ca2+ signaling by silencing neuronal
rons modify their APs in responding homeostatically to chronic spiking is counterintuitive; the expectation is that Ca2+ entry
inactivity (Figure 7M). The adaptation arises from a well-defined would be dampened (Bridi et al., 2018). We resolved the paradox
change in splicing and is mediated by a novel signaling cascade by optically tracking transient depolarizations in dendritic spines
that mobilizes Hebbian-type signaling, even in the absence of of ‘‘inactive’’ neurons (Figures 6A–6C), which were strong
spikes. Remarkably, each element of this signaling pathway enough to activate the CaV1 channels present in dendritic spines
has been genetically implicated in neuropsychiatric disease (Stanika et al., 2016). This demystified the recruitment of CaV1
(see below). signaling, already indicated pharmacologically.
A potent set of synaptic events combine to drive homeostatic
Regulation of BK Channel Splicing Selectively readjustment of APD. Chronic spike blockade drives enhanced
Controls APD spontaneous presynaptic vesicle release (Jakawich et al.,
We found that splicing of BK channels is necessary and sufficient 2010; Lindskog et al., 2010) and incorporation of postsynaptic,
for inactivity-induced lengthening of APD. Under typical circum- high-conductance, PhTx-sensitive GluA1 receptors (Kim and
stances, regulatory changes in one repolarizing current (e.g., BK) Ziff, 2014; Thiagarajan et al., 2005). This increases glutamate-
would be offset by compensatory changes in others (e.g., KV2) gated cation influx, drives greater spine depolarization, and re-
(Kimm et al., 2015). Here, however, the key findings are null ef- cruits L-type Ca2+ channels in spine heads (Obermair et al.,
fects (Figures 1C, 1D, 1J, 1K, 6M, and 6N); without changes in 2004; Yasuda et al., 2003), Mg2+-unblocked NMDARs, and
spike waveform, altered recruitment of other voltage-gated Ca2+ release from internal stores (Dittmer et al., 2017). Our sce-
channels is not expected. Kimm et al. (2015) have further shown nario (Figure 7M) accounts for L-type channel participation in
that blockade of BK channels with IbTx spares the current-fre- responses triggered by chronic TTX; L-type channel involvement
quency relationship. Thus, mechanisms other than BK regulation in responses to chronic depolarization is well known (O’Leary
are expected and seen for homeostatic adjustments of firing fre- et al., 2010). Ca2+ channel-triggered signaling in response to
quency (Lee and Chung, 2014; Lee et al., 2015). For APD, we chronic inactivity and depolarization runs counter to conven-
cannot exclude ancillary changes enabled by E29 inclusion, tional expectations of diametrically opposing effector actions.
such as BK channel phosphorylation or modified auxiliary sub- Our data suggest that L-type-dependent signaling can be
Figure 7. CaV1-CaMK Signaling Is Required for Chronic Spike Blockade-Induced Nova-2 Translocation and BK Channel AS
(A) Expression level of pCaMKIV and CaMKIV in neurons treated with nimodipine, KN93, or STO-609.
(B) Quantification of CaMKIV activation from (A) (n = 3).
(C) Western blot indicating nuclear Nova-2 levels from neurons treated as indicated.
(D) Quantification of the nuclear Nova-2 level from (C) (n = 3).
(E) E29 splicing after 48 h TTX with other treatments as indicated.
(F) Quantification of E29 splicing from (E) (n = 4).
(G) E29 splicing after chronic depolarization (60K for 24 h) with or without nimodipine or KN93.
(H) Quantification of E29 splicing from (G) (n = 4).
(I) bCaMKK immunostaining (green) with MAP2 (red) and DAPI (blue). Nuclei are indicated by white dashed circles. Scale bar, 10 mm.
(J) Fold changes of aCaMKK and bCaMKK expression in the nucleus, cytosol, and nuc/cyt ratio after TTX (48 h) relative to Con neuron levels (n = 30).
(K) E29 splicing from neurons expressing viral shRNA constructs against endogenous bCaMKK with or without co-expression of shRNA-resistant bCaMKK
(bCaMKK (R)). Scrambled shRNA was used as Con.
(L) Quantification of E29 splicing from (K) (n = 4).
(M) Diagram of the homeostatic signaling loop regulating APD. (1) Chronic spike blockade leads to upregulation of synaptic GluA1 (synaptic scaling). (2) Activation
of GluA1 mediates excessive cation influx, induces membrane depolarization in dendritic spines, and facilitates opening of the NMDAR and CaV1 channels. (3)
CaV1 opening leads to activation of CaMKs, including CaMKI, CaMKII, and bCaMKK. (4) bCaMKK translocates from the cytosol to nucleus and activates CaMKIV.
(5) Activated CaMKIV phosphorylates nuclear Nova-2. (6) Phosphorylated Nova-2 translocates to the cytosol. (7) This leads to reduced E29 inclusion in BK
channel pre-mRNA. (8) BK channel activity is inhibited when lacking E29, broadening APDs, the observed homeostatic response to chronic spike blockade.
(N) The autoregulatory feedback loop of neuronal excitability (LeMasson et al., 1993; Marder et al., 1996; Siegel et al., 1994) with components described in this
study. Various genes implicated in neuropsychiatric diseases are highlighted.
For (B), (D), (F), (H), (J), and (L), data are represented as mean ± SEM; **p < 0.01; N.S. represents p > 0.05. See also Figure S7.
Cell 181, 1547–1565, June 25, 2020 1559

ll
Article
mobilized to achieve the appropriate response irrespective of the 48 h, indicating that homeostatic modulation occurs in two
direction of the initial perturbation. phases. Perhaps early inhibition of CaMKIV, together with in-
Strikingly, the CaV1 blocker nimodipine completely abolished hibition of calcineurin, contributes to up-scaling of AMPARs,
TTX-induced CaMK activation, Nova-2 translocation, and E29 critical for activation of synaptic NMDARs and CaV1; with
splicing but only partially inhibited the Ca2+ transient in spines further inactivity, enhanced synaptic events lead to recruit-
induced by chronic TTX. We suggest that other Ca2+ influx path- ment of calcium signaling and enhanced nuclear pCaMKIV, vi-
ways contribute to Ca2+ flux (Wheeler et al., 2012), whereas only tal for regulation of APD.
nimodipine-sensitive CaV1 channels provide critical voltage- Regulation of AS by CaMKIV has been studied previously
dependent conformational (VDC) signaling (Li et al., 2016) to in the context of acute depolarization, acting at conserved CaM-
help trap activated CaMKII (Wang et al., 2017). KIV-responsive RNA element (CaRRE) sequences in target pre-
cursor mRNAs (pre-mRNAs) (Iijima et al., 2011; Lee et al.,
Multiple CaM Translocators Support Different Forms of 2007; Liu et al., 2012; Xie and Black, 2001). We looked for regu-
Surface-to-Nucleus Communication lation of inclusion of stress-axis-regulated (STREX) exon (X4
Translocation of a CaMK is a recurrent theme in cytonuclear in our study) and for CaRRE sequences flanking E29 but did
signaling in neurons, but the identity of the kinase varies (Co- not find either (Figure S3). This is not surprising in light of evi-
hen et al., 2016; Ma et al., 2014). In the present case of dence that multiple activity-dependent splicing factors (hnRNP
chronic inactivity-induced signaling, sensitivity to the selective L, SAM68, and related STAR [signal transduction and activation
CaMKK inhibitor STO-609 indicated that a CaMKK was of RNA] family members) have specific effects on individual pre-
involved. These findings resembled studies in C. elegans mRNA targets (Iijima et al., 2011). For Nova-2 and other splicing
where cytonuclear signaling relies on a monomeric CaMKK factors, puzzles remain regarding how selective control could be
(CKK-1) that translocates across the nuclear membrane (Ki- exerted if the splicing events are controlled by the same regu-
mura et al., 2002). Precedent exists for regulated cytonuclear lator, CaMKIV. Possible differences in the dynamics and locali-
distribution of bCaMKK (Cao et al., 2011; Karacosta et al., zation of specific splicing events need further study.
2012; Nakamura et al., 2001), but its molecular mechanism Similar to Rbfox1 (Lee et al., 2009), Nova function is regulated
needs elucidation. bCaMKK lacks a classic NLS, but it might by cytonuclear relocation (Racca et al., 2010). Nova-2 posi-
rely on regulation of its nuclear export signal (NES) (Xu tioning is a U-shaped function of activity level, with nuclear exit
et al., 2012) through interaction with a resident nuclear favored by hyperactivity (seizure induction [Eom et al., 2013] or
protein. sustained depolarization; Figure S7E) or chronic inactivity. This
CaMKIV was, as expected, the target of bCaMKK activation, echoes the U-shaped relationship of CaMKIV activation to activ-
as judged by elevation of pCaMKIV and its blockade by STO- ity. Although we focused on Nova-2 actions in the nucleus,
609. Finding that CaMKIV was prebound to its substrate Nova-2 has also been found in dendrites, co-localized with its
Nova-2 opens up the possibility of local signaling events. This target mRNAs (Racca et al., 2010). Outside of the nucleus,
would minimize the perturbation of nuclear Ca2+ regulation splicing factors might facilitate translocation of their binding
overall, desirable for signaling extending over hours rather partners (e.g., CaMKIV) and regulate the stability and translation
than seconds to minutes. To be activated by CaMKK, CaMKIV of their target mRNAs (see Lee et al., 2016, for such a role of
must be bound to Ca2+/CaM. In one scenario, bCaMKK could Rbfox1).
locally transfer Ca2+/CaM to CaMKIV while retaining most of
its catalytic activity, a known feature of this enzyme (Anderson Implications for Autism, Schizophrenia, and Other
et al., 1998; Edelman et al., 1996; Tokumitsu and Soderling, Neuropsychiatric Diseases
1996). Consistent with such an intranuclear handoff, buffering In the specific feedback loop we circumnavigated (Figures 7M
of nuclear free Ca2+/CaM by CaMBP4nuc completely inhibited and 7N), each element is genetically implicated in autism
chronic inactivity-induced Nova-2 translocation (Figures 4D spectrum disorder (ASD), schizophrenia, or other neuropsychi-
and 4E). atric disorders. First, L-type Ca2+ channels have been impli-
cated repeatedly in ASD and schizophrenia (Bhat et al., 2012;
Roles of CaMKIV in Diverse Aspects of Homeostatic Purcell et al., 2014; Splawski et al., 2004, 2005) and affect
Plasticity downstream signaling of many forms (Deisseroth et al., 2003).
CaMKIV has been identified previously as a key player in ho- Second, bCaMKK (gene name CAMKK2) exhibits sporadic
meostatic plasticity of excitatory neurons by Ibata et al. (2008) mutations in individuals with schizophrenia with severe
and Joseph and Turrigiano (2017), who found that the synap- biochemical effects (Luo et al., 2014; O’Brien et al., 2017) and
tic response to 4-h TTX treatment, a relatively brief period of affect multiple target kinases (Marcelo et al., 2016). Third,
inactivity, could be mimicked by CaMKIV inhibition using Nova-2 mediates splicing of hundreds of pre-mRNAs, encod-
STO-609 or overexpression of a dominant-negative CaMKIV. ing proteins prominent at synaptic sites (Saito et al., 2016;
Those findings might seem at odds with enhanced pCaMKIV Ule et al., 2005) whose patterns are altered in Fragile X, a
in neurons undergoing chronic inactivity (Figures 4 and 7). form of ASD (Kuwano et al., 2011; Lewis et al., 2000). Finally,
However, reconciliation may be possible based on the dy- KCNMA1, encoding the BK channel, is causally involved in
namics of the homeostatic response. During TTX treatment certain sporadic forms of ASD (Laumonnier et al., 2006), mental
for 24–48 h, Kim and Ziff (2014) found initial inhibition of retardation, schizophrenia, and epilepsy (Du et al., 2005; Hig-
Ca2+ signaling for at least 6 h, followed by late activation at gins et al., 2008; Zhang et al., 2006); inclusion of BK exon E29
1560 Cell 181, 1547–1565, June 25, 2020

ll
Article
is significantly lower in ASD samples than in matched controls thank Xiaohan Wang and other Tsien lab members for advice and comments
(Parikshak et al., 2016). Together, these findings suggest that on the manuscript. This work was supported by research grants from the
NIGMS (GM058234), NINDS (NS24067), NIMH (MH071739), NIDA
individual steps in the adaptive feedback pathway not only
(DA040484), Druckenmiller Foundation, Simons Foundation, Mathers Founda-
contribute to homeostatic regulation but might also go awry tion, and Burnett Family Foundation (to R.W.T.); the National Key R&D Program
in neuropsychiatric disorders (Mullins et al., 2016; Figure 7N). of China (2018YFA0108300 to B.L.); the National Natural Science Foundation
In pinpointing players in E-AS coupling that seem to support of China (81622016 and 31571034 to B.L. and 81871048 and 81741063 to
pathogenesis, our results align with generally altered splicing L.H.); the Guangdong Natural Science Foundation (Grants for Distinguished
in ASD (Parikshak et al., 2016), possibly arising from this kind Young Scholars 2015A030306019 to B.L. and 2018B030311034 to L.H.);
and the Guangdong Provincial Key R&D Programs (Key Technologies for
of regulatory loop or side branches from it. Thus, correction
Treatment of Brain Disorders 2018B030332001 and Development of New
of faulty E-AS coupling merits consideration as a therapeutic
Tools for Diagnosis and Treatment of Autism 2018B030335001 to L.H.
strategy (Hébert et al., 2014). and B.L.).
STAR+METHODS AUTHOR CONTRIBUTIONS
Detailed methods are provided in the online version of this paper B.L. and R.W.T. conceived the project. B.S.S., S.D.S., and Z.L. performed
and include the following: electrophysiology experiments. B.L. and C.W. performed subcellular fraction-
ation and immunostaining. N.C. provided analysis tools for calcium and
d KEY RESOURCES TABLE voltage imaging. N.J.M. and S.D.S. performed immunostaining and analysis
for GluA1. G.Z. and T.A.N. performed mass spectrometry. B.W. and G.F. per-
d LEAD CONTACT AND MATERIALS AVAILABILITY
formed thalamic input elimination and immunohistochemistry for Nova. G.T.,
d EXPERIMENTAL MODEL AND SUBJECT DETAILS S.S., and S.D.S. performed cell culture. S.Y. and L.H. performed qRT-PCR
B Cell lines and immunostaining. B.L. performed all other experiments and data analysis.
B Primary cell cultures B.L., R.W.T., and S.D.S. wrote the manuscript with advice from L.H. and G.F.
B Animals
d METHOD DETAILS DECLARATION OF INTERESTS
B Constructs
B Transfection and treatment of cortical neurons The authors declare no competing interests.
B Lentiviral transduction of cortical neurons
Received: April 24, 2019
B Immunocytochemistry and image acquisition and
Revised: January 28, 2020
analysis Accepted: May 4, 2020
B Transfection and electrophysiology of HEK293 cells Published: June 2, 2020
B Electrophysiological recording of action potential half-
width in cultured cortical neurons REFERENCES
B Recording Current-Voltage relationships in HEK cells
transfected with BK channels Anderson, K.A., Means, R.L., Huang, Q.H., Kemp, B.E., Goldstein, E.G., Sel-
B Ca
2+
imaging on dendrite and soma of cultured bert, M.A., Edelman, A.M., Fremeau, R.T., and Means, A.R. (1998). Compo-
nents of a calmodulin-dependent protein kinase cascade. Molecular cloning,
neurons
functional characterization and cellular localization of Ca2+/calmodulin-depen-
B Voltage imaging of cultured neurons dent protein kinase kinase b. J. Biol. Chem. 273, 31880–31889.
B Immunoprecipitation
Bhat, S., Dao, D.T., Terrillion, C.E., Arad, M., Smith, R.J., Soldatov, N.M., and
B Protein sample preparation and western blot Gould, T.D. (2012). CACNA1C (Cav1.2) in the pathophysiology of psychiatric
B Oligo-RNA pull-down disease. Prog. Neurobiol. 99, 1–14.
B RNA immunoprecipitation Bischofberger, J., Geiger, J.R., and Jonas, P. (2002). Timing and efficacy of
B RT-PCR and Realtime qPCR Ca2+ channel activation in hippocampal mossy fiber boutons. J. Neurosci.
B Thalamic input elimination and immunohistochemistry 22, 10593–10602.
B Monocular deprivation Bito, H., Deisseroth, K., and Tsien, R.W. (1996). CREB phosphorylation and
B Protein sample preparation and mass spectrometry dephosphorylation: a Ca(2+)- and stimulus duration-dependent switch for hip-
analysis pocampal gene expression. Cell 87, 1203–1214.
B Phosphopeptide identification and quantitation Black, D.L. (2003). Mechanisms of alternative pre-messenger RNA splicing.
d QUANTIFICATION AND STATISTICAL ANALYSIS Annu. Rev. Biochem. 72, 291–336.
d DATA AND CODE AVAILABILITY Borst, J.G., and Sakmann, B. (1998). Calcium current during a single action po-
tential in a large presynaptic terminal of the rat brainstem. J. Physiol. 506,
143–157.
SUPPLEMENTAL INFORMATION
Bridi, M.C.D., de Pasquale, R., Lantz, C.L., Gu, Y., Borrell, A., Choi, S.Y., He,
Supplemental Information can be found online at https://doi.org/10.1016/j. K., Tran, T., Hong, S.Z., Dykman, A., et al. (2018). Two distinct mechanisms for
cell.2020.05.013. experience-dependent homeostasis. Nat. Neurosci. 21, 843–850.
Buckanovich, R.J., and Darnell, R.B. (1997). The neuronal RNA binding protein
ACKNOWLEDGMENTS Nova-1 recognizes specific RNA targets in vitro and in vivo. Mol. Cell. Biol. 17,
3194–3201.
We thank Dr. Robert B. Darnell for providing human anti-Nova serum and Dr. Byrne, J.H., and Kandel, E.R. (1996). Presynaptic facilitation revisited: state
Peter Stoilov and Dr. Douglas Black for providing AS reporter constructs. We and time dependence. J. Neurosci. 16, 425–435.
Cell 181, 1547–1565, June 25, 2020 1561

ll
Article
Cao, W., Sohail, M., Liu, G., Koumbadinga, G.A., Lobo, V.G., and Xie, J. (2011). Harnett, M.T., Makara, J.K., Spruston, N., Kath, W.L., and Magee, J.C. (2012).
Differential effects of PKA-controlled CaMKK2 variants on neuronal differenti- Synaptic amplification by dendritic spines enhances input cooperativity. Na-
ation. RNA Biol. 8, 1061–1072. ture 491, 599–602.
Cohen, S.M., Ma, H., Kuchibhotla, K.V., Watson, B.O., Buzsáki, G., Froemke, Harreman, M.T., Kline, T.M., Milford, H.G., Harben, M.B., Hodel, A.E., and Cor-
R.C., and Tsien, R.W. (2016). Excitation-transcription coupling in parvalbumin- bett, A.H. (2004). Regulation of nuclear import by phosphorylation adjacent to
positive interneurons employs a novel CaM kinase-dependent pathway nuclear localization signals. J. Biol. Chem. 279, 20613–20621.
distinct from excitatory neurons. Neuron 90, 292–307. Hébert, B., Pietropaolo, S., Même, S., Laudier, B., Laugeray, A., Doisne, N.,
Cruzalegui, F.H., and Means, A.R. (1993). Biochemical characterization of the Quartier, A., Lefeuvre, S., Got, L., Cahard, D., et al. (2014). Rescue of fragile
multifunctional Ca2+/calmodulin-dependent protein kinase type IV expressed X syndrome phenotypes in Fmr1 KO mice by a BKCa channel opener molecule.
in insect cells. J. Biol. Chem. 268, 26171–26178. Orphanet J. Rare Dis. 9, 124.
Deisseroth, K., Mermelstein, P.G., Xia, H., and Tsien, R.W. (2003). Signaling Helton, T.D., Xu, W., and Lipscombe, D. (2005). Neuronal L-type calcium
from synapse to nucleus: the logic behind the mechanisms. Curr. Opin. Neuro- channels open quickly and are inhibited slowly. J. Neurosci. 25,
biol. 13, 354–365. 10247–10251.
Deng, P.Y., Rotman, Z., Blundon, J.A., Cho, Y., Cui, J., Cavalli, V., Zakharenko, Higgins, J.J., Hao, J., Kosofsky, B.E., and Rajadhyaksha, A.M. (2008). Dysre-
S.S., and Klyachko, V.A. (2013). FMRP regulates neurotransmitter release and gulation of large-conductance Ca2+-activated K+ channel expression in non-
synaptic information transmission by modulating action potential duration via syndromal mental retardation due to a cereblon p.R419X mutation. Neuroge-
BK channels. Neuron 77, 696–711. netics 9, 219–223.
Desai, N.S., Rutherford, L.C., and Turrigiano, G.G. (1999). Plasticity in the Hodgkin, A.L., and Huxley, A.F. (1952). A quantitative description of membrane
intrinsic excitability of cortical pyramidal neurons. Nat. Neurosci. 2, current and its application to conduction and excitation in nerve. J. Physiol.
515–520. 117, 500–544.
Ding, X., Liu, S., Tian, M., Zhang, W., Zhu, T., Li, D., Wu, J., Deng, H., Jia, Hu, H., Shao, L.R., Chavoshy, S., Gu, N., Trieb, M., Behrens, R., Laake, P.,
Y., Xie, W., et al. (2017). Activity-induced histone modifications govern Pongs, O., Knaus, H.G., Ottersen, O.P., and Storm, J.F. (2001). Presynaptic
Neurexin-1 mRNA splicing and memory preservation. Nat. Neurosci. 20, Ca2+-activated K+ channels in glutamatergic hippocampal terminals and their
690–699. role in spike repolarization and regulation of transmitter release. J. Neurosci.
Dittmer, P.J., Wild, A.R., Dell’Acqua, M.L., and Sather, W.A. (2017). STIM1 21, 9585–9597.
Ca2+ Sensor Control of L-type Ca2+-Channel-Dependent Dendritic Spine Ibata, K., Sun, Q., and Turrigiano, G.G. (2008). Rapid synaptic scaling induced
Structural Plasticity and Nuclear Signaling. Cell Rep. 19, 321–334. by changes in postsynaptic firing. Neuron 57, 819–826.
Du, W., Bautista, J.F., Yang, H., Diez-Sampedro, A., You, S.A., Wang, L., Ko- Iijima, T., Wu, K., Witte, H., Hanno-Iijima, Y., Glatter, T., Richard, S., and
tagal, P., Lüders, H.O., Shi, J., Cui, J., et al. (2005). Calcium-sensitive potas- Scheiffele, P. (2011). SAM68 regulates neuronal activity-dependent alternative
sium channelopathy in human epilepsy and paroxysmal movement disorder. splicing of neurexin-1. Cell 147, 1601–1614.
Nat. Genet. 37, 733–738. Jackson, M.B., Konnerth, A., and Augustine, G.J. (1991). Action potential
Edelman, A.M., Mitchelhill, K.I., Selbert, M.A., Anderson, K.A., Hook, S.S., Sta- broadening and frequency-dependent facilitation of calcium signals in pituitary
pleton, D., Goldstein, E.G., Means, A.R., and Kemp, B.E. (1996). Multiple nerve terminals. Proc. Natl. Acad. Sci. USA 88, 380–384.
Ca(2+)-calmodulin-dependent protein kinase kinases from rat brain. Purifica- Jakawich, S.K., Nasser, H.B., Strong, M.J., McCartney, A.J., Perez, A.S., Ra-
tion, regulation by Ca(2+)-calmodulin, and partial amino acid sequence. kesh, N., Carruthers, C.J., and Sutton, M.A. (2010). Local presynaptic activity
J. Biol. Chem. 271, 10806–10810. gates homeostatic changes in presynaptic function driven by dendritic BDNF
Ehlers, M.D. (2003). Activity level controls postsynaptic composition and synthesis. Neuron 68, 1143–1158.
signaling via the ubiquitin-proteasome system. Nat. Neurosci. 6, Jiang, M., and Chen, G. (2006). High Ca2+-phosphate transfection efficiency in
231–242. low-density neuronal cultures. Nat. Protoc. 1, 695–700.
Emptage, N., Bliss, T.V., and Fine, A. (1999). Single synaptic events evoke Joseph, A., and Turrigiano, G.G. (2017). All for one but not one for all: excitatory
NMDA receptor-mediated release of calcium from internal stores in hippocam- synaptic scaling and intrinsic excitability are coregulated by CaMKIV, whereas
pal dendritic spines. Neuron 22, 115–124. inhibitory synaptic scaling is under independent control. J. Neurosci. 37,
Eom, T., Zhang, C., Wang, H., Lay, K., Fak, J., Noebels, J.L., and Darnell, R.B. 6778–6785.
(2013). NOVA-dependent regulation of cryptic NMD exons controls synaptic Karacosta, L.G., Foster, B.A., Azabdaftari, G., Feliciano, D.M., and Edelman,
protein levels after seizure. eLife 2, e00178. A.M. (2012). A regulatory feedback loop between Ca2+/calmodulin-dependent
Fodor, A.A., and Aldrich, R.W. (2009). Convergent evolution of alternative protein kinase kinase 2 (CaMKK2) and the androgen receptor in prostate can-
splices at domain boundaries of the BK channel. Annu. Rev. Physiol. cer progression. J. Biol. Chem. 287, 24832–24843.
71, 19–36. Kim, J., and Tsien, R.W. (2008). Synapse-specific adaptations to inactivity in
Furlanis, E., and Scheiffele, P. (2018). Regulation of neuronal differentiation, hippocampal circuits achieve homeostatic gain control while dampening
function, and plasticity by alternative splicing. Annu. Rev. Cell Dev. Biol. 34, network reverberation. Neuron 58, 925–937.
451–469. Kim, S., and Ziff, E.B. (2014). Calcineurin mediates synaptic scaling via syn-
Geiger, J.R., and Jonas, P. (2000). Dynamic control of presynaptic Ca(2+) aptic trafficking of Ca2+-permeable AMPA receptors. PLoS Biol. 12,
inflow by fast-inactivating K(+) channels in hippocampal mossy fiber boutons. e1001900.
Neuron 28, 927–939. Kimm, T., Khaliq, Z.M., and Bean, B.P. (2015). Differential regulation of action
Green, M.F., Anderson, K.A., and Means, A.R. (2011a). Characterization of the potential shape and burst-frequency firing by BK and Kv2 channels in substan-
CaMKKb-AMPK signaling complex. Cell. Signal. 23, 2005–2012. tia nigra dopaminergic neurons. J. Neurosci. 35, 16404–16417.
Green, M.F., Scott, J.W., Steel, R., Oakhill, J.S., Kemp, B.E., and Means, A.R. Kimura, Y., Corcoran, E.E., Eto, K., Gengyo-Ando, K., Muramatsu, M.A., Ko-
(2011b). Ca2+/Calmodulin-dependent protein kinase kinase b is regulated by bayashi, R., Freedman, J.H., Mitani, S., Hagiwara, M., Means, A.R., and Toku-
multisite phosphorylation. J. Biol. Chem. 286, 28066–28079. mitsu, H. (2002). A CaMK cascade activates CRE-mediated transcription in
Ha, T.S., Jeong, S.Y., Cho, S.W., Jeon, Hk., Roh, G.S., Choi, W.S., and Park, neurons of Caenorhabditis elegans. EMBO Rep. 3, 962–966.
C.S. (2000). Functional characteristics of two BKCa channel variants differen- Kuwano, Y., Kamio, Y., Kawai, T., Katsuura, S., Inada, N., Takaki, A., and Ro-
tially expressed in rat brain tissues. Eur. J. Biochem. 267, 910–918. kutan, K. (2011). Autism-associated gene expression in peripheral leucocytes
1562 Cell 181, 1547–1565, June 25, 2020

ll
Article
commonly observed between subjects with autism and healthy women having Ma, H., Groth, R.D., Cohen, S.M., Emery, J.F., Li, B., Hoedt, E., Zhang, G.,
autistic children. PLoS ONE 6, e24723. Neubert, T.A., and Tsien, R.W. (2014). gCaMKII shuttles Ca2+/CaM to the nu-
Laumonnier, F., Roger, S., Guérin, P., Molinari, F., M’rad, R., Cahard, D., Bel- cleus to trigger CREB phosphorylation and gene expression. Cell 159,
hadj, A., Halayem, M., Persico, A.M., Elia, M., et al. (2006). Association of a 281–294.
functional deficit of the BKCa channel, a synaptic regulator of neuronal excit- Maffei, A., and Turrigiano, G.G. (2008). Multiple modes of network homeosta-
ability, with autism and mental retardation. Am. J. Psychiatry 163, sis in visual cortical layer 2/3. J. Neurosci. 28, 4377–4384.
1622–1629. Maghsoodi, B., Poon, M.M., Nam, C.I., Aoto, J., Ting, P., and Chen, L. (2008).
Lee, K.Y., and Chung, H.J. (2014). NMDA receptors and L-type voltage-gated Retinoic acid regulates RARalpha-mediated control of translation in dendritic
Ca2+ channels mediate the expression of bidirectional homeostatic intrinsic RNA granules during homeostatic synaptic plasticity. Proc. Natl. Acad. Sci.
plasticity in cultured hippocampal neurons. Neuroscience 277, 610–623. USA 105, 16015–16020.
Lee, U.S., and Cui, J. (2010). BK channel activation: structural and functional Marcelo, K.L., Means, A.R., and York, B. (2016). The Ca(2+)/Calmodulin/
insights. Trends Neurosci. 33, 415–423. CaMKK2 axis: nature’s metabolic CaMshaft. Trends Endocrinol. Metab. 27,
Lee, J.A., Xing, Y., Nguyen, D., Xie, J., Lee, C.J., and Black, D.L. (2007). Depo- 706–718.
larization and CaM kinase IV modulate NMDA receptor splicing through two Marder, E., Abbott, L.F., Turrigiano, G.G., Liu, Z., and Golowasch, J. (1996).
essential RNA elements. PLoS Biol. 5, e40. Memory from the dynamics of intrinsic membrane currents. Proc. Natl.
Lee, J.A., Tang, Z.Z., and Black, D.L. (2009). An inducible change in Fox-1/ Acad. Sci. USA 93, 13481–13486.
A2BP1 splicing modulates the alternative splicing of downstream neuronal Matthews, E.A., Linardakis, J.M., and Disterhoft, J.F. (2009). The fast and slow
target exons. Genes Dev. 23, 2284–2293. afterhyperpolarizations are differentially modulated in hippocampal neurons
Lee, K.Y., Royston, S.E., Vest, M.O., Ley, D.J., Lee, S., Bolton, E.C., and by aging and learning. J. Neurosci. 29, 4750–4755.
Chung, H.J. (2015). N-methyl-D-aspartate receptors mediate activity-depen-
Mauger, O., Lemoine, F., and Scheiffele, P. (2016). Targeted intron retention
dent down-regulation of potassium channel genes during the expression of
and excision for rapid gene regulation in response to neuronal activity. Neuron
homeostatic intrinsic plasticity. Mol. Brain 8, 4.
92, 1266–1278.
Lee, J.A., Damianov, A., Lin, C.H., Fontes, M., Parikshak, N.N., Anderson, E.S.,
Mayer, M.L., Westbrook, G.L., and Guthrie, P.B. (1984). Voltage-dependent
Geschwind, D.H., Black, D.L., and Martin, K.C. (2016). Cytoplasmic Rbfox1
block by Mg2+ of NMDA responses in spinal cord neurones. Nature 309,
regulates the expression of synaptic and autism-related genes. Neuron 89,
261–263.
113–128.
Mullins, C., Fishell, G., and Tsien, R.W. (2016). Unifying views of autism spec-
LeMasson, G., Marder, E., and Abbott, L.F. (1993). Activity-dependent regula-
trum disorders: a consideration of autoregulatory feedback loops. Neuron 89,
tion of conductances in model neurons. Science 259, 1915–1917.
1131–1156.
Lewis, H.A., Musunuru, K., Jensen, K.B., Edo, C., Chen, H., Darnell, R.B., and
Nakamura, Y., Okuno, S., Sato, F., and Fujisawa, H. (1995). An immunohisto-
Burley, S.K. (2000). Sequence-specific RNA binding by a Nova KH domain: im-
chemical study of Ca2+/calmodulin-dependent protein kinase IV in the rat cen-
plications for paraneoplastic disease and the fragile X syndrome. Cell 100,
tral nervous system: light and electron microscopic observations. Neurosci-
323–332.
ence 68, 181–194.
Li, Q., Lee, J.A., and Black, D.L. (2007). Neuronal regulation of alternative pre-
mRNA splicing. Nat. Rev. Neurosci. 8, 819–831. Nakamura, Y., Okuno, S., Kitani, T., Otake, K., Sato, F., and Fujisawa, H.
(2001). Immunohistochemical localization of Ca(2+)/calmodulin-dependent
Li, B., Jie, W., Huang, L., Wei, P., Li, S., Luo, Z., Friedman, A.K., Meredith, A.L.,
protein kinase kinase b in the rat central nervous system. Neurosci. Res. 39,
Han, M.H., Zhu, X.H., and Gao, T.M. (2014). Nuclear BK channels regulate
175–188.
gene expression via the control of nuclear calcium signaling. Nat. Neurosci.
17, 1055–1063. Nowak, L., Bregestovski, P., Ascher, P., Herbet, A., and Prochiantz, A. (1984).
Magnesium gates glutamate-activated channels in mouse central neurones.
Li, B., Tadross, M.R., and Tsien, R.W. (2016). Sequential ionic and conforma-
Nature 307, 462–465.
tional signaling by calcium channels drives neuronal gene expression. Science
351, 863–867. O’Brien, M.T., Oakhill, J.S., Ling, N.X., Langendorf, C.G., Hoque, A., Dite, T.A.,
Means, A.R., Kemp, B.E., and Scott, J.W. (2017). Impact of genetic variation
Licatalosi, D.D., and Darnell, R.B. (2006). Splicing regulation in neurologic dis-
on human CaMKK2 regulation by Ca2+-calmodulin and multisite phosphoryla-
ease. Neuron 52, 93–101.
tion. Sci. Rep. 7, 43264.
Licatalosi, D.D., and Darnell, R.B. (2010). RNA processing and its regulation:
global insights into biological networks. Nat. Rev. Genet. 11, 75–87. O’Leary, T., van Rossum, M.C., and Wyllie, D.J. (2010). Homeostasis of
intrinsic excitability in hippocampal neurones: dynamics and mechanism of
Lindskog, M., Li, L., Groth, R.D., Poburko, D., Thiagarajan, T.C., Han, X., and
the response to chronic depolarization. J. Physiol. 588, 157–170.
Tsien, R.W. (2010). Postsynaptic GluA1 enables acute retrograde enhance-
ment of presynaptic function to coordinate adaptation to synaptic inactivity. O’Leary, T., Williams, A.H., Franci, A., and Marder, E. (2014). Cell types,
Proc. Natl. Acad. Sci. USA 107, 21806–21811. network homeostasis, and pathological compensation from a biologically
plausible ion channel expression model. Neuron 82, 809–821.
Liu, G., Razanau, A., Hai, Y., Yu, J., Sohail, M., Lobo, V.G., Chu, J., Kung, S.K.,
and Xie, J. (2012). A conserved serine of heterogeneous nuclear ribonucleo- Obermair, G.J., Szabo, Z., Bourinet, E., and Flucher, B.E. (2004). Differential
protein L (hnRNP L) mediates depolarization-regulated alternative splicing of targeting of the L-type Ca2+ channel a 1C (CaV1.2) to synaptic and extrasy-
potassium channels. J. Biol. Chem. 287, 22709–22716. naptic compartments in hippocampal neurons. Eur. J. Neurosci. 19,
2109–2122.
Llinás, R., Sugimori, M., and Simon, S.M. (1982). Transmission by presynaptic
spike-like depolarization in the squid giant synapse. Proc. Natl. Acad. Sci. USA Palmer, L.M., and Stuart, G.J. (2009). Membrane potential changes in dendritic
79, 2415–2419. spines during action potentials and synaptic input. J. Neurosci. 29,
Luo, X.J., Li, M., Huang, L., Steinberg, S., Mattheisen, M., Liang, G., Donohoe, 6897–6903.
G., Shi, Y., Chen, C., Yue, W., et al.; MooDS SCZ Consortium (2014). Conver- Parikshak, N.N., Swarup, V., Belgard, T.G., Irimia, M., Ramaswami, G., Gan-
gent lines of evidence support CAMKK2 as a schizophrenia susceptibility dal, M.J., Hartl, C., Leppa, V., Ubieta, L.T., Huang, J., et al. (2016). Genome-
gene. Mol. Psychiatry 19, 774–783. wide changes in lncRNA, splicing, and regional gene expression patterns in
Ma, W.P., Li, Y.T., and Tao, H.W. (2013). Downregulation of cortical inhibition autism. Nature 540, 423–427.
mediates ocular dominance plasticity during the critical period. J. Neurosci. Penney, J., Tsurudome, K., Liao, E.H., Elazzouzi, F., Livingstone, M., Gonza-
33, 11276–11280. lez, M., Sonenberg, N., and Haghighi, A.P. (2012). TOR is required for the
Cell 181, 1547–1565, June 25, 2020 1563

ll
Article
retrograde regulation of synaptic homeostasis at the Drosophila neuromus- St-Pierre, F., Marshall, J.D., Yang, Y., Gong, Y., Schnitzer, M.J., and Lin, M.Z.
cular junction. Neuron 74, 166–178. (2014). High-fidelity optical reporting of neuronal electrical activity with an ul-
Pietrzykowski, A.Z., Friesen, R.M., Martin, G.E., Puig, S.I., Nowak, C.L., trafast fluorescent voltage sensor. Nat. Neurosci. 17, 884–889.
Wynne, P.M., Siegelmann, H.T., and Treistman, S.N. (2008). Posttranscrip- Stanika, R., Campiglio, M., Pinggera, A., Lee, A., Striessnig, J., Flucher, B.E.,
tional regulation of BK channel splice variant stability by miR-9 underlies neu- and Obermair, G.J. (2016). Splice variants of the CaV1.3 L-type calcium chan-
roadaptation to alcohol. Neuron 59, 274–287. nel regulate dendritic spine morphology. Sci. Rep. 6, 34528.
Purcell, S.M., Moran, J.L., Fromer, M., Ruderfer, D., Solovieff, N., Roussos, P., Stoilov, P., Lin, C.H., Damoiseaux, R., Nikolic, J., and Black, D.L. (2008). A
O’Dushlaine, C., Chambert, K., Bergen, S.E., Kähler, A., et al. (2014). A poly- high-throughput screening strategy identifies cardiotonic steroids as
genic burden of rare disruptive mutations in schizophrenia. Nature 506, alternative splicing modulators. Proc. Natl. Acad. Sci. USA 105,
185–190. 11218–11223.
Racca, C., Gardiol, A., Eom, T., Ule, J., Triller, A., and Darnell, R.B. (2010). The Styr, B., and Slutsky, I. (2018). Imbalance between firing homeostasis and syn-
Neuronal Splicing Factor Nova Co-Localizes with Target RNAs in the Dendrite. aptic plasticity drives early-phase Alzheimer’s disease. Nat. Neurosci. 21,
Front. Neural Circuits 4, 5. 463–473.
Ramocki, M.B., and Zoghbi, H.Y. (2008). Failure of neuronal homeostasis re- Thiagarajan, T.C., Lindskog, M., and Tsien, R.W. (2005). Adaptation to synap-
sults in common neuropsychiatric phenotypes. Nature 455, 912–918. tic inactivity in hippocampal neurons. Neuron 47, 725–737.
Rappsilber, J., Mann, M., and Ishihama, Y. (2007). Protocol for micro-purifica- Tokumitsu, H., and Soderling, T.R. (1996). Requirements for calcium and
tion, enrichment, pre-fractionation and storage of peptides for proteomics us- calmodulin in the calmodulin kinase activation cascade. J. Biol. Chem. 271,
ing StageTips. Nat. Protoc. 2, 1896–1906. 5617–5622.
Sabatini, B.L., and Regehr, W.G. (1997). Control of neurotransmitter release by Trasande, C.A., and Ramirez, J.M. (2007). Activity deprivation leads to sei-
presynaptic waveform at the granule cell to Purkinje cell synapse. J. Neurosci. zures in hippocampal slice cultures: is epilepsy the consequence of homeo-
17, 3425–3435. static plasticity? J. Clin. Neurophysiol. 24, 154–164.
Saito, Y., Miranda-Rottmann, S., Ruggiu, M., Park, C.Y., Fak, J.J., Zhong, R., Turrigiano, G.G. (2008). The self-tuning neuron: synaptic scaling of excitatory
Duncan, J.S., Fabella, B.A., Junge, H.J., Chen, Z., et al. (2016). NOVA2-medi- synapses. Cell 135, 422–435.
ated RNA regulation is required for axonal pathfinding during development. Turrigiano, G.G., and Nelson, S.B. (2004). Homeostatic plasticity in the devel-
eLife 5, e14371. oping nervous system. Nat. Rev. Neurosci. 5, 97–107.
Sausbier, M., Hu, H., Arntz, C., Feil, S., Kamm, S., Adelsberger, H., Sausbier, Ule, J., Ule, A., Spencer, J., Williams, A., Hu, J.S., Cline, M., Wang, H., Clark,
U., Sailer, C.A., Feil, R., Hofmann, F., et al. (2004). Cerebellar ataxia and Pur- T., Fraser, C., Ruggiu, M., et al. (2005). Nova regulates brain-specific splicing
kinje cell dysfunction caused by Ca2+-activated K+ channel deficiency. Proc. to shape the synapse. Nat. Genet. 37, 844–852.
Natl. Acad. Sci. USA 101, 9474–9478. Ule, J., Stefani, G., Mele, A., Ruggiu, M., Wang, X., Taneri, B., Gaasterland, T.,
Schanzenbächer, C.T., Sambandan, S., Langer, J.D., and Schuman, E.M. Blencowe, B.J., and Darnell, R.B. (2006). An RNA map predicting Nova-depen-
(2016). Nascent Proteome Remodeling following Homeostatic Scaling at Hip- dent splicing regulation. Nature 444, 580–586.
pocampal Synapses. Neuron 92, 358–371. Vuong, C.K., Black, D.L., and Zheng, S. (2016). The neurogenetics of alterna-
Schaukowitch, K., Reese, A.L., Kim, S.K., Kilaru, G., Joo, J.Y., Kavalali, E.T., tive splicing. Nat. Rev. Neurosci. 17, 265–281.
and Kim, T.K. (2017). An Intrinsic Transcriptional Program Underlying Synaptic Wang, J., Campos, B., Jamieson, G.A., Jr., Kaetzel, M.A., and Dedman,
Scaling during Activity Suppression. Cell Rep. 18, 1512–1526. J.R. (1995). Functional elimination of calmodulin within the nucleus by tar-
Schneider, C.A., Rasband, W.S., and Eliceiri, K.W. (2012). NIH Image to Im- geted expression of an inhibitor peptide. J. Biol. Chem. 270,
ageJ: 25 years of image analysis. Nat. Methods 9, 671–675. 30245–30248.
Selbert, M.A., Anderson, K.A., Huang, Q.H., Goldstein, E.G., Means, A.R., and Wang, X., Marks, C.R., Perfitt, T.L., Nakagawa, T., Lee, A., Jacobson,
Edelman, A.M. (1995). Phosphorylation and activation of Ca(2+)-calmodulin- D.A., and Colbran, R.J. (2017). A novel mechanism for Ca2+/calmodulin-
dependent protein kinase IV by Ca(2+)-calmodulin-dependent protein kinase dependent protein kinase II targeting to L-type Ca2+ channels that initi-
Ia kinase. Phosphorylation of threonine 196 is essential for activation. J. Biol. ates long-range signaling to the nucleus. J. Biol. Chem. 292,
Chem. 270, 17616–17621. 17324–17336.
Shao, L.R., Halvorsrud, R., Borg-Graham, L., and Storm, J.F. (1999). The Wayman, G.A., Lee, Y.S., Tokumitsu, H., Silva, A.J., and Soderling, T.R. (2008).
role of BK-type Ca2+-dependent K+ channels in spike broadening during Calmodulin-kinases: modulators of neuronal development and plasticity.
repetitive firing in rat hippocampal pyramidal cells. J. Physiol. 521, Neuron 59, 914–931.
135–146. West, A.E., Griffith, E.C., and Greenberg, M.E. (2002). Regulation of transcrip-
Shelley, C., Whitt, J.P., Montgomery, J.R., and Meredith, A.L. (2013). Phos- tion factors by neuronal activity. Nat. Rev. Neurosci. 3, 921–931.
phorylation of a constitutive serine inhibits BK channel variants containing Wheeler, D.G., Groth, R.D., Ma, H., Barrett, C.F., Owen, S.F., Safa, P., and
the alternate exon ‘‘SRKR’’. J. Gen. Physiol. 142, 585–598. Tsien, R.W. (2012). Ca(V)1 and Ca(V)2 channels engage distinct modes of
Shipston, M.J., and Tian, L. (2016). Posttranscriptional and posttranslational Ca(2+) signaling to control CREB-dependent gene expression. Cell 149,
regulation of BK channels. Int. Rev. Neurobiol. 128, 91–126. 1112–1124.
Siegel, M., Marder, E., and Abbott, L.F. (1994). Activity-dependent current Wiesel, T.N., and Hubel, D.H. (1963). Effects of visual deprivation on
distributions in model neurons. Proc. Natl. Acad. Sci. USA 91, morphology and physiology of cells in the cats lateral geniculate body.
11308–11312. J. Neurophysiol. 26, 978–993.
Splawski, I., Timothy, K.W., Sharpe, L.M., Decher, N., Kumar, P., Bloise, R., Wondolowski, J., and Dickman, D. (2013). Emerging links between homeo-
Napolitano, C., Schwartz, P.J., Joseph, R.M., Condouris, K., et al. (2004). static synaptic plasticity and neurological disease. Front. Cell. Neurosci.
Ca(V)1.2 calcium channel dysfunction causes a multisystem disorder including 7, 223.
arrhythmia and autism. Cell 119, 19–31. Xie, J., and Black, D.L. (2001). A CaMK IV responsive RNA element mediates
Splawski, I., Timothy, K.W., Decher, N., Kumar, P., Sachse, F.B., Beggs, A.H., depolarization-induced alternative splicing of ion channels. Nature 410,
Sanguinetti, M.C., and Keating, M.T. (2005). Severe arrhythmia disorder 936–939.
caused by cardiac L-type calcium channel mutations. Proc. Natl. Acad. Sci. Xie, J., and McCobb, D.P. (1998). Control of alternative splicing of potassium
USA 102, 8089–8096, discussion 8086–8088. channels by stress hormones. Science 280, 443–446.
1564 Cell 181, 1547–1565, June 25, 2020

ll
Article
Xu, D., Farmer, A., Collett, G., Grishin, N.V., and Chook, Y.M. (2012). Sequence Yasuda, R., Sabatini, B.L., and Svoboda, K. (2003). Plasticity of calcium chan-
and structural analyses of nuclear export signals in the NESdb database. Mol. nels in dendritic spines. Nat. Neurosci. 6, 948–955.
Biol. Cell 23, 3677–3693. Zarei, M.M., Zhu, N., Alioua, A., Eghbali, M., Stefani, E., and Toro, L. (2001). A
novel MaxiK splice variant exhibits dominant-negative properties for surface
Yang, Y.Y., Yin, G.L., and Darnell, R.B. (1998). The neuronal RNA-binding pro- expression. J. Biol. Chem. 276, 16232–16239.
tein Nova-2 is implicated as the autoantigen targeted in POMA patients with Zhang, L., Li, X., Zhou, R., and Xing, G. (2006). Possible role of potassium
dementia. Proc. Natl. Acad. Sci. USA 95, 13254–13259. channel, big K in etiology of schizophrenia. Med. Hypotheses 67, 41–43.
Cell 181, 1547–1565, June 25, 2020 1565

ll
Article
STAR+METHODS
KEY RESOURCES TABLE

Antibodies
Rabbit pCaMKII Cell Signaling Technology Cat#3361;RRID:AB_10015209
Rabbit pCaMKIV Abcam Cat# ab59424;RRID:AB_2068253
Rabbit pCaMKI Santa Cruz Cat#sc-28438-R;RRID:AB_667968
Goat anti-bCaMKK Santa Cruz Cat#sc-9629;RRID:AB_2243844
Rabbit anti-bCaMKK Santa Cruz Cat#sc-50341;RRID:AB_2068532
Mouse anti-CaMKIV Santa Cruz Cat#sc-55501;RRID:AB_2243836
Mouse anti-CaMKIV Santa Cruz Cat#sc-136249;RRID:AB_2275109
Mouse anti-flag (DYKDDDDK Tag) Cell Signaling Technology Cat#8146;RRID:AB_10950495
Rabbit anti-PSD95 Synaptic Systems Cat#124002; RRID:AB_887760
Mouse anti-PSD95 Synaptic Systems Cat#124011; RRID:AB_10804286
Goat anti-Nova-2 Santa Cruz Cat#sc-10546;RRID:AB_2151558
Human anti-pan NOVA anti-Nova paraneoplastic Saito et al., 2016
human serum
Mouse anti-MAP2 (HM-2) Sigma Cat#M9942;RRID:AB_477256
Mouse anti-CaM Millipore Cat#05-173;RRID:AB_309644
Goat anti-gCaMKII Santa Cruz Cat#sc-1541;RRID:AB_2068234
Mouse anti-CaMKI Santa Cruz Cat#sc-377418;RRID:AB_2069999
Goat anti-bCaMKI Santa Cruz Cat#sc-131452;RRID:AB_2243992
Rabbit anti-dCaMKI Santa Cruz Cat#sc-134638;RRID:AB_2070115
Mouse anti-gCaMKI Abcam Cat#ab77046;RRID:AB_1565944
Rabbit anti-gCaMKI Thermo Scientific Cat#PA5-19661;RRID:AB_10981875
Mouse anti-aCaMKK Santa Cruz Cat#sc-17827;RRID:AB_2275110
Rabbit anti-aCaMKK Santa Cruz Cat#sc-11370;RRID:AB_2068406
Rabbit anti-GluR1 Calbiochem Cat# 04-855;RRID: AB_1977216
Rabbit anti-BK channel (extracellular epitope) Alomone labs Cat#APC151;RRID:AB_10915895
Mouse anti-PSD-95 UC Davis/NIH NeuroMab facility N/A
Guinea anti-MAP2 Synaptic Systems Cat# 188004;RRID:AB_2138181
Rabbit anti-GFP Abcam Cat# ab290;RRID:AB_303395
Rabbit anti-pCaMKII PhosphoSolutions Cat# p1005-286; RRID:AB_2492051
Rabbit anti-Lamin-B1 Cell Signaling Technology Cat#13435;RRID:AB_2737428
Rabbit anti-GAPDH Cell Signaling Technology Cat#5174;RRID:AB_10622025
Tetrodotoxin (TTX) Ascent Scientific Cat#4368-28-9
KN93 Tocris Cat#5215
STO-609 Tocris Cat#1551
PhTx Tocris Cat#2770
APV Tocris Cat#0106
nimodipine Abcam Cat#ab120138
thapsigargin Tocris Cat#1138
Phusion Site-Directed Mutagenesis Kit Thermo Scientific Cat# F541
Dynabeads Protein G Immunoprecipitation Kit Life Technologies Cat#10007D
Magnetic RNA-Protein Pull-Down Kit Thermo Scientific Cat#20164
e1 Cell 181, 1547–1565.e1–e8, June 25, 2020

ll
Article
Continued
Magna RIP RNA-Binding Protein Sigma Cat#17-700
Immunoprecipitation Kit
Pierce Cell Surface Protein Isolation Kit Thermo Scientific Cat#89881
Pierce Nuclear Protein Extraction Kit Thermo Scientific Cat#78833
CelLytic M Cell Lysis Reagent Sigma Cat#C3228
Mouse: Olig3-Cre mouse gift from Y. Nakagawa, N/A
University of Minnesota
Mouse: R26floxstopTeNT gift from M. Goulding at the Salk N/A
Institute for Biological Studies
Oligonucleotides
Probes and shRNAs, see Table S1. This paper N/A
Primers for RT-PCR and qPCR, see Table S1 This paper N/A
Primers to insert E29 and flanking introns to the This paper N/A
splicing reporter: F: CCGGAATTCCGGC
TATGTGGCAACCCTAC
Primers to insert E29 and flanking introns to the This paper N/A
splicing reporter: R: CGCGGATCCGCGT
CTCCTTTGACTTCCTCT
Primers to measure E29 splicing in the splicing This paper N/A
reporter: F: GGAGAAGTCTGCCGTTACTGCCC
TGTG (DY-782 labeled)
Primers to measure E29 splicing in the splicing This paper N/A
reporter: R: CCGTCGTCCTTGAAGAAGATGGTGC
Recombinant DNA
mouse bCaMKK construct: Lentiviral CaMKKbeta Green et al., 2011b Addgene Plasmid #33322;
RRID:Addgene_33322
rat bCaMKK 1-460 construct: pSG5-FLAG- Green et al., 2011a Addgene Plasmid #33324;
CaMKKbeta rat 1-460 RRID:Addgene_33324
Human Nova-2 ORF Origene Cat# RC216200L1V, RC216200L2V
pcDNA3-BK-GFP Li et al., 2014 N/A
pCKII-GFP Li et al., 2016 N/A
BK channel without E29 This paper N/A
BK channel with E29 This paper N/A
splice reporter pFlare5 vector Stoilov et al., 2008 N/A
pGFP-C-shLenti bCaMKK shRNA constructs Origene Cat#TL711303
pGFP-C-shLenti Nova2 shRNA constructs Origene Cat#TL508674
pcDNA3.1/Puro-CAG-ASAP1 St-Pierre et al., 2014 Addgene Plasmid # 52519;
RRID:Addgene_52519
AAV-CaMKIIa-GCaMP6s-P2A-nls-tdTomato Gift from Jonathan Ting Addgene Plasmid #51086;
(unpublished) RRID:Addgene_51086
CaMKIV-GFP construct Gift from Haruhiko Bito N/A
CA-CaMKIV construct Gifts from Tian-Ming Gao N/A
CaMBP4 construct Gifts from Tian-Ming Gao N/A
pClamp 9 Molecular Devices https://www.moleculardevices.com/
Prism GraphPad https://www.graphpad.com/
MATLAB Mathworks https://www.mathworks.com/
ImageJ Schneider et al., 2012 https://imagej.nih.gov/ij/
Cell 181, 1547–1565.e1–e8, June 25, 2020 e2

ll
Article
LEAD CONTACT AND MATERIALS AVAILABILITY
Further information and requests for resources and reagents should be addressed to Lead Contact, Richard W. Tsien (richard.tsien@
nyulangone.org)
All unique reagents generated in this study are available from the Lead Contact with a completed Materials Transfer Agreement.
Cell lines
HEK293 cell lines (human, female) were purchased from the American Type Culture Collection (ATCC). Cells were cultured in
standard Dulbeco’s modified Eagle’s medium (DMEM) supplemented with 10% FBS, 100 U/mL penicillin, 100 mg/mL streptomycin,
2 mM L-glutamine. All cells used were negative for mycoplasma. The cell lines were authenticated by the source repository.
Primary cell cultures

Cortical neurons were cultured from postnatal day 0 (P0) male and female Sprague-Dawley rat pups as previously described (Li et al.,
2016; Ma et al., 2014). Mixed cohorts of female and male pups were used for all experiments to minimize gender effects. The frontal
cortex was isolated and washed twice in ice-cold modified HBSS (4.2 mM NaHCO3 and 1 mM HEPES, pH 7.35, 300 mOsm)
containing 20% fetal bovine serum (FBS; Hyclone, Logan, UT). Samples were washed and digested for 30 min in a papain solution
(2.5 mL HBSS + 145 U papain + 40 mL DNase) at 37 C with gentle shaking every 5 min. Digestion was stopped by adding 5 mL of
modified HBSS containing 20% fetal bovine serum. After additional washing, the tissue was dissociated using Pasteur pipettes of
decreasing diameter. The cell suspension was pelleted twice, filtered with a 70 mm nylon strainer, and plated on 10 mm coverslips
coated with poly-D-lysine. The cultures were maintained in NbActiv4 (BrainBits, Springfield, IL), at 37 C in a 5% CO2 incubator. Half
of the media was changed at 7 days, and once per week thereafter.
Animals
C57BL/6 male and female mice (postnatal day 26), purchased from the Charles River Laboratory, were used for monocular depriva-
tion experiments. The Olig3Cre mouse line was a gift from Y. Nakagawa, University of Minnesota. R26floxstop-TeNT (tetoxf/f) mouse
line was a gift from M. Goulding at the Salk Institute for Biological Studies. These two mouse strains were maintained on a mixed
background (Swiss Webster and C57/ B16). All mice were housed with a 12 hour light-dark cycle. Mixed cohorts of female and
male mice were used for all experiments to minimize gender effects. Animal protocols were performed in accordance with NIH
guidelines and approved by the Institutional Animal Care and Use Committee at New York University and Sun Yat-sen University.
METHOD DETAILS
Constructs
Nova-2 mutants were produced by Phusion Site-Directed Mutagenesis Kit (Thermo Scientific). BK channel without E29 was
first cloned from pcDNA3-BK-GFP (Li et al., 2014) via PCR and inserted into pCKII-GFP construct with an aCaMKII promoter to
restrict expression to pyramidal cells (Li et al., 2016). E29 sequence was synthesized and inserted by Phusion Site-Directed Muta-
genesis Kit (Thermo Scientific). The splice reporter containing E29 and partial flanking intron sequences were cloned by PCR
from genomic DNA obtained from rat brain tissue and inserted into pFlare5 vector (Stoilov et al., 2008). Other constructs, see Key
Resources Table.
Transfection and treatment of cortical neurons

Neurons were transfected 7 to 9 days after plating using a high efficiency Ca2+- phosphate transfection method (Jiang and Chen,
2006). Experiments were performed 12-14 days after plating. For chronic spike blockade, neurons were treated with 1 mM TTX on
DIV12 for 48 hours to prevent action potential. Where indicated, additional antagonists (5 mM nimodipine, 4 mM KN93 or 3 mM
STO-609) were added with TTX. For chronic depolarization, neurons were treated with 20 mM K+ (20K), 40K or 60K solution (to depo-
larize cells; Na+ adjusted to maintain osmolarity) for 12 or 24 h. For action potential recording by patch clamp, coverslips with control
or TTX-treated neurons were removed from the culture medium, washed and recorded in TTX-free artificial cerebrospinal fluid
(ACSF). For imaging the action potential-independent Ca2+ transients and voltage changes in the dendrites, control or TTX-treated
neurons were transferred to a TTX-containing Tyrode’s solution consisting of (in mM): 150 NaCl, 4 KCl, 1 MgCl2, 2 CaCl2, 10 HEPES,
10 glucose, pH 7.4 with 1 mM TTX. Where indicated, antagonists were added by manual pipetting to achieve a final concentration
of 10 mM PhTx, 10 mM APV, 5 mM nimodipine or 1 mM thapsigargin in TTX-containing Tyrode’s solution.
Lentiviral transduction of cortical neurons

The production of lentivirus was performed as previously described (Ma et al., 2014), pGFP-C-shLenti constructs encoding
shRNAs against Nova-2 or bCaMKK (OriGene) were transfected into 293T cells along with the packaging plasmid psPAX2 and
the envelope plasmid pMD2.g. After 16 h, the medium was changed and the supernatant was collected 24 h later and cleared of
e3 Cell 181, 1547–1565.e1–e8, June 25, 2020

ll
Article
cell debris by filtering through a 0.45 mm filter. The viral particles were concentrated by centrifuging the filtrate at 70,000 3 g for 2 hr at
4 C using a Beckman SW28 rotor. The viral pellet was then resuspended in sterile PBS, aliquoted, and stored at 80 C. Lentivirus
particles (0.5-1 mL of viral stock diluted in 20 mL of PBS per coverslip) were added to cortical cultures containing 500 mL of medium on
DIV7. The experiments with overexpressed proteins or shRNAs were performed on DIV14.
Immunocytochemistry and image acquisition and analysis

Cells were fixed in ice-cold 4% paraformaldehyde in phosphate buffer with 20 mM EGTA and 4% sucrose; permeabilized with 0.1%
Triton X-100; blocked with 10% normal goat serum (or donkey serum); and incubated overnight at 4 C in primary antibodies. For
surface staining of GluR1 or BK channel, coverslips were fixed and blocked in 7.5% normal donkey serum for 30 minutes in the
absence of Triton X-100. Surface staining with primary antibodies (Key Resources Table) was then performed for 1 hour at room tem-
perature. Cells were then permeabilized in 0.1% Triton X-100, and stained with anti-PSD-95 and anti-MAP2 (Figure S6) overnight at
4 C. The next day, cells were washed with PBS, incubated at RT for 60 min with Alexa secondary antibodies (1:1000, Molecular
Probes), washed again and mounted with ProLong Gold + DAPI (Invitrogen).
Fixed cells were imaged with a 40X or 60X oil objective on a Zeiss LSM 800 confocal microscope. Intensity quantification was per-
formed with custom scripts in MATLAB (Mathworks) or ImageJ (NIH). For all analyses, a region of interest lacking cells was selected in
each field of view as an ‘off-cell’ background, and the mean intensity was subtracted from all cellular regions of interest for each color
channel. The following types of analyses were performed from at least three independent cultures in each case:
(1) Analysis of Nova-2, pCaMKIV, bCaMKK and flag-tagged Nova-2 (e.g., Figures 3 and 4). Nuclear and cytosolic regions of in-
terest were manually drawn while viewing only DAPI or MAP2 channels, but blinded to the Nova-2 color channel. Background-
subtracted mean intensity was quantified and normalized to control conditions.
(2) Analysis of pCaMKII, pCaMKI and surface GluR1 intensity (e.g., Figure S6) was restricted to MAP2-positive regions of interest
containing a proximal dendrite (60 mm in length, 15 mm in width) at least 30 mm from cell soma. For GluR1, PSD-95 puncta
were identified, and the average intensity of GluR1 signal intensity measured in the region of those puncta. For pCaMKII and
pCaMKI, the regions of interests were manually drawn while viewing MAP2 channel but blinded to the pCaMKII color channel.
The background was subtracted from pCaMKII intensity. Data represents mean ± SEM over 20 such dendrites, normalized to
control conditions.
Transfection and electrophysiology of HEK293 cells

HEK293 cells were transfected via a high efficiency Ca2+- phosphate transfection method (Jiang and Chen, 2006), with constructs
encoding CMV promoter-driven BK channel with (E29) or without (DE29) E29, or with splicing reporter constructs co-expressed with
or without WT or mutant Nova-2, or with CaMKIV-GFP and flag-Nova-2 constructs, or with CA-CaMKIV and flag-Nova-2 constructs.
Experiments were done 2-3 days later. Whole-cell recordings were obtained at room temperature (Axopatch 200B, Molecular De-
vices). Electrodes were pulled borosilicate glass capillaries (World Precision Instruments, MTW 150-F4), with 5–8 MU resistances,
before 80% series resistance compensation.
Electrophysiological recording of action potential half-width in cultured cortical neurons

DIV 10-14 cultured cortical neurons were used to measure homeostatic changes in action potential waveforms. Cultured cortical
neuron coverslips were placed into a submerged recording chamber and perfused with artificial cerebrospinal fluid (ACSF) at
4 mL/min held at room temperature (20-25 C) and bubbled with 95%/5% O2/CO2. All cells were measured in current clamp
mode using borosilicate patch electrodes with tip resistance of 2-4 MU. Recordings were not corrected for the liquid junction poten-
tial. Data were sampled at 10 kHz and low pass filtered at 2 kHz using a Bessel filter in pClamp 9 software. Recordings were not used
if the access resistance was above 25 MU or changed significantly throughout the recording (> 20%).
The ACSF contained (in mM): 3 KCl, 10 D-Glucose, 122 NaCl, 1.25 NaH2PO4, 1.3 MgCl2, 2 CaCl2, 26 NaHCO3. The internal pipette
solution contained (in mM): 130 K Gluconate, 1 MgCl2, 10 HEPES, 0.3 EGTA, 10 Tris-Phosphocreatine, 4 Mg-ATP, 0.3 Na-GTP.
To minimize changes in action potential duration due to depolarizing current steps, all action potentials were evoked as a rebound
spike (anodal break response) (Hodgkin and Huxley, 1952). Briefly, a 500 ms long pulse between 500 pA and 800 pA was injected
into the neuron before it was allowed to rebound to resting potential. Upon return to resting potential a single action potential was
usually evoked and sweeps where an action potential was successfully evoked were used for analysis. Using this method allows
an action potential to be evoked from a membrane potential negative enough to maximize removal of inactivation and minimize
the influence of variations in resting potential.
Action potential half width was measured using custom MATLAB scripts. Briefly, the action potential amplitude (baseline to
action potential peak value) was measured and the time between the half peak amplitude on the rising and falling phases was taken
to be the action potential half width.
Cell 181, 1547–1565.e1–e8, June 25, 2020 e4

ll
Article
Recording Current-Voltage relationships in HEK cells transfected with BK channels

For recordings in HEK cells a symmetrical solution (containing equal ionic concentrations in the internal pipette solution and external
solution) was used containing (in mM): 10 K Gluconate, 2 KCl, 2 MgCl2, 10 HEPES, 9.93 CaCl2, 10 EGTA, 10 Glucose. A current-
voltage (I-V) relationship was measured in voltage clamp mode with 250 ms long voltage steps between 100 mV and 100 mV
in 10 mV increments. Custom MATLAB scripts were used offline to take current measurements when the membrane current
stabilized after voltage steps.
Ca2+ imaging on dendrite and soma of cultured neurons

Neurons were transfected with GCaMP6s on DIV 7-9 and Ca2+ transients were imaged on DIV13 to 15. Images were collected with
a 40X oil objective on a Zeiss 710 confocal microscope, 300 ms per frame. Intensity quantification was performed with custom
scripts in MATLAB (Mathworks).
For imaging of Ca2+ transients in soma, the 488 nm fluorescence changes were quantified using regions of interest in cell soma
(Figure 1B). Ca2+ transients were imaged in TTX-free Tyrode’s solution, consisting of (in mM): 150 NaCl, 4 KCl, 1 MgCl2, 2 CaCl2,
10 HEPES, 10 glucose, pH 7.4. The amplitude of single action-induced Ca2+ transient was defined by the minimal amplitude of single
Ca2+ transient that could propagate to the entire cell (soma and dendritic tree). The Ca2+ transients that arose locally in the soma
but did not propagate to the dendrite was regarded as local under-threshold depolarization-induced Ca2+ transients and was elim-
inated from the quantification. For imaging the action potential-independent Ca2+ transients in the soma, control or TTX-treated
neurons were imaged in a TTX-containing Tyrode’s solution.
For Ca2+ transients in dendrite, a 200 mm long dendrite located at least 30 mm from cell soma was chosen. Ca2+ transients
were imaged in TTX-containing Tyrode’s solution. Where indicated, antagonists were added by manual pipetting to achieve a
final concentration of 10 mM PhTx, 10 mM APV, 5 mM nimodipine or 1 mM thapsigargin in TTX-containing Tyrode’s solution.
In dendritic trees, sites of spontaneous Ca2+ influx were identified as regions of interest (ROIs) which showed sparse peaks of
GCaMP6s fluorescence through time, many of which spatially overlapped with spine-like protrusions in baseline GCaMP6s images
highlighting dendritic trees structure (Figures 6D and 6F). ROI shape and number were obtained by automatically computing a series
of operations. First, temporal stacks of fluorescence images were de-noised and the baseline GCaMP6s fluorescence was
subtracted pixel-wise. Noise elimination was performed using a custom method that kept intact the non-random correlation between
neighboring pixels showing coordinated activity in order to preserve localized signals from Ca2+ influx. An overly large set of putative
ROIs were then identified by using the separable non-negative matrix factorization (sNMF) algorithm. sNMF works by identifying
strong but temporally sparse signals and preventing the temporal redundancy between extracted signals - a good model for
spontaneous Ca2+ signals in dendrites. The final ROI set was obtained by unbiasedly selecting only those with cumulated temporal
!
rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
P 2
fluorescence FðtÞ greater than the average + 3 3 the standard deviation of cumulated fluorescence values drawn outside
t
candidate ROIs.
Voltage imaging of cultured neurons

Cultured pyramidal neurons were transfected via a high efficiency Ca2+- phosphate method 7-9 days after plating (Jiang and Chen,
2006), with 1 mg pcDNA3.1/Puro-CAG-ASAP1 plasmid (St-Pierre et al., 2014). Experiments were done 5-7 days later. In voltage im-
aging data in Figures 6A–6C, the spontaneous AP in soma and bAP in dendritic spines were monitored by simultaneously measuring
the DF/F of voltage sensor ASAP1 in soma and dendritic spines. We reasoned that 1) only action potential, instead of subthreshold
synaptic current, could backpropagate from soma to dendritic trees and elicit simultaneous voltage changes in soma, different den-
dritic branches and dendritic spines; 2) due to the high on and off kinetics of ASAP1 (2ms), high speed recording (2.87 ms / frame in
our experiments) of the DF/F changes could reliably detect single spontaneous action potential; DF/F changes induced by single ac-
tion potential were comparable in both amplitude and duration; DF/F changes induced by burst of action potentials were much larger
in amplitude and duration and were discarded in our analysis. Accordingly, we named the simultaneous DF/F changes (voltage
changes) in soma and spines as ‘‘AP in soma’’ and ‘‘bAP in spine,’’ respectively, and named the DF/F changes (voltage changes)
that took place only in spines as ‘‘Vspine.’’ When measuring Vspine, TTX was acutely applied to eliminate action potentials. A narrow
region of interest containing the plasma membrane of dendritic spines (30-80 mm from soma) and cell soma were chosen, and time
lapse imaging with a speed of 2.87 ms per frame was performed on a Zeiss LSM 710 confocal microscope, utilizing GFP filter sets.
Background-subtracted fluorescence intensity were analyzed in ImageJ (NIH). ASAP1 waveforms were plotted as -DF/F (to account
for the negative change in ASAP1 fluorescence upon depolarization) and corrected for bleaching. The noise was filtered with
LOWESS smoothing. Identical cell-selection, bleaching correction and noise-filtering parameters were used for all datasets in
Figure 6. Data is plotted as mean ± SEM of 14-15 cells for each condition.
To calculate the standard fluorescence-voltage (DF/F-DV) curve for ASAP1, neurons were perfused high K+ Tyrode’s solutions with
Na+ adjusted to maintain osmolarity. The change of fluorescence intensity (-DF/F) was plot against voltage according to 4 mM K+
(4K), 60 mV; 20K, 37 mV; 40K, 19 mV; 60K, 9 mV; 90K, 0 mV (Wheeler et al., 2012). The membrane potentials in different
conditions were then calculated according to the peak values of -DF/F.
e5 Cell 181, 1547–1565.e1–e8, June 25, 2020

ll
Article
Immunoprecipitation
Immunoprecipitation was performed with Dynabeads Protein G Immunoprecipitation Kit (Life Technologies) following the kit protocol
as described (Li et al., 2016). Briefly, for binding the antibody with the beads, 5 mg of antibody or control IgG was crosslinked and
incubated at room temperature with 100 mL Dynabeads. The supernatant was removed with the help of a magnet to retain the
bead-antibody complex, which was then washed with Ab Binding & Washing Buffer. For lysate preparation, cultured cortical neurons
or HEK293 were washed with PBS and lysed on ice with IP Lysis Buffer (Pierce) containing protease and phosphatase inhibitors. The
lysate was centrifuged at 16,000 g for 10 min at 4 C to pellet the cell debris. The resulting supernatant was incubated with bead-anti-
body complex with rotation for 30 min at room temperature. The bead-antibody-antigen complex was then washed 3 times using
washing buffer. The protein complex was eluted with Elution buffer and the eluent was mixed and heated with 2 3 SDS sample buffer
containing b-mercaptoethanol at 90 C for 10 min.
Protein sample preparation and western blot

Nuclear and cytosolic protein fraction was performed as described (Li et al., 2014) . Briefly, nuclear and cytosolic protein was ex-
tracted using the Pierce Nuclear Protein Extraction Kit (Thermo), following the manual. Neurons were collected by centrifugation
at 500 g for 3 min at 4 C and were resuspended in 200 mL precooled CER I by 15 s of intense vortexing. The solution was then incu-
bated on ice for 10 min. CER II (11 ml) was precooled and added to the solution. After 5 s of intense vortexing, the solution was incu-
bated on ice for 1 min. After another 5 s of intense vortexing, the solution was then centrifuged at 16,000 g for 5 min at 4 C and the
supernatant collected as cytosolic protein. NER (100 ml) was added to the precipitate and the sample vibrated intensively for 15 s. The
solution was then incubated on ice for 10 min. The vibration and incubation were repeated four times, and then the precipitate was
centrifuged at 16,000 g for 10 min at 4 C. The supernatant represented purified nuclear protein.
Plasma membrane protein extraction was performed with Pierce Cell Surface Protein Isolation Kit as described (Li et al., 2014). For
surface biotinylation, neurons were chilled on ice, washed twice with ice-cold PBS, and then incubated with PBS dissolved sulfo-
NHS-SS-biotin for 30 min at 4 C. Unreacted biotin was quenched by washing cells with Quenching Solution. Cultures were harvested
in Lysis Buffer. Homogenates were centrifuged at 10,000 g for 2 min at 4 C. The resulting supernatant was rotated 1 hr at room tem-
perature with NeutrAvidin Agarose. The beads were washed with Wash Buffer and analyzed by immunoblotting with each antibody.
Whole cell lysate was performed as previously described (Li et al., 2016). Cells were lysed with CelLytic M Cell Lysis Reagent
(Sigma). Protein concentration was measured by the BCA Protein Assay Kit (Thermo). The lysates were combined with 2 3 SDS
loading buffer, then boiled for 10 min.
Protein samples were loaded onto 4%–20% gradient precast or 10% SDS-PAGE gel (Bio-Rad). Proteins were transferred to a
PVDF membrane (Millipore) for 1 h at 350 mA, and membranes were incubated in Odyssey Blocking Buffer (Li-Cor) for 1 h at
room temperature. The membrane was incubated overnight at 4 C with primary antibodies. The blots were washed three times in
PBS containing 0.1% Tween-20 for 15 min and then incubated with IRDye-conjugated secondary antibody for 1 h in PBS, 0.1%
Tween-20 at room temperature. Immunoreactivity was detected by an Odyssey imaging system (Li-Cor).
Oligo-RNA pull-down
RNA pull-down was performed with Magnetic RNA-Protein Pull-Down Kit (Thermo) following the manual. Briefly, synthesized 50 -bio-
tinylated 20 -OMe-RNA oligonucleotides (Eurofins) (Key Resources Table) were bound to streptavidin magnetic beads (Thermo)
in RNA Capture Buffer at room temperature with agitation. Cleared brain lysates from 1-month old mouse cortex (Figure 2B) or
transfected HEK cells (Figure S5E) prepared in IP Lysis Buffer (Pierce) containing protease and phosphatase inhibitors mixed with
Protein-RNA Binding Buffer, and then incubated with the packed beads at 4 C for 1 hr. Beads were washed three times with
wash buffer and the precipitate was subjected to immunoblot analysis. Signal intensities were quantified by an Odyssey imaging
system (Li-Cor).
RNA immunoprecipitation
RNA immunoprecipitation was performed with Magna RIP RNA-Binding Protein Immunoprecipitation Kit (Sigma) following
the manual. Briefly, 5 mg of anti-Nova-2 antibody or control IgG were incubated with magnetic beads in RIP wash buffer at room
temperature for 30 mins. The antibody prebound beads were then washed 3 times with RIP wash buffer at room temperature. For
lysate preparation, mouse cortical lysates were prepared with complete RIP Lysis Buffer. The lysates were mixed with prebound
beads in RIP Immunoprecipitation Buffer containing RIP wash buffer, EDTA and RNase inhibitor, overnight with rotation at 4 C.
The magnetic beads antibody-RNA binding protein complex was washed with wash buffer and then re-suspended in proteinase
K buffer at 55 C for 30 minutes. Phenol:chloroform:isoamyl alcohol and chloroform were used to separate the phase and the
RNA was precipitated with Salt Solution I, Salt Solution II, Precipitate Enhancer and absolute ethanol at 80 C overnight.
The RNA precipitation was centrifugated and washed with 80% ethanol solution. The pellets were dried and resuspended in
RNase-free water and used for further RT-PCR.
RT-PCR and Realtime qPCR

Total RNA was isolated from cortical neurons or HKE 293 cells with RNeasy Mini kit (QIAGEN) according to manufacturer’s
instructions. Extracted RNA was reverse transcribed into first strand cDNA using oligo dT and Superscript III (Invitrogen). PCR
Cell 181, 1547–1565.e1–e8, June 25, 2020 e6

ll
Article
was performed with AmpliTaq Gold 360 Master Mix with DY-682 labeled primers (for X6). The PCR products were either resolved
on agarose gel (X1-X5) or denatured and resolved on 6% polyacrylamide/8 M urea denaturing gels (X6). The band density was
analyzed by an Odyssey imaging system (Li-Cor). qPCR was performed with SYBR-green PCR master mix (Fermentas) with primers
(Table S1) using the DNA engine Opticon 2 (Bio-Rad).
Thalamic input elimination and immunohistochemistry

The Olig3Cre mouse line was a gift from Y. Nakagawa, University of Minnesota. R26floxstop-TeNT (tetoxf/f) was a gift from M. Goulding at
the Salk Institute for Biological Studies. Neurotransmission is selectively blocked from the thalamic cells as expression of Cre re-
moves a stop codon at the R26 locus (removing the flox sites) permitting the expression of the tetanus toxin light chain subunit
(TeNT) only within thalamic Olig3+ neurons. Olig3Cre mice were bred with animals that were homozygous for tetoxf/f. Pups from
the subsequent litters were genotyped for Cre and presence of the TeNT allele in the R26 locus. Olig3Cre-only animals were used
as controls and Olig3Cre plus tetoxf/+ heterozygous animals as mutants. All mouse strains were maintained on a mixed background
(Swiss Webster and C57/ B16).
Mice were perfused inter cardiac with 4% PFA after being anesthetized either on ice or using Sleepaway IP administration. Brains
were post-fixed for 30 minutes and cryopreserved in 30% sucrose following the perfusion and brain harvest. 16mm coronal sections
were obtained using Cryostat (Leica Biosystems) and collected on super-frost coated slides, then allowed to dry overnight and stored
at 20 C until use. For immunofluorescence, cryosections were thawed and allowed to dry for 5-10 min and rinsed twice in 1x PBS.
They were incubated at room temperature in a blocking solution of PBST (PBS-0.1%Tx-100) and 10% normal donkey serum (NDS)
for 60min, followed by incubation with primary antibodies in PBS-T and 1% NDS at 4 C overnight. Samples were then washed 3 times
with PBS-T and incubated with fluorescence conjugated secondary Alexa antibodies (Life Technologies) in PBS-T with 1% NDS at
room temperature for 60-90min. Slides were then incubated for 30 s with DAPI, washed 3 times with PBS-T and once with PBS.
Slides were mounted with Fluoromount G (Southern Biotech) and imaged on a Zeiss LSM 510 Laser scanning microscope.
Monocular deprivation
The monocular deprivation was performed following previous study (Ma et al., 2013). Briefly, the mice (postnatal day 26) were anes-
thetized with the mixture of 100 mg/kg ketamine hydrochloride and 10 mg/kg xylazine hydrochloride intraperitoneally. Under micro-
scope and illuminator, the eyelashes and the edge of eyelids were cut using the spring scissors. The eyelids of the right eye were
sutured with three mattress sutures. A thin layer of xylocaine 2% Jelly and bacitracin zinc ointment was applied to the sutured eyelids.
After 5 days monocular deprivation, mice brains were sliced for immunostaining or RNA isolation. Nova-2 localization or E29 splicing
in the monocular region of primary cortex V1, both contralateral and ipsilateral to the visual deprived eye, was analyzed.
Protein sample preparation and mass spectrometry analysis

Protein sample preparation and mass spectrometry analysis were perform as described (Ma et al., 2014). Briefly, flag-tagged Nova-2
were co-transfected with CA-CaMKIV or GFP into HEK293 cells. 48 h after transfection, the cells were washed with ice-cold PBS and
lysed with IP Lysis buffer (Thermo Fisher) containing a protease inhibitor cocktail tablet (Roche, Indianapolis, IN). The lysates were
mixed with CaM kinase reaction buffer (Enzo Life Sciences) and incubated with anti-flag antibody (Cell Signaling Technology) and
Protein A/G Sepharose beads. The immunoprecipitated complex was separated by SDS-PAGE. Corresponding areas showing
Nova-2 bands were cut out from the Coomassie blue-stained SDS-PAGE gels. Gel pieces underwent in-gel digestion and the phos-
phopeptides were enriched by titanium dioxide (GL Sciences Inc., Japan) based on a previously published protocol (Rappsilber et al.,
2007). Resulting peptides were subjected to Nano LC-MS/MS analysis using a Thermo Scientific EASY-nLC 1000 coupled to a Q
Exactive mass spectrometer (Thermo Fisher Scientific). A self-packed 75mm 3 25 cm reversed phase column (Reprosil C18,
3mm, Dr. Maisch GmbH, Germany) was used for peptide separation. Peptides were eluted by a gradient of 3%–30% acetonitrile
in 0.1% formic acid over 60 min at a flow rate of 250 nL/min. The Q Exactive was operated in the data-dependent mode with survey
scans acquired at a resolution of 50,000 at m/z 400. Up to the top 10 most abundant precursors from the survey scan were selected
with an isolation window of 1.6 Thomsons and fragmented by higher energy collisional dissociation with normalized collision energies
of 27. The maximum ion injection times for the survey scan and the MS/MS scans were 20 and 60 ms, respectively, and the ion target
value for both scan modes were set to 1,000,000. Each sample was processed and analyzed in triplicate.
Phosphopeptide identification and quantitation

Phosphopeptide identification and quantitation was performed as described (Ma et al., 2014). Briefly, the raw files were processed
using the MaxQuant computational proteomics platform version 1.2.7.0 for peptide identification and quantitation. The fragmentation
spectra were searched against the Uniprot protein database allowing up to two missed tryptic cleavages. Carbamidomethylation of
cysteine was set as a fixed modification, and phosphorylation of serine/threonine/tyrosine, oxidation of methionine and protein N-ter-
minal acetylation were used as variable modifications for database searching. The precursor and fragment mass tolerances were
set to 7 and 20 ppm, respectively. Quantitation of the phosphopeptides was performed using the ion intensity values calculated
by MaxQuant for the quintuply charged peptides. Normalization of phosphopeptide intensities to global peptide intensities had no
effect on calculated ratios.
e7 Cell 181, 1547–1565.e1–e8, June 25, 2020

ll
Article
All statistical analyses were performed using Prism (GraphPad Software). Means of two groups were compared using Student’s t
test. One-way ANOVA was used to compare group means between more than two groups, followed by either Dunnett’s multiple
comparisons test when all other groups were compared to the control group, or Tukey’s multiple comparisons test when
means of each pair of groups were compared. All comparisons are two-sided. Statistical significance was determined as follows:
*p < 0.05, **p < 0.01. Data are shown as mean ± SEM. The statistical details of all experiments, including the number of samples
and p values, can be found in figures and figure legends.
DATA AND CODE AVAILABILITY
All data supporting the findings of this study and custom MATLAB code for calcium imaging are available upon request from the
Lead Contact.
Cell 181, 1547–1565.e1–e8, June 25, 2020 e8

ll
Article
Figure S1. APD Increased after Chronic TTX Treatment, Related to Figure 1
(A) Representative waveforms of action potential recorded after TTX treatment (24 or 48 h). (B) Quantification of half width of action potentials in the groups in (A)
(n = 10. Data for control and 48 h groups is from Figure 1A).
ll
Article
Figure S2. Effects of Inactivity and E29 Splicing on Trafficking and Expression of BK Channels, Related to Figure 1
(A) Expression level of BK channel mRNA was assayed by RT-PCR (n = 4). (B) BK channel expression in the plasma membrane was assayed by western blotting
after biotinylation-mediated membrane protein fractionation. (C) Immunofluorescence of surface BK channel. Surface BK channel was probed with antibodies
targeting extracellular N terminus. (D) Quantification of intensity of surface BK channel (n = 10). (E) Overexpression of GFP-tagged BK channel isoforms in HEK293
cells. (F) Quantification of surface fluorescence intensity of GFP-tagged BK channel isoforms. (G) Expression level of GFP-tagged BK channel isoforms was
assayed by western blotting after biotinylation-mediated membrane protein fractionation. Na/K ATPase, Lamin B and GAPDH are markers for plasma membrane,
nucleus and cytosol.
ll
Article
Figure S3. Effects of Chronic Inactivity and Depolarization on AS of the BK Channel, Related to Figure 1
(A) Impact of chronic inactivity on alternative splicing of X1 to X5. Cultured cortical neurons were sham- or TTX-treated for 48 h. Separation of RT-PCR products
obtained with specific primers flanking each alternatively spliced site (X1 to X5). Closed arrows indicate the longer RT-PCR products obtained if the alternative
exon was included, open arrows indicate the shorter products if that exon was excluded. (B) Quantification of inclusion or exclusion of X1 to X6 by quantitative RT-
PCR. (C) High K+ solution maintained long-term depolarization in resting membrane potential. (D) Impact of sustain depolarization on E29 inclusion. Cultured
cortical neurons were sham- or high K+-treated (20mM, 40mM and 60mM, respectively) for 12 or 24 h. E29 inclusion was examined by RT-PCR.
ll
Article
Figure S4. Nova-2 Is Responsible for E29 AS, Related to Figure 2

(A) Validation of the knockdown of Nova-2 expression by shRNAs. Cortical neurons were transfected with shRNAs either targeting the untranslated region (UTR)
of the Nova-2 gene or targeting the Nova-2 coding sequence (CDS). Both shRNAs efficiently knocked down endogenous Nova-2 expression. (B) UTR-lacking
flag-Nova-2 plasmids (Nova-2 (R)) were co-expressed with shRNAs targeting CDS (targeting three different regions) or shRNAs targeting UTR in HEK293 cells.
The shRNAs targeting CDS efficiently knocked down flag-Nova-2 expression, whereas shRNAs targeting UTR did not. (C) E29 inclusion could only be induced by
wild-type (WT) Nova-2, but not DNLS-Nova-2. WT-Nova-2-induced E29 inclusion was prevented by the shRNAs targeting the Nova-2 coding sequence (CDS),
but not by shRNAs targeting the untranslated region (UTR) of the Nova-2 gene.
ll
Article
Figure S5. Identification of CaMKIV-Mediated Nova-2 Phosphorylation Sites by Mass Spectrometry, Related to Figure 5
(A) Schematic diagram of Nova-2 protein as in Figure 7. (B to D) Mass spec results of phosphorylation level of S25, T27 and S194 sites. (E) WT and S194E Nova-2
are no different in RNA binding. Flag-tagged WT Nova-2 or S194E Nova-2 was overexpressed in HEK293 cells. RNA pull-down assay was performed from the cell
lysates by probe A1 (Figure 2). The RNA binding ability of WT and S194E Nova-2 was assayed by western blotting with anti-flag antibody.
ll
Article
Figure S6. Ca2+ Transients, Surface GluA1 Expression, and Activation of CaMKs in Con or TTX-Treated Neurons, Related to Figure 6
(A) Ca2+ transients were imaged from the soma of control neurons, with no TTX present. (B) Ca2+ imaging, from the soma of control neurons (upper) or TTX-treated
(48 h) neurons, was performed in the acute presence of TTX. (C) Ca2+ transients induced by action potential-independent synaptic transmission (red, imaged with
TTX present) or by back-propagating action potentials (bAP) (blue, imaged without TTX). (D and E) Puncta area (indicating active spines, D) and the number of
Ca2+ transients (per minute, E) in each puncta (active spine) in control and TTX-treated (48 h) neurons (n = 14 to 15). (F) The influence of chronic inactivity on
surface GluA1 expression. Surface-labeling of GluA1 in the dendrites from control and TTX-treated (48 h) neurons. PSD95 as synaptic marker, MAP2 as dendrite
marker. Scale bar 10 mm. (G) Quantification of surface GluA1 immunofluorescent intensity in the dendrites from control and TTX-treated (48 h) neurons. (H) The
activation of CaMKII in control and TTX-treated neurons. Immunofluorescence of phosphorylated CaMKII on Thr286 site, indicating the level of autophos-
phorylated CaMKII. (I) Quantification of dendritic pCaMKII intensity in control and TTX groups. (J) The activation of CaMKI in control and TTX-treated neurons.
Immunofluorescence of phosphorylated CaMKI on Thr177/178 site, indicating the activation of CaMKI by CaMKK. (K) Quantification of dendritic pCaMKII in-
tensity in control and TTX groups. MAP2 as dendrite marker. Scale bar 10 mm.
ll
Article
Figure S7. Regulation of pCaMKIV Activation, Nova-2 Localization, E29 Splicing, and CaMK Localization by Chronic Inactivity and CaV1-
CaMK Blockers or Inhibitors, Related to Figure 7
(A) The activation of nuclear CaMKIV in different groups as indicated. Immunofluorescence of phosphorylated CaMKIV on Thr196 site indicating the level of
CaMKIV activation by CaMKK. (B) Normalized nuclear/cytosolic ratio of pCaMKIV immunofluorescence intensity in each group as indicated. (C) The localization
of Nova-2 in different groups as indicated. (D) Normalized nuclear/cytosolic ratio of Nova-2 immunofluorescence intensity in each group as indicated. (E) 12 h and
24 h high potassium media treatment (20mM, 40mM and 60mM, respectively) led to Nova-2 translocation from nucleus to cytosol. (F) Long-term depolarization
reduced the inclusion of E29, which could be blocked by CaV1 blocker nimodipine and CaM kinases blocker KN-93. (G and H) Fold changes in abundance of
CaM, aCaMKI, bCaMKI, gCaMKI, dCaMKI and gCaMKII in the nucleus, cytosol and nucleus/cytosol ratio. (G) Representative micrographs of different proteins in
control or TTX-treated neurons. (H) Ratios of levels after TTX treatment (48 h) relative to levels in control neurons. (I) Validation of the knockdown of bCaMKK
expression by shRNAs. flag-tagged bCaMKK was co-expressed with shRNAs targeting different locations of rat bCaMKK CDS. The efficiency of shRNAs was
assayed by western blotting.
Article
An Autonomous Oscillation Times and Executes

Centriole Biogenesis
Mustafa G. Aydogan,
Thomas L. Steinacker,
Mohammad Mofatteh, ..., Alain Goriely,
Michael A. Boemo, Jordan W. Raff
Correspondence
mustafa.aydogan@path.ox.ac.uk
(M.G.A.),
mb915@cam.ac.uk (M.A.B.),
jordan.raff@path.ox.ac.uk (J.W.R.)
In Brief
Feedback-driven oscillations in centriolar
Plk4 kinase levels—normally entrained by
the cell-cycle oscillator but capable of
running autonomously—trigger and time
centriole biogenesis to ensure that
daughter centrioles grow at the right time
and to the right size.
Highlights
d Centriolar Plk4 levels oscillate and act as a switch for
centriole biogenesis
d Oscillations may be generated via an Asl/Plk4 delayed

negative feedback loop
d Plk4 oscillations are entrained and phase-locked by the Cdk/

Cyclin oscillator (CCO)
d Plk4 oscillations can drive centriole biogenesis even when

the CCO is perturbed
Aydogan et al., 2020, Cell 181, 1566–1581

June 25, 2020 ª 2020 The Author(s). Published by Elsevier Inc.
ll
Article
An Autonomous Oscillation
Times and Executes Centriole Biogenesis
Mustafa G. Aydogan,1,5,* Thomas L. Steinacker,1,5 Mohammad Mofatteh,1 Zachary M. Wilmott,1,2 Felix Y. Zhou,3
Lisa Gartenmann,1 Alan Wainman,1 Saroj Saurya,1 Zsofia A. Novak,1 Siu-Shing Wong,1 Alain Goriely,2
Michael A. Boemo,1,4,* and Jordan W. Raff1,6,*
1Sir William Dunn School of Pathology, University of Oxford, Oxford OX1 3RE, UK
2Mathematical Institute, University of Oxford, Oxford OX2 6GG, UK
3Ludwig Institute for Cancer Research, University of Oxford, Oxford OX3 7DQ, UK
4Present address: Department of Pathology, University of Cambridge, Cambridge CB2 1QP, UK
6Lead Contact
*Correspondence: mustafa.aydogan@path.ox.ac.uk (M.G.A.), mb915@cam.ac.uk (M.A.B.), jordan.raff@path.ox.ac.uk (J.W.R.)

SUMMARY
The accurate timing and execution of organelle biogenesis is crucial for cell physiology. Centriole biogenesis is
regulated by Polo-like kinase 4 (Plk4) and initiates in S-phase when a daughter centriole grows from the side of a
pre-existing mother. Here, we show that a Plk4 oscillation at the base of the growing centriole initiates and times
centriole biogenesis to ensure that centrioles grow at the right time and to the right size. The Plk4 oscillation is
normally entrained to the cell-cycle oscillator but can run autonomously of it—potentially explaining why cen-
trioles can duplicate independently of cell-cycle progression. Mathematical modeling indicates that the Plk4
oscillation can be generated by a time-delayed negative feedback loop in which Plk4 inactivates the interaction
with its centriolar receptor through multiple rounds of phosphorylation. We hypothesize that similar organelle-
specific oscillations could regulate the timing and execution of organelle biogenesis more generally.
INTRODUCTION we recently examined living syncytial Drosophila embryos where

we could follow the assembly of hundreds of centrioles as they
Albert Claude’s landmark paper (Claude, 1943) challenged the duplicate in near-synchrony in a common cytoplasm (Aydogan
idea that cells are a mere bag of enzymes whose contents et al., 2018). These studies revealed that centriole growth in
grow freely in the cytoplasm with no active regulation. We now these embryos is homeostatic: when centrioles grow slowly,
appreciate the diverse and compact nature of the many organ- they grow for a longer period; when centrioles grow quickly,
elles in the cytoplasm (Marsh et al., 2001), yet the physical mech- they grow for a shorter period. As a result, centrioles grow to a
anisms that regulate the number and size of these organelles consistent size.
remain largely unknown (Marshall, 2016). For most organelles Polo-like kinase 4 (Plk4) is the master regulator of centriole
in the cell, however, this question has been difficult to address, biogenesis and it is initially recruited to a ring around the mother
as the variation in their numbers and 3D-shape has made it chal- centriole, but this ring resolves into a single focus on the side of
lenging to monitor their growth—or to even determine which the mother, defining the site of daughter centriole assembly (Ban-
parameter (e.g., their surface area, volume, or perhaps the terle and Gönczy, 2017; Fırat-Karalar and Stearns, 2014; Leda
amount of a limiting component) best defines their size. et al., 2018; Nigg and Holland, 2018; Takao et al., 2019). Unex-
Centrioles are highly structured organelles that form centro- pectedly, we found that Plk4 not only determines the position of
somes and cilia (Bettencourt-Dias et al., 2011; Nigg and Holland, this site, but also helps to establish the inverse relationship be-
2018; Nigg and Raff, 2009). Their linear structure and tightly tween the rate and period of daughter centriole growth (Aydogan
controlled pattern of duplication makes them an attractive model et al., 2018). Plk4 presumably influences the rate of centriole
with which to study organelle biogenesis (Goehring and Hyman, growth, at least in part, by phosphorylating Ana2/STIL to promote
2012; Marshall, 2016). Most cells are born with a single pair of its interaction with Sas-6 and, consequently, the assembly of the
centrioles that duplicate precisely once during S-phase, when central cartwheel (Dzhindzhev et al., 2014; Kratz et al., 2015; Ohta
a daughter centriole grows out orthogonally from the base of et al., 2014), the 9-fold symmetric structure that forms the back-
each mother until it reaches the same size as its mother (Banterle bone of the growing daughter centriole (Kitagawa et al., 2011;
and Gönczy, 2017; Fırat-Karalar and Stearns, 2014; Nigg and van Breugel et al., 2011, 2014). It is less clear, however, how
Holland, 2018). To monitor the dynamics of centriole growth, Plk4 might influence the period of centriole growth.
1566 Cell 181, 1566–1581, June 25, 2020 ª 2020 The Author(s). Published by Elsevier Inc.
ll
Article
A B
E F
Figure 1. Plk4 Levels Oscillate at the Centriole in a Process Entrained by the CCO
(A) Top panel: micrograph shows an image from a time-lapse movie of an embryo expressing Plk4-NG. Middle panels: micrographs illustrate the centriolar Plk4-
NG oscillation during nuclear cycle 12—obtained by superimposing all the Plk4-NG foci (n = 60) at each time point (see STAR Methods). Bottom panel:
quantification of centriolar Plk4-NG levels during nuclear cycles 11–13 in a single embryo (red arrows highlight equivalent time points in the middle panels).
(B) Graphs show the mathematical regression of centriolar Plk4-NG dynamics during S-phase of cycles 11–13 (regression mean ± SEM). R2 values indicate
goodness-of-fit. N R 15 embryos; n = 24, 37, and 53 centrioles (mean) per embryo over cycles 11–13, respectively.
(C) The bar charts quantify the oscillation parameters—derived from the data shown in (B). Data are presented as mean ± SD. Statistical significance was as-
sessed using an ordinary one-way ANOVA test (for Gaussian-distributed data) or a Kruskal-Wallis test (***p < 0.001; ****p < 0.0001; ns, not significant).
(D) Micrographs show, and pie charts quantify, the distribution of Plk4-NG at centrioles assessed by 3D-SIM at the indicated phases of the nuclear cycle (see
STAR Methods). N = 6 embryos per cell-cycle stage; n = 20 centrioles per embryo; all images were scored blindly by 3 assessors and the mean score is shown
(scale bar, 0.5 mm).
Cell 181, 1566–1581, June 25, 2020 1567

ll
Article
Recent studies have shown that Plk4 localizes to centrioles in Plk4 Oscillations Time and Execute Centriole
a cyclical manner in both fly embryos (Aydogan et al., 2018) and Biogenesis
human cultured cells (Takao et al., 2019), but the functional sig- To test whether the Plk4 oscillations were important for centriole
nificance of this localization pattern is unclear. Here, we show biogenesis, we generated flies co-expressing Plk4-NG (in a Plk4
that a Plk4 oscillation at the base of the growing centriole initiates mutant background) and the centriole cartwheel component
and times centriole biogenesis in fly embryos. Sas-6-mCherry, which is irreversibly incorporated into the base
of the growing daughter centriole cartwheel and can be used
to monitor centriole growth in fly embryos (Aydogan et al.,
RESULTS AND DISCUSSION 2018). These flies laid embryos that often failed to hatch (Fig-
ure S3C), but we simultaneously measured Plk4 oscillations
Plk4 Levels Oscillate at the Base of Growing Daughter and centriole growth in those embryos that appeared to be
Centrioles developing normally (Figures 2A, S3A, and S3B; Video S2). The
To investigate the cyclical recruitment of Plk4 to the centrioles, mother centrioles in these embryos were often slightly delayed
we generated flies transgenically expressing Plk4-mNeonGreen in initiating daughter centriole growth (Figures 2A, S3D, and
(Plk4-NG) under the control of its own promoter in a Plk4 mutant S3E), allowing us to measure the amount of Plk4 at the centrioles
background. We monitored centriolar Plk4-NG levels in living when daughter centrioles either started or stopped growing (Fig-
Drosophila syncytial embryos, where the duration of S-phase ure 2A, colored dotted lines).
gradually elongates over nuclear cycles 11–13 (Figures 1A, S1, Strikingly, the centriolar levels of Plk4 at which centriole
and S2A; Video S1). Centriolar Plk4-NG levels oscillated during growth initiated at each cycle (‘‘Start’’; Figures 2A and 2B)
each cycle: levels started to rise in M-phase, peaked in early- were not significantly different than the levels at which centriole
mid S-phase, and were minimal by the next M-phase (Figures growth stopped (‘‘Stop’’; Figures 2A and 2B). This suggests that
1A and S2A). We fit the S-phase oscillations in individual em- at each cycle there is a threshold level of centriolar Plk4 that is
bryos (Figures S1C and S1D) to derive an average S-phase oscil- required to support centriole growth: above this threshold the
lation for each cycle (Figure 1B). centrioles can grow, below this threshold they cannot. If the
Not surprisingly, the Plk4 oscillations appeared to be entrained threshold concept is correct, then mother centrioles that failed
by the core Cdk/Cyclin oscillator as their period increased as nu- to recruit sufficient Plk4 should not grow a daughter. We
clear cycles slowed during cycles 11–13 (Figure 1C). Moreover, observed that the centrioles in a fraction of the embryos express-
genetically altering the duration of the nuclear cycles elicited cor- ing both Plk4-NG and Sas-6-mCherry (mostly at nuclear cycle
responding alterations in the Plk4 oscillation period (Figures 1E 13) separated at the start of S-phase but did not detectably
and 1F). Interestingly, however, the Plk4 oscillation exhibited incorporate Sas-6-mCherry, indicating that daughter centrioles
adaptive behavior: as the period (T) of the oscillation tended to did not grow (Figures S3D and S3E)—a defect that may explain
increase at successive cycles, its amplitude (A) tended to why many of these embryos failed to hatch (Figure S3C). Intrigu-
decrease, so that the total amount of Plk4 recruited to centri- ingly, centriolar Plk4 levels continued to oscillate in these
oles—i.e., the area under the S-phase oscillation curve (area un- embryos, but the average amplitude of these oscillations was
der the curve [U])—remained relatively constant (Figure 1C). lower than in the embryos in which centrioles continued to dupli-
Plk4 is initially recruited to a ring around the mother centriole cate—and it was almost always below the average threshold at
that resolves into a single hub that defines the site of daughter which centriole growth was normally initiated (Figure 2C).
centriole assembly (Banterle and Gönczy, 2017; Fırat-Karalar Together, these results suggest that the Plk4 oscillations initiate,
and Stearns, 2014; Nigg and Holland, 2018). To examine how and determine the duration of, centriole growth.
this localization related to the Plk4 oscillations, we used 3D-
structured illumination super-resolution microscopy (3D-SIM) Mathematical Modeling of the Plk4 Oscillation
to assess the centriolar localization of Plk4 during the nuclear cy- Oscillations in biology are often generated by delayed feedback
cles in living embryos. Plk4-NG was only very briefly detectable circuits (Tsai et al., 2008). In Drosophila, Plk4 is recruited to cen-
in a ring during late-mitosis; at all other stages it appeared largely trioles by Asterless (Asl), which also activates Plk4, allowing it to
as a single hub (Figure 1D). Thus, the recruitment and loss of Plk4 phosphorylate both itself and Asl at multiple sites (Boese et al.,
from the centriole wall is not responsible for the S-phase oscilla- 2018; Dzhindzhev et al., 2010; Klebba et al., 2015). Human Asl
tion we observe in these embryos; instead, centriolar Plk4-NG (Cep152) also binds, and is phosphorylated by, Plk4 in vitro (Ciz-
levels oscillate at the base of the growing daughter centriole. mecioglu et al., 2010; Hatch et al., 2010). We realized that this
(E) Graph shows the mean regression of Plk4-NG oscillations in nuclear cycle 12 of WT embryos (green), or in embryos where the genetic dose of either cyclin B
(CycB1/2; blue) or grapes (Drosophila Chk1) (grp1/2; red) has been halved to slow or speed-up the nuclear cycles, respectively. Dashed lines mark the center (peak)
of the Plk4-NG oscillations (denoted with C), and dotted lines indicate the time of NEB (denoted with N) for each genotype. N R 14 embryos for each condition; n =
55, 43, and 44 centrioles (mean) per embryo in WT, CycB1/2, and grp1/2 embryos, respectively. To clearly illustrate the phase shift in the oscillations, the highest
mean fluorescence signal for each group was normalized to 1.
(F) Bar charts quantify the time at which the Plk4-NG oscillations peaked, the length of S-phase, and the ratio between them (C/N)—derived from the data shown
in (E). Data are presented as mean ± SD. Statistical significance was assessed using an ordinary one-way ANOVA test (for Gaussian-distributed data) or a
Kruskal-Wallis test (**p < 0.01; ***p < 0.001; ns, not significant).
See also Figures 6, S1, and S2.
1568 Cell 181, 1566–1581, June 25, 2020

ll
Article
B C
Figure 2. Plk4 Oscillations Initiate and Time Centriole Biogenesis

(A) Graphs show the mean regression of Plk4-NG oscillations (red, green, and blue lines for cycles 11–13, respectively) and centriole growth (monitored by Sas-6-
mCherry incorporation, black lines) measured simultaneously in embryos during S-phase of cycles 11–13. For ease of presentation, the SEM for these data are
not shown, but are presented in Figures S3A and S3B. Dotted lines indicate the centriolar Plk4 levels at which centrioles ‘‘start’’ or ‘‘stop’’ growing. N = 17 embryos
(cycles 11 and 12), and 8 embryos (cycle 13); n = 19, 31, and 45 centrioles (mean) per embryo in cycles 11–13, respectively. See STAR Methods for an explanation
of data normalization and scaling.
(B) Bar charts quantify the centriolar Plk4-NG threshold levels at which centrioles start and stop growing during cycles 11–13—derived from the data shown in (A).
Data are presented as mean ± SD. Statistical significance was assessed using an unpaired t test with Welch’s correction (for Gaussian-distributed data) or an
unpaired Mann-Whitney test (ns, not significant).
(C) Eight embryos in which the centrioles did not grow (Figures S3D and S3E) were excluded from the analysis shown in (A) and (B); the scatter graph shown here
illustrates how the mean amplitude of the Plk4 oscillations in each of these eight embryos (red dots) tended to be lower than the mean amplitude (±SEM) of the
Plk4 oscillations in the embryos where the centrioles did grow.
See also Figure S3.
system could form a time-delayed negative feedback network Plk4 is required to promote centriole growth—but in our model
capable of generating Plk4 oscillations if the activation of Plk4 this reaction is not important for the Plk4 oscillation per se, so
by Asl eventually led to the inhibition of their interaction. we do not consider it further. We speculate that the phosphory-
A simple version of such a scenario is illustrated in Figures 3A lation of Asl at multiple sites reduces its affinity for Plk4, so that
and 3B. At the start of each oscillation cycle, we envisage that the bound Plk4 molecules are released, leaving behind the phos-
unphosphorylated Asl receptors on the mother centriole recruit phorylated Asl-receptor that can no longer recruit Plk4 (Fig-
Plk4 to the site of daughter centriole assembly with high affinity ure 3A, (iii)) (see the end of this section for how this network
(Figure 3A, (i)). Binding activates Plk4, allowing it to phosphory- can be reset to trigger subsequent rounds of oscillations).
late itself (Cunha-Ferreira et al., 2013; Holland et al., 2010; This network (Figure 3B; see mathematical model 1 in STAR
Klebba et al., 2013), Ana2/STIL (Dzhindzhev et al., 2014; Kratz Methods) maps onto a set of coupled linear ordinary differential
et al., 2015; McLamarrah et al., 2018; Ohta et al., 2014) and equations, which we solved analytically. Solutions to this first
Asl/Cep152 (Boese et al., 2018; Hatch et al., 2010) at multiple model (model 1 in STAR Methods) fit the discrete Plk4 oscillation
sites (Figure 3A, (ii)). The phosphorylated Ana2 promotes cart- data from each S-phase of nuclear cycles 11–13 very well (Fig-
wheel assembly, potentially explaining why a threshold level of ure 3C; R2 > 0.99). Although the model may overfit the data,
Cell 181, 1566–1581, June 25, 2020 1569

ll
Article
A B
D E
Figure 3. A Simple Mathematical Model of the Plk4 Oscillation and Experimental Investigations to Test Its Predictions
(A) Diagram of the model. (i) During mitosis, Asl receptors (red) on the surface of the mother centriole start to bind Plk4 (green) with high affinity (k1). (ii) Once bound,
Plk4 is activated, and it starts to phosphorylate itself, Ana2 (black) and Asl (k2) at multiple sites (indicated by dotted black arrows and black dots). (iii) We speculate
that, after several rounds of phosphorylation, Asl is converted to a state with low affinity for Plk4, so phosphorylated Plk4 is released (k3)—and likely degraded.
These Asl receptors are now inactivated and can no longer bind Plk4 to promote centriole growth.
(B) Schematic depicts the topology of the mathematical model (see STAR Methods for full details of the model). Asl-p and Plk4-p indicate phosphorylated
proteins. Bold arrows indicate the dominant direction of the reactions. This model discretely examines centriolar Plk4-NG levels only during S-phase of each
cycle. We speculate that a phosphatase normally removes the phosphate groups from Asl during mitosis to reset the system for the next oscillation (red arrow; k4),
and we extend the model to include this step elsewhere (Figures S4A and S4B).
(C) Graphs show the Lorentzian fit of the Plk4-NG oscillation data during S-phase of cycles 11–13 (solid lines) overlaid with analytical solutions to the model
(dotted lines). R2 values indicate goodness-of-fit.
(D) Bar charts quantify the average cytosolic concentration and centriolar fluorescence of Asl-GFP at the start of S-phase of each cycle. Cytosolic Asl-GFP was
measured using FCS; each data point represents the average of 4–6 10-s recordings from a single embryo (Figure S5). Centriolar Asl-GFP was measured using
confocal microscopy, as described in STAR Methods. N R 14 embryos for each cell cycle; n = 48, 70, and 130 centrioles (mean) per embryo in cycles 11–13,
respectively. Data are presented as mean ± SEM. Statistical significance was assessed using an ordinary one-way ANOVA test (ns, not significant).
1570 Cell 181, 1566–1581, June 25, 2020

ll
Article
these solutions were within a reasonable and generally narrow Asl-13A-mKate2 localized to centrioles less efficiently than Asl-
parameter space (Figures 3C and S5; Data S1, first and fourth WT-mKate2 (Figure S4D), expressing untagged Asl-13A
charts). Nevertheless, we believe this model is likely to be over- increased the amplitude of the Plk4-NG oscillation (Figure S4E),
simplified. Plk4’s ability to phosphorylate itself, for example, consistent with our idea that phosphorylating Asl can reduce its
could help to generate the oscillation by promoting Plk4 degra- affinity for Plk4 (Figures 3A and 3B).
dation (Cunha-Ferreira et al., 2009; Guderian et al., 2010; An inspection of the parameters generated by our model re-
Holland et al., 2010; Rogers et al., 2009) or lowering the affinity vealed that the reduction in the amplitude of the Plk4 oscillation
of the Asl::Plk4 interaction—as has recently been demonstrated at successive nuclear cycles was driven primarily by a reduction
(Park et al., 2019). Moreover, the model considers the behavior in the cytosolic concentration of Plk4 (that determines k1, the rate
of only Asl and Plk4, when other factors, such as Ana2/STIL, at which Plk4 binds to Asl), while total levels of the Asl receptor
are likely to modulate the systems behavior (Arquint and Nigg, (Atot) remain relatively constant (Data S1, first chart). To test if
2016; Gönczy and Hatzopoulos, 2019). Finally, the model does this was the case, we first used fluorescence correlation spec-
not consider the possibility that Plk4 bound to one receptor troscopy (FCS) (Figure S5) to examine the cytosolic concentra-
could phosphorylate nearby receptors, or the Plk4 bound to tion of Asl-GFP. Although the number of centrioles assembled
nearby receptors, to influence their behavior—a concept that doubles at each successive cycle, the average cytosolic con-
may be important when considering how Plk4 ultimately local- centration of Asl-GFP, and the average centriolar levels of Asl-
izes to only a single site on the side of the mother centriole GFP, remained relatively constant at the start of each successive
(Leda et al., 2018; Takao et al., 2019). cycle (Figure 3D), as predicted by our model. Unfortunately, the
In order to demonstrate how this network could be reset for the cytosolic concentration of Plk4-NG was too low to be measured
next oscillation, we extended our model (model 2 in STAR by conventional FCS, so we developed a new method, peak
Methods) to allow a protein phosphatase (PPTase) to be acti- counting spectroscopy (PeCoS), to measure relative protein
vated during M-phase to dephosphorylate Asl (Figures 3B, red abundance at lower concentrations (see STAR Methods) (Fig-
arrow, and S4A). This resetting is biologically plausible, because ure S6). This revealed that, in contrast to Asl-GFP, the cytosolic
the activities of several PPTases are regulated during the cell cy- levels of Plk4-NG tended to decrease at successive nuclear cy-
cle (Nilsson, 2019). This model can be solved exactly, and its so- cles (Figure 3E), as predicted by the model.
lutions generate robust centriolar Plk4 oscillations within the Why do cytosolic Plk4 levels decrease at successive nuclear cy-
context of a system that, like the early Drosophila embryo, alter- cles? Our modeling suggests that if total Plk4 levels in the devel-
nates between periods of S- and M-phases (Figure S4B). Thus, oping embryo remain constant (i.e., the rate of Plk4 degradation
our minimal model illustrates that a classical ‘‘time delayed nega- and synthesis are balanced), then the doubling of centriole
tive-feedback’’ network (Novák and Tyson, 2008) can generate numbers at each cycle can lead to the depletion of cytosolic
Plk4 oscillations, although the precise molecular details of this Plk4—particularly during later nuclear cycles—as an increasing
system remain to be fully elucidated. fraction of the protein is sequestered by the increasing number
of centrioles (Figure S4F). Alternatively (or additionally), Plk4 mole-
cules that are activated by binding to Asl may be more likely to
Testing Predictions of the Mathematical Models phosphorylate themselves to stimulate their degradation, ensuring
A key feature of our models is that the phosphorylation of Asl by that more Plk4 is degraded at each cycle as the number of centri-
Plk4 reduces their affinity (although, as discussed above, Plk4’s oles increase. Interestingly, in either of these scenarios, increasing
ability to phosphorylate itself, and other factors, could also help centriole numbers lead to more Plk4 depletion from the cytosol,
to generate the oscillation). To test the plausibility of this idea, we potentially allowing embryos to effectively ‘‘count’’ their centrioles.
mutated 13 potential Plk4 phosphorylation sites in Asl to Ala (Asl-
13A) (Figure S4C). These sites were selected based on their con-
servation, their similarity to known Plk-family consensus sites The Plk4 Oscillation Can Adapt to Changes in Plk4
(Leung et al., 2007), their proximity to the N- and C-terminal re- Levels to Maintain a Constant Centriole Size
gions of Asl that are thought to interact with Plk4 (Boese et al., Our finding that cytosolic levels of Asl remain constant at succes-
2018), and a previous analysis of sites in the Asl N-terminal re- sive cycles while cytosolic Plk4 levels decrease suggests a ratio-
gion that are either phosphorylated by Plk4 kinase domain nale for why centriole biogenesis may be regulated by an oscilla-
in vitro or have been shown to be phosphorylated in cultured tory system. In our models, Asl effectively functions as an
Drosophila cells (Boese et al., 2018). If some of these sites are integrator (Ferrell, 2016; Somvanshi et al., 2015) whose levels are
normally phosphorylated by Plk4 to reduce the affinity of the kept constant so that it can measure changes in the input (cytosolic
Asl::Plk4 interaction, we would predict that expressing Asl-13A Plk4 levels) and adapt the oscillation to maintain a constant output
in the presence of endogenous, unlabeled Asl would lead to an (centriole size). If this interpretation is correct, then the Plk4 oscil-
increase in centriolar Plk4-NG levels—because the Plk4 should lation should adapt to maintain a constant centriole size when
unbind from the mutant Asl receptors less efficiently. Although Plk4 levels change, but not when Asl levels change. To test this,
(E) Bar chart shows the relative abundance of Plk4-NG at the start of each nuclear cycle measured by PeCoS (see STAR Methods; Figure S6). Each data point
represents a single 180-s recording from a single embryo. Statistical significance was assessed using a Kruskal-Wallis test (****p < 0.0001). Data are presented as
mean ± SD.
See also Figures S4, S5, and S6 and Data S1 (first and fourth charts as well as the Monte Carlo analysis).
Cell 181, 1566–1581, June 25, 2020 1571

ll
Article
Figure 4. The Plk4 Oscillator Can Adapt to Changes in Plk4 Concentration but Not to Changes in Asl Concentration
(A and B) Graphs show the regression data (solid lines) and mathematical solutions (dotted lines) for Plk4-NG oscillations in cycle 12 for experiments where either
(A) the genetic dose of Plk4-NG was halved (Plk4-NG1/2), or (B) the genetic dose of asl was halved (asl1/2) (gray lines) compared to controls (green lines). (A) N R 11
embryos for each condition; n = 47 and 42 centrioles (mean) per embryo in control or Plk4-NG1/2 groups, respectively. (B) N = 18 embryos for each condition; n =
44 and 43 centrioles (mean) per embryo in control or asl1/2 groups, respectively. Data are presented as mean ± SEM. Bar charts quantify oscillation parameters,
as indicated; data are presented as mean ± SD.
1572 Cell 181, 1566–1581, June 25, 2020

ll
Article
we monitored Plk4-NG oscillations in embryos laid by mothers an interphase-like state with intact nuclei that do not duplicate
where we genetically halved the dose of either Plk4-NG (hereafter their DNA, but where centrosomes can continue to duplicate
Plk4-NG1/2 embryos) or asl (hereafter asl1/2 embryos). Centrioles (McCleland and O’Farrell, 2008). We initially injected embryos in
appeared to duplicate normally in both sets of embryos, but the nuclear cycles 7–8 and monitored Plk4-NG behavior 30 min
Plk4 oscillation parameters were altered: in Plk4-NG1/2 embryos, later. In all such embryos, we observed an initial synchronous
A decreased but there was a compensatory increase in T, so U re- round of centriole duplication without NEB (indicating that the
mained relatively constant (Figures 4A and S7A); in asl1/2 embryos, CCO was perturbed), followed by one or more rounds of less syn-
A decreased, but there was no compensatory change in T, so U chronous centriole duplication (Figures 5A and S2B; Video S3).
decreased (Figures 4B and S7B–S7D). Strikingly, a normal Plk4-NG oscillation was associated with the
Our mathematical model (model 1) could fit both sets of data first, synchronous, round of centriole duplication, but subsequent
well (Figures 4A and 4B; R2 > 0.99), generating a reasonable range oscillations were more variable (Figures 5A and S2B).
of parameters (Data S1, second and third charts), several of which We reasoned that any residual Plk4-NG oscillations in these
we again validated experimentally (Figure S7; see mathematical embryos might be triggered by residual CCO oscillations that
modeling section in STAR Methods). Interestingly, if we took the could trigger centriole duplication, but not DNA synthesis or
normal parameters derived from our model and simply adjusted NEB. While one can never rule out the possibility of residual
the amount of Asl or Plk4 in the model to the levels we experimen- CCO activity, we tried to overcome this potential problem by
tally measured in the half-dose embryos, the model fit the data examining centriole behavior in embryos in which the CCO was
less well (not shown). This suggests that changing the concentra- likely to be more fully suppressed by injecting the embryos earlier
tion of one component is likely to influence the concentration and/ (nuclear cycles 2–4) and monitoring them later (after 90 min). The
or behavior of other components so that several parameters of the centrioles in these embryos were now completely dissociated
Plk4 oscillation are altered. This seems plausible, as the core from the non-dividing nuclei and they appeared to divide stochas-
centriole duplication proteins are known to interact with and influ- tically, with some centrioles duplicating one or more times, and
ence each other in multiple ways (Arquint and Nigg, 2016; Gönczy others not duplicating at all (Figure S8; Video S4). The CCO coor-
and Hatzopoulos, 2019; Nigg and Holland, 2018). dinates cell-cycle events in normal early embryos by spreading as
Consistent with our observation that the Plk4-NG oscillations a chemical trigger wave (Chang and Ferrell, 2013; Deneke et al.,
adapt in Plk4-NG1/2 embryos by reducing A and increasing T to 2016), but duplicating centrioles did not detectably trigger the
maintain a relatively constant U, we previously showed that halving duplication of nearby centrioles (Figure S8F). Thus, the ‘‘decision’’
the genetic dose of Plk4 led to the centrioles growing slowly, but for to duplicate in these CCO-suppressed embryos appears to be
a longer period of time, to maintain a constant size (Aydogan et al., largely intrinsic to each individual centriole.
2018). In contrast, we would predict that daughter centrioles in To test whether these stochastic centriole duplications were
asl1/2 embryos should grow more slowly (as A is decreased), but triggered by Plk4 oscillations, we measured Plk4-NG fluores-
for a normal period (as T is unchanged), and so centrioles would cence levels at individual centrioles. The raw intensity data were
be too short (as U decreases). We measured the parameters of noisy, but duplicating ‘‘fertile’’ centrioles appeared to exhibit
daughter centriole growth in asl1/2 embryos and confirmed that more prominent Plk4-NG oscillations than non-duplicating ‘‘ster-
this was the case (Figure 4C). Together, these experiments sug- ile’’ centrioles (Figure 5B). Moreover, the average centriolar
gest that the Plk4 oscillatory network functions to maintain a con- Plk4-NG fluorescence level (expressed as signal-to-noise ratio
stant centriole size even when Plk4 levels vary. [SNR]) was significantly higher at fertile centrioles (Figure S8B),
and Plk4-NG SNR values could distinguish fertile and sterile cen-
Plk4 Oscillations Can Execute Centriole Duplication trioles, correctly predicting centriole fertility or sterility 74% and
Independently of a Robust Cdk/Cyclin Cell-Cycle 71% of the time, respectively (Figures S8C and S8D).
Oscillator Upon filtering the raw oscillation data, we found that the peaks
Although the Plk4 oscillations in fly embryos are normally en- of the Plk4-NG oscillations (see STAR Methods for a description
trained by the cell-cycle oscillator (CCO) (Figures 1E and 1F), it of peak-calling methodology) were often associated with
has long been known that centrioles can continue to duplicate centriole duplication events (Figure 5B). An unbiased computa-
in many systems even when several other aspects of cell-cycle tional analysis of all the 45 fertile centrioles that we observed in
progression are blocked (Balczon et al., 1995; Gard et al., 1990; 3 different embryos revealed that the predicted Plk4-NG oscilla-
Sluder et al., 1990). We wondered whether this might be because tion peaks predicted centriole duplication events with high preci-
Plk4 oscillations can continue to drive centriole biogenesis even in sion (40/49 Plk4-NG peaks were associated with a duplication
the absence of a robust CCO. To test this possibility, we injected event that occurred within ±5 min of the peak) and recall (40/
embryos with double-stranded RNAs (dsRNAs) targeting the three 52 duplication events occurred within ±5 min of a Plk4-NG oscil-
embryonic mitotic cyclins: A, B, and B3. These embryos arrest in lation peak) (Figures 5C and 5D). Computer simulations revealed
(C) Graph quantifies the parameters of cartwheel growth—as measured by Sas-6-GFP fluorescence incorporation (Aydogan et al., 2018)—in WT and asl1/2
embryos; data are presented as mean ± SEM. Bar charts quantify growth parameters presented as mean ± SD. N = 17 embryos for each condition; n = 77 and 72
centrioles (mean) per embryo in WT or asl1/2 groups, respectively. Statistical significance was assessed using an unpaired t test with Welch’s correction (for
Gaussian-distributed data) or an unpaired Mann-Whitney test (*p < 0.05; **p < 0.01; ***p < 0.001; ****p < 0.0001; ns, not significant). R2 values indicate goodness-
of-fit for the mathematical solutions.
See also Figure S7 and Data S1 (second and third charts).
Cell 181, 1566–1581, June 25, 2020 1573

ll
Article
A B
C D E F
Figure 5. Plk4 Levels Can Continue to Oscillate and Promote Centriole Duplication Even When the CCO is Perturbed
(A) Graph shows the Plk4-NG oscillations in an embryo injected with dsRNA against cyclin A-B-B3; the schema above the graph illustrates the experimental
protocol. The nuclei in this embryo arrest in interphase, but centrioles go through an additional round of division—centriole separation (CS)—accompanied by a
Plk4 oscillation. See Figure S2B for additional examples; n = 30 centrioles (mean) per embryo.
(B) Graphs show the raw (black lines) and filtered (red lines) fluorescence intensity data of 3 individual ‘‘fertile’’ centrioles and 3 individual ‘‘sterile’’ centrioles within
the same cyclin-depleted embryo. The fertile centrioles duplicate (black dotted lines), and these events were often closely associated with computed Plk4
oscillation peaks (red dotted lines) (see STAR Methods for further details of the peak calling methodology).
(C) An unbiased computational analysis of all 45 fertile centrioles in 3 embryos reveals that >80% of the computationally detected Plk4 oscillation peaks occur
within 5 min of an experimentally observed duplication event. A simulation with randomly distributed centriole duplication events and Plk4 oscillation peaks
showed a mean time separation of 10.5 min (data not shown).
(D) Venn diagram shows how, using a 5-min window, the oscillation peaks can be used to predict duplication events with both high precision and high recall (40/49
Plk4 oscillation peaks are associated with a duplication event, and 40/52 duplication events are associated with a Plk4 oscillation peak).
(E) Graph shows the ability of Plk4 oscillation peaks to ‘‘retrieve’’ centriole duplication events across all peak prominences. All detected oscillation peaks were
ranked in order of their peak prominence from high to low (black dots) and assigned uniquely to a duplication event if within a 5-min time window. The graph then
plots the precision and recall values if the threshold for calling a peak were set as the peak prominence value of each peak (in descending order). Below the
detected peak that is associated with a peak prominence threshold of 0.12, the precision dramatically drops, suggesting the existence of a minimum peak
amplitude for centriole duplication. At this threshold, precision and recall are jointly optimized. Note, if there were no overall correlation between Plk4 peaks and a
duplication event, the integrated area under the curve across all peak prominences or average precision (AP) for the 5-min time window (AP5min) would be ~50%
(given by # duplications/(# duplications + # peaks)); so the score of ~75% indicates a meaningful correlation.
(F) Graph shows the correlation between the time of the computationally determined Plk4 peaks and their respective experimentally observed duplication events.
Correlation strength was examined using Pearson’s correlation coefficient (r < 0.40 weak; 0.40 < r < 0.60 moderate; r > 0.60 strong); significance of correlation
was determined by the p value (p < 0.05) (see STAR Methods for a full description of this analysis).
See also Figures S2 and S8.
that a random distribution of the duplication events lead to an based on amplitude revealed that the higher the amplitude of
average time of >10 min between the peaks and duplication the oscillation, the more likely it was to be associated with a
events, indicating that the observed association was not centriole duplication event (Figure 5E), while plotting the relative
random. Moreover, a rank ordering of the Plk4-NG oscillations timing of the Plk4-NG oscillations and the centriole duplication
1574 Cell 181, 1566–1581, June 25, 2020

ll
Article
events revealed a strong positive correlation (Figure 5F; Pearson cell-cycle regulators can influence the Plk4 oscillation without
r = 0.9580, p < 0.0001). We conclude that individual centrioles changing Plk4’s cytosolic concentration. This supports the
can organize autonomous Plk4 oscillations that can drive model prediction that the drop in cytosolic Plk4 levels at succes-
centriole duplication even in the absence of a robust CCO. sive nuclear cycles (Figure 3E) is not, on its own, sufficient to
This potentially explains how centrioles can continue to dupli- account for the change in Plk4 oscillation parameters we
cate independently of many other cell-cycle events. observe from cycles 11–13. This presumably explains why the
model requires several parameters to change slightly at each
The CCO Can Phase-Lock the Plk4 Oscillation to successive cycle to best fit the data (Data S1, first chart).
Coordinate Centriole Duplication with Other Cell-Cycle Taken together, our observations are consistent with the
Events phase-locker model of cell-cycle regulation (Lu and Cross,
It is widely believed that the CCO acts primarily as a ‘‘ratchet’’ 2010). We propose that the Plk4 oscillation may be an exemplar
whose activity increases over the cell cycle to trigger the sequen- of an autonomously oscillating system that can independently
tial execution of cell-cycle events such as DNA replication, drive a cellular event (centriole duplication), but that is normally
centriole duplication, nuclear envelope breakdown (NEB), and phase-locked by the CCO to ensure its proper coordination
spindle assembly (Stern and Nurse, 1996; Swaffer et al., 2016, with other biological events and with cell division.
2018). An interesting alternative possibility is that the CCO could
act as a ‘‘phase-locker’’ whose function is simply to entrain the A Model to Generate Autonomous Plk4 Oscillations in
phase of a network of autonomous oscillations, each of which the Absence of a CCO
is responsible for the execution of a specific cell-cycle event How can a Plk4 oscillation be generated independently of the
(Lu and Cross, 2010). The Plk4 oscillation appears to time and CCO? Our mathematical model (model 2 in STAR Methods) cannot
execute centriole biogenesis, and it can trigger centriole duplica- explain this, as it requires a PPTase to reset the system specifically
tion independently of a robust CCO, so it is an excellent candi- during M-phase (Figures S4A and S4B). Interestingly, if we extend
date for such an autonomous oscillation. the model to allow the PPTase to have a constant low-level of ac-
To better understand how the CCO might entrain the Plk4 tivity (10% of the level normally required to reset the system in M-
oscillation, we measured the average period of the stochastic phase) (Figure S8G) this new model (model 3 in STAR Methods) re-
Plk4 oscillations in cyclin-depleted embryos (20.5 ± 4.6 min) capitulates several features of centriole duplication in the cyclin-
and compared this to the average period of the Plk4 oscillations depleted embryos (Figure S8H). This model predicts that after a
in cycles 11–12 (11.7 ± 0.7 min) and 12–13 (14.9 ± 1.7 min). The last round of mitosis the centrioles in the cyclin-depleted embryos
natural period of the autonomous Plk4 oscillation in these early will undergo a single synchronous Plk4 oscillation (as all of the Asl
embryos is therefore similar to, but slightly slower than, the receptors start this first cyclin-depleted cycle in a dephosphory-
period of the Plk4 oscillations normally enforced by the CCO, lated state), but subsequent Plk4 oscillations rapidly dampen as
indicating that the CCO could entrain the Plk4 oscillation by the individual Asl receptors lose synchrony, and the system tends
speeding up a phase of its natural cycle. toward a steady state—where some of the centriolar Asl receptors
To examine which phase this might be, we tested for correla- are Plk4-bound and being phosphorylated, while others are not
tions between various parameters of the Plk4 oscillation and the Plk4-bound and are being dephosphorylated (Figure S8H). Intrigu-
length of S- or M-phase. During cycles 11–13, we observed a sig- ingly, the inherent noise in the system generated stochastic Plk4
nificant correlation between the timing of the Plk4-NG oscillation oscillations that could plausibly drive centriole duplication (Fig-
trough in M-phase and the duration of M-phase (Figure 6, lower ure S8H)—potentially mimicking the stochastic Plk4 oscillations
scatterplots in the light yellow panel), suggesting that the CCO en- and centriole duplication events that we observe in the cyclin-
trains the Plk4 oscillation by speeding it up during M-phase. This is depleted embryos (Figure 5B).
consistent with our minimal model, in which the CCO entrains the In this model, each Asl receptor effectively behaves as an in-
Plk4 oscillation by ensuring the rapid and coordinated dephos- dependent oscillator—alternating between a Plk4-bound form
phorylation of Asl during M-phase (Figures S4A and S4B). that is being phosphorylated and a non-Plk4-bound form that
We also noticed an additional correlation between the peak of is being dephosphorylated. In the presence of the CCO, the
the Plk4-NG oscillation and S-phase length in cycle 13 (Figure 6, Asl receptors generate coordinated Plk4 oscillations because
upper rightmost scatterplot in the light yellow panel). This is not the CCO synchronizes them every nuclear cycle by providing a
surprising, as a Wee1-dependent checkpoint dramatically slows coordinated burst of PPTase activity during mitosis.
the CCO—and many other aspects of S-phase progression—
particularly during nuclear cycle 13 (Deneke et al., 2016; Stumpff Plk4 Oscillations Are Detectable in Non-dividing Mouse
et al., 2004). Moreover, in Wee1/ embryos, the correlation be- Liver Cells and Can Be Entrained by the Circadian Clock
tween the Plk4-NG oscillation trough and M-phase length was In species as distant as cyanobacteria and mammals, the CCO can
maintained (Figure 6, lower rightmost scatterplot in the light yel- be entrained to the circadian clock (Matsuo et al., 2003; Yang et al.,
low panel), while the correlation between the Plk4-NG oscillation 2010). We wondered, therefore, whether the autonomous Plk4
peak and S-phase length was lost (Figure 6; upper rightmost oscillation could also be entrained by the circadian clock. We
scatterplot in the light yellow panel), demonstrating that Wee1 examined a recently published diurnal proteome from non-regen-
can influence the Plk4 oscillation in S-phase. Interestingly, the erating mouse liver (Wang et al., 2018), where hepatocytes, the ma-
cytosolic levels of Plk4-NG were essentially the same in wild- jor building blocks of the liver, are largely quiescent (Friedman,
type (WT) and Wee1/ embryos (Figure S6E), indicating that 2000). Several key cell-cycle regulators (such as Cdk1, cyclin E,
Cell 181, 1566–1581, June 25, 2020 1575

ll
Article
Figure 6. The CCO Phase-Locks the Plk4 Oscillations in Mitosis of Cycles 11–13 Independently of Wee1 and in Interphase of Cycle 13 in a
Wee1-Dependent Manner
Scatterplots illustrate correlations between various parameters of the Plk4 oscillation and the length of S- or M-phase in nuclear cycles 11–13 in WT and Wee1/
embryos (see Plk4-NG smooth curve fitting and parameter extraction in STAR Methods for details of how these parameters [along with their descriptions] were
obtained in an unbiased way). During cycles 11–13, there is a significant correlation between the timing of the Plk4 oscillation trough in M-phase and the duration
of M-phase (lower scatterplots in the light yellow panel), suggesting that the CCO entrains the Plk4 oscillation during M-phase. This entrainment is not altered in
the Wee1/ embryos. During nuclear cycle 13, there is an additional correlation between the peak of the Plk4 oscillation and S-phase length that is lost in the
Wee1/ embryos (upper rightmost scatterplot in the light yellow panel). The plots for the WT group were generated with the data obtained from Figures 1A and
S2A, as well as 5 additional embryos of the same genotype. N = 10 embryos; n = 23 centrioles (mean; starting from cycle 11) per embryo in Wee1/ group.
Correlation strength was examined using Pearson’s correlation coefficient (r < 0.40 weak; 0.40 < r < 0.60 moderate; r > 0.60 strong); significance of correlation
was determined by the p value (p < 0.05).
See also Figures S4 and S6E.
cyclin B1, and Plk1) were not detectable at any stage of the diurnal dataset to examine the behavior of the mouse homologs of all
cycle, confirming that these cells were largely quiescent. In the mitotic PPTase subunits that function in flies (Chen et al.,
contrast, Plk4 protein (but not transcript) levels exhibited a striking 2007). Among the 27 PPTase subunits examined, only PPP2CB
oscillation that was entrained to the light/dark cycles (Figures 7A– exhibited a clear oscillatory behavior that is similar to Plk4, and
7C). We presume that this oscillation is sub-threshold for centriole the period of these oscillations was precisely out of phase with
biogenesis—because centrioles should not be duplicating in these the Plk4 oscillation (Figure 7D, highlighted with a red dotted
non-dividing cells—and simply reflects the ability of the Plk4 sys- frame). Intriguingly, PPP2CB is the homolog of Mts, the catalytic
tem to oscillate in a way that can be entrained by the circa- subunit of PP2A in Drosophila that localizes to centrosomes spe-
dian clock. cifically during mitosis in fly cells, and its knockdown leads to
In our model, a mitotic PPTase that dephosphorylates Asl-re- centrosome duplication defects (Dobbelaere et al., 2008).
ceptors out of phase with Plk4 is required to generate Plk4 oscil- Thus, PP2A is an excellent candidate for the PPTase that may
lations (Figures S4A and S4B). We therefore used the mouse normally dephosphorylate centriolar Asl during mitosis.
1576 Cell 181, 1566–1581, June 25, 2020

ll
Article
A C
Cell 181, 1566–1581, June 25, 2020 1577

ll
Article
Remarkably, 8% of the 6,800 proteins in the mouse data- B Spatiotemporal heatmap of centriole duplications
set exhibited a 24 h-entrained oscillatory behavior. It is unclear B Spatial clustering assessment of centriole duplications
why so many proteins oscillate in this way, or whether any of B Plk4-NG smooth curve fitting and parameter extraction
these oscillations are of functional significance. Nevertheless, B 3D-Structured Illumination Microscopy (3D-SIM)
these observations indicate that there are many other proteins, B Mathematical modeling and its experimental validation
and so perhaps many different biological processes, that have B Model 2: Generating robust Plk4 oscillations entrained
a largely under-appreciated ability to oscillate. by the CCO
B Model 3: Stochastic duplications
B Fluorescence Correlation Spectroscopy (FCS)
Concluding Remarks
B FCS background corrections
There is great interest in determining the physical and molecular
B Data restriction
principles that cells use to regulate the biogenesis of their organ-
B Peak Counting Spectroscopy (PeCoS)
elles (Liu et al., 2018; Mukherji and O’Shea, 2014). The idea that
d QUANTIFICATION AND STATISTICAL ANALYSIS
an organelle-specific oscillation could time and execute organelle
biogenesis has, to our knowledge, not been proposed previously.
We suggest that the Plk4 centriole oscillation could be a paradigm SUPPLEMENTAL INFORMATION
for a general mechanism describing the regulation of organelle
biogenesis: oscillations in the levels/activity of key regulatory fac- Supplemental Information can be found online at https://doi.org/10.1016/j.
tors essential for organelle biogenesis could precisely time the initi- cell.2020.05.018.
ation and duration of the growth process, ensuring that organelles
grow at the right time and to the appropriate size. In such a model, ACKNOWLEDGMENTS
the CCO and circadian clocks could act simply as ‘‘phase-lockers’’
(Lu and Cross, 2010; Morgan, 2010), whose function is to entrain We are grateful to Laura Hankins, Fabio Echegaray Iturra, Marjorie Fournier,
the phase of a network of autonomous oscillators to ensure that Christoffer Lagerholm, and Bela Novak for advice and discussion and Alissa
M. Kleinnijenhuis and members of the Raff laboratory for critically reading
biological processes occur in a coordinated manner.
the manuscript. Microscopy was performed at the Micron Oxford Advanced
Bioimaging Unit, funded by a Strategic Award from the Wellcome Trust
STAR+METHODS (107457). The research was funded by a Wellcome Trust Senior Investigator
Award (104575 to T.L.S., M.M., Z.M.W., A.W., S.S., and Z.A.N.), Edward Pen-
ley Abraham Scholarships (to M.G.A. and L.G.), a Cancer Research UK Oxford
Centre Prize DPhil Studentship (C5255/A23225 to S.-S.W.), a Balliol Jason Hu
and include the following: Scholarship, (to S.-S.W.), a Clarendon Scholarship (to S.-S.W.), and Ludwig
Institute for Cancer Research funding (to F.Y.Z.). M.A.B. was supported by a
d KEY RESOURCES TABLE Biotechnology and Biological Sciences Research Council grant (BB/
d RESOURCE AVAILABILITY N016858/1) and the St. Cross Emanoel Lee Junior Research Fellowship.
B Lead Contact
B Materials Availability
d EXPERIMENTAL MODEL AND SUBJECT DETAILS This study was conceptualized by M.G.A., T.L.S., M.A.B., and J.W.R. Investi-
B D. melanogaster stocks and husbandry gation was done by M.G.A., T.L.S., M.M., Z.M.W., L.G., A.W., S.S., and M.A.B.
d METHOD DETAILS Data were analyzed by M.G.A., T.L.S., Z.M.W., L.G., F.Y.Z., and M.B.A. Meth-
B Hatching experiments odology was developed by M.G.A., T.L.S., M.M., Z.M.W., S.-S.W., F.Y.Z.,
B Synthesis of double-stranded RNA M.A.B., and J.W.R. Project was administered by M.G.A., M.A.B., and J.W.R.
B Embryo collections and dsRNA injections Resources were shared/made by M.G.A., M.M., L.G., A.W., S.S., S.-S.W.,
Z.A.N., and M.A.B. Software development was carried out by M.G.A., T.L.S.,
B Immunoblotting
Z.M.W., S.-S.W., F.Y.Z., and M.A.B. Overall supervision was done by
B Image acquisition, processing, and analysis M.G.A., A.G., and J.W.R. Validation experiments/analyses were carried out
B Analysis of centriole ‘‘fertility’’ in embryos injected with by M.G.A., A.W., S.S., and J.W.R. M.G.A., T.L.S., A.G., M.A.B., and J.W.R.
dsRNA against cyclin A-B-B3 wrote and edited the draft with significant input from all authors.
Figure 7. Plk4 Levels May Autonomously Oscillate in Mouse Liver Cells Entrained by the Circadian Clock Where Levels of PP2A Catalytic
Subunit Oscillates Precisely out of Phase
(A) Diagram shows the workflow used by Wang et al. (2018) to obtain a diurnal proteome of the whole liver of light/dark-entrained mice.
(B) Graphs reproduced from Wang et al. (2018) show the relative diurnal expression of the circadian clock transcripts Bmal1 and Per1 as internal controls. We re-
analyzed the diurnal proteome produced in this study—comprising a matrix of Z scores for 6,780 proteins identified during 2 circadian cycles (supplemental
dataset 9 from Wang et al. [2018]).
(C) Graphs we derived show the relative protein levels of Plk4 and the cartwheel component STIL (Ana2 in flies). Plk4 levels strongly spike in a periodic manner
every circadian cycle, whereas STIL levels appear to randomly fluctuate and show neither a discernible pattern of oscillation nor any entrainment to the circadian
clock. Because these cells are generally not proliferating, centrioles should not be duplicating, so the Plk4 oscillations are presumably sub-threshold for centriole
biogenesis. Thus, Plk4 oscillations are detectable in non-dividing mammalian cells, where they are entrained by the circadian clock.
(D) Graphs examine in the non-dividing liver cells the behavior of mouse homologs of all the mitotic PPTase subunits that function in flies (Chen et al., 2007).
Among the 27 PPTase subunits examined, only PPP2CB (highlighted with a red dotted frame) exhibited a clear oscillatory behavior that is similar to Plk4, and the
period of these oscillations was precisely out of phase with the Plk4 oscillation.
1578 Cell 181, 1566–1581, June 25, 2020

ll
Article
DECLARATION OF INTERESTS Slimb ubiquitin ligase limits centrosome amplification through degradation of
SAK/PLK4. Curr. Biol. 19, 43–49.
The authors declare no competing interests. Cunha-Ferreira, I., Bento, I., Pimenta-Marques, A., Jana, S.C., Lince-Faria, M.,
Duarte, P., Borrego-Pinto, J., Gilberto, S., Amado, T., Brito, D., et al. (2013).
Received: April 5, 2019 Regulation of autophosphorylation controls PLK4 self-destruction and
Revised: December 19, 2019 centriole number. Curr. Biol. 23, 2245–2254.
Deneke, V.E., Melbinger, A., Vergassola, M., and Di Talia, S. (2016). Waves of
Cdk1 Activity in S Phase Synchronize the Cell Cycle in Drosophila Embryos.
Dev. Cell 38, 399–412.
REFERENCES
Dzhindzhev, N.S., Yu, Q.D., Weiskopf, K., Tzolovsky, G., Cunha-Ferreira, I., Ri-
parbelli, M., Rodrigues-Martins, A., Bettencourt-Dias, M., Callaini, G., and
Alvarez-Rodrigo, I., Steinacker, T.L., Saurya, S., Conduit, P.T., Baumbach, J.,
Glover, D.M. (2010). Asterless is a scaffold for the onset of centriole assembly.
Novak, Z.A., Aydogan, M.G., Wainman, A., and Raff, J.W. (2019). Evidence
Nature 467, 714–718.
that a positive feedback loop drives centrosome maturation in fly embryos. eL-
ife 8, D430. Dobbelaere, J., Josué, F., Suijkerbuijk, S., Baum, B., Tapon, N., and Raff, J.
(2008). A genome-wide RNAi screen to dissect centriole duplication and
Arquint, C., and Nigg, E.A. (2016). The PLK4-STIL-SAS-6 module at the core of
centrosome maturation in Drosophila. PloS. Biol. 6, e224.
centriole duplication. Biochem. Soc. Trans. 44, 1253–1263.
Dzhindzhev, N.S., Tzolovsky, G., Lipinszki, Z., Schneider, S., Lattao, R.,
Aydogan, M.G., Wainman, A., Saurya, S., Steinacker, T.L., Caballe, A., Novak,
Fu, J., Debski, J., Dadlez, M., and Glover, D.M. (2014). Plk4 phosphory-
Z.A., Baumbach, J., Muschalik, N., and Raff, J.W. (2018). A homeostatic clock
lates Ana2 to trigger Sas6 recruitment and procentriole formation. Curr.
sets daughter centriole size in flies. J. Cell Biol. 217, 1233–1248.
Biol. 24, 2526–2532.
Balczon, R., Bao, L., Zimmer, W.E., Brown, K., Zinkowski, R.P., and Brinkley,
Eilers, P.H., and Boelens, H.F. (2005). Baseline Correction with Asymmetric
B.R. (1995). Dissociation of centrosome replication events from cycles of DNA
Least Squares Smoothing. Leiden University Medical Centre report, 2005.
synthesis and mitotic division in hydroxyurea-arrested Chinese hamster ovary
cells. J. Cell Biol. 130, 105–115. Ferrell, J.E., Jr. (2016). Perfect and Near-Perfect Adaptation in Cell Signaling.
Cell Syst. 2, 62–67.
Ball, G., Demmerle, J., Kaufmann, R., Davis, I., Dobbie, I.M., and Schermelleh,
L. (2015). SIMcheck: a Toolbox for Successful Super-resolution Structured Fırat-Karalar, E.N., and Stearns, T. (2014). The centriole duplication cycle.
Illumination Microscopy. Sci. Rep. 5, 15915. Philos. Trans. R. Soc. Lond. B Biol. Sci. 369, 20130460.
Banterle, N., and Gönczy, P. (2017). Centriole Biogenesis: From Identifying Friedman, S.L. (2000). Molecular regulation of hepatic fibrosis, an integrated
the Characters to Understanding the Plot. Annu. Rev. Cell Dev. Biol. cellular response to tissue injury. J. Biol. Chem. 275, 2247–2250.
33, 23–49. Gard, D.L., Hafezi, S., Zhang, T., and Doxsey, S.J. (1990). Centrosome dupli-
Basto, R., Brunk, K., Vinadogrova, T., Peel, N., Franz, A., Khodjakov, A., and cation continues in cycloheximide-treated Xenopus blastulae in the absence
Raff, J.W. (2008). Centrosome amplification can initiate tumorigenesis in flies. of a detectable cell cycle. J. Cell Biol. 110, 2033–2042.
Cell 133, 1032–1042. Goehring, N.W., and Hyman, A.A. (2012). Organelle growth control through
Baumbach, J., Novak, Z.A., Raff, J.W., and Wainman, A. (2015). Dissecting the limiting pools of cytoplasmic components. Curr. Biol. 22, R330–R339.
function and assembly of acentriolar microtubule organizing centers in Gönczy, P., and Hatzopoulos, G.N. (2019). Centriole assembly at a glance.
Drosophila cells in vivo. PLoS Genet. 11, e1005261. J. Cell Sci. 132, jcs228833.
Bettencourt-Dias, M., Hildebrandt, F., Pellman, D., Woods, G., and Godinho, Guderian, G., Westendorf, J., Uldschmid, A., and Nigg, E.A.E. (2010). Plk4
S.A. (2011). Centrosomes and cilia in human disease. Trends Genet. 27, trans-autophosphorylation regulates centriole number by controlling be-
307–315. taTrCP-mediated degradation. J. Cell Sci. 123, 2163–2169.
Blachon, S., Gopalakrishnan, J., Omori, Y., Polyanovsky, A., Church, A., Ni- Hatch, E.M., Kulukian, A., Holland, A.J., Cleveland, D.W., and Stearns, T.
castro, D., Malicki, J., and Avidor-Reiss, T. (2008). Drosophila asterless and (2010). Cep152 interacts with Plk4 and is required for centriole duplication.
vertebrate Cep152 Are orthologs essential for centriole duplication. Genetics J. Cell Biol. 191, 721–729.
180, 2081–2094. Holland, A.J., Lan, W., Niessen, S., Hoover, H., and Cleveland, D.W. (2010).
Boese, C.J., Nye, J., Buster, D.W., McLamarrah, T.A., Byrnes, A.E., Slep, K.C., Polo-like kinase 4 kinase activity limits centrosome overduplication by autore-
Rusan, N.M., and Rogers, G.C. (2018). Asterless is a Polo-like kinase 4 sub- gulating its own stability. J. Cell Biol. 188, 191–198.
strate that both activates and inhibits kinase activity depending on its phos- Jacobs, H.W., Knoblich, J.A., and Lehner, C.F. (1998). Drosophila Cyclin B3 is
phorylation state. Mol. Biol. Cell 29, 2874–2886. required for female fertility and is dispensable for mitosis like Cyclin B. Genes
Chang, J.B., and Ferrell, J.E., Jr. (2013). Mitotic trigger waves and the spatial Dev. 12, 3741–3751.
coordination of the Xenopus cell cycle. Nature 500, 603–607. Jones, E., Oliphant, T., and Peterson, P. (2001). SciPy: Open Source Scientific
Chen, F., Archambault, V., Kar, A., Lio’, P., D’Avino, P.P., Sinka, R., Lilley, Tools for Python. http://www.scipy.org/.
K., Laue, E.D., Deak, P., Capalbo, L., and Glover, D.M. (2007). Multiple Kitagawa, D., Vakonakis, I., Olieric, N., Hilbert, M., Keller, D., Olieric, V.,
protein phosphatases are required for mitosis in Drosophila. Curr. Biol. Bortfeld, M., Erat, M.C., Flückiger, I., Gönczy, P., and Steinmetz, M.O.
17, 293–303. (2011). Structural basis of the 9-fold symmetry of centrioles. Cell 144,
Cizmecioglu, O., Arnold, M., Bahtz, R., Settele, F., Ehret, L., Haselmann- 364–375.
Weiss, U., Antony, C., and Hoffmann, I. (2010). Cep152 acts as a scaffold Klebba, J.E., Buster, D.W., Nguyen, A.L., Swatkoski, S., Gucek, M., Rusan,
for recruitment of Plk4 and CPAP to the centrosome. J. Cell Biol. 191, N.M., and Rogers, G.C. (2013). Polo-like kinase 4 autodestructs by generating
731–739. its Slimb-binding phosphodegron. Curr. Biol. 23, 2255–2261.
Claude, A. (1943). The constitution of protoplasm. Science 97, 451–456. Klebba, J.E., Galletta, B.J., Nye, J., Plevock, K.M., Buster, D.W., Hollings-
Conduit, P.T., Wainman, A., Novak, Z.A., Weil, T.T., and Raff, J.W. (2015). Re- worth, N.A., Slep, K.C., Rusan, N.M., and Rogers, G.C. (2015). Two Polo-like
examining the role of Drosophila Sas-4 in centrosome assembly using two- kinase 4 binding domains in Asterless perform distinct roles in regulating ki-
colour-3D-SIM FRAP. eLife 4, 1032. nase stability. J. Cell Biol. 208, 401–414.
Cunha-Ferreira, I., Rodrigues-Martins, A., Bento, I., Riparbelli, M., Zhang, W., Koppel, D.E. (1974). Statistical accuracy in fluorescence correlation spectros-
Laue, E., Callaini, G., Glover, D.M., and Bettencourt-Dias, M. (2009). The SCF/ copy. Phys. Rev. A 10, 1938–1945.
Cell 181, 1566–1581, June 25, 2020 1579

ll
Article
Kratz, A.-S., Bärenz, F., Richter, K.T., and Hoffmann, I. (2015). Plk4-dependent Price, D., Rabinovitch, S., O’Farrell, P.H., and Campbell, S.D. (2000).
phosphorylation of STIL is required for centriole duplication. Biol. Open 4, Drosophila wee1 Has an Essential Role in the Nuclear Divisions of Early
370–377. Embryogenesis. Genetics 155, 159–166.
Leda, M., Holland, A.J., and Goryachev, A.B. (2018). Autoamplification and Ripley, B.D. (1976). The second-order analysis of stationary point processes.
Competition Drive Symmetry Breaking: Initiation of Centriole Duplication by J. Appl. Probab. 13, 255–266.
the PLK4-STIL Network. iScience 8, 222–235. Rogers, G.C., Rusan, N.M., Peifer, M., and Rogers, S.L. (2008). A multi-
Leung, G.C., Ho, C.S.W., Blasutig, I.M., Murphy, J.M., and Sicheri, F. (2007). component assembly pathway contributes to the formation of acentroso-
Determination of the Plk4/Sak consensus phosphorylation motif using peptide mal microtubule arrays in interphase Drosophila cells. Mol. Biol. Cell 19,
spots arrays. FEBS Lett. 581, 77–83. 3163–3178.
Liu, T.-L., Upadhyayula, S., Milkie, D.E., Singh, V., Wang, K., Swinburne, I.A., Rogers, G.C., Rusan, N.M., Roberts, D.M., Peifer, M., and Rogers, S.L. (2009).
Mosaliganti, K.R., Collins, Z.M., Hiscock, T.W., Shea, J., et al. (2018). The SCF Slimb ubiquitin ligase regulates Plk4/Sak levels to block centriole
Observing the cell in its native state: Imaging subcellular dynamics in multicel- reduplication. J. Cell Biol. 184, 225–239.
lular organisms. Science 360, eaaq1392. Rüttinger, S., Buschmann, V., Krämer, B., Erdmann, R., Macdonald, R., and
Lu, Y., and Cross, F.R. (2010). Periodic cyclin-Cdk activity entrains an auton- Koberling, F. (2008). Comparison and accuracy of methods to determine the
omous Cdc14 release oscillator. Cell 141, 268–279. confocal volume for quantitative fluorescence correlation spectroscopy.
J. Microsc. 232, 343–352.
Markow, T.A., Beall, S., and Matzkin, L.M. (2009). Egg size, embryonic devel-
Schönle, A., Von Middendorff, C., Ringemann, C., Hell, S.W., and Eggeling, C.
opment time and ovoviviparity in Drosophila species. J. Evol. Biol. 22,
(2014). Monitoring triplet state dynamics with fluorescence correlation spec-
430–434.
troscopy: bias and correction. Microsc. Res. Tech. 77, 528–536.
Marsh, B.J., Mastronarde, D.N., Buttle, K.F., Howell, K.E., and McIntosh, J.R.
Schwarz, G. (1978). Estimating the Dimension of a Model. Ann. Stat. 6,
(2001). Organellar relationships in the Golgi region of the pancreatic beta cell
461–464.
line, HIT-T15, visualized by high resolution electron tomography. Proc. Natl.
Acad. Sci. USA 98, 2399–2406. Shaner, N.C., Lambert, G.G., Chammas, A., Ni, Y., Cranfill, P.J., Baird, M.A.,
Sell, B.R., Allen, J.R., Day, R.N., Israelsson, M., et al. (2013). A bright mono-
Marshall, W.F. (2016). Cell Geometry: How Cells Count and Measure Size.
meric green fluorescent protein derived from Branchiostoma lanceolatum.
Annu. Rev. Biophys. 45, 49–64.
Nat. Methods 10, 407–409.
Matsuo, T., Yamaguchi, S., Mitsui, S., Emi, A., Shimoda, F., and Okamura, H. Shcherbo, D., Murphy, C.S., Ermakova, G.V., Solovieva, E.A., Chepurnykh,
(2003). Control mechanism of the circadian clock for timing of cell division T.V., Shcheglov, A.S., Verkhusha, V.V., Pletnev, V.Z., Hazelwood, K.L., Roche,
in vivo. Science 302, 255–259. P.M., et al. (2009). Far-red fluorescent tags for protein imaging in living tissues.
McCleland, M.L., and O’Farrell, P.H. (2008). RNAi of mitotic cyclins in Drosophila Biochem. J. 418, 567–574.
uncouples the nuclear and centrosome cycle. Curr. Biol. 18, 245–254. Sibon, O.C., Stevenson, V.A., and Theurkauf, W.E. (1997). DNA-replication
McLamarrah, T.A., Buster, D.W., Galletta, B.J., Boese, C.J., Ryniawec, checkpoint control at the Drosophila midblastula transition. Nature
J.M., Hollingsworth, N.A., Byrnes, A.E., Brownlee, C.W., Slep, K.C., Ru- 388, 93–97.
san, N.M., and Rogers, G.C. (2018). An ordered pattern of Ana2 phos- Sluder, G., Miller, F.J., Cole, R., and Rieder, C.L. (1990). Protein synthesis and
phorylation by Plk4 is required for centriole assembly. J. Cell Biol. 217, the cell cycle: centrosome reproduction in sea urchin eggs is not under trans-
1217–1231. lational control. J. Cell Biol. 110, 2025–2032.
Morgan, D.O. (2010). The hidden rhythms of the dividing cell. Cell 141, Somvanshi, P.R., Patel, A.K., Bhartiya, S., and Venkatesh, K.V. (2015). Imple-
224–226. mentation of integral feedback control in biological systems. Wiley Interdiscip.
Mukherji, S., and O’Shea, E.K. (2014). Mechanisms of organelle biogenesis Rev. Syst. Biol. Med. 7, 301–316.
govern stochastic fluctuations in organelle abundance. eLife 3, e02678. Stern, B., and Nurse, P. (1996). A quantitative model for the cdc2 control of S
phase and mitosis in fission yeast. Trends Genet. 12, 345–350.
Nigg, E.A., and Holland, A.J. (2018). Once and only once: mechanisms of
centriole duplication and their deregulation in disease. Nat. Rev. Mol. Cell Stumpff, J., Duncan, T., Homola, E., Campbell, S.D., and Su, T.T. (2004).
Biol. 19, 297–312. Drosophila Wee1 kinase regulates Cdk1 and mitotic entry during embryogen-
esis. Curr. Biol. 14, 2143–2148.
Nigg, E.A., and Raff, J.W. (2009). Centrioles, centrosomes, and cilia in health
and disease. Cell 139, 663–678. Swaffer, M.P., Jones, A.W., Flynn, H.R., Snijders, A.P., and Nurse, P. (2016).
CDK Substrate Phosphorylation and Ordering the Cell Cycle. Cell 167,
Nilsson, J. (2019). Protein phosphatases in the regulation of mitosis. J. Cell
1750–1761.
Biol. 218, 395–409.
Swaffer, M.P., Jones, A.W., Flynn, H.R., Snijders, A.P., and Nurse, P. (2018).
Novák, B., and Tyson, J.J. (2008). Design principles of biochemical oscillators. Quantitative Phosphoproteomics Reveals the Signaling Dynamics of Cell-Cy-
Nat. Rev. Mol. Cell Biol. 9, 981–991. cle Kinases in the Fission Yeast Schizosaccharomyces pombe. Cell Rep. 24,
Novak, Z.A., Conduit, P.T., Wainman, A., and Raff, J.W. (2014). Asterless li- 503–514.
censes daughter centrioles to duplicate for the first time in Drosophila em- Takao, D., Watanabe, K., Kuroki, K., and Kitagawa, D. (2019). Feedback loops
bryos. Curr. Biol. 24, 1276–1282. in the Plk4-STIL-HsSAS6 network coordinate site selection for procentriole
Ohta, M., Ashikawa, T., Nozaki, Y., Kozuka-Hata, H., Goto, H., Inagaki, M., formation. Biol. Open 8, bio047175.
Oyama, M., and Kitagawa, D. (2014). Direct interaction of Plk4 with STIL en- Tinevez, J.-Y., Perry, N., Schindelin, J., Hoopes, G.M., Reynolds, G.D.,
sures formation of a single procentriole per parental centriole. Nat. Commun. Laplantine, E., Bednarek, S.Y., Shorte, S.L., and Eliceiri, K.W. (2017).
5, 5267. TrackMate: An open and extensible platform for single-particle tracking.
Park, J.-E., Zhang, L., Bang, J.K., Andresson, T., DiMaio, F., and Lee, K.S. Methods 115, 80–90.
(2019). Phase separation of Polo-like kinase 4 by autoactivation and clustering Tsai, T.Y.-C., Choi, Y.S., Ma, W., Pomerening, J.R., Tang, C., and Ferrell, J.E.,
drives centriole biogenesis. Nat. Commun. 10, 4959. Jr. (2008). Robust, tunable biological oscillations from interlinked positive and
Petrásek, Z., and Schwille, P. (2008). Precise measurement of diffusion coef- negative feedback loops. Science 321, 126–129.
ficients using scanning fluorescence correlation spectroscopy. Biophys. J. 94, van Breugel, M., Hirono, M., Andreeva, A., Yanagisawa, H.-A., Yamaguchi, S.,
1437–1448. Nakazawa, Y., Morgner, N., Petrovich, M., Ebong, I.-O., Robinson, C.V., et al.
1580 Cell 181, 1566–1581, June 25, 2020

ll
Article
(2011). Structures of SAS-6 suggest its organization in centrioles. Science 331, Wang, Y., Song, L., Liu, M., Ge, R., Zhou, Q., Liu, W., Li, R., Qie, J., Zhen, B.,
1196–1199. Wang, Y., et al. (2018). A proteomics landscape of circadian clock in mouse
van Breugel, M., Wilcken, R., McLaughlin, S.H., Rutherford, T.J., and Johnson, liver. Nat. Commun. 9, 1553.
C.M. (2014). Structure of the SAS-6 cartwheel hub from Leishmania major. eL- Yamamoto, S., and Kitagawa, D. (2019). Self-organization of Plk4 regulates
ife 3, e01812. symmetry breaking in centriole duplication. Nat. Comm. 10, 1810.
Waithe, D., Clausen, M.P., Sezgin, E., and Eggeling, C. (2016). FoCuS-point: Yang, Q., Pando, B.F., Dong, G., Golden, S.S., and van Oudenaarden, A.
software for STED fluorescence correlation and time-gated single photon (2010). Circadian gating of the cell cycle revealed in single cyanobacterial cells.
counting. Bioinformatics 32, 958–960. Science 327, 1522–1526.
Cell 181, 1566–1581, June 25, 2020 1581

ll
Article
STAR+METHODS
KEY RESOURCES TABLE

Antibodies
Mouse anti-GFP Roche RRID: AB_390913
Mouse anti-Actin Sigma RRID: AB_476730
HRPO-linked anti-mouse IgG Sigma / GE Healthcare Cat# GENA931
QuikChange II XL mutagenesis kit Agilent Technologies Cat# 200521
Q5 Site Directed Mutagenesis kit New England Biolabs Cat# E0554S
Voltalef grade H10S oil Arkema N/A
Alexa Fluor 488 NHS Ester Thermo Fisher Scientific Cat# A20000
D. melanogaster: Plk4-mNeonGreen This paper N/A
D. melanogaster: Plk4Aa74 (Plk4 null mutant) Aydogan et al., 2018 FlyBase ID: FBab0049012
D. melanogaster: Asl-mKate2 This paper N/A
D. melanogaster: Sas-6-mCherry Rogers et al., 2008 N/A
D. melanogaster: CycB2 Jacobs et al., 1998 FlyBase ID: FBal0094855
D. melanogaster: grpfsA4 Sibon et al., 1997 FlyBase ID: FBal0062815
D. melanogaster: Asl-GFP Blachon et al., 2008 FlyBase ID:FBtp0040947
D. melanogaster: aslB46 Baumbach et al., 2015 FlyBase ID: FBal0343439
D. melanogaster: Plk4-GFP Aydogan et al., 2018 FlyBase ID: FBal0343977
D. melanogaster: Asl-mCherry Conduit et al., 2015 FlyBase ID: FBal0343645
D. melanogaster: Sas-6-GFP Aydogan et al., 2018 FlyBase ID: FBtp0131375
D. melanogaster: Asl-13A-mKate2 This paper N/A
D. melanogaster: Asl-13A This paper N/A
D. melanogaster: Asl This paper N/A
D. melanogaster: wee1* (Homozygous viable mutant derived from N/A
Price et al., 2000; courtesy of Prof. Shelagh
Campbell)
D. melanogaster: Plk4-mNeonGreen, This paper N/A
Plk4Aa74 / Plk4-mNeonGreen, Plk4Aa74
D. melanogaster: Asl-mKate2 / Cyo; Plk4- This paper N/A
mNeonGreen, Plk4Aa74 / Plk4-
mNeonGreen, Plk4Aa74
D. melanogaster: Sas-6-mCherry / +; Plk4- This paper N/A
D. melanogaster: CycB2 / +; Plk4- This paper N/A
D. melanogaster: grpfsA4 / +; Plk4- This paper N/A
D. melanogaster: Asl-GFP / Asl-GFP; This paper N/A
aslB46 / aslB46
D. melanogaster: Oregon-R (Wild-type Kyoto Stock Center FlyBase ID: FBst0324696
strain)
D. melanogaster: Asl-mKate2, aslB46 / + This paper N/A
e1 Cell 181, 1566–1581.e1–e14, June 25, 2020

ll
Article
Continued
D. melanogaster: Plk4-GFP / Cyo; Plk4 Aa74
/ Aydogan et al., 2018 N/A
Plk4Aa74
D. melanogaster: Plk4-mNeonGreen, This paper N/A
Plk4Aa74 / Plk4Aa74
D. melanogaster: Asl-mCherry / +; This paper N/A
Plk4Aa74 / +
D. melanogaster: Plk4-mNeonGreen / +; This paper N/A
Plk4-mNeonGreen, Plk4Aa74 / Plk4Aa74
D. melanogaster: Plk4-mNeonGreen / +; This paper N/A
Plk4-mNeonGreen, Plk4Aa74 / aslB46,
Plk4Aa74
D. melanogaster: Asl-GFP / +; aslB46 / aslB46 This paper N/A
D. melanogaster: Plk4-GFP / Cyo; aslB46, This paper N/A
Plk4Aa74 / Plk4Aa74
D. melanogaster: Sas-6-GFP / +; aslB46 / + This paper N/A
D. melanogaster: Asl-13A-mKate2 / Asl- This paper N/A
13A-mKate2; aslB46 / aslB46
D. melanogaster: Asl-mKate2 / Asl-mKate2; This paper N/A
aslB46 / aslB46
D. melanogaster: Asl-13A / +; Plk4- This paper N/A
D. melanogaster: Asl / +; Plk4- This paper N/A
D. melanogaster: wee1* / wee1*; Plk4- This paper N/A
Oligonucleotides
Primers to introduce the NheI restriction Invitrogen, Thermo Fisher Scientific N/A
enzyme sites into the mCherry C-terminal
Gateway vector, see Table S1.
Primers to replace the mCherry tag with Invitrogen, Thermo Fisher Scientific N/A
mNeonGreen by homologous
recombination on the destination vector,
see Table S1.
Primers to replace the mCherry tag with Invitrogen, Thermo Fisher Scientific N/A
mKate2 by homologous recombination on
the destination vector, see Table S1.
Primers to remove the NheI restriction Invitrogen, Thermo Fisher Scientific N/A
enzyme sites from the destination vector via
site-directed mutagenesis (mNeonGreen
vector), see Table S1.
Primers to remove the NheI restriction Invitrogen, Thermo Fisher Scientific N/A
enzyme sites from the destination vector via
site-directed mutagenesis (mKate2 vector),
see Table S1.
Primers to amplify Cyclin A, B or B3, see Invitrogen, Thermo Fisher Scientific N/A
Table S1.
Primers to introduce various site directed Invitrogen, Thermo Fisher Scientific N/A
mutations for Asl-13A construct, see
Table S1.
Cell 181, 1566–1581.e1–e14, June 25, 2020 e2

ll
Article
Continued
Primers to delete mKate2 to generate Invitrogen, Thermo Fisher Scientific N/A
endogenous Asl-13A construct without a
fluorescent tag, see Table S1.
Primers to generate endogenous Asl Invitrogen, Thermo Fisher Scientific N/A
construct without a fluorescent tag, see
Table S1.
Recombinant DNA
mCherry C-terminal Gateway vector Basto et al., 2008 N/A
pDONR-Zeo vector Thermo Fisher Scientific Cat# 12535035
mNeonGreen vector Shaner et al., 2013 N/A
mKate2 vector Shcherbo et al., 2009 N/A
Asl-mKate2 P-element transformation This study N/A
vector
Fiji (ImageJ) National Institutes of Health https://imagej.nih.gov/ij/
TrackMate Tinevez et al., 2017 https://imagej.net/TrackMate
Prism 7 and 8 GraphPad https://www.graphpad.com/
scientific-software/prism/
Scipy’s find_peaks function Jones et al., 2001 https://docs.scipy.org/doc/scipy/
reference/generated/scipy.signal.
find_peaks.html
Asymmetric baseline smoothing Eilers and Boelens, 2005 N/A
Zen Black Software Zeiss https://www.zeiss.com/microscopy/us/
products/microscope-software/zen.html
FoCuS-Point Software Waithe et al., 2016 N/A
The equations used for mathematical This paper https://github.com/RaffLab/
modeling and regressions centriole_oscillator_model
Python script to automate PeCoS analysis This paper https://github.com/RaffLab/
centriole_oscillator_model
Lead Contact
Further information and requests for resources and reagents should be directed to and will be fulfilled by the Lead Contact, Jordan W.
Raff (jordan.raff@path.ox.ac.uk).
All unique/stable reagents generated in this study are available from the Lead Contact without restriction, unless for commercial
application, in which case a completed Materials Transfer Agreement will be requested. There is restriction to the availability of
dsRNA cocktails produced in this study, as they only last for 6 months without degradation (if preserved at conditions indicated
in the STAR Methods), and therefore these cocktails are recommended to be made fresh using the protocol described in the
STAR Methods. Fly alleles and plasmids (with original source species) generated in this study will be requested by FlyBase admin-
istration to deposit onto FlyBase public archives within 6 months following the publication of this study. Compound and recombinant
flies are deposited to the Lead Contact’s laboratory stocks (without direct public access), but are available without restriction upon
request.

The codes generated to perform mathematical modeling and regressions are available in the following web link: < https://github.com/
RaffLab/centriole_oscillator_model >. The code generated to automate PeCoS analysis procedure is available in the following web
link: < https://github.com/RaffLab/PeCoS >. Source 3D time-lapse spinning-disk confocal micrographs and SIM reconstruction da-
tasets supporting the current study are of sizes between 10 and 20GB for each experiment (exceeding the current upload limits of
public repositories) and therefore have been deposited in Open Microscopy Environment (OMERO) repository. These are available
e3 Cell 181, 1566–1581.e1–e14, June 25, 2020

ll
Article
without restriction, via file transfer systems, when requested from the Lead Contact – unless for commercial application, in which
case a completed Materials Transfer Agreement will be requested.
D. melanogaster stocks and husbandry

The specific D. melanogaster stocks used, generated and/or tested in this study are listed in Key Resources Table. To generate Plk4-
mNeonGreen and Asl-mKate2 constructs: 1) NheI restriction enzyme sites were introduced into an mCherry C-terminal Gateway vec-
tor (Basto et al., 2008), using the Quikchange II XL mutagenesis kit (Agilent Technologies). 2) The mCherry tag was replaced with
either mNeonGreen (Shaner et al., 2013) (Allele Biotechnology) or mKate2 (Shcherbo et al., 2009) tags by homologous recombination
via In-fusion Cloning (TaKaRa). 3) NheI restriction enzyme sites were removed via site-directed mutagenesis, using the Quikchange II
XL mutagenesis kit (Agilent Technologies). These vectors were recombined via Gateway technology to pDONR-Zeo vectors (Thermo
Fisher Scientific) where the genetic regions of either Plk4 (Aydogan et al., 2018) or asl (Novak et al., 2014) were previously cloned from
2 kb upstream of the start codon up to (but excluding) the stop codon. Similary, to generate an Asl construct without any fluorescent
tag, the endogeneous Asl stop codon was introduced to the Asl pDONR by site-directed mutagenesis, using the Quikchange II XL
mutagenesis kit.
To generate the Asl-13A-mKate2 construct, 13 point mutations (Figure S4C) were introduced into the endogenous Asl-mKate2 P-
element transformation vector (this study) through NEB Q5 Site Directed Mutagenesis (NEB #E0554S) in four sequential mutagenesis
steps. To generate the endogenous Asl-13A construct without any fluorescent tag, the mKate2 coding sequence was removed, and a
stop codon was introduced immediately following the Asl coding sequence in the eAsl-13A-mKate2 vector (described above) using
NEB Q5 Site Directed Mutagenesis.
Primer sequences used to generate these constructs are listed in Table S1. Transgenic lines were generated using standard P-
element mediated transformation by the Fly Facility in the Department of Genetics, University of Cambridge (Cambridge, England,
UK) or BestGene Inc. (USA). Flies were maintained at 18 C or 25 C on Drosophila culture medium (0.77% agar, 6.9% maize, 0.8%
soya, 1.4% yeast, 6.9% malt, 1.9% molasses, 0.5% propionic acid, 0.03% ortho-phosphoric acid, and 0.3% nipagin) in vials or
bottles.
METHOD DETAILS
Hatching experiments
To measure embryo hatching rates, 0-3 h embryos were collected and aged for 24 h, and the % of embryos that hatched out of their
chorion was calculated.
Synthesis of double-stranded RNA

Double-stranded RNAs (dsRNAs) against cyclins A, B and B3 were synthesized essentially as described previously (McCleland and
O’Farrell, 2008). Primer sequences used for gene amplification are listed in Table S1. The resulting RNA was precipitated with 8 mL of
3M Na-Acetate and 220 mL of 100% ethanol before washing with 70% cold ethanol. The RNA pellets were air-dried and resuspended
in 30 mL of RNase-free diethylpyrocarbonate-treated water (Thermo Fisher Scientific). To generate double-stranded molecules,
RNAs were placed in a 67.5 C water bath for 30 min, and allowed to cool to room temperature over 90 min. Unincorporated
UTPs were removed using CHROMA SPIN-100-DEPC-H2O columns (Clontech) according to the manufacturer’s instructions. To
confirm the synthesis of the correct RNA product, 3 mL of the final reaction was subjected to electrophoresis on a 1.5% agarose
gel using 2xRNA loading buffer (Thermo Fisher Scientific). A 1:1 mix of RNA and loading buffer was heated to 65 C for 5 min and
then placed on ice to denature any secondary structure of RNA.
Embryo collections and dsRNA injections

For embryo collections, 25% cranberry-raspberry juice plates (2% sucrose and 1.8% agar with a drop of yeast suspension) were
used. Embryos for imaging experiments were collected for 1h at 25 C, and aged at 25 C for 45–60 min. Embryos were dechorio-
nated by hand, mounted on a strip of glue on a 35-mm glass-bottom Petri dish with 14 mm micro-well (MatTek), and were left to
desiccate for 1 min at 25 C. After desiccation, the embryos were covered with Voltalef grade H10S oil (Arkema). Embryos for dsRNA
injection experiments were treated in the same way except that the desiccation period was increased to 5-6 min. Embryos were in-
jected with dsRNA at a needle concentration of 0.6–0.8 mg/ml.
Immunoblotting
Immunoblotting was performed as described previously (Aydogan et al., 2018). Primary antibodies used in this study are as follows:
mouse anti-GFP (Roche; RRID: AB_390913) and mouse anti-Actin (Sigma; RRID: AB_476730). Both the antibodies were used at
1:500 dilution in blocking solution (Aydogan et al., 2018). For all blots, 10, 20 or 30 staged early embryos were boiled in sample buffer
and loaded in each lane. The incubation period for primary antibodies was 1 h (or overnight at 4 C). Membranes were quickly washed
3x in TBST (TBS and 0.1% Tween 20) and then incubated with HRPO-linked anti-mouse IgG (both GE Healthcare) diluted 1:3,000 in
Cell 181, 1566–1581.e1–e14, June 25, 2020 e4

ll
Article
blocking solution for 45 min. Membranes were washed 3x15min in TBST and then incubated in SuperSignal West Femto Maximum
Sensitivity Substrate (Thermo Fisher Scientific). Membranes were exposed to film using exposure times that ranged from < 1 to 60s.
Image acquisition, processing, and analysis

Spinning disk confocal microscopy
Living embryos were imaged at room temperature using a system equipped with an EM-CCD Andor iXon+ camera on a Nikon Eclipse
TE200-E microscope using a Plan-Apochromat 60x/1.42-NA oil DIC lens, controlled with Andor IQ2 software. Confocal sections of 17
slices at 0.5mm intervals were collected every 30 s. A 488nm laser was used to excite mNeonGreen and GFP, and a 568nm laser was
used to excite mCherry and mKate2. Emission discrimination filters were applied when mNeonGreen and mCherry were imaged
together.
Post-acquisition image processing was performed using Fiji (National Institutes of Health). Maximum-intensity projections of the
images were first bleach-corrected with Fiji’s exponential fit algorithm, and background was subtracted using the subtract back-
ground tool with a rolling ball radius of 10 pixels. Plk4-NG, Sas-6-mCherry or -GFP, and Asl-mCherry or -GFP were tracked using
the Fiji plug-in TrackMate (Tinevez et al., 2017) with a track spot diameter size of 1.1 mm. When Plk4-NG was continuously monitored
over cycles 11-13, the maximum intensity projections were limited to ± 5 slices from the central plane of the nuclei, as the nuclei and
centrosomes progressively get closer to the embryo cortex at successive cycles. This processing more accurately compares the
dynamics of centriolar Plk4-NG at successive cycles by avoiding fluctuations due to the varying depths of the centrioles in the em-
bryo. The regressions for the centriole growth curves (Sas-6-GFP or -mCherry) were calculated in Prism 7 (GraphPad Software), as
described previously (Aydogan et al., 2018). The regressions for the Plk4 oscillation curves (Plk4-NG) were calculated using the
nonlinear regression (curve fit) function in Prism 7. Discrete Plk4 oscillation curves in S-phase were initially fitted against four different
functions to assess the most suitable regression model: 1) Lorentzian, 2) Gaussian, 3) Increase – Constant – Decrease, and 4)
Increase – Decrease. Among these models, Lorentzian best fit the data (Figure S1D), so all the discrete Plk4 oscillation curves in
S-phase were regressed using this function. The Lorentzian and Gaussian functions are described in Prism 7, while the latter two
functions are in-house algorithms (Alvarez-Rodrigo et al., 2019).
In order to plot the dynamics of Plk4-NG and Sas-6-mCherry together (Figures 2, S3A, and S3B), the highest mean fluorescence
signal for each tag was normalized to 1 and was accordingly scaled across cycles 11-13 (the scaling factor for Plk4-NG was calcu-
lated from the data shown in Figures 1B and 1C). Note that the amplitude of the Plk4 oscillation does not appear to decrease signif-
icantly between nuclear cycles 11-13 in the data shown in Figure 2A—in contrast to the Plk4 oscillations shown in Figure 1B. This is
not because of the scaling procedure applied to the data shown in Figure 2A (described above), but rather because embryos that
failed to grow their centrioles were excluded from the analysis shown in Figure 2A. The amplitude of the Plk4 oscillations was lower
in these embryos (Figure 2C), so embryos with low amplitude Plk4 oscillations were effectively excluded from the analysis shown in
Figure 2A. Almost all of these excluded embryos were at nuclear cycle 13, so the ‘‘average’’ oscillation at nuclear cycle 13 is in reality
an average of only those embryos that had a relatively high amplitude Plk4 oscillation.
In all the imaging experiments, the beginning of S-phase was taken as the time at which the old and new mother centrioles were first
detected to separate from each other (termed ‘‘centrosome separation’’ or ‘‘CS’’). Entry into mitosis was taken as the time of nuclear
envelope breakdown (NEB), which could be determined in our movies by adjusting the contrast to visualize when the cytosolic pool of
the fluorescent protein was first observed to enter into the nucleus.
Analysis of centriole ‘‘fertility’’ in embryos injected with dsRNA against cyclin A-B-B3
In experiments where we depleted embryos of mitotic cyclins during early rounds of nuclear division, we observed qualitatively that
‘‘fertile’’ centrioles exhibited distinct Plk4-NG fluorescence peaks that often appeared to correlate with centriole duplication events,
while ‘‘sterile’’ centrioles exhibited no obvious peaks (Figure 5B). To test if we could more quantitatively distinguish between fertile
and sterile centrioles, we computationally analyzed all 81 centrioles that we could track throughout the observation period in 3
different embryos. We first assessed the average signal-to-noise ratio (SNR) of Plk4-NG fluorescence of each centriole over the entire
observation period and found that fertile centrioles exhibited a significantly higher SNR than sterile centrioles—assessed using a t
test assuming equal variance (Figure S8B). The distribution of SNR within sterile and fertile centriole signals was unimodal and sym-
metrically distributed (Figure S8C), so we attempted to classify centrioles in an unbiased way by thresholding the SNR. Based on the
bimodality of the SNR, an automatic threshold was determined from the data using Otsu thresholding (red dashed line Figure S8C);
the classification performance was summarized in a visual confusion matrix, which shows the proportion of correctly and falsely clas-
sified signals (Figure S8D). This unbiased computational method successfully classified 74% of the fertile centrioles and 71% of
the sterile centrioles.
Peak Calling
We next tested whether computationally identified peaks in the Plk4-NG signal were correlated with centriole duplication events.
Plk4-NG peaks were only called on signals whose fluctuation (as measured by signal-to-noise ratio, SNR) was greater than a certain
defined threshold (0.1, see below). A peak was defined only as a local maximum in intensity. To call a peak, the Plk4-NG signal in-
tensity was compared to the signal intensity at neighboring times. Here an unbiased distance = 1 was set, that is, an intensity at time t
is a peak only if the intensity is higher than those at both t 1 and t + 1. To filter noise detections, a threshold of 0.1 was placed on the
peak prominence. Peak prominence measures the extent to which a detected peak stands out from its surrounding – it is defined as
e5 Cell 181, 1566–1581.e1–e14, June 25, 2020

ll
Article
the vertical distance between the peak and its lowest contour line (Scipy’s find_peaks function) (Jones et al., 2001). The choice of 0.1
as a threshold was guided by comparing a peaks predictive power given no cut-off with the (ground truth) duplication time. This anal-
ysis indicated that the optimal peak prominence cut-off—i.e., the point at which the power of the peaks to predict duplication events
(see below) sharply drops off—was 0.12 (green dot, Figure 5E). The observed steep drop-off in predictive power below this threshold
supports the view that that there is likely to be a minimal amount of centriolar Plk4-NG that is necessary to trigger duplication under
these conditions. Moreover, this unbiased computational approach identified 4x as many peaks in the fertile centrioles when
compared to the sterile centrioles.
To determine whether the filtered Plk4-NG peaks were predictive of centriole duplication, we determined all the peaks above the
0.1 threshold for the fertile centriole signals and assessed whether these peaks could be used to ‘‘retrieve’’ the real or relevant time
points for centriole duplication. The performance of such retrieval can be evaluated using ‘‘precision’’ (the number of relevant re-
trievals among all retrieved instances—in this case the number of Plk4 peaks associated with a centriole duplication event divided
by the total number of Plk4 peaks) and ‘‘recall’’ (the number of relevant instances retrieved of the total relevant instances—in this case
the number of Plk4 peaks associated with a centriole duplication event divided by the total number of centriole duplication events), as
defined below.
Number ofðRelevant & RetrievedÞ
Precision = ;
Number of ðRetrievedÞ
Number ofðRelevant & RetrievedÞ

Recall =
Number ofðRelevantÞ
The evaluation of such a system naturally depends on the cut-off to call a positive match between the Plk4-NG signal peak and its
corresponding centriole duplication time. Too small a cut-off (e.g., 0 minutes) is unrealistic: no system can predict time perfectly;
while too large a cut-off (e.g., 15 min) is too lenient and non-specific. Figure 6C plots the precision evaluated over all centrioles
for different temporal cut-offs attempting to uniquely match Plk4-NG peaks to the nearest duplication time within a given time win-
dow. The elbow point (red dashed line) at 5 min was selected as an appropriate cut-off with a precision of 80%. (Note that the recall
was not plotted in this graph, but it exhibits a similar behavior to the precision: the Number of (Relevant) = 52, while the Number of
(Retrieved) = 49). This temporal cut-off can also be interpreted as an estimate of the temporal accuracy to which Plk4-NG peak time
associates with centriole duplication time. For comparison, we also derived the mean temporal separation distance of peaks and
duplication events, if the same number of experimental centriole duplication times were randomly distributed over the same time in-
terval for each embryo. 1000 simulations were run per embryo to produce a distribution. Across all embryos, an average temporal
separation distance (for randomly distributed duplication times) was 10.5 minutes (data not shown), twice as long as the chosen
5 min cut-off, thus the association is not coincidental.
In addition, we assessed the precision and recall performance over different possible threshold values (based on peak prominence)
used to call a Plk4 ‘peak’. To do this, we computed the precision-recall curve. All detected Plk4 peaks (Black dots; Figure 5E) were
ranked according to their peak prominence from high to low and were assigned uniquely to a duplication event according to a 5 min
time window for determining a positive match. Peaks that could not be uniquely assigned in such a manner were regarded as ‘neg-
atives’. The graph then plots the precision, recall values if the threshold for calling a peak were set as the peak prominence value of
each peak in descending order. Beyond the detected peak associated with a peak prominence of 0.12 (i.e., points right of this point),
the precision drops sharply. At this threshold, precision and recall are jointly optimized. This suggests that a minimum level of Plk4-
NG peak fluorescence intensity is required to predict duplication. The ability of Plk4 peaks to predict duplication across all peak
prominences (over the selected time window of 5 min) is quantified by the integrated area under the curve or average precision
(AP5min). If there were no overall correlation between a Plk4 peak and a duplication event, AP5min would be 51.5% (given by # dupli-
cations / (# duplications + # peaks)); the score of 75% indicates a strong overall correlation (Figure 5E).
Finally, the correlation between the Plk4-NG peaks and times of centriole duplication was examined (Figure 5F), which provided an
alternative accuracy test. Plk4-NG peaks were uniquely matched to the nearest centriole division times without using a temporal cut-
off over individual centrioles from three independent embryos. Pearson correlation r, R2 and P values are reported as goodness of fit.
The fitted regression line, y = 0:87x + 3:69. Together, these unbiased computational analyses indicate that the Plk4 oscillations at
individual centrioles are highly correlated with the time at which these centrioles duplicate.
Spatiotemporal heatmap of centriole duplications

To visually assess whether there is bias as to where and when centriole duplications happen, the (x,y) position of all duplicating cen-
trioles were overlaid on the (x,y) positions of all centrioles (duplicating and non-duplicating at all times; black dots) (Figure S8E). To
grade the temporal sequence of centriole separations, duplication points were colored blue to red. To enable comparison across
embryos relative to the same geometric reference, the embryonic width and anterior-posterior were set as x- and y-axes, respec-
tively, by applying a principal component analysis on the extracted (x,y) positions of all centrioles.
Cell 181, 1566–1581.e1–e14, June 25, 2020 e6

ll
Article
Spatial clustering assessment of centriole duplications

To statistically measure whether centriole duplications are enriched at particular spatial regions over time, Ripley K statistics was
used (Ripley, 1976). In the field of spatial statistics, given a set of (x,y) points, the Ripley K statistic detects deviations from spatial
homogeneity at different distances between points or spatial scales (e.g., such analyses are heavily used in geophysics to map
out the spatial distribution of natural disasters and in crime statistics for detecting high incidence areas). For a dataset of n points,
the Ripley K statistic, K Ripley ðdÞ is the mean spatial occurrence of two points, point i and j having a separation distance, dij less than
the search distance threshold of d:
1 XIðdij < dÞ
K Ripley ðdÞ =
l isj n
l is the average density of points (estimated as n=A where the number of total points, n is divided by the area of the region containing
all points, A), and I is the indicator counting function (I = 1 if its operand is true, 0 otherwise). Thus, if points are homogeneously spread
in 2D, the Ripley K statistic should vary quadratically as pd2 . The basic test assumes (x,y) points occur at any spatial position contin-
uously in the image. However, centrioles only duplicate in certain discrete positions within fly embryos. Thus, to examine evidence of
spatial clustering from the natural distribution of centrioles, we assessed difference in the Ripley K statistics computed from the (x,y)
positions of all duplicating centrioles and the (x,y) positions of all centrioles accumulated over time.
Plk4-NG smooth curve fitting and parameter extraction

To enable accurate extraction of signal parameters from the Plk4-NG oscillations, yðtÞ the signals were robustly fit to a smooth 1D
function. Here a mixture of N Gaussian and linear trend was used with the following functional form to enable local modeling of the
peak and troughs of signals:
X
N ðtmi Þ2

s2i
yðtÞ = ðA + BtÞ + Ci e
i=1
where t is time, A; B are the constant and slope of a linear trend line, and Ci ; mi ; si are the amplitude, time and temporal duration of the
ith Gaussian, respectively. This function was fit in two steps. In the first step A; B was fit by applying least-squares linear regression on
the baseline trend line that is extracted from asymmetric baseline smoothing (Eilers and Boelens, 2005). In the second step, peak and
trough positions were first detected on the de-trended signal, y0 ðtÞ after subtraction of the fitted trend line in the first step from the
original signal, yðtÞ, so as to determine the number N of Gaussians to fit. The mixture of Gaussians was then fit by iterative non-linear
regression using a robust Cauchy loss function. From the fit signal, yðtÞfitted , peak and trough positions were re-detected, and the
following signal parameters were extracted (Figure 6):
d Acceleration rate ðDFluo: A:U: =secsÞ: The maximum rate of increase in fluorescence between successive time points during
a trough to peak oscillation phase.
d Deceleration rate ðDFluo: A:U: =secsÞ: The maximum rate of decrease in fluorescence between successive time points during
a peak to trough oscillation phase.
d Oscillation peak time ðsÞ: The time point corresponding to the maximum fluorescence.
d Oscillation trough time ðsÞ into mitosis: The end point of a trough after which the fluorescence begins to accelerate upward.
The method described here was also used to determine the period of Plk4 oscillations (measuring peak-to-peak time; see Results
and Discussion) both in normal embryos and in embryos where dsRNA was injected against cyclins A, B and B3 to halt the progres-
sion of cell cycle.
3D-Structured Illumination Microscopy (3D-SIM)

Living embryos were imaged at room temperature using a DeltaVision OMX V3 Blaze microscope (GE Healthcare). The system was
equipped with a 60x/1.42-NA oil UPlanSApo objective (Olympus Corp.), 488nm and 593nm diode lasers, and Edge 5.5 sCMOS cam-
eras (PCO). Spherical aberration was reduced by matching the refractive index of the immersion oil (1.514) to that of the embryos. 3D-
SIM image stacks consisting of six slices at 0.125mm intervals were acquired in five phases and from three angles per slice. The raw
acquisition was reconstructed using softWoRx 6.1 (GE Healthcare) with a Wiener filter setting of 0.006 and channel-specific optical
transfer functions (OTFs). Filters used for the green and red channels were a 540/80 center band pass filter and a 605-long pass filter,
respectively. For two-color 3D-SIM, images from green and red channels were registered with the alignment coordination information
obtained from the calibrations using 0.2mm-diameter TetraSpeck beads (Thermo Fisher Scientific) in the OMX Editor software. The
SIMCheck plug-in in ImageJ (National Institutes of Health) was used to assess the quality of the SIM reconstructions (Ball et al., 2015);
only images that passed this test were used.
e7 Cell 181, 1566–1581.e1–e14, June 25, 2020

ll
Article
Mathematical modeling and its experimental validation

Model 1: A simple mathematical model of discrete S-phase Plk4 oscillations
Figures 3A and 3B, specify a regulatory network where Plk4 binds to an Asl receptor with high affinity; this activates Plk4, allowing it to
phosphorylate itself and Asl multiple times. After a certain number of phosphorylations, Asl switches to a new state that binds Plk4
with low affinity. As a result, Plk4 unbinds, leaving Asl in a phosphorylated, low affinity state. We suspect that Asl is normally dephos-
phorylated by a phosphatase in M-phase, which ‘‘resets’’ it to a high-affinity state in preparation for the next oscillation in S-phase
(this additional step is considered in Model 2 below). In this first model, the gradual conversion of centriolar Asl to a low affinity binding
state forms a time-delayed negative feedback loop wherein Asl effectively activates Plk4 to gradually promote its own inhibition. After
making assumptions about the chemical kinetics of the system and imposing suitable initial conditions, the behavior of this regulatory
network can be simulated by mapping it onto a set of coupled ordinary differential equations.
In the model, it is assumed that the diffusion of Plk4 is sufficiently fast such that it remains well-mixed in the cytoplasm and that
centrioles are large macromolecular structures; this implies that Plk4 and Asl receptors on the centriole follow mass-action kinetics.
Let ½P and ½A0 denote the concentrations of unbound cytosolic Plk4 and unbound centriolar Asl, (per unit volume and area, respec-
tively). Plk4 binds to Asl with the fixed rate constant k, and the rate constant of the reverse reaction is sufficiently small that any
unbinding is ignored. Once Plk4 is bound to Asl, it can only unbind after it has phosphorylated Asl a certain number of times (N).
We denote by ½A0 the concentration of Asl receptor that has bound Plk4, but has not yet been phosphorylated, and by ½Ai the con-
centration of Asl receptors that have been phosphorylated i times. Throughout this paper, we use an asterisk superscript to denote an
Asl receptor that has bound Plk4 and a numerical subscript to denote the phosphorylation state of the receptor. Each phosphory-
lation of Asl by Asl-bound Plk4 has rate constant k 2 and, following N phosphorylations, the Asl is switched to a state that binds
Plk4 with very low affinity ½AN . Once Asl has been converted to this low affinity state, Plk4 unbinds at rate k3 . The rate constant
of the reaction where Plk4 binds to Asl receptor in this low affinity state is assumed to be sufficiently small that this reaction is ignored
in the model.
Intuitively, k scales the affinity with which Plk4 binds to an Asl receptor. By mass-action kinetics, the rate of this reaction is given by
k½P½A0 . It is assumed in the model that Plk4 is abundant enough in the cytoplasm that its concentration does not decrease over the
few minutes of a single S-phase cycle; this assumption means that ½P remains constant over that time (see below for further discus-
sion of this assumption). Therefore, the number of parameters in the model is reduced by introducing a new rate constant k1 = k½P.
Using the assumptions above, the regulatory network in Figures 3A and 3B is simulated using the following set of ordinary differ-
ential equations which are solved over the time domain 0 % t % S, where S is the length of S-phase:

d A0
= k1 ½A0 k2 A0 (1)
dt

d A1
= k2 A0 k2 A1 (2)
dt

d AN
= k2 AN1 k3 AN (3)
dt
d½A0
= k1 ½A0 (4)
dt
Appropriate initial conditions at t = 0 are,

b ; ½A0 = A
A0 = A b0 ; A = A = / = A = 0: (5)
0 1 2 N

In Equation 5, the positive constant A b is the initial amount of Plk4-bound Asl at the centriole at the start of each S-phase, which is
0
determined experimentally for each cell cycle using the techniques described in the Image acquisition, processing and analysis sec-
tion of STAR Methods. The constant A b is the initial amount of unbound Asl, so the total amount of Asl in the system is given by Atot =

b b
A0 + A0.
Since the model specified by Equations 1–5 is a system of linear differential equations with constant coefficients, it has an analytical
solution that can be expressed as a sum of exponentials. Values for the parameters A b0 , k1 , k2 , and k3 can then be determined by fitting

the curve ½A0 ðtÞ + ½A1 ðtÞ + / + ½AN ðtÞ to the experimentally measured data for the amount of Asl-bound Plk4 (i.e., the Plk4 that is
recruited to the centriole) over time. Fitting was done using a trust-region algorithm to optimize a nonlinear least-squares penalty
function.
Cell 181, 1566–1581.e1–e14, June 25, 2020 e8

ll
Article
Parameter fitting
The fitting was constrained to enforce that all parameters were positive, and k1 and k2 were taken to be less than 1. Each cycle was
fitted individually using the discrete Plk4-NG oscillation data from S-phase of cycles 11, 12 and 13 (Figure 3C). Parameter values are
shown in Data S1 (First, second and third charts). As explained below, the solutions to this model are very insensitive to variations in k3
(see Data S1, the Monte Carlo analysis), so in the solutions presented here k3 was kept at a constant value of 0.06906, which was the
best-fit parameter value for cycle 12 (Data S1; first, second and third charts).
Picking the value of N, the number of phosphorylation sites:
In the model, we assumed that Asl had to be phosphorylated by Plk4 N times before it switched to a low-affinity state—indicated by
variables ½A0 ;/; ½AN . We tested the effect of the number of phosphorylation sites on the model solution by using N = 1, 4, 9, 14, or 16.
The best fit curves for ½A0 ðtÞ + ½A1 ðtÞ + / + ½AN ðtÞ suggested that the model is a good fit for the data for any value of N > 4 (N = 1 (R2 =
0.9152), N = 4 (R2 = 0.9886), N = 9 (R2 = 0.9996), N = 14 (R2 = 0.9962) or N = 16 (R2 = 0.9931)). So, we use N = 9 corresponding to 10
phosphorylation sites of Asl (1 unbound, 9 bound with various stages of phosphorylation) in all subsequent modeling, although we
note that any value above 4 works essentially equally well.
Data S1, first chart, shows that the trust-region algorithm finds a very good fit (R2 > 0.99) for the model to the experimental data
(Figure 3C), but this provides little information about uniqueness of the fit as there may be other subsets of the parameter space that
also provide a good fit to the data. To see if any such regions could be detected, the parameter space was further explored by using a
Metropolis-Hastings Markov chain Monte Carlo algorithm. Four Markov chains were started at the positions in the parameter space
specified in Data S1 (fourth chart). The Monte Carlo analysis in Data S1 shows the six two-dimensional traces of the four-dimensional
parameter space. For clarity, only points that provided a good fit to the cycle 12 data (R2 > 0.95) are shown.
The results in Data S1 (the Monte Carlo analysis) reveal how sensitive the model is to changes in each parameter value. The model
only fit the data well for a relatively narrow range of values for k1 , k2 , and A b0 . In contrast, the fits are mostly insensitive to k3 . This is
likely, because the rate of phosphorylating Asl at multiple sites is relatively slow compared to the rate at which Plk4 is subsequently
released from the multiply phosphorylated Asl—so the rate of release is not limiting. The Monte Carlo simulations also reveal corre-
lations between k1 and k2 , k1 and A b0 , and k2 and A b0 . For example, these results show that if A b0 (the initial amount of unbound Asl
receptor at the start of S-phase) is reduced, the model can still fit the data well if k2 is decreased and k1 is increased. While these
results suggest that there is a single, continuous region of the parameter space that provides a good fit to the data, it is still possible
that there are other such regions that the Markov chains in Data S1 (the Monte Carlo analysis) did not explore. However, the results in
Data S1 (the Monte Carlo analysis, panel B) show that the points, which are identified at the center of the parameter region, provide
the best fit to the data. This suggests the nonlinear least-squares minima found by the trust-region fitting is insensitive to the
initial seed.
Interestingly, the best-fit parameters for cycles 11–13 showed that the biggest difference between the parameters of the Plk4 os-
cillations at each cycle is in k1—the rate at which Plk4 binds to Asl (which is dependent on the cytosolic concentration of Plk4).
Although our model assumes that the cytosolic concentration of Plk4 remains constant during the S-phase period within each cycle,
if the phosphorylated Plk4 molecules that are released from the Asl receptor are ultimately degraded—and there is good evidence
that Asl activates Plk4 to promote Plk4 degradation (Klebba et al., 2015)—there could be a wave of phosphorylated-Plk4 degradation
in the cytoplasm toward the end of S-phase. If so, the cytosolic levels of Plk4 would get successively lower at the start of each suc-
cessive cycle, as our PeCoS analysis indicates is the case (Figure 3E).
The effects of reducing the genetic dose of Plk4 by half (Plk4-NG1/2 embryos—see Results and Discussion for details) were
analyzed. Our PeCoS analysis indicated that there was a 45% drop in the cytosolic levels of the Plk4-NG protein in the Plk4-
NG1/2 embryos (Figure S6C). When the model was fit to Plk4-NG1/2 oscillation, the best-fit (R2 = 0.996) parameter had a k1 value
that was 39% of the control value (Data S1, second chart), so in reasonable agreement with the 45% drop in cytosolic Plk4 levels
we measured experimentally. These parameter values also suggested that the total amount of centriolar Asl ðAtot Þshould remain rela-
tively unchanged between the Plk4-NG and Plk4-NG1/2 conditions (Data S1, second chart). Centriolar Asl levels were analyzed in em-
bryos expressing Asl-mCherry in either WT versus Plk41/2 conditions, and our findings showed that this was indeed the case
(Figure S7A).
Next, the effects of reducing the genetic dose of asl by half (asl1/2 embryos—see Results and Discussion for details) were analyzed.
Interestingly, the best-fit parameter values (R2 = 0.999) predicted that the total centriolar Asl levels (Atot) would be reduced by only
28% in asl1/2 embryos (Data S1, third chart). This value was therefore directly measured in embryos expressing either one or two
copies of Asl-GFP (under the control of its own promoter in an asl mutant background). Encouragingly our findings showed that
reducing the genetic dose of Asl-GFP by half led to a reduction of only 30% in centriolar Asl-GFP levels (Figure S7B). Moreover,
the parameter values suggested that the concentration of Plk4 (incorporated in the k1 term) should not vary significantly between
WT and asl1/2 conditions (Data S1, third chart). We confirmed this prediction using western blotting for Plk4-GFP and PeCoS for
Plk4-NG (both transgenically expressed from their own promoters in a Plk4 mutant background) in control and asl1/2 embryos (Fig-
ures S7C and S7D).
Taken together, these analyses indicate that our model can robustly describe the Plk4-NG oscillations under normal conditions
(Figure 3C) and when the levels of either Plk4 or Asl are perturbed experimentally (Figures 4A and 4B). Moreover, the model makes
several plausible predictions about the relative levels of these proteins in the perturbed conditions that are close to the levels that we
measured experimentally.
e9 Cell 181, 1566–1581.e1–e14, June 25, 2020

ll
Article
Finally, the best-fit value of k2 (reflecting the kinase activity of individual Plk4 molecules) decreased slightly between cycles 11 to 12
and decreased more significantly between cycles 12 to 13 (by 9% and 37%, respectively); k2 also decreased when levels of Plk4
were genetically reduced in Plk4-NG1/2 embryos (by 25%)—but not when Asl levels were genetically reduced in asl1/2 embryos
(Data S1; first, second and third charts). The molecular basis for this inferred decrease in kinase activity remains unknown, but we
believe it is biologically plausible. We previously suggested that centriolar Plk4 was likely to integrate several inputs at the start of
each cycle (from, for example, cell cycle regulators, or its activator Ana2/STIL) and adjust its kinase activity in response to the length-
ening of S-phase during successive nuclear cycles (Aydogan et al., 2018). Moreover, our finding that Wee1 kinase, an important cell
cycle regulator, can influence the Plk4 oscillation parameters in S-phase strongly supports this hypothesis (Figure 6).
Model 2: Generating robust Plk4 oscillations entrained by the CCO

The network described above was further extended to test the possibility of generating robust oscillations in centriolar Plk4 levels. To
do so, we allow the Asl receptors to be dephosphorylated by a phosphatase at rate k 4 (Figure S4A). Subject to the constraint that
there is no Plk4 bound to the Asl receptors initially (i.e., at the start of cycle 1), we model multiple cycles by allowing the phosphatase
to be active only during mitosis, so that k 4 is nonvanishing only in this period. Therefore, the system reads:

d A0
= k1 ½A0 k2 A0 (6)
dt

d A1
= k2 A0 k2 A1 k4 A1 (7)
dt

d AN
= k2 AN1 k3 AN k4 AN (8)
dt
d½A0
= k4 ½A1 k1 ½A0 (9)
dt
d½A1
= k4 ½A2 k4 ½A1 (10)
dt
d½AN
= k3 AN k4 ½AN (11)
dt
subject to the initial conditions,

½A0 = 1; ½A1 = / = ½AN = 0; A0 = / = AN = 0: (12)
It is further assumed that the embryo is in mitosis for 30% of the total time in each nuclear cycle and all cycle times are kept constant.
Hence, k4 = 0 for 0 < t mod T < 0:7T and a positive constant for 0:7T < t mod T < T, where T is the period of the cell cycle. Values for
the rate constants are determined by fitting the exact analytical solution in the S-phase of cycle 12 to the Lorentzian regression of the
experimental data (R2 = 0.9870). In the first instance, we assumed that the cytosolic concentration of Plk4 remains constant over the
nuclear cycles (see below). We plot the exact solution for the percentage of Asl-bound Plk4 molecules for a total of 14 nuclear cycles,
as is the case in fly embryos. This minimal model was sufficient to generate sustained oscillations in centriolar Plk4 levels (Figure S4B;
k4 = 0.0708).
As an alternative to the assumption that the cytosolic concentration of Plk4 is constant over the cycles, we also considered the
case where the total number of Plk4 molecules in the embryo is kept constant (so any Plk4 degradation is balanced by new synthesis).
In this model, as the number of centrioles, NC ðtÞ, increases at successive cycles so the number of available Plk4 molecules in the
cytosol initially decreases during S-phase (as Plk4 binds to centriolar Asl receptors), and then increases (as Plk4 unbinds from Asl
receptors). We estimate that there are NP = 105 molecules of Plk4 in an embryo of 0.01mm3 volume (Markow et al., 2009); a concen-
tration of 10nM, in agreement with that measured in human cells (Yamamoto and Kitagawa, 2019), but potentially higher than we
infer from our observation that Plk4 levels are too low to be measured by FCS (as we cannot infer absolute protein concentration from
our PeCoS experiments). To simulate the effect of centriole duplications, we double the number of centrioles each cycle and assume
that, at each centriole duplication, the bound Plk4 (attached to Asl receptors) is equally split between the mother and separating
daughter. To consider the Cdk/Cyclin trigger wave that sweeps through embryos (Deneke et al., 2016), it is assumed that the
Cell 181, 1566–1581.e1–e14, June 25, 2020 e10

ll
Article
duplicated centrioles separate nearly synchronously over the last 10% of the time-window in each cycle. Based on our 3D-SIM mi-
croscopy data, we assume that each centriole has 30 Asl receptors, as we essentially only need to consider Plk4 binding to Asl at
the site of centriole assembly (the model works well for between 20-80 receptors), and that there is a single centriole in cycle 1. With
these modifications to the model, the system reads

d A0 1 dNC
= k1 ½P½A0 k2 A0 A0 (13)
dt NC dt

d A1 1 dNC
= k2 A0 k2 A1 k4 A1 A1 (14)
dt NC dt

d AN 1 dNC
= k2 AN1 k3 AN k4 AN AN (15)
dt NC dt
d½A0 1 dNC
= k4 ½A1 k1 ½A0 + A (16)
dt NC 0 dt
d½A1 1 dNC
= k4 ½A2 k4 ½A1 + A (17)
dt NC 1 dt
d½AN 1 dNC
= k3 AN k4 ½AN + A (18)
dt NC N dt
NR NC ðtÞ XN
½P = 1 An (19)
NP n=0
subject to the initial conditions (12).

We plot the solution of the model for the percentage of Asl-bound Plk4 molecules, as well as the percentage of Plk4 molecules that
remain in the cytoplasm, over 14 nuclear cycles (Figure S4F; R2 = 0.9871 and k4 = 0.0612).
We observe a small spike as the centrioles begin to separate at the end of each mitosis (Figure S4F). The spike is small, due to the
slight asynchrony in centriole separations; if the centrioles were to separate all simultaneously, the concentration would instantly
halve at this point, since the number of receptors would all double. Interestingly, these small spikes are consistent with our exper-
imental observations (Figures 1A and S2A). Moreover, we emphasize that, for the first few nuclear cycles, almost all of the Plk4 re-
mains in the cytoplasm since there are only a few centrioles. In the later cycles, however, the amount of Plk4 sequestered by the Asl
receptors increases exponentially, as the number of centrioles increase by a factor of 2 in each cycle. Therefore, the rate at which the
Asl receptors are able to recruit Plk4 from the cytoplasm decreases, resulting in a reduction in the amplitude of the Plk4 oscillation
(Figure S4F). This feature of the model is also consistent with our experimental observations (Figures 1B and 1C).
Model 3: Stochastic duplications

Finally, we have also developed a discrete mathematical model, which is analogous to Model 2 described previously, in order to
consider the possibility of stochastic centriole duplication in non-cycling embryos (as in those injected with dsRNA against cyclins
A, B and B3; Figure S8G). As before, we assume that unphosphorylated Asl receptors bind Plk4 with high affinity until they become
fully phosphorylated and Plk4 unbinds, and we initially assume that the Asl receptors can be dephosphorylated during mitosis. We
model this system stochastically by defining the state vector for each receptor to be

V = A0 ; /; AN ; A0 ; /; AN (20)
At any given time, precisely one entry of V is equal to unity, corresponding to the state which the receptor is in at that moment, and all
other entries are equal to zero. We allow the receptor to change state over time according to the transition matrix:
e11 Cell 181, 1566–1581.e1–e14, June 25, 2020

ll
Article
2 3
Q1 0 0 / 0 P1 0 / / 0
6 P4 Q4 0 1 0 0 0 1 1 « 7
6 7
6 0 P4 Q4 1 0 « « 1 1 « 7
6 7
6 « 1 1 1 « « « 1 1 « 7
6 7
6 0 / 0 P4 Q4 0 0 1 1 « 7
M= 6
6 0
7 (21)
6 / / 0 0 Q2 P2 0 1 « 7 7
6 « 1 1 « « P4;2 Q2;4 P2;4 0 « 7
6 7
6 « 1 1 « « 0 1 1 1 0 7
6 7
4 « 1 1 0 0 « 1 P4;2 Q2;4 P2;4 5
0 / / 0 P3;4 0 / 0 P4;3 Q3;4
where
ki
Pi = 1 eki ; Pi;j = 1 eðki + kj Þ ; (22)
ki + kj
Qi = eki ; Qi;j = eðki + kj Þ ; (23)
describe the probabilities of a receptor changing state and remaining in the same state which arise in our model. We may allow the
cytosolic Plk4 concentration to vary in this model by making the substitution k1 / k1 ½P and using (19) to compute ½P (evaluating the
sum over all receptors being simulated). We also assume that, if a receptor is in a Plk4-bound state ðAn Þ during the last 10% of a cycle,
the Plk4 will unbind ðAn / An Þ with 50% probability during that time period in order to simulate mother-daughter separation.
In this model, each Asl receptor behaves as an independent oscillator—alternating between a Plk4-bound form that is being phos-
phorylated, and an unbound form that is being dephosphorylated. In the presence of the CCO, the individual Asl receptors generate
coordinated oscillations because the CCO effectively synchronizes them every cycle by ensuring a coordinated burst of PPTase ac-
tivity during mitosis. This activity is lost in the absence of the CCO, but instead we allow the PPTase to be active at a low, but constant,
level (10% of the mitotic activity in cycling embryos). We plot the Asl-bound Plk4 levels for a total of 10 centrioles (each with 30 re-
ceptors as assumed above; Figure S8H). We observe that the centrioles are initially synchronized, since they all start in an unbound
state, and display a single round of Plk4 binding. However, as time progresses, the Asl receptors lose synchrony, and each centriole
exhibits stochastic, low-amplitude oscillations. Such oscillations may be sufficient to trigger duplications at individual centrioles, as
evident from our experimental observations (Figures 5B–5F and S8B–S8F; Video S4).
All the equations used for mathematical modeling and regressions are available in the following web link: < https://github.com/
RaffLab/centriole_oscillator_model >.
Fluorescence Correlation Spectroscopy (FCS)

FCS setup and measurements
Point FCS measurements were performed on a confocal Zeiss LSM 880 (Argon laser excitation at 488 nm and GaASP detector) with
the Zen Black Software. A C-Apochromat 40x/1.2 W objective and a pinhole setting of 1AU were used. A laser power of 10 mW was
used, and no photobleaching was observed during the measurements. The microscope was kept at 25 C using the Zeiss inbuilt heat-
ing insert P and the heating unit XL. A schematic overview of the methodology used is shown in Figure S5A, and a comparison of the
average autocorrelation curves generated at the start of S-phase of nuclear cycles 11-14 is shown in Figure S5B.
The effective volume of the imaging setup was estimated to be 0.28 fL by averaging the estimate obtained by three independent
methods, as described previously (Rüttinger et al., 2008): 1) Measuring the concentration of a soluble Alexa Fluor 488 NHS Ester
dilution series (100 nM, 10 nM, 1 nM and 0.1 nM); 2) Measuring the diffusion time for Alexa Fluor 488 NHS Ester (same concentrations)
in water at 25 C. The measured diffusion time was then compared to a previously reported diffusion coefficient for the Alexa Fluor 488
NHS Ester (Petrásek and Schwille, 2008); 3) Imaging subresolution beads (FluoSpheres Carboxylate-Modified Microspheres, 0.1 mm)
and determining the effective volume via Gaussian fitting with the line tool and Z-axis profile in ImageJ (Bethasda, USA).
Embryo collections (from mother flies expressing Asl-GFP under the control of its own promoter in an asl mutant background) were
as described above, with the exception of using high precision 35 mm, high Glass Bottom m-dishes (ibidi). Before every measure-
ment, spherical aberrations were adjusted on the correction collar of the objective by maximizing the count-rate per molecule
(CPM). At the beginning of S-phase in each cell cycle (when the old and new mother centrioles were separating), consecutive cyto-
solic measurements were made 6x for 10 s each at the centriolar plane of the embryo. Individual recordings where centrioles moved
through the measurement spot, based on the highly erratic shape of the correlation curve (< 2% of all recordings), were discarded.
Cell 181, 1566–1581.e1–e14, June 25, 2020 e12

ll
Article
Autocorrelation analysis and post-acquisition curve fitting

The autocorrelation function, G(t), was calculated during each measurement in the Zen Black software using the following equation:
hdIðtÞ$dIðt + t Þi
Gðt Þ = D E
dIðtÞ2
where CD denotes a time average, dIðtÞ describes the intensity fluctuation at the time point t, and t states the lag time of the
autocorrelation.
All 10 s-recordings were then fitted with 8 different 3D diffusion models using the software FoCuS-Point (Waithe et al., 2016) with
the following equation:
XDs ak !1 1=2

t t
G3D ðt Þ = Ak 1 + 1+
k =1
txyk AR2 txyk
where Ak defines the fraction of a diffusing species for which the sum of all diffusing species equals 1, txy describes the average resi-
dence time of the diffusing species in Veff, a accounts for anomalous subdiffusion within the cytoplasm, and AR is a structural param-
eter that describes the relationship among the x, y and z-axes of the excitation volume.
Dark states of the fluorophore were fitted with the following formula:
XTs Tj
GT ðtÞ = 1 + $et=tTj
j = 11 Tj
where T depicts the triplet population, and tT states the triplet correlation time during which the fluorophore stays in the dark state
(Schönle et al., 2014).
The data was fitted within the boundaries of 4x104 ms and 1.5x103 ms, and the dark states were restricted to 10-300 ms for the
blinking state, and 1-10 ms for the triplet state. The models (Ms) were defined as the following: M1) 1 diffusing species (ds) 0 blinking
states (bs) 0 triplet states (ts); M2) 1 ds 1bs 0ts; M3) 1 ds 0bs 1ts; M4) 1 ds 1bs 1ts; M5) 2 ds 0bs 0ts; M6) 2 ds 1bs 0ts; M7) 2 ds 0bs
1ts; M8) 2 ds 1bs 1ts. In all models, the structural parameter AR and the anomalous subdiffusion parameter a were kept constant at 5
and 0.7, respectively.
In order to avoid over-fitting the data, the most plausible model to describe the autocorrelation functions was selected using the
Bayesian Information Criterion (BIC), which is based on the likelihood function, but introduces a penalty term for the complexity (num-
ber of variables) for the models (Schwarz, 1978). In this study, M4 was the preferred model to describe Asl-GFP diffusion
(Figure S5A(iv)). The concentration was calculated from the FoCuS-point fit data of the preferred model:
1 hNi
hNi = ; conc: =
G0 Veff
where N states the average number of particles within the effective volume Veff, and G0 represents the height of the autocorrelation
function at t = 0.
FCS background corrections

In order to estimate the contribution of the background noise, 22 wild-type embryos were measured with the same laser intensity
(10 mW) and in roughly the same plane and developmental stage as the Asl-GFP embryos. Despite no observable correlated back-
ground, the uncorrelated background contributed 30% of the total photon count rate, presumably due to the low concentration of
cytosolic Asl-GFP and the high autofluorescence of the embryo itself (Figure S5(v)). Background corrections were performed after the
autocorrelation analysis by calculating the correction factor c2 using the following formula (Koppel, 1974):
1 1
=
c2 ð1 + hbi=hf iÞ2
1
hNi =
c2 G0
where hbi denotes the average background and hf i states the average count rate of the sample.
Data restriction
In some FCS measurements a sudden drop in CPM was observed, possibly due to movements within the embryo or the embryo
drifting away from the measurement plane. When this happened, a strong, often unreasonable increase in concentration was
e13 Cell 181, 1566–1581.e1–e14, June 25, 2020

ll
Article
observed. These outliers were therefore discarded based on a ROUT outlier test (with the aggression factor Q = 1%), which was per-
formed on all 10 s-long concentration measurements (the red data points in Figure S5(vi)). Only the embryos with at least 4x10 s re-
cordings (after discarding outliers and erratically shaped ACFs) were included in the final analysis.
Peak Counting Spectroscopy (PeCoS)

FCS was not sensitive enough to investigate the cytosolic concentration of Plk4-NG, presumably because its concentration was too
low. We therefore developed a new method that we term Peak Counting Spectroscopy (PeCoS) that allows the relative concentration
of low abundance proteins to be measured accurately (Figure S6). PeCoS uses the same set up as the point FCS protocol described
above, but it differs in terms of its data acquisition and analysis. In PeCoS, the intensity peaks, which are generated by a fluorophore
moving through the effective volume, are counted as a proxy for concentration. Due to the low cytosolic concentration of the fluo-
rescently-tagged protein of interest (e.g., Plk4-NG in this study), spherical aberrations could not be corrected within the same embryo
where the measurements were taken. Therefore, embryos that express a bright fluorescent centriolar marker were positioned next to
the experimental embryos on the same imaging dish and these were used for correction collar adjustment (Figure S6A(i)). Experi-
mental recordings were then captured for 180 s (instead of 6x10 s), as the number of particles that pass through the field of view
was usually very low. Before every measurement, the observation region was pre-bleached with the same laser intensity (10 mW)
for 3 s to bleach away any potential immobile fraction.
Instead of autocorrelation analysis, the resulting intensity traces (Figure S6A(iii)) were quantified for their number of peaks, which
originate from a fluorophore moving through the excitation volume and causing a detectable burst of photons. In order to determine
the cut-off threshold, which was used to subtract the background noise, 40 control embryos were measured (Figure S6A(iii)). These
control embryos were from mothers expressing Asl-mKate2 to allow measurements at the centriolar plane and at the right nuclear
cycle stage (beginning of S-phase). ‘‘Mean + n*SD’’ (where n = 1,2,3,.) of all control recordings was subtracted from each control
recording, and the threshold that resulted in an average of less than five peaks (per 180 s control measurement) was subtracted from
all intensity traces (Figure S6A(iv-vi)). This threshold was found to be a good compromise for minimizing the background noise without
discarding too much information. A Python script was written in order to automate this procedure, which is available via < https://
github.com/RaffLab/PeCoS >. The subtraction of ‘‘Mean + 8*SD’’ resulted in an average peak count of 3.25 for the control record-
ings, and this was used for the background subtraction in all in vivo measurements. In the peak detection algorithm above, a peak
was defined as any consecutive value (photon count) that surpasses the subtracted threshold (Figure S6A(vi)).
In order to assess the effective concentration range of the PeCoS methodology, two-fold dilution series of Alexa488 NHS Ester
were measured, and for every sample, both the ACF and the number of peaks were calculated using FCS and PeCoS (where Back-
ground = Mean ± 23*SD (water)), respectively. As expected, PeCoS did not perform well at high concentrations, where, presumably,
too many particles move simultaneously through the excitation volume; at lower particle concentrations, however, (where FCS was
no longer accurate) the number of peaks decreased in a nearly linear fashion (Figure S6B). To test the sensitivity of PeCoS under
in vivo conditions, the cytosolic concentration of Plk4-NG was measured at the beginning of nuclear cycle 12 in embryos expressing
either one (1x) or two (2x) copies of Plk4-NG in the Plk4 mutant background. PeCoS analysis indicated a 90% increase in the number
of peaks (per minute) in the 2x embryos compared to the 1x embryos, indicating the effectiveness of PeCoS in measuring the relative
cytosolic concentration of low abundance proteins (Figure S6C).
The details for quantification, statistical tests, sample numbers, definitions of center, and the measures for dispersion and precision
are described in the main text, relevant figure legends, or relevant sections of STAR Methods. Significance in statistical tests was
defined by p < 0.05. To determine whether the data values were normally distributed, a D’Agostino–Pearson omnibus normality
test was applied. Prism 7 and 8 were used for all the modeling and statistical analyses.
Cell 181, 1566–1581.e1–e14, June 25, 2020 e14

ll
Article

ll
Article
Figure S1. Summary of the Protocol for Image Acquisition, Processing, and Analysis of the Plk4-NG Oscillations, Related to Figure 1
(A) Diagram illustrates the centrioles in ~2 h old embryo expressing Plk4-NG being imaged on a spinning-disk confocal system.
(B) Micrograph shows a typical image of the tracks of the Plk4-NG centrioles in S-phase of cycle 12, tracked using the ImageJ plugin, TrackMate.
(C) Graphs show the Plk4-NG oscillation during cycle 12 in a single embryo quantified from the tracks of either several individual centriole pairs (i), or the Mean ±
SD oscillation calculated from the tracks of > 90 centriole pairs (ii). The data for each embryo was then regressed using a Lorentzian equation (red line, iii)—see (D)
for an explanation of the rationale for choosing this function. This process was repeated for multiple embryos to calculate a Mean ± SEM regression for nuclear
cycle 12 (iv). R2 values indicate the goodness-of-fit (Mean ± SD) of the regression. CS = time of centrosome separation (set to 0); NEB = time of nuclear envelope
breakdown.
(D) Table shows the various models that were tested to fit the Plk4-NG oscillation data. R2 and SSAbs (absolute sum of squares) values indicate the goodness of fit.
The Lorentzian function was the best fit for the majority of embryos, so it was used for all further analyses.
Further details of these models are provided in STAR Methods.
ll
Article
Figure S2. Plk4-NG Oscillations in Individual Embryos, Related to Figures 1 and 5

(A) Graphs show the Mean ± SD centriolar fluorescence intensity of Plk4-NG (two copies of a transgene expressed from its own promoter in a Plk4 null mutant
background) during nuclear cycles 11-13 in 5 different embryos imaged on a spinning-disk confocal system. n = 26 centrioles (mean) tracked starting from cycle
11 per embryo. See STAR Methods for full details of image acquisition and data analysis.
(B) Same as in (A), but showing the Plk4-NG oscillation in 5 embryos arrested in interphase by the injection of dsRNAs against Cyclin A, B and B3.
See Figure 5A for further details on sample numbers and experimental protocol.
ll
Article

ll
Article
Figure S3. Simultaneously Measuring Centriole Growth and the Plk4 Oscillation in the Same Embryos, Related to Figure 2
(A and B) Graphs show the same data presented in Figure 2A, but with the SEM included (as these error bars were omitted from Figure 2A for ease of pre-
sentation). CS = centrosome separation and NEB = nuclear envelope breakdown. R2 values indicate the goodness of fit.
(C) Graph quantifies the embryo hatching frequency in embryos laid by either wild-type (Oregon-R) females or females simultaneously expressing Sas-6-mCherry
and Plk4-NG in a Plk4 mutant background (all mated with WT males). At least 4 technical repeats were carried out over several days, and a total of at least 400
embryos were analyzed.
(D) Cartoon graphs (i.e., imaginary data) illustrate the three different centriole growth phenotypes we observed in the Plk4 mutant embryos that simultaneously
express 2 copies of Plk4-NG and one copy of Sas-6-mCherry. In our previous analysis of centriole growth kinetics (Aydogan et al., 2018) almost all embryos
started to incorporate Sas-6-GFP at the very start of S-phase (‘‘Growth on time,’’ left graph). In the embryos analyzed here (with a more complicated genotype,
and expressing Sas-6-mCherry rather than Sas-6-GFP), some of the embryos exhibited a clear delay in initiating the incorporation of Sas-6-mCherry (‘‘Late
growth,’’ middle graph), while others did not appear to incorporate significant amounts of Sas-6-mCherry at all (‘‘No growth,’’ right graph).
(E) Pie charts quantify the percentage of embryos exhibiting each centriole growth phenotype at each nuclear cycle. Note that embryos exhibiting the ‘‘No
growth’’ phenotype were excluded from the analysis shown in (A) and (B) and in Figure 2A, although the amplitude of the Plk4 oscillations in these embryos was
analyzed separately (Figure 2C): we observed 8 embryos in total that exhibited the ‘‘No growth’’ phenotype (1 in cycle 12, and 7 in cycle 13). Centriolar Plk4-NG
levels continued to oscillate in these embryos, and the scatter graph shown in Figure 2C plots the peak amplitude of the Plk4-NG oscillations in these 8 embryos
overlaid on the average ‘‘threshold’’ level of Plk4-NG at which centrioles started to grow in the population of embryos that did exhibit Sas-6-mCherry incor-
poration. This threshold was very similar at cycle 12 and 13, so the threshold shown in Figure 2C is taken from cycle 13 embryos (as 7 of the 8 embryos shown here
were at cycle 13). The Plk4-NG oscillation in all but one of the 8 embryos failed to reach the average ‘‘threshold’’ level that would normally initiate centriole growth
in these embryos.
ll
Article

ll
Article
Figure S4. Theoretical and Experimental Assessment of Several Assumptions Made in the Mathematical Model, Related to Figure 3
(A) Our mathematical model depicted in Figures 3A and 3B only discretely examines the Plk4-NG oscillation during S-phase of each nuclear cycle. The schematic
here shows our speculation that a phosphatase normally removes the phosphate groups (dotted circles) from Asl (red) during mitosis to reset the system for the
next oscillation at rate k4 (dotted black arrow).
(B) We implemented this step to extend the original model and plotted the mathematical solution for the percentage of Asl-bound Plk4 molecules (black curve) for
a total of 14 nuclear cycles. For simplicity we kept the length of S-phase and mitosis constant through all 14 cycles (see STAR Methods for further details of this
extended model).
(C) Schematic shows the Serine (S) and Threonine (T) residues (in bold) that were mutated to Alanine in the Asl-13A construct. Dark gray boxes show the relative
positions of the previously mapped Plk4-interacting regions within the N-terminal (Dzhindzhev et al., 2010) and C-terminal (Klebba et al., 2015) regions of Asl.
(D) Micrographs show images from time-lapse movies of embryos expressing Asl-WT-mKate2 and Asl-13A-mKate2 (under the control of their own promoters in
an asl mutant background), respectively.
(E) Graphs show the regression data (solid lines) for Plk4-NG oscillations in cycle 12 in embryos expressing either Asl-WT (green) or Asl-13A (dark gray) (both
without any fluorescent tag) simultaneously with Plk4-NG. N R 25 embryos for each condition; n = 71 and 68 centrioles (mean) per embryo in Asl-WT or Asl-13A,
respectively (collection of two trials performed by two independent researchers, blinded for each other’s data). Data are presented as Mean ± SEM R2 values
indicate goodness-of-fit for the regressions. CS = Centrosome separation; NEB = Nuclear envelope breakdown.
(F) In (B) it is assumed that the cytosolic concentration of Plk4 is kept constant over all cycles. The graph here plots an alternative model where the total number of
Plk4 molecules in the embryo is kept constant at all cycles. The number of centrioles doubles each cycle, and the mathematical solution for the percentage of Asl-
bound Plk4 molecules (black curve), and the percentage of Plk4 molecules that remain in the cytoplasm (red curve), is depicted over 14 nuclear cycles (see STAR
Methods for further details and implications of this model). For the first few nuclear cycles, almost all of the Plk4 remains in the cytoplasm since there are only a few
centrioles. In the later cycles, however, the amount of Plk4 sequestered by the Asl receptors increases exponentially, as the number of centrioles increase by a
factor of 2 in each cycle. Therefore, the rate at which the Asl receptors are able to recruit Plk4 from the cytoplasm decreases, resulting in a reduction in the
amplitude of the Plk4 oscillation. This aspect of the model is consistent with our experimental observations that the amplitude of the Plk4 oscillation decreases at
later cycles (Figure 1), as does the cytosolic concentration of Plk4 (Figure 3E). An alternative, or additional, mechanism that might explain these observations is
that the Plk4 molecules activated by binding to Asl may be more likely to autophosphorylate to stimulate their degradation, so ensuring that more Plk4 is degraded
at each cycle as the number of centrioles increase. Interestingly, in either of these scenarios, increasing centriole numbers leads to increasing Plk4 depletion from
the cytosol, potentially allowing embryos to effectively ‘‘count’’ their centrioles.
ll
Article
Figure S5. FCS Analysis of Cytosolic Asl Levels, Related to Figure 3

(A) Schematic workflow describes the acquisition and analysis of point Fluorescence Correlation Spectroscopy (FCS) measurements (see STAR Methods for
further details). The 488nm laser beam is positioned at the centriolar plane in embryos expressing 2 copies of Asl-GFP (under the control of its own promoter in an
asl mutant background). (i) At the beginning of every cycle, when the old and new mother centrioles have just separated (white arrows), 6x 10 s FCS mea-
surements were taken at a point in the cytosol maximally distant from the centrioles (center of red crosshairs). (ii) This generated 6 autocorrelation functions
(ACFs) (a typical example is shown here). (iii) In the FoCuS-point software, 8 different models were fitted to each ACF. (iv) The model that best fitted the majority of
the data (#4 in this case) was chosen based on the Bayesian information criterion, and all ACFs were then fitted to this model. (v) The fitted ACFs were corrected
for background noise which was determined by measurements in WT embryos. (vi) The ACFs used for further analysis were then restricted by excluding individual
outlier measurements based on a ROUT-outlier test (Q = 1%) (these outlier measurements usually had a poor signal-to-noise ratio and gave concentrations that
were often biologically unrealistic, and were presumably generated when a centriole or non-specific fluorescent structure passed through the analyzed volume).
(B) Graph shows the average ACFs (represented as Mean ± SEM) for nuclear cycles 11-14 before background corrections. All individual ACFs were used to
calculate the cytosolic concentration data shown in Figure 3D.
(C) Western blot shows the protein levels of the Asl-GFP in either the early or late cell cycles from embryos of the same genotype used in (A) and (B). This supports
the results obtained from the FCS measurements, and suggests that total Asl levels do not change significantly during the development of the syncytial embryo.
Early and late embryos were separated based on their distinct morphology (judged by eye using a dissection microscope). Actin is shown as a loading control. A
representative blot is shown from two technical repeats.
ll
Article
Figure S6. Peak Counting Spectroscopy Analysis of Cytosolic Plk4 Levels, Related to Figures 3 and 6
(A) Schematic workflow describes the acquisition and analysis of Peak Counting Spectroscopy (PeCoS) measurements. (i) In addition to embryos expressing
Plk4-NG under its own endogenous promoter, embryos of two other genotypes were placed on the same imaging dish. One expressing a green-fluorescent
centriole marker to allow correction of the spherical aberration caused by coverslip thickness variation, the other expressing Asl-mKate2 to determine the au-
tofluorescence background threshold for the Plk4-NG expressing embryos—Asl-mKate2 allows one to determine the correct plane (containing the centrioles;
white arrows) for background measurement, while the mKate2 fluorophore does not interfere with the PeCoS measurements. (ii) As for FCS (see Figure S5), a
488nm laser beam is positioned near the cortex of embryos, and the measurements are taken at a single point in the cytosol (red crosshairs) at the beginning of S-
phase, but for 1x 180 s, in both control and Plk4-NG expressing embryos (iii). Afterward, (iv) an appropriate threshold is calculated from the control embryos, so
that the background contributes less than 5 peaks on average during each recording. Following background subtraction, (v and vi) the number of peaks is
quantified.
(B) To compare the effective linear concentration range of FCS and PeCoS we assessed a two-fold dilution series of the Alexa488 dye. At high dye concentrations,
FCS (black symbols) exhibits a near-linear response, while PeCoS (gray symbols) is saturated—presumably because there are too many fluorophores in the
effective volume (Veff) for them to be measured as individual peaks. At intermediate dye concentrations, both methods exhibit a near linear response. At low
concentrations (~ < 0.2nM), however, FCS becomes unreliable while PeCoS continues to have a near-linear response.
(C) The bar chart shows the in vivo validation of PeCoS. A significant difference in the number of peaks per minute was observed between embryos expressing
either 1x or 2x copies of Plk4-NG (under the control of its endogenous promoter), which were measured at the beginning of S-phase in nuclear cycle 12. Each data
point represents a 180 s recording from a single embryo. Statistical significance was assessed using Mann-Whitney test (***p < 0.001). Data are presented as
Mean ± SD
(D) Western blot analysis of Plk4-GFP (arrow) levels in early and late embryos supports the conclusion from the PeCoS analysis (Figure 3E) that cytosolic Plk4
levels are lower in late embryos than in early embryos. Prominent non-specific bands are indicated (*). A representative blot is shown from two technical repeats.
(E) The bar chart compares the cytosolic levels of Plk4-NG (under the control of its endogenous promoter; at the beginning of S-phase in Cycle 13) between WT
and Wee1/ embryos (the same genotypes as in Figure 6). Statistical significance was assessed using an ordinary unpaired t test (ns, not significant). Data are
presented as Mean ± SD.
ll
Article
Figure S7. Quantification of Centriolar Asl and Cytosolic Plk4 Levels When the Genetic Dose of asl or Plk4 Is Halved, Related to Figure 4
(A) Micrograph shows an image of Asl-mCherry at centrioles in an embryo in early S-phase (just after centrosome separation). Bar charts quantify the average
centriolar Asl-mCherry levels in early S-phase in either WT embryos (WT) or in embryos where the genetic dose of Plk4 has been halved (Plk41/2). N = 17 embryos
for each condition; n = 67 and 58 centrioles (mean) per embryo in WT or Plk41/2 groups, respectively. Average centriolar Asl levels do not change significantly
when the genetic dosage of Plk4 is halved, in agreement with the prediction of our model (see Data S1; first, second and third charts).
(B) Same schema as (A), but showing the localization of Asl-GFP, and quantifying the centriolar levels of Asl-GFP in asl mutant embryos expressing either 1 (Asl-
GFP1x) or 2 (Asl-GFP2x) copies of Asl-GFP. N = 10 embryos for each condition; n = 59 and 54 centrioles (mean) per embryo in Asl-GFP1x or Asl-GFP2x groups,
respectively. This analysis reveals that centriolar Asl-GFP levels drop by ~30% when the genetic dosage of Asl-GFP is halved, in good agreement with the
prediction of our model (see Data S1; first, second and third charts). Data are represented as Mean ± SEM. Statistical significance was assessed using an
unpaired t test with Welch’s correction (for Gaussian-distributed data) or an unpaired Mann-Whitney test (**p < 0.01; ns, not significant).
(C) Western blot compares the protein levels of Plk4-GFP (arrow) (expressed under the control of its own promoter in a Plk4 mutant background) in otherwise WT
embryos or in embryos in which the genetic dosage of asl has been halved. This analysis reveals that Plk4-GFP levels in the embryo do not change dramatically
when the genetic dosage of asl is halved, in agreement with the prediction of our model. WT embryos (Lane 1) are shown as a negative control to demonstrate that
the Plk4-GFP band is only detected in embryos expressing Plk4-GFP. Prominent non-specific bands are indicated (*). Actin is shown as a loading control. A
representative blot is shown from two technical repeats.

ll
Article
(D) The bar chart compares the number of Plk4-NG peaks per minute that was observed between normal embryos (WT) or embryos where the genetic dose of Asl
was halved (asl1/2). Measurements were performed at the beginning of S-phase in nuclear cycle 12. Each data point represents a 180 s recording from a single
embryo. Statistical significance was assessed using an ordinary unpaired t test (for Gaussian-distributed data) or a Mann-Whitney test (ns, p > 0.05). Data are
presented as Mean ± SD.
ll
Article
Figure S8. The Average Centriolar Plk4-NG Level on Individual Centrioles Can Be Used to Predict Stochastic Centriole Duplications in
Embryos Arrested in Interphase by Mitotic Cyclin Depletion, Related to Figures 5 and S2
(A) The pie chart quantifies the percentage of centrioles that continued to duplicate in embryos where cyclin A-B-B3 dsRNA was injected into embryos at nuclear
cycle 2-4, and centriole behavior assessed ~90 min later. Ambiguous (gray) indicates the fraction of centrioles whose duplication state could not be unam-
biguously determined due to their drifting out of focus during imaging.
(B) Bar chart shows the mean signal-to-noise ratio (SNR) of Plk4-NG fluorescence signals from sterile and fertile centrioles (red and green, respectively) through
the entire period of observation. Data are presented as Mean ± SD. Statistical significance of SNR was tested using a t test assuming equal variance (***p < 0.001).
ll
Article
(C) Heatmap histogram of all SNR values from sterile and fertile centrioles. Red dashed line shows the unbiased threshold, determined automatically from Otsu
thresholding for distinguishing sterile and fertile centrioles. Heatmap (Red: Sterile and Green: Fertile) indicates the fraction of fertile/sterile centrioles in each
column. Note that, the higher the SNR, the more fertile the centrioles are.
(D) Confusion matrix shows the classification performance of sterile versus fertile centriole Plk4-NG signals using the Otsu threshold in (C) as a proportion of the
total number of signals, n = 81 centrioles from 3 embryos.
(E) Heatmap plots demonstrate the spatial (x,y) coordinates of all centriole duplication events in a representative cycling embryo (left; at the beginning of cycle 13
when centrioles are separating over the course of ~3.5 min) and non-cycling embryo (right; captured over ~60 min), as each duplication event colored light blue to
dark red to represent early and late time points, respectively. The black points plot the observed spatial (x,y) positions of all centrioles, duplicating and non-
duplicating, at all time points. Note that the duplications in the cycling embryo are spatially and temporally coordinated (tending to divide first at the top of the
embryo and later at the bottom of the embryo), while the duplications in the non-cycling embryo occur over a longer time-scale and do not appear to be co-
ordinated in space or time.
(F) To test more rigorously whether the centriole duplication events in non-cycling embryos are largely stochastic, we calculated Ripley K statistics for all the non-
cycling embryos used in (A–D). This statistic provides a measure of whether the temporal duplication events have spatial preference by measuring the average
number of events that occur as a function of distance from individual centrioles. Curves were computed from the (x,y) coordinates of only the duplicating
centrioles (denoted Kdivpoint , red line) and of all centrioles at all times (denoted Kallpoints , black line). The sigmoidal increase in the statistic as a function of distance in
both cases suggests that duplication events do not cluster spatially at short distances (< 50-100 pixels). The trend and amplitudes of red and black lines (mean ±
SD) are very similar and fall in each other’s statistical confidence range, indicating that duplicating centrioles do not exhibit additional spatial clustering above the
natural spatial distribution of centrioles.
(G) Schematic depicts the topology of the mathematical model that illustrates how Plk4 oscillations at individual centrioles could be generated to trigger sto-
chastic duplications in non-cycling embryos. Briefly, we no longer assume that a PPTase acts in discrete bursts during mitosis and instead assume a continuous
low-level PPTase activity (10% of the activity in cycling embryos). We allow individual Asl receptors to bind Plk4 and be phosphorylated until they release Plk4 (as
in our original model), and to be continuously be slowly dephosphorylated by the PPTase. Asl-p and Plk4-p indicate phosphorylated proteins. Bold arrows
indicate the dominant direction of the reactions.
(H) Graph shows how the percentage of Asl receptors that are bound to Plk4 changes over time at 10 individual centrioles, two of which have been colored red or
green. The centrioles are initially synchronized, as their Asl receptors all start in a dephosphorylated, unbound, state and so exhibit a coordinated pulse of Plk4
binding. As time progresses, however, the Asl receptors lose synchrony (as their dephosphorylation is no longer entrained by the CCO), and so each centriole
exhibits low amplitude stochastic oscillations. These oscillations may be sufficient to trigger centriole duplication under these conditions of interphase arrest with
low CCO activity, as evident from our experimental observations (Figures 5B–5F and Video S4; see STAR Methods for full details of the model).
Article
A Unified Model for the Function of YTHDF Proteins in

Regulating m6A-Modified mRNA
Sara Zaccara, Samie R. Jaffrey
Correspondence
srj2003@med.cornell.edu
In Brief
Analysis of the transcriptome-wide
effects of m6A-mRNA effectors, known as
YTHDF proteins, demonstrates that they
act redundantly to induce degradation of
the same subset of mRNAs, with no
evidence of a direct role in promoting
translation.
Highlights
d YTHDF proteins function together to mediate degradation of
m6A-mRNAs
d YTHDF proteins show identical binding to all m6A sites

in mRNAs
d Each YTHDF paralog can compensate for the function of the

other YTHDF paralogs
d Depletion of all three YTHDF proteins promotes

differentiation of leukemia cells
Zaccara & Jaffrey, 2020, Cell 181, 1582–1595

ll
Article
A Unified Model for the Function of YTHDF Proteins
in Regulating m6A-Modified mRNA
Sara Zaccara1 and Samie R. Jaffrey1,2,*
1Department of Pharmacology, Weill Cornell Medicine, Cornell University, New York, NY 10065, USA
2Lead Contact
*Correspondence: srj2003@med.cornell.edu
SUMMARY
N6-methyladenosine (m6A) is the most abundant mRNA nucleotide modification and regulates critical as-
pects of cellular physiology and differentiation. m6A is thought to mediate its effects through a complex
network of interactions between different m6A sites and three functionally distinct cytoplasmic YTHDF
m6A-binding proteins (DF1, DF2, and DF3). In contrast to the prevailing model, we show that DF proteins
bind the same m6A-modified mRNAs rather than different mRNAs. Furthermore, we find that DF proteins
do not induce translation in HeLa cells. Instead, the DF paralogs act redundantly to mediate mRNA degrada-
tion and cellular differentiation. The ability of DF proteins to regulate stability and differentiation becomes
evident only when all three DF paralogs are depleted simultaneously. Our study reveals a unified model of
m6A function in which all m6A-modified mRNAs are subjected to the combined action of YTHDF proteins
in proportion to the number of m6A sites.
INTRODUCTION (24%). Based on these studies, each DF paralog mediates the

effects of m6A by targeting specific cohorts of m6A-mRNAs,
Methylation of adenosine in mRNA to form N6-methyladenosine which has led to the major concept that different DF paralogs
(m6A) has important roles in cellular physiology. Initial studies in control distinct physiologic processes (Anders et al., 2018; Han
Arabidopsis and yeast showed defective seed development et al., 2019; Hesser et al., 2018; Paris et al., 2019; Shi
and sporulation, respectively, upon genomic deletion of the et al., 2018).
m6A-forming methyltransferase (Clancy et al., 2002; Zhong Although the ability of the DF paralogs to bind different m6A
et al., 2008). More recent studies have further documented the sites is fundamental to explain the different biology of the DF pa-
connection between m6A and cellular differentiation in embry- ralogs, it remains unclear how they achieve selective binding to
onic and hematopoietic stem cells, acute myeloid leukemia cells, different m6A sites. Additionally, the mechanistic basis of the
as well as other cell types (Barbieri et al., 2017; Cui et al., 2017; different functions of DF proteins remains unknown, especially
Geula et al., 2015; Lee et al., 2019; Vu et al., 2017; Wang et al., in light of their high sequence identity (Patil et al., 2018; Wang
2014b). Together, these studies show that m6A affects cellular and He, 2014).
physiology and that these effects likely reflect modulation of Here we present a fundamentally different model to explain the
mRNA fate by m6A. effects of m6A on mRNA and how DF proteins mediate these ef-
The effects of m6A in cytosolic transcripts are thought to be fects. In contrast to the prevailing view that different m6A sites
mediated by a poorly understood and complex network of bind largely different DF paralogs, we find that all m6A sites
interactions between specific m6A sites and specific members bind all three DF paralogs in an essentially similar manner.
of the YT521-B homology Domain-containing Family (YTHDF) Thus, all DF paralogs regulate the same mRNAs based on the
family of m6A-binding proteins (Shi et al., 2017, 2019). The presence of m6A sites. We also find that DF paralogs are not
YTHDF family includes three paralogs (YTHDF1, YTHDF2, and translation enhancers, as currently thought for DF1 and DF3.
YTHDF3; also called DF1, DF2, and DF3, respectively), each of Instead, we show that the major effect of the three DF paralogs
which has different reported functions; DF1 enhances mRNA is to promote mRNA degradation in a largely redundant manner.
translation, DF2 promotes mRNA degradation, and DF3 en- The redundant effect of DF depletion has not been detected
hances translation and degradation (Li et al., 2017; Shi et al., previously because these proteins are typically depleted sepa-
2017; Wang et al., 2014a, 2015). According to the prevailing rately rather than simultaneously. Depletion of only one protein
model (Shi et al., 2017, 2019), the largest fraction of m6A-modi- allows for varying degrees of compensation by the other paral-
fied mRNAs (m6A-mRNAs; ~44%) bind a single DF paralog, ogs. Our finding that DF paralogs have similar functions rather
but some m6A-mRNAs (~32%) bind two DF paralogs. m6A- than different functions is supported by their essentially identical
mRNAs that bind all three DF paralogs are relatively rare m6A-binding properties, their similar set of associated binding
ll
Article
A B
Figure 1. DF Proteins Bind the Same m6A Sites throughout the Transcriptome
(A) The amino acids that contact m6A and the m6A-proximal nucleotides are conserved in the DF1, DF2, and DF3 YTH domain. Conserved (gray) and non-
conserved (blue) amino acids are shown on the YTH domain, rendered from the DF1 m6A-RNA structure (Xu et al., 2015) (PDB: 4RCJ). Conserved is defined as
amino acids identical in all three DF paralogs, whereas non-conserved is defined as amino acids that are different in at least one DF paralog. Shown are a front
view (left) and back view (rotated 180 , right).
(B) DF1, DF2, and DF3 have similar binding preferences for different m6A submotifs. Shown is the prevalence of different binding sites recognized by DF1, DF2,
and DF3 based on iCLIP binding data. For comparison, the percentage of each DRACH motif identified by miCLIP is shown. The three DF paralogs have similar
binding site preferences. Their binding preferences are similar to the prevalence of the m6A sequence motifs.
(C) Each m6A site in the transcriptome binds DF1, DF2, and DF3. DF1, DF2, and DF3 binding at 4,182 m6A mRNAs was based on DF1, DF2, and DF3 iCLIP
datasets in HEK293T cells (Patil et al., 2016). m6A sites are plotted as points in which the x and y coordinates represent the number of normalized DF iCLIP reads
Cell 181, 1582–1595, June 25, 2020 1583

ll
Article
proteins, and their shared subcellular localizations. Furthermore, the different DF paralogs to bind different m6A sequence motifs.
we show that the effects of m6A on cellular differentiation can be m6A residues are almost exclusively found in a single highly
explained by the combined action of the three DF proteins rather conserved sequence motif: DR-m6A-CH (D = A, G, U; R = A,
than individual DF proteins. Overall, these studies reveal a new G; H = A, C, U) (Canaani et al., 1979; Dimock and Stoltzfus,
unified model of m6A function in which m6A predominantly influ- 1977; Linder et al., 2015; Wei et al., 1976). We reasoned that
ences mRNA degradation via the combined action of the three the differential association of DF paralogs with distinct sets of
largely redundant DF proteins. m6A sites could be explained by their preferences for certain
DR-m6A-CH submotifs, such as GG-m6A-CU versus AG-
RESULTS m6A-CU.
To determine the binding preference for each DF paralog,
Structural Analysis of Different YTH Domains Reveals we examined our recently generated transcriptome-wide
Their Similar RNA-Binding Properties iCLIP (individual-nucleotide-resolution UV crosslinking and
An important concept in m6A-mediated gene regulation is that immunoprecipitation) maps of the binding sites of endogenously
the different DF proteins bind distinct subsets of m6A residues expressed DF1, DF2, and DF3 in HEK293T cells (Patil et al.,
in the transcriptome (Han et al., 2019; Liu et al., 2018; Shi et al., 2016). We calculated the percentage of each DR-m6A-CH
2017, 2018, 2019; Wang et al., 2015). As a result, m6A is thought sequence submotif at each DF-binding site identified by iCLIP
to influence mRNAs in different ways, depending on which DF pa- and ranked the most common sequence submotifs bound by
ralog it binds. Based on this, a set of DF1, DF2, or DF3 ‘‘unique’’ each DF paralog. These analyses showed a nearly identical fre-
m6A sites is commonly used for analysis of the function of each quency of each DR-m6A-CH submotif bound by each DF paralog
DF paralog (Han et al., 2019; Liu et al., 2018; Park et al., 2019; (Figure 1B). The rank order of the prevalence of the submotifs
Shi et al., 2017, 2018, 2019). Because this m6A-specific binding recognized by the DF paralogs was similar to the overall preva-
behavior of the DF paralogs ultimately determines how m6A lence of each m6A submotif in the transcriptome (Linder et al.,
regulates cellular processes, a major goal is to understand the 2015; Figure 1B). Thus, each DF paralog shows the same binding
basis of selective m6A recognition by the DF paralogs. preferences, which largely correlate with the prevalence of m6A
To address this, we first asked whether differences in the motifs, rather than a paralog-specific sequence preference.
YT521-B homology (YTH) domains might account for the Although the DF paralogs seem to bind m6A in proportion
different binding preferences of the DF proteins. The YTH to the prevalence of the m6A motif, each paralog may be targeted
domain comprises ~134 amino acids and mediates selective to a different set of m6A sites, as suggested by previous studies
binding to m6A (Li et al., 2014; Patil et al., 2018; Xu et al., (Shi et al., 2017, 2019; Figure S1C). Thus, we sought to identify
2015). We used the crystal structures of the DF1 and DF2 YTH the m6A sites uniquely bound by each DF paralog.
domains bound to m6A-containing RNA (Li et al., 2014; Xu To do this, we quantified the number of DF1, DF2, and DF3
et al., 2015) to annotate the amino acids that recognize m6A reads from the iCLIP datasets performed in HEK293T cells (Patil
and the adjacent nucleotides. In the case of DF3, an RNA-bound et al., 2016) that map to each HEK293T m6A site (Linder et al.,
structure has not yet been reported; however, each of the amino 2015) or, similarly, the FLAG-DF1 and FLAG-DF2 reads from
acids that bind m6A and the adjacent nucleotides in DF1 and DF2 the PhotoActivatable Ribonucleoside-enhanced Crosslinking
is conserved in DF3 (Figure S1A), suggesting that the structural and ImmunoPrecipitation (PAR-CLIP) datasets performed in
mechanism of m6A binding is the same for all three DF proteins. HeLa cells (Shi et al., 2017; Wang et al., 2014a, 2015) and map-
It remains possible that the few amino acids that differ be- ped at each HeLa m6A site (Ke et al., 2017). We could not use the
tween the YTH domains (Figure S1A) could affect RNA binding. FLAG-DF3 PAR-CLIP dataset (Shi et al., 2017) because it con-
However, these amino acids map on the surface opposite the tained a low number of reads mappable to cytosolic mRNAs
RNA-binding pocket (Figure 1A), suggesting that they might (Figure S1D).
not affect YTH-RNA interactions. Furthermore, each full-length Using this m6A-centric approach, we found a strong linear cor-
DF protein shows a similar binding affinity for an m6A-containing relation in pairwise comparisons between DF1, DF2, and DF3
RNA oligonucleotide (Figure S1B), consistent with previous reads at each m6A site. This correlation was seen in the PAR-
studies (Arguello et al., 2019; Wang et al., 2014a; Xu et al., CLIP datasets in HeLa cells mapped onto HeLa m6A sites (Fig-
2015). Overall, the RNA-binding surfaces for the three YTH ures S1E–S1F), as well as the iCLIP datasets in HEK293T cells
domains appear to be identical. mapped onto HEK293T m6A sites (Figure 1C). Notably, in
contrast to the previous model where many m6A sites show DF
The DF Paralogs Show Equivalent Binding to All m6A paralog-specific binding, we found no preferential enrichment
Sites throughout the Transcriptome of iCLIP or PAR-CLIP reads for any paralog at any m6A site.
The different transcriptome-wide binding properties reported Overall, these results suggest that each m6A site is bound by
for the DF paralogs (Shi et al., 2017) could reflect the ability of DF1, DF2, and DF3 in similar proportions.
overlapping that site (log2 normalized). We did not find m6A sites that preferentially bound either DF1, DF2, or DF3. The high Pearson correlation coefficients (r)
show that DF paralogs have highly similar binding preferences. Similar results were found in HeLa cells using DF PAR-CLIP datasets (Figure S1E).
(D) DF1, DF2, and DF3 iCLIP reads show a similar distribution in mRNAs that resembles the miCLIP read distribution. Shown are representative examples of DF1,
DF2, and DF3 iCLIP read distribution and miCLIP read distribution on HNRNPF and MDM2. The iCLIP and miCLIP data shown here were obtained in HEK293T
cells and were similar to data obtained in HeLa cells (Figures S1F–S1H).
1584 Cell 181, 1582–1595, June 25, 2020

ll
Article
C D
Cell 181, 1582–1595, June 25, 2020 1585

ll
Article
Because we were surprised by the lack of m6A sites uniquely even when using different cell lines and different CLIP methodol-
bound by each of the DF proteins, we reexamined the previously ogies (iCLIP and PAR-CLIP).
reported DF1 and DF2 unique RNA-binding sites (Shi et al., Overall, these diverse lines of evidence, including structural
2017). We asked whether DF1 binding, as measured by PAR- similarity, motif analysis, and analysis of DF binding at each
CLIP signal, was missing at DF2 unique sites and vice versa. m6A site in the transcriptome using CLIP datasets from two
To test this, we plotted DF2 binding at each DF1 unique site cell lines, suggest that the DF paralogs do not exhibit different
and vice versa. This analysis shows that there is a comparable patterns of m6A-binding in the transcriptome. Instead, they
level of DF1 and DF2 binding at each DF1 unique site and a com- appear to have similar binding preferences and affinities.
parable level of DF1 and DF2 binding at each DF2 unique site
(Figures S1G and S1H). Thus, the previously described unique DF Proteins Exhibit Similar Protein Interaction Networks
sites instead appear to be indistinguishable in terms of their Although we find that the DF paralogs bind the same m6A sites,
DF1 and DF2 binding. an unresolved question is how these nearly identical proteins
Next, we directly examined the PAR-CLIP reads on individual exert different molecular effects on m6A-mRNAs, especially in
transcripts proposed to be DF1 unique or DF2 unique. After the case of DF1 (translation) compared with DF2 (degradation).
extensive visual inspection, we found essentially no differences A compelling possibility for divergent functions of DF paralogs
in the distribution of DF1 and DF2 PAR-CLIP reads for any of might lie within their effector domain, an ~40-kDa low-
these transcripts (Figure S1J). Notably, the PAR-CLIP reads complexity domain that comprises the remainder of the protein
correlated with the location of m6A-called sites, supporting the outside of the YTH domain (Patil et al., 2018). Recruitment of
idea that DF binding is due to m6A. A similar effect was seen a distinct set of proteins to the effector domain of each DF pa-
when using the iCLIP datasets for DF1, DF2, and DF3 (Figure 1D). ralog may allow the DF paralogs to exert different effects on
These data further suggest that the sites that were previously m6A-mRNAs. We thus analyzed the differences in the effector
described as being DF paralog unique are not, in fact, unique. domains between DF paralogs as well as their protein interaction
Our reexamination of the iCLIP and PAR-CLIP studies contrast partners.
with previous reports showing that each DF paralog has only par- Examination of the three effector domains shows them to be
tial overlap with m6A sites and with each other (Shi et al., 2017; superficially similar. Each is a proline and glutamine-rich low-
Wang et al., 2014a, 2015; Figure S1C). In the PAR-CLIP studies, complexity domain with ~60% amino acid identity and 70%
DF paralog-binding sites were determined based on a threshold amino acid similarity (Figure S2A). Furthermore, the hydropho-
number of misincorporation-containing reads at any individual bicity and charge distribution along the length of the low-
site in the transcriptome. However, using a threshold approach complexity domains are highly similar. Additionally, the positions
in independent experiments can often produce false negative of the disordered regions within the low-complexity domains are
sites. These arise because of the arbitrary nature of a threshold, similar for each paralog (Figure 2A).
which can cause some sites to be missed when they fall just The different DF proteins may have different protein interaction
beneath that threshold. For this reason, the threshold approach partners to mediate their different functions. We therefore exam-
is not generally used for comparative analysis of datasets (Chak- ined a recent comprehensive Bio-ID study in which 139 proteins
rabarti et al., 2018; Guertin et al., 2018; Landt et al., 2012). were BirA tagged, and interacting proteins were detected based
Instead, comparative analysis is often performed by selecting a on their proximity-induced biotinylation (Youn et al., 2018). In
set of specific transcriptomic sites, such as sequence motifs, these experiments, 937–1,360 proteins were detected as
and determining whether one CLIP dataset or another shows possible DF interactors, of which 63–103 were identified as
preferential binding to any of these sites (Dominguez et al., high-confidence interactors (63 of 937 DF1 interactors, 100 of
2018; Wheeler et al., 2018). Our approach is similar; by 1,270 DF2 interactors, and 103 of 1,360 DF3 interactors).
comparing DF paralogs binding at mapped m6A sites, we To determine which interactors are preferentially bound by
find that the RNA binding of DF paralogs is essentially identical, each DF paralog, we performed a scatterplot analysis in which
Figure 2. DF Proteins Have Similar Binding Partners and Intracellular Localization

(A) The three DF paralogs have similar intrinsically disordered regions, hydrophobicity, and amino acid charge distribution along the length of their effector
domains. A schematic representation of each DF protein is shown, with the YTH domain indicated. Green indicates disordered regions (STAR Methods). Each of
these physiochemical parameters has high similarity among all DF paralogs, suggesting that these proteins may have similar properties.
(B) DF paralogs show similar binding preferences for their high-confidence interacting proteins. Shown are pairwise comparisons of the average spectral counts
of peptides derived from each protein identified as a DF paralog interactor in vivo (Youn et al., 2018). Each protein interactor is plotted as a circle in which x and y
coordinates represent the average spectral counts calculated based on a Bio-ID study of each DF paralog (log2-normalized values). The diameter is proportional
to the average probability of interaction (AvgP). As indicated by the orange circles (high-confident interactors with AvgP > 0.95 for both DF paralogs), there is a
high level of correlation between each pair of DF paralogs in the spectral counts and AvgP for each DF high-confidence interactor. These interactors are enriched
in proteins related to mRNA degradation pathways (shown in red). Proteins with a function in mRNA translation, including eIF3A and eIF3B, have a low AvgP and,
thus, are considered low-confidence or non-specific interactors.
(C) DFs are predicted to have similar subcellular localization. Shown is the ‘‘non-negative matrix factorization’’ probability assigned to each DF paralog in each
subcellular compartment, calculated based on the protein-protein interaction network (STAR Methods). Each DF paralog shows a similar probability of being
classified as a P-body protein and high probability of being enriched in ribonucleoprotein (RNP) granules or stress granules.
(D) Structured illumination microscopy (SIM) images indicate that the DF paralogs (red) are localized to small punctate structures (<150 nm) throughout the
cytoplasm and P-bodies (150–200 nm), based on colocalization with EDC4 (green). Scale bar is indicated.
1586 Cell 181, 1582–1595, June 25, 2020

ll
Article
each biotinylated interactor was plotted as a circle, with the x were P-bodies and cytoplasmic mRNA ribonucleoprotein
and y axis position each reflecting the average spectral counts (mRNP) complexes (Figure 2C). Thus, DF paralogs are predicted
for one of the three DF paralogs. In this pairwise analysis, we to have similar subcellular localizations.
found similar proteins bound by all three paralogs. Additionally, To further test this, we examined the localization of each DF
we found a marked linear correlation in the spectral counts for paralog by immunofluorescence. Here we found that all three
each high-confidence protein interactor when examined in DF paralogs showed essentially identical localization throughout
pairwise comparisons of DF1, DF2, and DF3 (Figure 2B). Thus, the cytoplasm in small punctate structures and, to a lower
the DF binding partners seem to be shared and bind with the extent, in larger punctate structures. Although the smaller ones
same overall rank order preference for all three DF paralogs. resemble the granule-like particles detected previously under
Notably, the top 25 interactors of all three DF paralogs unstressed conditions (Youn et al., 2018), the larger punctate
were highly similar and included high-scoring interactions structures were identified as P-bodies based on their colocaliza-
with components of the Carbon Catabolite Repression- tion with EDC4, a P-body marker (Figure 2D). This is consistent
Negative On TATA-less (CCR4-NOT) RNA degradation complex, with an independent proteomic analysis using P-body markers
such as CNOT1, CNOT7, and CNOT10 (Figures 2B and S2B). that also identified all three DF paralogs in P-bodies (Youn
Other high-confidence interactors of all three DF paralogs et al., 2018). Thus, rather than exhibiting distinct localizations
included RNA degradation proteins such as PATL1, XRN1, in cells, DF proteins have similar punctate cytoplasmic localiza-
LSM12, and DDX6. In addition, all three DF paralogs interact tions, some of which include localization to P-bodies. Overall,
with protein components of stress granules (Figures 2B and these data suggest that DF proteins have highly similar se-
S2B). Stress granule proteins interact even in the absence of quences, functional domains, protein binding partners, and
stress (Markmiller et al., 2018; Youn et al., 2018). Notably, all of intracellular localizations.
these interactors have been seen in other studies. For example,
CNOT1 immunoprecipitation studies have identified all three DF The Combined Activity of DF Proteins Leads to
paralogs (Du et al., 2016). Additionally, an engineered ascorbate Degradation of m6A-Modified mRNA
peroxidase (APEX)-based proteomics analysis of the G3BP1 Because all DF paralogs show high-confidence interactions
stress granule protein found all three DF paralogs to be interac- with RNA degradation pathway proteins (Youn et al., 2018; Fig-
tors (Markmiller et al., 2018). Thus, all three DF paralogs show ure 2B), we considered the possibility that each paralog may
similar patterns of binding proteins, and the major interactions function to mediate degradation of m6A-mRNA.
are seen across independent proteomic datasets. Previous comparative studies of the DF paralogs found that
Because DF1 is thought to regulate translation through its some of them induce mRNA degradation when artificially teth-
interactions with the eIF3A and eIF3B translation initiation fac- ered to a reporter transcript (Kennedy et al., 2016; Shi et al.,
tors (Shi et al., 2017; Wang et al., 2015), we wanted to determine 2017; Tirumuru et al., 2016; Wang et al., 2014a, 2015). How-
whether DF1 shows selective and robust interactions with these ever, because these studies used heterologously overex-
proteins. The Bio-ID proteomics analyses (Figure 2B) shows pressed DF proteins, we were concerned about the potential
weak interactions of eIF3A and eIF3B with all three DF paralogs, for these proteins to aggregate into stress granule-like struc-
and in each case, the probability of interaction was low. Previous tures when overexpressed. This phenomenon is seen in pro-
analysis of DF1-binding partners identified eIF3A and eIF3B teins, like the DF paralogs, that contain low-complexity do-
based on mass spectrometry analysis of DF1 immunoprecipi- mains (Alberti et al., 2019). Indeed, we observed variable
tates (Wang et al., 2015). However, in that study, eIF3A and levels of DF granule-like structures after transfecting cells
eIF3B were among the lowest-scoring DF1 interactors (Fig- with DF-expressing plasmids (Figure S2D). Thus, DF overex-
ure S2C). The weak binding seen in immunoprecipitates is pression experiments may be misleading because of the
consistent with the low probability seen in the Bio-ID studies variable degrees of DF aggregation, which could sequester
(Figure 2B). Thus, each proteomic study suggests that all three proteins and potentially suppress their function. We therefore
DF paralogs exhibit nonspecific or low-level binding to eIF3A re-examined the role of DF paralogs using knockdown ap-
and eIF3B in cells. proaches instead.
Overall, the protein-protein interactions of all three DF paral- We first examined mRNA abundance using RNA sequencing
ogs are very similar, rather than different, with high-confidence (RNA-seq) after small interfering RNA (siRNA)-mediated deple-
interactions with RNA degradation machinery and low-confi- tion of DF1, DF2, and DF3 in HeLa cells. We used siRNA that
dence interactions with translation machinery. we validated for selective high-efficiency DF1, DF2, and DF3
knockdown (Patil et al., 2016; Figure S3A–S3C). Because m6A
DF Proteins Exhibit Similar Intracellular Localizations sites on mRNAs bind all the three DF paralogs (Figures 1 and
To further understand the different functions of the DF paralogs, S1), we examined how m6A-mRNAs were affected by DF
we examined their subcellular localizations because distinct depletion and whether the effects were correlated with the num-
subcellular localizations might be expected for proteins with ber of m6A sites (Table S1). As in a previous study of DF1 (Wang
different functions. In the Bio-ID study (Youn et al., 2018), a et al., 2015), we found no effect of DF1 depletion on m6A-mRNA
non-negative matrix factorization classification analysis was abundance compared with non-methylated mRNAs (Figure 3A).
developed to predict the subcellular localization of 139 RNA- In contrast, as reported previously (Wang et al., 2014a), deple-
binding proteins based on their interaction partners. The tion of DF2, the most highly expressed DF paralog in HeLa cells
prominent localizations predicted for all three DF paralogs (Patil et al., 2018; Figure S3D), was associated with a small but
Cell 181, 1582–1595, June 25, 2020 1587

ll
Article
A E
Figure 3. DF Proteins Redundantly Control the Abundance and Stability of m6A-mRNAs

(A–D) m6A-mRNAs show a substantially higher increase in expression upon simultaneous silencing of DF1, DF2, and DF3 in HeLa cells. The abundance of each
mRNA (based on RNA-seq counts) was compared between the control and the conditions of DF paralog silencing, as indicated. mRNAs were binned based on
the number of m6A sites. The expression distribution of each of m6A-mRNA subgroup is quantified in the boxplots. The center of each box represents the median
fold change (log2), the boundaries contain genes within a quartile of the median, and whiskers represent 1.53 interquartile ranges. The width of the boxplot is
proportional to the number of genes in each category. m6A-mRNAs do not change in expression upon silencing of DF1 (A) or DF3 (C) compared with non-
methylated mRNAs. However, m6A-mRNAs show a small increase in expression upon silencing of DF2 (B). m6A-mRNAs show a substantially higher increase in
expression upon simultaneous silencing of DF1, DF2, and DF3 (D). Only significant p values are shown in each graph (two-tailed Mann-Whitney test). n = 3
replicates; n = 2,425 mRNAs for 0, n = 894 mRNAs for 1, n = 1,521 mRNAs for 2–4, and n = 2,022 mRNAs for 5+ m6A sites.
(E) Quantification of the results in (A)–(D). The fold increase in mRNA expression was quantified for each indicated knockdown condition. Only mRNAs with 5 or
more annotated m6A sites were used in this analysis and compared with mRNAs with 0 annotated m6A sites. Although no effect on m6A mRNA expression is seen
with depletion of DF1 or DF3, a small increase is seen when these two are knocked down together. Triple knockdown shows a significantly larger effect compared
with knockdown of each DF paralog alone. ****p % 2.2e16, **p % 0.001, two-tailed Mann-Whitney test.
(F) m6A mRNA stability is increased upon depletion of DF paralogs. The stability of m6A-mRNAs and non-methylated mRNAs was determined by quantifying
mRNA levels before and 2 h after actinomycin D treatment. Shown is the fold change in mRNA levels, used as a measure of stability change upon silencing of the
DF paralog(s) (STAR Methods). The increase in mRNA stability is most apparent when all three DF paralogs were knocked down and is proportional to the number
of m6A sites per mRNA. n = 456 mRNAs for 0, n = 96 mRNAs for 1–5, n = 140 mRNAs for more than 5 annotated m6A sites. ***p = 2.079e6, **p % 0.01, *p % 0.02,
two-tailed Mann-Whitney test, n = 2 replicates.
1588 Cell 181, 1582–1595, June 25, 2020

ll
Article
statistically significant increase in the abundance of m6A- compensation cannot occur, and m6A-mRNA stability is
mRNAs (Figure 3B). This effect was more pronounced for increased most robustly.
mRNAs with high numbers of annotated m6A sites. No increase
in m6A-mRNA abundance was observed following DF3 depletion DF Paralogs Do Not Affect the Translation of m6A-
(Figure 3C), which, like DF1, is more lowly expressed in HeLa Modified mRNAs
cells (Patil et al., 2018; Figure S3D). Overall, our data confirm Although our data strongly indicate that DF paralogs act
the idea that DF2, but not DF1 or DF3, destabilizes m6A-mRNAs. together to destabilize m6A-mRNA, it remains possible that
We then considered the possibility that the DF paralogs may one or more of these proteins could also control m6A-mRNA
be functionally redundant and that our inability to detect a translation. Importantly, DF1 and DF3 have been described to
stability-regulatory effect was due to compensation by the other be enhancers of m6A-mRNA translation (Shi et al., 2017; Wang
paralogs. Additionally, we noticed that knockdown of any of et al., 2015). Therefore, we examined the role of the DF paralogs
the DF paralogs was associated with a compensatory increase in translational regulation of m6A-mRNAs.
in the expression of the other paralogs in HeLa cells (Figures DF1 has been proposed to enhance translation by binding
S3A–S3B). Thus, compensatory upregulation of the other DF pa- m6A and recruiting eIF3 to 30 UTRs. This subsequently facilitates
ralogs could further mask an effect of knockdown. loading of eIF3 onto mRNA caps (Wang et al., 2015). This is
We therefore tested simultaneous knockdown of different reminiscent of the poly(A)-binding protein PABPC1, which also
combinations of two DF paralogs or all three DFs. Knocking binds 30 UTRs, binds an initiation factor, and promotes formation
down any two DF paralogs increased the overall abundance of of initiation complexes at mRNA caps (Le et al., 1997; Ozoe et al.,
m6A-mRNA (Figures S3E–S3I). Importantly, the selective in- 2013; Wells et al., 1998). The ability of PABPC1 to enhance
crease in m6A-mRNA expression was largest upon triple knock- mRNA translation is evident based on its enrichment in the
down (Figure 3D), and, in each case, it was directly correlated polysome fractions corresponding to highly translated mRNAs
with the number of m6A sites per mRNA (Figures 3D and S3F– (Arava et al., 2003).
S3H). Double knockdown and triple knockdown exhibited We therefore examined whether DF1 is similarly enriched with
greater increases in m6A-mRNA abundance than DF2 knock- highly translated mRNAs. However, all three DF paralogs are
down alone (Figure 3E). These data are consistent with a mostly excluded from the polysome fraction (Figure 4A and
model in which any of the DF proteins can promote degradation S4A) and highly enriched in the fractions at the beginning of
of m6A-mRNAs, but m6A-mRNA degradation is most effective the gradients, which represent cytoplasmic mRNPs (Arava
when all three DF paralogs are available, resulting in maximal et al., 2003). This result is not consistent with a model in which
DF-dependent m6A-mRNA degradation. any DF protein is stably bound to mRNA 30 UTRs, enhancing
To determine whether the expression levels seen upon their translation.
triple DF knockdown are correlated with changes in mRNA Despite the absence of DF paralogs from the highly translating
stability levels, we measured the levels of m6A-mRNAs mRNA pool, we wanted to examine whether translation of m6A-
after transcription inhibition with actinomycin D. Because mRNAs is enhanced by any of the DF paralogs.
m6A-mRNAs tend to have a short half-life (Ke et al., 2017; We first examined the published ribosome profiling data that
Schwartz et al., 2014), we treated cells with actinomycin D revealed the translation-enhancing effect of DF1 (Wang et al.,
for 2 h and measured the amount of each mRNA remaining 2015). In this study, processed data were provided, listing the
using RNA-seq. mRNA stability was relatively unaffected abundance of ribosome-protected fragments that map to each
upon depletion of each DF paralog individually, except in gene. When we examined the processed data obtained after
the case of DF2, where a small stabilizing effect was seen DF1 silencing, it was evident that m6A-mRNAs were less trans-
(Figure 3F). However, when all three DF paralogs were lated than in control siRNA-transfected samples (Figure S4B).
knocked down, a substantial increase in mRNA stability was Thus, consistent with the previous study (Wang et al., 2015),
seen for m6A-containing mRNAs compared with non-m6A- but in contrast to what was expected from the polysome analysis
mRNAs (Figure 3F). These effects were also seen in qRT- (Figure S4A), DF1 silencing causes a reduction in translation of
PCR experiments examining the stability of specific mRNAs m6A-mRNAs. This analysis was performed using the average
annotated to contain high numbers of m6A sites or annotated of the two ribosome profiling replicates, as described in the orig-
to lack m6A (Figure S3J). inal study.
Overall, these data further suggest that the most efficient However, we found different results when the processed data
degradation of m6A-mRNAs occurs when all three DF paralogs of the two DF1 siRNA replicates were analyzed separately. Repli-
are present. When DF1 or DF3 is knocked down alone, the ef- cate 1 showed a prominent decrease in translation of m6A-
fects are nearly undetectable, but knockdown of DF1 and mRNAs upon DF1 depletion, consistent with the idea that DF1
DF3 together or in combination with DF2 resulted in readily enhances translation of m6A-mRNA (Figure S4C). However,
detectable increases in m6A-mRNA mRNA expression (Fig- replicate 2 showed no change in translation of m6A-mRNAs
ure 3E). This suggests that the other DF paralogs can partially upon DF1 silencing (Figure S4D). Thus, the overall reduction in
compensate for the loss of another DF. The ability of any DF m6A-mRNA translation we saw in the averaged data (Figure S4B)
protein to compensate is likely to be influenced by its expres- was driven by the large drop in m6A-mRNA translation seen in
sion level, with the lower expression of DF1 and DF3 (Fig- replicate 1.
ure S3D) making them less able to compensate for DF2 deple- Because we were surprised by the different roles of DF1
tion. In this model, when all three DF paralogs are depleted, implied by the two different replicates, we re-processed the raw
Cell 181, 1582–1595, June 25, 2020 1589

ll
Article
A Figure 4. Depletion of DF Proteins Does Not Affect the Translation

Efficiency of m6A mRNAs in HeLa Cells
(A) DF paralogs are not highly associated with actively translated mRNAs. DF1,
DF2, and DF3 were not enriched in polysomal fractions. Instead, they were
predominantly associated with the mRNA ribonucleoprotein (mRNP) complex
fraction. The ribosomal protein RPS6 was used as a marker for ribosome-
enriched fractions.
(B–E) The translation efficiency of m6A-mRNAs is not reduced upon silencing
of DF1 or any DF paralog. The number of ribosome-protected fragments
bound to each mRNA was normalized to the abundance of the respective
mRNA to calculate translation efficiency (TE). The TE was then compared
between control and cells knocked down for the indicated DF paralog(s) (DF1
in B, DF2 in C, DF3 in D). mRNAs were binned based on the number of m6A
sites. The distribution of each of m6A mRNA subgroups is quantified in the
boxplots. The center of each box represents the median fold change (log2), the
boundaries contain genes within a quartile of the median, and whiskers
represent 1.53 interquartile ranges. The width of the boxplot is proportional to
the number of genes in each category. m6A-mRNAs do not decrease their TE
B upon single or triple silencing of DF1, DF2, or DF3 (E) compared with non-
methylated mRNAs. Only significant p values are shown in each graph, and the
exact p values are reported (two-tailed Mann-Whitney test). n = 3 replicates;
n = 2,425 mRNAs for 0, n = 894 mRNAs for 1, n = 1,521 mRNAs for 2–4, n =
2,022 mRNAs for 5+ m6A sites.
sequencing data generated in the Wang et al. (2015) study. Upon

reanalysis using the standard pipeline for alignment and calcula-
tion of ribosome-protected fragments (Calviello et al., 2016;
McGlincy and Ingolia, 2017), both replicates showed the same
C result: DF1 depletion did not affect m6A-mRNA translation effi-
ciency (Figures S4E–S4G; Tables S2 and S3), similar to the effect
seen with the previously processed replicate 2 but not replicate 1.
To understand the basis of the inconsistencies between the
original replicate 1 and 2, we generated correlation coefficients
for the ribosome-protected fragments between replicates.
Here we saw only modest correlation in the number of ribo-
some-protected fragments per m6A-mRNA between the original
replicates (Figure S4H). In contrast, the re-processed data
showed high correlation (Figure S4I), as expected for replicates.
D
We therefore considered that a bioinformatic rather than a bio-
logical artifact may have caused variability in the original re-
ported analysis. Overall, based on one of the original replicates,
and based on both replicates after re-processing of the raw
ribosome profiling data from Wang et al. (2015), DF1 does not
regulate translation of m6A-mRNAs.
We next examined DF3, which has also been proposed previ-
ously to promote translation (Shi et al., 2017). In this previous
analysis, DF3-bound mRNAs were identified by PAR-CLIP, and
E the translation efficiency of these mRNAs was analyzed after
DF3 knockdown (Shi et al., 2017). However, we found that the
DF3 PAR-CLIP dataset did not contain a sufficient number of
mappable reads, as described above (Figure S1D). We therefore
decided to reanalyze the effect of DF3 depletion on mRNAs
based on the number of m6A sites because m6A sites correlate
with DF3 binding (Figure 1). Here we saw that DF3 depletion
does not affect the translation efficiency of m6A-mRNAs
(Figure S4J).
Because of the various inconsistencies, we wanted to inde-
pendently confirm the effects of depleting each DF paralog on
the translation efficiency of m6A-mRNAs using ribosome
profiling. In each experiment, we generated three replicates for
1590 Cell 181, 1582–1595, June 25, 2020

ll
Article
A B translation. Instead, their major effect is to mediate m6A-mRNA

degradation.
The Combined Activity of DF Proteins Suppresses

Differentiation of Leukemia Cells
One of the most well-studied examples of m6A-regulated cellular
differentiation is in acute myeloid leukemia cell lines, where m6A
maintains cells in an undifferentiated blast-like state (Barbieri
et al., 2017; Vu et al., 2017). Knockdown of METTL3 induces
expression of markers such as CD14, a transmembrane protein
that reflects differentiation and commitment to the myeloid line-
age. Depletion of DF2 affects expression of m6A-RNAs whose
repression may contribute to leukemogenesis, such as
TNFRSF1B (Paris et al., 2019). Attempts to identify the m6A
reader that mediate the differentiation-suppressive effects of
m6A have not yet been successful. Recent studies have sug-
gested that DF2 does not mediate the differentiation-suppres-
sive effects of m6A because DF2 knockout does not induce dif-
ferentiation (Paris et al., 2019), which contrasts the effect of
Figure 5. DF Proteins Redundantly Contribute to Suppressing METTL3 depletion (Barbieri et al., 2017; Vu et al., 2017). There-
Differentiation of Leukemia Cells fore, it has not been possible to ascribe the differentiation-sup-
(A) TNFRSF1B mRNA expression is increased upon depletion of all DF pa- pressive effects of m6A to a specific m6A reader.
ralogs. TNFRSF1B mRNA levels were measured by qRT-PCR in MOLM-13 We therefore asked whether suppression of TNFRSF1B is
cells after knockdown of each paralog. DF2 silencing lead to increased levels mediated by the combined action of the three DF proteins. For
of TNFRSF1B. However, a larger increase was seen upon silencing all three DF
these experiments, we knocked down each DF transcript alone
paralogs. TNFRSF1B levels were normalized to RPS28 expression levels, a
non-methylated mRNA (Vu et al., 2017). Non-parametric ANOVA test, ****p %
or in combination in MOLM-13 leukemia cells, which, as we
0.0001, n = 3, mean ± SEM. and others showed previously, were maintained in an undifferen-
(B) Expression of the CD14 differentiation marker in MOLM-13 cells is strongly tiated state by METTL3 (Barbieri et al., 2017; Vu et al., 2017; Fig-
induced by silencing all three DF paralogs. Shown is the percentage of CD14+ ure S5G). TNFRSF1B mRNA expression levels were slightly
cells after silencing of the indicated DF paralog(s). For each DF paralog, two affected by knockdown of each of the DF paralogs individually,
different short-hairpin RNAs (shRNAs) were tested in two biological replicates, with the largest increase seen with DF2 knockdown (Figure 5A).
with the two different shRNAs indicated as a closed circle and an open circle.
However, upon knockdown of all three DF paralogs, TNFRSF1B
Upon triple knockdown, the percentage of CD14+ cells is significantly
increased. Non-parametric ANOVA test; ****p < 0.0001, ***p = 0.0010, **p = expression was markedly enhanced (Figure 5A). Overall, these
0.0012, *p = 0.03. data further support the idea that the DF proteins act redundantly
to control the expression levels of m6A-mRNAs in MOLM-
13 cells.
knockdown and control (Figures S3A, S3B, and S5A–S5C). We We next examined the role of DF proteins in suppressing dif-
found no reduction in translation efficiency of m6A-mRNAs ferentiation. DF2 depletion on its own does not induce differen-
compared with non-methylated mRNAs in DF1-deficient HeLa tiation, as measured by the CD14 differentiation marker (Paris
cells (Figure 4B). Similarly, neither DF2 nor DF3 depletion was et al., 2019). We therefore asked whether the combined action
associated with a change in translation in m6A-mRNAs by ribo- of the DF paralogs suppresses differentiation. To test this idea,
some profiling (Figures 4C and 4D). As a second approach, we measured CD14 5 days after knockdown of DF paralogs us-
we quantified the levels of specific m6A-mRNAs at each ing flow cytometry. Similar to the previous study (Paris et al.,
position along the polysome gradient. We focused on specific 2019), we did not find induction of CD14 upon knockdown of
m6A-mRNAs reported to show the largest drop in translation af- DF2 (Figure 5B). However, upon triple knockdown, CD14
ter DF1 depletion (Wang et al., 2015). However, we saw no expression was markedly increased. Overall, these data are
decrease in translation upon DF1 depletion (Figure S5D). consistent with the idea that the three DF paralogs function
Because all three DF proteins redundantly mediate mRNA together to mediate the differentiation-suppressing effects
degradation, we considered the possibility that they may also of m6A.
redundantly regulate mRNA translation. We therefore performed
ribosome profiling after knockdown of all three paralogs (Figures DISCUSSION
S5E and S5F). Here we still saw no reduction but a slight
enhancement in the translation efficiency of m6A-containing The YTHDF family proteins DF1, DF2, and DF3 are major cyto-
transcripts (Figure 4E). solic m6A-binding proteins and are thought to mediate the
Overall, multiple independent datasets demonstrate that DF1 effects of m6A in the cytosol (Patil et al., 2018). In contrast
has no translation-promoting role in HeLa cells, including the to the prevailing model, where each DF paralog binds to
original data used to show the translation-promoting effect of distinct subsets of mRNAs, we show that the DF paralogs
DF1. Thus, none of the DF paralogs directly enhance mRNA bind proportionately to each m6A site throughout the
Cell 181, 1582–1595, June 25, 2020 1591

ll
Article
transcriptome. Additionally, rather than exerting different ef- mediate degradation of m6A-mRNAs. Further studies may also
fects on different m6A-mRNAs, we show that the three DF address whether the position of m6A along the transcript body
proteins function together to mediate degradation of m6A- has an effect on DF-mediated mRNA degradation.
containing mRNAs. Although depletion of single DF paralogs Our data do not support the idea that DF paralogs have a
leads to mild or no effects on mRNA abundance and stability, direct role in regulating mRNA translation efficiency. Our reanal-
depleting all three DF proteins leads to robust stabilization of ysis of previously published data, together with our independent
m6A-mRNAs, suggesting that each paralog can fully or ribosome profiling datasets and analysis of individual m6A-
partially compensate for the function of the other DF paralogs. mRNAs by polysome fractionation analysis performed in the
Lastly, we find that previous studies that linked DF proteins to same cell line, shows no evidence of decreased translation of
mRNA translation were affected by bioinformatic and tech- m6A-mRNAs upon depletion of DF proteins. This conclusion is
nical issues, which led to the incorrect view that a major func- consistent with the previously published protein polysome frac-
tion of DF proteins was to promote translation. Instead, we tionation analysis (Wang et al., 2015) as well as the new one pre-
show that DF paralogs do not appear to directly enhance sented here, which do not show an enrichment of DF proteins in
translation of m6A-mRNAs in HeLa cells. Our comprehensive the fractions containing highly translated mRNAs. Thus, DF pro-
analysis of DF paralog function reveals a unified model of teins, including DF1, do not behave like other translation-
DF protein binding and function, with the major effect of enhancing RNA-binding proteins that bind the 30 UTR (Le et al.,
m6A to mediate mRNA degradation through the combined ef- 1997; Ozoe et al., 2013; Wells et al., 1998). Additionally, protein
fects of all three DF paralogs. interaction analysis shows a lack of high-confidence interactions
The issue of whether DF paralogs bind different mRNAs or of the DF paralogs with translation initiation factors in cells (Youn
the same mRNAs is a central question for understanding m6A et al., 2018). Overall, these diverse lines of evidence suggest that
function in cells. If each DF paralog binds different mRNAs, DF proteins do not promote translation. Although we cannot
then each DF would affect different cellular pathways and pro- exclude a role of DF proteins in translation enhancement in other
cesses. This prevailing model has created the impetus for cell lines or conditions, our current findings are not consistent
knockout studies focusing on separate analysis of each DF pa- with a role of any DF paralog in regulating m6A-mRNA transla-
ralog, with the goal of understanding the specific functions of tion, at least in the original cell line and conditions where which
each. However, this model lacks a clear mechanism to explain DF1 was originally linked to translational regulation.
how the DF paralogs could bind different m6A sites. Although DF proteins do not promote translation, m6A can
Our analysis supports the opposite conclusion. We find that promote translation through DF-independent mechanisms.
all m6A sites bind all DF proteins in an essentially indistinguish- m6A can affect 30 UTR length (Ke et al., 2017), which can indi-
able manner, with the main determinant of DF paralog binding rectly affect translation. m6A may directly bind eIF3 when m6A
simply being the presence of the m6A site. The level of DF bind- is in the 50 UTR (Meyer et al., 2015), or m6A may be associated
ing is likely to be correlated with m6A stoichiometry, which would with bound METTL3, which may facilitate translation (Choe
positively correlate with the degradation effect. It should be et al., 2018). However, because eIF3 and METTL3 are thought
noted that some m6A sites may bind DF paralogs poorly when to bind just a small subset of m6A sites, the role of eIF3 and
they are obscured by local RNA structure or by binding of nearby METTL3 binding in the overall cellular effects of m6A is likely to
RNA-binding proteins that limit access to m6A. be limited to specific transcripts (Zaccara et al., 2019).
Another central question is that of establishing the function of Our finding that all DF paralogs contribute to mRNA destabili-
the major cytoplasmic m6A readers DF1, DF2, and DF3. This is zation reconciles findings made by diverse groups where
arguably the most important step in understanding m6A biology depletion of DF1 was not associated with reduced mRNA
because it can explain how the effects of m6A can be rationalized translation. For example, DF1 depletion does not affect transla-
in diverse m6A-dependent processes. The current concept is tion of a heterologously expressed m6A-mRNA in MCF7 cells
that m6A, through the action of different DF proteins, can induce (Slobodin et al., 2017) and does not affect translation of m6A-
mRNA translation, mRNA degradation, or both and that this ef- mRNAs in neurons when using a DF1 tethering system unless
fect is transcript specific (Shi et al., 2017, 2019). a stimulus is added (Shi et al., 2018). Thus, the lack of translation
In contrast to this prevailing model, we find diverse lines of ev- regulation seen upon DF1 depletion can now be explained by
idence supporting the idea that DF proteins have similar rather the new model of DF function presented here.
than different functions. In addition to the high sequence identity In light of our current findings, phenotypes seen upon DF
in the RNA-binding and effector domains, DF proteins have depletion likely derive from upregulation of m6A-mRNAs.
similar cytoplasmic localizations, including P-body localization, Depletion of any single DF protein can affect mRNA levels to
and similar protein interactors, which are hallmarks of proteins a small degree, which can still result in clear cellular effects.
with similar functions. Most notable among the binding partners For example, recent studies show that m6A suppresses inter-
for all three DF proteins are CCR4-NOT deadenylation complex feron beta (IFNB1) mRNA levels and that knockdown of
proteins, supporting the idea that all DF paralogs have a com- different DF paralogs can cause a small increase in the abun-
mon role in mRNA degradation. A recent study showed that all dance and translation of this transcript (Winkler et al., 2019).
three DF proteins, when heterologously expressed, bind Because cells are sensitive to small increases in IFNB1, single
CNOT1 and elicit mRNA deadenylation (Du et al., 2016). The abil- DF knockdown can lead to readily detectable cellular effects.
ity of all three DF proteins to mediate deadenylation supports the These hypomorphic phenotypes may be different from
overall finding that the major function of these proteins is to METTL3 depletion because METTL3 depletion would cause
1592 Cell 181, 1582–1595, June 25, 2020

ll
Article
more a substantial increase in the levels of highly m6A-modi- B Generation of the ribosome profiling and RNA seq li-
fied mRNAs and affect multiple cellular pathways, potentially braries upon DFs silencing
leading to very different phenotypes. B Actinomycin D treatment
Although DF proteins appear to have similar functions, B Quantitative PCR analysis
depletion of different DF paralogs can have different effects. B Microscale Thermophoresis (MST)
This is because DF proteins exhibit markedly different expres- d QUANTIFICATION AND STATISTICAL ANALYSIS
sion levels (Patil et al., 2018). Therefore, depletion of a low- B Protein-protein interactome analysis
abundance DF paralog, such as DF3, is likely to only affect B Analysis of the physiochemical properties of each DF
a small number of highly sensitive mRNAs, whereas depletion paralog.
of a higher-abundance DF paralog would affect a larger sub- B Reanalysis of publicly available PAR-CLIP data of DF1,
set of mRNAs, causing a different phenotype. Additionally, DF2 and DF3 in HeLa cells
because DF paralogs may be expressed at different levels in B Comparison of the coverage of DF proteins at each
different tissues, single DF paralog depletion can result in tis- m6A site on mRNAs throughout the transcriptome.
sue-specific phenotypes (Nishizawa et al., 2017). DF paralogs B Calculation of coverage at each DF1 and DF2
may also be phosphorylated in a paralog-specific manner unique sites
(Patil et al., 2018; Zaccara et al., 2019) or selectively induced B Analysis of the ribosome profiling and RNA seq data
in response to specific stimuli, as seen with p63-dependent upon silencing of DF proteins
induction of DF3 (Birkaya et al., 2007; Shi et al., 2017). These B Reanalysis of publicly available ribosome profiling data
pathways and subtle DF-paralog amino acid sequence differ- upon silencing of DF1 in HeLa cells
6
ences may confer additional modes of regulation that can B Analysis of m A mRNA stability upon DF paralog
affect the ability of DF paralogs to induce m6A-mRNA depletion.
degradation.
Notably, the functions of the DF paralogs may differ de- SUPPLEMENTAL INFORMATION
pending on the cellular context. For example, in cell stress,
DF proteins bind m6A-mRNA and relocalize to stress granules, Supplemental Information can be found online at https://doi.org/10.1016/j.
cell.2020.05.012.
but the mRNA is not targeted for degradation (Ries et al.,
2019). In neurons, DF proteins have also been localized
ACKNOWLEDGMENTS
to transport granules and may therefore have roles in traf-
ficking m6A-mRNA to dendrites and, thus, indirectly promote We thank members of the Jaffrey laboratory for comments and suggestions, in
translation in neurons (Merkurjev et al., 2018; Ries et al., particular V. Despic, and D. Patil for early contributions to this project. We
2019). However, for cell types that exhibit the classic thank members of the Epigenomics and Flow Cytometry Weill Cornell Cores.
mRNA-destabilization effect of m6A, which has been seen in We thank A. North and the Rockefeller Bio-Imaging Resource Center for assis-
diverse cell types (Geula et al., 2015; Ke et al., 2017; Vu tance with SIM microscopy. We thank the Rockefeller High-Throughput
et al., 2017) and was originally described in 1978 (Sommer Screening Resource Center for assistance with the MST experiments. We
thank Y. Cheng and M. Kharas for advice regarding the experiments using leu-
et al., 1978), the likely mediator of this m6A-dependent
kemia cells. This work was supported by NIH grants R35NS111631 and
destabilization effect is the combined action of DF1, DF2, R01CA186702 (to S.R.J.) and an American-Italian Cancer Foundation fellow-
and DF3. ship (to S.Z.).
STAR+METHODS AUTHOR CONTRIBUTIONS
S.Z. and S.R.J. designed the experiments. S.Z. performed the experiments.
S.Z. and S.R.J. wrote the manuscript.
and include the following:
d KEY RESOURCES TABLE DECLARATION OF INTERESTS

S.R.J. is scientific founder of, advisor to, and owns equity in Gotham
B Lead Contact
Therapeutics.
B Material Availability
B Data and Code Availability Received: July 23, 2019
d EXPERIMENTAL MODEL AND SUBJECT DETAILS Revised: January 14, 2020
B Cell Culture Accepted: May 4, 2020
B siRNA silencing in HeLa cells Published: June 2, 2020
B shRNA silencing in MOLM-13 cells

REFERENCES
B DF1, DF2 and DF3 overexpression and imaging
B Protein expression and purification
Alberti, S., Gladfelter, A., and Mittag, T. (2019). Considerations and Challenges
d METHOD DETAILS in Studying Liquid-Liquid Phase Separation and Biomolecular Condensates.
B Flow cytometry Cell 176, 419–434.
B Antibody staining and 3D-SIM imaging. Anders, M., Chelysheva, I., Goebel, I., Trenkner, T., Zhou, J., Mao, Y., Ver-
B Polysome zini, S., Qian, S.-B., and Ignatova, Z. (2018). Dynamic m6A methylation
Cell 181, 1582–1595, June 25, 2020 1593

ll
Article
facilitates mRNA triaging to stress granules. Life Sci. Alliance 1, Guertin, M.J., Cullen, A.E., Markowetz, F., and Holding, A.N. (2018). Parallel
e201800113. factor ChIP provides essential internal control for quantitative differential
Arava, Y., Wang, Y., Storey, J.D., Liu, C.L., Brown, P.O., and Herschlag, D. ChIP-seq. Nucleic Acids Res. 46, e75.
(2003). Genome-wide analysis of mRNA translation profiles in Saccharomyces Han, D., Liu, J., Chen, C., Dong, L., Liu, Y., Chang, R., Huang, X., Liu, Y.,
cerevisiae. Proc. Natl. Acad. Sci. USA 100, 3889–3894. Wang, J., Dougherty, U., et al. (2019). Anti-tumour immunity controlled
through mRNA m6A methylation and YTHDF1 in dendritic cells. Nature
Arguello, A.E., Leach, R.W., and Kleiner, R.E. (2019). In Vitro Selection
566, 270–274.
with a Site-Specifically Modified RNA Library Reveals the Binding Prefer-
ences of N6-Methyladenosine Reader Proteins. Biochemistry 58, Hesser, C.R., Karijolich, J., Dominissini, D., He, C., and Glaunsinger, B.A.
3386–3395. (2018). N6-methyladenosine modification and the YTHDF2 reader
protein play cell type specific roles in lytic viral gene expression during
Bailey, T.L., Boden, M., Buske, F.A., Frith, M., Grant, C.E., Clementi, L., Ren,
Kaposi’s sarcoma-associated herpesvirus infection. PLoS Pathog. 14,
J., Li, W.W., and Noble, W.S. (2009). MEME SUITE: tools for motif discovery
e1006995.
and searching. Nucleic Acids Res. 37, W202-8.
Holehouse, A.S., Das, R.K., Ahad, J.N., Richardson, M.O.G., and Pappu, R.V.
Barbieri, I., Tzelepis, K., Pandolfini, L., Shi, J., Millán-Zambrano, G., Robson, (2017). CIDER: Resources to Analyze Sequence-Ensemble Relationships of
S.C., Aspris, D., Migliori, V., Bannister, A.J., Han, N., et al. (2017). Promoter- Intrinsically Disordered Proteins. Biophys. J. 112, 16–21.
bound METTL3 maintains myeloid leukaemia by m6A-dependent translation
Ke, S., Pandya-Jones, A., Saito, Y., Fak, J.J., Vågbø, C.B., Geula, S.,
control. Nature 552, 126–131.
Hanna, J.H., Black, D.L., Darnell, J.E., Jr., and Darnell, R.B. (2017). m6A
Birkaya, B., Ortt, K., and Sinha, S. (2007). Novel in vivo targets of DeltaNp63 in mRNA modifications are deposited in nascent pre-mRNA and are not
keratinocytes identified by a modified chromatin immunoprecipitation required for splicing but do specify cytoplasmic turnover. Genes Dev. 31,
approach. BMC Mol. Biol. 8, 43. 990–1006.
Calviello, L., Mukherjee, N., Wyler, E., Zauber, H., Hirsekorn, A., Selbach, M., Kennedy, E.M., Bogerd, H.P., Kornepati, A.V., Kang, D., Ghoshal, D., Marshall,
Landthaler, M., Obermayer, B., and Ohler, U. (2016). Detecting actively trans- J.B., Poling, B.C., Tsai, K., Gokhale, N.S., Horner, S.M., and Cullen, B.R.
lated open reading frames in ribosome profiling data. Nat. Methods 13, (2016). Posttranscriptional m(6)A Editing of HIV-1 mRNAs Enhances Viral
165–170. Gene Expression. Cell Host Microbe 19, 675–685.
Canaani, D., Kahana, C., Lavi, S., and Groner, Y. (1979). Identification and Lancaster, A.K., Nutter-Upham, A., Lindquist, S., and King, O.D. (2014).
mapping of N6-methyladenosine containing sequences in simian virus 40 PLAAC: a web and command-line application to identify proteins with prion-
RNA. Nucleic Acids Res. 6, 2879–2899. like amino acid composition. Bioinformatics 30, 2501–2502.
Chakrabarti, A.M., Haberman, N., Praznik, A., Luscombe, N.M., and Ule, J. Landt, S.G., Marinov, G.K., Kundaje, A., Kheradpour, P., Pauli, F., Batzoglou,
(2018). Data Science Issues in Studying Protein–RNA Interactions with CLIP S., Bernstein, B.E., Bickel, P., Brown, J.B., Cayting, P., et al. (2012). ChIP-seq
Technologies. Annu. Rev. Biomed. Data Sci. 1, 235–261. guidelines and practices of the ENCODE and modENCODE consortia.
Genome Res. 22, 1813–1831.
Choe, J., Lin, S., Zhang, W., Liu, Q., Wang, L., Ramirez-Moya, J., Du, P.,
Kim, W., Tang, S., Sliz, P., et al. (2018). mRNA circularization by METTL3- Lauria, F., Tebaldi, T., Bernabò, P., Groen, E.J.N., Gillingwater, T.H., and Viero,
eIF3h enhances translation and promotes oncogenesis. Nature 561, G. (2018). riboWaltz: Optimization of ribosome P-site positioning in ribosome
556–560. profiling data. PLoS Comput. Biol. 14, e1006169.
Clancy, M.J., Shambaugh, M.E., Timpte, C.S., and Bokar, J.A. (2002). Induc- Le, H., Tanguay, R.L., Balasta, M.L., Wei, C.C., Browning, K.S., Metz, A.M.,
tion of sporulation in Saccharomyces cerevisiae leads to the formation of N6- Goss, D.J., and Gallie, D.R. (1997). Translation initiation factors eIF-iso4G
methyladenosine in mRNA: a potential mechanism for the activity of the IME4 and eIF-4B interact with the poly(A)-binding protein and increase its RNA bind-
gene. Nucleic Acids Res. 30, 4509–4518. ing activity. J. Biol. Chem. 272, 16247–16255.
Lee, H., Bao, S., Qian, Y., Geula, S., Leslie, J., Zhang, C., Hanna, J.H., and
Cui, Q., Shi, H., Ye, P., Li, L., Qu, Q., Sun, G., Sun, G., Lu, Z., Huang, Y.,
Ding, L. (2019). Stage-specific requirement for Mettl3-dependent m6A
Yang, C.-G., et al. (2017). m6A RNA Methylation Regulates the Self-
mRNA methylation during haematopoietic stem cell differentiation. Nat. Cell
Renewal and Tumorigenesis of Glioblastoma Stem Cells. Cell Rep. 18,
Biol. 21, 700–709.
2622–2634.
Li, F., Zhao, D., Wu, J., and Shi, Y. (2014). Structure of the YTH domain of hu-
Dimock, K., and Stoltzfus, C.M. (1977). Sequence specificity of internal
man YTHDF2 in complex with an m(6)A mononucleotide reveals an aromatic
methylation in B77 avian sarcoma virus RNA subunits. Biochemistry 16,
cage for m(6)A recognition. Cell Res. 24, 1490–1492.
471–478.
Li, A., Chen, Y.-S., Ping, X.-L., Yang, X., Xiao, W., Yang, Y., Sun, H.-Y., Zhu, Q.,
Dobin, A., Davis, C.A., Schlesinger, F., Drenkow, J., Zaleski, C., Jha, S., Batut,
Baidya, P., Wang, X., et al. (2017). Cytoplasmic m6A reader YTHDF3 promotes
P., Chaisson, M., and Gingeras, T.R. (2013). STAR: ultrafast universal RNA-seq
mRNA translation. Cell Res. 27, 444–447.
aligner. Bioinformatics 29, 15–21.
Linder, B., Grozhik, A.V., Olarerin-George, A.O., Meydan, C., Mason, C.E., and
Dodt, M., Roehr, J.T., Ahmed, R., and Dieterich, C. (2012). FLEXBAR-Flexible Jaffrey, S.R. (2015). Single-nucleotide-resolution mapping of m6A and m6Am
Barcode and Adapter Processing for Next-Generation Sequencing Platforms. throughout the transcriptome. Nat. Methods 12, 767–772.
Biology (Basel) 1, 895–905.
Liu, J., Eckert, M.A., Harada, B.T., Liu, S.M., Lu, Z., Yu, K., Tienda, S.M., Chry-
Dominguez, D., Freese, P., Alexis, M.S., Su, A., Hochman, M., Palden, T., Ba- plewicz, A., Zhu, A.C., Yang, Y., et al. (2018). m6A mRNA methylation regulates
zile, C., Lambert, N.J., Van Nostrand, E.L., Pratt, G.A., et al. (2018). Sequence, AKT activity to promote the proliferation and tumorigenicity of endometrial
Structure, and Context Preferences of Human RNA Binding Proteins. Mol. Cell cancer. Nat. Cell Biol. 20, 1074–1083.
70, 854–867.e9.
Madeira, F., Park, Y.M., Lee, J., Buso, N., Gur, T., Madhusoodanan, N., Basut-
Du, H., Zhao, Y., He, J., Zhang, Y., Xi, H., Liu, M., Ma, J., and Wu, L. (2016). kar, P., Tivey, A.R.N., Potter, S.C., Finn, R.D., and Lopez, R. (2019). The EMBL-
YTHDF2 destabilizes m(6)A-containing RNA through direct recruitment of EBI search and sequence analysis tools APIs in 2019. Nucleic Acids Res. 47
the CCR4-NOT deadenylase complex. Nat. Commun. 7, 12626. (W1), W636–W641.
Geula, S., Moshitch-Moshkovitz, S., Dominissini, D., Mansour, A.A., Kol, N., Markmiller, S., Soltanieh, S., Server, K.L., Mak, R., Jin, W., Fang, M.Y., Luo, E.-
Salmon-Divon, M., Hershkovitz, V., Peer, E., Mor, N., Manor, Y.S., et al. C., Krach, F., Yang, D., Sen, A., et al. (2018). Context-Dependent and Disease-
(2015). Stem cells. m6A mRNA methylation facilitates resolution of naı̈ve plu- Specific Diversity in Protein Interactions within Stress Granules. Cell 172, 590–
ripotency toward differentiation. Science 347, 1002–1006. 604.e13.
1594 Cell 181, 1582–1595, June 25, 2020

ll
Article
McCarthy, D.J., Chen, Y., and Smyth, G.K. (2012). Differential expression anal- Shi, H., Zhang, X., Weng, Y.-L., Lu, Z., Liu, Y., Lu, Z., Li, J., Hao, P., Zhang, Y.,
ysis of multifactor RNA-Seq experiments with respect to biological variation. Zhang, F., et al. (2018). m6A facilitates hippocampus-dependent learning and
Nucleic Acids Res. 40, 4288–4297. memory through YTHDF1. Nature 563, 249–253.
McGlincy, N.J., and Ingolia, N.T. (2017). Transcriptome-wide measurement of Shi, H., Wei, J., and He, C. (2019). Where, When, and How: Context-Depen-
translation by ribosome profiling. Methods 126, 112–129. dent Functions of RNA Methylation Writers, Readers, and Erasers. Mol. Cell
74, 640–650.
Merkurjev, D., Hong, W.-T., Iida, K., Oomoto, I., Goldie, B.J., Yamaguti, H.,
Ohara, T., Kawaguchi, S.-Y., Hirano, T., Martin, K.C., et al. (2018). Synaptic Slobodin, B., Han, R., Calderone, V., Vrielink, J.A.F.O., Loayza-Puch, F., Elkon,
N6-methyladenosine (m6A) epitranscriptome reveals functional partitioning of R., and Agami, R. (2017). Transcription Impacts the Efficiency of mRNA Trans-
localized transcripts. Nat. Neurosci. 21, 1004–1014. lation via Co-transcriptional N6-adenosine Methylation. Cell 169, 326–
337.e12.
Meyer, K.D., Patil, D.P., Zhou, J., Zinoviev, A., Skabkin, M.A., Elemento, O.,
Sommer, S., Lavi, U., and Darnell, J.E., Jr. (1978). The absolute frequency of
Pestova, T.V., Qian, S.B., and Jaffrey, S.R. (2015). 50 UTR m(6)A Promotes
labeled N-6-methyladenosine in HeLa cell messenger RNA decreases with la-
Cap-Independent Translation. Cell 163, 999–1010.
bel time. J. Mol. Biol. 124, 487–499.
Nishizawa, Y., Konno, M., Asai, A., Koseki, J., Kawamoto, K., Miyoshi, N., Ta-
Tirumuru, N., Zhao, B.S., Lu, W., Lu, Z., He, C., and Wu, L. (2016). N(6)-meth-
kahashi, H., Nishida, N., Haraguchi, N., Sakai, D., et al. (2017). Oncogene c-
yladenosine of HIV-1 RNA regulates viral infection and HIV-1 Gag protein
Myc promotes epitranscriptome m6A reader YTHDF1 expression in colorectal
expression. eLife 5, e15528.
cancer. Oncotarget 9, 7476–7486.
Vu, L.P., Pickering, B.F., Cheng, Y., Zaccara, S., Nguyen, D., Minuesa, G.,
Oates, M.E., Romero, P., Ishida, T., Ghalwash, M., Mizianty, M.J., Xue, B., Chou, T., Chow, A., Saletore, Y., MacKay, M., et al. (2017). The N6-meth-
Dosztányi, Z., Uversky, V.N., Obradovic, Z., Kurgan, L., et al. (2013). D2P2: yladenosine (m6A)-forming enzyme METTL3 controls myeloid differentia-
database of disordered protein predictions. Nucleic Acids Res. 41, tion of normal hematopoietic and leukemia cells. Nat. Med. 23,
D508–D516. 1369–1376.
Olarerin-George, A.O., and Jaffrey, S.R. (2017). MetaPlotR: a Perl/R pipeline Wang, X., and He, C. (2014). Reading RNA methylation codes through methyl-
for plotting metagenes of nucleotide modifications and other transcriptomic specific binding proteins. RNA Biol. 11, 669–672.
sites. Bioinformatics 33, 1563–1564.
Wang, X., Lu, Z., Gomez, A., Hon, G.C., Yue, Y., Han, D., Fu, Y., Parisien, M.,
Ozoe, A., Sone, M., Fukushima, T., Kataoka, N., Arai, T., Chida, K., Asano, T., Dai, Q., Jia, G., et al. (2014a). N6-methyladenosine-dependent regulation of
Hakuno, F., and Takahashi, S. (2013). Insulin receptor substrate-1 (IRS-1) messenger RNA stability. Nature 505, 117–120.
forms a ribonucleoprotein complex associated with polysomes. FEBS Lett. Wang, Y., Li, Y., Toth, J.I., Petroski, M.D., Zhang, Z., and Zhao, J.C. (2014b).
587, 2319–2324. N6-methyladenosine modification destabilizes developmental regulators in
Paris, J., Morgan, M., Campos, J., Spencer, G.J., Shmakova, A., Ivanova, embryonic stem cells. Nat. Cell Biol. 16, 191–198.
I., Mapperley, C., Lawson, H., Wotherspoon, D.A., Sepulveda, C., et al. Wang, X., Zhao, B.S., Roundtree, I.A., Lu, Z., Han, D., Ma, H., Weng, X., Chen,
(2019). Targeting the RNA m6A Reader YTHDF2 Selectively Compromises K., Shi, H., and He, C. (2015). N(6)-methyladenosine Modulates Messenger
Cancer Stem Cells in Acute Myeloid Leukemia. Cell Stem Cell 25, RNA Translation Efficiency. Cell 161, 1388–1399.
137–148.e6.
Wei, C.M., Gershowitz, A., and Moss, B. (1976). 50 -Terminal and internal
Park, O.H., Ha, H., Lee, Y., Boo, S.H., Kwon, D.H., Song, H.K., and Kim, Y.K. methylated nucleotide sequences in HeLa cell mRNA. Biochemistry 15,
(2019). Endoribonucleolytic Cleavage of m6A-ContainingRNAs by RNase P/ 397–401.
MRP Complex. Mol. Cell 74, 494–507.e8. Wells, S.E., Hillner, P.E., Vale, R.D., and Sachs, A.B. (1998). Circulariza-
Patil, D.P., Chen, C.-K., Pickering, B.F., Chow, A., Jackson, C., Guttman, M., tion of mRNA by eukaryotic translation initiation factors. Mol. Cell 2,
and Jaffrey, S.R. (2016). m(6)A RNA methylation promotes XIST-mediated 135–140.
transcriptional repression. Nature 537, 369–373. Wheeler, E.C., Van Nostrand, E.L., and Yeo, G.W. (2018). Advances and chal-
Patil, D.P., Pickering, B.F., and Jaffrey, S.R. (2018). Reading m6A in the Tran- lenges in the detection of transcriptome-wide protein-RNA interactions. Wiley
scriptome: m6A-Binding Proteins. Trends Cell Biol. 28, 113–127. Interdiscip. Rev. RNA 9, e1436.
Quinlan, A.R., and Hall, I.M. (2010). BEDTools: a flexible suite of utilities for Winkler, R., Gillis, E., Lasman, L., Safra, M., Geula, S., Soyris, C., Nachshon,
comparing genomic features. Bioinformatics 26, 841–842. A., Tai-Schmiedel, J., Friedman, N., Le-Trilling, V.T.K., et al. (2019). m6A modi-
fication controls the innate immune response to infection by targeting type I in-
Ramı́rez, F., Ryan, D.P., Grüning, B., Bhardwaj, V., Kilpert, F., Richter, A.S., terferons. Nat. Immunol. 20, 173–182.
Heyne, S., Dündar, F., and Manke, T. (2016). deepTools2: a next generation
Wu, C.C.-C., Zinshteyn, B., Wehner, K.A., and Green, R. (2019). High-Res-
web server for deep-sequencing data analysis. Nucleic Acids Res. 44 (W1),
olution Ribosome Profiling Defines Discrete Ribosome Elongation States
W160-5.
and Translational Regulation during Cellular Stress. Mol. Cell 73,
Ries, R.J., Zaccara, S., Klein, P., Olarerin-George, A., Namkoong, S., 959–970.e5.
Pickering, B.F., Patil, D.P., Kwak, H., Lee, J.H., and Jaffrey, S.R. (2019).
Xu, C., Liu, K., Ahmed, H., Loppnau, P., Schapira, M., and Min, J. (2015). Struc-
m6A enhances the phase separation potential of mRNA. Nature 571,
tural Basis for the Discriminative Recognition of N6-Methyladenosine RNA by
424–428.
the Human YT521-B Homology Domain Family of Proteins. J. Biol. Chem. 290,
Risso, D., Ngai, J., Speed, T.P., and Dudoit, S. (2014). Normalization of RNA- 24902–24913.
seq data using factor analysis of control genes or samples. Nat. Biotechnol. Youn, J.-Y., Dunham, W.H., Hong, S.J., Knight, J.D.R., Bashkurov, M., Chen,
32, 896–902. G.I., Bagci, H., Rathod, B., MacLeod, G., Eng, S.W.M., et al. (2018). High-Den-
Schwartz, S., Mumbach, M.R., Jovanovic, M., Wang, T., Maciag, K., Bushkin, sity Proximity Mapping Reveals the Subcellular Organization of mRNA-Asso-
G.G., Mertins, P., Ter-Ovanesyan, D., Habib, N., Cacchiarelli, D., et al. (2014). ciated Granules and Bodies. Mol. Cell 69, 517–532.e11.
Perturbation of m6A writers reveals two distinct classes of mRNA methylation Zaccara, S., Ries, R.J., and Jaffrey, S.R. (2019). Reading, writing and erasing
at internal and 50 sites. Cell Rep. 8, 284–296. mRNA methylation. Nat. Rev. Mol. Cell Biol. 20, 608–624.
Shi, H., Wang, X., Lu, Z., Zhao, B.S., Ma, H., Hsu, P.J., Liu, C., and He, C. Zhong, S., Li, H., Bodi, Z., Button, J., Vespa, L., Herzog, M., and Fray, R.G.
(2017). YTHDF3 facilitates translation and decay of N6-methyladenosine- (2008). MTA is an Arabidopsis messenger RNA adenosine methylase and inter-
modified RNA. Cell Res. 27, 315–328. acts with a homolog of a sex-specific splicing factor. Plant Cell 20, 1278–1288.
Cell 181, 1582–1595, June 25, 2020 1595

ll
Article
STAR+METHODS
KEY RESOURCES TABLE

Antibodies
Mouse anti-Edc4 (H-12) Santa Cruz Biotechnology Cat# sc-376382; RRID: AB_10988077
Rabbit anti-YTHDF1 Proteintech Group Cat# 17479-1-AP; RRID: AB_2217473
Rabbit anti-YTHDF2 Aviva System Biology Cat# ARP67917_P050
Rabbit anti-YTHDF3 Abcam Cat# ab103328; RRID: AB_10710895
Rabbit anti-GAPDH Santa Cruz Biotechnology Cat# sc-365062; RRID: AB_10847862
Rabbit anti-S6 (5G10) Cell Signaling Technology Cat# 2217; RRID: AB_331355
Alexa Fluor 488 goat anti-mouse IgG (H+L) Thermo Fisher Scientific Cat# A-11001; RRID: AB_2534069
Alexa Fluor 568 goat anti-rabbit IgG (H+L) Thermo Fisher Scientific Cat# A-11011; RRID: AB_143157
Mouse anti-CD14 (M5E2) BD Bioscience Cat# 555398; RRID: AB_395799
Rabbit anti-PABP Abcam Cat# ab21060; RRID: AB_777008
Rosetta2 DE3 Novagen Cat# 70954
Human 6xHis-YTHDF1 Patil et. al., 2016 https://doi.org/10.1038/nature19342
Talon Metal Affinity Resin Clontech Cat# PT1320-1
Monolith Protein Labeling Kit RED-NHS 2nd Generation Nanotempertech Cat# MO-L011
NEBNext Ultra II Directional RNA Library Prep Kit New England Biolabs Cat# E7760L
for Illumina
NEBNext rRNA Depletion Kit (Human/Mouse/Rat) New England Biolabs Cat# E6310X
Deposited Data and Code
RNA-seq, siDF1-siDF2-siDF3 in HeLa cells This paper GSE134380
Ribo-seq, siDF1-siDF2-siDF3 in HeLa cells This paper GSE134380
RNA-seq upon Actinomycin D treatment, This paper GSE134380
siDF1-siDF2-siDF3 in HeLa cells
Code use to analyze the Ribo-seq datasets This paper https://doi.org/10.17632/nsz68ywvvx.1
iCLIP, DF1-DF2-DF3 in HEK293T cells Patil et al., 2016 GSE78030
PARCLIP, DF1 in HeLa cells Wang et al., 2015 GSE63591
PARCLIP, DF2 in HeLa cells Wang et al., 2014a GSE49339
PARCLIP, DF3 in HeLa cells Shi et al., 2017 GSE86214
Ribo-seq and RNA-seq, siDF1 in HeLa cells Wang et al., 2015 GSE63591
Ribo-seq and RNA-seq, siDF2 in HeLa cells Wang et al., 2014a GSE49339
Ribo-seq and RNA-seq, siDF3 in HeLa cells Shi et al., 2017 GSE86214
m6A-CLIP-seq in HeLa cells Ke et. al., 2017 GSE86336
miCLIP-seq in HEK293T Linder et al., 2015 GSE63753
Human female origin: HeLa cells ATCC Cat# CCL-2
Human male origin: MOLM13 cells Gift from M. Kharas’s lab NA
e1 Cell 181, 1582–1595.e1–e8, June 25, 2020

ll
Article
Continued
Oligonucleotides
Single m6A-DRACH 10-nt oligo: rUrCrCrGrG/ Xu et al., 2015 https://doi.org/10.1074/jbc.M115.680389
iN6Me-rA/rCrUrGrU
Human YTHDF2 shRNA Sigma Cat# TRCN000254410
Human YTHDF3 shRNA Sigma Cat# TCN000365173
Human YTHDF1, YTHDF2 and YTHDF3 shRNA Patil et al., 2016 https://doi.org/10.1038/nature19342
ERCC RNA Spike-in Mix Thermo Fisher Scientific Cat# 4456740
Recombinant DNA
pcDNA3-FLAG-HA Patil et. al., 2016 https://doi.org/10.1038/nature19342
pcDNA3-FLAG-HA-YTHDF1 Patil et. al., 2016 https://doi.org/10.1038/nature19342
GraphPad Prism 8 GraphPad Software https://www.graphpad.com/scientific-
software/prism/
RStudio (1.0.153) RStudio https://rstudio.com/
Fiji (ImageJ 2.0.0-rc-68/1.52h) NIH https://fiji.sc/
Imaris Bitplane https://imaris.oxinst.com
FLEXBAR Dodt et. al., 2012 https://github.com/seqan/flexbar/wiki
riboWaltz Lauria et al., 2018 https://github.com/
LabTranslationalArchitectomics/
riboWaltz
STAR Dobin et al., 2013 https://github.com/alexdobin/STAR
BEDTools Quinlan and Hall, 2010 https://github.com/arq5x/bedtools2
MEME Suite Bailey et al., 2009 http://meme-suite.org/index.html
DeepTools Ramı́rez et al., 2016 https://deeptools.readthedocs.io/en/
develop/
Lead Contact
Information and requests for resources and reagents should be directed to and will be fulfilled by the Lead Contact, Samie R. Jaffrey
(srj2003@med.cornell.edu).
Material Availability

All datasets generated during this study have been deposited to the NCBI Gene Expression Omnibus under accession number GEO:
GSE134380.
The code used to analyze the ribosome profiling datasets in this study is available at Mendeley Data: https://doi.org/10.17632/
nsz68ywvvx.1.
Cell Culture
HeLa (ATCC CCL-2) cells of female origin were maintained in 1x DMEM (11995-065, Life Technologies) with 10% FBS, 100 U ml-1
penicillin and 100 mg ml-1 of streptomycin under standard culture conditions. Cells were split with TrypLE Express (Life Technologies)
according to the manufacturer’s instructions. HeLa cells were authenticated. MOLM-13 (man origin) were maintained in 1x RPMI with
10% FBS, 100 U ml-1 penicillin and 100 mg ml-1 of streptomycin under standard culture condition. MOLM-13 cells were a kind dona-
tion from the laboratory of M. Kharas and were not authenticated during this study.
Cell 181, 1582–1595.e1–e8, June 25, 2020 e2

ll
Article
siRNA silencing in HeLa cells

Cells were seeded in 6-well plates. After 24 h, 30 nM siRNA was transfected using Pepmute transfection reagent (Signagen) accord-
ing to the manufacturer’s instructions. 48 h after the first transfection, a second transfection was performed. Cells were maintained at
70%–80% confluency and when necessary expanded in a 10-cm dish. Cells were then collected 4 days after the first transfection.
When silencing multiple DF proteins at the same time, the total concentration of siRNA in the final transfection mixture was main-
tained constant (i.e., silencing of DF1, DF2, and DF3 was performed by adding 10 nM of siRNAs targeting DF1, 10 nM of siRNAs tar-
geting DF2 and 10 nM of siRNAs targeting DF3 in the same transfection mixture, i.e., silencing of DF1 and DF2 was performed by
adding 15 nM of siRNA for DF1 and 15 nM of siRNA for DF2 in the same transfection mixture). siRNA sequences were previously
reported and validated (Patil et al., 2016). Knockdown was also additionally validated prior to all experiments reported in this study
by western blot. In each silencing condition, the siRNA sequences were specific to the DF paralog of interest.
shRNA silencing in MOLM-13 cells

MOLM-13 cells at a concentration of 500,000 cells ml-1 were transduced with lentiviruses expressing shRNAs against DF1, DF2,
and DF3 or expressing a control scrambled shRNA sequence as performed previously (Vu et al., 2017). To increase the virus inte-
gration efficiency, cells were spinoculated at 1400 RPM for 1 h. After 24 h from the spinoculation (Day 1), cell media was changed.
Cells were then collected 5 days after the transduction. shRNAs sequences against DF1, DF2, and DF3 were previously reported
(Ries et al., 2019). In order to obtain at least two shRNAs for each DF, additional MISSION shRNA plasmid DNA sequences were
purchased from Sigma (DF2 shRNA: TRCN000254410, DF3 shRNA: TCN000365173). EGFP/TGFP expression was used to select
efficiently transduced cells by FACS sorting.
DF1, DF2 and DF3 overexpression and imaging

HeLa cells were transfected with FLAG-3xHA-tagged DF1, FLAG-3xHA-tagged DF2 and FLAG-3xHA-tagged DF3-expressing
plasmids using Fugene HD Transfection Reagent (E2311, Promega) according to the manufacturer’s instructions. As control, a
FLAG-3xHA-expressing plasmid was used as control. Briefly, cells were plated on 35-mm glass-bottom dishes coated with 1%
gelatin and allowed to reach 40%–60% confluency before transfection. 24 h after transfection, plates were fixed, and immunostain-
ing was performed as described. To visualize the overexpressed DF, staining using an antibody against each DF paralog and the HA
tag was performed. Areas where the HA staining overlaps with the DF staining indicate the localization of the FLAG-3xHA-tagged DF.
After antibody staining, images were acquired using a Nikon TE-2000 inverted microscope with a 40x oil objective.
Protein expression and purification

For in vitro binding affinity studies of m6A-modified RNA to DF paralogs, each full-length DF paralog was purified in bacteria as pre-
viously described (Patil et al., 2016; Ries et al., 2019). Briefly, DF proteins were expressed in Escherichia coli Rosetta2 (DE3) (Nova-
gen) using pProEx HTb (Invitrogen) with checking every 20 min at 600nm (OD600) until the culture reached log phase (04-0.6 OD600).
DF protein expression was induced with 1 mM isopropyl b-D-1-thiogalactopyranoside (IPTG) for 16 h at 18 C. Cells were collected,
pelleted and then resuspended in the following buffer: 50 mM NaH2PO4 pH 7.2, 300 mM NaCl, 20 mM imidazole at pH 7.2 and sup-
plemented with EDTA-free protease inhibitor cocktail (05892791001, Roche) according to the manufacturer’s instructions. The cells
were lysed by sonication and then centrifuged at 10,000g for 20 min. The soluble protein was purified using Talon Metal Affinity Resin
(Clontech) and eluted in the following buffer: 50 mM NaH2PO4 pH 7.2, 300 mM NaCl, 250 mM imidazole-HCl at pH 7.2. Further con-
centration and buffer exchange were performed using Amicon Ultra-4 spin columns (Merck-Millipore) when proteins were not directly
labeled. Recombinant protein was stored in the following buffer: 20 mM HEPES pH 7.4, 300 mM KCl, 6 mM MgCl2, 0.02% NP40, 50%
glycerol at 80 C or 20% glycerol at 20 C. All protein purification steps were performed at 4 C. The purified protein was quan-
tified using a ND-2000C NanoDrop spectrophotometer (NanoDrop Technologies) with OD 280 and verified by Coomassie staining.
METHOD DETAILS
Flow cytometry
After 5 days of DF silencing, MOLM-13 cells were tested for the expression of the CD14 differentiation marker using the following
protocol. Cells were collected by centrifugation, washed once with 1X PBS+2% FBS and then counted using a hemocytometer.
One million cells were used to perform cellular staining using PE-CD14 antibody (BD PharMingen). The antibody dilution was per-
formed according to the manufacturer’s instructions. Staining was performed on ice for 1 h. After the staining, excess unbound anti-
body was removed by two washes with 1x PBS. DAPI was added prior to analysis. Cells were analyzed on a BD FACS ARIA instru-
ment. To calculate the percentage of CD14 positive cells, data were processed using FlowJo.
Antibody staining and 3D-SIM imaging.

HeLa cells were seeded on coated no 1.5H (170mm ± 5mm) coverslip (474030-9000-000, Carl Zeiss) in 6 well-plates. Coverslips
were coated with gelatin prior to seeding. 24 h after seeding, cells were washed twice with PBS at 25 C and then fixed with 4% form-
aldehyde in 1X PBS for 15 min at 25 C. After two washes with 1X PBS to remove the excess of formaldehyde, cells were
permeabilized with permeabilization/blocking buffer (1x FBS, 1% Triton-X in 1x PBS) at room temperature for 30 min. Following
e3 Cell 181, 1582–1595.e1–e8, June 25, 2020

ll
Article
permeabilization/blocking, cells were incubated with the primary antibody in a humidified chamber for 2 h. Cells were then washed
with 1x PBS for three times and then incubated with the appropriate secondary antibody (anti-rabbit Alexa Fluor 568, anti-mouse
Alexa Fluor 488) for 1 h at room temperature. Following the incubation, cells were washed as before. Hoechst staining was performed
for 5 min. After additional washes, coverslips were mounted in mounting media (Prolong Diamond, P36961, Life Technologies) and
quickly sealed with nail polish. Stained cells were imaged by super-resolution 3D-SIM on OMX Blaze 3D-SIM super-resolution
microscope (Applied Precision) equipped with a 100x/1.40 UPLSAPO oil objective. To reduce spherical aberrations, an oil with
the optimal refractive index was first identified at the beginning of every acquisition session. Image reconstruction and alignment
was performed using SoftWoRx. Because all acquired images have AF-568 staining (red staining), Optimal-Red transfer functions
(OTFs) were used during the image processing. Further processing was performed on Imaris.
Polysome
HeLa cells were seeded on 10-cm dishes. At 70%–80% confluency, cells were treated with 100 mg/mL cycloheximide for 10 min at
37 C and then collected. Briefly, cells were washed with PBS in mild cycloheximide condition and then lysed directly on the dish us-
ing 400 ml of lysis buffer (20 mM Tris HCl pH 7.4 100 mM KCl, 5 mM MgCl2, 1% Triton X-100, 100 mg/ml cycloheximide, 2 mM DTT, 1x
cOmplete no EDTA protease inhibitor cocktail). The lysate was then left on ice for 10 min and then a centrifugation at 12,000 g x 15 min
was performed to clear the lysate. The cleared lysates were then loaded in 15%–50% linear sucrose gradients, ultra-centrifuged
and fractionated with an automated fraction collector. Proteins were extracted from each fraction using trichloroacetic acid. Sodium
deoxycholate was added as a carrier to assist in the protein precipitation. After acetone addition and washes, proteins were resus-
pended in protein loading buffer, denatured at 95 C for 10 min, and used for detecting DF1, DF2, DF3 and RPS6 by western blot. RNA
was extracted from each fraction using TRIzol-LS reagent (Invitrogen). To account for differences in extraction efficiency, 2 ng of a
spike-in luciferase mRNA (TriLink Biotechnologies) was added to each fraction before RNA extraction. Briefly, RNA extraction was
performed using a ratio of 2:1 between the volume of TRIzol LS reagent and the sample. After TRIzol addition, RNA isolation was
performed as indicated by the manual’s instructions.
Generation of the ribosome profiling and RNA seq libraries upon DFs silencing
Ribosome profiling was performed following the original protocol (McGlincy and Ingolia, 2017) with minor modifications. Briefly, HeLa
cells were plated on a 10-cm dish and each DF or combination of DF transcripts was silenced as described. After the silencing, to
inhibit ribosome run-off during the collection and lysis, cells were rapidly washed once with ice-cold PBS containing 50 mg/ml cyclo-
heximide and the plates were immersed in liquid nitrogen and placed in dried ice as previously described (Calviello et al., 2016). After
allowing cells to reach 4 C by keeping cells on ice, cells were immediately lysed in 400 mL of cell lysis buffer (20 mM Tris pH 7.4,
150 mM NaCl, 5 mM MgCl2, 1 mM DTT, 100 mg/mL cycloheximide, 1% Triton, 25 U DNase I) and by triturating the sample through
a 26-gauge needle ten times to further lyse the cellular material. 5% of the total cellular extract was used for RNaseq library prepa-
ration. 10% of the total cellular extract was used for Western Blot validation of the silencing. Lysate was then clarified by performing a
centrifugation step at 20,000 x g for 10 min at 4 C. Supernatant was collected, flash-frozen and stored at 80 C.
For the ribosome profiling library preparation, the RNA concentration in the cleared lysate was quantified using RiboGreen. 30 mg of
RNA was digested using RNase I (Epicenter) as previously described (McGlincy and Ingolia, 2017). After RNase digestion, lysates
were loaded on a sucrose gradient and centrifuged at 100,000 rpm (TLA 100.3 rotor) for 1 h to pellet ribosomes. RNA from the re-
suspended ribosomal pellet was purified using Trizol and run on a gel to selectively excise foot-printed RNAs (from 17 nt to 34 nt
in length). To reduce ribosomal contamination in the library preparation steps, we then performed Ribo Zero Gold depletion of the
foot-printed RNA. The rRNA-depleted RNA fragments were dephosphorylated, and the adaptor was ligated. To specifically deplete
unligated linker, yeast 50 -deadenylase and RecJ exonuclease digestion was performed. At this point, the library preparation steps
were performed essentially as described previously (McGlincy and Ingolia, 2017). Briefly, reverse transcription was performed using
SuperScript III. To avoid untemplated nucleotide addition, reverse transcription was carried out at 57 C, as described (McGlincy and
Ingolia, 2017). cDNA purification, circularization, and amplification were performed as previously described (Linder et al., 2015). Us-
ing this method, every ribosomal footprint represents a unique ribosome-binding event. Libraries were sequenced with a single end
50 bp run using an Illumina Hiseq2500 platform.
For the RNaseq library preparation, the RNA was extracted from the 5% of the lysate before the RNase digestion using Trizol LS.
The RNA quality was assessed by Bioanalyzer analysis. 1 mg of total RNA was used for library preparation using the NEBNext Ultra
Directional RNA Library Prep Kit. Ribosomal RNA was removed using NEBNext rRNA Depletion Kit. The libraries were sequenced on
the Illumina HiSeq 2500 instrument, in single read mode, 50 bases per read.
Actinomycin D treatment
HeLa cells were plated on a 10-cm dish and each DF or combination of DF proteins was silenced as described. After five days of
silencing, cells were treated with 5 mg/ml of actinomycin D or vehicle (DMSO) for 2 h to inhibit transcription. To ensure that the treat-
ment did not affect cell viability, cells were counted before collection. Total RNA was extracted using TRIzol and 1 mg of total RNA was
used to perform RNA-seq library preparation using the NEBNext Ultra Directional RNA Library Prep Kit. Ribosomal RNA was removed
Cell 181, 1582–1595.e1–e8, June 25, 2020 e4

ll
Article
using NEBNext rRNA Depletion Kit. Prior to ribosomal depletion, 2 mL of the 1:100 dilution of the ERCC (external RNA control
consortium) RNA Spike-in Control Mix 1 was added to each 1 mg of total RNA sample, as suggested by the manufacturer’s
instructions.
Quantitative PCR analysis

Total RNA was isolated from cells using TRIzol according to the manufacturer’s instructions. For each condition, the same amount of
total RNA was reverse transcribed to cDNA using the SuperScript IV First-Strand kit. When the transcript abundance was tested upon
actinomycin D treatment in HeLa cells, oligo-dT primers were used during the cDNA synthesis step. This allowed us to selectively
convert to cDNA the amount of intact RNA still present in cells upon actinomycin D treatment, while avoiding the conversion of frag-
mented RNA to cDNA. When transcript abundance was tested in MOLM13 cells, random hexamers were used. Quantitative qPCR
was performed using the iQ SYBR Green Supermix with 200nM primers in 10 mL final reaction mix. For the amplification, the following
protocol was used: 98 C for 3 min, 40 cycles of 95 C for 15 s and 60 C for 45 s. To test primer specificity, melting curves were per-
formed at the end of the 40 cycles of amplification. A delta cycle threshold (Ct) was calculated using the average Ct values across
technical triplicates, by subtracting the geometric mean of a control non-m6A gene (RPS28). To test the distribution of the mRNA
levels along the sucrose gradient, for each fraction, equal volume of extracted RNA was reverse transcribed to cDNA. The list of
primers used for qPCR is presented in Table S4.
Microscale Thermophoresis (MST)

10 mM of each recombinant DF paralog expressed in E. coli was labeled with RES-NHS dye using the Monolith Protein Labeling Kit
RED-NHS 2nd Generation (Amine Reactive) following the protocol instructions. Prior to each MST run, protein aggregation was
minimized by centrifuging the protein solutions at 20,820 3 g for 10 min. A constant concentration (50 nM) of each DF paralog
was mixed with increasing concentration of the m6A-modified RNA (rUrCrCrGrG/iN6Me-rA/rCrUrGrU, IDT) in MST buffer (50 mM
HEPES, 100 mM NaCl, 0.05% Tween-20, pH 7.4) and incubated for 30 min at room temperature. After incubation, samples were
loaded onto Premium Coated capillaries (NanoTemper Technologies) for MicroScale Thermophoresis (MST) acquisition. The MST
measurements were performed at room temperature using a fixed IR-laser power of 40% for 20 s per capillary, using an LED power
of 20% in a red laser equipped Monolith NT.115 instrument (NanoTemper Technologies) (HTSRC, Rockefeller University). The Mono-
Temper analysis software (NanoTemper Technologies) was used to determine KD values.
Protein-protein interactome analysis

Data related to Figure 2B were previously reported in a recent Bio-ID interactome study in which the DF proteins were analyzed
among 139 other proteins (Youn et al., 2018). This analysis was performed using two C-terminal labeling experiments for DF1,
two N-terminal and two C-terminal labeling-experiments for DF2 and DF3. ‘‘Table S1’’ of the Bio-ID interactome study was used
to select the number of spectra counts, the number of performed replicates, the level of confidence of each DF1, DF2 and DF3 in-
teractor. As determined in the study (Youn et al., 2018), only proteins with an average probability of interaction of 0.95 were consid-
ered as high-confidence interactors. In the Youn et al. (2018) study, 0.95 was considered as an optimal cut-off for high-confidence
interactors as it was well above the Bayesian 1% FDR estimation (corresponding to an AvgP of 0.91) and as decreasing the threshold
resulted in no worthwhile gains in BioGRID-curated interactions for the drop-in sensitivity. The list of top 25 interacting proteins is also
provided in Table S1 of the same Bio-ID interactome study. ‘‘Table S7’’ of the same Bio-ID interactome study (Youn et al., 2018) was
used to define the NMF analysis-based association of each DF protein presented in Figure 2C. The 14 subcellular compartments are:
1. RNP granule (cytoplasmic ribonucleoprotein granule), 2. chromosome, 3. ER (Endoplasmic reticulum), 4. RNP complex (ribonu-
cleoprotein complex), 5. nucleolus, 6. plasma membrane, 7. mitochondrion, 8. P-body (CCR4-NOT complex/P-body), 9. stress
granules, 10. cytoskeleton, 11. nucleus, 12. vesicles, 13. MOC (microtubule organizing center), 14. snRNPs (small nuclear ribonu-
cleoprotein complex/spliceosomal complex).
Analysis of the physiochemical properties of each DF paralog.

The disordered region score was calculated using D2P2 (Oates et al., 2013). The prion-like likelihood ratio was calculated
using PLAAC (Lancaster et al., 2014). The net charge per residue, and the probability of charged residues, were calculated in a sliding
window of 10 amino acids using CIDER (Holehouse et al., 2017). Hydrophobicity analysis was performed using ProtScale (ExPASy) in
a sliding window of 10 amino acids.
Reanalysis of publicly available PAR-CLIP data of DF1, DF2 and DF3 in HeLa cells
Data from the DF3 PAR-CLIP (GEO: GSE86214) were downloaded and used for analysis in this study. Reads processing was per-
formed using Flexbar (Dodt et al., 2012). First, the adaptor sequence reported on the GEO submission was removed using the
following set-up: flexbar -r SRR509926X.fastq -f i1.8 -t SRR509926X.noadapter–max-uncalled 1 -a adapters_par-clip.fasta–pre-
trim-phred 20 -n 20–min-read-length 1. After the adaptor removal, quality read analysis check was performed using Fastqc
(https://www.bioinformatics.babraham.ac.uk/projects/fastqc/). According to the Fastqc results, 40% of the reads of each DF3
e5 Cell 181, 1582–1595.e1–e8, June 25, 2020

ll
Article
PAR-CLIP library replicate was represented by the following sequence: CTCAACACCCACTACCTAAAAAATCCCAAACATATAACT-

GAACTCCTCAC. By using the BLAT analysis tool, we found that the sequence had 100% sequence identity to MT-RNR2, the mi-
tochondrially encoded 16S ribosomal RNA. This sequence is likely to represent a contaminant of the library preparation procedure.
The removal of this sequence and any other PCR duplicate was performed using the CIMS/fastq2collapse.pl script. Reads shorter
than 18 nucleotides were then removed by using Flexbar with the default–min-read-length option (Dodt et al., 2012). The remaining
reads were aligned to the human genome (H. sapiens, NCBI GRCh38) using Bowtie version 1 and the reported parameters in the GEO
submission data processing description (-v 2 -m 5).
Data from the DF1 and DF2 PAR-CLIP datasets were analyzed similarly. For DF1, the original PAR-CLIP sequencing data were
downloaded from GEO: GSE63591. For DF2, the original PAR-CLIP sequencing data were download from GEO: GSE49339. In
this case, when we performed the quality check using Fastqc, we did not find any over-represented sequences representing
more than 5% of the library. Thus, after removing low-quality nucleotides, the adaptor sequence, short reads, and duplicates, the
remaining reads were aligned to the human genome (H. sapiens, NCBI GRCh38) using Bowtie version 1 and the reported parameters
in the GEO submission data processing description (-v 2 -m 5).
To visualize the aligned reads and comparing them across the different PAR-CLIP datasets, we calculated the number of reads
counts normalized to the millions of mapped reads at every coordinate on the human genome. In this case, every single read
does not represent a unique event of RNA-protein interaction since libraries were prepared using a Truseq small RNA sample prep-
aration kit without the use of unique molecular identifiers. Therefore, PCR duplicates cannot be distinguished from distinct binding
events.
Comparison of the coverage of DF proteins at each m6A site on mRNAs throughout the transcriptome.
To compare iCLIP coverage at each m6A site, we calculated the number of normalized DF iCLIP read counts at each m6A site using a
previously described approach (Patil et al., 2016). We first calculated the number of unique read counts per million mapped reads
obtained from the CITS analysis of each iCLIP dataset (Patil et al., 2016). In this approach, each iCLIP read represents a unique
RNA-bound protein. Only m6A mapping to mRNAs were used in this analysis. A 5-nucleotide window from the m6A site genomic co-
ordinate was selected and the number of iCLIP reads per million uniquely mapped reads was calculated at every m6A coordinate
using BEDTools (Quinlan and Hall, 2010). The following formula was used: (r*106)/R where r = number of unique CLIP reads at the
m6A window, R = total number of uniquely mapped unique iCLIP reads in the whole iCLIP library. We refer to the resulting number
as ‘‘coverage at each m6A site’’ throughout the manuscript. For comparing the reanalyzed DF1 and DF2 PAR-CLIP coverage at each
m6A site, we used a similar approach. For each DF1, DF2 and DF3 iCLIP and miCLIP, the enriched motif presented in Figure 1 was
obtained using MEME suite (Bailey et al., 2009) as previously described (Patil et al., 2016).
For representation of iCLIP, PAR-CLIP and miCLIP tracks in Figure 1 and Figure S1, the normalized number of read counts from
iCLIP, PAR-CLIP and miCLIP datasets at the indicated genomic coordinate is presented. The reported annotated m6A sites in the
respective GEO submission were downloaded (GEO: GSE86336, GSE63753). In order to calculate the number of m6A sites per tran-
script, m6A sites were assigned to the gene transcript using MetaplotR (Olarerin-George and Jaffrey, 2017).
Calculation of coverage at each DF1 and DF2 unique sites

To select the previously described DF1-uniquely bound sites and DF2-uniquely bound sites, we first obtained the list of genes consid-
ered unique DF1 and DF2 targets from Shi et al., 2017. Next, the subset of called DF1 or DF2 sites on these targets were selected from
the entire list of called sites reported in supplementary file ‘‘GSE63591_A-Y1-PARCLIP.xlsx’’ (GEO: GSE63591), and in the supple-
mentary file ‘‘GSE49339_A-PARCLIP.xlsx.gz’’ (GEO: GSE49339). The number of DF1 and DF2 normalized coverage at each site was
calculated using deepTools bamCoverage (parameters:–binSize 50–effectiveGenomeSize–normalizeUsing BPM). To compare the
two calculated coverage, deepTools bamCompare was used and visualized using deepTools plotHeatmap (Ramı́rez et al., 2016).
Analysis of the ribosome profiling and RNA seq data upon silencing of DF proteins
After the sequencing of the ribosome profiling libraries performed in this study, reads were quality-based trimmed and reads below
16 nucleotides were excluded. The adaptor was removed using Flexbar (Dodt et al., 2012). The demultiplexing was performed based
on the experimental barcode using the pyBarcodeFilter.py script. The second part of the random barcode was then moved to the
read header. After removal of the PCR duplicates based on the UMI sequence, ribosomal RNA reads were removed using STAR
aligner (Dobin et al., 2013). As previously reported, a high percentage of reads in the library is represented by ribosomal RNA. To avoid
possible contamination of the reads representing the Ago-complex recruited on the siRNA target sequence, reads were also mapped
to the siRNA sequences to remove these sequences. siRNA sequences are previously described and reported in the GEO submis-
sion related to this previous study. Reads with no acceptable alignment to ribosomal and siRNA sequences were then mapped to the
hg38 transcriptome. The STAR genome index was built using the annotation obtained from GENCODE (version 26). The following
parameters were used to align to the genome: STAR–runThreadN 20–genomeDir ./star_hg38–readFilesIn reads_rRNA_siRNA_Un-
mapped.fastq–outSAMtype BAM SortedByCoordinate–sjdbGTFfile gencode.v26.protein_coding.annotation.gtf–outFilterMulti-
mapNmax 8–limitBAMsortRAM 10000000000–alignEndsType EndtoEnd–outWigType bedGraph–outWigStrand Stranded–
outWigNorm None–quantMode TranscriptomeSAM GeneCounts–outFilterMismatchNmax 8–outSAMattributes All–
outFilterIntronMotifs RemoveNoncanonicalUnannotated. These parameters were previously used (Calviello et al., 2016). The map-
Cell 181, 1582–1595.e1–e8, June 25, 2020 e6

ll
Article
ped reads that represent ribosome-protected fragments (reads longer than 21 nucleotides, but shorter than 32 nucleotides) with a
40% minimum level of periodicity were considered for the subsequent analysis. We specifically considered only ribosome-protected
fragments reads mapped to the coding sequence to avoid any possible contamination coming from the untranslated area of the
genome. To measure the degree to which transcripts are translated independently from the initiation and termination rate, we
excluded reads mapping to the first 15 nucleotides from the start codon and the 9 nucleotides before the stop codon of each coding
sequence, as recommended in previous studies (Lauria et al., 2018). To precisely determine the localization of each ribosome with
respect to the start and stop codon, the P-site offset was estimated as previously described using riboWaltz (Lauria et al., 2018).
Reads matching all these criteria were considered to determine the number of ribosome-protected fragments per gene. Given the
level of variability and the possibility of contamination of the ribosome profiling library preparation, all these criteria have been re-
ported as essential analysis steps to better estimate the number of ribosome-protected fragments mapped to each transcript (Cal-
viello et al., 2016; McGlincy and Ingolia, 2017). Only the longest splice isoform of each gene was considered. Genes with less than 10
ribosome-protected fragments were excluded from the analysis. For each DF paralog silencing experiment, TMM normalization,
empirical Bayes estimate of the negative binomial dispersion, and measurement of the level of change in translation (log2 Fold
Change) compared to the control condition was performed using edgeR (McCarthy et al., 2012). To account for sample-to-sample
variability, all replicates were analyzed at the same time to define a unique log2 Fold Change measurement.
After the sequencing of the RNA-seq libraries, reads with low quality were discarded and read lengths shorter than 18 nt were dis-
carded. Ribosomal reads were removed using STAR aligner. The remaining reads were mapped to hg38 genome using STAR and the
data were used to normalize the Riboseq data and derive the change in RNaseq expression. For each DF paralog silencing, TMM
normalization, empirical Bayes estimate of the negative binomial dispersion, and measurement of the level of change in translation
(log2 Fold Change) compared to the control condition was performed using edgeR. To account for sample to sample variability, all
replicates were analyzed at the same time to define a unique log2 Fold Change measurement.
For the translational efficiency measurement, the Riboseq change was compared to the RNaseq change.
Reanalysis of publicly available ribosome profiling data upon silencing of DF1 in HeLa cells
To analyze the public available ribosome profiling data upon DF1 silencing, we used two independent approaches. First, we down-
loaded the Table ‘‘GSE63591_C-Y1-Ribosome_profiling.xlsx’’ available at the GEO: GSE63591 and assigned to each reported tran-
script a number of m6A sites. To define whether m6A mRNAs were affected by the DF1 silencing, we used the translational efficiency
values reported in the table without performing any further processing step. Secondarily, the original sequencing data of the GEO:
GSE63591were downloaded and processed as follow. Quality check was performed using Flexbar. After quality check, sequences
longer than 21 nt were first mapped to the siRNA sequences reported in the published study (Wang et al., 2015), to the ribosomal
RNA, and then to the hg38 transcriptome. All mapping steps were performed using STAR as mentioned above. Reads mapping
to the coding sequence that represent ribosome protected fragments and have high level of periodicity were considered for the anal-
ysis steps as described above. We excluded reads mapping to the first 5 codons from the start codon and the 3 codons before the
stop codon of each coding sequence as described above.
We noticed a major difference between the original public available table and the new tables generated by our reanalysis. In the
case of Replicate 2, the number of ribosome-protected fragments assigned to the DF1 mRNA in the processed table by Wang et al.
(2015) showed a significant reduction in the number of ribosome-protected fragments for the DF1-silenced sample compared to the
matching control sample. This is expected for the DF1 knockdown condition. When we reanalyzed the original sequences reads from
GEO: GSE63591, we reproduced this loss of ribosome-protected fragments assigned to DF1 upon DF1 silencing. In contrast, the
number of ribosome-protected fragments assigned to the DF1 mRNA in Replicate 1, as shown in the processed table
‘‘GSE63591_C-Y1-Ribosome_profiling.xlsx,’’ shows a substantially higher number of ribosome-protected fragments for the DF1-
silenced sample compared to the matching control sample (15-fold increase). This would indicate a condition of overexpression
instead of knockdown. We therefore examined the raw sequencing reads for Replicate 1 to generate a new table of ribosome-pro-
tected fragments. Upon reanalysis of the Replicate 1 ribosome-profiling data (Table S2), the number of ribosome-protected frag-
ments mapped to DF1 mRNA was significantly lower than the number indicated in the published processed table ‘‘GSE63591_C-
Y1-Ribosome_profiling.xlsx.’’ In the reanalysis, we observed 98% less ribosome-protected fragments mapped to DF1 than the num-
ber of ribosome-protected fragments mapped to DF1 in the matching control sample. This is consistent with the raw data represent-
ing a DF1 knockdown condition rather than an overexpression condition.
We used a standard protocol for determining ribosome-protected reads, which inherently discards any reads that derive from the
siRNA itself because these fragments would be too short to be a ribosome-protected fragment, as described above (Calviello et al.,
2016; McGlincy and Ingolia, 2017). It is possible that the large number of ribosome-protected fragments that map to DF1 in the pub-
lished processed table for Replicate 1 derive from DF1-specific siRNA molecules that were cloned into the library but not removed, if
there was no appropriate size filter used in the alignment protocol. The length, sequence and alignment position of each ribosome
protected fragment is not provided in the processed table. Alternatively, the ribosome-protected reads assigned to DF1 could derive
from a sample which contains overexpressed DF1 mRNA. Because of this uncertainty, we generated new DF1 knockdown ribosome
profiling datasets in the same cell line. The pipeline used for the reanalysis of these datasets is reported (see ‘‘Data and Code
Availability’’).
e7 Cell 181, 1582–1595.e1–e8, June 25, 2020

ll
Article
Analysis of m6A mRNA stability upon DF paralog depletion.

The stability of m6A-modified mRNAs and non-methylated mRNAs was determined by quantifying mRNA levels immediately before
and 2 h after actinomycin D treatment. After sequencing, libraries were checked for their quality as previously described for all RNA-
seq libraries in this study. Reads were then mapped to the transcriptome using STAR aligner. A read count table was generated using
the STAR aligner as well. ERCC Spike-in reads were also mapped using STAR aligner. Thus, the resulting read counts table contains
reads mapping to the ERCC RNA Spike-in as well. To remove unwanted variation, the RUVg estimation package was used (Risso
et al., 2014). Additionally, edgeR was used to calculate ERCC spike-in RNA size factors for each sample, which were then applied
to normalize for library size changes in each replicate. We thus obtained a final number of normalized reads mapped to each mRNA.
Since m6A mRNAs are known to have a short half-life (2 to 4 h) (Geula et al., 2015; Ke et al., 2017; Schwartz et al., 2014), m6A mRNA
expressions are often substantially reduced even at 2 h after actinomycin D treatment. We focused on mRNAs in which the reduction
in expression level could be accurately measured, i.e., those mRNAs that showed a reduction in expression level by at least 40% after
2 h of actinomycin D treatment. The percentage of mRNA abundance reduction was calculated in both the control samples (control
siRNA) and in the samples subjected to DF paralog knockdown. In some cases, the stabilization of the mRNA was very substantial
upon depletion of all three DF paralogs, and the mRNA expression reduction was less than 10%. Since these mRNAs no longer show
a drop in mRNA expression, it is difficult to precisely establish the fold change in mRNA stability. Therefore, we simply assigned the
reduction at 10% for these mRNAs in the calculations used in Figure 3F.
Cell 181, 1582–1595.e1–e8, June 25, 2020 e8

ll
Article
Figure S1. DF Paralogs Bind the Same m6A Sites throughout the Transcriptome, Related to Figure 1
(A) The YTH domain of DF1, DF2 and DF3 exhibit high sequence homology. Shown is a detailed representation of the aligned amino acid sequence for the YTH
domains of DF1, DF2 and DF3. Amino acids previously described to be essential for the m6A binding based on the DF1- m6A RNA and DF2- m6A RNA crystal
structures (Li et al., 2014; Xu et al., 2015) (PDB: 4RCJ, 4RDN) are highlighted. Pink indicates the three tryptophan residues that form the aromatic cage
ll
Article
surrounding m6A. Green and Blue indicate amino acids that make additional points of contact between the YTH domain and the nucleotides adjacent to m6A. Of
note, each YTH-DF domain shares each of these amino acids. Shown below in a yellow color code scheme is the level of conservation of every amino acid among
the three YTHDF proteins with a range from 1 (low conservation) to 10 (high conservation) as predicted by the Clustal Omega alignment algorithm (Madeira et al.,
2019). Most residues (88%) are fully conserved residues across all DFs paralogs as indicated by the conservation score of 10. Importantly, amino acids essential
for recognizing m6A and adjacent residues in RNA are fully conserved. As indicated in Figure 1A, amino acids with a conservation score lower than 10 are located
on the opposite side of the YTH domain away from the RNA-binding pocket, suggesting that they might not affect the YTH-m6A RNA interaction.
(B) DF1, DF2 and DF3 have similar binding affinity to m6A-modified mRNA. Previous studies suggest that DF1, DF2 and DF3 bind with similar affinity to m6A-
modified mRNA (Patil et al., 2018; Wang et al., 2014a; Xu et al., 2015). To compare the affinities side-by-side, full length DF1, DF2 and DF3 were prepared as
recombinant proteins in E. coli and the affinity to an m6A-modified RNA was measured by microscale thermophoresis (MST). Shown is the percentage of protein
bound (fraction bound) at increasing concentration of the m6A-RNA. These data show that the three DFs have similar in vitro binding affinity to m6A. Data are
represented as mean ± s.d. (n = 2 replicates).
(C) The previously reported (Shi et al., 2017) level of overlap of the DF1, DF2, and DF3 targets is summarized. In these previous studies, the targets of each DF
paralog were first identified by PAR-CLIP (Shi et al., 2017; Wang et al., 2014a, 2015). Then, to determine which of these mRNAs represent valid DF-mRNA in-
teractions, RIP (immunoprecipitation of the DF paralog, followed by reverse transcription and sequencing of the bound RNA) was performed. An overlap
approach was used to identify targets of each DF paralog, and subsequently to define DF1, DF2 or DF3 unique targets or common targets. This analysis resulted
in the conclusion that only a minority of mRNAs (23.98%) are bound by all DF paralogs, as shown by the pie chart. Most of the mRNAs were annotated to be either
unique targets of one DF or shared by two DFs. The finding that different DF proteins bind distinct subsets of m6A-containing mRNAs in the transcriptome has
been the foundation of the current model of how DFs differentially regulate different m6A mRNA cohorts (Han et al., 2019; Liu et al., 2018; Park et al., 2019; Shi
et al., 2017, 2018, 2019).
(D) Read quality analysis of the previously performed DF1, DF2 and DF3 PAR-CLIP datasets. For each PAR-CLIP dataset (Shi et al., 2017; Wang et al., 2014a,
2015), reads are classified based on the following quality parameters: read length (yellow, whether a read is shorter than 18 nucleotides, and thus cannot be
aligned to the genome), presence of PCR duplicates (blue, duplicates category), and mappability to the genome using Bowtie and the previously reported
parameters (categorized into ‘‘mappable’’ (green) or ‘‘fail to map’’ (light blue) reads). The DF3 PAR-CLIP dataset was unusual for several reasons. First, 44.11% of
all reads mapped to a single site in a single gene, MT-RNR2 (STAR Methods). These reads were identical, consistent with a PCR duplication event and were not
found in the DF1 or DF2 PAR-CLIP datasets or any miCLIP datasets. Second, of the remaining reads, only a small percentage of these were mappable (indicated
in green in the 10 3 10 dot plot). Overall, the DF3 PAR-CLIP dataset lacks sufficient read depth to detect endogenous DF3-binding sites in the transcriptome as
efficiently as the other PAR-CLIP datasets.
(E) Schematic representation of the method used to identify m6A sites bound by one DF or shared by two or more DFs. Scatterplots are used to represent the
pairwise comparison of the number of normalized DF reads calculated from the iCLIP or PAR-CLIP datasets at each m6A site. Yellow areas indicate where m6A
sites that uniquely bind one DF should be located. For instance, DF1-unique sites with a high level of coverage for DF1 (high number on x axis) and a low level of
coverage for DF2 (low number on the y axis) will be shown in the lower right area of the scatterplot. m6A sites bound by two or more DFs with a similar level of
binding for each DF will be found in the indicated light red area.
(F) Essentially all m6A sites show highly correlated binding of DF1 and DF2 based on previous PAR-CLIP datasets (Shi et al., 2017; Wang et al., 2014a, 2015). As
described in (E), to identify m6A sites that preferentially bind DF1 or DF2, their level of binding was quantified at each m6A site in the transcriptome (Ke et al., 2017).
In this experiment, we used the previously reported PAR-CLIP datasets prepared in HeLa cells (Shi et al., 2017; Wang et al., 2014a, 2015). According to the
previous analysis (Shi et al., 2017; Wang et al., 2014a, 2015), nearly 50% of m6A sites should be uniquely bound by one DF. However, few if any m6A sites show
evidence of disproportionate binding of one DF paralog compared to the other. Additionally, the high Pearson correlation coefficients (r) show that DF1 and DF2
have highly similar binding preferences for each m6A site in the transcriptome of HeLa cells. Similar results are shown in Figure 1C using iCLIP data for DF1, DF2
and DF3 in HEK293T cells (Linder et al., 2015; Patil et al., 2016). Thus, the binding of all m6A sites to each DF paralog is not cell-type specific, and not specific to
the type of CLIP assay used (PAR-CLIP versus iCLIP).
(G, H) DF1-unique sites show extensive levels of bound DF2, and vice versa. Previous studies show that nearly half of all m6A sites are uniquely bound by only one
DF paralog (Shi et al., 2017; Wang et al., 2014a, 2015. To confirm whether DF1-unique mRNAs are indeed uniquely bound by DF1, we asked if there are markedly
fewer DF2 PAR-CLIP reads than DF1 PAR-CLIP reads located at DF1-unique sites. For this analysis, we selected the previously described DF1-uniquely bound
sites and DF2-uniquely bound sites (Shi et al., 2017; Wang et al., 2014a, 2015). Shown is the DF1 and DF2 PAR-CLIP coverage at DF1-unique sites (G) and at DF2-
unique sites (H). For each site, PAR-CLIP DF1 or DF2 coverage is shown, with each row representing the coverage for a different unique site and its surrounding
area. The rows are ordered based on the degree of DF1 PAR-CLIP coverage (G) or DF2 PAR-CLIP coverage (H). In principle, DF1-unique sites should show higher
level of DF1 PAR-CLIP coverage than DF2. However, the PAR-CLIP coverage for both DF1 and DF2 appears largely identical at both DF1-unique sites and DF2-
unique sites. We were unable to define the DF3 coverage level at each site given the low number of mappable reads of the DF3 PAR-CLIP dataset, as discussed in
Figure S1C.
(J) DF1 and DF2 PAR-CLIP read tracks are highly similar even on transcripts thought to be DF1-unique or DF2-unique (Shi et al., 2017; Wang et al., 2014a, 2015).
DF1-unique and DF2-unique transcripts should show unique peaks or patterns of DF1 and DF2 binding. However, we found that transcripts generally appear to
show identical PAR-CLIP read coverage along the entire length of the transcript body. When inspecting different mRNAs, differences in PAR-CLIP coverage were
only seen if read coverage was low and therefore susceptible to read noise. Shown is the normalized read distribution of DF1 PAR-CLIP (in blue) and DF2 PAR-
CLIP (in green) on OGT, an mRNA previously thought to be a DF1-unique target, and on DUSP1, an mRNA previously thought to be a DF2-unique target. The
‘‘called’’ DF peaks were previously reported by Wang et al. (2015) and Wang et al. (2014a) based on a statistical peak finding algorithm PARalyzerv1.1 and are
presented below their respective tracks. Each row represents called peaks in a different PAR-CLIP replicate. Notably, the location of peaks and the relative
heights of each peak is similar for DF1 and DF2 on both transcripts. All the peaks overlap substantially with called m6A sites in HeLa cells (Ke et al., 2017). In the
original DF1, DF2, and DF3 mapping studies, the PAR-CLIP called sites were overlapped with a list of target mRNAs immunoprecipitated using the RIP protocol
(Shi et al., 2017; Wang et al., 2014a, 2015). Compared to the total number of m6A-containing mRNAs or the total number of mRNAs containing DF1 or DF2 PAR-
CLIP peaks, only a few mRNAs were successfully immunoprecipitated using this method (1747 for DF1, 1592 for DF2, 2080 for DF3). Therefore, the use of this
approach makes the final list of DF-specific mRNA targets highly dependent on this low efficiency mRNA immunoprecipitation method.
ll
Article

ll
Article
Figure S2. DF Proteins Have Similar Effector Domains and Protein-Protein Interactions, Related to Figure 2
(A) DF1, DF2 and DF3 exhibit generally high sequence identity and similarity in their effector domain. In Figure S1, we examined the high sequence identity of all
DFs in their YTH domain. Shown here is a schematic representation of the aligned amino acid sequence of the remaining part of the DF1, DF2 and DF3 protein.
There was overall similarity along the entire length of the effector domain, with small regions where there were amino acid differences. These small differences
may account for the previously described differences of DF function, which are examined in this study. Residues are classified according to the ClustalW Multiple
Sequence Alignment score (Madeira et al., 2019) based on whether the residue is exactly identical among all DFs, or ‘‘strongly and weakly similar’’ residues (the
amino acids share higher or lower level of similarity in their physicochemical properties among all DFs). Shown in the table is the overall percentage of sequence
identity and similarity calculated by the pairwise sequence EMBOSS Needle aligner (Madeira et al., 2019). The pairwise comparisons of the DF amino acid
sequences confirms the high sequence similarity and identity of all DFs.
(B) High-confidence interactors of each DF paralog generally show high-confidence interactions with the other DF paralogs. As described in Figure 2B, high-
confidence interactors of each DF paralog in vivo were identified previously using a Bio-ID proteomic approach (Youn et al., 2018). The top 25 interacting proteins
for each DF was previously determined based on their length-normalized spectral counts and the average probability of interaction calculated across different
replicates (Youn et al., 2018). The heatmap show the AvgP, average probability of interaction, for each of the top 25 interactors of DF1, DF2 and DF3. The majority
of the top 25 interactors have a high average probability of interaction with all three paralogs. These proteins include the main components of the CCR4-NOT
RNA-degradation complex and proteins previously identified as stress granule components. A recent study showed that all three DF proteins bind CNOT1 and all
three can elicit mRNA deadenylation when tethered in cells to a reporter mRNA (Du et al., 2016), supporting the overall finding that the major function of the DF
proteins is to mediate degradation of m6A-mRNAs. Interestingly, interaction with components, or repressor of the translation process may suggest a possible
involvement of DF proteins in the translational repression of m6A mRNAs. This repression may be then interconnected to the observed DF-dependent regulation
of m6A-mRNA stability. Shown at the bottom is the AvgP of interaction of each DF with the other DF paralogs.
(C) The Bio-ID DF1 interactome data (Youn et al., 2018) did not identify eIF3A and 3B as high confidence DF1 interactors. However, in Wang et al., 2015, eIF3A and
eIF3B were identified in FLAG-tagged DF1 immunoprecipitates by mass spectrometry (Wang et al., 2015). Shown is the reported score and number of unique
peptides mapped to each putative DF1 interactor from Wang et al. (2015). However, even in this study, eIF3A and eIF3B are not among the top DF1 interactors
when ranking all the interactors by the reported score. Moreover, as shown in red, the number of unique peptides is less than 5. This low level of interaction is
consistent with the Bio-ID interactome studies presented in Figure 2B. Thus, eIF3A and eIF3B are not seen as DF1 interactors in the Bio-ID study (Youn et al.,
2018) and are weak interactors in the Wang et al. (2015) study.
(D) Heterologous expression of DF proteins causes formation of DF stress granule-like structures. Plasmids expressing FLAG-3xHA tagged DF proteins, or a
control FLAG-3xHA construct were transfected in HeLa cells and staining was performed after 24 h of transfection. Areas where the HA staining (green) overlaps
with the DF staining (red) indicate the FLAG-3xHA tagged DF localization (yellow). As indicated by the arrows, the FLAG-3xHA tagged protein formed granule-like
structures of different dimension in some, but not all cells. This phenomenon occurs with proteins with low-complexity domains (Alberti et al., 2019). As shown by
the absence of these structures in the control-transfected cells (FLAG-3xHA), the granule-like structures are not an artifact of transfection. Thus, expressing DF
proteins may lead to uninterpretable results due to the variable formation of protein aggregates that could act to sequester DF proteins and mask their function.
Scale bar, 20 mm.
ll
Article

ll
Article
Figure S3. Analysis and Validation of RNA-Seq Data Obtained upon Single, Double, and Triple DF Protein Silencing in HeLa Cells, Related to
Figure 3
(A) Validation of DF1, DF2 and DF3 knockdown upon silencing of each of the DF paralogs alone or together. Western blotting was used to confirm knockdown
using the indicated DF paralog-specific antibodies on each sample used to perform RNA seq and Ribosome profiling analysis. HeLa cells were transfected with
the same total concentration (30 nM) of siRNA in each condition. Western blotting was performed 4 days after transfection. In each silencing condition, the siRNA
sequences were specific to the DF paralog of interest, as seen by the selective knockdown seen with each siRNA. GAPDH was used as loading control.
(B) Relative quantification of DF1, DF2 and DF3 protein band intensity shown in A. western blot chemiluminescent signal of every band was measured using
ImageLab Image tools and normalized to the GAPDH loading control. Knockdown of any of the DF paralogs was associated with a compensatory increase in the
expression of the other paralogs. The number of dots indicates the number of Western Blot replicates for each condition. Error bars indicate standard deviation.
(C) Reproducibility of the RNA-seq library replicates. For each silencing condition tested (DF1, DF2, DF3 and triple silencing), three independent replicates were
performed. The Pearson correlation coefficient of the normalized number of mapped reads across replicates was calculated and presented in each heatmap.
(D) DF1, DF2 and DF3 RNA and protein levels in HeLa cells measured by RNA-seq and Ribo-seq. The normalized counts of reads mapped to each paralog after
RNaseq (mRNA) and ribosome profiling (RPFs, ribosome protected fragments) were used as a proxy of the mRNA expression and protein expression levels,
respectively. As shown by the heatmap, DF1 has mRNA levels similar to DF2. However, DF2 is the most highly translated DF paralog. This will result in a higher
amount of DF2 protein in HeLa cells. Thus, at least in HeLa cells, DF2 is likely to be the most abundant DF paralog.
(E) Validation of the simultaneous knockdown of DF1 and DF2, DF1 and DF3, DF2 and DF3 four days after siRNA transfection. Cells were transfected with siRNA
on Day 1 and Day 3, and then western blotted to detect levels of DF1, DF2, or DF3 on Day 4. Knockdown of two DF paralogs was associated with a modest
compensatory increase in the expression of the remaining DF paralog. GAPDH was used as loading control.
(F-H) Cumulative distribution plots related to Figure 3E. In Figures 3A–3D, the cumulative distribution plots were shown only for the single knockdown experiments
and the triple knockdown experiment. Here we show the cumulative distribution blots for the double knockdown experiments. The abundance of each mRNA
(based on RNA-seq counts) was compared between the control and the indicated double DF silenced condition. mRNAs were binned based on the number of
annotated m6A sites. m6A mRNAs show higher expression levels compared to nonmethylated mRNA upon silencing any two DF paralogs. In F-H, each m6A
binned group was compared to the non-methylated mRNA group using a two-tailed Mann–Whitney test. Only significant p values are shown in each graph and
the exact p values are reported. n = 3 biological replicates.
(I) Reproducibility of the RNA-seq library replicates. For each silencing condition tested (DF1-DF2, DF1-DF3, DF2-DF3), three independent replicates were
performed. The Pearson correlation coefficient of the normalized number of reads across replicates was calculated and presented in each heatmap.
(J) RT-qPCR validation of the increase in stability of m6A mRNAs upon the silencing of the three paralogs. Highly methylated mRNAs (ZNF503 – 4 annotated m6A
sites, SGK1 – 4 annotated m6A sites, ID3 – 7 annotated m6A sites) were selected for validation. Transcripts lacking annotated m6A sites (PP1E3C and RPS28)
were chosen as controls. The stability of m6A-modified mRNAs and non-methylated mRNAs was determined by quantifying by RT-qPCR the mRNA levels
immediately before (0 h), 1 h and 2 h after actinomycin D treatment. The amount of mRNA detected at each time point is shown as percentage of the total mRNA
amount quantified at time 0 and normalized on the RPS28 abundance. The increase in mRNA stability is most apparent when all three DF paralogs were knocked
down. (n = 2 biological replicates ± s.d.).
ll
Article

ll
Article
Figure S4. Reanalysis of Previously Performed Ribosome Profiling Data Shows that DF1 and DF3 Do Not Control the Translational Efficiency
of m6A-mRNAs, Related to Figure 4
(A) Unlike PABPC1, DF1 is not enriched in the polysomal fraction. DF1 has been proposed to enhance translation by facilitating loading of eIF3 to mRNA 50 UTRs.
In this model, DF1 binds to an mRNA 30 UTR, and binds eIF3. This can create an mRNA loop in which the 30 UTR and 50 UTR are connected via the DF1-eIF3
complex. This is similar to other PABPC1, which also binds to mRNA 30 UTRs and binds a translation initiation factor, which results in mRNA looping. PABPC1
function in enhancing mRNA translation is evidenced by its enrichment in the sucrose fractions occupied by mRNAs containing high levels of actively translating
ribosomes (polysomes). We therefore tested if DF1 is also enriched in the polysome fractions by treating HeLa cells with cycloheximide and fractionating
polysomes on a sucrose gradient. The RNA fractions indicated here are related to the UV trace as shown in Figure 4A. DF1 is not enriched in these fractions based
on paralog-specific western blotting (also see Figure 4A). In contrast, PABPC1 is readily detected in the actively translating ribosomal fractions. The ribosomal
protein RPS6 was used to confirm the efficiency in recovering the translating ribosomes from the polysomal sucrose fractions. Thus, these data do not support a
model in which DF1 enhances the translation of m6A-mRNAs by binding to highly translated mRNAs.
(B) m6A-mRNAs are downregulated upon depletion of DF1, based on ribosome profiling datasets provided in Wang et al. (2015). Here, we used two ‘‘processed’’
datasets provided in the GEO submission for Wang et al. (2015) (GSE63591_C-Y1-Ribosome_profiling.xlsx). These datasets are referred to as processed since
the number of ribosome-protected fragments, RNA-Seq data, and translational efficiency (TE) are listed for each transcript based on the authors’ calculations and
analysis of their next-generation sequencing data. We used this processed data to calculate the change in translational efficiency of mRNAs upon DF1 silencing
based on the number of annotated m6A sites. As can be seen, m6A-modified mRNAs show reduced translation upon DF1 knockdown. This result is essentially
identical to the cumulative distribution plots presented in Wang et al. (2015). Notably, the data presented here is based on the average of two replicates provided
by Wang et al. (2015), designated as Replicate 1 and Replicate 2.
(C) Replicate 1 shows a more prominent reduction in m6A-mRNA translation upon depletion of DF1. The effect seen in Replicate 1 is more pronounced than the
effect seen in (A), which is the average of Replicate 1 and Replicate 2. Only significant p values are shown in each graph and the exact p values are reported.
(D) Replicate 2 shows no reduction in m6A-mRNA translation efficiency upon depletion of DF1. Surprisingly, and in contrast to Replicate 1 (shown in (B), Replicate
2 does not show a drop in m6A mRNA translation efficiency upon depletion of DF1. The effect seen in (A) was seen since it was the average of Replicate 1 and
Replicate 2, which show different effects on m6A mRNA. In B-D, each m6A group was compared to the non-methylated mRNA group using a two-tailed Mann–
Whitney test. Only significant p values are shown in each graph and the exact p values are reported.
(E) The reanalyzed ribosome profiling datasets do not show reduced m6A-mRNA translation efficiency upon depletion of DF1. In this experiment, we accessed the
original next-generation sequencing data generated by Wang et al. (2015) (GEO: GSE63591) and generated a new table of ribosome-protected fragments using a
standard ribosome-profiling analysis pipeline (McGlincy and Ingolia, 2017). We referred to these reanalyzed datasets ‘‘Replicate R1’’ and ‘‘Replicate R2.’’ mRNAs
were binned based on the number of annotated m6A sites. In contrast to the results presented in (B), analysis of the two replicates (presented as an average)
shows that m6A mRNAs do not show reduced translation efficiency upon DF1 silencing.
(F) The reanalyzed ribosome profiling sample, i.e., Replicate R1, no longer shows a link between m6A mRNA translation and DF1. In contrast to the original
processed data presented in (C), reanalysis of the underlying next-generation sequencing data from the ribosome profiling data in Wang et al. (2015), no longer
shows decreased m6A-mRNA translation upon depletion of DF1.
(G) Replicate R2, like the original Replicate 2 shown in (D), shows no change in m6A-mRNA translation efficiency upon depletion of DF1. In E-G, each m6A binned
group was compared to the non-methylated mRNA group using a two-tailed Mann–Whitney test. Only significant p values are shown in each graph and the exact
p values are reported.
(H) The number of ribosome-protected fragments per mRNA reported by Wang et al. (2015) for Replicate 1 only modestly correlates to the number of ribosome-
protected fragments of Replicate 2. Both the control replicates, and the DF1-depleted replicates were compared. For this analysis, we used the number of
ribosome-protected fragments reported for the two replicates in the GEO submission for Wang et al. (2015) (GSE63591_C-Y1-Ribosome_profiling.xlsx). These
datasets are referred to as ‘‘processed’’ since the number of ribosome-protected fragments are listed based on the authors’ analysis of their next-generation
sequencing raw data. For each tested condition (transfection with a control siRNA or with siRNAs targeting DF1), a plot showing the comparison between the
number of ribosome-protected fragments calculated for Replicate1 and Replicate 2 is presented. A Pearson correlation was calculated for each comparison. To
reduce the complexity of the analysis, only the ribosome-protected fragments mapped to the m6A-mRNAs are presented.
(I) Upon our re-examination of the raw data reported by Wang et al. (2015), the number of ribosome-protected fragments per mRNA in Replicate 1 strongly
correlates with the number of ribosome-protected fragments of Replicate 2. This analysis was performed for both the control and DF1-depleted samples. Here,
we accessed the original next-generation sequencing data generated by Wang et al. (2015) (GEO: GSE63591) and generated a new table of ribosome-protected
fragments per gene using a standard ribosome-profiling analysis pipeline (McGlincy and Ingolia, 2017). We referred to these as ‘‘Reanalysis of the data from Wang
et al. (2015).’’ As in (H), a plot showing the comparison between the number of ribosome-protected fragments calculated for Replicate1 and Replicate 2 is
presented for each tested condition. In contrast to (H), there is high level of correlation between replicates independently from the tested condition. This result is
generally seen when analyzing replicates of the same condition giving higher level of confidence in interpreting the possible effect of DF1 on the translation of
m6A genes.
(J) m6A-mRNAs do not show a reduction in translation efficiency upon depletion of DF3. We asked if m6A-mRNA translation is affected upon depletion of DF3,
using the ribosome profiling datasets generated by Shi et al. (2017). Unlike in the DF1 knockdown dataset, the processed data provided by Shi et al. (2017)
presented the ribosome-protected fragments from two replicates as a log2-fold-change relative to a single control ribosome profiling experiment. mRNAs were
binned based on the number of annotated m6A sites. Here, we saw no change in translation of m6A-annotated mRNAs using the two knockdown replicates
provided in Shi et al. (2017).
ll
Article

ll
Article
Figure S5. Quality Control Analysis of Ribosome Profiling Datasets and Validation of DF Paralog Silencing in MOLM-13 Cells, Related to
Figures 4 and 5
(A) Reproducibility of the ribosome profiling library replicates. For each silencing condition tested (DF1, DF2, DF3), three independent replicates were performed.
The Pearson correlation coefficient of the normalized number of ribosome-protected fragments reads across replicates was calculated and presented in each
heatmap.
(B) Validation of ribosome profiling datasets. Distribution of read length in the ribosome profiling experiments in each tested condition (DF1, DF2 and DF3
silencing). Most of the reads are 29-30 nt in length, which reflects the expected size of the ribosome footprint when a ribosome is translating on the mRNA and the
A site is occupied (McGlincy and Ingolia, 2017; Wu et al., 2019). The samples were prepared in the absence of cycloheximide as recommended in the most current
ribosome profiling protocols (McGlincy and Ingolia, 2017). The percentage of P-sites assigned to the coding sequence, 30 UTR, or the 50 UTR is presented as a bar
plot for each condition. As expected, most of the reads map to the coding region. (lower right) The position of ribosome footprints relative to the reading frame was
determined for each dataset. Overall, these analyses indicate that a high fraction of the reads obtained in these datasets represent ribosome-protected fragments
that reflect the position of the translating ribosomes in cells.
(C) Validation of the DF1, DF2 and DF3 knockdown. To confirm knockdown of the DF paralogs, we quantified the number of ribosome-protected fragments (RPFs)
that mapped to each transcript before and after the silencing of each DF paralog. Upon the silencing, the reduction in the RPFs confirms that DF1, DF2 and DF3
are less translated, consistent with efficient knockdown. These results are consistent with the western blot validation presented in Figures S3A and S3B.
(D) Effect of DF1 depletion on the distribution of m6A-modified mRNAs (OSGIN2, CDC27 and YY1) and non-m6A-modified mRNA (RPS28) along the sucrose
gradient. To independently test whether m6A-mRNAs are less translated, we performed polysome fractionation on a sucrose gradient of DF1-silenced and
control siRNA-treated cytoplasmic lysates (upper graph). The level of m6A-mRNAs reported to be exhibit markedly reduced in translation upon DF1 silencing by
(YY1 and OSGIN2, 5.6 log2 fold change; CDC27, 3.66 log2 fold change; Wang et al., 2015), were tested in each fraction by qRT-PCR. The quantity of each
mRNA per fraction was normalized to the amount of spike-in luciferase mRNA in each fraction and presented as percentage of the total amount measured in all
fractions. As shown in the lower graphs, DF1 silencing does not affect the distribution of m6A-mRNAs along the gradient. Thus, the effect previously shown by the
ribosome profiling for the top-regulated m6A-mRNAs cannot be independently validated using polysome fractionation and qPCR analysis. Data are means ±
s.e.m. (n = 2)
(E) Reproducibility of the DF1, DF2, and DF3 triple knockdown ribosome profiling dataset. As shown in (A), the Pearson correlation coefficient of the normalized
number of ribosome-protected fragments across replicates was calculated and presented in a heatmap.
(F) Validation of the ribosome profiling dataset obtained upon depletion of DF1, DF2, and DF3. As presented in (B), shown are the distribution of the ribosome
profiling read lengths, the percentage of P-sites assigned to the coding sequence, 30 UTR, and 50 UTR, and the position of ribosome footprints relative to the
reading frame. Overall, these analyses indicate that a high fraction of the reads obtained in this dataset represent ribosome-protected fragments that reflect the
position of the translating ribosomes in cells.
(G) Confirmation of DF1, DF2 and DF3 knockdown in MOLM-13 cells. Western Blot was performed after 5 days from the shRNA transduction. GAPDH was used
as loading control.
Article
Metabolic Fingerprinting Links Oncogenic PIK3CA

with Enhanced Arachidonic Acid-Derived
Eicosanoids
Nikos Koundouros, Evdoxia Karali,
Aurelien Tripp, ..., Robert C. Glen,
Zoltan Takats, George Poulogiannis
Correspondence
z.takats@imperial.ac.uk (Z.T.),
george.poulogiannis@icr.ac.uk (G.P.)
In Brief
Metabolic fingerprinting using the iKnife
offers near real-time diagnosis of PIK3CA
mutant breast cancers and connects
oncogenic PIK3CA with enhanced
arachidonic acid metabolism. cPLA2
inhibition shows remarkable synergy with
dietary fat restriction to restore tumoral
immune cell infiltration and inhibit growth
of mutant PIK3CA-bearing breast tumors.
Highlights
d The iKnife offers near real-time diagnosis of PIK3CA mutant
breast cancers
d Oncogenic PIK3CA promotes enhanced arachidonic acid via

mTORC2-PKCz-cPLA2 signaling
d Mutant PIK3CA regulates proliferation beyond a cell

autonomous manner
d cPLA2 inhibition and dietary fat restriction suppress

PIK3CA-induced tumorigenicity
Koundouros et al., 2020, Cell 181, 1596–1611

June 25, 2020 ª 2020 The Author(s). Published by Elsevier Inc.
ll
OPEN ACCESS
Article
Metabolic Fingerprinting Links Oncogenic PIK3CA
with Enhanced Arachidonic Acid-Derived Eicosanoids
Nikos Koundouros,1,2 Evdoxia Karali,1 Aurelien Tripp,1 Adamo Valle,1,3,4,5 Paolo Inglese,2 Nicholas J.S. Perry,1
David J. Magee,1,6 Sara Anjomani Virmouni,1,7 George A. Elder,1,8 Adam L. Tyson,9,10 Maria Luisa Dória,2
Antoinette van Weverwijk,11,12 Renata F. Soares,2 Clare M. Isacke,11 Jeremy K. Nicholson,2,13 Robert C. Glen,2,14
Zoltan Takats,2,* and George Poulogiannis1,2,15,*
1Signalling and Cancer Metabolism Team, Division of Cancer Biology, The Institute of Cancer Research, 237 Fulham Road, London SW3
6JB, UK
2Division of Systems Medicine, Department of Metabolism Digestion and Reproduction, Imperial College London, London SW7 2AZ, UK
3Energy Metabolism and Nutrition, Research Institute of Health Sciences (IUNICS), University of Balearic Islands, 07122 Palma de
Mallorca, Spain
4Health Research Institute of the Balearic Islands (IdISBa), University of Balearic Islands, 07120 Palma de Mallorca, Spain
5Biomedical Research Networking Center for Physiopathology of Obesity and Nutrition (CIBERobn CB06/03/0043), Instituto de Salud Carlos
III, 28029 Madrid, Spain

6Pain Medicine Department, The Royal Marsden Hospital, London, UK
7Department of Life Sciences, College of Health and Life Sciences, Brunel University London, Uxbridge UB8 3PH, UK
8School of Biological and Chemical Sciences, Queen Mary University of London, London E1 4NS, UK
9Flow Cytometry and Light Microscopy Core Facility, Division of Cancer Biology, The Institute of Cancer Research, 237 Fulham Road, London
SW3 6JB, UK
10Sainsbury Wellcome Centre for Neural Circuits and Behaviour, University College London, 25 Howland Street, London W1T 4JG, UK
11Breast Cancer Now Research Centre, The Institute of Cancer Research, 237 Fulham Road, London SW3 6JB, UK
12Division of Tumor Biology and Immunology, the Netherlands Cancer Institute, Plesmanlaan 121, 1066 CX Amsterdam, the Netherlands
13The Australian National Phenome Centre, Health Futures Institute, Murdoch University, Perth WA6150, WA, Australia
14Centre for Molecular Informatics, Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, UK
15Lead Contact
*Correspondence: z.takats@imperial.ac.uk (Z.T.), george.poulogiannis@icr.ac.uk (G.P.)

SUMMARY
Oncogenic transformation is associated with profound changes in cellular metabolism, but whether tracking
these can improve disease stratification or influence therapy decision-making is largely unknown. Using the
iKnife to sample the aerosol of cauterized specimens, we demonstrate a new mode of real-time diagnosis,
coupling metabolic phenotype to mutant PIK3CA genotype. Oncogenic PIK3CA results in an increase in
arachidonic acid and a concomitant overproduction of eicosanoids, acting to promote cell proliferation
beyond a cell-autonomous manner. Mechanistically, mutant PIK3CA drives a multimodal signaling network
involving mTORC2-PKCz-mediated activation of the calcium-dependent phospholipase A2 (cPLA2).
Notably, inhibiting cPLA2 synergizes with fatty acid-free diet to restore immunogenicity and selectively
reduce mutant PIK3CA-induced tumorigenicity. Besides highlighting the potential for metabolic phenotyping
in stratified medicine, this study reveals an important role for activated PI3K signaling in regulating arachi-
donic acid metabolism, uncovering a targetable metabolic vulnerability that largely depends on dietary fat
restriction.
INTRODUCTION tracking in diagnosis and disease monitoring (Vander Heiden

and DeBerardinis, 2017). Cells can also reprogram pathways
In addition to the routine use of histological subtyping in cancer of nutrient acquisition and processing as a result of oncogenic
diagnosis and therapy decision-making, a renewed interest has activation of commonly perturbed signaling pathways (Tar-
emerged in the identification of novel predictive factors to rado-Castellarnau et al., 2016). A characteristic example is the
improve patient stratification and therapeutic response. Malig- activation of the phosphatidylinositol 3-kinase (PI3K)/Akt/
nant transformation and disease progression involve multiple mammalian target of rapamycin (mTOR) pathway, which
changes in biosynthetic and energy production pathways, offer- regulates a wide range of transcriptional and post-translational
ing opportunities to exploit the clinical utility of metabolic programs to support the anabolic and catabolic requirements
1596 Cell 181, 1596–1611, June 25, 2020 ª 2020 The Author(s). Published by Elsevier Inc.
This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
ll
Article OPEN ACCESS
of proliferating cells (Carracedo and Pandolfi, 2008; Dibble and (Figures 1C and S1C), suggesting that the modulation of ER
Manning, 2013; Fruman et al., 2017; Lien et al., 2016). signaling induces distinct lipidomic alterations, which are detect-
These observations raise some fundamental questions: able by REIMS and are reversible by ER inhibition.
whether we can use metabolic tracking for more effective With a robust lipidomic profile obtained using REIMS, we
screening of the molecular features underlying tumor pathogen- next performed unsupervised hierarchical clustering to partition
esis and, ultimately, whether this information can be translated all breast cancer cell lines on the basis of their spectral similar-
into better and more efficacious treatment strategies for each ities measured over 872 lipid species. This analysis revealed
patient. Indeed, the concept of metabotyping has been widely two subtypes with distinctive signatures, in which REIMS-de-
applicable in characterizing functionally distinct traits that have tected lipid species were significantly enriched (black) or
the power to influence clinical decision-making (Gavaghan depleted (gray) (Figure 1D). The observed clusters were also
et al., 2000; Holmes et al., 2008; Nicholson et al., 2002, 2012). confirmed using a consensus non-negative matrix factorization
Here, we used rapid evaporative ionization mass spectrometry (NMF) (Figure S1D).
(REIMS), which coupled to the intelligent surgical device, also To shed light on the mechanism that is driving this unique
known as iKnife, allows for instantaneous chemical analysis of metabolic classification, we examined mutational enrichment
the aerosol generated during electrosurgical tissue ablation of the cells between the two subtypes. Out of the top 150 genes
and cauterization, in the form of gas-phase ionic species. Unlike that are frequently (>20%) mutated in these cell lines, oncogenic
other technologies that are commonly used for metabolite mutation in PIK3CA was the only one to be significantly (Fisher’s
profiling, such as liquid chromatography-mass spectrometry test, p value = 0.019) overrepresented in the lipid-enriched clus-
(LC-MS), REIMS analysis requires no sample preparation and alter (Figure 1E; Table S2). In accordance with this finding, analysis
lows for near real-time (1–2 s) lipidomic analysis and tissue of isogenic MCF10A PIK3CA wild-type (WT) and mutant (MUT)
recognition, based on multivariate classification analysis of (E545K and H1047R) cell lines also revealed clustering of the
spectral libraries of reference mass spectra. latter in the lipid-enriched group, both when cells were cultured
The iKnife/REIMS can be used both in the intraoperative and in 2D, or 3D as spheroids (Figures 1E and S1E). Consistent with
biopsy collection settings to differentiate cancerous from non- this stratification, performing gene and functional pathway
cancerous tissues with very high precision, based on their lipido- enrichment analyses revealed KEGG-pathway ontologies
mic composition (Alexander et al., 2017; Balog et al., 2013; St relating to metabolic pathways that were significantly associated
John et al., 2017). However, the potential of using this technology with the lipid-enriched subtype (Figures S2A and S2B). Specific
beyond the pattern-level identification of tissues, to reveal the overexpressed genes included FASN and ELOVL6, which are
biological mechanisms underlying unique metabolic signatures, involved in de novo lipogenesis, and LDLRAP1, which facilitates
or identify which patients will likely benefit from a given treat- exogenous lipid uptake (Figures S2C–S2E). Indeed, PIK3CA
ment, has not yet been explored. MUT cells displayed elevated induction of the de novo lipogen-
esis transcriptional regulator SREBP1 (Figure 1F) and higher
RESULTS exogenous FA uptake capacity (Figure 1G), suggesting that
both could contribute to the lipid-enriched metabotype.
Metabolic Phenotyping Using REIMS Predicts Molecular Most importantly, the observed metabolic stratification was
Markers Including Oncogenic Mutations in PIK3CA also evident among PIK3CA WT and MUT breast cancer PDXs
We first examined whether REIMS-detected lipid signatures (Figure 1H) and primary tumors (Figure 1I; Table S3). Among
correlate with any established molecular markers of breast can- the PDX tumors assessed (n = 18), only one was misclassified
cer of known prognostic and therapeutic value (Figure 1A). For (BR5017), and this harbored a rare I391M mutation that has ac-
this, we selected a panel of 43 breast cancer cell lines, 18 pa- tivity reminiscent of WT PIK3CA (Dan et al., 2010) (Figure 1H).
tient-derived xenograft (PDX), and 12 primary breast tumors Overall, PIK3CA mutation status in both PDX and primary tumors
that are well characterized for their estrogen (ER), progesterone could be classified with an accuracy of 90% using all measurable
(PR), and HER2 receptor status. REIMS profiling of cell lines lipid species (Figure S1F), suggesting that the iKnife/REIMS
consistently classified ER, HER2, and triple negative status could be used for near real-time diagnosis of PIK3CA MUT
(TN) with area under the curve (AUC) accuracies between 0.8– breast cancers by MS analysis of aerosolized tissue material.
0.9, and 0.6–0.7 for PR (Figure 1B).
Consistent with previous studies (Hilvo et al., 2011), the most mTORC2 Signaling Downstream of Oncogenic PIK3CA
striking differences in lipid profiles were observed between ER- Drives the Lipid-Enriched Phenotype
positive (+ve) and -negative (ve) breast cancer cell lines (Fig- The effects of PI3K/Akt/mTOR signaling on lipid metabolism have
ures 1B and S1A; Table S1) and tumor specimens (Figure S1B). been observed on numerous levels (Dibble and Manning, 2013;
A surrogate marker for ER positivity, aside from its routine deter- Lien et al., 2016; Saxton and Sabatini, 2017). To elucidate the spe-
mination by immunohistochemistry (IHC), is expression of the cific mechanisms underlying the observed phenotype, we treated
estrogen receptor 1 (ESR1) gene. We built a regression model PIK3CA MUT cells with inhibitors targeting the activity of key no-
to predict ESR1 expression based on the spectral profiles ob- des in the PI3K pathway (Figure 2A), albeit at concentrations that
tained by REIMS and tested this in representative ER+ve cell lines do not affect cell viability (Figure S3A). PI3K (BYL719, BKM120)
treated with or without 4-hydroxy-tamoxifen (4-OHT). Of note, and mTOR (rapamycin, torin 1) inhibition dramatically reduced
the predicted ESR1 expression was significantly reduced relative phospholipid levels, but surprisingly, Akt inhibition with
following 4-OHT treatment as compared to untreated controls either MK2206 or GSK690693 did not (Figure 2B). Similar results
Cell 181, 1596–1611, June 25, 2020 1597

ll
OPEN ACCESS Article
A B C
D E
F H I
Figure 1. REIMS Analysis Predicts Breast Cancer Molecular Markers Including Oncogenic Mutations in PIK3CA
(A) Schematic overview of sample preparation for REIMS analysis.
(B) Area under the curve (AUC) classification accuracies for ER, PR, HER2 receptor, and triple negative status of 43 breast cancer (BC) cell lines (median intensity
of n = 3 biological replicates) following feature selection for phospholipids in the m/z range 600–900, and leave-one-out cross validation.
(C) Immunoblot analysis of estrogen inducible protein pS2 and predicted ESR1 expression in ER+ve MCF7 cells following treatment with 0.1% DMSO or indicated
concentrations of 4-OHT for 72 h.
(D) Unsupervised hierarchical clustering of 872 lipid species detected by REIMS across 43 BC cell lines.
(E) Dendrogram of BC cell lines and isogenic MCF10A cells harboring either WT or MUT (E545K or H1047R) PIK3CA.
(F) Immunoblot analysis of mature SREBP1 transcription factor expression in nuclear extracts of the MCF10A PIK3CA isogenic panel.
(G) Relative exogenous fatty acid uptake in MCF10A PIK3CA WT and MUT cells following serum starvation for 1 h and supplementation with fluorescently labeled
dodecanoic acid (n = 5 replicates).
(H and I) Unsupervised hierarchical clustering of 9 PIK3CA WT and 9 MUT breast PDX tumors (H) and (I) 5 WT and 7 MUT primary breast tumors. Individual rows in
the heatmaps in (D), (H) and (I) correspond to scaled Z score phospholipid intensities (n = 3 biological replicates). Error bars represent ± SEM. n.s., not significant;
*p % 0.05; **p % 0.01; ***p % 0.001. p values in (C, bottom panel) and (G) were calculated with one-way ANOVA, followed by unpaired, two-tailed Student’s t test
with Bonferroni correction.
1598 Cell 181, 1596–1611, June 25, 2020

ll
Article OPEN ACCESS
A B C D
Figure 2. Oncogenic PIK3CA Drives the Lipid-Enriched Phenotype via mTORC2 Signaling
(A) MCF10A PIK3CA MUT cells were treated with BYL719 or BKM120 (100 nM), MK2206, or GSK690693 (150 nM) for 72 h, or rapamycin (20 nM) for 4 h, or
rapamycin or torin 1 (20 nM) for 72 h.
(B) Unsupervised hierarchical clustering of MCF10A E545K and H1047R MUT cells treated with PI3K, AKT, and mTOR inhibitors.
(C and D) Immunoblot analysis (C) and unsupervised hierarchical clustering (D) of MCF10A E545K and H1047R cells transfected with RAPTOR, RICTOR, or mTOR
siRNA. Individual rows in the heatmaps in (B) and (D) correspond to scaled Z score phospholipid intensities (n = 3 biological replicates).
were observed in a panel of 5 PIK3CA MUT breast cancer cell which are established products of elevated lipogenesis and in
lines, with the exception of MCF7 cells that also responded to line with the observed lipid enriched phenotype (Figure 1D; Table
MK2206 (Figure S3B). S4). Interestingly, the second most significantly elevated FA after
Interestingly, we did not observe an effect on relative phos- palmitoleate was arachidonic acid (AA) (FA20:4), an omega-6 FA
pholipid abundances following acute exposure to rapamycin which is predominantly found in animal fats and is of particular
for 4 h, despite inhibition of mTORC1 (Figures 2A, bottom left relevance as a major regulator of pro-inflammatory responses
panel, and 2B). Because extended rapamycin treatment inhibits in cancer, through the production of bio-active lipids known as
mTORC1 and mTORC2 (Figure 2A, bottom right panel) (Sarbas- eicosanoids (Wang and Dubois, 2010) (Figure 3A; Table S4).
sov et al., 2006), and both impinge upon lipogenesis (Düvel et al., Importantly, in addition to the cell lines, significant elevations in
2010; Griffiths et al., 2013; Guri et al., 2017; Lee et al., 2017; AA were also observed in all the PIK3CA MUT breast PDX and
Ricoult et al., 2016), we sought to investigate which of these primary tumors (Figures 3B and 3C) and across tumors of other
complexes might contribute to the regulation of the observed tissue types including ovarian, pancreatic, and sarcomas (Fig-
phenotype. Knockdown of RICTOR or mTOR, but not RAPTOR, ure 3D). In agreement with our REIMS findings, AA and down-
led to a significant reduction in relative phospholipid abun- stream eicosanoids were also found to be significantly elevated
dances (Figures 2C and 2D), pointing to a PIK3CA- and in both PIK3CA MUT cells using LC-MS (Figures 3E, S4A,
mTORC2-dependent metabolic phenotype that is largely inde- and S4B).
pendent of mTORC1 or Akt inhibition. To measure FAs that are secreted from cells, as opposed to
those that might already exist in serum-supplemented media,
Oncogenic PIK3CA Drives Enhanced Arachidonic Acid cells were grown under FA-deprived conditions. Pro-inflamma-
Metabolism, thereby Promoting Cell Proliferation tory derivatives of AA were significantly increased in the media
beyond a Cell-Autonomous Manner of PIK3CA MUT cells (Figure 3F), signifying a potential role for
Given that global lipidomic profiles could stratify breast cancer these bio-active lipids in tumor microenvironment (TME)
cell lines and tumors based on PIK3CA mutation status, we interactions.
next aimed to characterize specific lipid alterations that are Next, to ascertain the functional consequences of elevated
associated with oncogenic PIK3CA. Fatty acids (FAs), which eicosanoid metabolism, the effects of PIK3CA MUT-derived
are the main constituents of phospholipids and have additional conditioned media (CM) were assessed. PIK3CA WT cells dis-
effector functions in cancer pathogenesis, were profiled in the played dramatically increased proliferative rate following incuba-
PIK3CA isogenic panel using REIMS. Among the most abundant tion with CM obtained from MUT cells, and this was effectively
FAs in PIK3CA MUT compared to WT cells were palmitoleate rescued by depleting the lipids from the media (Figures 3G,
(FA16:1), palmitic acid (FA16:0), and oleic acid (FA18:1), all of S4C, and S4D). Moreover, the proliferation of both PIK3CA
Cell 181, 1596–1611, June 25, 2020 1599

ll
OPEN ACCESS Article
A B C
D E
F G H
Figure 3. Oncogenic PIK3CA Drives Enhanced Arachidonic Acid Metabolism

(A and B) Arachidonic acid (AA) levels measured by REIMS in MCF10A PIK3CA WT and MUT cells cultured under full 5% horse serum or fatty acid-free (FAF)
conditions for 72 h (n = 3 biological replicates) (A). AA levels of 18 breast PDX tumors (n = 9 PIK3CA WT and n = 9 MUT) (left) (B). Three sections corresponding to
different tumor regions were analyzed with REIMS. Data are summarized in the boxplot to the right.
(C) 12 primary breast tumors (n = 5 PIK3CA WT and n = 7 MUT) (left). Data are summarized in the boxplot to the right.
(D) Breast, ovarian, pancreatic, sarcoma, and colorectal PDX tumors (n = 5 PIK3CA WT and MUT tumors for breast, pancreatic, sarcoma, and colorectal tissues,
and n = 4 PIK3CA WT and MUT ovarian PDX tumors).
(E) Heatmap and Venn diagram summarizing the intracellular eicosanoids that were significantly different between MCF10A PIK3CA WT and MUT cells. Rows
correspond to the Z score scaled eicosanoid intensities detected by LC-MS (n = 3 biological replicates).
(F) Heatmap and Venn diagram summarizing the eicosanoids of the conditioned media (CM) that were significantly different between MCF10A PIK3CA WT and
MUT cells.
(G) Cell proliferation assays of MCF10A PIK3CA WT cells cultured in CM derived from WT or H1047R MUT cells before or after lipid depletion (LD), with or without
the supplementation of 25 mM AA, palmitate, or palmitoleate.
(H) Cell proliferation assays of MCF10A PIK3CA H1047R MUT cells before or after LD, with or without the supplementation of 25 mM AA, palmitate, or palmi-
toleate. Sulforhodamine B (SRB) protein staining was used in (G) and (H) to measure cell proliferation over 5 days (replicates from n = 3 wells). Error bars in (G) and
(H) represent mean ± SEM for each time point. *p % 0.05; **p % 0.01; ***p % 0.001. p values in (A)–(D) were calculated with unpaired, two tailed Student’s t test.
Two-way ANOVA was used for (G) and (H).
MUT cell lines was significantly reduced following incubation data support not only a critical role for oncogenic PIK3CA in influ-
with their respective lipid-deprived CM (Figures 3H and S4E), encing autocrine- and paracrine-mediated cell proliferation, but
whereas supplementation of lipid-deprived CM with AA, but also point to AA as an easily measured metabolic biomarker that
not palmitate or palmitoleate, restored proliferation in both WT could help with the diagnosis and treatment of PIK3CA MUT
and MUT cells (Figures 3G, 3H, S4D, and S4E). Together, these tumors.
1600 Cell 181, 1596–1611, June 25, 2020

ll
Article OPEN ACCESS
A B C D
E F G H
J K M
Figure 4. Oncogenic PIK3CA Signaling Triggers cPLA2-Induced Arachidonic Acid Production

(A) Enzymatic activity of cPLA2, iPLA2, and sPLA2 in the MCF10A PIK3CA isogenic panel.
(B–D) cPLA2 activity (B) and AA levels (C and D) measured by REIMS in MCF10A H1047R PIK3CA MUT cells following RAPTOR or RICTOR siRNA-mediated
knockdown (C), or treatment with 100 nM ASB14780, 1 mM each of PKCa, b, ε, or z peptide inhibitors, 250 mM GSK650394, or 150 nM MK2206 for 72 h (D). Cells
were grown under exogenous FAF conditions.
(E) cPLA2 activity following PKCz inhibition with 1 mM peptide inhibitor for 72 h.
Cell 181, 1596–1611, June 25, 2020 1601

ll
OPEN ACCESS Article
Oncogenic PIK3CA Promotes Enhanced Arachidonic mTORC2 signaling to this process is unknown, as is the role of
Acid Production via mTORC2-PKCz-cPLA2 Signaling PKCz in the regulation of cPLA2. Consistent with PKCz being a
To better understand the mechanism by which PIK3CA MUT direct substrate of PDK-1 (Chou et al., 1998) and mTORC2 phos-
cells have elevated AA, we assessed various pathways which phorylation (Li and Gao, 2014), it was found to be hyperphos-
contribute to its cellular pool, including: direct exogenous up- phorylated in PIK3CA MUT cells (Figure 4F) and breast PDX
take, synthesis from linoleic acid, hydrolysis from diacylglycerol tumors (Figure S5L). Moreover, its inhibition led to a marked
(DAG), or endogenous release from membrane phospholipid reduction in MAPK/ERK signaling (Figure 4G), as well as p38
through phospholipase (PLAs) activity. Curiously, we noted a MAPK phosphorylation and active GTP-bound Rac-1 (Fig-
persistent increase in AA in PIK3CA MUT isogenic panel even ure 4H), culminating in reduced cPLA2 phosphorylation at the
when cells were cultured with FA-free media (Figure 3A). Addi- S505 site (Figures 4G and S5L).
tionally, DAG levels—that can be in part generated from the hy- In addition to S505 phosphorylation, an increase in intracel-
drolysis of phosphatidylinositol-4,5-bisphosphate (PIP2)—were lular calcium levels is essential for sustained phospholipase ac-
significantly reduced in PIK3CA MUT cells (Figures S4F tivity and liberation of AA by cPLA2 (Ambs et al., 1995; Clark
and S4G). et al., 1995). Given that elevated PIP3 levels, induced by onco-
These results pointed to a potential role for phospholipases genic PIK3CA, promote the activation of phospholipase C
(PLAs), of which three main classes predominate: cytosolic/cal- gamma 1 (PLCg1), leading to an increase in cytosolic calcium
cium-dependent (cPLA2), calcium-independent (iPLA2), and via generation of inositol-1,4,5-trisphosphate (IP3) (Rameh
secretory (sPLA2) phospholipase A2 (Burke and Dennis, 2009). et al., 1998), we hypothesized that this signaling node could
Among these, only cPLA2 displayed significantly higher enzy- also play a role in regulating cPLA2 activity and AA release down-
matic activity (Figure 4A), as well as elevated total protein levels stream of active PI3Ka. In line with this premise, we observed
and stability (Figures S5A and S5B) in the presence of oncogenic higher phosphorylation of PLCg1 (Figure S6A) and significantly
PIK3CA, whereas expression of PLA2G4A—the gene encoding elevated intracellular calcium levels in PIK3CA MUT cells (Fig-
cPLA2a—remained unchanged (Figure S5C). ure S6B). Although genetic and pharmacological (U73122) inhibi-
Consistent with the predominant role of mTORC2 in driving the tion of PLCg1 led to a significant reduction in calcium flux in both
lipid enriched phenotype in PIK3CA MUT cells (Figures 2C and PIK3CA WT and MUT cells (Figures S6C–S6E), an inhibitory ef-
2D), RICTOR, but not RAPTOR, silencing rescued cPLA2 activity fect on cPLA2 activity and AA levels was only observed in the
(Figure 4B), and led to a concomitant reduction in AA and pros- MUT cells (Figures S6F–S6H), highlighting the importance of
taglandin E2 (PGE2) levels (Figures 4C and S5D–S5F). Impor- PLCg1-mediated Ca2+ flux in sustaining elevated cPLA2 activity
tantly, mTORC2-specific inhibition was also accompanied by in the context of oncogenic PIK3CA.
decreased cPLA2 stability (Figure S5G). To elucidate the mech- Finally, because cPLA2 has a predicted PKCz phosphorylation
anism underlying this observation, known substrates of site (T376), we tested the possibility that it could serve as a direct
mTORC2 including serum/glucocorticoid-regulated kinase 1 substrate for PKCz. In vitro kinase assays using purified PKCz
(SGK-1) and the protein kinase C (PKC) isoforms were inhibited and cPLA2 suggested a direct interaction and phosphorylation
in PIK3CA MUT cells. ASB14780—an indole-based compound (Figure 4I), and this was confirmed following immunoprecipita-
that inhibits both cPLA2 translocation to membrane compart- tion (Figure 4J) and proximity ligation activity (PLA) assays (Fig-
ments and the interaction between phospholipid substrates ures S6I and S6J). To further evaluate cPLA2 as a candidate sub-
with the enzyme active site (McKew et al., 2008; Tomoo et al., strate for PKCz, we developed a custom antibody recognizing
2014)—was used as a positive control. Importantly, a significant the cPLA2 T376 phosphorylation site. Specificity was validated
reduction in AA and cPLA2’s enzymatic activity, reminiscent of in both serum starved/stimulated samples (Figure S6K, left
that observed upon ASB14780 inhibition, was only observed panel), following PKCz inhibition (Figures S5K and S6K, middle),
following pharmacological and small interfering RNA (siRNA)- and in cPLA2 CRISPR knockout cells overexpressing a phos-
mediated inhibition of PKCz (Figures 4D, 4E, and S5H–S5K). phoresistant MUT (T376A) cPLA2 (Figure S6K, right panel).
Previous studies have shown that phosphorylation of cPLA2 Importantly, increased T376 phosphorylation was observed in
on S505 regulates its activity and stability and that this is medi- the presence of oncogenic PIK3CA and this was reduced to
ated, at least in part, by the p38 MAPK/ERK signaling pathway levels equivalent to PIK3CA WT cells upon PKCz inhibition
(Kramer et al., 1996; Lin et al., 1993). The contribution of PI3K- (Figure 4K).
(F and G) Immunoblot analysis of the MCF10A PIK3CA isogenic panel following growth factor deprivation for 16 h and 30 min stimulation with serum and growth
factors (F) or PKCz inhibition with 1 mM peptide inhibitor for 72 h (G).
(H) Immunoblot analysis of activated Rac-1 and p38 MAPK in the MCF10A PIK3CA isogenic panel following PKCz inhibition with 1 mM peptide inhibitor for 72 h.
(I) In vitro kinase assay of 100 ng and 0.5 mg/mL purified PKCz and cPLA2 proteins, respectively.
(J) Immunoblot analysis of anti-HA immunoprecipitates derived from HA-tagged cPLA2 transfected MCF10A PIK3CA WT and MUT cells.
(K) Immunoblot analysis of anti-HA immunoprecipitates derived from HA-tagged cPLA2 transfected MCF10A PIK3CA WT and MUT cells treated where indicated
with 1 mM PKCz peptide inhibitor for 48 h.
(L) AA levels across H1047R MUT cells with CRISPR knockout of PLA2G4A reconstituted with WT or phosphoresistant cPLA2 isoforms.
(M) Diagram summarizing the proposed model for PI3K-mTORC2-PKCz and calcium-dependent activation of cPLA2, leading to a concomitant increase in AA
and downstream eicosanoids. Data are presented as the mean ± SEM of n = 3–6 biological replicates and are representative of at least two independent ex-
periments. n.s., not significant; *p % 0.05; **p % 0.01; ***p % 0.001. p values in (A) were calculated with unpaired, two tailed Student’s t test, and in (B)–(E), (I), and
(L) with one-way ANOVA, followed by unpaired, two-tailed Student’s t test with Bonferroni correction.
1602 Cell 181, 1596–1611, June 25, 2020

ll
Article OPEN ACCESS
A B C D
F G
Figure 5. Genetic and Pharmacological Inhibition of cPLA2 Selectively Reduces Oncogenic PIK3CA-Mediated Tumorigenicity
(A–D) Cell viability of (A) PIK3CA WT and MUT MCF10A cells, and (B) breast cancer cell lines (PIK3CA WT: MDAMB134, Hs578T, AU565; PIK3CA MUT: MCF-7,
CAL-51, MDAMB453) following treatment with increasing concentrations (20 nM–10 mM) of ASB14780 under full serum conditions for 72 h. The same treatments
were also performed under fatty acid-free conditions in (C) and (D), in the presence or absence of exogenous supplementation of 25 mM AA.
(E) Clonogenic assays of MCF10A PIK3CA WT and MUT cells treated with increasing concentrations of ASB14780 as in (A)–(D). Treatments were performed
under fatty acid-free conditions, with or without the supplementation of 25 mM AA.
(F) Immunoblot analysis confirming specific knockdown of cPLA2 using two independent constitutive shRNAs (sh1 and sh5) (left) and reduction in AA levels in
MCF10A E545K/H1047R MUT cells using REIMS.
(G) Proliferation of MCF10A H1047R MUT cells expressing shGFP, cPLA2-sh1, or cPLA2-sh5 under exogenous FAF conditions. Sulforhodamine B (SRB) protein
staining was used to measure cell proliferation over 5 days.
Cell 181, 1596–1611, June 25, 2020 1603

ll
OPEN ACCESS Article
To ascertain the functional significance of these two phos- cPLA2, we demonstrated that PIK3CA MUT, but not WT cells,
phorylation sites (S505 and T376), endogenous cPLA2 in rely on cPLA2 to form epithelial acini in 3D culture and sustain
PIK3CA WT and H1047R MUT cells was reconstituted with WT their proliferation (Figures 6A–6C).
or phosphoresistant MUT of cPLA2 (S505A or T376A) (Figure 4L). To further evaluate the therapeutic effect of cPLA2 inhibition in
Knockout of cPLA2 in H1047R cells reduced AA to levels equiv- primary breast cancers, we treated triple negative breast cancer
alent to the PIK3CA WT background, and this could be rescued (TNBC) (Figure 6D) PIK3CA WT (Figure 6E) and MUT (Figure 6F)
with ectopic expression of the WT, but not MUT (S505A or PDX-bearing mice with the ASB14780 inhibitor in conjunction
T376A) cPLA2 (Figure 4L). Interestingly, the activity of exoge- with a normal or near-isocaloric fat-free diet. A significant reduc-
nously expressed WT or MUT cPLA2 largely mirrored the trends tion in tumor weight was only observed in the PIK3CA MUT PDX
in AA that were previously detected by REIMS (Figure S6L), sug- model when both the inhibitor and a fat-free diet were adminis-
gesting that cPLA2 activity is highly dependent on the phosphor- tered in combination (Figure 6F). Corroborating these observa-
ylation of S505 and T376 in the context of oncogenic PIK3CA. tions, histological analysis did not reveal any changes in tumor
Overall, our model provides a unifying framework for several pre- area for the PIK3CA WT BR1458 model (Figures 6G, left panels,
viously unconnected components of PI3K signaling, which and 6H), while a striking reduction in viable tumor regions was
converge on cPLA2 activation and enhanced AA metabolism observed in PIK3CA MUT PDX-bearing mice treated with
(Figure 4M). ASB14780 in fat-free diet (Figures 6G, right panels, and 6I).
The concomitant increase in necrotic regions (as indicated by
cPLA2 Inhibition and Dietary Fat Restriction Suppress areas of pale eosinophilic cytoplasms, in addition to loss of
PIK3CA-Induced Tumorigenicity and Restore Anti- nuclei and karyolysis), evidently contributed to a substantial pro-
cancer Immune Responses portion of the overall weight of the PIK3CA MUT-bearing tumor
If cPLA2 is required for PIK3CA oncogenicity, then targeting this that was left after treatment of ASB14780 in fat-free diet (Fig-
enzyme could represent an attractive therapeutic strategy. For ure 6G, bottom right panel). In the interest of measuring AA levels
this, we used the inhibitor ASB14780, which displays excellent at the end of the treatment regime (Figure 6D), resected tumors
oral bioavailability and higher specificity for the cPLA2 isoform were analyzed directly with REIMS. Although a fat-free diet alone
than other commonly used compounds such as Efipladib and led to a modest, yet significant reduction in the AA levels of the
Ecopladib (Lee et al., 2007; Tomoo et al., 2014). Furthermore, PDX tumors, the decrease was significantly more pronounced
ASB14780 has been shown to ameliorate inflammatory pathol- when accompanied with cPLA2 inhibition (Figures 6J and 6K),
ogies including non-alcoholic fatty liver disease (NAFLD) and highlighting the role of dietary AA restriction in therapy response
chronic obstructive pulmonary disease (COPD) through sup- (Figures 6E–6I).
pression of AA and prostaglandin synthesis (Kanai et al., 2016; The observation that cPLA2 inhibition and fat-free diet selec-
Tomoo et al., 2014). However, the potential anti-neoplastic prop- tively reduces the tumorigenicity of PIK3CA MUT cells raises
erties of this inhibitor remain obscure. Of note, PIK3CA mutation the interesting prospect of modulating this response by altering
sensitized cells to pharmacological inhibition of cPLA2 with dietary fat content. Indeed, it is becoming increasingly appreci-
ASB14780 (Figures 5A and 5B), and this effect was more prom- ated that the diet of Western populations commonly contains
inent under exogenous FA-free conditions (Figures 5C and 5D), an excess of pro-inflammatory omega-6 to omega-3 FAs by up
alleviating any compensatory mechanisms to obtain AA. In addi- to 50 times (popularly referred to as the ‘‘Western’’ diet), and
tion to viability, cPLA2 inhibition significantly reduced the clono- this may be implicated in the progression of breast and colo-
genicity of PIK3CA MUT cells, and, importantly, both could be rectal cancers (Patterson et al., 2012; Simopoulos, 2008). To
rescued by exogenous supplementation of AA (Figures 5C, 5D, further explore this premise, we injected triple negative CAL51
and 5E). Near-identical results were obtained following genetic (PIK3CA MUT) and Hs578T (PIK3CA WT) cell lines stably ex-
knockdown of cPLA2 using two constitutive short-hairpin pressing either control shGFP, or two independent shRNAs tar-
RNAs (shRNAs, denoted as cPLA2-sh1 and cPLA2-sh5) (Figures geting cPLA2 (cPLA2-sh1 or sh5) into the mammary fat pad of
5F, 5G, and S6M), while WT cells were unaffected, suggesting BALB/c nude mice that had been preconditioned on either fat-
that cPLA2 is dispensable in this setting (Figure S6N). To further free, balanced (omega6:omega3 = 1:1) or ‘‘Western’’ (omega6:o-
confirm the importance of cPLA2-induced AA metabolism in mega3 = 50:1) diets (Figure 7A). Consistent with pharmacolog-
PIK3CA MUT cancers, we suppressed cPLA2 in the isogenic ical inhibition, knockdown of cPLA2 significantly impaired the
panel and assessed their ability to form colonies under FA-free growth of PIK3CA MUT tumor xenografts under fat-free diet con-
conditions. Although no significant difference in colony forma- ditions (Figure 7B), and this therapeutic effect was completely
tion was detected in PIK3CA WT cells, there was a marked reversed when animals were fed the AA-enriched ‘‘Western’’
reduction in the number of colonies following cPLA2 knockdown diet (Figures 7C and 7D). Although there was a trend toward a
in the MUT cells that was restored in the presence of AA reduction in overall tumor weights under a balanced diet, this
(Figure 5H). Furthermore, using an inducible shRNA against did not reach statistical significance (Figures S7A and S7B). In
(H) Clonogenic assays of MCF10A PIK3CA WT and MUT cells expressing shGFP, cPLA2-sh1, or cPLA2-sh5 under FAF conditions, supplemented with or without
25 mM AA. Data in (A)–(H) are presented as the mean ± SEM of n = 3–4 biological replicates and are representative of at least two independent experiments. Data
in (D) are presented as the mean viability of three PIK3CA MUT (MCF-7, CAL-51, MDAMB453) and WT (MDAMB134, Hs578T, AU565) measured in triplicate wells.
n.s., not significant; *p < 0.05; **p < 0.01; ***p < 0.001; p values in (A)–(D) and (G) were calculated using two-way ANOVA. For (E, right), (F, right), and (H, right), one-
way ANOVA followed by unpaired, two-tailed Student’s t test with Bonferroni correction was applied.
1604 Cell 181, 1596–1611, June 25, 2020

ll
Article OPEN ACCESS
A B C
E F G
H I J K
Cell 181, 1596–1611, June 25, 2020 1605

ll
OPEN ACCESS Article
accordance with our model, PIK3CA WT tumors were unaffected In agreement with the chemokine analysis, tumor infiltration of
by these treatments (Figures 7E–7G, S7C, and S7D). Further NK cells was significantly increased in PIK3CA MUT tumors by
corroborating our findings, genetic inhibition of cPLA2 only dietary and therapeutic interventions, with dual inhibition (either
reduced viable tumor area in animals with PIK3CA MUT tumors with ASB14780 or shRNA) of cPLA2 and fat-free diet leading to
when those were fed a fat-free diet, while no anti-neoplastic the largest increase in NKp46 staining (Figures S7K, S7L, S7N,
benefit was conferred under a ‘‘Western’’ diet (Figures 7H–7J). and S7O). It is also noteworthy that PIK3CA WT PDX and cell
REIMS profiling of excised tumors revealed that AA levels line-derived xenograft tumors contained relatively higher base-
were significantly altered in concordance with targeting cPLA2 line levels of CCL5, CX3CL5, and NKp46 expression, as
and dietary fat intake (Figures 7K and S7E). Interestingly, a compared to MUT tumors, and these were not significantly
more substantial AA reduction was observed in PIK3CA MUT altered by cPLA2 inhibition and/or changes in the diet (Figures
xenograft tumors following cPLA2-knockdown and administra- S7G–S7J, S7M, and S7P). Considered together, these data sug-
tion of fat-free diet (40%–50% decrease), as compared to WT tu- gest that oncogenic PIK3CA might suppress BC immunoge-
mors (20% decrease) (Figures 7K and 7L, S7E, and S7F). In line nicity, at least in part, through regulation of AA metabolism,
with our previous in vivo study (Figure 6), these findings demon- and this can be reversed through co-administration of cPLA2 in-
strate that the modulation of dietary fat content, and supplemen- hibition and dietary fat restriction.
tation of omega-6 FAs either in a balanced ratio with omega-3
FAs, or to a much larger extent in the ‘‘Western’’ diet, completely DISCUSSION
abolishes the therapeutic benefit of cPLA2 inhibition in PIK3CA
MUT tumors. Unraveling the interplay between genotype and metabolic
In addition to promoting growth and proliferation through both phenotype, as well as their complex interactions with nutrient
autocrine and paracrine mechanisms, AA and its downstream me- availability, unequivocally plays a major role in understanding
tabolites have also been implicated in the metabolic remodeling of disease pathogenesis and identifying novel therapeutic interven-
the tumor microenvironment. One of their major consequences is tions. Here, we demonstrate that the iKnife/REIMS enables close
inhibition of the anti-cancer immune responses, ultimately leading to real-time prediction of clinically relevant tumor features based
to immune evasion and tumor progression (Böttcher et al., 2018; on their metabolic fingerprints, offering a novel repertoire for
Zelenay et al., 2015). Although the BALB/c nude mice used in cancer diagnosis and therapy decision-making. Among these,
this study lack adaptive immunity in the form of T cells, these an- is oncogenic PIK3CA, which triggers almost ‘‘the perfect storm’’
imals mount robust innate immune responses predominantly of signaling events, culminating in the overproduction of AA and
mediated by natural killer (NK) cells (Lee et al., 2015; Okada downstream eicosanoids via the activation of cPLA2.
et al., 2019). We therefore sought to investigate how the various We have evidence of the central role of AA and eicosanoids in
therapeutic and dietary regimes that were used in this study a wide range of disorders including cancer, obesity, diabetes,
impact NK cell responses. To do this, the levels of type I inter- asthma, and autoimmune disorders (Dennis and Norris, 2015;
feron-induced chemokines (CCL5 and CX3CL1) and expression Sonnweber et al., 2018; Wang and Dubois, 2010). However,
of a major NK cell-activating receptor (NKp46) were measured the signal transduction pathways behind their activation, as
in tumors from different treatment groups. well as the molecular cues that link these bio-active lipids with
Coinciding with the largest reduction in AA levels (Figure 6K), a growth factor-independent cell proliferation have remained
marked increase in CCL5 and CX3CL1 was only observed in the largely obscure. Our results demonstrate that mTORC2 down-
BR1282 (PIK3CA MUT) PDX tumor following co-administration stream of oncogenic PIK3CA acts as a pivotal signaling hub for
of ASB14780 and fat-free diet (Figures S7G and S7I). Similar re- driving enhanced AA metabolism to sustain cell proliferation
sults were obtained in the CAL51 (PIK3CA MUT)-derived xeno- beyond a cell autonomous manner. This is particularly interesting
graft tumor model, where the increase in chemokine levels was in light of evidence obtained from in situ single-cell analysis of
rescued when mice were fed the AA-enriched ‘‘Western’’ diet primary breast tumors, showing often that only a small fraction
(Figures S7H and S7J). of cancer cells within a tumor carry PIK3CA mutations, while
Figure 6. Oncogenic PIK3CA Serves as a Defining Biomarker for Sensitivity of Pre-clinical Models to cPLA2 Inhibition
(A) Immunoblot analysis confirming inducible knockdown of cPLA2 following induction with 2 mg/mL doxycycline.
(B) 3D acini formation of MCF10A PIK3CA WT and MUT cells following doxycycline-induced cPLA2-sh1 or shGFP expression. Cells were stained for Ki-67 (pink,
Alexa Fluor 546), F-actin (red, Phalloidin 633), and DAPI (blue).
(C) Quantification of Ki-67 staining from treatments in (B).
(D) Schematic of in vivo experimental design and tumor profiling with REIMS.
(E and F) Tumor weights of (E) PIK3CA WT (BR1458) and (F) C420R MUT (BR1282) breast PDX tumors treated with 100 mg/kg of the cPLA2a pharmacological
inhibitor ASB14780 under FAF diet (n = 8 mice for the BR1458 model, and n = 7 mice for the BR1282 for both the vehicle- and ASB14780-treated groups).
(G) Representative images of H&E staining from resected tumors in (E) and (F). The black masks in (G) represent viable tumor area, while unshaded regions
correspond to necrotic tissue.
(H and I) Quantification of viable tumor area from (H) PIK3CA WT (BR1458) and (I) PIK3CA MUT (BR1282) tumor sections based on the analysis depicted in (G).
(J and K) AA levels measured by REIMS in (J) PIK3CA WT (BR1458) and (K) MUT (BR1282) tumors excised and snap frozen 2 h after the final dosing. Error bars in
(C), (J), and (K) represent mean ± SEM, with data in (J) and (K) corresponding to tumor REIMS measurements from n = 7–8 mice. n.s., not significant;*p < 0.05;
**p < 0.01; ***p < 0.001; p values in (E), (F), and (H)–(K) were calculated using one-way ANOVA followed by unpaired, two-tailed Student’s t test with Bonferroni
correction.
1606 Cell 181, 1596–1611, June 25, 2020

ll
Article OPEN ACCESS
B C D
E F G
H I J
K L
Cell 181, 1596–1611, June 25, 2020 1607

ll
OPEN ACCESS Article
many of the neighboring cancer and stromal cells are WT (Janis- oncogenic PIK3CA has been shown to sensitize cancer cells to
zewska et al., 2015). Given that AA has been shown to induce aspirin (Henry et al., 2017; Liao et al., 2012) and, based on our
both PI3K (Hughes-Fulford et al., 2006) and MAPK (Alexander findings, it is tempting to speculate that this connection could
et al., 2006) signaling, PIK3CA MUT cells could trigger a snow- be true, in part because of the heightened capacity of PIK3CA
ball effect, through overproduction of AA, affecting not only their MUT cells for high AA production. However, further studies are
own signaling and proliferation, but also that of their adjacent needed to ascertain this connection, because evidence sug-
PIK3CA WT cells. Moreover, in light of the role of prostaglandins gests that the growth inhibitory effect of aspirin in PIK3CA
in lymphangiogenesis (Lala et al., 2018), the paracrine effects of MUT cells is likely to be COX-2-independent (Henry et al., 2017).
AA could be of further relevance to the activity of PI3Ka in endo- Although PI3K pathway inhibitors have shown some efficacy in
thelial cells (Okkenhaug et al., 2016; Wang and Dubois, 2010). treating advanced solid tumors, the majority has been associ-
Eicosanoids no longer represent the missing link between ated with only partial tumor remission and they are often accom-
inflammation and cancer (Greene et al., 2011). Elevated tumor- panied by severe side effects (Fruman et al., 2017; Li et al., 2018).
derived PGE2 contributes to immune evasion by preventing the Recent evidence suggests that one way to enhance their efficacy
interferon gamma (IFNg)-dependent upregulation of ICAM-1 is by suppressing their insulin feedback through adoption of a
that is pertinent for complete CD8(+) T cells activation (Basingab ketogenic diet (Hopkins et al., 2018). Indeed, diet could play a
et al., 2016). In addition, autocrine PGE2 impairs NK cell viability much more significant role in therapy response than previously
and chemokine production and leads to downregulation of the anticipated. Our data suggest that a diet rich in FAs limits the ef-
chemokine receptors of cDC1 that promote their recruitment ficacy of the cPLA2 inhibitor, as PIK3CA MUT tumors likely
into tumors (Böttcher et al., 2018). Importantly, there is a strong depend on their high flux of extracellular FA intake to compen-
positive correlation between the gene signature of cDC1 and sate for the loss of AA. This observation raises the possibility
NK cells and better overall survival in melanoma and breast can- that adopting a diet without meat and dairy products (major sour-
cers (Böttcher et al., 2018), suggesting that monitoring the immu- ces of AA) could dramatically improve the sensitivity of the
nomodulatory functions of prostaglandins via PI3K/Akt pathway cPLA2 inhibitor and help restore tumor immunogenicity, sug-
inhibition could have important clinical implications. Indeed, we gesting a novel path for future clinical trials where nutrition will
have shown that modulation of AA levels in PIK3CA MUT tumors play a major role in disease management and treatment.
through cPLA2 inhibition in combination with dietary fat restriction
increases intra-tumor infiltration of NK cells and their associated STAR+METHODS
chemokines, while this can be reversed by the ‘‘Western’’ diet,
which contains an excess of omega6-FAs. NK cell markers Detailed methods are provided in the online version of this paper
were largely unaffected in PIK3CA WT tumors, and this could and include the following:
reflect their lower intra-tumor AA levels that are likely attributable
to reduced cPLA2 activity and FA uptake, as compared to d KEY RESOURCES TABLE
PIK3CA MUT cells. In light of recent evidence showing that block- d RESOURCE AVAILABILITY
ing PI3K signaling with the pan-PI3K inhibitor BKM120 increases B Lead Contact
tumor-immune infiltrate and renders PIK3CA MUT mouse bladder B Materials Availability
tumors more susceptible to PD-1 blockade (Borcoman et al., B Data and Code Availability
2019), our model raises important considerations for how immu- d EXPERIMENTAL MODEL AND SUBJECT DETAILS
notherapies may be successfully applied to oncogenic PIK3CA B Human samples
MUT tumors that may be inherently less immunogenic, at least B Animals
in part, due to enhanced AA production. B Cell culture
Another way to relieve the immunosuppressive effects of tu- d METHOD DETAILS
mor cells is by inhibiting COX activity via the use of non-steroidal B Experimental design
anti-inflammatory drugs (NSAIDs), such as aspirin. Notably, B Mass spectrometry analysis
Figure 7. Dietary Supplementation of Arachidonic Acid Reverses the Sensitivity of PIK3CA Mutant Tumors to cPLA2 Inhibition
(A) Schematic of in vivo experimental design and profiling of breast cancer cell line xenografts with REIMS.
(B and C) Relative tumor growth of CAL-51 (PIK3CA MUT)-derived xenografts stably expressing control shGFP or two independent shRNAs targeting cPLA2
(cPLA2-sh1 and cPLA2-sh5) under (B) fat-free or (C) ‘‘Western’’ diets.
(D) Weights of tumors excised at the end of the experiments (B) and (C).
(E and F) Relative tumor growth of Hs578T (PIK3CA WT)-derived xenografts stably expressing control shGFP or two independent shRNAs targeting cPLA2 under
(E) fat-free or (F) ‘‘Western’’ diets.
(G) Weights of tumors excised at the end of the experiments (E) and (F).
(H) Representative images of H&E staining from resected tumors in (D) and (G). The black masks in (H) represent viable tumor area, while unshaded regions
correspond to necrotic tissue.
(I and J) Quantification of viable tumor area from (I) PIK3CA MUT (CAL-51) and (J) PIK3CA WT (Hs578T) tumor sections based on the analysis depicted in (H).
(K and L) AA levels measured by REIMS in (K) PIK3CA MUT (CAL-51) and (L) PIK3CA WT (Hs578T) snap frozen excised tumors. AA intensities are reported as
scaled values to the appropriate shGFP-fat-free diet condition. Data in (B), (C), (E), (F), (K), and (L) represent the mean ± SEM of relative tumor growth or tumor
REIMS measurements from n = 3–5 mice. n.s., not significant; *p < 0.05; **p < 0.01; ***p < 0.001; p values in (B), (C), (E), and (F) were calculated using two-way
ANOVA, and one-way ANOVA followed by unpaired, two-tailed Student’s t test with Bonferroni correction was used in (D), (G), and (I)–(L).
1608 Cell 181, 1596–1611, June 25, 2020

ll
Article OPEN ACCESS
B Metabolomics data pre-processing and analysis ted with LC-MS profiling of eicosanoids. A.v.W. and C.M.I. helped with animal
B Transfections and site directed mutagenesis experiments. R.F.S. provided essential technical support for sample process-
B Cell based assays ing with REIMS. N.K. and G.P. wrote the manuscript. All authors edited the
manuscript.
B Three-dimensional cell culture
B Confocal microscopy
B Enzymatic assays
B Immunoblot analysis
N.K. and G.P. are inventors on a patent application covering new methods and
B Immunoprecipitation analysis compositions useful in the treatment of cancers with PIK3CA mutation (appli-
B Immunohistochemistry analysis cation number GB 2005874.9).
B Proximity ligation assay
B Fluorescent calcium assay Received: August 5, 2019
B Quantitative RT-PCR and PIK3CA mutation analysis Revised: March 7, 2020
B In vitro kinase assay
B Chemokine assays
B Lipid extraction and eicosanoid profiling
REFERENCES
B Bioinformatic analysis
d QUANTIFICATION AND STATISTICAL ANALYSIS Alexander, L.D., Ding, Y., Alagarsamy, S., Cui, X.L., and Douglas, J.G. (2006).
Arachidonic acid induces ERK activation via Src SH2 domain association with
SUPPLEMENTAL INFORMATION the epidermal growth factor receptor. Kidney Int. 69, 1823–1832.
Alexander, J., Gildea, L., Balog, J., Speller, A., McKenzie, J., Muirhead, L.,
Supplemental Information can be found online at https://doi.org/10.1016/j. Scott, A., Kontovounisios, C., Rasheed, S., Teare, J., et al. (2017). A novel
cell.2020.05.053. methodology for in vivo endoscopic phenotyping of colorectal cancer based
A video abstract is available at https://doi.org/10.1016/j.cell.2020.05. on real-time analysis of the mucosal lipidome: a prospective observational
053#mmc8. study of the iKnife. Surg. Endosc. 31, 1361–1370.
Ambs, P., Baccarini, M., Fitzke, E., and Dieter, P. (1995). Role of cytosolic
ACKNOWLEDGMENTS phospholipase A2 in arachidonic acid release of rat-liver macrophages: regu-
lation by Ca2+ and phosphorylation. Biochem. J. 311, 189–195.
We thank Naomi Guppy and Farzana Noor (Breast Cancer Now Histopatholo-
Balog, J., Sasi-Szabó, L., Kinross, J., Lewis, M.R., Muirhead, L.J., Veselkov,
gy, ICR, London, UK) and Elena Miranda and Adriana Resende Alves (Pathol-
K., Mirnezami, R., Dezso } , B., Damjanovich, L., Darzi, A., et al. (2013). Intraoper-
ogy Core Facility, University College London Cancer Institute) for support with
ative tissue identification using rapid evaporative ionization mass spectrom-
immunohistochemistry, hematoxylin, and eosin analysis; and Champions
etry. Sci. Transl. Med. 5, 194ra93.
Oncology (London, UK) for kindly providing breast, pancreatic, ovarian, sar-
coma, and colorectal cancer PDX tumor specimens. We would also like to Basingab, F.S., Ahmadi, M., and Morgan, D.J. (2016). IFNg-Dependent Inter-
thank Edward St. John for enabling access to primary breast tumor samples, actions between ICAM-1 and LFA-1 Counteract Prostaglandin E2-Mediated
Verena M. Horneffer-van der Sluis for assistance with eicosanoid analysis, and Inhibition of Antitumor CTL Responses. Cancer Immunol. Res. 4, 400–411.
the Biological Services Unit staff at the Institute of Cancer Research (Chelsea Bligh, E.G., and Dyer, W.J. (1959). A rapid method of total lipid extraction and
site) for their assistance with in vivo experiments. N.K. was supported by an purification. Can. J. Biochem. Physiol. 37, 911–917.
ICR PhD studentship. The work described and the laboratory of G.P. was sup- Borcoman, E., De La Rochere, P., Richer, W., Vacher, S., Chemlali, W.,
ported by the Institute of Cancer Research and a Cancer Research UK Grand Krucker, C., Sirab, N., Radvanyi, F., Allory, Y., Pignot, G., et al. (2019). Inhibi-
Challenge award (C59824/A25044). Work in the Z.T. lab was supported by the tion of PI3K pathway increases immune infiltrate in muscle-invasive bladder
European Research Council (MASSLIP Consolidator grant), Cancer Research cancer. OncoImmunology 8, e1581556.
UK Grand Challenge award (C59824/A25044), and the National Institute for
Böttcher, J.P., Bonavita, E., Chakravarty, P., Blees, H., Cabeza-Cabrerizo, M.,
Health Research (Imperial Biomedical Research Centre). A.V. was funded by
Sammicheli, S., Rogers, N.C., Sahai, E., Zelenay, S., and Reis E Sousa, C.
the Ministry of Education, Culture and Sport under the Program for Promoting
(2018). NK Cells Stimulate Recruitment of cDC1 into the Tumor Microenviron-
and Hiring of Talent and its Employability (Subprogram for Mobility ‘‘José Cas-
ment Promoting Cancer Immune Control. Cell 172, 1022–1037.e14.
tillejo’’) of the Spanish Government and by Comunitat Autònoma de les Illes
Balears, Direcció General d’lnnovació i Recerca (AAEE003/2017) and Fons Eu- Burke, J.E., and Dennis, E.A. (2009). Phospholipase A2 structure/function,
ropeu de Desenvolupament Regional de la Unió Europea (FEDER). mechanism, and signaling. J. Lipid Res. 50 (Suppl ), S237–S242.
Carracedo, A., and Pandolfi, P.P. (2008). The PTEN-PI3K pathway: of feed-
AUTHOR CONTRIBUTIONS backs and cross-talks. Oncogene 27, 5527–5541.
Chambers, M.C., Maclean, B., Burke, R., Amodei, D., Ruderman, D.L., Neu-
N.K., Z.T., and G.P. designed the study with contributions from J.K.N. and mann, S., Gatto, L., Fischer, B., Pratt, B., Egertson, J., et al. (2012). A cross-
R.C.G. Z.T. and G.P. directed the project and supervised data analysis. N.K. platform toolkit for mass spectrometry and proteomics. Nat. Biotechnol. 30,
performed and analyzed most experiments. E.K. performed 3-D acini, prox- 918–920.
imity ligation assays, calcium flux analyses, the CRISPR knockdown genera-
Chou, M.M., Hou, W., Johnson, J., Graham, L.K., Lee, M.H., Chen, C.S.,
tion, the site-directed mutagenesis constructs, and assisted with xenograft
Newton, A.C., Schaffhausen, B.S., and Toker, A. (1998). Regulation of protein
studies. A.T. performed in vitro kinase assays, ELISAs, and assisted with xeno-
kinase C zeta by PI 3-kinase and PDK-1. Curr. Biol. 8, 1069–1077.
graft studies and REIMS analysis of tumor samples. A.V. assisted with estro-
gen receptor signaling experiments, the generation of dox-inducible cPLA2- Clark, J.D., Schievella, A.R., Nalefski, E.A., and Lin, L.L. (1995). Cytosolic
knockdown cell lines, and lipid extractions for LC-MS analysis. P.I. developed phospholipase A2. J. Lipid Mediat. Cell Signal. 12, 83–117.
essential platforms for pre-processing and interpretation of REIMS data. Dan, S., Okamura, M., Seki, M., Yamazaki, K., Sugita, H., Okui, M., Mukai, Y.,
N.J.S.P. assisted with xenograft studies. D.J.M. assisted with chemokine as- Nishimura, H., Asaka, R., Nomura, K., et al. (2010). Correlating phosphatidyli-
says. S.A.V. and G.A.E. cultured and analyzed cell lines grown under different nositol 3-kinase inhibitor efficacy with signaling pathway status: in silico and
microenvironment conditions. A.L.T. carried out image analysis. M.L.D. assis- biological evaluations. Cancer Res. 70, 4982–4994.
Cell 181, 1596–1611, June 25, 2020 1609

ll
OPEN ACCESS Article
Debnath, J., Muthuswamy, S.K., and Brugge, J.S. (2003). Morphogenesis and for mobilization of arachidonic acid by cPLA2. J. Biol. Chem. 271,
oncogenesis of MCF-10A mammary epithelial acini grown in three-dimen- 27723–27729.
sional basement membrane cultures. Methods 30, 256–268. Lala, P.K., Nandi, P., and Majumder, M. (2018). Roles of prostaglandins in tu-
Dennis, E.A., and Norris, P.C. (2015). Eicosanoid storm in infection and inflam- mor-associated lymphangiogenesis with special reference to breast cancer.
mation. Nat. Rev. Immunol. 15, 511–523. Cancer Metastasis Rev. 37, 369–384.
Dibble, C.C., and Manning, B.D. (2013). Signal integration by mTORC1 coor- Lee, K.L., Foley, M.A., Chen, L., Behnke, M.L., Lovering, F.E., Kirincich, S.J.,
dinates nutrient input with biosynthetic output. Nat. Cell Biol. 15, 555–564. Wang, W., Shim, J., Tam, S., Shen, M.W., et al. (2007). Discovery of Ecopladib,
Düvel, K., Yecies, J.L., Menon, S., Raman, P., Lipovsky, A.I., Souza, A.L., Tri- an indole inhibitor of cytosolic phospholipase A2alpha. J. Med. Chem. 50,
antafellow, E., Ma, Q., Gorski, R., Cleaver, S., et al. (2010). Activation of a 1380–1400.
metabolic gene regulatory network downstream of mTOR complex 1. Mol. Lee, S.J., Kang, W.Y., Yoon, Y., Jin, J.Y., Song, H.J., Her, J.H., Kang, S.M.,
Cell 39, 171–183. Hwang, Y.K., Kang, K.J., Joo, K.M., and Nam, D.H. (2015). Natural killer (NK)
Fruman, D.A., Chiu, H., Hopkins, B.D., Bagrodia, S., Cantley, L.C., and cells inhibit systemic metastasis of glioblastoma cells and have therapeutic ef-
Abraham, R.T. (2017). The PI3K Pathway in Human Disease. Cell 170, fects against glioblastomas in the brain. BMC Cancer 15, 1011.
605–635.
Lee, G., Zheng, Y., Cho, S., Jang, C., England, C., Dempsey, J.M., Yu, Y., Liu,
Gavaghan, C.L., Holmes, E., Lenz, E., Wilson, I.D., and Nicholson, J.K. (2000). X., He, L., Cavaliere, P.M., et al. (2017). Post-transcriptional Regulation of De
An NMR-based metabonomic approach to investigate the biochemical conse- Novo Lipogenesis by mTORC1-S6K1-SRPK2 Signaling. Cell 171, 1545–1558.
quences of genetic strain differences: application to the C57BL10J and Alp-
Li, X., and Gao, T. (2014). mTORC2 phosphorylates protein kinase Cz to regu-
k:ApfCD mouse. FEBS Lett. 484, 169–174.
late its stability and activity. EMBO Rep. 15, 191–198.
Gibb, S., and Strimmer, K. (2012). MALDIquant: a versatile R package for the
Li, X., Dai, D., Chen, B., Tang, H., Xie, X., and Wei, W. (2018). Efficacy of PI3K/
analysis of mass spectrometry data. Bioinformatics 28, 2270–2271.
AKT/mTOR pathway inhibitors for the treatment of advanced solid cancers: A
Greene, E.R., Huang, S., Serhan, C.N., and Panigrahy, D. (2011). Regulation of literature-based meta-analysis of 46 randomised control trials. PLoS ONE 13,
inflammation in cancer by eicosanoids. Prostaglandins Other Lipid Mediat. e0192464.
96, 27–36.
Liao, X., Lochhead, P., Nishihara, R., Morikawa, T., Kuchiba, A., Yamauchi, M.,
Griffiths, B., Lewis, C.A., Bensaad, K., Ros, S., Zhang, Q., Ferber, E.C., Konisti,
Imamura, Y., Qian, Z.R., Baba, Y., Shima, K., et al. (2012). Aspirin use, tumor
S., Peck, B., Miess, H., East, P., et al. (2013). Sterol regulatory element binding
PIK3CA mutation, and colorectal-cancer survival. N. Engl. J. Med. 367,
protein-dependent regulation of lipid synthesis supports cell survival and tu-
1596–1606.
mor growth. Cancer Metab. 1, 3.
Lien, E.C., Lyssiotis, C.A., and Cantley, L.C. (2016). Metabolic Reprogramming
Guri, Y., Colombi, M., Dazert, E., Hindupur, S.K., Roszik, J., Moes, S., Jenoe,
by the PI3K-Akt-mTOR Pathway in Cancer. Recent Results Cancer Res.
P., Heim, M.H., Riezman, I., Riezman, H., and Hall, M.N. (2017). mTORC2 Pro-
207, 39–72.
motes Tumorigenesis via Lipid Synthesis. Cancer Cell 32, 807–823.e12.
Lin, L.L., Wartmann, M., Lin, A.Y., Knopf, J.L., Seth, A., and Davis, R.J. (1993).
Henry, W.S., Laszewski, T., Tsang, T., Beca, F., Beck, A.H., McAllister, S.S.,
cPLA2 is phosphorylated and activated by MAP kinase. Cell 72, 269–278.
and Toker, A. (2017). Aspirin Suppresses Growth in PI3K-Mutant Breast Can-
cer by Activating AMPK and Inhibiting mTORC1 Signaling. Cancer Res. 77, McKew, J.C., Lee, K.L., Shen, M.W., Thakker, P., Foley, M.A., Behnke, M.L.,
790–801. Hu, B., Sum, F.W., Tam, S., Hu, Y., et al. (2008). Indole cytosolic phospholipase
Hilvo, M., Denkert, C., Lehtinen, L., Müller, B., Brockmöller, S., Seppänen- A2 alpha inhibitors: discovery and in vitro and in vivo characterization of 4-{3-
Laakso, T., Budczies, J., Bucher, E., Yetukuri, L., Castillo, S., et al. (2011). [5-chloro-2-(2-{[(3,4-dichlorobenzyl)sulfonyl]amino}ethyl)-1-(diphenylmethyl)-
Novel theranostic opportunities offered by characterization of altered mem- 1H-indol-3-yl]propyl}benzoic acid, efipladib. J. Med. Chem. 51, 3388–3413.
brane lipid metabolism in breast cancer progression. Cancer Res. 71, Nicholson, J.K., Connelly, J., Lindon, J.C., and Holmes, E. (2002). Metabo-
3236–3245. nomics: a platform for studying drug toxicity and gene function. Nat. Rev.
Holmes, E., Wilson, I.D., and Nicholson, J.K. (2008). Metabolic phenotyping in Drug Discov. 1, 153–161.
health and disease. Cell 134, 714–717. Nicholson, J.K., Holmes, E., Kinross, J.M., Darzi, A.W., Takats, Z., and Lindon,
Hopkins, B.D., Pauli, C., Du, X., Wang, D.G., Li, X., Wu, D., Amadiume, S.C., J.C. (2012). Metabolic phenotyping in clinical and surgical environments. Na-
Goncalves, M.D., Hodakoski, C., Lundquist, M.R., et al. (2018). Suppression ture 491, 384–392.
of insulin feedback enhances the efficacy of PI3K inhibitors. Nature 560, Okada, S., Vaeteewoottacharn, K., and Kariya, R. (2019). Application of Highly
499–503. Immunocompromised Mice for the Establishment of Patient-Derived Xeno-
Huang, W., Sherman, B.T., and Lempicki, R.A. (2009). Systematic and integra- graft (PDX) Models. Cells 8, 889.
tive analysis of large gene lists using DAVID bioinformatics resources. Nat. Okkenhaug, K., Graupera, M., and Vanhaesebroeck, B. (2016). Targeting PI3K
Protoc. 4, 44–57. in Cancer: Impact on Tumor Cells, Their Protective Stroma, Angiogenesis, and
Hughes-Fulford, M., Li, C.F., Boonyaratanakornkit, J., and Sayyah, S. (2006). Immunotherapy. Cancer Discov. 6, 1090–1105.
Arachidonic acid activates phosphatidylinositol 3-kinase signaling and in-
Otsu, N. (1979). A Threshold Selection Method from Gray-Level Histograms.
duces gene expression in prostate cancer. Cancer Res. 66, 1427–1433.
IEEE Trans. Syst. Man Cybern. 9, 62–66.
Janiszewska, M., Liu, L., Almendro, V., Kuang, Y., Paweletz, C., Sakr, R.A.,
Patterson, E., Wall, R., Fitzgerald, G.F., Ross, R.P., and Stanton, C. (2012).
Weigelt, B., Hanker, A.B., Chandarlapaty, S., King, T.A., et al. (2015). In situ sin-
Health implications of high dietary omega-6 polyunsaturated Fatty acids.
gle-cell analysis identifies heterogeneity for PIK3CA mutation and HER2 ampli-
J. Nutr. Metab. 2012, 539426.
fication in HER2-positive breast cancer. Nat. Genet. 47, 1212–1219.
Rameh, L.E., Rhee, S.G., Spokes, K., Kazlauskas, A., Cantley, L.C., and Cant-
Kanai, S., Ishihara, K., Kawashita, E., Tomoo, T., Nagahira, K., Hayashi, Y., and
ley, L.G. (1998). Phosphoinositide 3-kinase regulates phospholipase
Akiba, S. (2016). ASB14780, an Orally Active Inhibitor of Group IVA Phospho-
Cgamma-mediated calcium signaling. J. Biol. Chem. 273, 23750–23757.
lipase A2, Is a Pharmacotherapeutic Candidate for Nonalcoholic Fatty Liver
Disease. J. Pharmacol. Exp. Ther. 356, 604–614. Ricoult, S.J.H., Yecies, J.L., Ben-Sahra, I., and Manning, B.D. (2016). Onco-
Kramer, R.M., Roberts, E.F., Um, S.L., Börsch-Haubold, A.G., Watson, S.P., genic PI3K and K-Ras stimulate de novo lipid synthesis through mTORC1
Fisher, M.J., and Jakubowski, J.A. (1996). p38 Mitogen-activated protein ki- and SREBP. Oncogene 35, 1250–1260.
nase phosphorylates cytosolic phospholipase A2 (cPLA2) in thrombin-stimu- Sanjana, N.E., Shalem, O., and Zhang, F. (2014). Improved vectors and
lated platelets. Evidence that proline-directed phosphorylation is not required genome-wide libraries for CRISPR screening. Nat. Methods 11, 783–784.
1610 Cell 181, 1596–1611, June 25, 2020

ll
Article OPEN ACCESS
Sarbassov, D.D., Ali, S.M., Sengupta, S., Sheen, J.H., Hsu, P.P., Bagley, A.F., 3-(1-Aryl-1H-indol-5-yl)propanoic acids as new indole-based cytosolic phos-
Markhard, A.L., and Sabatini, D.M. (2006). Prolonged rapamycin treatment in- pholipase A2a inhibitors. J. Med. Chem. 57, 7244–7262.
hibits mTORC2 assembly and Akt/PKB. Mol. Cell 22, 159–168.
Vander Heiden, M.G., and DeBerardinis, R.J. (2017). Understanding the Inter-
Saxton, R.A., and Sabatini, D.M. (2017). mTOR Signaling in Growth, Meta- sections between Metabolism and Cancer Biology. Cell 168, 657–669.
bolism, and Disease. Cell 168, 960–976.
Wang, D., and Dubois, R.N. (2010). Eicosanoids and cancer. Nat. Rev. Cancer
Simopoulos, A.P. (2008). The importance of the omega-6/omega-3 fatty acid 10, 181–193.
ratio in cardiovascular disease and other chronic diseases. Exp. Biol. Med.
(Maywood) 233, 674–688. Wiederschain, D., Wee, S., Chen, L., Loo, A., Yang, G., Huang, A., Chen, Y.,
Caponigro, G., Yao, Y.M., Lengauer, C., et al. (2009). Single-vector inducible
Sonnweber, T., Pizzini, A., Nairz, M., Weiss, G., and Tancevski, I. (2018).
lentiviral RNAi system for oncology target validation. Cell Cycle 8, 498–504.
Arachidonic Acid Metabolites in Cardiovascular and Metabolic Diseases. Int.
J. Mol. Sci. 19, 3285. Wolfer, A.M., Gaudin, M., Taylor-Robinson, S.D., Holmes, E., and Nicholson,
St John, E.R., Balog, J., McKenzie, J.S., Rossi, M., Covington, A., Muirhead, J.K. (2015). Development and Validation of a High-Throughput Ultrahigh-Per-
L., Bodai, Z., Rosini, F., Speller, A., Shousha, S., et al. (2017). Rapid evapora- formance Liquid Chromatography-Mass Spectrometry Approach for
tive ionisation mass spectrometry of electrosurgical vapours for the identifica- Screening of Oxylipins and Their Precursors. Anal. Chem. 87, 11721–11731.
tion of breast pathology: towards an intelligent knife for breast cancer surgery. Yu, G., and He, Q.-Y. (2016). ReactomePA: an R/Bioconductor package for re-
Breast Cancer Res. 19. actome pathway analysis and visualization. Mol. Biosyst. 12, 477–479.
Tarrado-Castellarnau, M., de Atauri, P., and Cascante, M. (2016). Oncogenic Zelenay, S., van der Veen, A.G., Böttcher, J.P., Snelgrove, K.J., Rogers, N.,
regulation of tumor metabolic reprogramming. Oncotarget 7, 62726–62753. Acton, S.E., Chakravarty, P., Girotti, M.R., Marais, R., Quezada, S.A., et al.
Tomoo, T., Nakatsuka, T., Katayama, T., Hayashi, Y., Fujieda, Y., Terakawa, (2015). Cyclooxygenase-Dependent Tumor Growth through Evasion of Immu-
M., and Nagahira, K. (2014). Design, synthesis, and biological evaluation of nity. Cell 162, 1257–1270.
Cell 181, 1596–1611, June 25, 2020 1611

ll
OPEN ACCESS Article
STAR+METHODS
KEY RESOURCES TABLE

Antibodies
Rabbit polyclonal anti-beta-actin Cell Signaling Technology Cat# 4967; RRID: AB_10695744
Rabbit polyclonal anti-Akt antibody Cell Signaling Technology Cat# 9272; RRID: AB_329827
Rabbit monoclonal anti-phospho-Akt Cell Signaling Technology Cat# 4060; RRID: AB_2315049
(Ser473)
Rabbit polyclonal anti-PRAS40 Cell Signaling Technology Cat# 2610; RRID: AB_916206
Rabbit monoclonal anti-phospho-PRAS40 Cell Signaling Technology Cat# 2997; RRID: AB_2258110
(Thr246)
Mouse monoclonal anti-S6 Ribosomal Cell Signaling Technology Cat# 2317; RRID: AB_2238583
Protein
Rabbit polyclonal anti-phospho-S6 Cell Signaling Technology Cat# 2211; RRID: AB_331679
Ribosomal Protein (Ser235/236)
Rabbit polyclonal anti-NDRG1 Cell Signaling Technology Cat# 5196; RRID: AB_10626626
Rabbit monoclonal anti-phospho-NDRG1 Cell Signaling Technology Cat# 5482; RRID: AB_10693451
(Thr376)
Rabbit polyclonal anti-Raptor Cell Signaling Technology Cat# 2280; RRID: AB_561245
Rabbit polyclonal anti-Rictor Cell Signaling Technology Cat# 2140; RRID: AB_2179961
Rabbit monoclonal anti-mTOR Cell Signaling Technology Cat# 2983; RRID: AB_2105622
Rabbit polyclonal anti-P70 S6 Kinase Cell Signaling Technology Cat# 9202; RRID: AB_331676
Rabbit polyclonal anti-phospho-p70 S6 Cell Signaling Technology Cat# 9205; RRID: AB_330944
Kinase (Thr389)
Rabbit polyclonal anti-4E-BP1 (53H11) Cell Signaling Technology Cat# 9644; RRID: AB_2097841
Rabbit polyclonal anti-phospho-4E- Cell Signaling Technology Cat# 9451; RRID: AB_330947
BP1 (Ser65)
Rabbit monoclonal anti-Lamin B1 (D4Q4Z) Cell Signaling Technology Cat# 12586; RRID: AB_2650517
Rabbit polyclonal anti-phospho-PKC (pan) Cell Signaling Technology Cat# 9371; RRID: AB_2168219
(bII Ser660)
Rabbit polyclonal anti-PKCdelta Cell Signaling Technology Cat# 2058; RRID: AB_10694655
Rabbit polyclonal anti-phospho-PKCdelta/ Cell Signaling Technology Cat# 9376; RRID: AB_2168834
theta (Ser643/676)
Rabbit polyclonal anti-RKIP (G38) Cell Signaling Technology Cat# 5060; RRID: AB_1904081
Rabbit polyclonal anti-c-Raf Cell Signaling Technology Cat# 9422; RRID: AB_390808
Rabbit monoclonal anti-phospho-c-Raf Cell Signaling Technology Cat# 9427; RRID: AB_2067317
(Ser338)
Rabbit polyclonal anti-MEK1/2 Cell Signaling Technology Cat# 9122; RRID: AB_823567
Rabbit polyclonal anti-phospho-MEK1/2 Cell Signaling Technology Cat# 9121; RRID: AB_331648
(Ser217/Ser221)
Rabbit polyclonal anti-p44/42 MAPK Cell Signaling Technology Cat# 9102; RRID: AB_330744
(Erk1/2)
Rabbit polyclonal anti-phospho-p44/42 Cell Signaling Technology Cat# 9101; RRID: AB_331646
MAPK (Thr202/Tyr204)
Rabbit polyclonal anti-p38 MAPK Cell Signaling Technology Cat# 9212; RRID: AB_330713
Rabbit polyclonal anti-phospho-p38 MAPK Cell Signaling Technology Cat# 9211; RRID: AB_331641
(Thr180/Tyr182)
Rabbit polyclonal anti-cPLA2 Cell Signaling Technology Cat# 2832; RRID: AB_2164442
e1 Cell 181, 1596–1611.e1–e16, June 25, 2020

ll
Article OPEN ACCESS
Continued
Rabbit polyclonal anti-phospho-cPLA2 Cell Signaling Technology Cat# 2831; RRID: AB_2164445
(Ser505)
Rabbit monoclonal anti-HA-Tag Cell Signaling Technology Cat# 3724; RRID: AB_1549585
Rabbit monoclonal anti-GAPDH Cell Signaling Technology Cat# 2118; RRID: AB_561053
Rabbit polyclonal anti-PLCgamma1 Cell Signaling Technology Cat# 2822; RRID: AB_2163702
Rabbit polyclonal anti-phospho- Cell Signaling Technology Cat# 2821; RRID: AB_330855
PLCgamma1 (Tyr783)
Rabbit polyclonal anti-phospho-threonine Cell Signaling Technology Cat# 9381; RRID: AB_330301
Rabbit polyclonal anti-IkBalpha Cell Signaling Technology Cat# 9242; RRID: AB_331623
Rabbit monoclonal anti-phospho- Cell Signaling Technology Cat# 2859; RRID: AB_561111
IkBalpha (Ser32)
Rabbit monoclonal anti-Stat3 Cell Signaling Technology Cat# 4904; RRID: AB_331269
Rabbit polyclonal anti-phospho-Stat3 Cell Signaling Technology Cat# 9134; RRID: AB_331589
(Ser727)
Rabbit monoclonal anti-PKD/PKCm Cell Signaling Technology Cat# 90039; RRID: AB_2800149
Rabbit polyclonal anti-phospho-PKD/PKCm Cell Signaling Technology Cat# 2054; RRID: AB_2172539
(Ser744/748)
Mouse monoclonal anti-Rb (4H1) Cell Signaling Technology Cat# 9309; RRID: AB_823629
Rabbit monoclonal anti-estrogen inducible Abcam Cat# ab92377; RRID: AB_10562122
protein pS2
Rabbit polyclonal anti-PKCzeta Abcam Cat# ab59364; RRID: AB_944858
Rabbit monoclonal anti-phospho-PKCzeta Abcam Cat# ab62372; RRID: AB_946309
(Thr560)
Rabbit polyclonal anti-PKCepsilon Abcam Cat# ab63638; RRID: AB_1142276
Rabbit polyclonal anti-phospho- Abcam Cat# ab63387; RRID: AB_1142277
PKCepsilon (Ser729)
Rabbit monoclonal anti-secretory Abcam Cat# ab139692
phospholipase A2
Mouse monoclonal anti-PKCbeta II (F-7) Santa Cruz Biotechnology Cat# sc-13149; RRID: AB_628144
Mouse monoclonal anti-PKCzeta (B-7) Santa Cruz Biotechnology Cat# sc-393218
Mouse monoclonal anti-phospho-RKIP Santa Cruz Biotechnology Cat# sc-135779; RRID: AB_2163163
(Ser153)
Normal rabbit IgG Santa Cruz Biotechnology Cat# sc-2027; RRID: AB_737197
Mouse monoclonal anti-phospho-Rb Santa Cruz Biotechnology Cat# sc-271930; RRID: AB_670923
(Thr821/826)
Rabbit polyclonal anti-phospholipase Sigma-Aldrich Cat# SAB4200129; RRID: AB_11129638
A2 (iPLA2)
Rabbit monoclonal anti-phospho-PKCzeta Thermo Fisher Scientific Cat# MA5-15060; RRID: AB_10983263
(Thr410)
Mouse monoclonal anti-SREBP1 BD Biosciences Cat#557036; RRID: AB_396559
Goat anti-rabbit IgG (H+L)-HRP conjugate Bio-Rad Cat#170-6515; RRID: AB_11125142
Goat anti-mouse IgG (H+L)-HRP conjugate Bio-Rad Cat#170-6516; RRID: AB_11125547
Mouse polyclonal anti-Nkp46/ncr1 R and D systems Cat#AF2225; RRID: AB_355192
Rabbit monoclonal anti-Ki-67 (D3B5) Cell Signaling Technology Cat# 9129; RRID: AB_2687446
Goat anti-Mouse IgG (H+L) secondary Thermo Fisher Scientific Cat# A-11003; RRID: AB_2534071
antibody, Alexa Fluor 546
Phalloidin-iFluor 633 Abcam Cat# ab176758
Rabbit polyclonal anti-phospho-cPLA2 This paper (produced by Thermo Fisher Cat# UE1820P-T-AB1792
(Thr376) (Peptide name: PLA2G4A- Scientific)
369:383-pT376)
Cell 181, 1596–1611.e1–e16, June 25, 2020 e2

ll
OPEN ACCESS Article
Continued

MAX Efficiency DH5a Competent Cells Thermo Fisher Scientific Cat#18258012
Biological Samples
Primary human breast tissue Imperial College Tissue Bank Cat# IKB180
Breast PDX model CrownBioscience Cat# BR6695
Breast PDX model Champions Oncology Cat# CTG1059
Ovarian PDX model Champions Oncology Cat# CTG0253
e3 Cell 181, 1596–1611.e1–e16, June 25, 2020

ll
Article OPEN ACCESS
Continued
Pancreatic PDX model Champions Oncology Cat# CTG0292
Sarcoma PDX model Champions Oncology Cat# CTG0886
Colorectal PDX model Champions Oncology Cat# CTG0083
Hydrocortisone Sigma-Aldrich Cat# H-0888
Insulin-transferrin-selenium GIBCO Cat# 41400-045
Epidermal growth factor (EGF) Peprotech Cat# AF-100-15
Cholera toxin Sigma-Aldrich Cat# C-8052
Insulin Sigma-Aldrich Cat# I-1882
Fatty acid free bovine serum albumin Sigma-Aldrich Cat# A8806
PKC zeta pseudosubstrate inhibitory Sigma-Aldrich Cat# P1614
peptide
PKC beta II peptide inhibitor Sigma-Aldrich Cat# P-0102
FIPI hydrochloride hydrate Sigma-Aldrich Cat# F5808
4-hydroxytamoxifen Sigma-Aldrich Cat# T176
Bromoenol lactone Sigma-Aldrich Cat# B1552
Cycloheximide Sigma-Aldrich Cat# C7698
Cell 181, 1596–1611.e1–e16, June 25, 2020 e4

ll
OPEN ACCESS Article
Continued
Doxycycline hyclate Sigma-Aldrich Cat# D9891
Arachidonic acid Sigma-Aldrich Cat# 10931
Palmitoleate Sigma-Aldrich Cat# P9417
Palmitate Sigma-Aldrich Cat# P0500
PKC alpha (C2-4) inhibitor peptide Santa Cruz Biotechnology Cat# sc-304
PKC epsilon inhibitor peptide Cambridge Bioscience Cat# CAY17476
Rapamycin Selleckchem Cat# S1039
Torin 1 Selleckchem Cat#S2827
BYL719 Selleckchem Cat# S2814
BKM120 Selleckchem Cat# 2247
MK2206 Selleckchem Cat# S1078
GSK690693 Selleckchem Cat# S1113
GSK650394 Tocris Bioscience Cat# 3572
U73122 Tocris Bioscience Cat# 1268
ASB14780 Axon Medchem Cat# 2578
DharmaFECT-1 transfection reagent Dharmacon Cat# T-2001-02
FuGENE HD Transfection reagent Promega Cat# E2311
Lipid Removal Adsorbent Supelco Cat# 13358
Matrigel Corning Cat# 354230
Paraformaldehyde Sigma-Aldrich Cat# 158127
Triton X-100 Sigma-Aldrich Cat# X100
DAPI (4’,6-Diamidino-2-Phenylindole, Thermo Fisher Scientific Cat# D1306
Dihydrochloride)
RIPA buffer Thermo Fisher Scientific Cat# 89900
Leupeptin Sigma-Aldrich Cat# L2884
Pepstatin Sigma-Aldrich Cat# P5318
Na3VO4 Sigma-Aldrich Cat# 450243
DL-Dithiothreitol Sigma-Aldrich Cat# 646653
Calyculin A Cell Signaling Technology Cat# 9902
Beta-glycerophosphate Sigma-Aldrich Cat# G9422
PMSF protease inhibitor Cell Signaling Technology Cat# 8553
Bradford reagent Bio-Rad Cat# 5000006
ALLN protease inhibitor Merck-Millipore Cat# 208719
2x Laemmli sample buffer Bio-Rad Cat# 161-0737
4x Laemmli sample buffer Bio-Rad Cat# 161-0747
10X Cell lysis buffer Cell Signaling Technology Cat# 9803
Recombinant human protein kinase C zeta Insight Biotechnology Cat# TP302472
Recombinant human phospholipase A2, Insight Biotechnology Cat# TP320972
group IVA
Magnesium chloride Sigma-Aldrich Cat# M8266
Bovine serum albumin Sigma-Aldrich Cat# A2153
Puromycin Invivogen Cat# ant-pr-1
Blasticidin Invivogen Cat# ant-bl-1
QuikChange Lightning Site-Directed Agilent Cat# 210518
Mutagenesis Kit
CellTiter96 Aqueous Non-radioactive (MTS) Promega Cat# G5430
cell proliferation assay
e5 Cell 181, 1596–1611.e1–e16, June 25, 2020

ll
Article OPEN ACCESS
Continued
Fatty Acid Uptake Kit Sigma-Aldrich Cat# MAK156
Cytosolic Phospholipase A2 Assay Kit Abcam Cat# ab133090
Secretory Phospholipase A2 Assay Kit Abcam Cat# ab133089
Diacylglycerol (DAG) Assay Kit Cell Biolabs Inc Cat# MET-5028
Active Rac1 Detection Kit Cell Signaling Technology Cat# 8815
Duolink In Situ Detection Reagents Red Kit Sigma-Aldrich Cat# DUO92008
Minus and Plus PLA probes Sigma-Aldrich Cat# DUO92004 and DUO92002
Fluo-4 Direct Calcium Assay Kit Thermo Fisher Scientific Cat# F10471
RNeasy Plus Mini Kit QIAGEN Cat# 74134
QuantiTect Reverse Transcription Kit QIAGEN Cat#205311
SYBR Select Master Mix Thermo Fisher Scientific Cat# 4472908
QIAamp DNA mini kit QIAGEN Cat# 51304
PNAClamp PIK3CA Mutation Detection Kit Panagene Cat# PNAC-4001
ADP-Glo Kinase Assay Promega Cat# V6930
Arachidonic Acid ELISA Kit Generon Cat# CEB098Ge
Prostaglandin E2 ELISA Kit Enzo Life Sciences Cat# ADI-900-001
Mouse RANTES (CCL5) ELISA Kit Abcam Cat# ab100739
Mouse Fractalkine (CX3CL1) ELISA Kit Abcam Cat# ab100683
Pierce BCA Protein Assay Kit Thermo Fisher Scientific Cat# 23225
Deposited Data
Custom script for quantification of proximity This manuscript Github: https://github.com/adamltyson/
ligation assay images foci2D
Custom script for quantification of acini This manuscript Github: https://github.com/adamltyson/
images cell-coloc-3D
Custom script for quantification of This manuscript Github: https://github.com/adamltyson/
calcium flux CalciumAnalysis
REIMS data for Figure 1D This manuscript Mendeley Data; https://doi.org/10.17632/
xcgc5kpntm.1
REIMS data for Figure 1E This manuscript Mendeley Data; https://doi.org/10.17632/
xcgc5kpntm.1
REIMS data for Figure 1H This manuscript Mendeley Data; https://doi.org/10.17632/
xcgc5kpntm.1
REIMS data for Figure 3B This manuscript Mendeley Data; https://doi.org/10.17632/
xcgc5kpntm.1
REIMS data for Figure 3D This manuscript Mendeley Data; https://doi.org/10.17632/
xcgc5kpntm.1
Significantly altered phospholipids across This manuscript Table S1
breast cancer cell lines of different receptor,
or triple negative status
Significantly different fatty acids between This manuscript Table S4
MCF10A PIK3CA wild-type and E545K/
H1047R mutant cells
Human PIK3CA (H1047R/+) MCF10A Horizon Discovery Cat# HD 101-011
Human PIK3CA (E545K/+) MCF10A Horizon Discovery Cat# HD 101-002
AU565 (human breast carcinoma) ATCC Cat# CRL-2351; RRID: CVCL_1074
BT20 (human breast carcinoma) ATCC Cat# HTB-19; RRID: CVCL_0178
CAL51 (human breast carcinoma) DSMZ Cat# ACC-302; RRID: CVCL_1110
Cell 181, 1596–1611.e1–e16, June 25, 2020 e6

ll
OPEN ACCESS Article
Continued
CAMA1 (human breast carcinoma) ATCC Cat# HTB-21; RRID: CVCL_1115
EFM19 (human breast carcinoma) DSMZ Cat# ACC-231; RRID: CVCL_0253
Hs578T (human breast carcinoma) ATCC Cat# HTB-126; RRID: CVCL_0332
JIMT1 (human breast carcinoma) DSMZ Cat# ACC-589; RRID: CVCL_2077
KPL1 (human breast carcinoma) DSMZ Cat# ACC-317; RRID: CVCL_2094
MCF7 (human breast carcinoma) ATCC Cat# HTB-22; RRID: CVCL_0031
MDAMB134 (human breast carcinoma) ATCC Cat# HTB-23; RRID: CVCL_0617
MFM223 (human breast carcinoma) DSMZ Cat# ACC-422; RRID: CVCL_1408
S68 (human breast carcinoma) Breast Cancer Now (Institute of Cancer RRID: CVCL_5585
Research)
SKBR3 (human breast carcinoma) ATCC Cat# HTB-30; RRID: CVCL_0033
T47D (human breast carcinoma) ATCC Cat# HTB-133; RRID: CVCL_0553
UACC812 (human breast carcinoma) ATCC Cat# CRL-1897; RRID: CVCL_1781
VP229 (human breast carcinoma) Breast Cancer Now (Institute of Cancer RRID: CVCL_2754
Research)
HCC1143 (human breast carcinoma) ATCC Cat# CRL-2321; RRID: CVCL_1245
SUM52 (human breast carcinoma) Breast Cancer Now (Institute of Cancer RRID: CVCL_3425
Research)
ZR751 (human breast carcinoma) ATCC Cat# CRL-1500; RRID: CVCL_0588
ZR7530 (human breast carcinoma) ATCC Cat# CRL-1504; RRID: CVCL_1661
SUM44 (human breast carcinoma) Breast Cancer Now (Institute of Cancer RRID: CVCL-3424
Research)
Research)
Research)
Research)
Research)
HEK293T (human embryonic kidney) ATCC Cat# CRL-3216; RRID: CVCL_0063
Human MCF10A PIK3CA WT CRISPR This manuscript N/A
control cell line
e7 Cell 181, 1596–1611.e1–e16, June 25, 2020

ll
Article OPEN ACCESS
Continued
Human MCF10A PIK3CA WT cPLA2 This manuscript N/A
CRISPR cell line
Human MCF10A PIK3CA H1047R (+/) This manuscript N/A
CRISPR control cell line
Human MCF10A PIK3CA H1047R (+/) This manuscript N/A
cPLA2 CRISPR cell line
Mouse: BALB/c nude (female, age: 6– Beijing Anikeeper Biotech (Beijing, China) N/A
8 weeks)
Mouse: BALB/c nude (female, age: 7– Envigo N/A
9 weeks)
Oligonucleotides
Primers for cPLA2 shRNA amplification Sigma-Aldrich N/A
(Forward 50 -
GTGGAAAGGACGAAACACCGGT-30 ,
Reverse 50 -
TTTGTCTCGAGGTCGAGAATTC-30
Mutagenesis primers to generate cPLA2 Sigma-Aldrich N/A
S505A (Forward 50 -
GCAAAGTCACTCAAAGGAGCCAGTG
GATAAGATGTATTG-30 , Reverse 50 -
CAATACATCTTATCCACTGGCTCCTTT
GAGTGACTTTGC-30
Mutagenesis primers to generate Sigma-Aldrich N/A
cPLA2 T376A (Forward 50 -TTCTTC
ATACTTCTTAACGACTGCTCCCATAAAAA
ATTTGCTTCCAA-30 ,
Reverse 50 -TTGGAAGCAAATTT
TTTATGGGAGCAGTCGTTAAG
AAGTATGAAGAA-30
qPCR primers for human cPLA2 (PLA2G4A) Sigma-Aldrich N/A
(Forward 50 -
GATGAAACTCTAGGGACAGCAAC-30 ,
Reverse 50 -
CTGGGCATGAGCAAACTTCAA-30
qPCR primers for human beta-actin Sigma-Aldrich N/A
(Forward 50 -
GACCCAGATCATGTTTGAGACC-30 ,
Reverse 50 -
CTTCATGAGGTAGTCAGTCAGG-30 )
Recombinant DNA
pLKO-Tet-On Vector Wiederschain et al., 2009 Addgene ID: 21915
pCMV3-HA-PLA2G4A Sino Biological Cat# HG13126-NY
RICTOR ON-TARGETplus SMARTPool Dharmacon Cat# L-016984-00-0005
human siRNA
RAPTOR ON-TARGETplus SMARTPool Dharmacon Cat# L-004107-00-0005
human siRNA
FRAP1 ON-TARGETplus SMARTPool Dharmacon Cat# L-003008-00-0005
human siRNA
PLCg1 ON-TARGETplus SMARTPool Dharmacon Cat# L-003559-00-0005
human siRNA
PRKCZ ON-TARGETplus SMARTPool Dharmacon Cat# L-003526-00-0005
human siRNA
Non-targeting siRNA control Dharmacon Cat# D-001810-01-05
Cell 181, 1596–1611.e1–e16, June 25, 2020 e8

ll
OPEN ACCESS Article
Continued
TRC Lentiviral eGFP shRNA positive control Dharmacon Cat# RHS4459
TRC Lentiviral Human PLA2G4A shRNA 1 Dharmacon Cat# TRCN0000050263
TRC Lentiviral Human PLAG2G4A shRNA 5 Dharmacon Cat# TRCN0000050267
LentiCRISPR v2 Sanjana et al., 2014 Addgene ID: 52961
PLA2G4A sgRNA CRISPR/Cas9 All-in-One Applied Biological Materials Cat# K1659207
Lentivector Target 2
pCMV-HA-PLA2G4A-S505A This manuscript N/A
pCMV-HA-PLA2G4A-T376A This manuscript N/A
Inducible pLKO-Tet-On-TRC Lentiviral This manuscript N/A
Human PLA2G4A shRNA 1
R statistical software (version 3.5.1) The R Project https://www.r-project.org/
ProteoWizard MsConvert software (version Chambers et al., 2012 http://proteowizard.sourceforge.net/
3.0.11781) download.html
MALDIquant package Gibb and Strimmer, 2012 https://cran.r-project.org/web/packages/
MALDIquant/index.html
ReactomePA package Yu and He, 2016 https://bioconductor.org/packages/
release/bioc/html/ReactomePA.html
MATLAB (2014a, version 8.3.0.532) Mathworks https://www.mathworks.com/products/
matlab.html
GraphPad Prism (version 8.0.1) GraphPad https://www.graphpad.com/
scientific-software/prism/
Database for Annotation, Visualization and Huang et al., 2009 https://david.ncifcrf.gov/
Integrated Discovery (DAVID, version 6.8)
TargetLynx software Waters Corporation https://www.waters.com/waters/
home.htm
Image Lab Software (version 5.2.1) Bio-Rad https://www.bio-rad.com/en-uk/product/
image-lab-software?ID=KRE6P5E8Z
ImageJ (version 1.51) NIH https://imagej.nih.gov/ij/download.html
Other
Protein G Sepharose beads Sigma-Aldrich Cat# P3296
Dulbecco’s Modified Eagle’s GIBCO Cat# 41965-039
Medium (DMEM)
Roswell Park Memorial Institute GIBCO Cat# 10220-106
(RPMI) 1640
Ham’s F12 media Thermo Fisher Scientific Cat# 11765054
DMEM/F-12 GIBCO Cat# 31330-038
Horse Serum Thermo Fisher Scientific Cat# 16050-122
4–15% Criterion TGX Precast Midi Bio-Rad Cat# 5671084
Protein Gel
4–15% Mini-PROTEAN TGX Precast Bio-Rad Cat# 4561083
Protein Gel
Electrosurgical bipolar forceps Erbe Elektromedizin (Germany) N/A
ForceTriad electrosurgical unit Covidien (Ireland) N/A
Thermo Exactive orbitrap instrument Thermo Scientific N/A
Waters HSS T3 UPLC column Waters Corporation Cat# 186005614
Waters Xevo TQ-S triple quadrupole mass Waters Corporation N/A
spectrometer
Normal diet (for PDX study) Keaoxieli Feed (Beijing, China) Cat# 2152
Fat free diet (for PDX study) Xietong Organism (Beijing, China) Cat# RD17112401
e9 Cell 181, 1596–1611.e1–e16, June 25, 2020

ll
Article OPEN ACCESS
Continued
Western diet (omega-3/omega-6 = 1:50) Research Diets Cat# D19032707
(For cell line xenograft study)
Balanced diet (omega-3/omega-6 = 1:1) Research Diets Cat# D19032708
(For cell line xenograft study)
Fat free diet (For xenograft study) Research Diets Cat# D19032705
Precellys Lysing Soft tissue Precellys Cat# P000912-LYSK0
homogenizing kit
Precellys24 homogenizer Bertin Instruments Cat# P000669-PR240-A
Lead Contact
Further information and requests for resources and reagents should be directed to and will be fulfilled by the Lead Contact, George
Poulogiannis (george.poulogiannis@icr.ac.uk).
All unique/stable reagents generated in this study are available from the Lead Contact without restriction.

The code generated during this study are available at GitHub using the following accessions: https://github.com/adamltyson/
CalciumAnalysis, https://github.com/adamltyson/cell-coloc-3D, and https://github.com/adamltyson/foci2D. These accessions are
also provided in the Key Resources Table. The published article includes all REIMS m/z values and putative annotations for signif-
icantly different lipids between various receptor subtypes and MCF10A PIK3CA isogenics in the Supplementary Information in Tables
S1 and S4, respectively. Original/source data of REIMS profiles for Figures 1D, 1E, 1H, 3B, and 3D in the paper corresponding to
breast cancer cell lines and tumors is available through Mendeley Data (https://doi.org/10.17632/xcgc5kpntm.1)
Human samples
12 primary breast cancer samples from female patients (> 18 years of age) who consented to utilization of tissue for research were
provided by the Imperial College Healthcare NHS Trust Tissue Bank. Other investigators may have received samples from these
same tissues. The research was supported by the National Institute for Health Research (NIHR) Biomedical Research Centre based
at Imperial College Healthcare NHS Trust and Imperial College London. The views expressed are those of the author(s) and not
necessarily those of the NHS, the NIHR or the Department of Health. Human samples used in this research project were obtained
with evaluation and approval from the Wales Research Ethics Committee Reference 17/WA/0161 (Imperial College Healthcare Tissue
Bank Human Tissue Authority license: 12275; Project number R18024), the East of England – Cambridge East Research Ethics Com-
mittee Reference 14/EE/0024, and the project was registered under the Imperial College Tissue Bank.
Animals
Mouse PDX experiments were performed by Crown Bioscience in accordance with approved Institutional Animal Care and Use Com-
mittee (IACUC) protocols and ethical guidelines, and in strict accordance with the Crown Bioscience Guidelines and Standard Oper-
ating Procedures. Two primary human triple-negative breast cancer PDX corresponding to PIK3CA WT (BR1458) or PIK3CA C420R
MUT (BR1282) tumor fragments (2-3 mm in diameter) were inoculated subcutaneously into the breast pad of 7-9-week-old female
immunodeficient BALB/c nude mice weighing 17-23 g and which had not received previous treatments or procedures. Once tumors
reached a volume of 100-200 mm3, mice were randomized into four groups corresponding to either a normal or fatty acid free (FAF)
diet and administered with vehicle (0.5% hydroxypropyl cellulose in sterile water) or 100 mg/kg cPLA2 inhibitor ASB14780 (2578,
Axon Medchem) daily through oral gavage for 21 days. Animals were housed in a specific pathogen free facility in individually vented
cages and provided with diets and distilled water ad libitum. Room temperature was monitored and maintained at 20-25 C with the
light cycle set at 12 hours. All animals were checked daily for signs of ill health, as well as for any effects of tumor growth and
treatments on behavior such as mobility, food and water consumption, and body weight gain/loss. Researchers were not blinded
to treatment groups. Tumors were excised 2 hours after the final dosing and snap frozen in preparation for metabolomics/REIMS
processing and histopathological assessment. The mean tumor area as a percent of the total tissue area was initially assessed by
an independent histopathologist, and subsequently quantified using ImageJ version 1.51. Normal and FAF diets were purchased
Cell 181, 1596–1611.e1–e16, June 25, 2020 e10

ll
OPEN ACCESS Article
from Keaoxieli Feed (2152) and Xietong Organism (RD17112401), respectively, and their compositions are summarized in Table S6.
All animal work undertaken at the Institute of Cancer Research was carried out under UK Home Office Project Licenses
P6AB1448A (Establishment License, X702B0E74 70/2902) and was approved by the Animal Welfare and Ethical Review Body at
the ICR. For cell line-derived xenograft studies, 7-9 week old female immunodeficient BALB/c nude mice weighing 18-25 g and which
had not received previous treatments or procedures were initially pre-conditioned on fat free, balanced (omega3/omega6 1:1) or
Western (omega3/omega6 1:50) diets for 2 weeks to assess tolerability. In order for the animal feed to be controlled, cages were ran-
domized to treatment groups rather than individual mice, and this occurred prior to orthotopic injections. Animals were subsequently
injected with 2.5x106 triple negative CAL51 (PIK3CA mutant) or Hs578T (PIK3CA WT) cells expressing either control shGFP or two
independent shRNAs targeting cPLA2 (cPLA2-sh1 or cPLA2-sh5) in 100 mL PBS:matrigel (50:50) into the right mammary fat pad. An-
imals were housed in a specific pathogen free facility in individually vented cages (no more than 4 mice per cage) and provided with
diets and distilled water ad libitum. Room temperature was monitored and maintained at 20-25 C with the light cycle set at 12 hours.
All animals were checked daily for signs of ill health, as well as for any effects of tumor growth and treatments on behavior such as
mobility, food and water consumption, and body weight gain/loss. Owing to the nature of the diets, blinding was not possible. Tumor
measurements were taken twice weekly in three dimensions (width, length and depth), and presented as relative tumor growth
normalized to first measurement day. Mice were excluded from the analysis if the primary tumor engrafted subcutaneously or into
the peritoneum instead of the mammary fat pad. On the final day of the experiment, tumors were excised and immediately snap
frozen in preparation for metabolomics/REIMS processing and histopathological assessment. The mean tumor area was quantified
as a percent of the total tissue area using ImageJ version 1.51. Fat free, balanced and Western diets were purchased from Research
Diets Inc. (D19032705, D19032708, D19032707, respectively). The composition of the diets used for the cell line xenograft study are
summarized in Table S6
Fresh frozen breast patient-derived xenograft (PDX) tumors were obtained from Crown Bioscience, and additional breast, ovarian,
pancreatic, sarcoma and colorectal PDX tumors were kindly provided by Champions Oncology. These are described in the Key Re-
sources Table. The PIK3CA mutational status for all 66 PDX tumor samples is summarized in Table S5.
Cell culture
Human female breast carcinoma cell lines AU565 BT20, BT474, BT549, CAL51, CAMA1, EFM19, Hs578T, JIMT1, KPL1, MCF7,
MDAMB134, MDAMB157, MDAMB231, MDAMB361, MDAMB436, MDAMB453, MDAMB468, MFM223, S68, SKBR3, T47D,
UACC812 and VP229 cells were cultured in Dulbecco’s Modified Eagle’s Medium (DMEM) (GIBCO, 41965-039) and BT483,
HCC1143, HCC1395, HCC1428, HCC1500, HCC1569, HCC1937, HCC1954, HCC202, HCC38, HCC70, SUM52, ZR751 and
ZR7530 cells were cultured in Roswell Park Memorial Institute (RPMI) 1640 media (Sigma-Aldrich, R8758), both supplemented
with 10% fetal bovine serum (FBS) (GIBCO, 10220-106). SUM44 (human, female), SUM159 (human, female), SUM149 (human, fe-
male), SUM225 (human, female) and SUM229 (human, female) were cultured in Ham’s F12 media (Thermo Fisher Scientific,
11765054) supplemented with 5% FBS, 0.5 mg/ml hydrocortisone (Sigma-Aldrich, H-0888) and 0.4% insulin-transferrin-selenium
(GIBCO, 41400-045) and MCF10A (human, female) cells (including the PIK3CA MUT isogenic panel) were cultured in DMEM/F-12
(GIBCO, 31330-038) supplemented with 5% horse serum (Thermo Fisher Scientific, 16050-122), 20 ng/ml epidermal growth factor
(EGF) (Peprotech, AF-100-15), 100 ng/ml cholera toxin (Sigma-Aldrich, C-8052), 0.5 mg/ml hydrocortisone (Sigma-Aldrich, H-0888)
and 10 mg/ml insulin (Sigma-Aldrich, I1882). For cell culture conditions free of exogenous sources of fatty acids, 10% FBS or 5%
horse serum was replaced with 1% fatty acid bovine serum albumin (BSA) (Sigma-Aldrich, A8806). All cell lines were maintained
at 37 C, 5% CO2. All cell lines were authenticated by short tandem repeat analysis (Eurofins Scientific) and were tested and
confirmed to be negative for mycoplasma infection.
METHOD DETAILS
Experimental design
Experiments were repeated multiple times across different cell line and tumor models with similar results as indicated in the figure leg-
ends. Key findings from in vivo experiments were reproduced using orthogonal approaches including cell line xenograft models and
genetic inhibitions. Animals were randomized into treatment groups either individually following PDX engraftment and growth to
100-200 mm3, or in cages following diet preconditioning. Throughout the study researchers were not blinded as data analysis required
prior knowledge of the sample annotation. For in vitro and in vivo experiments, sample size was chosen based on preliminary exper-
iments and previous experience with protocols. No completed data were excluded from the analysis performed in this manuscript.
Mass spectrometry analysis

Cell lines were cultured in 10 cm2 plates and harvested in biological triplicate for REIMS analysis when 80% confluent. After replacing
the media one hour prior to harvesting, the plates were washed twice with ice-cold PBS, and immediately flash frozen in liquid nitro-
gen. 1 mL PBS was added to the cells and cell extracts were scraped and collected in eppendorf tubes and centrifuged for 5 min at
5000rpm. Pellets were flash frozen and stored at 80 C. For fresh frozen PDX and primary breast tumors, three separate regions
were sectioned for analysis.
e11 Cell 181, 1596–1611.e1–e16, June 25, 2020

ll
Article OPEN ACCESS
REIMS analysis was performed with commercially available electrosurgical bipolar forceps (Erbe Elektromedizin, Germany) con-
nected to a ForceTriad electrosurgical unit (Covidien, Ireland) programmed in Macro bipolar setting using 4 W or 30 W power for cell
lines and tumors, respectively. Bipolar forceps were connected to the inlet capillary of a Thermo Exactive orbitrap instrument (Thermo
Scientific) using PTFE tubing, allowing for the direct suction of aerosol generated from rapid biomass heating to the mass spectrom-
eter (set up is shown in Figure 1A). The mass spectrometer settings used for phospholipid and fatty acid profiling are summarized in
Table S7.
Metabolomics data pre-processing and analysis

Raw mass spectrometric files were converted to mzXML format using the ProteoWizard MSConvert software (Version 3.0.11781)
(Chambers et al., 2012) and imported into RStudio (Version 3.4.4). Data were pre-processed using the MALDIquant package
(Gibb and Strimmer, 2012). Prior to pre-processing, high quality spectra corresponding to the maximum total ion (TIC) count for
each measurement were selected from each individual sample runs. Spectra were normalized using median scaling determined
from the non-zero intensities across the full scan range (m/z 150-2000), and square root transformed. Peaks were aligned using
the locally weighted scatterplot smoothing (LOWESS) function and detected using median absolute deviation (MAD) with a signal
to noise ratio (SNR) equal to 3. Peak matching was subject to a maximum peak shift of 5 ppm. For further analysis, the median in-
tensity of all biological and technical replicates was calculated, such that each sample is represented by a single spectrum. The mass
ranges of m/z 600-900 and m/z 150-500 were recorded for the detection of phospholipids and fatty acids, respectively. m/z values of
interest which discriminated significantly between experimental conditions were annotated based on previously performed tandem
mass spectrometry (MS/MS) fragmentation analysis and data available from Lipid Maps (https://www.lipidmaps.org). To assess in-
strument reproducibility, the MDAMB468 cells were included in every REIMS run as a quality control. The spectra obtained were
compared among different batches using unsupervised or guided principal component analysis (gPCA) and no significant batch ef-
fect was observed. Classification and feature-selection was performed using leave-one-out or 3-fold cross validation with random
forest as the classifier, applying the ‘caret’ and ‘randomForest’ packages, while unsupervised analysis was performed using hierar-
chical clustering (Euclidean distance, complete linkage), or non-negative matrix factorization (NMF) using the ‘NMF’ package. Pre-
diction of continuous variables such as ESR1 gene expression was performed using random forest regression analysis with the Tree-
Bagger function in MATLAB (2014a, version 8.3.0.532).
Transfections and site directed mutagenesis

Sub-cloning of cPLA2 sh1 into the pLKO-Tet-On inducible vector was done by first amplifying the shRNA sequence of interest with
the following primers: Forward 50 -GTGGAAAGGACGAAACACCGGT-30 and Reverse 50 -TTTGTCTCGAGGTCGAGAATTC-30 , and
subsequently introducing compatible AgeI and EcoRI restriction sites in the shRNA oligonucleotides and pLKO-Tet-On vector
(TET-pLKO puro was a gift from D. Wiederschain, Addgene plasmid 21915). For lentiviral gene knockdown, PLA2G4A or GFP
shRNAs were transfected in 293T cells using FuGENE HD Transfection reagent (Promega, E2311), according to the manufacturer’s
instructions and infected cells were selected in the presence of 1 mg/ml puromycin. For siRNA knockdown, MCF10A PIK3CA WT and
E545K/H1047R MUT cells were transfected with 25 nM ON-TARGETplus SMARTpool human siRNAs targeting RAPTOR, RICTOR,
FRAP1, PLCg1 or PRKCZ using DharmaFECT-1 Transfection Reagent for 48 hours. HA-tagged cPLA2 was transiently overex-
pressed using 9 mg pCMV3-HA-PLA2G4A vector DNA and 18 mL FuGENE HD Transfection reagent, and experiments were per-
formed 48 hours post transfection.
For generating CRISPR knockouts, MCF10A PIK3CA WT and H1047R cells were transfected with PLA2G4A sgRNA CRISPR/Cas9
All-in-One Lentivector Target 2 (Applied Biological Materials, K1659207) or control LentiCRISPR v2 (lentiCRISPR v2 was a gift from
Feng Zhang, Addgene plasmid #52961) according to the manufacturer’s instructions. Immunoblotting and DNA sequencing were
used to validate the isolation of PLA2G4A deleted clones.
Phosphoresistant isoforms of cPLA2 were generated using the QuikChange Lightning Site-Directed Mutagenesis Kit (Agilent,
210518) according to the manufacturer’s instructions. Briefly, 100 ng of pCMV3-HA-PLA2G4A vector DNA was incubated with
125 ng of the appropriate mutagenesis primers:
S505A Forward 50 -GCAAAGTCACTCAAAGGAGCCAGTGGATAAGATGTATTG-30 , Reverse 50 -CAATACATCTTATCCACTGGC

TCCTTTGAGTGACTTTGC-30 ;
T376A
Forward 50 -TTCTTCATACTTCTTAACGACTGCTCCCATAAAAAATTTGCTTCCAA-3 0 Reverse 50 -TTGGAAGCAAATTTTTTATGGG
AGCAGTCGTTAAGAAGTATGAAGAA-30 ;
PCR cycling parameters were as follows: initial denaturing at 95 C for 2 min, followed by 18 cycles of 20 s denaturing (95 C), 10 s
annealing (60 C) and 4.5 min elongation (68 C). A final elongation step occurred for 5 min (68 C). To assess the effects of these phos-
phomutants on cPLA2 activity or arachidonic acid (AA) levels, cells were transiently transfected with 9 mg pCMV3-HA-PLA2G4A,
pCMV3-HA-PLA2G4A-S505A or pCMV3-HA-PLA2G4A-T376A vector DNA and 18 mL FuGENE HD Transfection reagent, and ex-
periments were performed 48 hours post transfection.
Cell 181, 1596–1611.e1–e16, June 25, 2020 e12

ll
OPEN ACCESS Article
Cell based assays

For colony formation assays, single cell suspensions containing 5x102 MCF10A PIK3CA WT or E545K/H1047R MUT cells were
seeded in 6 well plates and allowed to adhere for 18 hours. Wells were subsequently washed twice with PBS and DMEM-F/12 media
containing growth factors and 1% FAF BSA (herein called ‘FAF media’), supplemented either with 0.1% DMSO or 25 mM AA. After
14 days, colonies were stained with crystal violet, and quantified using ImageJ. For 3D spheroid culture, 5x103 cells were seeded in
round bottom, low attachment 96 well plates (Corning, 7007) and incubated for 14 days. Media was replaced every 3 days. For
ASB14780 viability assays, 5x103 PIK3CA WT (MCF10A, MDAMB134, AU565, Hs578T) or PIK3CA MUT (MCF10A E545K/
H1047R, CAL51, MCF7, MDAMB453) were seeded in 96 well plates in full media and allowed to adhere for 18 hours. Wells were
washed twice with PBS, and FAF media containing either 0.1% DMSO, 20 nM-10 mM ASB14780 or the inhibitor supplemented
with 25 mM AA was added. Viability was assessed after 72 hours using the CellTiter 96 Aqueous Non-Radioactive (MTS) cell prolif-
eration assay (Promega, G5430).
To measure cell proliferation, 5x103 MCF10A isogenic cells expressing GFP control or cPLA2 shRNAs (sh1 and sh5) were seeded
in 96 well plates, left to adhere for 18 hours, washed twice with PBS, and incubated with FAF media. Cell number was determined on
Days 0, 1, 3 and 5 using the Sulforhodamine B (SRB) assay.
To assess eicosanoid-mediated autocrine and paracrine effects on cell proliferation, conditioned media were derived from
MCF10A PIK3CA WT or MUT isogenics. Cells were grown to 80% confluency in full media, after which residual media were washed
off twice with PBS, and cells were cultured with FAF media for 48 hours. The conditioned media were centrifuged at 2000 g for 20 min
to remove cell debris and incubated with Lipid Removal Adsorbent (LRA, Supelco, 13358) at 0.4 g of LRA per 10 mL media to deplete
secreted lipids. Following overnight incubation with gentle shaking, lipid-deprived media were centrifuged at 2000 g for 20 min and
supernatants were collected. Proliferation was assessed over 5 days using the SRB assay.
To assess fatty acid uptake, 5x104 MCF10A PIK3CA WT or MUT cells were seeded in 100 mL full medium containing 5% horse
serum in a 96-well plate and incubated for 24 hours at 37 C, 5% CO2. Cells were washed twice with PBS and serum deprived for
1 hour before adding 100 mL TF2-C12 Fatty Acid Stock Solution (Sigma-Aldrich, MAK156). After incubating the cells for 1 hour at
37 C, the fluorescence signal was measured at Ex/Em = 485/515 nm.
Three-dimensional cell culture

For 3D acini formation, MCF10A cells were grown in Matrigel (Corning, 354230) as described previously (Debnath et al., 2003). Briefly,
PIK3CA isogenic MCF10A cells expressing pLKO-Tet-On inducible shRNA against cPLA2 or GFP were induced with 2 mg/ml doxy-
cycline for 48 hours prior to seeding. A single cell suspension (3x103 cells per well) was plated in assay medium containing 1% FAF
BSA onto eight-well chamber slides coated with a layer of pure growth factor–reduced Matrigel. Cells were then overlaid with a
300 mL of assay medium containing 2% Matrigel and cultured for 10 days. Growth media was replaced every 3 days.
Confocal microscopy
For immunofluorescence, cells were fixed with 4% PFA at room temperature, permeabilized with pre-chilled 0.5% Triton X-100 for
10 min prior to blocking with 2% BSA/PBS for 1 hour. After blocking, cells were incubated overnight at 4 C with primary Ki-67 anti-
body (Cell Signaling Technology, 9129) in a humidified chamber. The following day cells were incubated with fluorescently labeled
secondary antibodies Alexa Fluor 546-conjugated secondary antibody (Thermo Fischer Scientific, A-11003) and Phalloidin 633
(Abcam, ab176758) to visualize Ki-67 and F-actin, respectively, 1% BSA/PBS for 1 hour. Slides were mounted using Prolong
Gold anti-fade reagent with DAPI (Invitrogen, D1306). Images were captured using a Zeiss AxioObserver microscope equipped
with a Yokogawa CSU-W1 spinning disk unit (Intelligent Imaging Innovations) and a 40x oil objective. Serial z stacks of the acini struc-
tures were acquired at 5 um intervals (usually 10-15 sections per field), and then analyzed with a custom MATLAB script (2017b, The
Mathworks Inc.). Images were resampled to isotropic resolution and each spheroid was manually segmented. The DAPI signal was
thresholded using Otsu’s method (Otsu, 1979) following intensity depth correction and smoothing. Holes were then filled and small,
non-cellular objects were removed. The resulting binary nuclei image was used as a mask to measure cellular proliferation.
Enzymatic assays
cPLA2, iPLA2 and sPLA2 activities were measured using commercially available assays (Abcam, ab133089 or ab133090) according
to the manufacturers instructions. Briefly, total cell lysates were obtained using 1x Cell Lysis Buffer (Cell Signaling Technology, 9803)
under non-denaturing conditions. For cPLA2 activity, 10 mL lysate, 5 mL Assay buffer and 200 mL substrate solution containing arach-
idonoyl Thio-PC were incubated at room temperature for one hour. For iPLA2 activity, cell lysates were either untreated (measuring
both cPLA2 and iPLA2 activity) or treated with 5 mM of the iPLA2 specific inhibitor bromoenol lactone (BEL), and activity was deter-
mined as follows: iPLA2 activity = (Activity without BEL) – (Activity with BEL). For sPLA2 activity, conditioned media was obtained
from MCF10A PIK3CA WT and MUT cells and concentrated using a centrifugal vacuum evaporator (‘‘SpeedVac’’). Dried samples
were resuspended in 100 mL Assay Buffer, and 10 mL of this was used for the assay. Cellular diacylglycerol (DAG) levels were
measured using a DAG assay kit according to the manufacturer’s instructions (Cell Biolabs Inc., MET-5028). Briefly, 1x107 cells
were harvested by scraping with 1 mL cold PBS, and pellets were obtained after centrifugation at 1500 g for 10 min. Lipids were
extracted following sonication and incubation with methanol, sodium chloride and chloroform. The lower chloroform phase was
e13 Cell 181, 1596–1611.e1–e16, June 25, 2020

ll
Article OPEN ACCESS
washed twice with pre-equilibrated upper phase (PEU) and dried under a stream of nitrogen. 50 mL of assay buffer was used to re-
suspend the dried sample, 20 mL of which was used for the assay.
Immunoblot analysis
Cells were washed with ice-cold PBS and lysed on ice for 30 min with cell lysis buffer containing RIPA buffer (Thermo Scientific,
89900) supplemented with 4 mg each of leupeptin (Sigma-Aldrich, L2884) and pepstatin (Sigma-Aldrich, P5318), 2 mM Na3VO4
(Sigma-Aldrich, 450243), 1 mM DL-Dithiothreitol (Sigma-Aldrich, 646653), 10 mM Calyculin A (Cell Signaling Technology, 9902),
250 mM b-glycerophosphate (Sigma-Aldrich, G9422) and 400 mM PMSF protease inhibitor (Cell Signaling Technology, 8553). Lysates
were subjected to centrifugation at 12,000 g for 30 min at 4 C, and protein concentrations were determined using the Bradford assay
(Bio-Rad, 5000006). Nuclear isolation for mature SREBP probing was performed using a Nuclear Extract Kit (Active Motif, 40010) with
10 mg/ml of the protease inhibitor ALLN (Millipore, 208719). Protein lysates were boiled for 10 min and subjected to SDS-PAGE elec-
trophoresis using 4%–15% precast gels (Bio-Rad, 567-1084). Densitometry was calculated using the Image Lab Software 5.2.1 (Bio-
Rad). Affinity purified custom antibodies for phosphorylated cPLA2 on Thr376 were developed by Thermo Fisher Scientific using the
PLA2G4A-369:383 peptide antigen and used at a dilution of 1:500 in 5% BSA/TBST. All other primary antibodies were used at a dilu-
tion of 1:1000 in 5% BSA/TBST solution, and secondary antibodies at 1:5000 in 5% milk/TBST.
Immunoprecipitation analysis
MCF10A PIK3CA WT and MUT isogenics were transiently transfected with HA-tagged cPLA2 (pCMV3-HA-PLA2G4A, Sino Biological,
HG13126-NY) and lysed with 1X Cell Lysis Buffer (Cell Signaling Technologies, 9803) 48 hours post transfection. Equal volume of diluted
cell lysates containing 4 mg of soluble protein were incubated with 2 mg rabbit anti-HA (Cell Signaling Technology, 3724) or 2 mg normal
rabbit IgG (Santa-Cruz Biotechnology, sc-2027) as a negative control, and were pre-coupled with protein G Sepharose beads (Sigma-
Aldrich, P3296) following incubation at 4 C for 4 hours rotating. 1X Cell Lysis Buffer was used to wash the beads four times, after which
samples were resuspended in 30 mL 2x Laemmli sample buffer (Bio-Rad, 161-0737) and boiled for 10 min to release bound proteins.
To detect GTP-bound Rac1, the Active Rac1 Detection Kit (Cell Signaling Technology, 8815) was used, following the manufac-
turer’s instructions. Briefly, cell lysates were harvested under non-denaturing conditions using 1X Cell Lysis Buffer. To affinity pre-
cipitate activated G-protein, spin cups were incubated with 100 mL of glutathione resin and subsequently washed with 1X Cell Lysis
Buffer. 20 mg of GST-PAK1-PBD was added to spin cups containing glutathione resin, after which 700 mL of the cell lysate containing
1 mg total protein was added, and the mix was incubated at 4 C for 1 hour with gentle shaking. Following three washes with 400 mL 1X
Cell Lysis Buffer, samples were eluted by adding 2X reducing sample buffer containing 200 mM DTT in 2X SDS sample buffer, and
subsequently heated for 5 min at 95 C. 25 mL of eluted samples were loaded onto an SDS-PAGE gel for subsequent immunoblot
analysis using the provided Rac1 mouse monoclonal antibody (1:1000 dilution).
Immunohistochemistry analysis
10 mM thick sections were obtained from formalin-fixed and paraffin-embedded tumors and stained using an anti-phospho PKCz
Thr560 antibody (Abcam, ab62372) at a dilution of 1:100. For breast PDX and cell line-derived xenograft tumors, fresh frozen tumor
pieces were mounted on Optimal Cutting Temperature (OCT) compound and sectioned to 10 mM thickness. To assess the presence
of activated natural killer cells, slides were stained using goat anti-mouse NKp46/ncr1 polyclonal antibody (R and D systems, 170-
6516) at a final concentration of 3 mg/ml. Images were captured on a Hamamatsu NanoZoomer, and staining was quantified using the
IHC Profiler plug-in for ImageJ.
Proximity ligation assay

For the proximity ligation assay (PLA), all incubations were performed in a humidity chamber as per the manufacturer’s instructions
using a Duolink In Situ Detection Reagents Red kit (Sigma-Aldrich, DUO92008). Briefly, MCF10A PIK3CA isogenic cells were seeded
on glass coverslips, serum-starved overnight, and stimulated with full serum and growth factors for 30 min; control cells were un-
treated. They were then fixed with 4% paraformaldehyde (PFA), blocked and incubated with primary antibodies (anti-cPLA2 and
anti-phospho PKCz Thr560) for 1 hour at room temperature. Samples were then incubated with Minus and Plus PLA probes
(Sigma-Aldrich, DUO92004 and DUO92002) for 1 hour at 37 C, followed by a ligation step for 30 min at 37 C, and an amplification
step for 100 min at 37 C. Finally, the coverslips were mounted with the Duolink PLA mounting media and DAPI, and fluorescence was
visualized with a Zeiss 710 Confocal microscope with a 40x oil objective. Images were analyzed using a custom MATLAB (2017b)
script. Briefly, the DAPI signal was segmented by taking a maximum intensity projection, smoothing, and then thresholding (Otsu,
1979). Borders between cells were estimated by finding the midpoints between nuclei using a watershed approach. A summed pro-
jection was calculated for the probe signal, this was smoothed (Gaussian kernel sigma of 1 pixel), and thresholded. The threshold
calculated from a positive control image was applied to all images. The segmented image of the probe was used to calculate the
number of foci on a per cell basis.
Fluorescent calcium assay

Calcium flux was measured using the Fluo-4 DirectTM Calcium Assay Kit (Thermo Fisher Scientific, F10471). Full MCF10A growth
medium was removed and replaced with FBS-free medium for 18 hours. Control cells were treated with equal volume of the culture
Cell 181, 1596–1611.e1–e16, June 25, 2020 e14

ll
OPEN ACCESS Article
media of the 2x Fluo-4 DirectTM calcium assay reagent solution for 30 min in 37 C, 5% CO2, without removing the assay media. The
induced cells were stimulated with full media for 30 min, while incubated with the calcium assay reagent solution. Cells were then
analyzed by monitoring the fluorescence of the Fluo-4 dye using an ImageXpress Micro Confocal High-Content Imaging System
(Molecular Devices) and a 20x objective with 488 nm excitation. The fluorescence per field before and after induction was calculated
using a MetaXpress Software Custom Module.
To assess the effects of PLCg1 inhibition on calcium flux, cells were treated with either 2 mM U73122 for 24 hours, or transiently
transfected with 25 nM ON-TARGETplus SMARTpool human siRNA targeting PLCg1 for 48 hours. For the final 18 hours, full MCF10A
growth medium was removed and replaced with FBS-free medium. Immediately prior to the assay, cells were treated with equal vol-
ume culture media and 2x Fluo-4 DirectTM calcium assay reagent, and baseline fluorescence was monitored using an ImageXpress
Micro Confocal High-Content Imaging System (Molecular Devices). After 300 ms, cells were induced by adding 1 mL full MCF10A
growth medium, and fluorescence of the Fluo-4 dye was monitored using a 20x objective with 488 nm excitation. Calcium flux
was calculated by first normalizing fluorescence readings to baseline measurements using the formula: dF = F(t) – F(0)/F(0), where
F = fluorescence at 488 nm, t = time, and F(0) represents the average baseline readings from 1-299 ms. Fluorescence values were
subsequently normalized to a unit interval between 0 and 1 and presented as the time required post stimulation to reach a maximal
calcium intensity.
Quantitative RT-PCR and PIK3CA mutation analysis

RNA was extracted using the RNeasy Plus Mini Kit (QIAGEN, 74134) and 1 mg of total RNA was used to synthesize complementary
DNA (cDNA) using the QuantiTect Reverse Transcription Kit (QIAGEN, 205311). Primer sequences used are as follows: Human
cPLA2, forward: 50 -GATGAAACTCTAGGGACAGCAAC-30 ,
reverse: 50 -CTGGGCATGAGCAAACTTCAA-30 , Human b-actin: forward: 50 -GACCCAGATCATGTTTGAGACC-30 ,
reverse: 50 -CTTCATGAGGTAGTCAGTCAGG. Reactions were performed with SYBR Select Master Mix (ThermoFisher, 4472908)
using the TProfessional Thermocycler from Biometra and analyzed with the qPCRsoft version 3.1 (Thistle Scientific/Analytik Jena).
The following cycle reactions were used: pre-denaturation for 3 min at 95 C, followed by 45 cycles of 5 s at 95 C (denaturation), 5 s at
55.8 C (annealing) and 15 s at 72 C (elongation).
To detect PIK3CA mutations in primary breast tumor samples, DNA was extracted from 10 mg of tissue using the QIAamp DNA
mini kit (QIAGEN, 51304). Mutations in exon 9 (helical domain) and exon 20 (kinase domain) of PIK3CA were assessed using the PNA-
Clamp PIK3CA Mutation Detection Kit (Panagene, PNAC-4001), according to the manufacturer’s instructions. Briefly, reactions were
performed using 10 ng DNA with a SYBR Green PCR reaction premix and primer premixes detecting E542, E545, Q546 and H1047
mutations using the TProfessional Thermocycler. The following cycle reactions were used: pre-denaturation for 5 min at 94 C, fol-
lowed by 40 cycles of 30 s at 94 C (denaturation), 20 s at 70 C (peptide nucleic acid clamping), 30 s at 63 C (annealing) and 30 s
at 72 C (extension). PIK3CA mutations were assessed based on manufacturer’s instructions.
In vitro kinase assay

In vitro kinase assays were performed using the ADP-Glo Kinase Assay (Promega, V6930), according to the manufacturer’s instruc-
tions. Briefly, 100 ng recombinant human protein kinase C zeta (PKCz, Insight Biotechnology, TP302472) and 0.5 mg/ml phospholi-
pase A2, group IVA (cPLA2, Insight Biotechnology, TP320972) were incubated in kinase buffer comprising 40 mM TRIS buffer pH 7.5,
20 mM MgCl2 (Sigma-Aldrich, M8266), 0.1 mg/ml BSA (Sigma-Aldrich, A2153) and 50 mM DTT (Sigma-Aldrich, 646563), and supple-
mented with 20 mM ATP. Reactions were incubated at 30 C for 3 hours, after which luminescence was measured following an inte-
gration time of 1 s.
Chemokine assays
Tumor samples were placed in 2 mL lysing tubes prefilled with 1.4 mm ceramic (zirconium oxide) beads and 1 mL of chilled PBS.
Samples were homogenized with a Precellys24 homogenizer programmed with three 30 s cycles at 6,500 Hz and 4-min pause times.
At the end of the homogenization cycle, samples were centrifuged at 5,000rpm for 10 min at 4 C, and supernatants were transferred
to fresh Eppendorf tubes
To measure CCL5 (RANTES) chemokine levels, the CCL5 Mouse ELISA kit (Abcam, ab100739) was used according to the man-
ufacturer’s instructions. Briefly, 100 mL of standards prepared in Assay Diluent A and sample supernatants were added to appropriate
wells and incubated at room temperature for 2.5 hours. Wells were subsequently washed 4 times with 1X Wash Solution, and 100 mL
of 1X Biotinylated CCL5 (RANTES) Detection Antibody was added and incubated for 1 hour at room temperature with gentle shaking.
Following the incubation with the detection antibody, wells were washed 4 times and incubated with 100 mL HRP-Streptavidin so-
lution for 45 min at room temperature with gentle shaking. Following a final set of 4 washes, 100 mL of TMB One-Step Substrate Re-
agent was added to each well and incubated for 30 min at room temperature in the dark with gentle shaking. 50 mL of stop solution
was subsequently added, and measurements were taken immediately at 450 nm.
To measure CX3CL1 chemokine levels, the CX3CL1 Mouse ELISA kit (Abcam, ab100683) was used according to the manufac-
turer’s instructions. Briefly, 100 mL of standards prepared in Assay Diluent C and sample supernatants were added to appropriate
e15 Cell 181, 1596–1611.e1–e16, June 25, 2020

ll
Article OPEN ACCESS
wells and incubated at room temperature for 2.5 hours. Wells were subsequently washed 4 times with 1X Wash Solution, and 100 mL
of 1X Biotinylated CX3CL1 Detection Antibody was added and incubated for 1 hour at room temperature with gentle shaking.
Following the incubation with the detection antibody, wells were washed 4 times and incubated with 100 mL HRP-Streptavidin so-
lution for 45 min at room temperature with gentle shaking. Following a final set of 4 washes, 100 mL of TMB One-Step Substrate Re-
agent was added to each well and incubated for 30 min at room temperature in the dark with gentle shaking. 50 mL of stop solution
was subsequently added, and measurements were taken immediately at 450 nm. All measurements were normalized to total protein
content as determined using the BCA protein assay (Thermo Fisher Scientific, 23225).
Lipid extraction and eicosanoid profiling

Total lipids were extracted based on the Bligh and Dyer method (Bligh and Dyer, 1959) and analyzed as described previously (Wolfer
et al., 2015). Briefly, cell lines were harvested when 80% confluent, and media was replaced one hour prior to lipid extraction. Cell
pellets were resuspended in 1ml water, after which 3.75 mL chloroform/methanol mixture (1:2 v/v) was added and samples were
vortexed and incubated for at least 30 min on ice. 1.25 mL chloroform was added, followed by 1.25 mL mili-Q water and after vortex-
ing and centrifugation at 1000rpm for 10 min at 4 C, the organic bottom phase was separated and transferred to a 15 mL amber vial to
get dried under a nitrogen flow. For conditioned media profiling, cells were grown to 80% confluency in full media, after which residual
media was washed off twice with PBS, and cells were cultured with FAF media for 48 hours. The conditioned media was centrifuged
at 2000 g for 20 min to remove cell debris. Samples were concentrated using a centrifugal vacuum evaporator. The lipid extraction
was then performed as above.
Eicosanoid levels were determined using liquid chromatography with a Waters HSS T3 UPLC column connected to a Waters Xevo
TQ-S triple quadrupole mass spectrometer with an electrospray ionization (ESI) source. TargetLynx (Waters, Manchester, UK) was
used for peak detection and integration. Data was normalized to protein concentration.
To measure PGE2 (Enzo Life Sciences, ADI-900-001) and AA (Generon, CEB098Ge) concentrations, enzyme-linked immunosor-
bent assays were used, following the manufacturer’s instructions.
Bioinformatic analysis
Gene-centric RMA-normalized expression data and mutation status for available cell lines were obtained from the Cancer Cell Line
Encyclopedia (CCLE). 150 frequently mutated genes were identified based on a mutation frequency of R 20% or higher in n = 35
available cell lines, and significant enrichment in observed clusters was assessed using Fisher’s exact test. A functional annotation
analysis using the Database for Annotation, Visualization and Integrated Discovery (DAVID, v6.8) (Huang et al., 2009) was used to
identify biologically relevant pathways represented by a set of 512 genes, which were significantly upregulated in the PIK3CA mu-
tation- and lipid-enriched cluster. Interactions between genes involved in the identified pathways were visualized by functional
gene networks using the ‘ReactomePA’ package in R (Yu and He, 2016).
Student t test, Fisher’s exact test, and one- or two-way ANOVA were used to evaluate statistical significance as indicated in the
respective figure legends. The Shapiro-Wilk test was used to assess normality. The ‘N’ for each experiment can be found in the figure
legends and represents independently generated samples for in vitro experiments, wells for cell-based assays, or mice for in vivo
experiments. Bar graphs present the mean ± standard error of the mean (SEM). Significance was defined as p < 0.05, and denoted
by asterisks throughout the figures as follows: n.s (not significant), *(p < 0.05), **(p < 0.01), ***(p < 0.001). No statistical methods were
used to determine sample size, and no completed data were excluded from the analysis performed in this manuscript. Statistical
tests were performed using GraphPad Prism (version 8.0.1) and R statistical software (version 3.5.1).
Cell 181, 1596–1611.e1–e16, June 25, 2020 e16

ll
Article OPEN ACCESS
A B C
E F

ll
OPEN ACCESS Article
Figure S1. Related to Figure 1

(A) Volcano plots of significantly altered phospholipids between receptor positive and negative cell lines. Black dots: not significantly altered; Red dots:
significantly upregulated; Green dots: significantly downregulated phospholipids. (B) Area under the curve (AUC) classification accuracies for estrogen (ER),
progesterone (PR), HER2 receptor and triple negative status of 30 primary and PDX breast tumors (median intensity of n = 3 separate sections per tumor) following
feature selection for phospholipids in the m/z range 600-900 and leave-one-out cross validation. (C) Immunoblot analysis of estrogen inducible protein pS2 (top)
and prediction of ESR1 expression (bottom) in ER+ve T47D cells following treatment with 0.1% DMSO or indicated concentrations of 4-OHT for 72 hours using
REIMS. (D) NMF consensus maps summarizing the clustering of cell lines used in Figure 1D. The color map represents the correlation between cell lines in the
same cluster when samples are divided into 2-6 groups. The highest cophenetic score was obtained for two clusters. (E) REIMS analysis of MCF10A PIK3CA WT
and MUT cells cultured as 3D spheroids for 10 days. Clustering was performed as in Figure 1D using the median lipid intensities of 3 biological replicates. (F)
Overall, precision and recall classification accuracies for PIK3CA mutation status in primary and PDX breast tumors (n = 30 in total), using all detectable lipid
features (n = 1147) following 3-fold cross validation repeated 100 times with random forest as a classifier. n.s., not significant; *p % 0.05; **p % 0.01; ***p % 0.001.
P values in (C, bottom panel) were calculated with one-way ANOVA, followed by unpaired, two-tailed Student’s t test with Bonferroni correction.
ll
Article OPEN ACCESS
A B
Figure S2. Related to Figures 1 and 2

(A) Pathway enrichment analysis of genes corresponding to the three most significantly enriched KEGG pathways determined from the 512 significantly upre-
gulated genes in the lipid-enriched cluster from Figure 1D. (B) Gene interaction networks corresponding to the 55 genes encompassing the KEGG pathways in (A).
(C) FASN, (D) ELOVL6, and (E) LDLRAP1 mRNA expression between cell lines in the lipid-enriched (n = 19) and depleted (n = 15) clusters. *p % 0.05; **p % 0.01;
***p % 0.001. P values in (C), (D), and (E) were calculated with unpaired, two-tailed Student’s t test.
ll
OPEN ACCESS Article
B C

(A) Cell viability of MCF10A PIK3CA MUT cells following treatment with increasing concentrations of rapamycin, torin 1, BYL-719, BKM120, MK2206 or
GSK690693 for 72 hours. (B) Unsupervised hierarchical clustering of the median phospholipid intensities of 5 PIK3CA MUT breast cancer cell lines (MCF7, T47D,
MDAMB361, MDAMB453 and BT474) treated with 20 nM rapamycin, 100 nM BYL-719 and 150 nM MK2006 for 72 hours. (C) Immunoblot analysis of mTORC1
and mTORC2 signaling in the PIK3CA MUT isogenic panel. Cells were serum and growth-factor starved for 16 hours and subsequently stimulated with 5% horse
serum, 20 ng/ml EGF, 0.5 mg/ml hydrocortisone and 10 mg/ml insulin for 30 min. Data in (A) are presented as the mean ± SEM of n = 4 biological replicates and are
representative of at least two independent experiments. n.s., not significant; *p % 0.05; **p % 0.01. P values in (A) were calculated with unpaired, two-tailed
Student’s t test.
ll
Article OPEN ACCESS

Intracellular levels of (A) Arachidonic acid (AA) and (B) PGE2 levels as measured by LC-MS profiling. (C) AA levels from the conditioned media (CM) of indicated
cells before and after lipid depletion (LD). (D) Cell proliferation assays of MCF10A PIK3CA WT cells cultured in CM derived from WT or E545K MUT cells before or
after LD, with or without the supplementation of 25 mM AA, palmitate or palmitoleate. (E) Cell proliferation assays of MCF10A PIK3CA E545K MUT cells before or
after LD, with or without the supplementation of 25 mM AA, palmitate or palmitoleate. (F) Diacylglycerol (DAG) levels in MCF10A PIK3CA WT and MUT cells. (G)
Diagram summarizing DAG contribution to AA production. Sulforhodamine B (SRB) protein staining was used in (D) and (E) to measure cell proliferation over
5 days. Data are presented as the mean ± SEM of n = 3-4 biological replicates and are representative of at least two independent experiments. *p % 0.05; **p %
0.01; ***p % 0.001. P values in (A), (B), (C) and (F) were calculated using one-way ANOVA followed by unpaired, two-tailed Student’s t test with Bonferroni
correction, and in (D) and (E) with two-way ANOVA.
ll
OPEN ACCESS Article

ll
Article OPEN ACCESS

(A) Immunoblot of phospholipases in MCF10A PIK3CA WT and MUT cells following serum and growth-factor deprivation for 16 hours and stimulation with serum
and growth factors for 30 min. (B) Immunoblot analysis of cPLA2 protein decay following treatment with 50 mM cycloheximide (CHX) for the indicated times. Image
and quantification is from one experiment. (C) Real-time quantitative PCR of PLA2G4A expression in the MCF10A PIK3CA isogenic panel. (D) AA levels measured
by REIMS in MCF10A E545K PIK3CA MUT cells following RAPTOR and RICTOR siRNA-mediated knockdown 48 hours post transfection under exogenous FAF
conditions. ELISA analysis of (E) AA and (F) PGE2 in the MCF10A PIK3CA isogenic panel 48 hours post RICTOR siRNA-mediated knockdown. (G) Immunoblot
analysis of cPLA2 protein decay (top) and quantification (bottom) following RICTOR siRNA-mediated knockdown and treatment with 50 mM cycloheximide for the
indicated times. Image and quantification is from one experiment. (H) Immunoblot analysis of substrates of conventional PKCa/b (p-IkBa Ser32 and p-RB Thr821/
826), novel PKCε (p-STAT3 Ser727 and p-PKD Ser744/748) and atypical PKCz (p-RKIP Ser153 and p-cPLA2 T376) isoforms following treatment of MCF10A
E545K and H1047R MUT cells with 1 mM of each PKCa, b, ε, and z peptide inhibitors for 72 hours. (I) Enzymatic activity of cPLA2, iPLA2, and sPLA2 in the MCF10A
PIK3CA isogenic panel following treatment with 100 nM ASB14780 for 72 hours. (J) AA levels measured by REIMS in MCF10A E545K MUT cells treated with
100 nM ASB14780, 1 mM each of PKCa, b, ε, and z peptide inhibitors, 250 mM GSK650394, or 150 nM MK2206 for 72 hours under exogenous FAF conditions. (K)
Immunoblot (right) of total PKCz and phospho-S505 and T376 cPLA2 of MCF10A PIK3CA WT and MUT cells following PKCz siRNA-mediated knockdown, and
cPLA2 activity (left) following 48 hours post-transfection. (L) Representative phospho-PKCz Thr560 immunoreactivity images (left) of 9 PIK3CA MUT (blue) and 9
WT (red) breast PDX tumors. Scale bar = 250 mm. Quantification of percent positive regions (right) was performed using the IHC profiler plug-in for ImageJ. Data
are presented as the mean ± SEM of n = 3-5 biological replicates and are representative of at least two independent experiments. n.s., not significant, *p % 0.05;
**p % 0.01. P values in (C), (D), (E), (F), (I), (J) and (K, right) with one-way ANOVA followed by unpaired, two-tailed Student’s t test with Bonferroni correction, and in
(L) with unpaired, two-tailed Student’s t test.
ll
OPEN ACCESS Article

(A) Immunoblot analysis of phospho-Tyr783 PLCg1 in MCF10A PIK3CA isogenics following serum and growth factor deprivation for 16 hours, and stimulation
with serum and growth factors for 30 min. Densitometry values are either scaled to unstimulated or stimulated (bold) WT samples. (B) Measurement of intracellular
calcium flux in MCF10A PIK3CA isogenics following serum and growth factor deprivation and stimulation for 30 min. (C) Immunoblot of MCF10A PIK3CA WT and
MUT cells following siRNA-mediated knockdown of PLCg1. (D) Intracellular calcium flux of MCF10A PIK3CA WT (top), E545K (middle) and H1047R (bottom) cells
ll
Article OPEN ACCESS
48 hours post transfection with siPLCg1, or (E) treatment with 2 mM U73122 for 24 hours. For the final 18 hours of the treatments, cells were serum and growth
factor deprived, and stimulated with full media immediately prior to the assay. cPLA2 activity in MCF10A PIK3CA WT and MUT following (F) siRNA-mediated
knockdown of PLCg1 for 48 hours, or (G) treatment with 2 mM U73122 for 24 hours. (H) AA levels measured by REIMS in MCF10A PIK3CA isogenics following
treatment with 2 mM U73122 for 24 hours. (I) Representative confocal images and (J) quantification of in situ proximity ligation assay (PLA) between cPLA2 and
phospho-Thr560 PKCz in MCF10A PIK3CA WT and MUT cells. (K) Immunoblot analysis of phospho-cPLA2 (T376) custom antibody in the MCF10A isogenic panel
following serum and growth factor deprivation for 18 hours and subsequent stimulation for 30 min (left), treatment with 1 mM PKCz peptide inhibitor for 72 hours
(middle), and in MCF10A H1047R cPLA2 CRISPR knockout cells overexpressing a phosphoresistant mutant (T376A) cPLA2 (right). (L) Activity of cPLA2 in
MCF10A PIK3CA WT or H1047R cPLA2 CRISPR knockout cells transfected with 9 mg of either WT-cPLA2, or S505A/T376A phosphoresistant mutant cPLA2
constructs. Activity was measured 48 hours post-transfection. Cell proliferation of MCF10A (M) PIK3CA WT and (N) E545K MUT cells expressing control shGFP,
cPLA2-sh1 or sh5 under exogenous FAF conditions. Sulforhodamine B (SRB) protein staining was used to measure cell proliferation over 5 days. Data in (B), (D),
(E), (F), (G), (H), (L), (M) and (N) are presented as the mean ± SEM of n = 3-6 biological replicates and are representative of at least two independent experiments.
n.s., not significant; *p < 0.05; **p < 0.01; ***p < 0.001; P values in (B), (D), (E), (M) and (N) were calculated using two-way ANOVA. One-way ANOVA followed by
Student’s t test with Bonferroni correction was used for (F), (G), (H), (J)and (L).
ll
OPEN ACCESS Article

ll
Article OPEN ACCESS

(A) Relative tumor growth and (B) tumor weights of CAL51 (PIK3CA MUT)-derived xenografts stably expressing control shGFP or two independent shRNAs
targeting cPLA2 (cPLA2-sh1 and cPLA2-sh5) in mice fed a balanced omega3:omega6 diet. (C) Relative tumor growth and (D) tumor weights of Hs578T (PIK3CA
WT)-derived xenografts stably expressing control shGFP or cPLA2-sh1 or cPLA2-sh5 in mice fed a balanced omega3:omega6 diet. AA levels measured by
REIMS in (E) PIK3CA MUT (CAL51) and (F) PIK3CA WT (Hs578T) snap frozen excised tumors. AA intensities are reported as scaled values to the appropriate
shGFP-fat free diet control. Quantification of CCL5 from excised tumors derived from (G) PDX and (H) cell line-derived xenograft studies. Quantification of
CX3CL1 from excised tumors derived from (I) PDX and (J) cell line-derived xenograft studies. Concentrations of chemokines in (G-J) were determined from whole
tumor lysates using ELISA, and normalized to protein content. (K) Representative immunohistochemical staining of the activated NK cell marker NKp46 in BR1282
(PIK3CA MUT) and BR1458 (PIK3CA WT) PDX tumors. (L) and (M) Quantification of positively immunostained areas from (K). (N) Representative immunohis-
tochemical staining of NKp46 in shGFP, cPLA2-sh1 and cPLA2-sh5 expressing CAL51 (PIK3CA MUT) and Hs578T (PIK3CA WT)-derived xenograft tumors under
fat free or ‘Western’ diets. (O) and (P) Quantification of positively immunostained areas from (N). Data in (A), (C), and (E) to (J) are presented as the mean ± SEM of
n = 3–5 mice for cell line xenograft or n = 7–8 mice for PDX studies. n.s., not significant; *p < 0.05; **p < 0.01; ***p < 0.001; P values in (A) and (C) were calculated
using two-way ANOVA, and one-way ANOVA followed by unpaired, two tailed Student’s t test with Bonferroni correction was used in (B), (D), (E), (F), (G), (H), (I), (J),
(L), (M), (O), (P).
Article
Intratumoral CD4+ T Cells Mediate Anti-tumor

Cytotoxicity in Human Bladder Cancer
David Y. Oh, Serena S. Kwek,
Siddharth S. Raju, ...,
Terence W. Friedlander, Chun Jimmie Ye,
Lawrence Fong
Correspondence
jimmie.ye@ucsf.edu (C.J.Y.),
lawrence.fong@ucsf.edu (L.F.)
In Brief
Single-cell RNA and paired T cell receptor
sequencing highlights enrichment of
cytotoxic CD4+ T cells rather than CD8+
T cells in human bladder cancer. These
CD4+ T cells are capable of killing
autologous tumor cells and are subjected
to inhibition by Tregs.
Highlights
d Human bladder tumors contain multiple clonally expanded
cytotoxic CD4+ T cell states
d Cytotoxic CD4+ T cells can kill autologous tumors in an MHC

class II-dependent fashion
d Autologous regulatory T cells can inhibit the activity of

cytotoxic CD4+ T cells
d A cytotoxic CD4+ gene signature predicts response to anti-

PD-L1 in bladder cancer
Oh et al., 2020, Cell 181, 1612–1625

ll
Article
Intratumoral CD4+ T Cells Mediate Anti-tumor
Cytotoxicity in Human Bladder Cancer
David Y. Oh,1,11 Serena S. Kwek,1,11 Siddharth S. Raju,3,6,8 Tony Li,1 Elizabeth McCarthy,3 Eric Chow,4 Dvir Aran,5
Arielle Ilano,1 Chien-Chun Steven Pai,1,12 Chiara Rancan,1 Kathryn Allaire,1 Arun Burra,1 Yang Sun,3
Matthew H. Spitzer,2,6,8 Serghei Mangul,9 Sima Porten,7 Maxwell V. Meng,7 Terence W. Friedlander,1
Chun Jimmie Ye,2,3,5,10,* and Lawrence Fong1,2,13,*
1Division of Hematology/Oncology, Department of Medicine, University of California, San Francisco, San Francisco, CA 94143, USA
2Parker Institute for Cancer Immunotherapy, University of California, San Francisco, San Francisco, CA 94143, USA
3Division of Rheumatology, Department of Medicine; Department of Epidemiology and Biostatistics; and Institute for Human Genetics,
University of California, San Francisco, San Francisco, CA 94143, USA

4Department of Biochemistry and Biophysics, Center for Advanced Technologies, University of California, San Francisco, San Francisco, CA
94143, USA
5Bakar Computational Health Sciences Institute, University of California, San Francisco, San Francisco, CA 94143, USA
6Department of Microbiology and Immunology, University of California, San Francisco, San Francisco, CA 94143, USA
7Department of Urology, University of California, San Francisco, San Francisco, CA 94143, USA
8Department of Otolaryngology – Head and Neck Surgery, University of California, San Francisco, San Francisco, CA 94143, USA
9Department of Computer Science, University of California, Los Angeles, Los Angeles, CA 90095, USA
10Chan Zuckerberg Biohub, San Francisco, CA, USA
12Present address: Department of Oncology Biomarker Development, Genentech, 1 DNA Way, South San Francisco, CA 94080, USA
13Lead Contact
*Correspondence: jimmie.ye@ucsf.edu (C.J.Y.), lawrence.fong@ucsf.edu (L.F.)

SUMMARY
Responses to anti-PD-1 immunotherapy occur but are infrequent in bladder cancer. The specific T cells that
mediate tumor rejection are unknown. T cells from human bladder tumors and non-malignant tissue were as-
sessed with single-cell RNA and paired T cell receptor (TCR) sequencing of 30,604 T cells from 7 patients. We
find that the states and repertoires of CD8+ T cells are not distinct in tumors compared with non-malignant
tissues. In contrast, single-cell analysis of CD4+ T cells demonstrates several tumor-specific states, including
multiple distinct states of regulatory T cells. Surprisingly, we also find multiple cytotoxic CD4+ T cell states
that are clonally expanded. These CD4+ T cells can kill autologous tumors in an MHC class II-dependent
fashion and are suppressed by regulatory T cells. Further, a gene signature of cytotoxic CD4+ T cells in tu-
mors predicts a clinical response in 244 metastatic bladder cancer patients treated with anti-PD-L1.
INTRODUCTION cytes (TILs) and their differential ability to confer a therapeutic

benefit upon treatment.
Immunotherapies have changed the landscape of cancer Currently, cytotoxic CD8+ T cells are the main focus of efforts
treatment by producing durable and long-lasting responses to understand how immunotherapy elicits anti-tumor immunity.
through triggering anti-tumor cell-mediated immunity. In In melanoma, expression and chromatin state signatures of
particular, checkpoint inhibitors (CPIs) targeting the immune cytotoxicity and exhaustion (Tirosh et al., 2016; Philip et al.,
inhibitory molecules CTLA-4 and PD-1 in T lymphocytes 2017; Ayers et al., 2017; Herbst et al., 2014) and the presence
have been approved based on responses and improved over- of CD8+ T cells at the tumor-invasive margin pre-treatment (Tu-
all survival in multiple malignancies, particularly those with a meh et al., 2014) are significantly correlated with subsequent re-
high mutational burden (Hodi et al., 2010; Herbst et al., sponses to PD-1-directed therapy. However, in metastatic
2014; Powles et al., 2014; Robert et al., 2015; Martincorena bladder TCC, where response rates to PD-1 blockade are
and Campbell, 2015; Cancer Genome Atlas Research 15%–20% in platinum chemotherapy-refractory patients and
Network, 2008). However, in specific malignancies, such as more than 20% in frontline platinum-ineligible patients, predic-
transitional cell carcinoma (TCC) of the bladder, CPIs as tive biomarkers of response are unclear, including PD-L1
monotherapies are efficacious in only 20% of patients expression (Koshkin and Grivas, 2018). Recently, bulk RNA
(Powles et al., 2014; Hargadon et al., 2018). This could be sequencing (RNA-seq) of the pre-treatment tumor microenviron-
partly due to the heterogeneity of tumor-infiltrating T lympho- ment in TCC found that a higher score of CD8+ gene signature
This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
ll
Article
and tumor mutational burden and, conversely, a lower score of S1). To assess the shared heterogeneity of T cells across sam-
transforming growth factor b (TGF-b) gene signature, particularly ples, we restricted the analysis to highly variable genes and
in immune-excluded tumors, were associated with response to used an empirical Bayes approach (ComBat; Johnson et al.,
the anti-PD-L1 agent atezolizumab (Mariathasan et al., 2018). 2007; Büttner et al., 2019) to account for preparation batch
However, the importance of heterogeneous subsets of TILs in among individual samples. We subsequently used Leiden clus-
TCC beyond canonical CD8+ cytotoxic and exhausted pheno- tering (Traag et al., 2019) to define clusters that were visualized
types in response to PD-1 blockade remains unexplored. In using uniform manifold approximation and projection (UMAP)
particular, the role of CD4+ T cells in controlling or enhancing (McInnes and Healy, 2018). Tumor- and non-malignant-derived
TCC tumor growth remains largely unknown. Although regulato- CD8+ T cells form 11 clusters, each populated by cells from all
ry CD4+ T cells (Tregs) in the TCC environment have been asso- samples suggestive of shared states in TCC regardless of the
ciated with adverse outcomes (Baras et al., 2016), and a CD4+ treatment regimen (Figure 1A; Figure S2A). Differential expres-
subset expressing the inducible costimulator (ICOS) that pro- sion analyses comparing each cluster with all other cells com-
duces interferon-gamma (IFNg) in response to anti-CTLA-4 ther- bined identified 1,067 differentially expressed genes in at least
apy has been described in human bladder tumors (Liakou et al., one cluster (adjusted P value (Padj) < 0.05, |log2(fold change,
2008), the presence of other CD4+ T cell subsets that directly FC)| > 0.5) (Table S2). The identified states include known CD8+
promote cell-mediated immunity through other effector mecha- subtypes (Figures 1B and 1C): cells expressing HAVCR2 (TIM-
nisms remains unclear. Detailed characterization of the T lym- 3), LAG3, ENTPD1, as well as the chemokine CXCL13
phocytes in the tumor is critically needed for precisely mapping (CD8ENTPD1: log2(FC) = 1.4–3.7), described previously as tumor-
the cells responsible for tumor recognition and control and reactive CD8+ T cells (Duhen et al., 2018); effector cells express-
defining predictive markers of response to CPI in bladder cancer. ing FGFBP2 and GNLY, a granule-associated pore-forming pro-
To address these points, we interrogated the tumor microen- tein known to function in pathogen killing (Krensky and Clay-
vironment of patients with localized muscle-invasive bladder berger, 2009) (CD8FGFBP2: log2(FC) = 3.6–5.3); naive cells
TCC who received or did not receive neoadjuvant anti-PD-L1 expressing CCR7 and GZMK (CD8NAIVE: log2(FC) = 0.9–2.8); cen-
immunotherapy (atezolizumab, Genentech) prior to surgical tral memory cells expressing CCR7 and SELL (L-selectin)
resection. Droplet single-cell RNA-seq (dscRNA-seq) and paired (CD8CM: log2(FC) = 1.5–1.7); and mucosal-associated invariant
T cell receptor sequencing (TCR-seq) of more than 30,000 CD4+ T (MAIT) cells expressing KLRB1 (CD8MAIT: log2(FC) = 2.7) that
and CD8+ T cells from paired tumors and adjacent non-malig- preferentially use the known semi-invariant TCR a chains
nant tissues revealed heterogeneity in known CD4+ states, TRAV1-2 and/or TRAJ33 (Kurioka et al., 2016; Figure 1D). Of
such as regulatory T cells, which were also enriched and clonally note, we also found MKI67+ proliferating cells (CD8PROLIF: log2(-
expanded in tumors. In addition, several states of cytotoxic FC) = 6.5) as well as cells expressing the chemokines XCL1/2
CD4+ T cells expressing cytolytic effector proteins were identi- (CD8xcl: log2(FC) = 5.2–5.6). Similar states were also identified
fied, some of which are enriched in tumors. Cytotoxic CD4+ in the tumor environment of hepatocellular carcinoma based on
T cells were clonally expanded in tumors and could kill autolo- scRNA-seq (Zheng et al., 2017a). Surprisingly, although the fre-
gous tumors ex vivo. Cytotoxic CD4+ T cells existed in discrete quency of CD8ENTPD1 cells was higher in tumors, none of the
proliferating and non-proliferating states in tumors. A gene CD8+ states displayed statistically significant differences in fre-
signature of cytotoxic CD4+ T cells was predictive of a response quency between the tumor and non-malignant bladder (exact
to PD-1 blockade in an orthogonal RNA-seq dataset of metasta- permutation test; Figure 1E; density plots in Figure 1F).
tic bladder cancer patients treated with anti-PD-L1. Overall,
these findings highlight the importance of CD4+ T cell heteroge- Tregs Included Heterogeneous States that Are Enriched
neity and the relative balance between activation of cytotoxic in Bladder Tumors
CD4+ effectors and inhibitory regulatory cells for killing autolo- Given the lack of tumor enrichment of CD8+ states and the higher
gous tumors. frequency of CD4+ over CD8+ T cells in bladder tumors (Fig-
ure S1B), we investigated CD4+ T cell heterogeneity in a similar
RESULTS fashion to determine their contribution to anti-tumor responses.
We sequenced and analyzed 16,995 tumor- and 2,847 non-ma-
Canonical CD8+ T Cell States Were Not Enriched in the lignant tissue-infiltrating CD4+ T cells isolated from the same pa-
Bladder Tumor Microenvironment tients. Tumor-derived and non-malignant tissue-derived CD4+
To assess the T cell composition of the tumor environment, we T cells formed 11 clusters each with representation from all indi-
profiled T cells from dissociated bladder tumors and adjacent vidual patients (Figure 2A; Figure S2B). We identified 1,511
uninvolved bladder tissues using single-cell RNA and TCR differentially expressed genes in at least one cluster (Padj <
sequencing (see schematic in Figure S1A). We used the 10X Ge- 0.05, |log2(FC)| > 0.5; Table S2; Figures 2B and 2C) defining
nomics Chromium platform (Zheng et al., 2017b) to sequence several canonical CD4+ T cell states. These include CCR7+ cells,
8,833 tumor-derived and 1,929 non-malignant tissue-derived which demonstrated a central memory phenotype (CD4CM)
CD8+ T cells from 7 patients (Table S1). All samples were mus- based on parallel flow cytometry data showing that these were
cle-invasive bladder cancer (MIBC) from 2 standard-of-care-un- CD45RA (see below) as well as cells expressing high levels of
treated patients (‘‘untreated’’), 1 chemotherapy-treated patient CXCL13 and IFNG (CD4CXCL13: log2(FC) = 5.9 and 1.4), which
(gemcitabine + carboplatin, ‘‘chemo’’), and 4 anti-PD-L1-treated were also likely to be exhausted based on overexpression of
patients (‘‘anti-PD-L1’’) with detailed clinical annotations (Table TOX (log2(FC) = 1.9) (Yao et al., 2019) and whose presence has
Cell 181, 1612–1625, June 25, 2020 1613

ll
Article
A B C
D E
F
Figure 1. Bladder Cancer Contains Canonical CD8+ T Cell States

(A) Uniform manifold approximation and projection (UMAP) plots of 10,762 single sorted CD3+ CD8+ T cells obtained from bladder tumors and adjacent non-
malignant tissue (N = 7 patients). Phenotypic clusters are represented in distinct colors.
(B) Relative intensity of expression of select genes superimposed on the UMAP projections in (A).
(C) Violin plots showing the relative expression of select differentially expressed genes (columns) for each cluster shown in (A) (rows) (all Padj < 0.05).
(D) The frequency of cells expressing MAIT-associated TRAV1-2/TRAJ33+ TCRs within each defined CD8+ phenotypic cluster.
(E) The frequency of cells in individual clusters shown as a proportion of total CD8+ cells within tumor or non-malignant compartments across all patients (orange,
tumor; blue, non-malignant). For each cluster, a box and whisker plot is shown with the median, interquartile range (IQR, a box with lower and upper bounds
representing 25th and 75th percentiles, respectively), and 1.5 times the IQR (whiskers). Outlier points are shown if more than 1.5 times the IQR beyond the lower
and upper quartiles. Statistical testing was done using an exact permutation test.
(F) Density plots showing distribution of cells in tumor or non-malignant samples.
been associated with improved outcomes in breast, gastric, and tuting 26% ± 1.9% (mean ± SEM) of tumor-infiltrating CD4+ cells,
microsatellite-unstable colorectal carcinoma, which is an im- which co-expressed FOXP3 (CD4IL2RAHI: log2(FC) = 2.7;
mune-responsive tumor (Schmidt et al., 2018; Gu-Trantien CD4IL2RALO: log2(FC) = 1.2) and known immune checkpoints,
et al., 2013, 2017; Wei et al., 2018; Zhang et al., 2018). Other including IL2RA, TIGIT, TNFRSF4/9/18, and CD27 (Philip et al.,
states included Th17 cells expressing IL17A (CD4TH17: log2(FC) = 2017; Zheng et al., 2017a; Plitas et al., 2016; De Simone et al.,
4.7), which represented important anti-tumor effectors (Kryczek 2016) (CD4IL2RAHI and CD4IL2RALO: log2(FC) > 0.65; Figures 2B,
et al., 2011); activated cells expressing CD69 (CD4ACTIVATED: 2C, and 3A). With the exception of TIGIT, these immune check-
log2(FC) = 2.2) but not FOXP3 (log2(FC) < 0.5) (Figures 2B and points are minimally expressed by other CD4+ states, such as
2C); as well as several important additional states described in CD4CM (Figure 3A). The two Treg states were distinguished by
further detail below. Notably, some of these states were selec- higher expression of IL2RA, TNFRSF4, TNFRSF9, and
tively enriched in specific compartments. CD4CXCL13 demon- TNFRSF18 in CD4IL2RAHI cells (CD4IL2RAHI: log2(FC) = 2.5–3.6;
strated significant enrichment in tumor compared with non-ma- CD4IL2RALO: log2(FC) = 0.4–1.6; Figure 3A; Table S2). Of note,
lignant tissue (tumor versus non-malignant: 6.5% versus 3.0%, both Treg states were significantly enriched in tumor compared
p = 0.015, exact permutation test), whereas states enriched in with adjacent non-malignant tissue (CD4IL2RAHI: 14.3% versus
non-malignant tissue included CD4CM (tumor versus non-malig- 4.6%, p = 0.002; CD4IL2RALO: 11.1% versus 6.7%, p = 0.002;
nant: 30% versus 42%, p = 0.008) and CD4ACTIVATED (tumor exact permutation test; Figure 2E). We confirmed, by flow cytom-
versus non-malignant: 7.5% versus 10%, p = 0.02) (density plots etry from 7 additional bladder tumors, that multiple tumors con-
in Figure 2D; tumor and non-malignant frequencies in Figure 2E). tained distinct regulatory states that expressed graded protein
Tregs were abundant constituents of the bladder tumor micro- levels of IL2RA and co-expressed significantly different levels
environment with demonstrated heterogeneity. We found two of immune checkpoints, such as TNFRSF18 (p < 0.05 for
states of Tregs (CD4IL2RAHI and CD4IL2RALO), together consti- TNFRSF18 expression in FOXP3+ CD25low versus CD25hi
1614 Cell 181, 1612–1625, June 25, 2020

ll
Article
A B
D E
Figure 2. CD4+ T Cells in Bladder Tumors Are Composed of Multiple Distinct Functional States
(A) UMAP plots of 19,842 single sorted CD3+ CD4+ T cells obtained from bladder tumors and adjacent non-malignant tissue (N = 7 patients). Each distinct
phenotypic cluster identified using Leiden clustering is identified with a distinct color. Annotation of each unbiased cluster was performed by manual inspection of
the highest-ranked differentially expressed genes for each cluster and using reference signature-based correlation methods (SingleR) as described in the text.
(B) Relative intensity of expression of select genes superimposed on the UMAP projections shown in (A).
(C) Violin plot showing relative expression of select differentially expressed genes (columns) for each cluster shown in (A) (rows) (all Padj < 0.05).
(D) Density plots showing distribution of cells in tumor or non-malignant samples.
(E) The frequency of cells in individual CD4+ T cell states defined by scRNA-seq clustering is shown as a proportion of total CD4+ cells within either tumor or non-
malignant compartments across all patients (orange, tumor; blue, non-malignant). A box and whisker plot is shown with formatting as in Figure 1E. *p < 0.05,
**p < 0.01 by exact permutation test.
populations by Wilcoxon signed-ranked test; Figure 3B; gating Ranger), this approach yielded 11,081 CD4+ T cells and 5,779
strategy in Figures S1C–S1D). This heterogeneity may be conse- CD8+ T cells with paired TRA and TRB CDR3 sequences
quential because Tregs expressing higher levels of immune (49% and 47% recovery, respectively; summary in Table S3).
checkpoints have been shown to be correlated with poorer out- These results are consistent with expected frequencies based
comes in non-small cell lung cancer (Guo et al., 2018). Both reg- on the average recovery of individual TRA (CD4+, 54%; CD8+,
ulatory states also demonstrated a common tumor-specific gene 50%) and TRB (CD4+, 68%; CD8+, 67%) sequences across
expression program that included several heat shock proteins whitelisted cells. Overall, the TCR repertoire was more
compared with non-malignant tissue (Figure S2C; Table S2). restricted in the tumor microenvironment than in adjacent
non-malignant tissue based on two analyses. First, in intratu-
Tregs Are Clonally Expanded in Bladder Tumors moral CD4+ T cells, 10.8% ± 1.6% of unique clonotypes are
To query the TCR sequence in the same single cells for which shared by 2 or more cells; this degree of sharing was signifi-
we obtained whole-transcriptome data, we PCR-amplified and cantly greater than in the non-malignant compartment (5.1%
sequenced to saturation the complementarity-determining re- ± 1.6%, unpaired t test, p = 0.033) and is not seen in blood
gion 3 (CDR3) of the TCR alpha (TRA) and beta (TRB) loci from healthy donors (0.12%–0.16%) or from publicly available
from the barcoded full-length cDNA library (primers in Table reference circulating CD4+ T cell data (0%) (Figure S3A). Sec-
S3). After filtering for matching whitelisted cell barcodes (Cell ond, we observed skewing of the intratumoral CD4+ T cell
Cell 181, 1612–1625, June 25, 2020 1615

ll
Article
A B
Figure 3. Regulatory CD4+ T Cells Are Heterogeneous, Enriched, and Clonally Expanded in Bladder Tumors
(A) Heatmap showing the expression of select regulatory T cell marker genes (rows) for individual single cells (columns) within the CD4IL2RAHI and CD4IL2RLO
clusters compared with the CD4CM cluster. Cells were grouped based on their annotations by tissue (tumor or non-malignant), treatment, and patient. Log2-
transformed expression of each gene was row scaled.
(B) Flow cytometry staining of CD4+ FOXP3+ TILs from a bladder tumor, showing the gating strategy for CD25neg, CD25low, and CD25hi (top left), and histograms of
TNFRSF18 staining from each CD25 gate (top right). Mean fluorescence intensity of TNFRSF18 and percent TNFRSF18+ from the parental gate are shown for
CD25 gates across samples (N = 7 tumors, mean ± SEM). *p < 0.05 by Wilcoxon paired t test.
(C) Gini coefficients for regulatory populations (CD4ILRA2HI and CD4IL2RALO, red labels at far left) and other CD4+ T cell populations within tumor and non-malignant
compartments across all samples. For each cluster, a box and whisker plot is shown with the median, IQR (box), and 1.5 times the IQR (whiskers), with outliers
exceeding 1.5 times the IQR beyond lower and upper quartiles. *p < 0.05, **p < 0.01 by exact permutation test. N = 7 tumor samples and 6 non-malignant
samples.
(D) Left: single cells expressing the top 3 most expanded clonotypes found in the combined regulatory populations (CD4ILRA2HI and CD4IL2RALO) are shown in red
in the same UMAP space as in Figure 2A. The regions composed of regulatory, cytotoxic, and proliferating T cells are outlined and superimposed on the UMAP
projection. Right: density plots for total CD4+ T cell distribution within tumor and non-malignant compartments are reproduced from Figure 2D for ease of visual
comparison.
repertoire toward an increased cumulative frequency of clono- When we assigned TCR sequences to cells with cluster
types over fewer cells (Figure S3B) and a corresponding higher identities (9,770 CD4+ and 5,151 CD8+ T cells with a paired
Gini coefficient (0.21 for tumor versus 0.05 for non-malignant TRA/TRB had an assigned phenotypic cluster or 49% and
tissue, Wilcoxon signed-rank test with Benjamini-Hochberg 48% of all T cells with assigned clusters, respectively; merged
correction, p = 0.009; Figure S3C) compared with the non-ma- TCR sequences and phenotypic clusters for CD4+ and CD8+
lignant compartment and healthy controls. T cells in Table S4), we found that clonal expansion of Tregs
1616 Cell 181, 1612–1625, June 25, 2020

ll
Article
B D
C E
Cell 181, 1612–1625, June 25, 2020 1617

ll
Article
contributes to intratumoral CD4+ T cell repertoire restriction. tumor-infiltrating CD4+ T cells. CD4GZMB and CD4GZMK cytotoxic
Compared with paired non-malignant tissue, both regulatory cells expressed a core set of cytolytic effector molecules (log2(-
states exhibited increased Gini coefficients in tumors FC) > 0.5, Padj < 0.05): GZMA (granzyme A), GZMB (granzyme B),
(CD4IL2RAHI: Ginitumor 0.17 versus Gininormal 0, p = 0.003; and NKG7 (a granule protein that translocates to the surface of
CD4IL2RALO: Ginitumor 0.06 versus Gininormal 0.003, p = 0.009; natural killer (NK) cells following target cell recognition; Medley
exact permutation test; Figure 3C). The most expanded clono- et al., 1996) (Figures 2B, 2C, and 4A; Table S2). Each cytotoxic
types within the Tregs were specific to regulatory cells but not CD4+ state was distinguished by the expression of specific
other cell states (all single cells expressing the top expanded effector molecules. CD4GZMB cells co-expressed high levels of
regulatory clonotypes are shown in Figure 3D). The CXCL13- GZMB, the pore-forming protein PRF1 (perforin), and the
expressing state CD4cxcl13 (discussed in greater detail below) granule-associated proteins GNLY and NKG7 (CD4GZMB: log2(-
was also restricted in tumors (Ginitumor 0.07 versus Gininormal FC) = 5.7, 3.4, 5.1, and 4.4, respectively), whereas CD4GZMK cells
0, p = 0.02, exact permutation test; Figure 3C). Gini coeffi- co-expressed high levels of the distinct granzyme GZMK and
cients for CD4+ states did not differ significantly by anti-PD- lower levels of NKG7 (CD4GZMK: log2(FC) = 6.3 and 3.9) (Fig-
L1 treatment (Figure S3G). In contrast, although repertoire ure 4A; Table S2). These shared cytolytic molecules were not ex-
restriction was also seen in CD8+ T cells from the same pressed by other CD4+ states, including regulatory and central
samples, this was observed in both tumor (percent unique clo- memory T cells (Figure 4A). Cytotoxic CD4+ cells co-expressed
notypes shared between cells: 15.1% ± 1.1%; Ginitumor: additional molecules, which may further contribute to anti-tumor
0.36% ± 0.04%) and non-malignant compartments (percent effector function. Notably, IFNG was expressed by both cyto-
unique clonotypes shared between cells: 14.6% ± 0.2%; toxic states, which may contribute to tumor cell death, including
Gininormal: 0.39% ± 0.06; Figures S3D–S3F). Furthermore, no ferroptosis (Wang et al., 2019) (CD4GZMB and CD4GZMK: log2(-
significant increase in Gini coefficient in tumor over non-ma- FC) = 2.1). Of note, the minority of CD4GZMB cells that expressed
lignant tissue was seen for any CD8+ state, including with IFNG appeared to also express TNF as well as specific immune
anti-PD-L1 treatment (Figures S3H and S3I). Hence, an impor- checkpoints, such as PDCD1, LAG3, and HAVCR2 (TIM3) (Fig-
tant contributor to increased repertoire restriction of tumor- ure 4A). A larger proportion of CD4GZMB cells expressed
infiltrating CD4+ over non-malignant tissue, which was not CXCR6 (CD4GZMB: log2(FC) = 1.3; Figure 4A). This chemokine
seen in the CD8+ compartment, involved clonal expansion of is expressed in regulatory and non-regulatory CD4+ TILs from
several distinct regulatory T cell states that differed in their colorectal carcinoma, nasopharyngeal carcinoma, and renal
levels of immune checkpoint expression, which may be cell carcinoma and, together with its ligand CXCL16, can
driven by tumor-associated antigens and the tumor-specific mediate TIL chemotaxis (Löfroos et al., 2017; Parsonage et al.,
microenvironment. 2012; Oldham et al., 2012). Finally, CD4GZMB and CD4GZMK cells
did not express high levels of other checkpoints associated with
Bladder Tumors Possessed Multiple Cytotoxic CD4+ T regulatory T cells, such as IL2RA, TIGIT, or TNFRSF4/9/18
Cell States (log2(FC) < 0.5; Figure 4A), nor did they express the exhaustion
In addition to regulatory states, we also found 2 distinct states of marker TOX (Table S2). Similar states were found with unbiased
cytotoxic CD4+ T cells in all samples constituting 15% ± 0.9% of clustering without batch correction for paired tumor- and
Figure 4. Multiple Cytotoxic CD4+ T Cell States Are Enriched and Clonally Expanded in Bladder Tumors and Possess Lytic Capacity against
Tumors
(A) Heatmap showing the expression of select cytotoxic or regulatory T cell marker genes (rows) for individual single cells (columns) within the cytotoxic CD4GZMB
and CD4GZMK clusters compared with regulatory (CD4IL2RAHI and CD4IL2RLO) and CD4CM clusters. Cells were grouped based on their annotations by tissue (tumor
or non-malignant), treatment, and patient. Log2-transformed expression of each gene was row scaled.
(B) Flow cytometry staining of GZMB, perforin, or GZMK in CCR7 CD4+ FOXP3 T cells.
(C) Percentage of cells expressing GZMB, GZMK, or perforin from CCR7 CD4+ FOXP3 T cells by flow cytometry (left) and the percentage of cells co-expressing
perforin within GZMB+ or GZMK+ CCR7 CD4+ FOXP3 T cells (right) (N = 7 tumors, mean + SEM).
(D) Representative flow cytometry staining of IFNg and TNF-a expression in GZMB+ or GZMK+ CCR7 CD4+ FOXP3 T cells stimulated with PMA and ionomycin.
(E) Percentages of cells expressing IFNg, TNF-a, or both from GZMB+ or GZMK+ CCR7 CD4+ FOXP3 T cells with and without stimulation (N = 11 tumors,
mean + SEM).
(F) Multiplex immunofluorescent staining of DAPI (blue), CD4 (immunohistochemistry, red), GZMK (RNAscope probe, green), and GZMB (RNAscope probe, white)
and overlay without DAPI from a cystectomy tumor region from a patient with parallel scRNA-seq and TCR-seq data (anti-PD-L1 C, top row) and from a cor-
responding tumor field with negative control staining (bottom row). CD4+ cells that co-express GZMK (arrows) or GZMB (arrowhead) are indicated. Scale
bar, 10 mm.
(G) The ratio of abundances of all regulatory T cell populations (CD4ILRAHI and CD4IL2RALO) to all cytotoxic CD4+ populations (CD4GZMB and CD4GZMK) across all
tumor and non-malignant samples (mean + SEM shown; *p < 0.05 by unpaired t test, assuming unequal variance).
(H) Gini coefficients for each of the cytotoxic CD4+ populations within tumor and non-malignant compartments across all samples (box and whisker plot is shown
with formatting as in Figure 3C; *p < 0.05, **p < 0.01, exact permutation test, N = 7 tumor samples and 6 non-malignant samples).
(I) Left panel: quantitation of Annexin V+ apoptotic cells over time from a time-lapse cytotoxicity experiment with tumor cells cultured alone or with bulk CD4+ TILs
(CD4total) or CD4+ TILs depleted of regulatory T cells (CD4eff) at a 30:1 effector:target ratio. Right panel: CD4eff TILs and tumor cells (30:1 effector:target ratio) were
co-cultured with a pan-anti-MHC class II antibody or isotype control. All traces were from the same culture and cytotoxicity assay from the same patient. All traces
show relative change in cell death from time point 0. Cytotoxicity with CD4eff is representative of independent experiments from 4 different patients. Mean ± SEM
from multiple technical replicates for each experiment is shown.
1618 Cell 181, 1612–1625, June 25, 2020

ll
Article
non-malignant-derived CD4+ cells from individual patients (Fig- tively, are assigned to effector memory CD8+ cell annotations),
ures S2D and S2E). reinforcing their cytotoxicity profile (Figure S2F). Finally, an inter-
We validated the presence and functional heterogeneity of nal comparison of the transcriptional profiles from CD4+ and
cytotoxic CD4+ T cells using several orthogonal and comple- CD8+ TIL clusters from our scRNA-seq data indicated that,
mentary methods. Using flow cytometry, the presence of cyto- although most CD4+ clusters are most similar to other CD4+ clus-
toxic CD4+ T cells with an effector memory (CCR7 CD45RA) ters, cytotoxic CD4+ T cells are an exception. CD4GZMB cytotoxic
or effector (CCR7 CD45RA+) phenotype that express GZMB, cells were most correlated with tumor-specific CD8ENTPD1 cells
GZMK, and perforin protein was confirmed by flow cytometry (Pearson correlation coefficient = 0.92), whereas CD4GZMK cyto-
in tumors from multiple independent replicate samples (N = 7 tu- toxic cells were most correlated with CD8CM and CD8NAIVE cells
mors; Figure 4B; gating strategy in Figures S1C and S1D). (Pearson correlation coefficient = 0.98 for both) (Figure S2G). The
Across this sample set, 9% ± 2.9% (mean ± SEM) of CD4+ tumor-specific gene expression program of these cytotoxic
FOXP3 CCR7 cells expressed GZMB, whereas 16% ± 4.5% CD4+ cells was marked by heat shock protein expression in
expressed GZMK and 5.3% ± 2.6% expressed perforin (Fig- both states as well as tumor overexpression of CXCL13 and
ure 4C, left panel), at lower frequencies than CCR7 CD8+ cyto- numerous immune checkpoints (TNFRSF18/LAG3/TIGIT/
toxic cells from the same patients (Figures S1E–S1G). Impor- HAVCR2) as well as ENTPD1 within CD4GZMB cells (Figure S2C;
tantly, 25.9% ± 8.7% of GZMB+ CD4+ FOXP3 CCR7 and Table S2).
8.6% ± 3.5% of GZMK+ CD4+ FOXP3 CCR7 cytotoxic
T cells showed co-expression of perforin with granzymes, in Cytotoxic CD4+ T Cells Were Enriched and Clonally
agreement with the scRNA-seq data (Figure 4C, right panel); Expanded in Bladder Tumors
these frequencies of granzyme and perforin co-expression Of the 2 cytotoxic CD4+ states, CD4GZMK cells were significantly
were lower than those of CCR7 CD8+ cytotoxic cells from the enriched in abundance in tumors (CD4GZMK in tumor versus non-
same patients (Figures S1F and S1G). Importantly, CD45 malignant tissues: 7.2% ± 0.5% versus 5.0% ± 0.5%, exact per-
bladder tumor cells express multiple major histocompatibility mutation test, p = 0.01; Figure 2E). Overall, the CD4+ compart-
complex (MHC) class II molecules (data not shown), which would ment exhibited a bias toward regulatory over cytotoxic CD4+
allow antigen recognition by TCRs expressing CD4 as a co-re- T cells in tumors (regulatory CD4+/cytotoxic CD4+ ratio = 1.8 ±
ceptor. Flow cytometry of a separate set of 11 muscle-invasive 0.2) compared with non-malignant tissues, where proportions
bladder tumors confirms the functional capacity of cytotoxic of regulatory and cytotoxic CD4+ T cells were more balanced
CD4+ T cells to produce multiple cytokines. In agreement with (regulatory CD4+/cytotoxic CD4+ ratio = 1.1 ± 0.2, t test, p =
the scRNA-seq data, 56% ± 4.8% (mean ± SEM) of CD4+ 0.04; Figure 4G). Cytotoxic CD4+ T cell states contributed to in-
CCR7 cells were polyfunctional and could produce both IFNg tratumoral CD4+ repertoire restriction. Both cytotoxic CD4+
and tumor necrosis factor alpha (TNF-a), whereas a minority of T cell states have significantly increased Gini coefficients in tu-
these cells only secrete IFNg alone or TNF-a alone after stimula- mor compared with non-malignant tissues, with CD4GZMB repre-
tion and, therefore, may demonstrate signs of exhaustion (IFNg+ senting the more restricted cytotoxic state in tumors (CD4GZMB:
TNF-a: 2.0% ± 0.76%; IFNg TNF-a+: 19% ± 3.3%) (Figures Ginitumor 0.21 versus Gininormal 0.06; CD4GZMK: Ginitumor 0.12
4D and 4E). The frequency of polyfunctional cytotoxic CD4+ versus Gininormal 0; exact permutation test, p = 0.04 for CD4GZMB
T cells was similar to stimulated CD8+ CCR7 T cells from the and p = 0.002 for CD4GZMK; Figure 4H). Hence, unbiased
same patients (IFNg+ TNF-a+: 55% ± 6.3%), although CD8+ dscRNA-seq revealed that heterogeneous cytotoxic CD4+
CCR7 T cells that were monofunctional demonstrated an T cells, a subset of which are closely related to conventional
increased trend toward preferential IFNg production alone over cytotoxic CD8+ T cells based on their functional program, are un-
TNF-a production compared with cytotoxic CD4+ T cells expected but frequent constituents of the bladder tumor micro-
(IFNg+ TNF-a: 14% ± 4.7%; IFNg TNF-a+: 7.2% ± 2.1%) environment, some of which are quantitatively enriched in tu-
(Figure S1H). mors. The tumor-specific clonal expansion of both cytotoxic
As further validation of the cytotoxic CD4+ T cell phenotype in CD4+ states suggests that their restricted repertoire may result
tissue, multiplex immunofluorescence tissue staining of bladder from recognition of MHC class II cognate antigens that may
tumor tissue from a patient in the scRNA-seq dataset demon- include bladder tumor antigens.
strated CD4+ T cells that also expressed GZMB or GZMK (Fig-
ure 4F, top row; tissue staining from an additional patient in Fig- Cytotoxic CD4+ T Cells Possessed Lytic Capacity
ure S1I) at levels not seen with negative control staining against Autologous Tumor Cells that Was Restricted by
(Figure 4F, bottom row). Autologous Tregs
Overall annotation of clusters from the scRNA-seq data was To validate the functional relevance of cytotoxic CD4+ in bladder
supported by an independent analysis that assigns each single tumors, we isolated CD4+ TILs by fluorescence-activated cell
cell to the best-known published immune subset profiled by sorting (FACS) and then cultured the cells ex vivo with inter-
bulk expression analysis after sorting (SingleR) (Aran et al., leukin-2 (IL-2). We then co-cultured these cells with autologous
2019). This corroborated the identification of Tregs (90% and tumor cells in an imaging-based time-lapse cytotoxicity assay,
78% of CD4IL2RAHI and CD4IL2RALO cells are assigned to Treg an- assessing for cell death with Annexin V. We found that expanded
notations, respectively) and further demonstrated that both cyto- CD4+ TILs were cytotoxic and could trigger increased tumor
toxic CD4+ states are most similar to CD8+ effector memory apoptosis (‘‘CD4total:tumor,’’ Figure 4I, left panel). However,
T cells (37% and 45% of CD4GZMB and CD4GZMK cells, respec- when we performed the same co-cultures but with CD4+ TILs
Cell 181, 1612–1625, June 25, 2020 1619

ll
Article
A B
C D
E
F
Figure 5. Proliferating CD4+ T Cells Contain Regulatory and Cytotoxic Cell States
(A) Heatmap showing expression of select cytotoxic, regulatory, and proliferating marker genes (rows) for individual single cells (columns) within the CD4PROLIF
cluster. Samples were hierarchically clustered. Log2-transformed expression of each gene was row scaled.
(B) Representative flow cytometry staining from a bladder tumor showing expression of CD25, GZMB, GZMK, and Ki67.
(C) Single cells expressing the top 3 most expanded clonotypes found in the CD4PROLIF T cell population are shown in red in the same UMAP space as in
Figure 2A. The regions composed of proliferating, regulatory, and cytotoxic T cells are outlined and superimposed on the UMAP projection for visualization.
(D) Left panel: pseudotime trajectories derived from all tumors (N = 7 samples) and non-malignant samples (N = 6 samples). Cells with expanded TCRs from the
proliferating (CD4PROLIF, green), regulatory (CD4IL2RAHI and CD4IL2RALO, shades of red), and cytotoxic (CD4GZMB and CD4GZMK, shades of purple) states were
used for this analysis. Specific branches corresponding to proliferating cytotoxic cells (top right), non-proliferating cytotoxic cells (bottom right), proliferating
regulatory cells (top left), and non-proliferating regulatory cells (bottom left) are labeled. Right panel: branches are color-coded according to the above prolif-
erating or non-proliferating identities. Also labeled are branch points that discriminate proliferating and non-proliferating cytotoxic CD4+ T cells (branch point 1)
and proliferating and non-proliferating regulatory T cells (branch point 2).
1620 Cell 181, 1612–1625, June 25, 2020

ll
Article
from the same patient that were depleted of Tregs, we found that confirmed the presence of discrete regulatory or cytotoxic pop-
killing was increased (‘‘CD4eff:tumor,’’ Figure 4I, left panel), indi- ulations of Ki67+ CD4+ T cells that co-expressed CD25, GZMB,
cating that autologous Tregs can inhibit the activity of cytotoxic or GZMK (Figure 5B). Across multiple independent samples,
CD4+ T cells. Significant tumor death was seen in co-cultures 4.7% ± 1.0% (mean ± SEM) of CD4+ FOXP3+ cells co-ex-
with CD4eff TILs compared with tumors alone (Figure 4I, left pressed Ki67 and CD25, whereas 1.2% ± 0.5% of CD4+
panel; representative of 3 independent experiments from FOXP3 CCR7 cells co-expressed Ki67 and GZMB, and
different patients). Furthermore, the cytotoxic activity of CD4eff 1.0% ± 0.1% of CD4+ FOXP3 CCR7 cells co-expressed
was at least partially dependent on MHC class II recognition Ki67 and GZMK (N = 7 tumors; Figure S1J). Proliferating
because tumor apoptosis was inhibited with pre-incubation Ki67+ GZMB+ cells are also seen, using flow cytometry, within
with a pan-anti-MHC class II antibody that was not seen with the CD8+ compartment of TCC patients (Figure S1K). Examina-
an isotype control antibody (Figure 4I, right panel; representative tion of exact TCR clonotype sharing of the most expanded
of 2 independent patients). Independent experiments with an CD4PROLIF clones identified sharing with regulatory and cyto-
alternative death indicator (Cytotox Red) confirmed increased toxic CD4+ T cells, further underscoring the contribution of
autologous tumor killing with tumor/CD4eff co-cultures (Fig- each state to CD4PROLIF cells (Figure 5C).
ure S4A), MHC class II dependence of CD4eff killing (Figure S4B), Given that regulatory and cytotoxic CD4+ T cells were heterog-
as well as similar MHC class I-dependent autologous tumor enous and composed of cells that were proliferating to a different
killing with expanded CD8+ T cells (Figures S4C and S4D). extent, existing clusters may fail to resolve the separate contri-
Hence, flow cytometry and functional analyses from multiple in- bution of specific expression programs from subsets with
dependent patients confirmed not only that cytotoxic CD4+ different proliferative capacity. Hence, we used pseudotime
T cells expressed cytolytic proteins, such as granzymes and per- analysis to separate regulatory and cytotoxic cells into prolifer-
forin, in tumor tissue but that these cells can recognize bladder ating and non-proliferating components (Qiu et al., 2017). This
tumor antigens in an MHC class II-dependent fashion and analysis divided CD4PROLIF cells into two groups, each lying
were functionally competent to lyse autologous tumor cells in a along a branch specific for proliferating regulatory or cytotoxic
manner that can be suppressed by autologous Tregs. CD4+ T cells, with separate branches for non-proliferating regu-
latory and cytotoxic cells (Figure 5D). This underscored that reg-
Proliferating CD4+ T Cells Contained Regulatory and ulatory and cytotoxic CD4+ T cells consist of distinct proliferating
Cytotoxic Cells and non-proliferating states in TCC, based on transcriptomic
Induction of proliferating T cells can be beneficial for anti-tumor and clonotypic analyses.
immune responses. Proliferating CD4+ T cells are rapidly
induced in the periphery within weeks of initiating checkpoint A Signature of Cytotoxic CD4+ T Cells Predicts Clinical
blockade in prostate cancer patients (Kavanagh et al., 2008) Response to Anti-PD-L1
and in separate cohorts of thymic epithelial tumors and non- To assess the importance of the specific proliferating and non-
small cell lung cancer treated with anti-PD-1; a higher fold proliferating cytotoxic CD4+ T cell states for patient outcomes,
change in Ki67+ cells among PD-1+ CD8+ T cells in the periph- we performed branched expression analysis modeling (BEAM)
ery after a week was predictive of durable clinical benefit, pro- to identify all genes that were differentially expressed between
gression-free survival, and (in the non-small cell lung cancer branches at branchpoint 1 of the pseudotime trajectory. This
cohorts) overall survival (Kim et al., 2019). Within our tumor- branchpoint divided proliferating cytotoxic CD4+ T cells, non-
infiltrating CD4+ T cell compartment in TCC, we also identified proliferating cytotoxic CD4+ T cells, and all other regulatory cells
proliferating cells (CD4PROLIF) expressing MKI67, microtubule- (Figure 5D, right panel). Hierarchical clustering identified genes
associated markers (e.g., STMN1/TUBB), and DNA-binding upregulated preferentially in the proliferating cytotoxic branch
proteins associated with cell cycle progression, such as (cluster 7) or the non-proliferating cytotoxic branch (cluster 4)
PCNA, HMGB1, and HMGB2, which were expressed at lower but not in regulatory branches within this analysis (all genes
levels in regulatory and cytotoxic CD4+ T cells (CD4PROLIF: with q < 0.05; heatmap of clusters and branches in Figure 5E;
log2(FC) > 2.1; Figure 2C; Table S2). A similar signature was branch-specific signatures in Table S5). We developed a gene
also seen in the CD8+ compartment (CD8PROLIF; Figure 1C; Ta- signature from this analysis consisting of genes that were upre-
ble S2). Higher-resolution clustering revealed that this prolifer- gulated specifically in proliferating or non-proliferating cytotoxic
ating state is comprised of discrete groups of cells co-express- CD4+ T cells (from cluster 7: ABCB1; from cluster 4: APBA2,
ing regulatory or cytotoxic genes but not both simultaneously SLAMF7, GPR18, and PEG10; Figure 5E) but were not upregu-
(Figure 5A). Flow cytometry analysis of separate TCC samples lated in any of the CD8+ T cell states from our scRNA-seq
(E) Heatmap showing all differentially expressed genes (columns) between branches for branch point 1 across cells in the pseudotime analysis (rows). Cells are
grouped by their proliferating or non-proliferating branch assignments, color-coded at the right of the heatmap and corresponding to colors in (D). Genes are
grouped by color-coded clusters (1–8) shown at the top of the plot, which result from hierarchical clustering based on co-regulation in specific branches.
(F) Cytotoxic CD4+ T cell gene signature scores were plotted in clinical responders (complete response or partial response) versus non-responders (stable
disease or progressive disease) from baseline metastatic biopsies from bladder cancer patients with inflamed tumors on the IMvigor210 clinical trial (N = 62
tumors). The signature score was obtained from the IMvigor210 bulk RNA-seq dataset for the cytotoxic CD4+ T cell-specific genes derived from non-proliferating
(cluster 4) and proliferating (cluster 7) cytotoxic CD4+ clusters from the pseudotime analysis shown below the heatmap in (E). Median ± SEM is shown; *p = 0.037
by two-tailed t test.
Cell 181, 1612–1625, June 25, 2020 1621

ll
Article
analysis (Table S2). We then tested this gene signature’s ability Second, we identified heterogenous states of cytotoxic CD4+
to predict treatment response using bulk RNA-seq data from T cells that were unexpected and differed in their expression of
pre-treatment tumors from a separate phase 2 trial of atezolizu- canonical cytolytic effector molecules (GZMB, GZMK, and
mab for metastatic bladder cancer (IMvigor210; Mariathasan PRF1 [perforin]) as well as other granule-associated proteins
et al., 2018). In 244 metastatic bladder cancer patients with (GNLY [granulysin] and NKG7) that may have roles in target
pre-treatment RNA-seq data, immunohistochemistry (IHC) infor- cell killing. These were distinct populations based on scRNA-
mation regarding immune phenotype (immune desert, immune- seq and orthogonal validation by flow cytometry and multiplex
excluded, or inflamed), and information regarding clinical immunofluorescence tissue staining. Our annotation using Sin-
response, this gene signature was significantly correlated with gleR indicated that effector states such as cytotoxic CD4+
clinical response to anti-PD-L1 therapy in inflamed samples T cells found in the tumor microenvironment may not yet be an-
(p = 0.037, two-tailed t test, N = 62 inflamed samples; Figure 5F), notated, and, based on ‘‘best fit’’ comparisons with external
which was not seen in samples with an immune-excluded or im- reference data and transcriptional correlation within our own
mune desert phenotype. Hence, we used a composite signature data, these cells were, in fact, most similar to conventional
containing genes that discriminated proliferating and non-prolif- effector memory cytotoxic CD8+ T cells. The functional similarity
erating cytotoxic CD4+ T cells to assess the specific contribu- between cytotoxic CD4+ T cells and conventional CD8+ T cells
tions of these discrete states and found that this signature is was underscored by our finding that CD4GZMB TILs were actually
associated with response to PD-1 blockade in a large cohort of most similar to tumor-specific CD8ENTPD1 cells (Duhen et al.,
TCC patients. This result highlights the potential clinical impor- 2018), based on transcriptional data, whereas CD4GZMK TILs
tance of possessing intratumoral cytotoxic CD4+ T cell activity were most similar to CD8CM and CD8NAIVE cells. Although these
in response to anti-PD-L1 treatment. were distinct cell types, based on separate CD4 and CD8 co-re-
ceptor expression, this may indicate shared modes of tumor
DISCUSSION recognition and tumor clearance by cytotoxic CD4+ and CD8+
T cells. Although cytotoxic CD4+ T cells are present in non-small
Current efforts to dissect the mechanism of tumor immune sur- cell lung and hepatocellular carcinoma (Zheng et al., 2017a; Guo
veillance and enhance the efficacy of cancer immunotherapies et al., 2018), circulate with ipilimumab treatment in metastatic
have primarily focused on conventional cytotoxic CD8+ T cell- melanoma (Kitano et al., 2013), and also are present in an infec-
mediated responses. However, given the known functional di- tious context, where they represent a clonally expanded dengue
versity of CD4+ T cell effector responses and emerging data virus-specific effector subset (Patil et al., 2018), the extent of
that CD4+ T cell recognition may be important for anti-tumor re- their heterogeneity in other solid tumors (including bladder can-
sponses (for instance, in the context of a neoantigen vaccine; Ott cer) and whether these cells are important for systemic immuno-
et al., 2017; Sahin et al., 2017), the role of specific CD4+ states in therapy has remained unclear prior to this work. We found that
enhancing or suppressing immune responses in the tumor cytotoxic CD4+ subsets in bladder tumors were clonally
microenvironment and how these are modulated by systemic expanded, which may be the result of recognition of cognate
therapies, including immunotherapy, remains unknown. Here bladder tumor antigens. Their functional importance was
we use unbiased massively parallel genotypic and phenotypic confirmed by their ability to kill autologous tumors ex vivo. The
profiling of the T cell compartment in localized bladder tumors mechanism by which these cells kill target tumor cells involves
and the adjacent non-malignant compartment, including those contact-dependent mechanisms based on inhibition of killing
treated with anti-PD-L1 immunotherapy, as a tool to finely by anti-MHC class II antibodies, although other mechanisms
dissect heterogeneity in CD4+ T cell subsets. We identified spe- may also contribute. We documented that these cytotoxic
cific CD4+ T cell states with functional relevance for response to CD4+ T cells are polyfunctional and secrete multiple, such as
immunotherapy and clinical outcomes. We not only confirmed TNF-a and IFNg; the latter may contribute to tumor death as
the presence of CD4+ T cell states with known contributions to well through ferroptosis (Wang et al., 2019) in addition to con-
anti-tumor immune responses, such as CXCL13+ CD4+ T cells tact-dependent cytotoxicity. Of note, apart from the subset of
(Schmidt et al., 2018; Gu-Trantien et al., 2013, 2017; Wei et al., cells that co-express TNF-a, IFNg, PDCD1, LAG3, and HAVCR2,
2018; Zhang et al., 2018) as well as Th17 cells (Kryczek et al., cytotoxic CD4+ T cells are found to generally lack surface
2011), we also uncovered insights into the contribution of expression of many immune checkpoints currently being tested
CD4+ TILs to tumor control by the immune system in bladder with therapeutic antibodies in pre-clinical and clinical testing,
cancer. suggesting that these effector cells may have distinct require-
First we identified distinct states of Tregs that differed based ments for activation.
on the level of expression of IL2RA and immune checkpoints, Importantly, a gene signature derived from single-cell analysis
such as TNFRSF18, which was confirmed at the protein level. of proliferating and non-proliferating cytotoxic CD4+ T cells is
These Tregs possessed a private repertoire with no detected predictive of the response to anti-PD-L1 therapy in a separate
clonotype sharing with other T cell states, which would suggest set of 62 patients with inflamed metastatic bladder cancer.
that these are not induced Tregs. Because a gene signature from Most of these genes have been previously implicated in the
checkpoint-high Tregs is associated with worse outcome in non- biology of cytotoxic effector cells or specific human CD4+
small cell lung cancer (Guo et al., 2018), it is possible that these T cell responses in pathogenesis or autoimmunity, including hu-
regulatory cells are responsible for setting a basal state of more man cytotoxic CD4+ T cells (Arlehamn et al., 2014; Burel et al.,
potent immunosuppression and adverse outcomes in TCC. 2018; Campbell et al., 2018, Imbeault et al., 2012; Mattoo
1622 Cell 181, 1612–1625, June 25, 2020

ll
Article
et al., 2016; Sumida and Cyster, 2018; Wang et al., 2014). Over- STAR+METHODS
all, the predictive value of this cytotoxic CD4+ T cell-specific
signature in a large cohort of anti-PD-L1-treated metastatic Detailed methods are provided in the online version of this paper
TCC patients highlights how anti-PD-L1 therapy may alter the and include the following:
immune microenvironment to favor activation of cytotoxic
CD4+ effectors, particularly in patients with pre-existing cyto- d KEY RESOURCES TABLE
toxic CD4+ T cell activity. d RESOURCE AVAILABILITY
The importance of the relative balance between regulatory and B Lead Contact
effector T cells is well known for conventional effectors; the reg- B Materials Availability
ulatory CD4+:cytotoxic CD8+ ratio has been associated with
improved survival or response to therapy in several cancers, d EXPERIMENTAL MODEL AND SUBJECT DETAILS
including bladder (Preston et al., 2013; Sato et al., 2005; Baras d METHOD DETAILS
B Tissue processing
et al., 2016; Takada et al., 2018). This work identifies the biolog-
B Flow cytometry/FACS
ical importance of another axis involving the relative balance of
regulatory T cells and these cytotoxic CD4+ effectors for anti-tu- B Single cell RNA sequencing
B TCR sequencing
mor activity: removal of regulatory T cells enhanced tumor killing
B Expression analysis
by cytotoxic CD4+ T cells. Our findings suggest that manipu-
lating the balance between cytotoxic CD4+ and regulatory B TCR analysis
T cell states can lead to therapeutic benefit in TCC. B Tumor infiltrating lymphocyte (TIL) isolation and
Finally, the origin of cytotoxic CD4+ T cell effectors within tu- culturing
mors remains unclear. We do not find direct evidence of plas- B Cytotoxic T lymphocyte (CTL) killing assay
ticity or interconversion of regulatory T cells into cytotoxic B Pseudotime analysis

B Gene signature analysis
CD4+ T cells, based on clonotype sharing. Cytotoxic CD4+
B RNAscope/tissue immunofluorescence staining
T cells do share clones with the proliferating CD4+ state, raising
the possibility that these cells may arise from activation of other d QUANTIFICATION AND STATISTICAL ANALYSIS
CD4+ subsets, whether within the tumor milieu itself or as a result d ADDITIONAL RESOURCES
of tumor homing of precursors, which are first activated outside
of the tumor in the peripheral circulation.
There are important limitations to the interpretation of this Supplemental Information can be found online at https://doi.org/10.1016/j.
study. The size of the sample set used for single-cell discovery cell.2020.05.017.
of T cell heterogeneity was limited to 7 patients; hence, larger-
scale single-cell sequencing efforts in bladder cancer will help ACKNOWLEDGMENTS
to validate these findings. The treatments administered before
collection were also heterogeneous; as a result, given the limita- We thank the patients who volunteered to participate in these studies; the
UCSF Genitourinary Medical Oncology and Urology providers involved in
tions in sample size, our ability to directly assess modulation of
screening, enrollment, and clinical care of these patients; the UCSF Bio-
T cell subsets, such as cytotoxic CD4+ T cells, by immuno- specimen Resources Program for assistance with tissue acquisition; and the
therapy in this dataset is limited. Finally, the scope of our findings Institute for Human Genetics Core for assistance with sequencing. This work
in this dataset is limited to patients with MIBC; further efforts will was supported by the Parker Institute for Cancer Immunotherapy. D.Y.O. is
be needed to assess the immune context in other bladder cancer supported by NIH T32CA177555 and K08AI139375, the Harry F. Bisel, MD En-
disease states. Nonetheless, the robustness of our findings dowed Young Investigator Award from the Conquer Cancer Foundation of the
American Society of Clinical Oncology, the Bladder Cancer Advocacy Network
across the individual patients in this dataset highlight conserved
Palm Beach New Discoveries Young Investigator Award, and the Prostate
CD4+ heterogeneity across patients, which is important for im- Cancer Foundation Young Investigator Award. S.S.K. is supported by the Pe-
mune recognition of bladder tumors. ter Michael Foundation. L.F. is support by the Parker Institute of Cancer Immu-
This work lays an important conceptual foundation for ef- notherapy; the Prostate Cancer Foundation; and NIH R01CA194511,
forts to enhance bladder tumor immunotherapy. We identify R01CA223484, U01CA233100, and U01CA244452.
cytotoxic CD4+ effectors whose distinct expression of cyto-
lytic molecules and other marker genes will lead to further ef- AUTHOR CONTRIBUTIONS
forts to isolate and enhance the activity of specific cytotoxic
Study concept and design, D.Y.O., S.S.K., and L.F.; Acquisition of Patients,
subsets as well as to discover the bladder tumor antigens Samples, and Data, D.Y.O., S.S.K., A.I., C.-C.S.P., C.R., K.A., A.B., S.P.,
they are recognizing. At the same time, this work points to M.V.M., and T.W.F.; Data Analysis and Interpretation, D.Y.O., S.S.K., S.S.R.,
specific regulatory T cell states that may be more suppressive T.L., E.M., E.C., D.A., A.I., K.A., A.B., Y.S., M.H.S., S.M., C.J.Y., and L.F.;
in bladder cancer and therefore represent ideal targets for Writing of the Manuscript, D.Y.O., S.S.K., C.J.Y., and L.F.; Study Oversight,
parallel approaches to inhibit their activity. Collectively, our C.J.Y. and L.F.
findings point to the importance of understanding multiple
axes that balance suppressive regulatory T cell activity with
effector function of anti-tumor immune subsets in TCC to D.Y.O. has received research support from Roche/Genentech and Merck and
enhance our ability to effectively manipulate these with thera- has served as a paid consultant for Maze Therapeutics. L.F. has received
peutic approaches to enhance tumor control. research support from Roche/Genentech, Abbvie, Bavarian Nordic, Bristol
Cell 181, 1612–1625, June 25, 2020 1623

ll
Article
Myers Squibb, Janssen, and Merck. C.J.Y. is a co-founder of Dropprint Hargadon, K.M., Johnson, C.E., and Williams, C.J. (2018). Immune checkpoint
Genomics. blockade therapy for cancer: An overview of FDA-approved immune check-
point inhibitors. Int. Immunopharmacol. 62, 29–39.
Received: August 2, 2019 Herbst, R.S., Soria, J.C., Kowanetz, M., Fine, G.D., Hamid, O., Gordon, M.S.,
Revised: February 21, 2020 Sosman, J.A., McDermott, D.F., Powderly, J.D., Gettinger, S.N., et al. (2014).
Accepted: May 8, 2020 Predictive correlates of response to the anti-PD-L1 antibody MPDL3280A in
Published: June 3, 2020 cancer patients. Nature 515, 563–567.
Hodi, F.S., O’Day, S.J., McDermott, D.F., Weber, R.W., Sosman, J.A., Haanen,
REFERENCES J.B., Gonzalez, R., Robert, C., Schadendorf, D., Hassel, J.C., et al. (2010).
Improved survival with ipilimumab in patients with metastatic melanoma.
Aran, D., Looney, A.P., Liu, L., Wu, E., Fong, V., Hsu, A., Chak, S., Naikawadi, N. Engl. J. Med. 363, 711–723.
R.P., Wolters, P.J., Abate, A.R., et al. (2019). Reference-based analysis of lung Imbeault, M., Giguère, K., Ouellet, M., and Tremblay, M.J. (2012). Exon level
single-cell sequencing reveals a transitional profibrotic macrophage. Nat. Im- transcriptomic profiling of HIV-1-infected CD4(+) T cells reveals virus-induced
munol. 20, 163–172. genes and host environment favorable for viral replication. PLoS Pathog. 8,
Arlehamn, C.L., Seumois, G., Gerasimova, A., Huang, C., Fu, Z., Yue, X., Sette, e1002861.
A., Vijayanand, P., and Peters, B. (2014). Transcriptional profile of tuberculosis Johnson, W.E., Li, C., and Rabinovic, A. (2007). Adjusting batch effects in mi-
antigen-specific T cells reveals novel multifunctional features. J. Immunol. 193, croarray expression data using empirical Bayes methods. Biostatistics 8,
2931–2940. 118–127.
Ayers, M., Lunceford, J., Nebozhyn, M., Murphy, E., Loboda, A., Kaufman, Kavanagh, B., O’Brien, S., Lee, D., Hou, Y., Weinberg, V., Rini, B., Allison, J.P.,
D.R., Albright, A., Cheng, J.D., Kang, S.P., Shankaran, V., et al. (2017). IFN- Small, E.J., and Fong, L. (2008). CTLA4 blockade expands FoxP3+ regulatory
g-related mRNA profile predicts clinical response to PD-1 blockade. J. Clin. and activated effector CD4+ T cells in a dose-dependent fashion. Blood 112,
Invest. 127, 2930–2940. 1175–1183.
Baras, A.S., Drake, C., Liu, J.J., Gandhi, N., Kates, M., Hoque, M.O., Meeker, Kim, K.H., Cho, J., Ku, B.M., Koh, J., Sun, J.M., Lee, S.H., Ahn, J.S., Cheon, J.,
A., Hahn, N., Taube, J.M., Schoenberg, M.P., et al. (2016). The ratio of CD8 to Min, Y.J., Park, S.H., et al. (2019). The First-week Proliferative Response of Pe-
Treg tumor-infiltrating lymphocytes is associated with response to cisplatin- ripheral Blood PD-1+CD8+ T Cells Predicts the Response to Anti-PD-1 Therapy
based neoadjuvant chemotherapy in patients with muscle invasive urothelial in Solid Tumors. Clin. Cancer Res. 25, 2144–2154.
carcinoma of the bladder. OncoImmunology 5, e1134412. Kitano, S., Tsuji, T., Liu, C., Hirschhorn-Cymerman, D., Kyi, C., Mu, Z., Allison,
J.P., Gnjatic, S., Yuan, J.D., and Wolchok, J.D. (2013). Enhancement of tumor-
Bolotin, D.A., Poslavsky, S., Mitrophanov, I., Shugay, M., Mamedov, I.Z., Pu-
reactive cytotoxic CD4+ T cell responses after ipilimumab treatment in four
tintseva, E.V., and Chudakov, D.M. (2015). MiXCR: software for comprehen-
advanced melanoma patients. Cancer Immunol. Res. 1, 235–244.
sive adaptive immunity profiling. Nat. Methods 12, 380–381.
Koshkin, V.S., and Grivas, P. (2018). Emerging Role of Immunotherapy in
Burel, J.G., Lindestam Arlehamn, C.S., Khan, N., Seumois, G., Greenbaum,
Advanced Urothelial Carcinoma. Curr. Oncol. Rep. 20, 48.
J.A., Taplitz, R., Gilman, R.H., Saito, M., Vijayanand, P., Sette, A., and Peters,
B. (2018). Transcriptomic Analysis of CD4+ T Cells Reveals Novel Immune Sig- Krensky, A.M., and Clayberger, C. (2009). Biology and clinical relevance of
natures of Latent Tuberculosis. J. Immunol. 200, 3283–3290. granulysin. Tissue Antigens 73, 193–198.
Kryczek, I., Zhao, E., Liu, Y., Wang, Y., Vatan, L., Szeliga, W., Moyer, J., Klimc-
Büttner, M., Miao, Z., Wolf, F.A., Teichmann, S.A., and Theis, F.J. (2019). A test
zak, A., Lange, A., and Zou, W. (2011). Human TH17 cells are long-lived
metric for assessing single-cell RNA-seq batch correction. Nat. Methods
effector memory cells. Sci. Transl. Med. 3, 104ra100.
16, 43–49.
Kurioka, A., Walker, L.J., Klenerman, P., and Willberg, C.B. (2016). MAIT cells:
Campbell, K.S., Cohen, A.D., and Pazina, T. (2018). Mechanisms of NK Cell
new guardians of the liver. Clin. Transl. Immunology 5, e98.
Activation and Clinical Activity of the Therapeutic SLAMF7 Antibody, Elotuzu-
mab in Multiple Myeloma. Front. Immunol. 9, 2551. Liakou, C.I., Kamat, A., Tang, D.N., Chen, H., Sun, J., Troncoso, P., Logothetis,
C., and Sharma, P. (2008). CTLA-4 blockade increases IFNgamma-producing
Cancer Genome Atlas Research Network (2008). Comprehensive genomic CD4+ICOShi cells to shift the ratio of effector to regulatory T cells in cancer pa-
characterization defines human glioblastoma genes and core pathways. Na- tients. Proc. Natl. Acad. Sci. USA 105, 14987–14992.
ture 455, 1061–1068.
Löfroos, A.B., Kadivar, M., Resic Lindehammer, S., and Marsal, J. (2017).
De Simone, M., Arrigoni, A., Rossetti, G., Gruarin, P., Ranzani, V., Politano, C., Colorectal cancer-infiltrating T lymphocytes display a distinct chemokine re-
Bonnal, R.J.P., Provasi, E., Sarnicola, M.L., Panzeri, I., et al. (2016). Transcrip- ceptor expression profile. Eur. J. Med. Res. 22, 40.
tional Landscape of Human Tissue Lymphocytes Unveils Uniqueness of Tu-
Mariathasan, S., Turley, S.J., Nickles, D., Castiglioni, A., Yuen, K., Wang, Y.,
mor-Infiltrating T Regulatory Cells. Immunity 45, 1135–1147.
Kadel, E.E., III, Koeppen, H., Astarita, J.L., Cubas, R., et al. (2018). TGFb atten-
Duhen, T., Duhen, R., Montler, R., Moses, J., Moudgil, T., de Miranda, N.F., uates tumour response to PD-L1 blockade by contributing to exclusion of
Goodall, C.P., Blair, T.C., Fox, B.A., McDermott, J.E., et al. (2018). Co-expres- T cells. Nature 554, 544–548.
sion of CD39 and CD103 identifies tumor-reactive CD8 T cells in human solid Martincorena, I., and Campbell, P.J. (2015). Somatic mutation in cancer and
tumors. Nat. Commun. 9, 2724. normal cells. Science 349, 1483–1489.
Gu-Trantien, C., Loi, S., Garaud, S., Equeter, C., Libin, M., de Wind, A., Ravoet, Mattoo, H., Mahajan, V.S., Maehara, T., Deshpande, V., Della-Torre, E., Wal-
M., Le Buanec, H., Sibille, C., Manfouo-Foutsop, G., et al. (2013). CD4+ follic- lace, Z.S., Kulikova, M., Drijvers, J.M., Daccache, J., Carruthers, M.N., et al.
ular helper T cell infiltration predicts breast cancer survival. J. Clin. Invest. 123, (2016). Clonal expansion of CD4(+) cytotoxic T lymphocytes in patients with
2873–2892. IgG4-related disease. J. Allergy Clin. Immunol. 138, 825–838.
Gu-Trantien, C., Migliori, E., Buisseret, L., de Wind, A., Brohée, S., Garaud, S., McInnes, L., and Healy, J. (2018). UMAP: Uniform Manifold Approximation and
Noël, G., Chi, V.L.D., Lodewyckx, J.N., Naveaux, C., et al. (2017). CXCL13- Projection for Dimension Reduction. arXiv, arXiv: 1802.03426. https://arxiv.
producing TFH cells link immune suppression and adaptive memory in human org/abs/1802.03426.
breast cancer. JCI Insight 2, 91487. Medley, Q.G., Kedersha, N., O’Brien, S., Tian, Q., Schlossman, S.F., Streuli,
Guo, X., Zhang, Y., Zheng, L., Zheng, C., Song, J., Zhang, Q., Kang, B., Liu, Z., M., and Anderson, P. (1996). Characterization of GMP-17, a granule mem-
Jin, L., Xing, R., et al. (2018). Global characterization of T cells in non-small-cell brane protein that moves to the plasma membrane of natural killer cells
lung cancer by single-cell sequencing. Nat. Med. 24, 978–985. following target cell recognition. Proc. Natl. Acad. Sci. USA 93, 685–689.
1624 Cell 181, 1612–1625, June 25, 2020

ll
Article
Monaco, G., Lee, B., Xu, W., Mustafah, S., Hwang, Y.Y., Carré, C., Burdin, N., (2018). Prognostic impact of CD4-positive T cell subsets in early breast cancer:
Visan, L., Ceccarelli, M., Poidinger, M., et al. (2019). RNA-Seq Signatures a study based on the FinHer trial patient population. Breast Cancer Res. 20, 15.
Normalized by mRNA Abundance Allow Absolute Deconvolution of Human Im-
Sumida, H., and Cyster, J.G. (2018). G-Protein Coupled Receptor 18 Contrib-
mune Cell Types. Cell Rep. 26, 1627–1640.e7.
utes to Establishment of the CD8 Effector T Cell Compartment. Front. Immu-
Oldham, K.A., Parsonage, G., Bhatt, R.I., Wallace, D.M., Deshmukh, N., nol. 9, 660.
Chaudhri, S., Adams, D.H., and Lee, S.P. (2012). T lymphocyte recruitment
Takada, K., Kashiwagi, S., Goto, W., Asano, Y., Takahashi, K., Takashima, T.,
into renal cell carcinoma tissue: a role for chemokine receptors CXCR3,
Tomita, S., Ohsawa, M., Hirakawa, K., and Ohira, M. (2018). Use of the tumor-
CXCR6, CCR5, and CCR6. Eur. Urol. 61, 385–394.
infiltrating CD8 to FOXP3 lymphocyte ratio in predicting treatment responses
Ott, P.A., Hu, Z., Keskin, D.B., Shukla, S.A., Sun, J., Bozym, D.J., Zhang, W.,
to combination therapy with pertuzumab, trastuzumab, and docetaxel for
Luoma, A., Giobbie-Hurder, A., Peter, L., et al. (2017). An immunogenic per-
advanced HER2-positive breast cancer. J. Transl. Med. 16, 86.
sonal neoantigen vaccine for patients with melanoma. Nature 547, 217–221.
Tirosh, I., Izar, B., Prakadan, S.M., Wadsworth, M.H., 2nd, Treacy, D., Trom-
Parsonage, G., Machado, L.R., Hui, J.W., McLarnon, A., Schmaler, T., Balaso-
betta, J.J., Rotem, A., Rodman, C., Lian, C., Murphy, G., et al. (2016). Dissect-
thy, M., To, K.F., Vlantis, A.C., van Hasselt, C.A., Lo, K.W., et al. (2012). CXCR6
ing the multicellular ecosystem of metastatic melanoma by single-cell RNA-
and CCR5 localize T lymphocyte subsets in nasopharyngeal carcinoma. Am. J.
seq. Science 352, 189–196.
Pathol. 180, 1215–1222.
Patil, V.S., Madrigal, A., Schmiedel, B.J., Clarke, J., O’Rourke, P., de Silva, Traag, V.A., Waltman, L., and van Eck, N.J. (2019). From Louvain to Leiden:
A.D., Harris, E., Peters, B., Seumois, G., Weiskopf, D., et al. (2018). Precursors guaranteeing well-connected communities. Sci. Rep. 9, 5233.
of human CD4+ cytotoxic T lymphocytes identified by single-cell transcriptome Tumeh, P.C., Harview, C.L., Yearley, J.H., Shintaku, I.P., Taylor, E.J., Robert,
analysis. Sci. Immunol. 3, eaan8664. L., Chmielowski, B., Spasic, M., Henry, G., Ciobanu, V., et al. (2014). PD-1
Philip, M., Fairchild, L., Sun, L., Horste, E.L., Camara, S., Shakiba, M., Scott, blockade induces responses by inhibiting adaptive immune resistance. Nature
A.C., Viale, A., Lauer, P., Merghoub, T., et al. (2017). Chromatin states define 515, 568–571.
tumour-specific T cell dysfunction and reprogramming. Nature 545, 452–456.
Wang, X., Sumida, H., and Cyster, J.G. (2014). GPR18 is required for a normal
Plitas, G., Konopacki, C., Wu, K., Bos, P.D., Morrow, M., Putintseva, E.V., Chu- CD8aa intestinal intraepithelial lymphocyte compartment. J. Exp. Med. 211,
dakov, D.M., and Rudensky, A.Y. (2016). Regulatory T Cells Exhibit Distinct 2351–2359.
Features in Human Breast Cancer. Immunity 45, 1122–1134.
Wang, W., Green, M., Choi, J.E., Gijón, M., Kennedy, P.D., Johnson, J.K., Liao,
Powles, T., Eder, J.P., Fine, G.D., Braiteh, F.S., Loriot, Y., Cruz, C., Bellmunt,
P., Lang, X., Kryczek, I., Sell, A., et al. (2019). CD8+ T cells regulate tumour fer-
J., Burris, H.A., Petrylak, D.P., Teng, S.L., et al. (2014). MPDL3280A (anti-PD-
roptosis during cancer immunotherapy. Nature 569, 270–274.
L1) treatment leads to clinical activity in metastatic bladder cancer. Nature
515, 558–562. Wei, Y., Lin, C., Li, H., Xu, Z., Wang, J., Li, R., Liu, H., Zhang, H., He, H., and Xu,
J. (2018). CXCL13 expression is prognostic and predictive for postoperative
Preston, C.C., Maurer, M.J., Oberg, A.L., Visscher, D.W., Kalli, K.R., Hart-
adjuvant chemotherapy benefit in patients with gastric cancer. Cancer Immu-
mann, L.C., Goode, E.L., and Knutson, K.L. (2013). The ratios of CD8+
nol. Immunother. 67, 261–269.
T cells to CD4+CD25+ FOXP3+ and FOXP3- T cells correlate with poor clinical
outcome in human serous ovarian cancer. PLoS ONE 8, e80063. Wolf, F.A., Angerer, P., and Theis, F.J. (2018). SCANPY: large-scale single-cell
Qiu, X., Mao, Q., Tang, Y., Wang, L., Chawla, R., Pliner, H.A., and Trapnell, C. gene expression data analysis. Genome Biol. 19, 15.
(2017). Reversed graph embedding resolves complex single-cell trajectories. Yao, C., Sun, H.W., Lacey, N.E., Ji, Y., Moseman, E.A., Shih, H.Y., Heuston,
Nat. Methods 14, 979–982. E.F., Kirby, M., Anderson, S., Cheng, J., et al. (2019). Single-cell RNA-seq re-
Robert, C., Long, G.V., Brady, B., Dutriaux, C., Maio, M., Mortier, L., Hassel, veals TOX as a key regulator of CD8+ T cell persistence in chronic infection.
J.C., Rutkowski, P., McNeil, C., Kalinka-Warzocha, E., et al. (2015). Nivolumab Nat. Immunol. 20, 890–901.
in previously untreated melanoma without BRAF mutation. N. Engl. J. Med.
Zemmour, D., Zilionis, R., Kiner, E., Klein, A.M., Mathis, D., and Benoist, C.
372, 320–330.
(2018). Single-cell gene expression reveals a landscape of regulatory T cell
Sahin, U., Derhovanessian, E., Miller, M., Kloke, B.P., Simon, P., Löwer, M., phenotypes shaped by the TCR. Nat. Immunol. 19, 291–301.
Bukur, V., Tadmor, A.D., Luxemburger, U., Schrörs, B., et al. (2017). Personal-
Zhang, L., Yu, X., Zheng, L., Zhang, Y., Li, Y., Fang, Q., Gao, R., Kang, B.,
ized RNA mutanome vaccines mobilize poly-specific therapeutic immunity
Zhang, Q., Huang, J.Y., et al. (2018). Lineage tracking reveals dynamic rela-
against cancer. Nature 547, 222–226.
tionships of T cells in colorectal cancer. Nature 564, 268–272.
Sato, E., Olson, S.H., Ahn, J., Bundy, B., Nishikawa, H., Qian, F., Jungbluth,
A.A., Frosina, D., Gnjatic, S., Ambrosone, C., et al. (2005). Intraepithelial Zheng, C., Zheng, L., Yoo, J.K., Guo, H., Zhang, Y., Guo, X., Kang, B., Hu, R.,
CD8+ tumor-infiltrating lymphocytes and a high CD8+/regulatory T cell ratio Huang, J.Y., Zhang, Q., et al. (2017a). Landscape of Infiltrating T Cells in Liver
are associated with favorable prognosis in ovarian cancer. Proc. Natl. Acad. Cancer Revealed by Single-Cell Sequencing. Cell 169, 1342–1356.e16.
Sci. USA 102, 18538–18543. Zheng, G.X., Terry, J.M., Belgrader, P., Ryvkin, P., Bent, Z.W., Wilson, R., Zir-
Schmidt, M., Weyer-Elberich, V., Hengstler, J.G., Heimes, A.S., Almstedt, K., aldo, S.B., Wheeler, T.D., McDermott, G.P., Zhu, J., et al. (2017b). Massively
Gerhold-Ay, A., Lebrecht, A., Battista, M.J., Hasenburg, A., Sahin, U., et al. parallel digital transcriptional profiling of single cells. Nat. Commun. 8, 14049.
Cell 181, 1612–1625, June 25, 2020 1625

ll
Article
STAR+METHODS
KEY RESOURCES TABLE

Antibodies
Brilliant Violet 605 CD25, clone BC96 Biolegend Cat# 302632
Brilliant Violet 786 CD127, clone A019D5 Biolegend Cat# 351330
Brilliant Violet 421 CD4, clone OKT4 Biolegend Cat# 317434
Brilliant Violet 650 CD3, clone UCHT1 Biolegend Cat# 300468
Brilliant Ultraviolet 395 CD45, clone H130 Becton Dickinson Cat# 563792
Alexa Fluor 647 CD8, clone SK1 Biolegend Cat# 344726
FITC GZMK, clone GM26E7 Biolegend Cat# 370508
PerCP-Cy5.5 HLA-DR, clone L243 Biolegend Cat# 307630
APC-R700 CCR7, clone 3D12 Becton Dickinson Cat# 565867
Brilliant Violet 480 CD3, clone UCHT1 Becton Dickinson Cat# 566105
Brilliant Violet 510 GZMB, clone GB11 Becton Dickinson Cat# 563388
Brilliant Violet 605 Ki67, clone Ki-67 Biolegend Cat# 350522
Brilliant Violet 650 CD45RA, clone HI100 Biolegend Cat# 304136
Brilliant Violet 786 CD25, clone BC96 Biolegend Cat# 302638
Brilliant violet 711 TNFSRF18, clone 108-17 Biolegend Cat# 371212
Brilliant ultraviolet 395 CD4, clone RPA-T4 Becton Dickinson Cat# 564724
Brilliant ultraviolet 496 CD8, clone RPA-T8 Becton Dickinson Cat# 564808
Brilliant ultraviolet 805 CD45, clone HI30 Becton Dickinson Cat# 564914
PE-CF594 FoxP3, clone 259D/C7 Becton Dickinson Cat# 562421
PE-Cy7 Perforin, clone B-D48 Biolegend Cat# 353316
Alexa Fluor 647 IFNg, clone 4S.B3 Biolegend Cat# 502516
PE anti-human TNFa, clone Mab11 Biolegend Cat# 502909
CD4, clone SP35 Cell Marque Cat# 104R-18
Alexa Fluor 555 goat anti-rabbit IgG(H+L) Invitrogen Cat# 4050-32
Liberase TL, research grade Millipore Sigma Cat# 5401020001
Draq7 Biolegend Cat# 424001
Live/dead fixable Near-IR dead cell stain Invitrogen Cat# L34975
FluoroFix buffer Biolegend Cat# 422101
Recombinant human IL-2 Peprotech Cat# 200-02
IncuCyte Annexin V Red reagent Essen Bioscience Cat# 4641
IncuCyte Cytotox Red reagent Essen Bioscience Cat# 4632
RNAscope probe, Homo sapiens, GZMB Advanced Cell Diagnostics Cat# 445971-C2
(channel 2)
RNAscope probe, Homo sapiens, GZMK Advanced Cell Diagnostics Cat# 475901-C1
(channel 1)
L15 media, 15 mM HEPES, 600 mg% UCSF Cell Culture Facility N/A
glucose
Fetal bovine serum Omega Scientific Cat# FB-01
RPMI-1640 UCSF Cell Culture Facility N/A
ImmunoCult XF complete medium STEMCELL Technologies Cat# 10981
(Medium + 10% FCS + 1% penicillin /
streptomycin)
e1 Cell 181, 1612–1625.e1–e6, June 25, 2020

ll
Article
Continued
GentleMACS Miltenyi Biotec Cat# 130-093-235
FoxP3/transcription factor staining eBioscience Cat# 00-5523-00
buffer set
Cell stimulation cocktail eBioscience Cat# 00-4975
Chromium Single Cell 30 Library, Gel Bead & 10X Genomics Cat# 120233 (discontinued)
Multiplex Kit
Chromium Single Cell 30 Chip Kit 10X Genomics Cat# 120232 (discontinued)
Dynabeads Human T-Activator CD3/ GIBCO Cat# 11162D
CD28/CD137
Opal 7-color manual IHC kit Perkin Elmer Cat# NEL811001KT
Deposited Data
Processed data This study NCBI GEO: GSE149652
Healthy human donor TCR data for CD4+ 10X Genomics https://support.10xgenomics.com/
peripheral blood mononuclear cells single-cell-vdj/datasets/2.2.0/
vdj_v1_hs_cd4_t
Healthy human donor TCR data for CD8+ 10X Genomics https://support.10xgenomics.com/
peripheral blood mononuclear cells single-cell-vdj/datasets/2.2.0/
vdj_v1_hs_cd8_t
Human reference genome, build hg19 10X Genomics http://software.10xgenomics.com/
Oligonucleotides
TCR sequencing primers Table S3 N/A
Cell Ranger v1.1 10X Genomics http://software.10xgenomics.com/
Scanpy v1.4.3 Wolf et al., 2018 https://scanpy.readthedocs.io/en/stable/
index.html
miXCR v2.1.12 Bolotin et al., 2015 https://mixcr.readthedocs.io/en/latest/
Monocle v2.10.1 Qiu et al., 2017 Bioconductor
SingleR v1.1.9 Aran et al., 2019 Bioconductor
FlowJo TreeStar N/A
Lead Contact
Further information and requests for resources and reagents should be directed to and will be fulfilled by the Lead Contact, Lawrence
Fong (lawrence.fong@ucsf.edu).
Primer sequences for TCR sequencing are enumerated in Table S3. All unique reagents generated in this study are available from the
Lead Contact upon request.

Processed single-cell RNA sequencing and TCR sequencing data that support this study have been deposited in the NCBI GEO data-
base under accession GSE149652. Raw sequencing data will be deposited in dbGaP. All software algorithms used for analysis are
available for download from public repositories which are listed in the Key Resources Table.
Tissues were obtained from patients with localized bladder transitional cell carcinoma (TCC) who either received 1-2 doses of neo-
adjuvant atezolizumab as part of an ongoing clinical trial (UCSF IRB# 14-15423, patients were accrued sequentially to receive
increasing numbers of atezolizumab doses), or standard of care treatments recommended by their treating physician including
Cell 181, 1612–1625.e1–e6, June 25, 2020 e2

ll
Article
chemotherapy (gemcitabine/carboplatin) or no systemic therapy prior to planned cystectomy (these patients were consented for tis-
sue collection under a separate protocol, UCSF IRB# 10-04057). All studies with patients and patient samples were conducted with
appropriate institutional IRB approval and oversight. Patient demographics, including age, gender, disease state (all were localized
muscle-invasive bladder cancer), neoadjuvant treatment, and presence of tumor and pathologic staging at the time of surgery are
provided in Table S1. No formal sample size calculations were conducted for this particular collection.
METHOD DETAILS
Tissue processing
Tissues were obtained from patients with localized bladder transitional cell carcinoma (TCC) who either received neoadjuvant ate-
zolizumab, standard of care chemotherapy (gemcitabine/carboplatin), or no systemic therapy as per standard of care prior to
planned cystectomy. Cystectomy surgical specimens were obtained fresh from the operating field, and dissected in surgical pathol-
ogy where grossly apparent tumor or adjacent bladder not grossly affected by tumor (‘‘non-malignant’’) were isolated, minced, and
transported at room temperature immersed in L15 media with 15 mM HEPES and 600 mg% glucose. Once received, these were
digested using Liberase TL as well as mechanical dissocation with heat (gentleMACS) using standard protocols. Single cell suspen-
sions were obtained and counted for viability before staining for FACS. Healthy donor blood was separately collected, processed by
gradient centrifugation to peripheral blood mononuclear cells (PBMCs), and cryopreserved to be thawed later for control
experiments.
Flow cytometry/FACS
Freshly dissociated TILs and previously frozen healthy donor PBMCs were used for sorting. Samples were stained with designated
panels for 30 minutes at 4 C and washed twice with FACS buffer (PBS, 2% FBS, 1mM EDTA). Cells were incubated with Draq7 (Bio-
legend, Cat# 424001) for 5 mins at room temperature to stain dead cells. Samples were sorted on a FACSAria Fusion (Becton Dick-
inson) using FACSDiva software with single channel compensation controls acquired on the same day.
For RNA sequencing flow validation, previously frozen TILs were thawed into in complete media (RPMI, 10% heat inactivated FBS,
1% non-essential amino acids solution, 10 uM HEPES, 1mM sodium pyruvate, 2 mM L-glutamine, 100 U/ml penicillin-streptomycin)
and washed once with PBS. Live/dead fixable Near-IR dead cell stain (Invitrogen, Cat# L34975) was incubated with cells for 30 mi-
nutes at room temperature and washed once with FACS buffer. Samples were stained with designated panels for 30 minutes at 4 C
and washed twice with FACS buffer. Cells requiring intracellular staining were fixed and permeabilized with eBioscience FoxP3/ Tran-
scription factor staining buffer set (Cat# 00-5523-00) according to the manufacturer’s protocol. Intracellular staining with antibodies
was carried out for 30 minutes at room temperature and washed twice with FACS wash. Cells were fixed with FluoroFix buffer (Bio-
legend, Cat# 422101) and washed once with FACS buffer. Cells were acquired the next day on a FACSymphony (Becton Dickinson)
using FACSDiva with single channel compensation controls acquired on the same day. Data was analyzed offline using FlowJo anal-
ysis software (FlowJo, LLC).
For cytokine expression, cells were resuspended in complete media and divided into two T25 flasks. One flask was activated with
cell stimulation cocktail (eBioscience, cat# 00-4975 containing phorbol 12-myristate 13-acetate, ionomycin, brefeldin A and monen-
sin at a final concentration of 81 nM, 1.34 nM, 10.6 mM and 2 mM respectively) and both flasks were incubated upright for 3 h in a CO2
incubator at 37 C. Cells were collected and washed once with PBS prior to Live/Dead Fixable Near-IR dead cell staining and surface
and intracellular flow staining as described above.
Antibodies used for sorting were Brilliant Violet 605 CD25 (Biolegend, clone BC96, Cat# 302632), Brilliant Violet 786 CD127 (Bio-
legend, clone A019D5, Cat# 351330), Brilliant Violet 421 CD4 (Biolegend, clone OKT4, Cat# 317434), Brilliant Violet 650 CD3 (Bio-
legend, clone UCHT1, Cat# 300468), Brilliant Ultraviolet 395 CD45 (Becton Dickinson, clone H130, Cat# 563792), and Alexa Fluor
647 CD8 (Biolegend, clone SK1, Cat# 344726). Antibodies used for RNA sequencing flow validation were FITC GZMK (Biolegend,
clone GM26E7, Cat# 370508), PerCP-Cy5.5 HLA-DR (Biolegend, clone L243, Cat# 307630), APC-R700 CCR7 (Becton Dickinson,
clone 3D12, Cat# 565867), Brilliant Violet 480 CD3 (Becton Dickinson, clone UCHT1, Cat# 566105), Brilliant Violet 510 GZMB (Becton
Dickinson, clone GB11, Cat# 563388), Brilliant Violet 605 Ki67 (Biolegend, clone Ki-67, Cat# 350522), Brilliant Violet 650 CD45RA
(Biolegend, clone HI100, Cat# 304136), Brilliant Violet 786 CD25 (Biolegend, clone BC96, Cat# 302638), Brilliant violet 711 TNFSRF18
(Biolegend, clone 108-17, Cat# 371212), Brilliant ultraviolet 395 CD4 (Becton Dickinson, clone RPA-T4, Cat# 564724), Brilliant ultra-
violet 496 CD8 (Becton Dickinson, clone RPA-T8, Cat# 564808), Brilliant ultraviolet 805 CD45 (Becton Dickinson, clone HI30, Cat#
564914), PE-CF594 FoxP3 (Becton Dickinson, clone 259D/C7), PE-Cy7 Perforin (Biolegend, clone B-D48, Cat# 353316).
Antibodies used for cytokine staining in addition to those used above were Alexa Fluor 647 IFNg (Biolegend, clone 4S.B3, Cat#
502516) and PE anti-human TNFa (Biolegend, clone Mab11, Cat# 502909).
Single cell RNA sequencing

Droplet-based single-cell RNA sequencing (dscRNA-seq) was performed using the 10X Genomics Chromium Single Cell 30 platform,
version 1, according to manufacturer’s instructions. CD3+CD4+ and CD3+CD8+ T cells were sorted from digested tumor and
e3 Cell 181, 1612–1625.e1–e6, June 25, 2020

ll
Article
non-malignant tissues, or Ficoll-purified and previously cryopreserved healthy control PBMCs, into 500 ul of PSA/0.04% BSA for
loading onto 10X. Following library preparation, sequencing was performed on an Illumina HiSeq 2500 (Rapid Run mode). Paired
samples from the same experiment and patient were processed in parallel during library preparation, and sequenced on the
same flowcell to minimize batch effects.
TCR sequencing
In brief, approximately 10% of the barcoded cDNA from the 10X workflow was utilized for TCR analysis. Primers used for TCR
sequencing are listed in Table S3. cDNA were first amplified with 6-12 amplification cycles using a template switching oligonucleotide
(TSO) and P7 primers. A pool of forward Va and Vb primers containing the TruSeq Read 1 primer sequence were then used in
conjunction with a reverse P7 primer to amplify CDR3 sequences from the TCR alpha and beta loci. An additional amplification
step using forward primers containing the Illumina P5, i5 and Truseq Read 1 sequences was used with reverse P7 primer to create
final TCR libraries for sequencing. Deep sequencing was done on an Illumina NovaSeq S1 with separate lanes for the TCR alpha and
TCR beta sequencing. Read 1 contained 280 bp of the TCR alpha or beta CDR3 sequence, and the i7 read contained the 14 bp 10X
barcode.
Expression analysis
After 10X sequencing data was processed through the Cell Ranger pipeline (version 1.1, hg19 genome assembly) with default set-
tings, filtered gene-barcode matrices for single tumors were analyzed using the scanpy toolkit (Wolf et al., 2018). Genes that were
detected in less than three cells were filtered out, and cells were filtered out with greater than ten percent of mitochondrial genes
and with fewer than 100 or greater than 1200 detected genes. Cells that were annotated as red blood cells (HBB) or macrophage
(CD14, CD68, CD163) were also excluded from downstream analyses. The gene expression values were log2 plus one transformed
and normalized to 10,000 counts per cell. The resulting matrix was batch corrected by regressing out total UMI counts and percent
mitochondrial genes using the built-in scanpy function followed by using the scanpy implementation of ComBat (Johnson et al., 2007)
with each well acting as a batch (13 wells total). The adjusted matrix was scaled to a mean of zero and variance of 1. Highly variable
genes were selected using the embedded scanpy function followed by principal component analysis (PCA), leiden clustering and
UMAP plotting with default settings with the exception of using a resolution of 1.5 for CD4+ T cells and 1.0 for CD8+ T cells for the
leiden clustering. This yielded 19 clusters which were collapsed to 11 cell types based on manual gene annotations (for CD4+ cells),
and 11 clusters (for CD8+ cells). We performed differential expression to identify marker genes that were upregulated in each indi-
vidual cluster relative to the combination of all other single cells (regardless of tumor or non-malignant tissue origin), or genes that
were upregulated in tumor versus non-malignant compartments. We compared the gene lists to known literature to label the clusters,
as well as using SingleR (Aran et al., 2019) to map the expression signature for each cluster to the best correlated candidate immune
reference signature, using the Monaco bulk RNA-seq reference of sorted human immune cell populations described within (Monaco
et al., 2019). Significant differences between the cell type abundances for the normal and tumor tissue samples were assessed using
an exact permutation test on the abundances.
Correlation analysis between gene expression from distinct clusters was performed by restricting to genes expressed across all
clusters being tested, and then correlating the scaled expression of the multidimensional vector of shared genes between pairs of
clusters and computing the Pearson correlation coefficient.
TCR analysis
TRA and TRB CDR3 nucleotide reads were demultiplexed by matching reads to 10X barcodes from cells with existing expression
data that passed filtering in the Cell Ranger pipeline, excluding cell barcodes that overlapped between multiple samples. Following
demultiplexing of the TRA and TRB CDR3s, reads were aligned against known TRA/TRB CDR3 sequences then assembled into clo-
notype families using miXCR (Bolotin et al., 2015) with similar methodologies to a previous study (Zemmour et al., 2018). For any given
10X barcode, the most abundant TRA or TRB clonotype was accepted for further analysis; if 2 TRA or TRB clonotypes were equally
abundant for a given 10X barcode, the clonotype with the highest sequence alignment score was used for further analysis. Detailed
sequencing statistics and saturation analysis are provided in Table S3. Only cells with paired TRA and TRB were used for further
downstream analysis. Analysis utilizing TCR data only (number of unique cells sharing a specific TRA/TRB clonotype sequence,
Gini coefficient) utilized cells both with and without a specific functional population that had been assigned by clustering. Analysis
involving both TCR clonotype and function was restricted to cells with both a mapped TRA/TRB and a functional population from
clustering. Statistical comparisons of Gini coefficients across compartments was performed using Wilcoxon signed-rank test with
Benjamini-Hochberg correction for multiple testing; statistical testing of differences in Gini coefficients between tumor and non-ma-
lignant compartments across all phenotypic clusters was performed using exact permutation testing.
Tumor infiltrating lymphocyte (TIL) isolation and culturing

Single-cell suspensions from processed and digested bladder tumors were viably frozen at 80 C and stored prior to culture setup.
To sort the tumor-infiltrating lymphocytes, frozen cancer cell aliquots were thawed, washed once with PBS, and counted by Vicell.
Cell 181, 1612–1625.e1–e6, June 25, 2020 e4

ll
Article
Cells were subsequent stained and sorted by FACS. CD4 TIL (Draq7-CD45+CD3+CD4+ that were not CD25+CD127lo) and CD8 TIL
(Draq7-CD45+CD3+CD8+) were sorted into ImmunoCult XF complete medium (Medium + 10% FCS + 1% penicillin/streptomycin;
STEMCELL Technologies #10981). T cells were pooled together for culturing. After centrifugation, T cells were suspended in Immu-
noCult XF complete medium, and Dynabeads Human T-Activator CD3/CD28/CD137 (GIBCO #11162D) were added to the culture per
manufacturer’s protocol. T cells were cultured in 96 well U-bottom plates, and briefly centrifuged to ensure cell contact with Dyna-
beads. T cell expansion was managed in two phases. For the first week of T cell expansion, TILs were maintained with ImmunoCult XF
complete medium + 200 IU/ml of human recombinant IL-2 (Peprotech #200-02). From the second week onward, IL-2 concentration
was gradually increased from 200 IU/ml to 2000 IU/ml based on cell growth kinetics (which varied by patient sample). T cells were
harvested between 5-8 weeks for functional killing assays.
Cytotoxic T lymphocyte (CTL) killing assay

After expansion, TILs were again sorted for either CD4 or CD8 as distinct effector populations. Primary cancer cells from frozen
aliquots were freshly thawed and sorted on CD45-Draq7- as target cells. To achieve various effector-to-target (E:T) ratios, 3000
cancer cell targets were suspended in ImmunoCult XF complete medium and seeded into each well. Different ratios of TILs
were serially diluted and added to the corresponding wells. Each well contained 200 mL of medium supplemented with 1 ml/
well of IncuCyte Annexin V Red reagent (Essen Bioscience #4641). For MHCI and MHCII blockade, 10 mg of blocking antibody
(or isotype control matched to the anti-MHCII antibody) was added into wells containing cancer cells and cultured at 37 C for
1 hour prior to co-culture with TILs. Cell culture was monitored by the IncuCyte Zoom system (Essen Bioscience) at 15-30 minute
intervals for up to 36 h when needed. Experiments with Annexin V were carried out with samples from 3 independent patients.
Additional experiments were performed using 0.25 mL of IncuCyte Cytotox Red reagent (Essen Bioscience #4632) instead of An-
nexin V; 2 independent experiments were performed with Cytotox Red using distinct aliquots from the same patient. Analysis was
performed using the IncuCyte Zoom software, using images background subtracted with a tophat filter. For Annexin V experi-
ments, background death was subtracted, with all traces displayed as relative change in cell death from time point 0. For Cytotox
Red experiments, tumor cells were larger than TIL based on inspection of wells with tumor cells alone or free TILs in wells con-
taining TILs; based on this, the number of dying tumor cells per mm2 was determined using a minimum area threshold of 75 mm2.
Out of focus frames were discarded, as were any wells where the first time frame was out of focus precluding accurate
normalization.
Pseudotime analysis
Pseudotime analysis, including branched expression analysis modeling (BEAM) to identify all genes with branch-dependent differ-
ential expression followed by unbiased clustering of genes based on patterns of co-expression in specific branches, was performed
using Monocle v2.10.1 as described (Qiu et al., 2017), for the combination of proliferating (CD4PROLIF, regulatory (CD4IL2RAHI,
CD4IL2RALO) and cytotoxic (CD4GZMB, CD4GZMK) states from scRNA-seq clustering.
Gene signature analysis

Genes were selected based on their specific upregulation in proliferating or non-proliferating cytotoxic CD4+ T cell branches, but not
in regulatory T cell branches, from the pseudotime analysis. Genes that overlapped with differentially expressed genes in any of the
CD8+ T cell states from our scRNA-seq analysis (Table S2) were removed. The resulting gene list was used to construct a composite
signature consisting of genes that distinguish either proliferating or non-proliferating cytotoxic CD4+ T cells; a signature score was
computed as previously published for the IMvigor210 dataset (Mariathasan et al., 2018), by z-score transforming the expression of
each gene in the signature, and then using the first component (PC1) of a principal component analysis as the gene signature score.
Immune subtypes for the IMvigor210 samples for this analysis were previously assigned based on CD8 immunohistochemistry stain-
ing (Mariathasan et al., 2018).
RNAscope/tissue immunofluorescence staining

RNAscope (Advanced Cell Diagnostics) in situ hybridization and immunofluorescence staining were performed on 4 mm FFPE sec-
tions from cystectomy specimens with existing scRNA-seq and TCR-seq data. Tissues were pre-treated with target retrieval re-
agents and protease to improve target recovery based on guidelines provided in the RNAscope Multiplex Fluorescent Reagent
Kit v2 Assay protocol. Probes for human GZMB and GZMK mRNA (ACD) were incubated at 1:700 dilution for 2 hr 40 min. The probe
was then hybridized with Opal 7-Color Manual IHC Kit (PerkinElmer), with detection of GZMB and GZMK using Opal 620 and Opal
540 respectively. Samples were then immunofluorescence stained for human CD4 (Cell Marque) which was detected using Alexa
Fluor 555-conjugated goat anti-rabbit IgG secondary antibody (Invitrogen). Tissues were counterstained with DAPI. Slides were
imaged using a TCS SP8 X white light laser inverted confocal microscope (Leica Microsystems). No staining was seen with negative
control probes (for RNAscope) or with secondary antibody alone (for immunofluorescence) for tumor tissue from cystectomy spec-
imens (shown in Figure 4F) or healthy tonsil tissue (data not shown).
e5 Cell 181, 1612–1625.e1–e6, June 25, 2020

ll
Article
Specific statistical tests and metrics (median, mean, standard error) used for comparisons, along with sample sizes, are described in
the Results and figure legends. The chemotherapy sample was included in unbiased clustering, testing for conserved marker genes
and tumor versus non-malignant testing, but was excluded from analyses of treatment effect (anti-PD-L1 versus untreated).
ADDITIONAL RESOURCES
The clinical trial of neoadjuvant atezolizumab prior to planned cystectomy for localized bladder cancer is registered under
clinicaltrials.gov (NCT02451423).
Cell 181, 1612–1625.e1–e6, June 25, 2020 e6

ll
Article

ll
Article
Figure S1. Flow Cytometry and Immunofluorescence Validation of T Cell Phenotypes in Bladder Tumors, Related to Figures 1, 2, 3, 4, and 5
(A) Schematic of processing for paired tumor and adjacent non-malignant tissue from either anti-PD-L1-treated, or standard-of-care (untreated/chemotherapy-
treated) cystectomy patients. FACS-sorted CD4+ or CD8+ T cells were subjected to droplet-based single-cell RNA sequencing (dscRNA-seq) with paired T cell
receptor (TCR) sequencing as described in the text. (B) Parallel flow cytometry data from the same single-cell digest used for dscRNA-seq from 4 anti-PD-L1-
treated tumors, showing the percentage of CD4+ or CD8+ T cells from total CD3+ cells. (C) Gating strategy for flow cytometric analysis of populations in CD4+ and
CD8+ T cells from RNA-seq. CD4+ and CD8+ populations were gated out of CD3+ CD45+ single live cells. CD4+ cells were further gated as FoxP3- and FoxP3+.
Treg cells are gated as FOXP3+ CD25+ cells. FOXP3- CD4+ and CD8+ cells were gated into central memory (CM, CCR7+ CD45RA-), and CCR7- cells (a com-
bination of effector memory CCR7- CD45RA- and effector CCR7- CD45RA+). Boolean gating of CCR7- cells was used to obtain GZMK+, GZMB+ and Ki67+
populations for further marker analysis. Plots are shown here to demonstrate the presence of these populations. (D) Representative gates shown for each marker
for CD4+ and CD8+ T cells were used for Boolean gating for the populations described above. (E) Flow cytometry staining of GZMB, GZMK, or perforin versus CD3
in CCR7- CD8+ T cells. Gates used for Boolean analysis are shown. (F) Flow cytometry staining of GZMB or GZMK co-expression with perforin in CCR7- CD8+
T cells. (G) Percentage of cells expressing GZMB, GZMK, or perforin from CCR7- CD8+ T cells by flow cytometry (left), and the percentage of cells co-expressing
perforin within GZMB+ or GZMK+ CCR7- CD8+ T cells (right), are shown (N = 7 tumors, mean + SEM). (H) Percentages of cells expressing IFNg, TNFa, or both from
GZMB+ or GZMK+ CCR7- CD8+ T cells with and without stimulation (N = 11 tumors, mean + SEM). (I) Multiplex immunofluorescent staining of DAPI (blue), CD4
(red), GZMK (green), GZMB (white) and overlay without DAPI are shown from a cystectomy tumor region from an additional patient with parallel scRNA-seq and
TCR-seq data (anti-PD-L1 D). CD4+ cells that co-express GZMK (arrows) or GZMB (arrowhead) are indicated. Scale bar, 10 mm. (J) Percentage of cells co-
expressing Ki67 and either GZMB or GZMK from CCR7- CD4+ FOXP3- T cells (left), or Ki67 and CD25 from CD4+ FOXP3+ T cells (right), by flow cytometry are
shown, with dots for values from individual tumors (N = 7 tumors, mean ± SEM). (K) Flow cytometry staining showing co-expression of GZMB and Ki67, or GZMK
and Ki67, from CCR7- CD8+ T cells.
ll
Article

ll
Article
Figure S2. Clustering, Differential Expression, Annotation, and Correlation Analysis of T Cell Transcriptional Phenotypes, Related to Figures
1, 2, 3, and 4
(A-B) UMAP plots showing cluster representation for CD8+ (A) and CD4+ (B) TIL from individual patients. (C) Volcano plots showing adjusted P values versus
log2(FC) for differential testing of genes between tumor and non-malignant compartments for regulatory T cell populations (top, CD4IL2RAHI, CD4IL2RALO) and
cytotoxic CD4+ populations (bottom, CD4GZMB, CD4GZMK). Genes whose expression is significantly different between compartments with Padj < 0.05 and |
log2(FC) > 1.4| are shown in red. (D-E) Unbiased clustering of CD4+ T cells from tumor and adjacent non-malignant tissue from a single patient (anti-PD-L1 C). (D)
UMAP plot showing individual cells coded by cluster or by tissue of origin. (E) Violin plot showing top 5 differentially expressed marker genes for each unbiased
cluster. (F) Annotations of single CD4+ T cells from tumor and adjacent non-malignant tissue using SingleR. (G) Correlation matrix of all CD4+ and CD8+ pop-
ulations from tissue (combined tumor and non-malignant tissues) based on expression of shared genes. Pearson correlation coefficient is shown. Populations
were arranged based on hierarchical clustering using Euclidean distance metric.
ll
Article
Figure S3. T Cell Receptor Repertoire Analysis of CD4+ and CD8+ Bladder Tumor- and Non-malignant Tissue-Infiltrating T Cells, Related to
Figures 3 and 4
(A) The percentage of unique paired TRA and TRB CDR3 nucleotide sequences that are expressed by one cell (blue), shared by two cells (green), or shared by three or
more cells (red) is indicated for CD4+ T cells from individual tumor (darker shades) and non-malignant tissues (lighter shades) from anti-PD-L1-treated (‘‘PD-L1’’),
untreated, and chemotherapy-treated (‘‘chemo’’) patients. Triplicate control samples from a single healthy donor’s CD4+ T cells sorted from peripheral blood and
processed for scRNA-seq and TCR in identical fashion in separate sequencing runs is shown (‘‘healthy 1-3’’), as well as reference publicly available data from peripheral
blood CD4+ from a healthy donor. (B) Lorenz curves showing the cumulative frequency distributions for unique CD4+ T cells and unique CD4+ T cell clonotypes for
tumor, non-malignant tissues, and healthy donor blood. Mean ± SD is shown. (C) Gini coefficients for CD4+ T cell clonotypes from tumor, non-malignant tissues, and
healthy donor blood, calculated from the Lorenz curves in (D); p = 0.009 by Wilcoxon with Benjamini-Hochberg correction for tumor versus non-malignant tissues. For

ll
Article
(D) and (E): N = 7 tumor samples; 6 non-malignant samples, 4 healthy donor samples (3 triplicates from one healthy donor, 1 dataset from 10X Genomics). (D-F) Paired
TRA/TRB clonotype sharing between cells, Lorenz curves, and Gini coefficients for CD8+ clonotype data as in (A-C). (G) Gini coefficients for tissue-infiltrating CD4+ in
individual populations, separated by treatment type. (H-I) Gini coefficients for CD8+ T cells in individual populations, separated by tumor versus non-malignant tissue (H)
and treatment type (I). All box and whisker plots are formatted as in Figure 3C.
ll
Article
Figure S4. Autologous MHC-Dependent Killing of Bladder Tumors by CD4+ and CD8+ TIL, Related to Figure 5
Analysis of the increase in the number of dead cells over time from the same killing assay for CD4eff TIL (ie cultures with Tregs sorted out during expansion) at 30:1
effector:target ratio (A), CD4eff TIL at 30:1 effector:target ratio with a pan-anti-MHCII antibody (B), CD8+ TIL at 30:1 effector:target ratio (C), or CD8+ TIL at 30:1
effector:target ratio with a pan-anti-MHCI antibody (D), are shown. Control traces from separate wells with tumor only are included. All traces were normalized to
the number of dead cells per mm2 at time point 0. Experiments were done using Cytotox Red. The observation of autologous tumor killing by CD4+ and CD8+ TIL
above the background level of spontaneous death is representative of 2 independent experiments involving distinct aliquots from the same patient.
Resource
Single-Cell Mapping of Human Brain Cancer Reveals

Tumor-Specific Instruction of Tissue-Invading
Leukocytes
Ekaterina Friebel, Konstantina Kapolou,
Susanne Unger, ..., Sonia Tugues,
Marian Christoph Neidert,
Burkhard Becher
Correspondence
becher@immunology.uzh.ch
In Brief
High-parametric single-cell mapping of
the tumor microenvironment of patients
with primary brain tumors or brain
metastases reveals that the immune
response to cancer in the brain is shaped
by cancer type, with metastases favoring
T cell and monocyte-derived
macrophage invasion and gliomas
characterized by activated microglia.
Highlights
d Leukocyte invasion is higher in brain metastasis than in CNS-
endogenous cancers
d The tumor type shapes the differentiation of monocyte-

derived macrophages
d Brain metastases harbor a high frequency of regulatory

T cells
d Both activation and exhaustion are prevalent in lymphocytes

of the metastatic TME
Friebel et al., 2020, Cell 181, 1626–1642

ll
Resource
Single-Cell Mapping of Human Brain Cancer
Reveals Tumor-Specific Instruction
of Tissue-Invading Leukocytes
Ekaterina Friebel,1,5 Konstantina Kapolou,2,5 Susanne Unger,1 Nicolás Gonzalo Núñez,1 Sebastian Utz,1
Elisabeth Jane Rushing,4 Luca Regli,3 Michael Weller,2 Melanie Greter,1 Sonia Tugues,1 Marian Christoph Neidert,2,3,6
and Burkhard Becher1,6,7,*
1Instituteof Experimental Immunology, University of Zurich, Zurich 8057, Switzerland
2Laboratory of Molecular Neuro-Oncology, Department of Neurology, Clinical Neuroscience Center, University Hospital Zurich and University
of Zurich, Zurich 8091, Switzerland
3Department of Neurosurgery, Clinical Neuroscience Center, University Hospital Zurich and University of Zurich, Zurich 8091, Switzerland
4Department of Neuropathology, University Hospital Zurich and University of Zurich, Zurich 8091, Switzerland
7Lead Contact
*Correspondence: becher@immunology.uzh.ch
SUMMARY
Brain malignancies can either originate from within the CNS (gliomas) or invade from other locations in the
body (metastases). A highly immunosuppressive tumor microenvironment (TME) influences brain tumor
outgrowth. Whether the TME is predominantly shaped by the CNS micromilieu or by the malignancy itself
is unknown, as is the diversity, origin, and function of CNS tumor-associated macrophages (TAMs). Here,
we have mapped the leukocyte landscape of brain tumors using high-dimensional single-cell profiling (Cy-
TOF). The heterogeneous composition of tissue-resident and invading immune cells within the TME alone
permitted a clear distinction between gliomas and brain metastases (BrM). The glioma TME presented pre-
dominantly with tissue-resident, reactive microglia, whereas tissue-invading leukocytes accumulated in BrM.
Tissue-invading TAMs showed a distinctive signature trajectory, revealing tumor-driven instruction along
with contrasting lymphocyte activation and exhaustion. Defining the specific immunological signature of
brain tumors can facilitate the rational design of targeted immunotherapy strategies.
INTRODUCTION et al., 2019; Tawbi et al., 2018). These divergent clinical re-
sponses to immunotherapy may be explained by tumor cell
Malignant brain tumors can be either primary tumors that arise in genomic (intrinsic) properties regulating T cell antigen recogni-
the brain (e.g., gliomas) or secondary tumors of extracranial tion and immune responses. Alternatively, tumor-extrinsic fea-
origin (metastases), most frequently invading from non-small- tures such as the tumor microenvironment (TME) may represent
cell lung carcinoma (NSCLC), breast cancer, and melanoma resistance pathways to immune-mediated interventions (Keenan
(Quail and Joyce, 2013). Both primary and metastatic brain ma- et al., 2019). Thus, in-depth knowledge of the immune cell
lignancies have a very poor prognosis, mainly due to limitations composition within the TME across different types of brain tu-
of standard therapies (e.g., surgery, radio/chemotherapy) (Al- mors may be critical to predict not only how tumors will progress,
dape et al., 2019). In this scenario, immunotherapy poses a but also their immunotherapeutic outcomes.
promising treatment option; however, programmed death (PD)- Uniquely, the CNS is both a site of immune privilege and a
1 blockade has not shown a survival benefit in recurrent glioblas- complex leukocyte landscape (Keren-Shaul et al., 2017; Forres-
toma (NCT 02017717). Only a selective group of patients with te- ter et al., 2018; Mrdjen et al., 2018; Van Hove et al., 2019). An
mozolomide-induced hypermutations in gliomas have been abundant cellular component of the brain TME are tumor-asso-
shown to benefit from immunotherapy (Daniel et al., 2019; Jo- ciated macrophages (TAMs), which possess both tumor-pro-
hanns et al., 2016; Bouffet et al., 2016). In contrast, a consider- moting and immunosuppressive capacities (Quail and Joyce,
able number of patients with brain metastases (BrM) respond 2013; Takenaka et al., 2019). This dichotomy is further compli-
to checkpoint blockade with concordant responses in extracere- cated by the heterogeneity of TAMs. TAM populations can be
bral disease and brain lesions (Goldberg et al., 2016; Kluger subdivided ontogenetically into at least two major populations:
ll
Resource
Figure 1. Mass Cytometry Analysis Reveals Unique Changes in the Leukocyte Composition of Brain Tumors
For a Figure360 author presentation of Figure 1, see https://doi.org/10.1016/j.cell.2020.04.055.
(A) Experimental approach.
(B) MDS plot showing the mean antigen expression across leukocytes. The clinical groups are indicated by varying point shapes and colors.
(C) A representative UMAP map showing the FlowSOM-guided meta-clustering of CD45+ cells.
Cell 181, 1626–1642, June 25, 2020 1627

ll
Resource
(1) tissue-resident microglia and macrophages of embryonic gliomas, BrM (metastatic melanoma, NSCLC, and other tumors)
origin, and (2) tissue-invading monocyte-derived macrophages and epilepsy (serving as a quasi-steady-state control) (Fig-
(hereafter termed MDM) (Croxford et al., 2015; Kiss et al., ure 1A). Gliomas were further characterized according to IDH1
2018). Tissue-resident populations comprise microglia and R132H mutation (IDH1mut and IDH1wt) and methylation status
border-associated macrophages (BAMs), representing the ma- of the O6-methylguanine DNA methyltransferase (MGMT) pro-
jority of phagocytes in the healthy steady-state brain. However, moter. Some patients with BrM had a long disease history and
tumor progression induces the recruitment of blood-borne underwent either chemotherapy or treatment with immune
monocytes, which differentiate into MDMs (Bowman et al., checkpoint inhibitors. Clinical information and treatment regi-
2016; Ginhoux et al., 2010; Goldmann et al., 2016; Mrdjen mens prior to surgery are summarized in Table S1.
et al., 2018). Concomitant with the expansion of MDMs, CNS- To map the complexity of the leukocyte compartment of the
resident macrophages undergo marked phenotypical and func- TME, we designed two CyTOF panels together, measuring 74
tional changes, which further complicates the identification of parameters at the single-cell level. The first panel was myeloid-
the origin of TAMs (Hambardzumyan et al., 2016; Bowman focused (Table S2). Here, we combined antibodies to capture
et al., 2016; Quail and Joyce, 2017). Importantly, the majority the entire spectrum of phagocytes in the brain TME, together
of studies assessing the contribution of TAMs in brain cancer with lineage-identifying markers to map the major leukocyte
have been performed in mouse models of malignant glioma or populations and their relative cellular frequencies. The second
restricted cohorts of glioblastoma patients, with only a few at sin- panel was lymphoid-focused (Table S3) and was designed to
gle-cell resolution (Darmanis et al., 2017; Quail et al., 2016; Ven- deeply interrogate the lymphoid compartment. We started the
teicher et al., 2017). analysis of leukocytes obtained from the single-cell suspension
Brain tumors also harbor variable lymphocyte infiltrates (TILs) of brain tissues with the myeloid-focused panel and estimated
(Bien kowski and Preusser, 2015). Both CD8 and CD4 T cell sub- similarities between glioma, BrM, and non-tumor control sam-
sets (including regulatory T cells [Tregs]) increase with tumor ples. To do so, we applied multidimensional scaling (MDS) to
grade (Jacobs et al., 2010; Kuppner et al., 1988). Paradoxically, the total leukocytes (Figure 1B), where the distances, or relative
prognostically more favorable diffuse gliomas with mutations in similarities, between samples were calculated using the mean
the isocitrate dehydrogenases 1 and 2 are associated with value of antigen expressions (Buja et al., 2008). The first dimen-
reduced T cell abundance, presumably due to the effect of the sion featured an unambiguous separation between the glioma
oncometabolite (R)-2-hydroxyglutarate on the TME (Bunse and BrM TME, although the single epilepsy sample was mapped
et al., 2018; Kohanbash et al., 2017). The extent of T cell infiltra- closer to the glioma samples. The second dimension showed
tion in brain tumors of extracranial origin has also been examined relative variety between samples within the tumor group (Fig-
in some studies. In melanoma BrM, for instance, high densities of ure 1B). In order to better understand the basis of the group sep-
TILs have been observed mainly within the tumor stroma and aration in the MDS analysis, we visualized the mean antigen
surrounding brain (Amit et al., 2013). Nevertheless, none of these expressions across leukocytes using a heatmap with unsuper-
reports have specifically investigated T cell and other lympho- vised hierarchical clustering (Figure S1) (Nowicka et al., 2017).
cyte infiltration in relation to the overall immune landscape of The most informative differentially expressed markers were
the TME. those commonly expressed by most phagocytes (CD11c,
To provide a more detailed insight into the nature of the brain CD64, HLA-DR, and CX3CR1), as well as common T cell-
TME with a focus on the immune cell composition, we performed restricted antigens (CD3, PD-1, and CD38).
single-cell mass cytometry (CyTOF) on ex vivo surgical resec- To visualize every single immune population isolated from the
tions of brain tumors and non-tumor controls. Our in-depth im- different brain tumor samples, we created a two-dimensional
mune cell phenotyping has revealed fundamental differences in graph using the dimensionality reduction algorithm uniform
cellular frequencies and phenotypes within primary and second- manifold approximation and projection (UMAP) (Figures 1C
ary brain tumor TMEs. Our findings may explain the divergent re- and S2A) (Mcinnes et al., 2018). To compute UMAP, we specified
sults of immunotherapy in patients with brain tumors and allow the lineage markers, listed in Figure 1D to be considered for the
for the data-driven design of novel therapeutic interventions. estimation of cell similarity. By reducing the high-dimensional
data into two dimensions, we could present the quantification
RESULTS of all measured marker expressions on all cellular subtypes,
simultaneously (Figure S2A). Next, we categorized the various
Mass Cytometry Analysis Reveals Unique Changes in embedded cell clusters using self-organizing maps (FlowSOM)
the Leukocyte Composition of Brain Tumors (Van Gassen et al., 2015; Hartmann et al., 2016). This strategy al-
The overall strategy involved harvesting freshly resected tissue lowed us to create a map of diverse immune cells including,
from 38 patients undergoing neurosurgery for the treatment of CNS-resident and invading TAMs/monocytes (CD64+, CD11c+,
(D) Heatmap displaying the median antigen intensity of markers used to generate (C).
(E) The relative frequencies of immune populations of brain tumors and non-tumor control. Only statistically significant p values were displayed (p < 0.05, Mann-
Whitney-Wilcoxon test, Benjamini-Hochberg correction). Error bars define an interval of max/min value ± SD.
(F) Circos plots showing the multiple correlation matrix between the leukocyte frequencies. Results are plotted in order to display statistical significance (p < 0.05).
Here, correlation coefficients (R) higher than 0.6 are represented in red, and those lower than 0.6 in blue.
See also Figures S1 and S2 and Table S1.
1628 Cell 181, 1626–1642, June 25, 2020

ll
Resource
Figure 2. The Brain TME Harbors a Heterogeneous Mononuclear Phagocyte Population

(A) CD49d expression overlaid onto the Scaffold map.
(B) Heatmap displaying the median antigen intensity of landmark nodes of (A).
(C and D) P2Y12 and CD49d expression of TAMs/monocytes isolated from glioma IDH1wt (C) and epilepsy (D), overlaid onto the reference map (A).
Cell 181, 1626–1642, June 25, 2020 1629

ll
Resource
and CD11b+), neutrophils (CD66b+ and CD16+), two subsets of closer to proximal nodes indicative of shared and similar fea-
dendritic cells (CD141+ and CADM1+ for cDC1 and CD1c+ for tures. The size of the node represents the number of cells within
cDC2), T cells (CD3+), natural killer (NK) cells (CD56+CD16+), the group. In addition, the Scaffold layout allowed for the intro-
B cells (CD19+ and HLA-DR+), and plasma cells (CD19+ and duction of manually gated data, which were used as landmark
CD38high) (Figures 1C and 1D). Notably, TAMs and monocytes populations for the reference map. This permitted a direct com-
comprised up to 80% (±18%) of leukocytes in the IDH1mut or parison of single-cell data from different species and/or single-
IDH1wt gliomas similar to the epilepsy non-tumor control, while cell maps acquired using different techniques. This way, CyTOF
T cells represented only 13% (±10%) (Figures 1E and S2B). and fluorescence-activated cell sorting (FACS) data obtained
Further stratification of IDH1wt glioblastoma patients according from patients and mice could be mapped side-by-side. To build
to the MGMT promoter methylation status showed comparable the reference map, we used equally normalized pooled data from
frequencies of major immune populations (Figure S2C). Interest- glioma and BrM samples, to ensure that all TAM subsets were
ingly, we observed the opposite situation in BrM, such as mela- present in our reference sample. Here, we focused on the
noma and carcinoma (Figure 1E), with significantly higher relative expression of the integrin alpha 4 (ITGA4/CD49d), which was
frequencies of T cells (up to 50% ± 16%) and lower frequencies previously reported to specifically mark CNS-invading macro-
of TAMs (up to 40% ± 18%) in the TME. Additionally, the meta- phages (Bowman et al., 2016) (Figure 2A). Last, we created the
static TME in both melanoma and carcinoma showed signifi- reference map for the Scaffold layout, considering three
cantly higher frequencies of plasma cells, which were absent in major cellular clusters: (1) CNS-resident microglia (CD49d ,
gliomas (Figure 1E). Mertk+, CX3CR1+, CD11c+, and CD64+), (2) monocytes
To determine whether therapy (including immunotherapy) was (CD14+, CCR2+, and CD11b+), and (3) MDMs (CD49d+, Mertk+,
driving these inter-tumor variations, we tracked treated patients CD163+, and CD64+) (Figures 2A and 2B).
in our analysis. However, we could not find a significant treat- Of note, our study revealed additional markers (CD45RA,
ment effect across the samples, and hence all samples were CD141, and ICAM), which were differentially expressed by
analyzed together and not stratified across therapeutic interven- invading MDMs compared to CNS-resident microglia (Figure 2B).
tions (Figure 1E). The relative distribution across leukocytes and Cellular identification by computational tools was confirmed us-
their relationship within brain tumors differed (Figure 1F). In gli- ing manual gating of the CyTOF data (Figure S3A). The CyTOF
oma, an increase in the relative frequencies of T cells, neutro- spectrum of CNS TAMs was complemented by 24-parameter
phils, and pDCs correlated negatively with TAM/monocyte fluorescence cytometry including the microglia-specific protein
frequencies, whereas T cell frequencies positively correlated P2Y12 (Butovsky et al., 2014) in epilepsy, glioma, and BrM sam-
with pDCs and cDCs frequencies. In BrM, on the other hand, ples. We repeated the same strategy using FlowSOM of the
we only observed a negative correlation between T cell and CD11b+CD11c+CD66b lin cells measured by FACS. Over-
TAM/monocyte frequencies. Taken together, our results show laying FACS data onto the reference map from the CyTOF data
that gliomas and BrM shape TMEs with a distinct immune cell showed the expected position of nodes around the fixed land-
composition: TAMs dominate the TME in gliomas whereas TILs mark populations (Figures 2C, 2D). FACS data confirmed both
dominate BrM. A close analysis of the immune compartment of the phenotype and frequencies of the CyTOF populations and
TME alone allowed for a clear separation into cancers of CNS categorized the exclusive expression of CD49d on CNS-
origin or extracranial CNS-invading metastasis. invading and P2Y12 on CNS-resident macrophages. As ex-
pected, we identified only P2Y12+ cells in epilepsy, as an
The Brain TME Harbors a Heterogeneous Mononuclear approximation to the steady-state CNS, whereas glioma and
Phagocyte Population BrM contained both populations, CD49d+ invading MDMs and
TAMs within the CNS may originate from the CNS-resident mi- P2Y12+ CNS-resident microglia.
croglia and/or from blood-derived monocytes that invade the Even though P2Y12 is one of the markers most commonly
TME and transform into MDMs. To study the relative contribution used to identify mouse and human microglia, it has been re-
of CNS-resident versus invading TAMs across gliomas and BrM, ported that P2Y12 expression is reduced after microglial activa-
we re-applied FlowSOM to CD64+, CD11c+, CD11b+, CD1c , tion (Haynes et al., 2006). Therefore, to solidify the notion that
and CD66b cells (Figures 1C, 1D) identified in the TME and CD49d expression distinguishes MDMs from microglia even
non-tumor control. This time, however, we essentially consid- within the TME, we used a preclinical model of glioma combined
ered the expression of the macrophage activation markers and with genetic fate mapping Sall1 expression, a transcriptional
initially clustered cells into 100 nodes, where each node repre- regulator exclusive to microglia within the hematopoietic system
sents a group of cells. For visualization, we chose a single-cell (Figure 2E) (Buttgereit et al., 2016; Mrdjen et al., 2018). This
analysis by fixed force- and landmark-directed (Scaffold) layout model showed that Sall1 expression by microglia remains high
(Figure 2A) (Spitzer et al., 2015). Here, one cluster node (cellular even during inflammatory conditions, and this marker is not ex-
population) represents one FlowSOM node and will be mapped pressed by any other CNS-invading cell types or BAMs. Mouse
(E) Sall1 and CD49d expression of mouse TAMs/monocytes, overlaid onto the reference map (A).
(F) TAMs/monocytes of glioma and BrM overlaid onto the reference map (A).
(G) Relative frequencies of the three TAM populations among CD64+ cells. Only statistically significant p values were displayed (p < 0.05, Mann-Whitney-Wil-
coxon test, Benjamini-Hochberg correction), whiskers within 1.5x IQR.
See also Figure S3 and Table S1.
1630 Cell 181, 1626–1642, June 25, 2020

ll
Resource
Figure 3. TAM Instruction Is Driven by the Type of Tumor Rather Than the Local Tissue Microenvironment
(A) Two-dimensional dot-plots displaying cellular markers among microglia.
(B) Boxplots quantify the mean antigen intensity of data shown in (A), whiskers within 1.5x IQR.
(C) A force-directed graph displays MDMs/monocytes from glioma and BrM. Individual plots are overlaid with the most differentially expressed markers
among MDMs.
Cell 181, 1626–1642, June 25, 2020 1631

ll
Resource
CD64+ phagocytes were analyzed using the same reference microglia in IDH1wt diffuse astrocytoma preserved a more rami-
Scaffold map as for human CyTOF data, excluding markers fied shape (Figure S4A).
that are differentially expressed in human and mouse (CD11c) Next, we investigated the monocyte-to-macrophage transi-
or were not used in both panels (Ly6C and CD14). The map faith- tion, which is initiated by the invasion of the monocytes from
fully separated mouse microglia (Sall1 YFP+CD49d ) from other the blood into the tissue and is primarily dictated by cues in
phagocytes (Sall1 YFP CD49d+) and invading phagocytes and the local tissue microenvironment (Ginhoux and Guilliams,
confirmed that CD49d is preferentially expressed among CNS- 2016; Okabe and Medzhitov, 2016). The fundamental question
invading phagocytes (MDMs) (Figure 2E). addressed was whether the phenotypic and functional features
Using a combination of experimental approaches, we could of MDMs are driven by the nature of the pathogenic insult (i.e.,
reliably identify the origin of each TAM population stemming the tumor type growing within the brain or instead by the CNS tis-
either from the embryonically derived CNS-resident microglia sue itself), which is generally deficient in blood-borne leukocytes
or from blood-derived MDMs. Glioma TME predominantly con- during the steady state. In order to chart the monocyte/MDM
tained TAMs of microglial origin, whereas BrM had a higher developmental trajectory in relation to either gliomas versus
invasion of MDMs, with a particular extension in the TME of car- BrM, we built a force-directed graph using Vortex (Figure 3C)
cinoma BrM (Figures 2F and 2G). MDM accumulation was also (Good et al., 2019; Samusik et al., 2016). The algorithm was
observed in the IDH1wt glioma TME, albeit to a lesser extent, applied to a combined CyTOF dataset of CNS-invading TAMs
whereas the macrophage composition of the TME of IDH1mut gli- of glioma and BrM TMEs, focusing on the expression level of
omas—associated with a better prognosis—did not differ signif- monocyte/macrophage markers (Figure 3C). The resulting graph
icantly from the epilepsy case (Figures 2F and 2G). Monocytes depicts a continuous process of monocyte/MDM differentiation,
were present at similarly low frequencies across all samples (Fig- which was characterized by the downregulation of CCR2 and
ures 2F and 2G). Overall, the data show that the TME of IDH1mut CD33, concomitant with the upregulation of CD163 and Mertk
tumors of neuronal derivation is dominated by TAMs of microglial (Figure 3C). Of note, among MDMs, we observed a trifurcation
origin. The more aggressive gliomas (IDH1wt) showed an in the developmental trajectory displayed in three branches on
increased invasion by MDM TAMs, independently of the methyl- the force-directed map, which was driven by the differential
ation status of the MGMT promoter (Figure S3B). Along this tra- expression of CD169, CD206, CD209, CD38, and PD-L1. To infer
jectory, brain tumors of extracranial origin are predominantly cell lineages and pseudotimes, we computed the slingshot algo-
invaded by MDMs, which represent the majority of the TAM rithm (Figure 3D) (Street et al., 2018). The algorithm modeled the
population. developmental trajectory by connecting adjacent clusters, using
the monocyte cluster as a starting point to construct branching
TAM Instruction Is Driven by the Type of Tumor Rather curves (Figure 3C). To explore different subsets of MDM, we per-
Than the Local Tissue Microenvironment formed FlowSOM analysis using the differentially expressed an-
TAM plasticity and polarization in the TME across various cancer tigens (Figure 3C). Our analysis revealed the presence of four
entities are subjects of intense investigation (Kiss et al., 2018). MDM clusters and one monocyte cluster (Figures 3D and 3E).
Here, we explored the expression of a large number of markers Next, we separated the force-directed map of invading
used for macrophage phenotyping and polarization (Table S2) on MDMs/monocytes for IDH1mut and IDH1wt gliomas versus mela-
human CNS-resident microglia with single-cell resolution. Of noma and carcinoma BrM to analyze the trajectories and relative
those, for instance, CD169, CD206, and CD209 were virtually ab- frequencies of different MDM subsets (Figure 3F). Importantly,
sent from CNS-resident tumor-associated microglia (Figure 2B). we found that the developmental traces of the monocyte-
We also examined the expression level of markers, which were to-MDM transition is not a random feature across brain tumors,
previously used to describe the ‘‘reactivity’’ of CNS-resident mi- but that each tumor entity clearly dictates the development of its
croglia in different pathological settings (Figure 3A) (Hopperton own MDM type with a distinctive and tumor type-specific pheno-
et al., 2018; Keren-Shaul et al., 2017; Mrdjen et al., 2018; Walker typic signature. For instance, CD163+ CX3CR1+ CADM1+ MDMs
and Lue, 2015). Microglia in IDH1mut glioma had a comparable (MDM 4) were found almost exclusively in the IDH1wt glioma TME
expression of HLA-DR to microglia in IDH1wt glioma and BrM (Figures 3F and 3G). MDMs expressing CD163, CD206, and
(Figure 3B). However, in both IDH1wt gliomas and BrM, but not CD169 (MDM 2) showed higher relative frequencies in carci-
in IDH1mut gliomas, these cells upregulated CD14 and CD64 noma BrM, whereas MDMs with higher levels of CD209, CD38,
(Figure 3B). The ‘‘reactive’’ phenotype of tumor-associated mi- PD-L1, and PD-L2 (MDM 3) were equally infiltrating the mela-
croglia in IDH1wt glioblastomas and BrM was corroborated by noma and carcinoma BrM (Figures 3F and 3G). CNS-invading
co-staining of brain sections with Iba1 and CD163, revealing phagocytes in the TME of IDH1mut glioma were predominantly
an amoeboid microglial morphology of these cells. In contrast, composed of monocytes and low frequencies of MDMs. These
(D–F) FlowSOM-guided metaclustering overlaid on (C). (E) Heatmap displaying the median antigen intensity of markers used to generate (D). (F) A force-directed
graph display changes in the brain TME (D and F). The overlaid lines represent the results of the slingshot pseudotime analysis.
(G) Bar plots representing the relative frequencies of monocytes and MDMs in the brain TME. The IDH1mut glioma TME composed of monocytes 74.5% ± 3.3%,
MDM1 13.8% ± 3.3%, MDM2 1.9% ± 0.6%, MDM3 5.7% ± 1.5%, and MDM4 4.1% ± 2.1%; the IDH1wt glioma: monocytes 20.4% ± 21.4%, MDM1 26.9% ±
12.8%, MDM2 17.3% ± 15.5%, MDM3 7.4% ± 10.2%, and MDM4 27.9% ± 20.7%; the melanoma BrM: monocytes 24.8% ± 18.1%, MDM1 22.8% ± 6.9%,
MDM2 17.5% ± 8.2%, MDM3 33.7% ± 27.9%, and MDM4 1.2% ± 1.3%; carcinoma BrM: monocytes 14.2% ± 5.8%, MDM1 26.1% ± 5.9%, MDM2 34.6% ±
7.8%, MDM3 22.7% ± 12.3%, and MDM4 2.4% ± 0.5%.
1632 Cell 181, 1626–1642, June 25, 2020

ll
Resource
Figure 4. Localization of CNS-Resident and
Invading Macrophages In Situ
(A and B) Representative immunofluorescence
images of microglia (A) and MDMs (B) in gliomas.
See also Figures S4A–S4D and Table S1.
(ITGAX(CD11c), FCGR1A(CD64), CX3CR1,

and P2RY12(P2Y12)) and the OS in the
TCGA-LGG (Brain Lower Grade Glioma;
WHO II/III grade) and TCGA-GBM data-
bases, which again did not reveal any sig-
nificant correlation (Figures 5B and 5C).
Conversely, applying the different gene
findings clearly showed that the monocyte to MDM transition is signatures of MDMs demonstrated a significant positive correla-
dictated by tumor-specific education of the TAM compartment tion of the MDM2/3 signature: CD163, MRC1 (CD206), CD14,
rather than a general feature of monocyte behavior within CNS and CD68, with patient survival (Figures 5D, 5E). The data sug-
tissue (Figures 3F and 3G). gest that CD206+ MDM2/3 infiltration contributes to tumor sup-
The localization of both CNS-resident microglia and invading pression, particularly in WHO II and III grade glioma.
MDMs within the TME of brain tumors was further investigated Taken together, our findings reveal a diversity of TAM subsets
using immunofluorescence and immunohistochemistry. Based in the brain TME besides CNS-resident and invading macro-
on our single-cell map (Figures 2B, 2C, and 2D), we used the phages. We propose that TAMs play a distinct role in the TME,
pan-myeloid marker Iba1 in combination with either CD163 or following the strong variation across all brain tumor types in their
P2RY12 to distinguish Iba1+CD163+ MDMs from Iba1+CD163 phenotype, localization, and correlation with clinical outcomes.
and Iba1+P2Y12+ microglia (Figures 4A, 4B, and S4A). Iba1+ In addition, we observed tumor-specific changes, with resident
CD163 microglia were diffusely scattered throughout gliomas, phagocytes predominating in glioma and invading phagocytes
but they were often confined to tumor border areas and absent in the metastatic brain TME. Moreover, the TME site, rather
from the tumor core in BrM (Figures 4A and S4A). Moreover, than the CNS site, drives TAM polarization, resulting in unique
we found Iba1+CD163+ MDMs and CD206+, CD169+, and phenotypes that correlate with clinical outcomes. Importantly,
CD209+ subsets of MDMs in close proximity to VE-cadherin- our results suggest that CD206+ MDMs favor tumor suppression
positive blood vessels in glioma and BrM (Figures 4B and in WHO II and III grade glioma patients.
S4A–S4D). Frequencies of MDMs were higher in samples with
NSCLC BrM, mirroring the results of the CyTOF analysis. Tregs Accumulation and T Cell Exhaustion Characterize
To interrogate the clinical significance of MDMs and CNS-resi- the TME of Brain Metastases
dent microglia infiltrating the glioma TME, we correlated the fre- The nature and frequencies of TILs are emerging as a prognostic
quencies of identified TAM populations with overall survival (OS) marker for several cancer types (Binnewies et al., 2018). Previ-
or time of follow-up in days (for patients alive at the time of anal- ous reports have shown a correlation between the number of
ysis). For this purpose, we used our cohort of IDH1wt glioma as TILs and the IDH1 mutational status of gliomas (Bunse et al.,
well as two publicly available TCGA databases (TCGA-GBM 2018). Here, using CyTOF with a focus on the lymphoid compart-
and TCGA-LGG). For the group of patients analyzed by CyTOF, ment (Table S3), we identified several TIL phenotypes and
we selected only patients with newly diagnosed IDH1wt glioma described their activation status in glioma and BrM. Upon
(excluding recurrent cases) due to higher variability of disease t-distributed stochastic neighbor embedding (t-SNE) dimension-
stage and treatment among patients with IDH1mut glioma and ality reduction in conjunction with FlowSOM clustering, we iden-
BrM. In the IDH1wt glioma cohort, we did not observe a tified CD4 (CD3+CD4+FoxP3 ) and CD8 (CD3+CD8+) T cells,
correlation between the frequencies of microglia or MDMs Tregs (CD3+ CD4+FoxP3+CTLA4+), gdT cells (CD3+TCRgd+),
with patient outcome. Only the frequencies of monocytes corre- NK cells (CD56+), B cells (CD19+CD20+), and plasma cells
lated with a trend of longer OS (Figure S4E). Further, we deter- (CD19+ and CD38high) (Figures S5A–S5C). Overall, the relative
mined the expression levels of the pan-microglial marker frequencies across all lymphocytes and samples were compara-
CX3CR1 and the pan-MDM marker CD163 in the TCGA-GBM ble between primary glioma and metastatic samples (Figures 6A
database (Glioblastoma Multiforme) (Figure 5A). Interestingly, and S5D). Also, we did not observe changes in the lymphocyte
the MDM marker CD163, in contrast to the microglia marker frequencies associated with MGMT promoter methylation status
CX3CR1, showed a negative correlation with OS, suggesting (Figure S5E). With few exceptions, IDH1wt gliomas had higher
that this feature has the potential to stratify patient clinical frequencies of Tregs compared to IDH1mut (Figure 6A). However,
outcomes (Figure 5A). We then determined whether the the highest relative frequencies of Tregs were found in the BrM,
composite TAM signatures could improve the correlation especially within the carcinoma TME (Figure 6A).
with clinical disease development and survival. First, we Next, we closely interrogated the phenotype of the CD8 T cells
analyzed the relation of the microglial signature expression using categorical One-SENSE analysis (Cheng et al., 2016;
Cell 181, 1626–1642, June 25, 2020 1633

ll
Resource
Figure 5. Overall Glioma Patient Survival Correlates with the Presence of MDM Signatures
(A) Kaplan-Meier curve shows OS in glioblastoma patients (TCGA-GBM database) with high and low CD163 or CX3CR1 gene expression.
(B–E) Kaplan-Meier survival analysis in patient groups of (B) TCGA-LGG and (C) TCGA-GBM databases with high and low microglial signature. (D and E) Kaplan-
Meier survival analysis in patient groups of (D) TCGA-LGG and (E) TCGA-GBM databases correlated with MDM2/3 signature. (B–E) Heatmaps show the selected
groups of patients according to the gene expression.
See also Figure S4E and Table S1.
1634 Cell 181, 1626–1642, June 25, 2020

ll
Resource
Figure 6. Tregs Accumulation and T Cell Exhaustion Characterize the TME of Brain Metastases
(A) Relative frequencies of the main T cell populations among lymphocytes. Only statistically significant p values were displayed (p < 0.05, Mann-Whitney-
Wilcoxon test, Benjamini-Hochberg correction), whiskers within 1.5x IQR.
(B) One-SENSE analysis comparing the lineage and activation profiles of CD8 T cells.
(C) Representative histograms showing differentially expressed markers on CD8 RM and CD8 EM subsets.
See also Figures S5, S6, and S7A and Table S1.
Cell 181, 1626–1642, June 25, 2020 1635

ll
Resource
Laurens van der Maaten, 2008). Here, the x-axis represents a (p = 0.1) toward a higher proportion of CD56int/brightCD16+ NK
naive/memory profile and the y-axis the activation profile, cells in the unmethylated cases (Figure S7C). We also correlated
including expression of co-stimulatory and co-inhibitory recep- the frequencies of CD56+ subsets with OS or number of follow up
tors (Figure 6B). Based on the expression of five markers days in the IDH1wt cohort, which suggested ie-ILC1-like cell
(x-axis: CD45RA, CD45RO, CCR7, CD127, and CD103), T cells accumulation as a potential marker of OS (Figure S7D).
can be differentiated into naive, central memory (CM), effector We next characterized the activation status of the three sub-
memory (EM), terminally differentiated effector memory sets of CD56+ cells, represented in the y-axis of the One-SENSE
(TEMRA), and non-circulating tissue-resident (RM) T cells (Smol- map (Cheng et al., 2016; Laurens van der Maaten, 2008) (Fig-
ders et al., 2018) (Figures 6B and S6A). The majority of T cells ure 7C). Detailed analysis of the CD56int/brightCD16+ population
analyzed were memory T cells, without statistically significant across patients revealed a higher trend of 2B4, CD38, and
variations in terms of relative frequencies among total T cell Ki-67 in BrM (Figures 7C and 7D). The CD56int/brightCD16+ pop-
numbers between the TME of gliomas versus BrM (Figure S6B). ulation in IDH1wt gliomas was distinguished by CD57 and TIM3
However, we observed a positive correlation of CD4 CM (p = expression, with the latter also observed in the IDH1mut gliomas.
0.021) and CD8 CM (p = 0.077) T cell frequencies with OS/ The profile of metastatic and glioma CD56int/brightCD16+ popula-
number of follow up days for patients with IDH1wt glioma tion was relatively homogeneous. Taken together, NK cell pro-
(Figure S6C). files also display tumor specificity, highlighted by the proportion
Further analysis of additional T cell markers demonstrated a of infiltrating cytotoxic and immature NK cells, with the preva-
higher expression of co-stimulatory receptors (ICOS, CD27, lence of the latter in the IDH1wt gliomas. Additionally, we
and CD137), co-inhibitory receptors (2B4, TIGIT, and PD-1), observed heterogeneity of infiltrating CD56int/brightCD16 NK
the activation marker CD38, effector (CD57 and granzyme B) cells between glioma and BrM.
and proliferation functions (Ki-67) among RM and EM CD8
T cells in melanoma BrM (Figures 6B, 6C, and S7A). The same DISCUSSION
phenotype, albeit with lower frequencies, was identified in carci-
noma BrM. Of note, CD8 RM and EM T cells in IDH1mut glioma In this study, we combined extensive single-cell proteome anal-
had lower expression of proliferation and activation markers ysis, immunofluorescence imaging, and genetic fate-mapping to
compared to DH1wt glioma. The same analysis strategy was interrogate the leukocyte landscape of the TME in brain tumors.
applied to CD4 T cells; however, here we did not observe major Our analyses revealed major changes in the CNS-resident and
differences between glioma and BrM. invading leukocyte populations, which were dictated by the
Altogether, the data revealed the accumulation of Tregs in the type of brain tumor. We found resident phagocytes to be abun-
TME of IDH1wt gliomas and BrM, with the highest frequencies in dant in the glioma TME, whereas invading leukocytes dominate
the carcinoma BrM. In IDH1wt gliomas, the accumulation of CD4 the immune landscape of BrM.
CM and CD8 CM in the TME may be a positive prognostic indi- Previous studies have shown that TAMs are the largest popu-
cator of better OS. In addition, the expansion of the analyzed lation of leukocytes in the glioma TME. Despite the correlation of
markers revealed unique phenotypic features of EM and RM TAMs with clinical prognosis and grade of glioma (Quail and
CD8 T cells. The metastatic TME was composed of activated/ Joyce, 2017; Venteicher et al., 2017), it appears that cellular phe-
exhausted T cells, whereas the glioma samples showed a lower notypes are more indicative of clinical outcomes than the mere
expression of activation markers. number of infiltrating macrophages (Müller et al., 2017; Pyonteck
et al., 2013). For instance, blocking the colony-stimulating factor
Immature NK Cells Accumulate in Glioblastoma receptor 1 (CSF-1R) resulted in a significant reduction of tumor
In view of the potential role of innate lymphoid cells in anti-tumor growth in a preclinical glioma model, which was associated
immunity (Tugues et al., 2019), we next used the lymphoid panel with a ‘‘re-education’’ of TAMs toward a pro-inflammatory tu-
(Table S3) dataset to interrogate NK cells (CD56+CD3-) (Fig- mor-suppressive phenotype (Coniglio et al., 2012; Pyonteck
ure 7A). We assigned cells mainly corresponding to the combina- et al., 2013). Despite the general interest in targeting TAMs for
torial expression of CD16 and CD56 and identified two major anti-glioma therapy, only a few studies performed in preclinical
populations of CD56int/brightCD16 and CD56intCD16+, which models have taken into account the dual origin of the TAM pop-
correspond to the immature and the high cytotoxic populations ulation (Bowman et al., 2016; Chen et al., 2017). Discrimination of
of NK cells, respectively (Simoni et al., 2017). Among microglia and the blood-derived MDM revealed the predomi-
CD56int/brightCD16 cells, we also found a major population of nance of ‘‘reactive’’ microglia-derived TAMs in glioma lesions,
CD69+CD103+CD56+ cells, which closely resemble intraepithe- which is in line with a recent report by Sankowski et al. (2019).
lial ILC1-like cells (ie-ILC1-like cells) (Figure 7A) (Simoni et al., This phenotype likely results from interferon (IFN)-g produced
2017). Next, we compared the frequencies of these three by TILs (both T and NK cells), a notion that can be further inves-
above-described CD56+ subsets in gliomas and BrM of different tigated in preclinical mouse models.
origins. In the IDH1wt gliomas, we observed the enrichment of Of note, we observed high relative frequencies of MDMs (not
immature CD56int/bright CD16 NK cells among lymphocytes, measured in Sankowski et al. [2019]) in IDH1wt glioma and
whereas in both the IDH1mut gliomas and BrM, predominantly BrM. In contrast to microglia frequencies, MDMs significantly
CD56int/bright CD16+ NK cells (Figures 7B and S7B) accumulated. correlated with clinical outcome of glioma patients and particu-
Further splitting of the IDH1wt glioma cohort according to the larly LGG. Interestingly, CX3CR1+CADM1+ MDMs observed in
methylation status of the MGMT promoter indicated a trend the IDH1wt glioma TME were phenotypically different from those
1636 Cell 181, 1626–1642, June 25, 2020

ll
Resource
Figure 7. Immature NK Cells Accumulate in Glioblastoma

(A) Gating strategy to identify ILC (CD56+) cell subsets.
(B) Relative frequencies of ILC populations among CD56+ cells.
(C) One-SENSE analysis comparing the lineage and activation profiles of CD56+ cells.
(D) The antigen median expression of the CD56int/brightCD16neg subset, which showed a statistically significant difference between glioma and BrM. (C and D) Only
statistically significant p values were displayed (p < 0.05, Mann-Whitney-Wilcoxon test, Benjamini-Hochberg correction), whiskers within 1.5x IQR.
See also Figures S7B–S7D and Table S1.
Cell 181, 1626–1642, June 25, 2020 1637

ll
Resource
identified in BrM. Recently developed preclinical glioma models inhibitory molecules, as well as proliferation markers. How this
will likely fuel the study of novel roles as well as unique activated phenotype is translated into certain functional proper-
trajectories of MDMs. For instance, the model mimicking the ties is not yet known and will require further investigation.
overproduction of (R)-2-hydroxyglutarate driven by IDH mutation However, the map provided here represents an important
(Amankulor et al., 2017) or those recreating various molecular resource for the informed design of novel therapeutic avenues.
types of gliomas (Miyai et al., 2017) can be used to further inves- A similar trend was observed in the NK cell compartment
tigate differentially matured MDMs upon extrinsic tumor factors. where the proportion of infiltrating cytotoxic and immature NK
Another attractive approach to study unique combination of cells decreased with disease severity. In gliomas, the impaired
TAMs for personalized therapy is the engraftment of patient- lymphocyte populations are in line with the TCGA data that char-
derived glioblastoma organoids, which represent the molecular acterize IDH1wt gliomas as lymphocyte-depleted and IDH1mut
heterogeneity among gliomas (Jacob et al., 2020; Mansour gliomas as immunologically silent (Kiss et al., 2018; Thorsson
et al., 2018). et al., 2018). In neuro-oncology, the IDH1 status is leveraged
In the BrM TME, we found a unique subpopulation of MDMs as an important classifier (Louis et al., 2016), with IDH1mut pa-
expressing CD206, CD209, CD169, and CD163 as well as high tients showing a favorable survival (Eckel-Passow et al., 2015;
levels of CD38, PDL-1, and PDL-2, which is reminiscent of a tu- Unruh et al., 2019). It is becoming more evident that IDH1 status
mor population of TAMs described by Ries et al. (2014) in is not only prognostically relevant but represents a marker for a
NSCLC. MDMs expressing CD38 were also described in vitro distinct disease entity within gliomas. The distinct molecular
on IFN-g stimulation, whereas CD38 expression was strongly characteristics of IDH1mut glioma and the distinct clinical
associated with phagocytosis (Schulz et al., 2019). In our study, behavior are now also well characterized by a distinct TME: im-
we did not observe a clear separation of TAMs based on the M1/ mune cells show the least amount of activation across all brain
M2 model of macrophage polarization (Murray et al., 2014), tumor samples.
which underestimates the dynamic nature of pro- or anti-inflam- Taken together, we show that the immune cell compartment of
matory properties across macrophages (Xue et al., 2014). brain TMEs is mainly shaped by the specific tumor type rather
Instead, the unique signature of cytokines, growth factors, and than by the CNS as an ‘‘immune-privileged niche.’’ We observed
mutational landscapes that instruct TAMs in each brain tumor a continuous increase in bone marrow-derived infiltrates, such
type may trigger the phenotypic differences observed. as MDMs and T cells, along the axis of IDH1mut, IDH1wt gliomas,
In most cases, we found MDMs localized near blood vessels in and BrM, and increasing dominance of CNS-resident cells, such
glioma and BrM. Previous preclinical glioma models showed a as microglia, in the glioma counterparts. The richness and acti-
close connection of MDMs and glioma stem cells, which were vation of the BrM TMEs regarding cellular subtypes and fre-
mainly found in perivascular niches and reported to secrete peri- quencies as well as functional states parallels their favorable
ostin to attract TAMs (Zhou et al., 2015). In turn, TAMs are a rich clinical response to checkpoint inhibitors. In-depth knowledge
source of soluble mediators (e.g., interleukin [IL]-6, IL-10, and of the specific immunological TME signatures across brain tu-
transforming growth factor-b1 and pleiotrophin) capable of sus- mors is a major step forward for the rational design of targeted
taining malignant stem cells (Shi et al., 2017). Regarding BrM, we immunotherapy strategies.
hypothesize that MDMs localize near blood vessels to establish
the metastatic niche. However, further studies using autochtho- STAR+METHODS
nous mouse models of BrM (An et al., 2017) are required to reveal
potential targets for immunotherapy. Detailed methods are provided in the online version of this paper
Regarding the lymphocyte compartment, our analyses re- and include the following:
vealed that Tregs preferentially accumulate in BrM rather than
in gliomas. Importantly, the presence of PD-L1+ TAMs has d KEY RESOURCES TABLE
been correlated with the Treg frequencies in several solid tissue d RESOURCE AVAILABILITY
tumors (Harter et al., 2015) as well as IDH1wt gliomas (Berghoff B Lead Contact
et al., 2017). In turn, Tregs secrete IL-10, IL-4, and IL-13, which B Materials Availability
may trigger the development of TAMs with immunosuppressive B Data and Code Availability
properties (Mantovani et al., 2017). An increased number of d EXPERIMENTAL MODEL AND SUBJECT DETAILS
Tregs might also result in the suppression of cytotoxic CD8 B Human Brain Tissue Samples
T cell responses. Brain tumors establish an immunosuppressive B Animal Models
TME that leads to T cell dysfunction (Quail and Joyce, 2017). B Cell Lines
T cells infiltrating glioblastoma, for instance, were found to ex- d METHOD DETAILS
press high amounts of multiple immune checkpoints (e.g., B Processing of Human Samples for Cytometry Analysis
PD-1, Lag-3, TIM-3, or TIGIT), which correlated with a loss of B Tissue collection for Immunofluorescence
effector function (Woroniecka and Fecci, 2018). PD-1 expression B Orthotopic Glioma Cell Injection
has been found in a high percentage of TILs in melanoma BrM B In Vivo Bioluminescent Imaging
(Berghoff et al., 2015). However, we found the greatest changes B Harvesting and Processing of Mouse Brain Samples
in T cell activation in BrM, with CD8 TRM (CD103+; CD69+) and B Mass-Tag Cellular Barcoding
TEM (CCR7 ; CD45RA ) displaying an activated phenotype B Metal-Isotope-Tagged Antibodies
characterized by high amounts of both co-stimulatory and co- B Cell Surface Staining for Cytometry
1638 Cell 181, 1626–1642, June 25, 2020

ll
Resource
B Intracellular Cytokine Staining for Cytometry cancer cell line MDA-MB-231 and its brain metastatic variant MDA-MB-231-
B Cell Preparation and Mass Cytometry Acquisition BR. Oncol. Rep. 38, 3001–3010.
B Flow Cytometry Acquisition Bastian, M., Heymann, S., and Jacomy, M. (2009). Gephi: An open source soft-
B Immunohistochemistry ware for exploring and manipulating networks. ICWSM 8, 361–362.
B Immunofluorescence Bendall, S.C., Simonds, E.F., Qiu, P., Amir, A.D., Krutzik, P.O., Finck, R.,
d QUANTIFICATION AND STATISTICAL ANALYSIS Bruggner, R.V., Melamed, R., Trejo, A., Ornatsky, O.I., et al. (2011). Single-
B Preprocessing of Cytometry Data cell mass cytometry of differential immune and drug responses across a hu-
man hematopoietic continuum. Science 332, 687–696.
B Automated Population Identification
B Survival Analysis Berghoff, A.S., Kiesel, B., Widhalm, G., Rajky, O., Ricken, G., Wöhrer, A., Die-
ckmann, K., Filipits, M., Brandstetter, A., Weller, M., et al. (2015). Programmed
B Statistical Analysis
death ligand 1 expression and tumor-infiltrating lymphocytes in glioblastoma.
Neuro-oncol. 17, 1064–1075.
Berghoff, A.S., Kiesel, B., Widhalm, G., Wilhelm, D., Rajky, O., Kurscheid, S.,
Kresl, P., Wöhrer, A., Marosi, C., Hegi, M.E., and Preusser, M. (2017). Correla-
tion of immune phenotype with IDH mutation in diffuse glioma. Neuro-oncol.
cell.2020.04.055.
19, 1460–1468.
Bienkowski, M., and Preusser, M. (2015). Prognostic role of tumour-infiltrating
ACKNOWLEDGMENTS
inflammatory cells in brain tumours: literature review. Curr. Opin. Neurol. 28,
Sall1CreER mice were kindly provided by R. Nishinakamura (Kumamoto Uni- 647–658.
versity). We thank the Mass- and Flow Cytometry Facility (University of Zurich) Binnewies, M., Roberts, E.W., Kersten, K., Chan, V., Fearon, D.F., Merad, M.,
for technical assistance, the Department of Neuropathology (University Hospi- Coussens, L.M., Gabrilovich, D.I., Ostrand-Rosenberg, S., Hedrick, C.C., et al.
tal Zurich and University of Zurich) for performing immunohistochemistry stain- (2018). Understanding the tumor immune microenvironment (TIME) for effec-
ing, and Helen Pickersgill of Life Science Editors for critical review and editing tive therapy. Nat. Med. 24, 541–550.
of the manuscript. This work was supported by grants from the Swiss Cancer Bouffet, E., Larouche, V., Campbell, B.B., Merico, D., de Borja, R., Aronson,
League (KFS-4431-02-2018), the Swiss National Science Foundation (733 M., Durno, C., Krueger, J., Cabric, V., Ramaswamy, V., et al. (2016). Immune
310030_170320, 310030_188450, and CRSII5_183478) to B.B., the European Checkpoint Inhibition for Hypermutant Glioblastoma Multiforme Resulting
Union H2020 Project iPC (826121 to B.B.), the Clinical Research Priority Pro- From Germline Biallelic Mismatch Repair Deficiency. J. Clin. Oncol. 34,
gram ImmunoCure (to B.B., M.W., and M.C.N.), and the University Research 2206–2211.
Priority Program Translational Cancer Research of the University of Zurich
Bowman, R.L., Klemm, F., Akkari, L., Pyonteck, S.M., Sevenich, L., Quail, D.F.,
(to B.B. and M.W.).
Dhara, S., Simpson, K., Gardner, E.E., Iacobuzio-Donahue, C.A., et al. (2016).
Macrophage Ontogeny Underlies Differences in Tumor-Specific Education in
Brain Malignancies. Cell Rep. 17, 2445–2459.
Conceptualization, B.B., E.F., M.W., and M.C.N.; Methodology, E.F. and K.K.; Buja, A., Swayne, D.F., Littman, M.L., Dean, N., Hofmann, H., and Chen, L.
Software, E.F.; Formal Analysis, E.F. and N.G.N.; Investigation, E.F., K.K., S. (2008). Data Visualization with Multidimensional Scaling Introduction.
Unger, S. Utz, and N.G.N.; Resources, B.B., M.C.N., M.W., and L.R.; Data Cu- J. Comput. Graph. Stat. 17, 444–472.
ration, E.F., S. Unger, N.G.N., K.K., M.C.N., and E.J.R.; Writing – Original Draft, Bunse, L., Pusch, S., Bunse, T., Sahm, F., Sanghvi, K., Friedrich, M., Alansary,
E.F., B.B., and S.T.; Writing – Review & Editing, E.F., B.B., S.T., K.K., M.C.N., D., Sonner, J.K., Green, E., Deumelandt, K., et al. (2018). Suppression of anti-
E.J.R., M.W., and M.G.; Visualization, E.F.; Funding Acquisition and Supervi- tumor T cell immunity by the oncometabolite (R)-2-hydroxyglutarate. Nat.
sion, M.C.N., M.W., and B.B. Med. 24, 1192–1203.
Butovsky, O., Jedrychowski, M.P., Moore, C.S., Cialic, R., Lanser, A.J., Ga-
DECLARATION OF INTERESTS briely, G., Koeglsperger, T., Dake, B., Wu, P.M., Doykan, C.E., et al. (2014).
Identification of a unique TGF-b-dependent molecular and functional signature
The authors declare no competing interests. in microglia. Nat. Neurosci. 17, 131–143.
Buttgereit, A., Lelios, I., Yu, X., Vrohlings, M., Krakoski, N.R., Gautier, E.L.,
Received: November 27, 2019
Nishinakamura, R., Becher, B., and Greter, M. (2016). Sall1 is a transcriptional
Revised: March 11, 2020
regulator defining microglia identity and function. Nat. Immunol. 17,
Accepted: April 28, 2020
1397–1406.
Published: May 28, 2020
Chen, Z., Feng, X., Herting, C.J., Garcia, V.A., Nie, K., Pong, W.W., Rasmus-
REFERENCES sen, R., Dwivedi, B., Seby, S., Wolf, S.A., et al. (2017). Cellular and molecular
identity of tumor-associated macrophages in glioblastoma. Cancer Res. 77,
Aldape, K., Brindle, K.M., Chesler, L., Chopra, R., Gajjar, A., Gilbert, M.R., Got- 2266–2278.
tardo, N., Gutmann, D.H., Hargrave, D., Holland, E.C., et al. (2019). Challenges Cheng, Y., Wong, M.T., van der Maaten, L., and Newell, E.W. (2016). Categor-
to curing primary brain tumours. Nat. Rev. Clin. Oncol. 16, 509–520. ical Analysis of Human T Cell Heterogeneity with One-Dimensional Soli-
Amankulor, N.M., Kim, Y., Arora, S., Kargl, J., Szulzewsky, F., Hanke, M., Mar- Expression by Nonlinear Stochastic Embedding. J. Immunol. 196, 924–932.
gineantu, D.H., Rao, A., Bolouri, H., Delrow, J., et al. (2017). Mutant IDH1 reg- Colaprico, A., Silva, T.C., Olsen, C., Garofano, L., Cava, C., Garolini, D., Sabe-
ulates the tumor-associated immune system in gliomas. Genes Dev. 31, dot, T.S., Malta, T.M., Pagnotta, S.M., Castiglioni, I., et al. (2016). TCGAbio-
774–786. links: an R/Bioconductor package for integrative analysis of TCGA data. Nu-
Amit, M., Laider-Trejo, L., Shalom, V., Shabtay-Orbach, A., Krelin, Y., and Gil, cleic Acids Research 44, e71.
Z. (2013). Characterization of the melanoma brain metastatic niche in mice and Coniglio, S.J., Eugenin, E., Dobrenis, K., Stanley, E.R., West, B.L., Symons,
humans. Cancer Med. 2, 155–163. M.H., and Segall, J.E. (2012). Microglial stimulation of glioblastoma invasion in-
An, J., Wang, L., Zhao, Y., Hao, Q., Zhang, Y., Zhang, J., Yang, C., Liu, L., volves epidermal growth factor receptor (EGFR) and colony stimulating factor
Wang, W., Fang, D., et al. (2017). Effects of FSTL1 on cell proliferation in breast 1 receptor (CSF-1R) signaling. Mol. Med. 18, 519–527.
Cell 181, 1626–1642, June 25, 2020 1639

ll
Resource
Croxford, A.L., Lanzinger, M., Hartmann, F.J., Schreiner, B., Mair, F., Pelczar, Harrell, F.E. (2020). Hmisc: Harrell Miscellaneous. http://biostat.mc.vanderbilt.
P., Clausen, B.E., Jung, S., Greter, M., and Becher, B. (2015). The Cytokine edu/Hmisc. https://github.com/harrelfe/Hmisc.
GM-CSF Drives the Inflammatory Signature of CCR2+ Monocytes and Li- Harter, P.N., Bernatz, S., Scholz, A., Zeiner, P.S., Zinke, J., Kiyose, M., Blasel,
censes Autoimmunity. Immunity 43, 502–514. S., Beschorner, R., Senft, C., Bender, B., et al. (2015). Distribution and prog-
Daniel, P., Sabri, S., Chaddad, A., Meehan, B., Jean-Claude, B., Rak, J., and nostic relevance of tumor-infiltrating lymphocytes (TILs) and PD-1/PD-L1 im-
Abdulkarim, B.S. (2019). Temozolomide Induced Hypermutation in Glioma: mune checkpoints in human brain metastases. Oncotarget 6, 40836–40849.
Evolutionary Mechanisms and Therapeutic Opportunities. Front. Oncol. 9, 41. Hartmann, F.J., Bernard-Valnet, R., Quériault, C., Mrdjen, D., Weber, L.M.,
Darmanis, S., Sloan, S.A., Croote, D., Mignardi, M., Chernikova, S., Samgha- Galli, E., Krieg, C., Robinson, M.D., Nguyen, X.-H., Dauvilliers, Y., et al.
babi, P., Zhang, Y., Neff, N., Kowarsky, M., Caneda, C., et al. (2017). Single- (2016). High-dimensional single-cell analysis reveals the immune signature
Cell RNA-Seq Analysis of Infiltrating Neoplastic Cells at the Migrating Front of narcolepsy. J. Exp. Med. 213, 2621–2633.
of Human Glioblastoma. Cell Rep. 21, 1399–1410. Haynes, S.E., Hollopeter, G., Yang, G., Kurpius, D., Dailey, M.E., Gan, W.-B.,
de Boer, J., Williams, A., Skavdis, G., Harker, N., Coles, M., Tolaini, M., Norton, and Julius, D. (2006). The P2Y12 receptor regulates microglial activation by
T., Williams, K., Roderick, K., Potocnik, A.J., and Kioussis, D. (2003). Trans- extracellular nucleotides. Nat. Neurosci. 9, 1512–1519.
genic mice with hematopoietic and lymphoid specific expression of Cre. Hopperton, K.E., Mohammad, D., Trépanier, M.O., Giuliano, V., and Bazinet,
Eur. J. Immunol. 33, 314–325. R.P. (2018). Markers of microglia in post-mortem brain samples from patients
Eckel-Passow, J.E., Lachance, D.H., Molinaro, A.M., Walsh, K.M., Decker, with Alzheimer’s disease: a systematic review. Mol. Psychiatry 23, 177–198.
P.A., Sicotte, H., Pekmezci, M., Rice, T., Kosel, M.L., Smirnov, I.V., et al. Inoue, S., Inoue, M., Fujimura, S., and Nishinakamura, R. (2010). A mouse line
(2015). Glioma Groups Based on 1p/19q, IDH, and TERT Promoter Mutations expressing Sall1-driven inducible Cre recombinase in the kidney mesen-
in Tumors. N. Engl. J. Med. 372, 2499–2508. chyme. Genesis 48, 207–212.
Eisenring, M., vom Berg, J., Kristiansen, G., Saller, E., and Becher, B. (2010). Jacob, F., Salinas, R.D., Zhang, D.Y., Nguyen, P.T.T., Schnoll, J.G., Wong,
IL-12 initiates tumor rejection via lymphoid tissue-inducer cells bearing the S.Z.H., Thokala, R., Sheikh, S., Saxena, D., Prokop, S., et al. (2020). A Pa-
natural cytotoxicity receptor NKp46. Nat. Immunol. 11, 1030–1038. tient-Derived Glioblastoma Organoid Model and Biobank Recapitulates Inter-
Ellis, B., Haaland, P., Hahne, F., Le Meur, N., Gopalakrishnan, N., Spidlen, J., and Intra-tumoral Heterogeneity. Cell 180, 188–204.
Jiang, M., and Finak, G. (2019). flowCore: Basic structures for flow cytometry Jacobs, J.F.M., Idema, A.J., Bol, K.F., Grotenhuis, J.A., de Vries, I.J.M., Wes-
data. https://rdrr.io/bioc/flowCore/. seling, P., and Adema, G.J. (2010). Prognostic significance and mechanism of
Field, A., Miles, J., and Field, Z. (2013). Discovering Statistics Using R (Sage). Treg infiltration in human brain tumors. J. Neuroimmunol. 225, 195–199.
Finak, G. (2018). flowWorkspaceData: A data package containing two Johanns, T.M., Miller, C.A., Dorward, I.G., Tsien, C., Chang, E., Perry, A., Up-
flowJo, one diva xml workspace and the associated fcs files as well paluri, R., Ferguson, C., Schmidt, R.E., Dahiya, S., et al. (2016). Immunoge-
as three GatingSets for testing the flowWorkspace, openCyto and nomics of Hypermutated Glioblastoma: A Patient with Germline POLE Defi-
CytoML packages. https://bioconductor.org/packages/release/data/ ciency Treated with Checkpoint Blockade Immunotherapy. Cancer Discov.
experiment/html/flowWorkspaceData.html. 6, 1230–1236.
Finck, R., Simonds, E.F., Jager, A., Krishnaswamy, S., Sachs, K., Fantl, W., Keenan, T.E., Burke, K.P., and Van Allen, E.M. (2019). Genomic correlates of
Pe’er, D., Nolan, G.P., and Bendall, S.C. (2013). Normalization of mass cytom- response to immune checkpoint blockade. Nat. Med. 25, 389–402.
etry data with bead standards. Cytometry A 83, 483–494. Keren-Shaul, H., Spinrad, A., Weiner, A., Matcovitch-Natan, O., Dvir-Sztern-
Forrester, J.V., McMenamin, P.G., and Dando, S.J. (2018). CNS infection and feld, R., Ulland, T.K., David, E., Baruch, K., Lara-Astaiso, D., Toth, B., et al.
immune privilege. Nat. Rev. Neurosci. 19, 655–671. (2017). A Unique Microglia Type Associated with Restricting Development of
Ginhoux, F., and Guilliams, M. (2016). Tissue-Resident Macrophage Ontogeny Alzheimer’s Disease. Cell 169, 1276–1290.
and Homeostasis. Immunity 44, 439–449. Kiss, M., Van Gassen, S., Movahedi, K., Saeys, Y., and Laoui, D. (2018).
Ginhoux, F., Greter, M., Leboeuf, M., Nandi, S., See, P., Gokhan, S., Mehler, Myeloid cell heterogeneity in cancer: not a single cell alike. Cell. Immunol.
M.F., Conway, S.J., Ng, L.G., Stanley, E.R., et al. (2010). Fate mapping analysis 330, 188–201.
reveals that adult microglia derive from primitive macrophages. Science 330, Kluger, H.M., Chiang, V., Mahajan, A., Zito, C.R., Sznol, M., Tran, T., Weiss,
841–845. S.A., Cohen, J.V., Yu, J., Hegde, U., et al. (2019). Long-Term Survival of Pa-
Goldberg, S.B., Gettinger, S.N., Mahajan, A., Chiang, A.C., Herbst, R.S., tients With Melanoma With Active Brain Metastases Treated With Pembrolizu-
Sznol, M., Tsiouris, A.J., Cohen, J., Vortmeyer, A., Jilaveanu, L., et al. (2016). mab on a Phase II Trial. J. Clin. Oncol. 37, 52–60.
Pembrolizumab for patients with melanoma or non-small-cell lung cancer Kohanbash, G., Carrera, D.A., Shrivastav, S., Ahn, B.J., Jahan, N., Mazor, T.,
and untreated brain metastases: early analysis of a non-randomised, open-la- Chheda, Z.S., Downey, K.M., Watchmaker, P.B., Beppler, C., et al. (2017). Iso-
bel, phase 2 trial. Lancet Oncol. 17, 976–983. citrate dehydrogenase mutations suppress STAT1 and CD8+ T cell accumula-
Goldmann, T., Wieghofer, P., Jordão, M.J.C., Prutek, F., Hagemeyer, N., Fren- tion in gliomas. J. Clin. Invest. 127, 1425–1437.
zel, K., Amann, L., Staszewski, O., Kierdorf, K., Krueger, M., et al. (2016). Kolde, R. (2019). pheatmap: Pretty Heatmaps. https://rdrr.io/cran/pheatmap/.
Origin, fate and dynamics of macrophages at central nervous system inter- Kuppner, M.C., Hamou, M.-F., Bodmer, S., Fontana, A., and de Tribolet, N.
faces. Nat. Immunol. 17, 797–805. (1988). The glioblastoma-derived T-cell suppressor factor/transforming
Good, Z., Borges, L., Vivanco Gonzalez, N., Sahaf, B., Samusik, N., Tibshirani, growth factor beta 2 inhibits the generation of lymphokine-activated killer
R., Nolan, G.P., and Bendall, S.C. (2019). Proliferation tracing with single-cell (LAK) cells. Int. J. Cancer 42, 562–567.
mass cytometry optimizes generation of stem cell memory-like T cells. Nat. Laurens van der Maaten, G.H. (2008). Visualizing Data using t-SNE. J. Mach.
Biotechnol. 37, 259–266. Learn. Res. 9, 2579–2605.
Gu, Z., Gu, L., Eils, R., Schlesner, M., and Brors, B. (2014). circlize Implements Louis, D.N., Perry, A., Reifenberger, G., von Deimling, A., Figarella-Branger, D.,
and enhances circular visualization in R. Bioinformatics 30, 2811–2812. Cavenee, W.K., Ohgaki, H., Wiestler, O.D., Kleihues, P., and Ellison, D.W.
Hahne, F., Gopalakrishnan, N., Khodabakhshi, A.H., Wong, C., and Lee, K. (2016). The 2016 World Health Organization Classification of Tumors of the
(2020). flowStats: Statistical methods for the analysis of flow cytometry data. Central Nervous System: a summary. Acta Neuropathol. 131, 803–820.
R package version 4.0.0. http://www.github.com/RGLab/flowStats. Mansour, A.A., Gonçalves, J.T., Bloyd, C.W., Li, H., Fernandes, S., Quang, D.,
Hambardzumyan, D., Gutmann, D.H., and Kettenmann, H. (2016). The role of Johnston, S., Parylak, S.L., Jin, X., and Gage, F.H. (2018). An in vivo model of
microglia and macrophages in glioma maintenance and progression. Nat. functional and vascularized human brain organoids. Nat. Biotechnol. 36,
Neurosci. 19, 20–27. 432–441.
1640 Cell 181, 1626–1642, June 25, 2020

ll
Resource
Mantovani, A., Marchesi, F., Malesci, A., Laghi, L., and Allavena, P. (2017). Schindelin, J., Arganda-Carreras, I., Frise, E., Kaynig, V., Longair, M., Pietzsch,
Tumour-associated macrophages as treatment targets in oncology. Nat. T., Preibisch, S., Rueden, C., Saalfeld, S., Schmid, B., et al. (2012). Fiji: an
Rev. Clin. Oncol. 14, 399–416. open-source platform for biological-image analysis. Nat. Methods 9, 676–682.
Mcinnes, L., Healy, J., and Melville, J. (2018). UMAP: Uniform Manifold Schulz, D., Severin, Y., Zanotelli, V.R.T., and Bodenmiller, B. (2019). In-Depth
Approximation and Projection for Dimension Reduction. arXiv, Characterization of Monocyte-Derived Macrophages using a Mass Cytome-
arXiv:1802.03426. try-Based Phagocytosis Assay. Sci. Rep. 9, 1925.
Mei, H.E., Leipold, M.D., and Maecker, H.T. (2016). Platinum-conjugated anti- Shi, Y., Ping, Y.-F., Zhou, W., He, Z.-C., Chen, C., Bian, B.-S.-J., Zhang, L.,
bodies for application in mass cytometry. Cytometry A 89, 292–300. Chen, L., Lan, X., Zhang, X.-C., et al. (2017). Tumour-associated macrophages
Miyai, M., Tomita, H., Soeda, A., Yano, H., Iwama, T., and Hara, A. (2017). Cur- secrete pleiotrophin to promote PTPRZ1 signalling in glioblastoma stem cells
rent trends in mouse models of glioblastoma. J. Neurooncol. 135, 423–432. for tumour growth. Nat. Commun. 8, 15080.
Mrdjen, D., Hartmann, F.J., and Becher, B. (2017). High Dimensional Cytome- Simoni, Y., Fehlings, M., Kløverpris, H.N., McGovern, N., Koo, S.L., Loh, C.Y.,
try of Central Nervous System Leukocytes During Neuroinflammation. Lim, S., Kurioka, A., Fergusson, J.R., Tang, C.L., et al. (2017). Human Innate
Methods Mol. Biol. 1559, 321–332. Lymphoid Cell Subsets Possess Tissue-Type Based Heterogeneity in Pheno-
type and Frequency. Immunity 46, 148–161.
Mrdjen, D., Pavlovic, A., Hartmann, F.J., Schreiner, B., Utz, S.G., Leung, B.P.,
Lelios, I., Heppner, F.L., Kipnis, J., Merkler, D., et al. (2018). High-Dimensional Smolders, J., Heutinck, K.M., Fransen, N.L., Remmerswaal, E.B.M., Hom-
Single-Cell Mapping of Central Nervous System Immune Cells Reveals brink, P., Ten Berge, I.J.M., van Lier, R.A.W., Huitinga, I., and Hamann, J.
Distinct Myeloid Subsets in Health, Aging, and Disease. Immunity 48, 380–395. (2018). Tissue-resident memory T cells populate the human brain. Nat. Com-
mun. 9, 4593.
Müller, S., Kohanbash, G., Liu, S.J., Alvarado, B., Carrera, D., Bhaduri, A.,
Watchmaker, P.B., Yagnik, G., Di Lullo, E., Malatesta, M., et al. (2017). Sin- Spitzer, M.H., Gherardini, P.F., Fragiadakis, G.K., Bhattacharya, N., Yuan,
gle-cell profiling of human gliomas reveals macrophage ontogeny as a basis R.T., Hotson, A.N., Finck, R., Carmi, Y., Zunder, E.R., Fantl, W.J., et al.
for regional differences in macrophage activation in the tumor microenviron- (2015). IMMUNOLOGY. An interactive reference framework for modeling a dy-
ment. Genome Biol. 18, 234. namic immune system. Science 349, 1259425.
Murray, P.J., Allen, J.E., Biswas, S.K., Fisher, E.A., Gilroy, D.W., Goerdt, S., Street, K., Risso, D., Fletcher, R.B., Das, D., Ngai, J., Yosef, N., Purdom, E.,
Gordon, S., Hamilton, J.A., Ivashkiv, L.B., Lawrence, T., et al. (2014). Macro- and Dudoit, S. (2018). Slingshot: cell lineage and pseudotime inference for sin-
phage activation and polarization: nomenclature and experimental guidelines. gle-cell transcriptomics. BMC Genomics 19, 477.
Immunity 41, 14–20. Takasato, M., Osafune, K., Matsumoto, Y., Kataoka, Y., Yoshida, N., Meguro,
Noble, W.S. (2009). How does multiple testing correction work? Nat. Bio- H., Aburatani, H., Asashima, M., and Nishinakamura, R. (2004). Identification of
technol. 27, 1135–1137. kidney mesenchymal genes by a combination of microarray analysis and
Sall1-GFP knockin mice. Mech. Dev. 121, 547–557.
Nowicka, M., Krieg, C., Crowell, H.L., Weber, L.M., Hartmann, F.J., Guglietta,
S., Becher, B., Levesque, M.P., and Robinson, M.D. (2017). CyTOF workflow: Takenaka, M.C., Gabriely, G., Rothhammer, V., Mascanfroni, I.D., Wheeler,
differential discovery in high-throughput high-dimensional cytometry data- M.A., Chao, C.C., Gutiérrez-Vázquez, C., Kenison, J., Tjon, E.C., Barroso,
sets. F1000Res. 6, 748. A., et al. (2019). Control of tumor-associated macrophages and T cells in glio-
blastoma via AHR and CD39. Nat. Neurosci. 22, 729–740.
Okabe, Y., and Medzhitov, R. (2016). Tissue biology perspective on macro-
phages. Nat. Immunol. 17, 9–17. Tawbi, H.A., Forsyth, P.A., Algazi, A., Hamid, O., Hodi, F.S., Moschos, S.J.,
Khushalani, N.I., Lewis, K., Lao, C.D., Postow, M.A., et al. (2018). Combined
Pyonteck, S.M., Akkari, L., Schuhmacher, A.J., Bowman, R.L., Sevenich, L., Nivolumab and Ipilimumab in Melanoma Metastatic to the Brain. N. Engl. J.
Quail, D.F., Olson, O.C., Quick, M.L., Huse, J.T., Teijeiro, V., et al. (2013). Med. 379, 722–730.
CSF-1R inhibition alters macrophage polarization and blocks glioma progres-
sion. Nat. Med. 19, 1264–1272. Thorsson, V., Gibbs, D.L., Brown, S.D., Wolf, D., Bortone, D.S., Ou Yang,
T.-H., Porta-Pardo, E., Gao, G.F., Plaisier, C.L., Eddy, J.A., et al.; Cancer
Quail, D.F., and Joyce, J.A. (2013). Microenvironmental regulation of tumor Genome Atlas Research Network (2018). The Immune Landscape of Cancer.
progression and metastasis. Nat. Med. 19, 1423–1437. Immunity 48, 812–830.
Quail, D.F., and Joyce, J.A. (2017). The Microenvironmental Landscape of Tugues, S., Ducimetiere, L., Friebel, E., and Becher, B. (2019). Innate lymphoid
Brain Tumors. Cancer Cell 31, 326–341. cells as regulators of the tumor microenvironment. Semin. Immunol. 41,
Quail, D.F., Bowman, R.L., Akkari, L., Quick, M.L., Schuhmacher, A.J., Huse, 101270.
J.T., Holland, E.C., Sutton, J.C., and Joyce, J.A. (2016). The tumor microenvi- Uhl, M., Aulwurm, S., Wischhusen, J., Weiler, M., Ma, J.Y., Almirez, R., Man-
ronment underlies acquired resistance to CSF-1R inhibition in gliomas. Sci- gadu, R., Liu, Y.-W., Platten, M., Herrlinger, U., et al. (2004). SD-208, a Novel
ence 352, aad3018. Transforming Growth Factor b Receptor I Kinase Inhibitor, Inhibits Growth and
R Development Core Team (2008). R: A language and environment for statis- Invasiveness and Enhances Immunogenicity of Murine and Human Glioma
tical computing (Vienna, Austria: R Foundation for Statistical Computing). Cells In vitro and In vivo. Cancer Res. 64, 7954–7961.
RStudio Team (2015). RStudio: Integrated Development for R (Boston, MA: Unruh, D., Zewde, M., Buss, A., Drumm, M.R., Tran, A.N., Scholtens, D.M., and
RStudio, Inc.). Horbinski, C. (2019). Methylation and transcription patterns are distinct in IDH
Ries, C.H., Cannarile, M.A., Hoves, S., Benz, J., Wartha, K., Runza, V., Rey- mutant gliomas compared to other IDH mutant cancers. Sci. Rep. 9, 8946.
Giraud, F., Pradel, L.P., Feuerhake, F., Klaman, I., et al. (2014). Targeting tu- Van Gassen, S., Callebaut, B., Van Helden, M.J., Lambrecht, B.N., Demeester,
mor-associated macrophages with anti-CSF-1R antibody reveals a strategy P., Dhaene, T., and Saeys, Y. (2015). FlowSOM: Using self-organizing maps for
for cancer therapy. Cancer Cell 25, 846–859. visualization and interpretation of cytometry data. Cytometry A 87, 636–645.
Samusik, N., Good, Z., Spitzer, M.H., Davis, K.L., and Nolan, G.P. (2016). Auto- Van Hove, H., Martens, L., Scheyltjens, I., De Vlaminck, K., Pombo Antunes,
mated mapping of phenotype space with single-cell data. Nat. Methods 13, A.R., De Prijck, S., Vandamme, N., De Schepper, S., Van Isterdael, G., Scott,
493–496. C.L., et al. (2019). A single-cell atlas of mouse brain macrophages reveals
Sankowski, R., Böttcher, C., Masuda, T., Geirsdottir, L., Sagar, Sindram, E., unique transcriptional identities shaped by ontogeny and tissue environment.
Seredenina, T., Muhs, A., Scheiwe, C., Shah, M.J., et al. (2019). Mapping mi- Nat. Neurosci. 22, 1021–1035.
croglia states in the human brain through the integration of high-dimensional Venteicher, A.S., Tirosh, I., Hebert, C., Yizhak, K., Neftel, C., Filbin, M.G., Hov-
techniques. Nat. Neurosci. 22, 2098–2110. estadt, V., Escalante, L.E., Shaw, M.L., Rodman, C., et al. (2017). Decoupling
Cell 181, 1626–1642, June 25, 2020 1641

ll
Resource
genetics, lineages, and microenvironment in IDH-mutant gliomas by single- Woroniecka, K., and Fecci, P.E. (2018). T-cell exhaustion in glioblastoma. On-
cell RNA-seq. Science 355, eaai8478. cotarget 9, 35287–35288.
Walker, D.G., and Lue, L.-F. (2015). Immune phenotypes of microglia in human
Xue, J., Schmidt, S.V., Sander, J., Draffehn, A., Krebs, W., Quester, I., De
neurodegenerative disease: challenges to detecting microglial polarization in
Nardo, D., Gohel, T.D., Emde, M., Schmidleithner, L., et al. (2014). Transcrip-
human brains. Alzheimers Res. Ther. 7, 56.
tome-based network analysis reveals a spectrum model of human macro-
Warnes, G.R., Bolker, B., Bonebakker, L., Gentleman, R., Liaw, W.H.A., Lum- phage activation. Immunity 40, 274–288.
ley, T., Maechler, M., Magnusson, A., Moeller, S., Schwartz, M., et al. (2019).
gplots: Various R Programming Tools for Plotting Data. https://rdrr.io/cran/ Zhou, W., Ke, S.Q., Huang, Z., Flavahan, W., Fang, X., Paul, J., Wu, L., Sloan,
gplots/. A.E., McLendon, R.E., Li, X., et al. (2015). Periostin secreted by glioblastoma
stem cells recruits M2 tumour-associated macrophages and promotes malig-
Wickham, H., François, R., Henry, L., and Müller, K. (2019a). dplyr: A Grammar
nant growth. Nat. Cell Biol. 17, 170–182.
of Data Manipulation. https://dplyr.tidyverse.org/.
Wickham, H., Chang, W., Henry, L., Pedersen, T.L., Takahashi, K., Wilke, C., Zunder, E.R., Finck, R., Behbehani, G.K., Amir, A.D., Krishnaswamy, S., Gon-
Woo, K., and Yutani, H. (2019b). ggplot2: Create Elegant Data Visualisations zalez, V.D., Lorang, C.G., Bjornson, Z., Spitzer, M.H., Bodenmiller, B., et al.
Using the Grammar of Graphics. https://ggplot2.tidyverse.org/reference/ (2015). Palladium-based mass tag cell barcoding with a doublet-filtering
ggplot2-package.html. scheme and single-cell deconvolution algorithm. Nat. Protoc. 10, 316–333.
1642 Cell 181, 1626–1642, June 25, 2020

ll
Resource
STAR+METHODS
KEY RESOURCES TABLE

Antibodies
Mass cytometry myeloid panel
anti-human Anti-SynCAM (TSLC1/CADM1), purified MBL Life Science Cat# CM004-3; RRID:AB_592783
anti- Biotin (1D4-C5), purified Biolegend Cat# 409002; RRID: AB_10642032
anti-human CCR2 (K036C2), purified Biolegend Cat# 357202;RRID: AB_2561851
anti-human CD100 (A8), purified Biolegend Cat# 328401;RRID: AB_1236386
anti-human CD11b (ICRF44), 209 Bi Fluidigm Cat# 3209003B; RRID: AB_2687654
anti-human CD11c (Bu15), 147-Sm Fluidigm Cat# 3147008B; RRID: AB_2687850
anti-human CD123 (6H6), purified Biolegend Cat# 306002; RRID: AB_314576
anti-human CD14 (M5E2), 160-Gd Fluidigm Cat# 3160001B; RRID: AB_2687634
anti-human CD141 (M80), purified Biolegend Cat# 344102; RRID: AB_2201808
anti-human CD16 (3G8), 165-Ho Fluidigm Cat# 3165001B; RRID: AB_2802109
anti-human CD163 (GHI/61), purified Biolegend Cat# 333602; RRID: AB_1088991
anti-human CD169 (7-239), purified Biolegend Cat# 346002; RRID: AB_2189031
anti-human CD19 (HIB19), 142-Nd Fluidigm Cat# 3142001B; RRID: AB_2651155
anti-human CD1c (L161), purified Biolegend Cat# 331502; RRID: AB_1088995
anti-human CD206 (15-2), purified Biolegend Cat# 321102; RRID: AB_571923
anti-human CD209 (DCN46), purified BD Cat# 551186; RRID: AB_394087
anti-human CD3 (UCHT1), 170-Er Fluidigm Cat# 3170001B; RRID: AB_2661807
anti-human CD33 (WM53), 169-Tm Fluidigm Cat# 3169010B; RRID: AB_2802111
anti-human CD38 (HIT2), 167-Er Fluidigm Cat# 3167001B; RRID: AB_2802110
anti-human CD45 (HI30), 89-Y Fluidigm Cat#3089003B; RRID: AB_2661851
anti-human CD45RA (HI100), 153-Eu Fluidigm Cat# 3153001B; RRID: AB_2802108
anti-human CD49d (9F10), 141-Pr Fluidigm Cat# 3141004B; RRID: N/A
anti-human CD5 (UCHT2), purified Biolegend Cat# 300602; RRID: AB_314088
anti-human CD56 (NCAM16), purified BD Cat# 559043, RRID: AB_397180
anti-human CD64 (10.1), 146-Nd Fluidigm Cat# 3146006B; RRID: AB_2661790
anti-human CD66b (80H3), 152-Sm Fluidigm Cat# 3152011B; RRID: AB_2661795
anti-human CD68 (Y1/82A), purified Biolegend Cat# 333802; RRID: AB_1089058
anti-human CD86 (IT2.2), 156-Gd Fluidigm Cat# 3156008B; RRID: AB_2661798
anti-human CD88 (S5/2), purified Biolegend Cat# 344302; RRID: AB_2259318
anti-human CX3CR1 (2A9-1), purified Biolegend Cat# 341602; RRID: AB_1595422
anti-human FCER1 (AER-37 (CRA1)), biotin eBioscience Cat# 13-5899-82; RRID: AB_466786
anti-human HLA-DR (TU36), 150-Nd Fluidigm Cat# 3150028B; RRID: N/A
anti-human ICAM-1(HA58), purified Biolegend Cat#353102 ; RRID: AB_11204426
anti-human LAP (TW4-2F8), purified Biolegend Cat# 349602; RRID: AB_10645476
anti-human MERTK (125518), purified R&D Cat# MAB8912; RRID: AB_2143588
anti-human PD-1 (EH12.2H7), 174-Yb Fluidigm Cat# 3174020B; RRID: N/A
anti-human PD-L1 (29E.2A3),175-Lu Fluidigm Cat# 3175017B; RRID: AB_2687638
anti-human PD-L2 (24F.10C12), purified Biolegend Cat# 329602; RRID: AB_1089010
anti-human SOX2 (14A6A34), purified Biolegend Cat# 656102 ; RRID: AB_2562246
Mass cytometry lymphoid panel
anti-human 2B4 (C1.7), purified Biolegend Cat# 329502, RRID: AB_1279194
anti-Biotin (1D4-C5), purified Biolegend Cat# 409002; RRID: AB_10642032
Cell 181, 1626–1642.e1–e11, June 25, 2020 e1

ll
Resource
Continued
anti-human CCR2 (K036C2), purified Biolegend Cat# 357202;RRID: AB_2561851
anti-human CCR4 (205410), Gd-158 Fluidigm Cat# 3158006A, RRID: AB_2687647
anti-human CCR6 (11A9), Pr-141 Fluidigm Cat# 3141014A, RRID: N/A
anti-human CCR7 (G043H7), Er-167 Fluidigm Cat# 3167009A, RRID: N/A
anti-human CD103 (Ber-ACT8), Eu-151 Fluidigm Cat# 3151011B, RRID: AB_2756418
anti-human CD127 (A019D5), Ho-165 Fluidigm Cat# 3165008B, RRID: N/A
anti-human CD137 (4B4-1), purified Biolegend Cat# 309802, RRID: AB_314781
anti-human CD16 (3G8), Bi-209 Fluidigm Cat# 3209002B, RRID: AB_2756431
anti-human CD19 (HIB19), Nd-142 Fluidigm Cat# 3142001B, RRID: AB_2651155
anti-human CD25 (2A3), Sm-149 Fluidigm Cat# 3149010B, RRID: AB_2756416
anti-human CD27 (L127), Gd-155 Fluidigm Cat# 3155001B, RRID: AB_2687645
anti-human CD28 (CD28.2), purified Biolegend Cat# 302902, RRID: AB_314304
anti-human CD3 (UCHT1), Sm-154 Fluidigm Cat# 3154003B, RRID: AB_2687853
anti-human CD38 (HIT2), purified Biolegend Cat# 303502, RRID: AB_314354
anti-human CD4 (RPA-T4), Nd-145 Fluidigm Cat# 3145001B, RRID: AB_2661789
anti-human CD45 (HI30)purified Biolegend Cat# 304002, RRID: AB_314390
anti-human CD45RA (HI100), Eu-153 Fluidigm Cat# 3153001B, RRID: AB_2802108
anti-human CD45RO (UCHL1), Dy-164 Fluidigm Cat# 3164007B, RRID: AB_2811092
anti-human CD45RO (UCHL1), purified Biolegend Cat# 304202, RRID: AB_314418
anti-human CD56 (NCAM16), purified BD Cat# 559043, RRID: AB_397180
anti-human CD57 (hCD57), Yb-172 Fluidigm Cat# 3172009B, RRID: N/A
anti-human CD69 (FN50), Nd-144 Fluidigm Cat# 3144018, RRID: AB_2687849
anti-human CD8 (RPA-T8), Nd-146 Fluidigm Cat# 3146001B, RRID: AB_2687641
anti-human CD90 (5E10), Tb-159 Fluidigm Cat# 3159007B, RRID: N/A
anti-human CD95 (DX2), Dy-164 Fluidigm Cat# 3164008B, RRID: N/A
anti-human CRTH2 (BM16), purified Biolegend Cat# 350102, RRID: AB_10639863
anti-human CTLA-4 (14D3), Dy-161 Fluidigm Cat# 3161004B, RRID: AB_2687649
anti-human CXCR3 (G025H7), Gd-156 Fluidigm Cat# 3156004B, RRID: AB_2687646
anti-human Granzyme B (GB11), Yb-171 Fluidigm Cat# 3171002B, RRID: AB_2687652
anti-human HLA-DR (TU36), Nd-150 Fluidigm Cat# 3150028B, RRID: N/A
anti-human ICOS (C398.4A), purified Biolegend Cat# 313502, RRID: AB_416326
anti-human Ki-67 (B56), Er-168 Fluidigm Cat# 3168007B, RRID: AB_2800467
anti-human KLRG1 (14C2A07), purified Biolegend Cat# 368602, RRID: AB_2566256
anti-human LAG-3 (11C3C65), Er-168 Fluidigm Cat# 3165037B, RRID: AB_2810971
anti-human Nkp44 (P44-8), purified Biolegend Cat# 325102, RRID: AB_756094
anti-human CD134 (OX-40; Ber-ACT35 (ACT35)), purified Biolegend Cat# 350002, RRID: AB_10639951
anti-human PD-1 (EH12.2H7), Lu-175 Fluidigm Cat# 3175008, RRID: AB_2687629
anti-human SOX2 (14A6A34), purified Biolegend Cat# 656102 ; RRID: AB_2562246
anti-human TCRgd (11F8), Sm-152 Fluidigm Cat# 3152008B, RRID: AB_2687643
anti-human TIGIT (MBSA43), Tm-169 eBioscience Cat# 16-9500-82, RRID: AB_10718831
anti-human Tim3 (F38-2E2), Biotin Miltenyi Cat# 130-098-945, RRID: N/A
anti-human Tim3 (F38-2E2), VioBright FITC Miltenyi Cat# 130-104-646, RRID: N/A
Flow cytometry panel, human brain samples
anti-human CCR2 (K036C2), BV605 Biolegend Cat# 357213; RRID:AB_2562702
anti-human CD11b (ICRF-44), FITC Biolegend Cat# 301330 ; RRID: AB_2561703
anti-human CD11c (B-ly6), BV570 BD Cat# 624298; RRID: N/A
anti-human CD123 (6H6), BV711 Biolegend Cat# 306029; RRID: AB_2566353
anti-human CD14 (6H6), BUV737 BD Cat# 564444; RRID:AB_2744285
e2 Cell 181, 1626–1642.e1–e11, June 25, 2020

ll
Resource
Continued
anti-human CD141 (1A4), BB700 BD Cat# 742245; RRID:AB_2740668
anti-human CD16 (3G8), BUV496 BD Cat# 321102; RRID: N/A
anti-human CD163 (GHI/61), BV650 BD Cat# 563888; RRID:AB_2738468
anti-human CD169 (7-239), PE/Dazzle 594 Biolegend Cat# 346015; RRID:AB_2750265
anti-human CD19 (REA675), APC-Vio770 Myltenyi Cat# 130-110-252; RRID: N/A
anti-human CD1c (F10/21A3), BB660 BD Cat# 624295; RRID: N/A
anti-human CD206 (15-2), BV421 Biolegend Cat# 321126; RRID: AB_2563839
anti-human CD235A (REA175), APC-Vio770 Myltenyi Cat# 130-100-264; RRID:AB_2656505
anti-human CD273 (MIH18), AF700 BD Cat# 565189; RRID:AB_2739102
anti-human CD274 (MIH1), PE-Cy7 BD Cat# 558017; RRID:AB_396986
anti-human CD3(REA613), APC-Vio770 Myltenyi Cat# 130-109-543; RRID:AB_2657072
anti-human CD33 (WM53), BUV395 BD Cat# 740293; RRID:AB_2740032
anti-human CD38 (HIT2), BV785 Biolegend Cat# 303529; RRID: AB_2561368
anti-human CD45 (HI-30), BUV805 BD Cat# 564915; RRID: N/A
anti-human CD45RA (MEM-56),Pe-Cy5.5 LifeTechnologies, Thermo Cat# MHCD45RA18; RRID:AB_10372221
Fisher Scientific
anti-human CD49d (9F10), Biotin Biolegend Cat# 304334; RRID: AB_2749896
anti-human CD56 (HCD56), APC-C7 Biolegend Cat# 318332; RRID: AB_10896424
anti-human CD66b (G10F5), BB790-P BD Customized
anti-human CD86 (IT2.2), PE-Cy5 Biolegend Cat# 305407; RRID: AB_314527
anti-human CD235A (REA175), APC-Vio770 Myltenyi Cat# 130-100-264; RRID:AB_2656505
anti-human CD273 (MIH18), AF700 BD Cat# 565189; RRID:AB_2739102
anti-human CX3CR1 (2A9-1), BV480 BD Cat# 746723; RRID:AB_2743987
anti-human HLA-DR (G46-6), BUV661 BD Cat# 565073; RRID:AB_2722500
anti-human MERTK (125518), APC R&D Cat# FAB8912A; AB_357213
anti-human P2Y12 (S16001E), PE Biolegend Cat# 392104; RRID: AB_2716007
Flow cytometry panel, mouse brain samples
anti-mouse MHCII (M5/114.15.2), BB700 BD Cat# 746197; RRID:AB_2743544
anti-mouse CCR2 (SA203G11), BV650 BD Cat# 747968; RRID: N/A
anti-mouse CD11b (M1/70), BUV737 BD Cat# 564443; RRID:AB_2738811
anti-mouse CD11c (N418), BV570 Biolegend Cat# 117331; RRID: AB_10900261
anti-mouse CD163 (TNKUPJ), biotin eBioscience Cat# 14-1631-82; RRID:AB_2716934
anti-mouse CD206 (C068C2), AF700 Biolegend Cat# 141734; RRID: AB_2629637
anti-mouse CD209a (MMD3), PE Biolegend Cat# 833004; RRID: AB_2721637
anti-mouse CD38 (90), PE-Dazzle594 Biolegend Cat# 102729; RRID: AB_2632890
anti-mouse CD45 (30-F11), BUV395 BD Cat# 564279; RRID:AB_2651134
anti-mouse CD49d (R1-2), PE-Cy7 BioLegend Cat# 103618; RRID:AB_2563700
anti-mouse CD64 (X54-5/7.1), BV421 Biolegend Cat# 139309; RRID: AB_2562694
anti-mouse CX3CR1 (SA011F11), BV510 Biolegend Cat# 149025; RRID: AB_2565707
anti-mouse F4/80 (BM8), PE-Cy5 Biolegend Cat# 123112; RRID: AB_893482
anti-mouse Ly6C (HK1.4), BV711 Biolegend Cat# 128037; RRID: AB_2562630
anti-mouse Ly6G (1A8), BUV 563 BD Cat# 565707; RRID:AB_2739334
anti-mouse MerTK (DS5MMER), SuperBright 780 eBioscience Cat# 78-5751-82; RRID: AB_2762814
anti-mouse PD-L1 (10F.9G2), BV605 Biolegend Cat# 124321; RRID: AB_2563635
anti-mouse Siglec1 (3D6.112), APC Biolegend Cat# 142417; RRID: AB_2565640
Streptavidin, BUV661 BD Customized
anti-mouse CD64 (X54-5/7.1), BV421 Biolegend Cat# 139309; RRID: AB_2562694
anti-mouse CX3CR1 (SA011F11), BV510 Biolegend Cat# 149025; RRID: AB_2565707
Cell 181, 1626–1642.e1–e11, June 25, 2020 e3

ll
Resource
Continued
Immunohistochemistry
CD163 (163C01/10D6) NeoMarkers / Lab Vision Cat# MS-1103-S, RRID:AB_64138
Corporation
Iba1 Wako Chemicals Cat# 019-19741, RRID:AB_839504
Immunofluorescence
VE-cadherin (C-19) (polyclonal) Santa Cruz Cat# sc-6458, RRID:AB_2077955
anti-human P2RY12 (S16001E), PE Biolegend Cat# 392104; RRID:AB_2716007
anti-human CD163 (GHI/61), PE Biolegend Cat# 333606; RRID:AB_1134002
anti-human CD169 (7-239), PE-Dazzle594 Biolegend Cat# 346015; RRID:AB_2750265
anti-human CD206 (15-2), PE-Dazzle594 Biolegend Cat# 321106; RRID:AB_571911
anti-human CD209 (DCS-8C1), PE Biolegend Cat# 343004; RRID:AB_2074328
Biological Samples
Brain tumor (glioma and brain metastases) University Hospital Zurich N/A
and non-tumor brain tissue (epilepsy) samples
16% Paraformaldehyde aqueous solution Electron Microscopy Sciences/ Cat#15710; RRID: N/A
LucernaChem
2-methylbutane Sigma Aldrich Cat# M32631-25.L; RRID: N/A
Antibody Stabilizer PBS Candor Bioscience Cat# 131 050; RRID: N/A
Antifading Mounting Medium with DAPI Dianova Cat# SCR-038448; RRID: N/A
Bambanker LubioScience GmbH Cat# 523303 (BB02); RRID: N/A
Benzonase nuclease Sigma-Aldrich Cat# E1014-25KU; RRID: N/A
Bovine Serum Albumin (BSA) Sigma-Aldrich Cat# B2064; RRID: N/A
Bromoacetamidobenzyl-EDTA (BABE) Dojindo Laboratories Cat# B437-10; RRID: N/A
Cell-IDTM Intercalator-Ir Fluidigm Cat# 201192B; RRID: N/A
Cell-ID Cisplatin Fluidigm Cat# 201064; RRID: N/A
CO2-Independent Medium Thermo Fisher Scientific Cat# 18045-070; RRID: N/A
Collagenase from Clostridium histolyticum, type IV Sigma-Aldrich Cat# C5138; RRID: N/A
Cryo Embedding Medium Medite Cat# 41-3011-00; RRID: N/A
Dead Cell Removal Kit Miltenyi Cat# 30-090-101; RRID: N/A
Deoxyribonuclease I from bovine pancreas Sigma-Aldrich Cat# DN25-1G; RRID: N/A
D-Luciferin Perkin Elmer Cat# 122799; RRID: N/A
DMSO Sigma-Aldrich Cat# D2438; RRID: N/A
DMSO; Dimethyl Sulfoxide, anhydrous, 99.7% Fischer Bioreagents Cat# BP231-1; RRID: N/A
EDTA StemCell Technologies, Inc Cat# EDS-100G; RRID: N/A
EQ Four Element Calibration Fluidigm Cat# 201078; RRID: N/A
Formaldehyde 4.0% PanReac Cat# 252931.1211; RRID: N/A
Foxp3 / Transcription Factor Staining Buffer Set eBioscience Cat# 00-5523-00; RRID: N/A
HBSS ThermoFisher Scientific Cat# 14175095; RRID: N/A
Human TruStain FcX Biolegend Cat# 422302; RRID:AB_2818986
Indium (115In) Trace Sciences International N/A
Iridium (191Ir, 193Ir) Fluidigm Cat# 201192A; RRID: N/A
Isoflurane Minrad N/A
Maleimido-mono-amide-DOTA (mDOTA) Macrocyclics Cat# B-272; RRID: N/A
Maxpar X8 Multimetal Labeling Kit Fluidigm Cat# 201300; RRID: N/A
Maxpar Fix and Perm Buffer Fluidigm Cat# 201067; RRID: N/A
Normal goat serum ThermoFisher Scientific Cat# PCN5000; RRID: N/A
Palladium (104Pd,105Pd, 106Pd, 108Pd, 110Pd) Trace Sciences International N/A
e4 Cell 181, 1626–1642.e1–e11, June 25, 2020

ll
Resource
Continued
Percoll GE Cat# P4937; RRID: N/A
Percoll Sigma-Aldrich Cat# GE17-0891; RRID: N/A-0
Phosphate-buffered saline Homemade N/A
Phosphate-buffered saline, DPBS, NO CALCIUM, Life Technologies Cat# 14190094; RRID: N/A
NO MAGNESIUM
RPMI 1640 Bioswisstec; Seraglob Cat# M 3413; RRID: N/A
RPMI 1640 Medium, HEPESS, no glutamine Life Technologies Cat# 42401042; RRID: N/A
Saponin Sigma-Aldrich Cat# S7900; RRID: N/A
Sudan black B Sigma-Aldrich Cat# 199664; RRID: N/A
Deposited Data
TCGA-LGG - Harmonized The Cancer Genome Atlas https://portal.gdc.cancer.gov -
mRNA gene quantification
HTSeq - Counts
TCGA-GBM - Legacy The Cancer Genome Atlas https://portal.gdc.cancer.gov/legacy-
archive - mRNA gene expression and
quantification HT_HG-U133A
TCGA-GBM - Harmonized The Cancer Genome Atlas https://portal.gdc.cancer.gov -
mRNA gene quantification
HTSeq - Counts
Mass- and flow cytometry data this study https://data.mendeley.com/datasets/
jk8c3c3nmz/draft?a=c0a9d8dc-
8ac2-4942-baf9-208de7a8c310
GL-261 cells A. Fontana, Experimental N/A
Immunology, University of Zurich,
Zurich, Switzerland
Sall1CreER/+ Ryuchi Nishinakamura RRID:MGI:4818961
(Kumamoto University)
R26YFP The Jackson Laboratory RRID:IMSR_JAX:00 6148
MATLAB R2016a N/A https://www.mathworks.com/
Normalizer Finck et al., 2013 https://github.com/nolanlab/bead-
normalization/releases
FlowJo V10.6.1.1 Tree Star https://www.flowjo.com/
R version 3.5 R Development Core Team, https://www.r-project.org/
2008
R Studio RStudio Team, 2015 https://www.rstudio.com/
FlowSOM Van Gassen et al., 2015 https://github.com/SofieVG/FlowSOM
Circlize Gu et al., 2014 https://cran.r-project.org/web/
packages/circlize/index.html
MCF-data-analysis Hartmann et al., 2016 https://github.com/hartmannfj/
MCF-data-analysis
CyTOF workflow Nowicka et al., 2017 https://f1000research.com/articles/6-748#
UMAP Mcinnes et al., 2018 https://github.com/lmcinnes/umap
SCAFFoLD Spitzer et al., 2015 https://github.com/nolanlab/scaffold
VorteX Samusik et al., 2016 https://github.com/nolanlab/vortex/
wiki/Getting-Started
Slingshot Street et al., 2018 https://bioconductor.org/packages/
release/bioc/html/slingshot.html
t-SNE Laurens van der Maaten, 2008 https://github.com/jkrijthe/Rtsne
Cell 181, 1626–1642.e1–e11, June 25, 2020 e5

ll
Resource
Continued
One-SENSE Cheng et al., 2016 N/A
flowStats Hahne et al., 2020 https://www.bioconductor.org/
packages/release/bioc/html/
flowStats.html
pheatmap Kolde, 2019 https://cran.r-project.org/web/
packages/pheatmap/index.html
dplyr Wickham et al., 2019a https://cran.r-project.org/web/
packages/dplyr/index.html
ggplot2 Wickham et al., 2019b https://cran.r-project.org/web/
packages/ggplot2/index.html
gplots Warnes et al., 2019 https://cran.r-project.org/web/
packages/gplots/index.html
Hmisc Harrell, 2020 https://cran.r-project.org/web/
packages/Hmisc/index.html
flowWorkspaceData Finak, 2018 N/A
flowCore Ellis et al., 2019 N/A
TCGAbiolinks Colaprico et al., 2016 https://bioconductor.org/packages/
release/bioc/html/TCGAbiolinks.html
Gephi Bastian et al., 2009 https://gephi.org/
Fiji Schindelin et al., 2012 https://imagej.net/Fiji
Living Image 2.5 Caliper Life Sciences N/A
Leica Bond III N/A https://www.leicabiosystems.com
Adobe Illustrator CS6 Adobe https://www.adobe.com/ch_de/
products/illustrator.html
Other
gentleMACS Octo Dissociator with Heaters Miltenyi Biotec Cat# 130-096-427
Bone wax (Aesculap B. Braun N/A
FACSymphony BD N/A
Gentle MACS C-tubes Miltenyi Biotec Cat# 130-096-334
Hamilton 75N syringe Sigma-Aldrich Cat# 28613-U
Hamilton syringe 5 ml Sigma-Aldrich Cat# 26286
Helious CyTOF2 Fluidigm N/A
Hyrax C60 Cryostat Zeiss N/A
MACS columns MS Miltenyi Biotec Cat# 130-042-201
microinjection pump (UMP-3) World precision Instruments N/A
Olympus IX81 Olympus N/A
Stereotactic frame David Kopf Instruments N/A
Tissue glue Indermil; Henkel N/A
Xenogen IVIS 100 Caliper Life Sciences N/A
Lead Contact
Further information and requests for resources should be directed to the Lead Contact, Burkhard Becher (becher@immunology.
uzh.ch).

Mass and flow cytometry data generated during this study are available at https://data.mendeley.com/datasets/jk8c3c3nmz/draft?
a=c0a9d8dc-8ac2-4942-baf9-208de7a8c310
e6 Cell 181, 1626–1642.e1–e11, June 25, 2020

ll
Resource
Human Brain Tissue Samples

Human brain tissue samples were collected in the Department of Neurosurgery at the University Hospital Zurich after written
informed patient consent following the local ethical requirements and the declaration of Helsinki. All analyses were performed ac-
cording to the guidelines of the local ethics committees (KEK-ZH-Nr. 2015-0163). All the collected samples were anonymized before
processing. Clinical information related to patients was collected during diagnosis and summarized in Table S1. A total of 45 patients
(including glioma, brain metastases, and epilepsy cases) from both male and female subjects between the ages of 3 - 80 years old
were included in the present study.
Animal Models
Mice were bred in house (see Key Resources Table): R26YFP (de Boer et al., 2003). Sall1CreER were kindly provided by R. Nishina-
kamura (Kumamoto University) (Inoue et al., 2010; Takasato et al., 2004). All ‘Cre’ and ‘CreER’ strains were used as heterozygotes.
Six-week-old mice (female and male) were used for all experiments of the glioma preclinical model. All animal experiments performed
in this study were approved by the Swiss Veterinary Office. All mice were on a C57BL/6 background and kept in individually ventilated
cages under specific-pathogen-free conditions. Animals were monitored once a week to assess weight loss and physical/neurolog-
ical abnormalities. In vivo measurements to assess tumor growth were also performed once a week.
Cell Lines
GL-261 cells, which are syngeneic in C57BL/6 mice, were stably transfected with pGl3-ctrl and pGK-Puro (Promega) and selected
with puromycin (Sigma-Aldrich) to generate luciferase-stable GL-261 cells. A single clone was isolated by limiting dilution and
passaged in vivo by intracranial tumor inoculation. Subsequently, cells were transfected with pCEP4-mIgG3, pCEP4-mIl-
12mIgG3, or pCEP4-mIl23mIgG3, and cytokine production was detected by ELISA and RT-PCR, as previously described (Eisenring
et al., 2010). SMA-560 spontaneous murine astrocytoma cells were characterized previously (Uhl et al., 2004).
METHOD DETAILS
Processing of Human Samples for Cytometry Analysis

Fresh resected human brain tissue samples were taken within 2 hours to the Laboratory of Molecular Neuro-Oncology to start tissue
dissection and processing. First, tissue samples were thoroughly washed with phosphate-buffered saline (PBS) to remove visible
blood clots and to reduce blood leukocytes contamination. After tissue was minced using scalpels into approximately 3 to 5 mm
diameter pieces and digested (1 mg/mL Collagenase IV (Sigma-Aldrich), 10 mg/mL DNase (Sigma-Aldrich), 10% Fetal Bovine Serum
(FBS) (Thermo Fisher Scientific), RPMI 1640 (Seraglob)) at 37 C for 45 minutes using the gentle MACS Octo Dissociator with Heaters
(Miltenyi Biotech) and continuous shaking. The enzymatic reaction was stopped by adding EDTA 2 mM (StemCell Technologies, Inc)
in PBS to a double volume of the sample. Afterward, the homogenate was filtered through a 100 mm cell strainer and centrifuged at
400 3 g for 8 minutes at 4 C to pellet the cells and myelin. This was followed by myelin removal step by gradient centrifugation with
30% Percoll (Sigma-Aldrich) in PBS (1,592 x g for 30 minutes at 4 C; without brakes during deceleration) using a 50 mL tube with a lid
for a fixed angle rotor fitting in a centrifuge. After myelin (the top white layer) separation, the middle transparent layer without the bot-
tom layer of red blood cells was collected and filtered once more through a 100 mm cell strainer. The single-cell suspension was
washed in PBS and centrifuged at 400 3 g for 8 minutes at 4 C to pellet the cells. Next, cells were ready for flow cytometry analysis
or storing samples by freezing. To freeze cells, we divided the single-cell suspension into two parts. One half was viably frozen in
Bambanker (LubioScience GmbH). The second half of the cell suspension was stained for viability with Cell-ID Cisplatin (Fluidigm)
(3 minutes at 4 C), followed by washing in PBS and centrifugation at 400 3 g for 5 minutes at 4 C to pellet the cells. Next, cells
were fixed with 1.6% Paraformaldehyde aqueous solution (PFA; Electron Microscopy Sciences) for 15 minutes at room temperature
(RT) (Zunder et al., 2015). After fixation, cells were washed twice in PBS, centrifuged at 600 3 g for 5 minutes at 4 C to pellet the cells
and cryopreserved in FACS buffer (EDTA 2mM (StemCell Technologies, Inc.), FBS 0.5% (ThermoFisher Scientific), PBS) comple-
mented with 10% DMSO (Sigma-Aldrich). Cell fixation before cryopreservation allowed us to preserve the myeloid compartment
(including microglia and neutrophils) and cell frequencies. Cryopreserved samples were stored at 80 C (maximum 30 days) or in
liquid nitrogen 160 C until analysis.
Tissue collection for Immunofluorescence

In some cases (Table S1), a portion of the tissue was collected for Immunofluorescence. The samples were cut into pieces of max 1 cm x
0.5 cm x 0.5 cm and embedded in cryomolds filled with Cryo Embedding Medium (Medite). Freezing was performed in a beaker filled
with 2-methylbutane (Sigma-Aldrich) and dry ice or on dry ice directly and stored at 80 C until further processing.
Orthotopic Glioma Cell Injection

8–12-week-old mice were anesthetized with 2%–5% Isoflurane (Minrad) in an induction chamber. Anesthesia was maintained on the
stereotactic frame (David Kopf Instruments). A blunt-ended syringe (Hamilton; 75N, 26 s/2’’/2, 5 ml; Sigma-Aldrich) was injected
Cell 181, 1626–1642.e1–e11, June 25, 2020 e7

ll
Resource
1.5 mm lateral and 1 mm frontal from the bregma. A 5ml syringe (Hamilton; Sigma-Aldrich) was injected with a depth of 4 mm below
the skull and retracted 1 mm, forming a reservoir. Using a microinjection pump (UMP-3; World precision Instruments Inc.), 5 3 104
GL-261 cells were injected in a volume of 2 ml at 1 ml/minute. After resting the needle for 2 minutes, it was retracted at a speed of
1 mm/minute. The injection hole was closed with bone wax (Aesculap; B. Braun), and the scalp wound was sealed with tissue
glue (Indermil; Henkel).
Every animal got an ID code. ‘‘C’’ in the animal ID marked the control animals, the one did not have the injection of tumor cells, but
had Sall1/YFP expression. Samples LH60, LV57, L57, RV58, RH57, RH58 and RV57 developed a big tumor, and the rest had an in-
termediate to a small tumor. Samples RV58, LH60, L57 and RH60 were wild-type control animals (YFP-) with the injection of tu-
mor cells.
In Vivo Bioluminescent Imaging

Tumor-bearing mice were injected with D-Luciferin (150 mg/kg body weight; Perkin Elmer). Animals were transferred to the dark
chamber of a Xenogen IVIS 100 (Caliper Life Sciences) imaging system and luminescence was detected. Data were subsequently
analyzed using Living Image 2.5 software (Caliper Life Sciences).
Harvesting and Processing of Mouse Brain Samples

Mice were euthanized with CO2 and perfused with PBS through the left ventricle of the heart using a 25-G butterfly needle attached to
a 50 mL syringe (Mrdjen et al., 2017). The collected complete brain sample was dissected into approximately 1 to 3 mm diameter
pieces using scissors and digested (0.4 mg/mL collagenase IV (Sigma-Aldrich), 10 mg/mL Deoxyribonuclease I from (DNase;
Sigma-Aldrich), 10% FBS (Thermo Fisher Scientific), RPMI 1640 (Seraglob)) at 37 C for 30 minutes applying continuous shaking.
The enzymatic reaction was stopped by adding EDTA (StemCell Technologies, Inc) in PBS to a final concentration 5 mM. To homog-
enize the sample, it was repeatedly aspirated and ejected using a 5 mL syringe with a 20-G needle until a uniform homogenate was
formed (Mrdjen et al., 2017). Afterward, the homogenate was filtered through a 70 mm cell strainer and centrifuged at 400 3 g for
8 minutes at 4 C to pellet the cells and myelin. This was followed by myelin removal step by gradient centrifugation with 30% Percoll
(Sigma-Aldrich) in PBS (1,592 x g for 30 minutes at 4 C; without brakes during deceleration) using a 50 mL tube with a lid for a fixed
angle rotor fitting in a centrifuge. After myelin (the top white layer) separation, the middle transparent layer without the bottom layer of
red blood cells was collected and filtered through a 70 mm cell strainer. The single-cell suspension was washed in PBS and centri-
fuged at 400 3 g for 8 minutes at 4 C to pellet the cells. Cells were then ready for flow cytometry analysis.
Mass-Tag Cellular Barcoding

To minimize inter-sample staining variation, we applied two strategies of mass-tag barcoding to fixed (myeloid focused panel) (Zun-
der et al., 2015) and viably frozen cells (lymphoid focused panel) (Mei et al., 2016).
Intracellular (fixed cell) barcoding consisted of a ten sample barcoding scheme composed with unique combinations of three out of
five palladium metals (104Pd, 105Pd, 106Pd, 108Pd, 110Pd) (Trace Sciences International). Palladium isotopes were conjugated to Bro-
moacetamidobenzyl-EDTA (Dojindo Laboratories). The concentrations were adjusted to 100 nM for all metals. After thawing
samples in a cold water bath, cells were washed once with Cell Staining Medium (CSM: PBS, 0.5% Bovine Serum Albumin (BSA)
(Sigma-Aldrich) and 0.02%NaN3), once with PBS and once with 0.03% Saponin (Sigma-Aldrich) in PBS and centrifuged at 600 3
g for 5 minutes at 4 C to pellet the cells. After the last washing step, 100 mL residual volume was left in each tube. Next, 1x1000 diluted
barcoding reagent in PBS with 0.03% saponin was thoroughly and quickly mixed with the sample and incubated at RT for 15 minutes.
Cells were then washed three times in CSM, and ten samples were pooled for antibody surface staining.
Live cell (viable frozen) barcoding comprised a twenty sample barcoding scheme with unique combinations of three out of five
palladium metals (104Pd, 105Pd, 106Pd, 108Pd, 110Pd) (Trace Sciences International) and one Yttrium (preconjugated anti-human
CD45 89Y) (Fluidigm). Palladium metals were conjugated with purified anti-human CD45 antibody (Biolegend) in house using the
Maxpar X8 chelating polymer kit (Fluidigm) according to the manufacturer’s instructions. Viable frozen cells were thawed in a water
bath at 37 C for 2 minutes. Cells were immediately resuspended in CO2-Independent Medium (Thermo Fisher Scientific) comple-
mented with 5% FBS (ThermoFisher Scientific) + Benzonase Nuclease (100 000 x) (Sigma-Aldrich) and centrifuged 400 3 g for 7 mi-
nutes at RT to pellet the cells. Cells were resuspended in RPMI-1640 complemented with 5% FBS and proceeded with a protocol of
dead cells were removal by negative selection using MACS columns MS (Miltenyi Biotech) and dead cell removal kit (Miltenyi Biotech)
according to the manufacturer’s instructions. Next, cells were incubated on ice in RPMI-1640 complemented with 5% FCS and a
unique combination of metal-tagged CD45 antibodies. Samples were then washed twice in PBS, and twenty samples were pooled
for antibody surface staining.
Metal-Isotope-Tagged Antibodies
All anti-human antibodies, corresponding clone, and tagged metal isotope for mass cytometry analysis are listed in Tables S2 and S3
and Key Resources Table. Preconjugated antibodies to metal isotope were purchased from Fluidigm or commercial suppliers in
purified form and conjugated in house using the Maxpar X8 chelating polymer kit (Fluidigm) according to the manufacturer’s
instructions.
e8 Cell 181, 1626–1642.e1–e11, June 25, 2020

ll
Resource
Cell Surface Staining for Cytometry

To avoid nonspecific binding of antibodies, the sample was incubated at 4 C for 15 minutes in Human TruStain FcX (Fc Receptor
Blocking Solution; Biolegend). Without washing, the cells were spun down, resuspended in the antibody mixture (Key Resources
Table) in PBS, and incubated at 4 C for 30 minutes. To optimize antibody staining for chemokine receptors, see Table S3, the sample
was incubated at 37 C first 10 minutes, and at 4 C followed 20 minutes. After staining the lymphoid focused panel (Table S3), we
added Cell-ID Cisplatin (Fluidigm) to the sample for 3 minutes to discriminate viable/dead cells. Then, the sample was washed
once in PBS and centrifuged to pellet the cells.
Intracellular Cytokine Staining for Cytometry

Cells were permeabilized using Foxp3/Transcription Factor Staining Buffer Set (eBioscience) according to the manufacturer’s in-
structions for 45 minutes at 4 C. Subsequently, the sample was washed once in Perm/Wash buffer (eBioscience) and incubated
in the antibody mixture (Key Resources Table) in Perm/Wash buffer (including transcriptional factors, intracellular antigens, anti-
streptavidin, anti-PE) for 30 minutes at 4 C. The sample was washed once in Perm/Wash buffer (eBioscience) and centrifuged to pel-
let the cells. The cells were ready for flow cytometry acquisition or processed before CyTOF acquisition.
Cell Preparation and Mass Cytometry Acquisition

After cell surface and intracellular antibody staining the cells were incubated in 4% PFA (Electron Microscopy Sciences) overnight.
Prior to acquisition the cells were pelleted without washing and resuspended in up to 1 mL of diluted 1:3000 Cell-IDTM Intercalator-Ir
(Fluidigm) + Maxpar Fix and Perm Buffer (Fluidigm) for 1.5-3 hours. After the sample was washed twice in PBS and twice in ddH2O,
diluted to 1.5–106 cells/ml in ddH2O containing 10% EQ Four Element Calibration Beads (Fluidigm) and filtered through a 40-mm filter-
cap FACS tube. Samples were analyzed with a Helious CyTOF2 (Fluidigm). Quality control and tuning processes on the Helios
CyTOF2 (Fluidigm) were performed following the guidelines for the daily instrument operation. Data were collected as FCS files.
Flow Cytometry Acquisition

Samples stained with antibodies coupled to fluorophores (Key Resources Table) were analyzed with a FACSymphony (BD Biosci-
ences). Data were collected as FCS files. Before acquisition, PMT voltages were adjusted using single-stained samples and
controlled using the fully stained sample. FCS files were named accordingly the patient ID or animal ID.
Immunohistochemistry
Immunohistochemistry was performed on 4-mm-thick tissue sections. Double immunostaining for Iba1 (Wako Chemicals USA, 019-
19741), pretreatment with TrisEDTABorat antigen retrieval, 32 minutes, followed by 32 minutes incubation, dilution 1/1000, OptiView
DAB detection kit (Ventana) was performed on an automated Ventana Benchmark Ultra, and CD163 (NeoMarkers / Lab Vision Cor-
poration, MS-1103-S), prediluted, incubation 32 minutes, Bond Polymer Refine Detection kit, was performed on a Leica Bond III. In
total we analyzed 27 glioma, 7 BrM and 2 epilepsy samples from the CyTOF cohort.
Immunofluorescence
Frozen tissues were cryosectioned (10-mm thick) using a Hyrax C60 Cryostat (Zeiss) and stored at 20 C. Sections were fixed with
4% PFA (PanReac), washed in PBS, and incubated with a blocking solution consisting of PBS supplemented with 1% BSA (Sigma-
Aldrich) and 0.3% Triton X-100 (Sigma-Aldrich). Subsequently, sections were incubated with the following primary antibodies (diluted
in blocking solution) at 4 C overnight: rabbit anti-Iba1 antibody (Wako; polyclonal; 1:500); goat anti-VE-cadherin antibody (Santa
Cruz; polyclonal; 1:00). Sections were then washed with blocking solution and incubated with AF647-labeled donkey anti-rabbit,
AF488-labeled donkey anti-goat secondary antibodies (Life Technologies; 1:500) and one of the following directly-labeled antibodies
(all diluted in blocking solution) at room temperature for 2h: P2RY12-PE (Biolegend; clone S16001E; 1:100), CD163-PE (Biolegend;
clone GHI/61; 1:200), CD169-PE (Biolegend; clone 7-239; 1:100), CD206-PE-Dazzle594 (Biolegend, clone 15-2; 1:100) or CD209-PE
(Biolegend; clone DCS-8C1; 1:100). Sections were washed with blocking solution and incubated in Sudan black B (Sigma-Aldrich)
dissolved in 70% ethanol to reduce autofluorescence of the tissues at RT for 30 minutes. Finally, sections were washed with HBSS
(ThermoFisher Scientific) and mounted with Immunoselect Anti fading Mounting Medium with DAPI (Dianova). Fluorescence photo-
micrographs were captured with an Olympus IX81 microscope using the 40 3 objective and images were processed with Fiji soft-
ware (Schindelin et al., 2012). A Gaussian filter s = 1 was applied to the P2RY12 image before merging with the other channels.
Preprocessing of Cytometry Data

Raw mass cytometry data were normalized using the MATLAB version of the Normalizer tool (Finck et al., 2013). Cells were
assigned by manually gating on Event length and DNA (191Ir and 193Ir) channels, followed by the dead cell discrimination analyzing
195
Pt expression using FlowJo Software (Tree Star). Doublets were excluded using Gaussian discrimination channels (Center, Offset,
Width, Residual). Next, data were concatenated and de-barcoded using Boolean gating in FlowJo software (Tree Star). The normal-
ized data containing living cells from every individual patient were manually exported from FlowJo Software (Tree Star) and imported
Cell 181, 1626–1642.e1–e11, June 25, 2020 e9

ll
Resource
into R studio of R using the R packages ‘‘flowCore’’ and ‘‘flowWorkspaceData’’ (R Foundation for Statistical Computing) (Ellis et al.,
2019; Finak, 2018). Before automated high-dimensional data analysis, the mass cytometry data were transformed with a cofactor in
the range of 5 and 60 using an inverse hyperbolic sine (arcsinh) function (Bendall et al., 2011).
For flow cytometry data, the compensation matrix was corrected using FlowJo software (Tree Star). After live, single, CD45
positive and compensated cells were exported and imported into R Studio. Before automated high-dimensional data analysis,
flow cytometry data were transformed using an inverse hyperbolic sine (arcsinh) function with a cofactor in the range of between
300 and 600.
Additionally, all cytometry data were normalized between 0 and 1 to the 99-999th percentile of the merged sample in each
batch. To control for the batch effect, we used the same clinical sample in two acquisition rounds. For the mass cytometry data,
the marker expression distributions were verified between two batches of the acquisition applying R package ‘‘flowStats’’ (Hahne
et al., 2020).
Automated Population Identification

To identify myeloid and lymphoid immune cell populations accurately, we first carried out a step of FlowSOM clustering to generate a
starting point of 100 nodes, on pre-processed and combined mass/ or flow cytometry datasets (Van Gassen et al., 2015; Hartmann
et al., 2016). This was then followed by expert-guided manual metaclustering, using parameters listed in each figure legend. The
respective k-value was manually chosen (in the range of between 20 and 30); identified clusters were annotated and merged based
on a similarity of antigen expression in order to uphold the biological relevance of the dataset. Manually-annotated clusters were used
to calculate the relative frequencies of immune populations. Heatmaps display median expression levels of all markers per merged
population and plotted using the R package ‘‘pheatmap’’ (Figures 1D, 2B, 3E, and S5B) (Kolde, 2019). From both mass cytometry
datasets, we pre-selected major populations and performed additional FlowSOM analysis to identify smaller cell subsets. For Figures
3B, 7D, and S7A, we calculated the median antigen expression among selected cell types of the second mass cytometry batch using
the R package ‘‘dplyr’’ (Wickham et al., 2019a).
For data visualization, we applied various dimensionality reduction techniques. For a complex overview of the immune compart-
ment, we used Uniform Manifold Approximation and Projection (UMAP) (Mcinnes et al., 2018). To create a UMAP of isolated leuko-
cytes (Figures 1C and S2A), we pooled equally proportioned 120,000 CD45+ cells from the glioma and BrM datasets from the second
CyTOF batch. To create a UMAP of lymphocytes (Figures S5A, S5C), we pooled equally proportioned 80,000 CD45highHLA-DR-lin+
cells of glioma and BrM datasets from the second CyTOF batch. To separate populations with a similar phenotypic spectrum, we
generated a force-directed layout integrating the SCAFFoLD algorithm (Figure 2) (Spitzer et al., 2015) and VorteX graphical environ-
ment (Figures 3C and 3D) (Samusik et al., 2016). To define landmark populations for the SCAFFoLD layout, we analyzed a combined
dataset of TAMs/monocytes of the second CyTOF batch and defined three myeloid populations (CNS-resident microglia, CNS-
invading MDMs and monocytes) using FlowSOM as described above. Each SCAFFoLD map was generated using 100 FlowSOM
initial nodes. The subsequent results were saved as GRAPHML files, which were later displayed with the open graph viz platform
Gephi (Bastian et al., 2009). SCAFFoLD maps allowed us to align and compare mass and flow cytometry, human and mouse flow
cytometry data. The force-directed graph (x-shift Vortex, Figure 3C, 3D) includes pooled 80,000 MDMs/monocytes from glioma
and BrM. The cell position matrix generated using VorteX was exported as CSV file and then imported into the R environment to over-
lay calculated FlowSOM results, using the R package ‘‘ggplot2’’ (Wickham et al., 2019b). Next, we carried out a pseudo-time align-
ment analysis using markers listed in Figure 3E and cell positions calculated using Vortex, to reveal the trifurcation of cellular subsets
from a monocytic phenotype toward a macrophage one (Street et al., 2018). Categorical ONE-Sense analysis generated one-dimen-
sional t-SNE (Laurens van der Maaten and Hinton, 2008) of equally pooled 64,000 CD8 T cells (Figure 6B) or 36,000 NK cells (Fig-
ure 7C) from glioma and BrM of the second CyTOF batch, where axis was calculated using lineage or activation markers. The
one-dimensional t-SNEs were aligned with two heatmaps, displaying lineage or activation cell profile using the R package ‘‘gplots’’
(Warnes et al., 2019).
Survival Analysis
For overall Survival (OS) data, we used The Cancer Genome Atlas (TCGA) from 164 (harmonized data, RNaseq) and 558 (legacy data,
Affymetrix) glioblastoma samples (TCGA-GBM). In order to analyze OS in low grade gliomas (TCGA-LGG), we used 516 samples
(harmonized data, RNaseq). To obtain the signature of genes associated with outcome, data were extracted from TCGAbiolinks
Rstudio and the gene list obtained from CyTOF. For harmonized data (TCGA-GBM/LGG), genes were plotted as a heatmap, and
selected according to high or low expression (gene groups). For legacy data, median levels were used to segregate cancer patients
according to OS outcome.
Statistical Analysis
P values were calculated to compare the relative frequencies of leukocytes or median antigen expression of an immune cell types
between glioma IDH1mut versus IDH1wt versus melanoma BrM versus carcinoma BrM using nonparametric Mann-Whitney-Wilcoxon
tests, and controlled for multiple testing by using the Benjamini-Hochberg test (Field et al., 2013; Noble, 2009). The relative fre-
quencies of individual patients’ leukocyte subsets from the first and second mass cytometry batches were combined to perform
the statistical analysis. Reported p-values were below 0.05, considered statistically significant and displayed on the corresponding
e10 Cell 181, 1626–1642.e1–e11, June 25, 2020

ll
Resource
graph. In Figures 1E, S2C, S3B, and S5E error bars define an interval of max/min value ± SD, horizontal line indicates the mean value.
In Figures 2G, 6A, 7B, 7D, S5D, S6B, and S7A–S7C boxplots represent the interquartile range (IQR) 50% and whiskers 25%. Listed
figures showing the relative frequencies of leukocytes or median antigen expression of an immune cell types generated using the R
package ggplot2 (Wickham et al., 2019b). The Pearson’s correlation matrix between the relative frequencies of immune populations
was calculated with the R environment (‘‘Hmisc’’ R package) and include p-value (p) and correlation coefficients (R) (Harrell, 2020).
Correlations were considered statistically significant if the p-value was below 0.05 and R value was below 0.6 or above 0.6 and
visualized using the R package ‘‘circlize’’ (Gu et al., 2014).
Cell 181, 1626–1642.e1–e11, June 25, 2020 e11

ll
Resource
Figure S1. Mass Cytometry Analysis Reveals Unique Changes in the Leukocyte Composition of Brain Tumors, Related to Figure 1
Heatmap of the mean expression of all markers in the myeloid panel, calculated across CD45+ leukocytes (value range 0-1 post-transformation/normalization).
ll
Resource
Figure S2. Mass Cytometry Analysis Reveals Unique Changes in the Leukocyte Composition of Brain Tumors, Related to Figure 1
(A) Individual UMAP plots are overlaid with all markers from the myeloid panel. (B) Composition of major immune populations found in the brain samples. Selected
bars show the reference sample included in both CyTOF acquisition rounds, which allowed normalization and batch correction. Samples were ordered according
to patients’ clinical diagnosis. (C) Frequencies of the main immune populations among CD45+ cells in the glioblastoma cohort stratified according to methylation
status of the MGMT promoter. Error bars define an interval of max/min value ± SD, horizontal line indicates the mean value. P-values were calculated using a non-
parametric Mann-Whitney-Wilcoxon test. P-values of less than 0.05 were considered statistically non-significant and were not displayed.
ll
Resource
Figure S3. The Brain TME Harbors a Heterogeneous Mononuclear Phagocyte Population, Related to Figure 2
(A) Manual gating strategy to validate the identification of the major immune populations by unsupervised machine-learning algorithms. (B) Relative frequencies of
three TAM populations among CD64+ cells in the glioblastoma cohort stratified according to methylation status of the MGMT promoter. Error bars define an
interval of max/min value ± SD, horizontal line indicates the mean value. P-values were calculated using a non-parametric Mann-Whitney-Wilcoxon test. P-values
of less than 0.05 were considered statistically non-significant and were not displayed.
ll
Resource
Figure S4. TAM Instruction Is Driven by the Type of Tumor Rather Than the Local Tissue Microenvironment, Related to Figures 4 and 5
(A) Representative immunohistochemistry images of CD163 (brown) and Iba1 (red) co-staining for glioma and BrM samples. Selected areas showing Iba1+
CD163- microglia with amoeboid (1) and ramified (2) morphology. Errors depicting the blood vessel. (B) Representative immunofluorescence image of CD209+
CNS phagocytes in recurrent anaplastic oligodendroglioma (ZH927). (C) Representative immunofluorescence image of CD169+ CNS phagocytes in the

ll
Resource
melanoma BrM (ZH879). (D) Representative immunofluorescence images of CD163+, CD206+ and CD169+ CNS phagocytes in NSCLC BrM (ZH968). (E) The
Pearson correlation between TAM frequencies and OS in the IDH1wt glioma group. Relative frequencies of microglia were calculated among CD64+ cells. Relative
frequencies of monocytes and MDM subsets were calculated among parental MDM/monocyte population. Patients that have survived till day of analysis (hence
past 400 days) have been highlighted using an asterisk (*) alongside their Patient ID.
ll
Resource
Figure S5. Preferential Treg Accumulation in the TME of Brain Metastases, Related to Figure 6
(A) A representative UMAP map displaying 120,000 singlets, live, TILs (CD3, CD19, CD56, CD90 expressing cells) equally proportioned from glioma and
metastasis samples. Individual UMAP plots are overlaid with all markers in the lymphoid panel. (B) Heatmap displaying the median antigen intensity of markers

ll
Resource
used to generate part (C). (C) UMAP map overlaid with FlowSOM-guided manual meta-clusters. (D) Relative frequencies of the main TILs populations identified
among lymphocytes in the brain tumors. (E) Relative frequencies of the main TILs populations identified in patients with glioblastoma stratified according to
methylation status of the MGMT promoter. (D, E) Only statistically significant p-values were displayed (p < 0.05, Mann-Whitney-Wilcoxon test, Benjamini-
Hochberg correction).
ll
Resource
Figure S6. Characterization of T cell memory formation and correlation with Gliobastoma patient survival. Related to Figure 6
(A) Gating strategy of naive/memory T cells. (B) Relative frequencies of main T cell subsets among CD3+ cells. (C) The Pearson correlation between T cell subset
frequencies and OS in the IDH1wt glioma group. Relative frequencies of naive/memory populations were calculated among T cells. Patients that have survived till
day of analysis (hence past 400 days) have been highlighted using an asterisk (*) alongside their Patient ID.
ll
Resource
Figure S7. T Cell Exhaustion Characterize the TME of Brain Metastases, whereas Immature NK Cells Accumulate in Glioblastoma, Related to
Figures 6 and 7
(A) Median expression of antigens derived from the most differentially expressed genes among CD8 RM and CD8 EM. Boxplots quantify the mean antigen
intensity of data shown in Figure 6C. Point shapes differentiate patients who received therapy prior to surgery (including immune-, radio- and chemotherapy). Only
statistically significant p-values were displayed (p < 0.05, Mann-Whitney-Wilcoxon test, Benjamini-Hochberg correction). (B) Relative frequencies of three CD56
expressing populations identified in the brain samples among lymphocytes. Point shapes differentiate patients who received therapy prior to surgery (including
immune-, radio- and chemotherapy). (C) Relative frequencies of ILC populations among CD56+ cells identified in the glioblastoma cohort stratified according to
methylation status of the MGMT promoter. (B, C) Only statistically significant p-values were displayed (p < 0.05, Mann-Whitney-Wilcoxon test, Benjamini-
Hochberg correction). (D) The Pearson correlation between ILC cell subset frequencies and OS in the IDH1wt glioma group. Relative frequencies of ILC
populations were calculated among CD56+ cells. Patients that have survived till day of analysis (hence past 400 days) have been highlighted using an asterisk
(*) alongside their Patient ID.
Resource
Interrogation of the Microenvironmental Landscape

in Brain Tumors Reveals Disease-Specific
Alterations of Immune Cells
Florian Klemm, Roeltje R. Maas,
Robert L. Bowman, ..., Roy T. Daniel,
Monika E. Hegi, Johanna A. Joyce
Correspondence
johanna.joyce@unil.ch
In Brief
High-dimensional, multi-omics
characterization of the brain tumor
microenvironment, including
comparisons of gliomas and brain
metastases, suggests that education of
immune cell types in the TME depends on
tumor origin and IDH mutational status.
Highlights
d Flow cytometry, RNA-seq, and protein and image analyses
reveal brain TME complexity
d Glioma IDH mutation status and brain metastasis primary

tumors shape the brain TME
d Microglia and monocyte-derived macrophages exhibit

multifaceted activation
d TME immune cells show disease- and cell-type-specific

expression patterns
Klemm et al., 2020, Cell 181, 1643–1660

ll
Resource
Interrogation of the Microenvironmental Landscape
in Brain Tumors Reveals Disease-Specific
Alterations of Immune Cells
Florian Klemm,1,2 Roeltje R. Maas,1,2,3,4 Robert L. Bowman,5 Mara Kornete,1,2 Klara Soukup,1,2 Sina Nassiri,1,2,6
Jean-Philippe Brouland,7 Christine A. Iacobuzio-Donahue,8 Cameron Brennan,9 Viviane Tabar,9 Philip H. Gutin,9
Roy T. Daniel,4 Monika E. Hegi,3,4 and Johanna A. Joyce1,2,10,*
1Department of Oncology, University of Lausanne, Lausanne, Switzerland
2Ludwig Institute for Cancer Research, University of Lausanne, Lausanne, Switzerland
3Neuroscience Research Center, Centre Hospitalier Universitaire Vaudois, Lausanne, Switzerland
4Department of Neurosurgery, Centre Hospitalier Universitaire Vaudois, Lausanne, Switzerland
5Memorial Sloan Kettering Cancer Center, New York, NY, USA
6Bioinformatics Core Facility, SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland
7Department of Pathology, Centre Hospitalier Universitaire Vaudois, Lausanne, Switzerland
8Department of Pathology, Memorial Sloan Kettering Cancer Center, New York, NY, USA
9Department of Neurosurgery, Memorial Sloan Kettering Cancer Center, New York, NY, USA
10Lead Contact
*Correspondence: johanna.joyce@unil.ch
SUMMARY
Brain malignancies encompass a range of primary and metastatic cancers, including low-grade and high-
grade gliomas and brain metastases (BrMs) originating from diverse extracranial tumors. Our understanding
of the brain tumor microenvironment (TME) remains limited, and it is unknown whether it is sculpted
differentially by primary versus metastatic disease. We therefore comprehensively analyzed the brain TME
landscape via flow cytometry, RNA sequencing, protein arrays, culture assays, and spatial tissue character-
ization. This revealed disease-specific enrichment of immune cells with pronounced differences in propor-
tional abundance of tissue-resident microglia, infiltrating monocyte-derived macrophages, neutrophils,
and T cells. These integrated analyses also uncovered multifaceted immune cell activation within brain ma-
lignancies entailing converging transcriptional trajectories while maintaining disease- and cell-type-
specific programs. Given the interest in developing TME-targeted therapies for brain malignancies, this
comprehensive resource of the immune landscape offers insights into possible strategies to overcome
tumor-supporting TME properties and instead harness the TME to fight cancer.
INTRODUCTION mors, the incidence of BrMs significantly exceeds that of

gliomas.
Brain malignancies include tumors that arise within the brain, Given the current limited treatment options for these pa-
such as low-grade gliomas and glioblastomas, and brain metas- tients, a key question to address is whether a deep compre-
tases (BrMs), which originate from extracranial primary tumors, hensive understanding of how primary and metastatic cancers
including melanoma, breast, and lung cancers (Cagney et al., develop within the brain tumor microenvironment (TME) could
2017). Gliomas mutant for the metabolic enzymes isocitrate de- reveal promising new targets for therapeutic intervention.
hydrogenase 1 and 2 (IDH mut) are generally low grade (II or III) Although diverse TME cell types can critically regulate cancer
and have a significantly better prognosis than IDH wild-type progression and response to therapy across a broad range
(WT) tumors, which are typically grade IV glioblastomas. Despite of extracranial tumors (Klemm and Joyce, 2015), we cannot
standard of care treatment comprising surgery followed by radi- simply extrapolate findings from these cancers to the singular
ation and temozolomide (Stupp et al., 2005), median survival brain TME, given its unique cell types, including astrocytes,
rates for glioblastoma patients remain stubbornly low (Aldape neurons, and microglia (MG); the immune-suppressive environ-
et al., 2019). Patient survival following BrM diagnosis can be ment of this organ; and the challenges presented for cells and
even lower, with rates typically measured in months (Cagney drugs to cross the blood-brain barrier (BBB) (Quail and
et al., 2017; Ceccarelli et al., 2016), and among all adult brain tu- Joyce, 2017).
ll
Resource
A C
E F
1644 Cell 181, 1643–1660, June 25, 2020

ll
Resource
Immune checkpoint blockade (ICB), adoptive cell therapy, and their TME differently than cancers that metastasize from extra-
vaccines represent treatments targeted against immune cells cranial sites? Does IDH mutation status affect the TME? How
within the TME and systemically. The success of immunother- do distinct TME compositions potentially modulate the activa-
apies in certain extracranial cancers has led to clear motivation tion states of immune cells? By integrating the answers to these
for their evaluation in brain malignancies. However, although questions, we provide insights into potential strategies to
they show some clinical efficacy in a subset of BrM patients harness the brain TME in the fight against these deadly diseases.
(Hendriks et al., 2019; Long et al., 2018; Tawbi et al., 2018),
ICB has only resulted in responses in isolated cases of primary RESULTS
gliomas to date (Lim et al., 2018; Schalper et al., 2019). Beyond
tumor cell-intrinsic effects, this may be attributed in part to im- Tumor Origin and IDH Mutational Status Influence the
mune-suppressive components of the brain TME, including tu- Immune Composition of Brain Malignancies
mor-associated macrophages (TAMs), which have emerged as We first determined the broad immune cell abundance in the
prominent players in brain cancers (Gutmann and Kettenmann, brain TME by analyzing the pan-leukocyte marker CD45 through
2019; Quail and Joyce, 2017). immunofluorescence (IF) staining of whole-tissue sections and
Lineage-tracing experiments in mice revealed that brain TAMs flow cytometry (FCM) analyses of non-tumor brain tissue, IDH
can originate from tissue-resident MG or monocyte-derived mac- mut low-grade and IDH WT high-grade gliomas, and BrMs orig-
rophages (MDMs) recruited from the peripheral circulation inating from different primaries, including breast cancer, lung
(Bowman et al., 2016; Chen et al., 2017). TAMs are highly plastic cancer, and melanoma (Figures 1A, 1B, and S1A). This showed
cells that integrate input from cytokines, growth factors, and other a leukocyte abundance from 20%–40% across the cancer
stimuli, resulting in diverse activation states and cellular pheno- samples. Stratification of CD45+ cells into myeloid and lymphoid
types, including promotion of invasion, angiogenesis, metastasis, lineages revealed a significant increase in myeloid cells in IDH
and immune suppression (Mantovani et al., 2017; Noy and Pollard, mut and IDH WT gliomas and of lymphocytes in IDH WT tumors
2014). This plasticity and their position at the nexus between ma- and BrMs compared with non-tumor tissue (Figure 1B; p < 0.05,
lignant cells and tumor-infiltrating T cells makes TAMs a promising one-sided Student’s t test). We used multicolor fluorescence-
target of TME-directed therapies in different cancers. Indeed, activated cell sorting (FACS) to analyze 14 major immune cell
studies in mice showed that phenotypic alteration of TAMs results populations across 100 clinical samples (Figure S1A; Tables
in anti-tumor efficacy in glioblastoma (Pyonteck et al., 2013; Quail S1 and S2) and collected cells for RNA sequencing (RNA-seq)
et al., 2016; Yan et al., 2017), whereas TAM depletion prevents from 48 patients (Table S3; full clinical annotation).
BrM outgrowth (Qiao et al., 2019). By incorporating cell lineage tracing and mouse models of
Despite these preclinical studies, the precise contribution of high-grade gliomas and BrM, we previously identified the cell
the two ontogenetically distinct TAM cell types in human brain surface marker integrin alpha 4, ITGA4/CD49D, as a means to
malignancies is unclear, which hinders clinical translation. For discriminate tumor-associated MG (T-MG) from tumor-associ-
example, previous studies interrogating the role of TAMs in pa- ated MDMs (T-MDMs) (Bowman et al., 2016), which we inte-
tient brain tumors did not distinguish between MG and MDMs grated here into clinical sample analyses. This enabled sorting
based on use of lineage tracing-derived markers (Gabrusiewicz of CD45 non-immune cells, CD49Dlow MG, CD49Dhigh MDMs,
et al., 2016; Sankowski et al., 2019; Szulzewsky et al., 2016) or neutrophils, and CD4+ and CD8+ T cells (Figure S1A; Tables S2
focused solely on gliomas (Müller et al., 2017; Venteicher et al., and S3A) for transcriptome analysis by RNA-seq. We assessed
2017). We therefore interrogated the TME landscape in gliomas sorting fidelity by FCM re-analysis of the sorted CD49Dlow and
and BrMs, with an emphasis on exploring TAMs, while also CD49Dhigh TAM populations (purity, 98.4%–99.8%) and by
investigating their relation to other immune cells and structures investigating the frequency of the canonical IDH codon 132
in the TME. We leveraged this multimodal resource to address missense mutation in the RNA-seq reads from CD45 cells
a number of questions. Do tumors arising within the brain shape and CD49Dlow and CD49Dhigh TAM populations. Although we
Figure 1. The Immune Cell Composition of Brain Malignancies

(A) Quantification of immunofluorescence (IF) staining of non-immune (CD45 ) and immune cells (CD45+) in sections of non-tumor brain tissue (n = 6), gliomas
(nIDH mut = 16, nIDH WT = 16), and brain metastases (BrMs, nbreast = 12, nlung = 5, nmelanoma = 7). Data are represented as mean ± SEM.
(B) Flow cytometry (FCM) quantification of non-immune cells (CD45 ), myeloid cells (CD45+, CD11B+), and lymphocytes (CD45+, CD11B ) in non-tumor tissue
(n = 6), gliomas (nIDH mut = 17, nIDH WT = 40), and BrMs (nbreast = 13, nlung = 16, nmelanoma = 8). Data are represented as mean ± SEM.
(C) Gene set variation analysis (GSVA) normalized enrichment score (NES) of MG and MDM ontogeny-specific core gene signatures in CD49Dlow MG and
CD49Dhigh MDMs from non-tumor and tumor tissues.
(D) Heatmap of immune cell proportions in relation to all CD45+ cells (MG, microglia; MDM, monocyte-derived macrophage; CD14low/CD16+, CD14low/CD16+
monocyte; CD14+/CD16+, CD14+/CD16+ monocyte; CD16 Gran., CD16 granulocyte; iMC, immature myeloid cell; DC, dendritic cell; Treg, regulatory T cell;
DNT, double-negative T cell) across the cohort (nnon-tumor = 6, nglioma = 57, nBrM = 37). Cluster assignment, disease type, IDH mutation status, and BrM primary
tumor are annotated per column (for clinical information, see Table S1).
(E) Principal component (PC) biplot of FCM data with sample scores and top 5 loadings of the first two PCs (n = 100 clinical samples, proportion of variance shown
on PC axes).
(F) Mean of immune cell populations in non-tumor tissue (n = 6), gliomas (nIDH mut = 17, nIDH WT = 40), and BrMs (nbreast = 13, nlung = 16, nmelanoma = 8) as percentage
of CD45+ cells.
See also Figure S1 and Tables S1 and S2.
Cell 181, 1643–1660, June 25, 2020 1645

ll
Resource
observed a mean mutated allele frequency of 0.43 in CD45 cells and deconvolution analyses to independently validate their pres-
from IDH mut gliomas (range, 0.3–0.61), this was very rare in ence. Commonly employed MG markers, such as P2RY12,
TAMs (mean, 0.01; range, 0.0–0.09), indicating reliable separa- TMEM119, and SALL1, and MDM-associated genes, such as
tion of cell populations. In a t-distributed stochastic neighbor AHR and VDR, showed varying RNA expression levels across
embedding (t-SNE) visualization of sorted populations, samples different brain malignancies while maintaining their cell type
clustered mostly by cell type (Figure S1B), with gliomas and specificity (Figure S2A) in a similar manner as observed for the
BrMs discernible as separate groups in the CD45 population. ontogeny core gene sets (Figure 1C). An equivalent pattern
In this global expression analysis in the context of the other was observed at the protein level (Figure S2B), where P2RY12
major brain TME components, CD49Dlow and CD49Dhigh TAM showed the highest expression in non-tumor tissue, and CD68
populations clustered closely, suggesting broad transcriptomic was most abundant in BrM-TAM populations. This necessitated
similarity. We thus further interrogated the utility of CD49D to use of both markers complemented by CD49D to reliably identify
differentiate between TAM populations by analyzing association MG and MDMs in IF analyses (Figure S2C). We used this strategy
of MG- and MDM-specific ontogeny core gene sets, identified to interrogate a cohort of non-tumor, glioma, and BrM samples
previously from lineage-tracing studies (Bowman et al., 2016), by whole-section quantification, confirming MDM accumulation
in human CD49Dlow and CD49Dhigh cells sorted from non-malig- in IDH WT gliomas and BrMs (Figures 2A–2C). Furthermore,
nant and brain cancer tissues. This revealed enrichment of comparison of tissue processed independently for IF and FCM
ontogeny core gene sets in the corresponding cell type (Fig- from the same individual samples demonstrated significant
ure 1C), demonstrating our ability to accurately distinguish MG concordance (Figure S2D).
and MDMs in human samples across different disease entities. We queried the sorted cell populations for T-MG- and T-MDM-
Interestingly, these core signatures were influenced within specific differentially expressed genes (DEGs) that separate
certain tumor types, with T-MDMs showing an increased MG these two populations from the most abundant other cell types;
core gene set signal in IDH mut gliomas and T-MG acquiring i.e., CD45 cells, neutrophils, and T cells (Figure S2E). Several of
MDM features in BrMs, suggesting tissue-dependent transcrip- the genes highly expressed in T-MG are well-established MG
tional programming of these cells, as further interrogated below. markers (P2RY12, TMEM119, and TAL1), whereas genes highly
We next assessed the landscape of intratumoral immune cell expressed in T-MDMs include markers of alternative macro-
populations (Figure S1A; Table S2) using clustering analysis to phage polarization (FCGR2B and CLEC10A) and DC-like pheno-
identify patterns of cellular abundance (Figure 1D; chi-square types (CD1C, CD1B, and CD207) with increased phagocytic and
test for independence, p < 0.0001). This revealed three major antigen cross-presentation ability (CD209). These gene sets also
clusters: (1) non-tumor samples and IDH mut gliomas character- allowed us to utilize a publicly available integrated dataset (Vivian
ized by dominance of MG with low numbers of other immune et al., 2017) containing bulk expression data of healthy cortical
cells; (2) IDH WT gliomas and several BrMs with an influx of brain tissue from the Genotype-Tissue Expression project
MDMs and, to some extent, neutrophils into the tumor while (GTEx; GTEx Consortium, 2013) and low- and high-grade glioma
mostly excluding lymphocytes; and (3) predominantly BrMs and samples from The Cancer Genome Atlas (TCGA; Ceccarelli et al.,
few IDH WT gliomas exhibiting the most diverse immune cell 2016) in a bulk tissue transcriptome deconvolution approach
landscape with substantial infiltration of T cells and neutrophils. (Racle et al., 2017). The estimates obtained of MG and MDM pro-
Certain tumors contained CD14low/CD16+ non-classical mono- portions in this external dataset (n = 711 samples) verified the
cytes, CD14+/CD16+ intermediate monocytes, CD16 granulo- prevalence of MG in IDH mut gliomas and MDM enrichment in
cytes, dendritic cells (DCs), or immature myeloid cells. Across IDH WT gliomas (Figure 2D).
all samples, the lymphocyte compartment was mostly composed
of T cells with fewer natural killer (NK) cells and B cells. MG and MDMs Exhibit a Multifaceted Polarization
Principal-component analysis (PCA) of the relative abundance Phenotype in Brain Malignancies
of all investigated populations confirmed that MG, MDMs, neu- We next employed PCA to specifically focus on TAMs and
trophils, and CD4+ and CD8+ T cells are the major immune cell analyze genes whose expression was influenced by tissue type
determinants of the brain TME landscape (Figure 1E). Principal (i.e., reference MDMs, non-tumor brain, gliomas, and BrMs)
component 1 (PC1) separated non-tumor tissue and IDH mut gli- and cell type (i.e., MG and MDMs) (Figure 3A). Within the first
omas from IDH WT gliomas and BrMs, whereas PC2 distin- two PCs, MG and MDMs projected into different spaces, with
guished IDH WT gliomas and BrMs. Further analysis stratifying in vitro differentiated MDMs distinct from tissue-derived sam-
for IDH status in gliomas and the primary tumor site in BrMs veri- ples. We observed a gradient across PC1 with non-tumor brain
fied a substantially higher proportion of lymphocytes in BrMs tissue at one end, traversing IDH mut and IDH WT gliomas,
(Figure 1F; meanlymphocytes %CD45+ = 46.23%, SEM = 4.15, t and ending with BrMs. Thus, TAM transcriptomic changes are
test, p < 0.0001). Melanoma BrMs exhibited the most abundant influenced by the brain TME per se and also by the specific
lymphocyte infiltrate with a sizeable CD8+ T cell fraction type of malignancy.
(meanCD8+ %CD45+ = 33.01%, SEM = 5.82, one-way ANOVA, We contrasted T-MG and T-MDMs from BrMs or gliomas
p < 0.01). Regulatory T cells (Tregs) were detected in certain (regardless of IDH mutation status) with MG from non-tumor
BrMs (meanTreg %CD45+ = 1.2%, SEM = 0.36) but were rare in gli- brain or in vitro differentiated MDMs from healthy donors,
omas (meanTreg %CD45+ = 0.25 %, SEM = 0.05, t test, p < 0.05). respectively (Figure S3A; Tables S3A and S4). This revealed pro-
Because of the prominence of T-MG and T-MDMs in the found expression changes in both populations, with T-MDMs ex-
myeloid compartment of brain malignancies, we used IF staining hibiting a higher magnitude in their transcriptional response
1646 Cell 181, 1643–1660, June 25, 2020

ll
Resource
C D
Figure 2. Analysis of MG and MDM Abundance

(A and B) Representative IF images (A) and corresponding cell type identification (B) of MG (CD45+, P2RY12+/CD68+, CD49D ), MDMs (CD45+, P2RY12+/CD68+,
CD49D+), and non-immune (CD45 ) and non-TAM immune cells (CD45+, P2RY12 /CD68 , CD49D /+) in non-tumor brain tissue, IDH mut and IDH WT gliomas,
and BrMs. Scale bars, 100 mm. Insets show quantification per field of view (FOV).
(C) IF quantification of MG and MDM abundance in non-tumor brain tissue (n = 6), IDH mut (n = 16) and IDH WT (n = 16) gliomas, and BrMs (n = 24).
(D) Deconvolution of merged GTEx and TCGA glioma datasets, showing relative abundance of MG, MDMs, and non-TAMs (‘‘other cells’’) in healthy frontal cortex
and IDH mut and IDH WT gliomas.
Wilcoxon rank-sum test was used for statistical analysis: *p < 0.05, **p < 0.01, ***p < 0.001, ****p < 0.0001. See also Figure S2.
compared with T-MG (Figure S3A). The intersect of DEGs in gli- Response,’’ ‘‘IL2 STAT5 Signaling,’’ and ‘‘IL6 JAK STAT3
omas and BrMs was highest in T-MDMs (Figure S3B), potentially Signaling’’) (Figure S3C).
reflecting the greater changes experienced by these cells upon We also assessed the M1 and M2 polarization status of T-MG
entering the completely foreign environment of a brain tumor. and T-MDMs using a panel of marker genes (Murray et al., 2014).
This was also evident when focusing on genes upregulated in gli- However, no evident pattern emerged of a defined M1 versus M2
omas and BrMs that are exclusive to T-MG or T-MDMs (Fig- phenotype in glioma or BrM T-MG or T-MDMs (Figure S3D). To
ure 3B). In T-MG and T-MDMs, the number of shared genes further explore the activation state of T-MG and T-MDMs, we
was higher across different diseases than between these two subjected their respective upregulated genes to ORA of
cell populations within the same tumor type. Consequently, macrophage stimulus-specific programs (Xue et al., 2014). This
only a small number of genes (n = 137) showed concordant up- revealed a multifaceted response (Figure 3C) incorporating ca-
regulation across a comparison of all diseases and TAM types nonical M1 (interferon g [IFNg]) and M2 polarization (inter-
(Figure 3B). leukin-4 [IL-4]), including expression changes associated with
To explore the underlying biological processes conserved in chronic inflammatory stimuli (tumor necrosis factor alpha
gliomas and BrMs, we examined the intersect of upregulated [TNF-⍺] + prostaglandin E [PGE2] and TNF⍺ + PGE2 +
genes (Figure S3B) in T-MG or T-MDMs using gene set over- Pam3CysSerLys4 [TPP]) and exposure to free fatty acids (oleic
representation analysis (ORA). In the Molecular Signature Data- acid [OA] and palmitic acid [PA]), which have been implicated
base (MSigDB; Liberzon et al., 2015) ‘‘hallmark’’ collection of in modulating myeloid cell function (Thapa and Lee, 2019). This
major biological categories, T-MG and T-MDMs showed indicates diverse transcriptional programming of T-MG and
pathway enrichment in (1) modeling of the TME (‘‘Angiogenesis,’’ T-MDMs in gliomas and BrMs extending beyond simple M1
‘‘Hypoxia’’), (2) inflammation (‘‘Inflammatory Response,’’ ‘‘Allo- versus M2 polarization.
graft Rejection’’), and (3) immune cell activation states (‘‘TNF⍺ To understand which processes are linked to and potentially
Signaling via NFkB,’’ ‘‘Interferon ⍺ Response,’’ ‘‘Interferon g driving these responses, we identified the gene set enrichment
Cell 181, 1643–1660, June 25, 2020 1647

ll
Resource
A B C
Figure 3. MG and MDMs Exhibit a Multifaceted Polarization Phenotype in Brain Malignancies

(A) PC biplot of MG and MDM transcriptome data from non-tumor brain tissue, IDH mut and IDH WT gliomas, and BrMs (for clinical information, see Table S3A;
reference = in-vitro-generated MDMs; proportion of variance shown on PC axes).
(B) Visualization of intersects of the conserved sets of significantly upregulated genes in MG and MDMs. Intersects between sets are shown in the combination
matrix. ngenes found uniquely in a gene set or intersect is indicated above individual bars.
(C) Stimulus-specific macrophage gene expression modules overrepresented (within conserved differentially expressed genes [DEGs] versus respective ref-
erences) in tumor associated MG (T-MG) and tumor-associated MDMs (T-MDMs). Bar heights and color indicate significance level. GC, glucocorticoid; IFNg,
1648 Cell 181, 1643–1660, June 25, 2020

ll
Resource
analysis (GSEA; Subramanian et al., 2005) leading-edge genes in perivascular niche (Figures 4A and S4A). Analysis of their distri-
T-MG and T-MDMs in gliomas and BrMs and clustered them into bution relative to CD31+ vascular structures showed a closer
leading-edge metagenes (LEMs) with non-negative matrix proximity of T-MDMs compared with T-MG (Figures 4B and
factorization (Godec et al., 2016). This identified up to 5 distinct S4A). Interrogation of anatomical transcriptome data from the
LEMs per cell type and comparison that were tested for signifi- Ivy Glioblastoma Atlas Project (Ivy GAP) study (Puchalski et al.,
cant overlap in a pairwise fashion (Figure S3E) and annotated us- 2018) also demonstrated enrichment of T-MDMs in the micro-
ing Gene Ontology (GO) terms (Figure 3D). LEMs associated with vascular compartment (Figure S4B). This enrichment coincided
mitosis and cell proliferation were present in T-MG and T-MDMs with CD4+ and CD8+ T cells, indicating further spatial TME orga-
in gliomas and BrMs (Figure 3D, group 1). The biological validity nization in IDH WT gliomas.
of these LEMs were verified by staining for Ki-67, a marker of cell We assessed whether the distinct T-MG and T-MDM distribu-
proliferation, in non-tumor, glioma, and BrM tissue sections (Fig- tions and cell numbers are paralleled by their activation state. In
ure 3E), showing increased proliferation in T-MG and T-MDMs in the LEM analysis, we had detected a type I IFN response in gli-
IDH WT gliomas and BrMs and in T-MG in IDH mut gliomas. oma MDMs but not MG (Figure 3D); we therefore queried the
Interestingly, LEMs enriched for type I IFN signaling were de- FCM data to analyze levels of major histocompatibility complex
tected in glioma and BrM T-MDMs and in BrM T-MG but not in (MHC) class II human leukocyte antigen-DR isotype (HLA-DR)
glioma T-MG (Figure 3D, group 2). Sustained type I IFN signaling expression. This showed significantly increased HLA-DR in
has been implicated in mediating immune suppression and ICB T-MDMs compared with T-MG in IDH mut and IDH WT tumors
resistance (Benci et al., 2016). The stringency of these group 2 (Figure 4C). We screened the associated RNA-seq data for anti-
LEMs was validated by building a protein-protein interaction gen processing and presentation pathway gene sets using GSEA
(PPI) network of the shared LEM genes (Figure S3F). Beyond and gene set variation analysis (GSVA) (Figure 4D). Interestingly,
their role in antiviral responses, the genes highlighted at the cen- we found evidence of increased expression of MHC class II an-
ter of the PPI network (Figure S3F, red nodes) have been impli- tigen presentation gene sets in IDH WT glioma MDMs and also
cated in a variety of tumor-promoting and -suppressing roles antigen processing-associated pathways (Figure S4C) and
(Benci et al., 2016). Similarly, the more peripheral network nodes MHC class I presentation gene sets (Figure 4D). Although these
IL15 and TNFSF10 are potentially able to modulate an effective findings suggest the potential of TAMs, particularly T-MDMs, to
immunological anti-tumor response or induce apoptosis in can- initiate an immune response, this potential is generally not real-
cer cells, respectively (Bouralexis et al., 2005; Santana Carrero ized in the glioma TME, based on the current status of ICB trials
et al., 2019). We asked whether these genes were directly in this disease, and we thus asked whether there was also evi-
induced by secreted factors in the brain TME and established dence of pro-tumor states in these cell populations.
cell-based assays to expose MDMs to TME conditioned medium We compared T-MG and T-MDMs from IDH WT gliomas with
(CM) generated from single-cell suspensions of freshly isolated T-MG from IDH mut gliomas because they constitute the most
glioma or BrM samples in culture. All genes analyzed were upre- abundant TME cell types in these tumors, respectively (Figures
gulated by BrM-TME-CM and to a lesser extent by glioma TME- 1F and 2C; Table S5). This revealed 489 DEGs in T-MG (Fig-
CM (Figure 3F). We also detected induction of inflammation- and ure 4E; Table S5; 406 up- and 83 downregulated), and 1,478
nuclear factor kB (NFkB) signaling-associated LEMs in BrM-MG, DEGs in T-MDMs (Figure 4F; Table S5; 903 up- and 575 down-
glioma MDMs, and BrM-MDMs (Figure 3D, group 3). LEMs that regulated). Although these gene lists were generated by
point toward a Th17 response (group 4) and recruitment of im- comparing T-MDMs from IDH WT gliomas with T-MG from IDH
mune cells and interactions between different immune cell com- mut gliomas, they similarly separated T-MDMs in IDH mut versus
partments were exclusively detected in MDMs (group 5). Collec- IDH WT disease in a clustering analysis (Figure 4F), indicating
tively, these analyses reveal acquisition of a multifaceted that they indeed reflect T-MDM alterations based on the IDH sta-
activation state of MG and MDMs upon their integration into tus of the tumor. 421 genes exhibit a similar pattern across both
the TME of brain malignancies. TAM cell types (343 up- and 78 downregulated), suggesting that
T-MG and T-MDMs can also acquire a common transcriptional
IDH Mutation Status Associated with Changes in Glioma pattern in IDH WT tumors. Among the shared genes were
TAM Activation several encoding extracellular matrix (ECM) proteins (Figure 4G,
We next asked whether MG and MDMs occupy distinct regions FN1 and VCAN) and ECM-associated matricellular proteins
within the TME of IDH WT gliomas. Spatial analysis of tissue sec- (THBS1, TGFBI, LGALS3, and ANGPTL4) that regulate the avail-
tions showed significant enrichment of both populations in the ability of ECM-sequestered ligands, angiogenesis, and tumor
interferon gamma; LA, lauric acid; LiA, linoleic acid; OA, oleic acid; PA, palmitic acid; PGE2, prostaglandin E2; sLPS, standard lipopolysaccharide; TNF-a = tumor
necrosis factor alpha; TPP, TNFa + PGE2 + Pam3CysSerLys4; IL-10, interleukin 10.
(D) Heatmap of GO overrepresentation analysis of leading-edge metagenes (LEMs) in MG and MDMs from gliomas and BrMs. Tile fill indicates significance
(hypergeometric test, -log10 (adjusted p value), terms were filtered by significance).
(E) IF quantification of the proportion of proliferating Ki67+ MG and MDMs in non-tumor tissue (n = 5), IDH mut (n = 10) and IDH WT gliomas (n = 9), and BrMs (n = 8).
Means were compared with one-tailed t test: *p < 0.05.
(F) qRT-PCR of type I IFN LEM marker genes from group 2 (Figure S3F) in in-vitro-generated MDMs stimulated with the indicated TME culture-conditioned
medium (TME CM). Fold changes were calculated relative to colony-stimulating factor-1 (CSF-1)-treated MDM baseline (one-way ANOVA, p < 0.1, nMDM = 4–11).
Data are represented as mean ± SEM.
Cell 181, 1643–1660, June 25, 2020 1649

ll
Resource
A B C
E G H
I J
1650 Cell 181, 1643–1660, June 25, 2020

ll
Resource
immunity (Mushtaq et al., 2018). This suggests that TAMs help was confirmed in a multivariate Cox proportional hazard model
shape the composition and effector functions of ECM proteins that included the transcriptomic glioma subtypes (as annotated
in IDH WT tumors. We also found the anti-inflammatory mole- in the TCGA dataset) and IDH status (Figure 4J). To verify that
cules ANXA1 and GPNMB (Figure 4G), previously implicated in this effect did not simply reflect changes in T-MDM number, we
pro-tumorigenic macrophage polarization and inhibition of classified the TCGA cohort based on enrichment of the T-MDM-
T cell activation (Kobayashi et al., 2019; Ripoll et al., 2007), to specific gene set used for deconvolution, which showed a low ef-
be upregulated in T-MG and T-MDMs. fect on survival (Figure S4F).
We next investigated inflammation mediators within the In light of disappointing outcomes from PD1 or PDL1 ICB trials
CD45 population of IDH WT tumors in parallel with their corre- in glioblastoma to date, we queried whether the abundant T-MG
sponding receptors in TAMs. TGFB2 expression was elevated and T-MDMs could contribute to the limited therapeutic efficacy.
compared with IDH mut CD45 cells, and the accessory trans- We performed ORA of a panel of 20 gene sets previously asso-
forming growth factor b (TGF-b) receptor ENG was highly ex- ciated with innate anti-PD1 resistance (IPRES; Hugo et al.,
pressed in IDH WT TAMs (Figure 4H). TGFB2 has pleiotropic ef- 2016) in the TAM DEGs of IDH WT gliomas and found a sizeable
fects in inflammation and tissue remodeling during wound fraction to be upregulated in T-MG and T-MDMs (Figure S4G).
healing and has been implicated in an autocrine signaling loop We then included the CD45 population and interrogated enrich-
in glioblastoma cells (Rodón et al., 2014). The neuroinflammatory ment of IPRES gene sets on the single-sample level by GSVA
cytokine MDK, which modulates TAM polarization to a M2-like (Figure S4H). This yielded a diverse picture with tumor cells
phenotype in glioma (Meng et al., 2019), was upregulated in and TAMs enriched for IPRES gene sets to varying degrees.
CD45 cells from IDH WT tumors, and its receptors SDC4 and Therefore, TAMs and CD45 cells from IDH WT gliomas may
ITGA4/CD49D were differentially expressed in T-MDMs versus contribute to mediating innate ICB resistance.
T-MG (Figure 4H), suggesting cell-type-specific effects of this in-
ferred signaling loop. The Immune Contexture Influences the TME on a
We asked whether a T-MDM-specific gene set generated from Global Level
IDH WT gliomas was associated with a survival difference in pa- Through integrated analysis of protein and gene expression
tients. By logistic regression, we derived a representative signa- data, we next explored the effect of immune cell infiltration.
ture consisting of 36 genes (Figure S4D) from the total number of Of 200 inflammation-associated proteins assessed, 55 were
genes upregulated in TAMs in brain malignancies (Figure 3B). differentially detected in our sample cohort (for clinical infor-
This included the macrophage marker RUNX3; the atypical che- mation, see Table S3B). Unsupervised clustering analysis re-
mokine receptor ACKR3, which can regulate CXCL12-CXCR4 vealed distinct clusters with abundant inflammatory proteins in
signaling; the endoplasmic reticulum (ER) stress protein tumors (Figure 5A). The profile of IDH WT gliomas and BrMs
HERPUD1 and the inhibitory Fc receptor FCGR2B, which can showed a sizeable overlap (protein cluster 1), encompassing
modulate macrophage activation (Bournazos et al., 2016; Li angiogenic factors (VEGFA and ANG), growth factors (PDGFA,
et al., 2018); and the cytokine IL19, which affects angiogenesis TGFB1, SPP1, and GDF15), several proteases and protease in-
and macrophage polarization (Richards et al., 2015). The signature hibitors (SERPINE1, CTSS, and TIMP1), the proteolysis cascade
was used to classify patients in a merged TCGA dataset of low- regulator PLAUR, and the cytokines CCL2 and CCL5 (Figures
and high-grade gliomas (Figures 4I and S4E). In IDH mut patients, 5A and S5A). However, we also found distinct protein
a decrease in median overall survival was associated with enrich- patterns between gliomas and BrMs. The neurotrophic growth
ment of the T-MDM IDH WT signature, whereas IDH WT patients factor FGF2 and neuronal cell adhesion molecules, including
with a low enrichment score showed increased survival. This ALCAM, which regulates immune cell infiltration during
Figure 4. IDH Mutation Status Shapes TAM Activation in Gliomas

(A) Number of MG and MDMs per square millimeter in the perivascular niche (PVN) or distant from the PVN (non-PVN) in IDH WT gliomas by IF staining. Means
were compared with Wilcoxon signed-rank test: ***p < 0.001.
(B) Distance of MG and MDMs to the nearest vessel in IDH WT gliomas (nsamples = 14, nMG = 88,781, and nMDM = 92,969 cells counted).
(C) Boxplot of HLA-DR geometric mean fluorescence intensity measured by FCM in MG and MDMs in IDH mut and IDH WT gliomas. MG and MDMs from the
same samples are connected by lines (nIDH mut = 17, nIDH WT = 39; Wilcoxon signed-rank test: ***p < 0.001, ****p < 0.0001).
(D) GSVA of antigen processing and presentation pathways from the Molecular Signatures Database (MSigDB) ‘‘Canonical Pathways’’ collection with significant
differential enrichment between MG and MDMs in IDH WT tumors and in MG and MDMs across IDH mut and IDH WT samples. Columns are ordered by IDH
mutation status and cell type, and rows (Z score) are hierarchically clustered.
(E and F) Expression heatmap of T-MG (E) and T-MDM (F) DEGs (compared with T-MG in IDH mut gliomas) in IDH mut and IDH WT glioma samples. Columns and
rows (Z score) are hierarchically clustered.
(G) Normalized counts of selected genes in MG and MDMs in gliomas stratified by IDH status. Data are represented as mean ± SEM.
(H) Relative expression in CD45 MG and MDM cells of ligands and receptors upregulated in CD45 cells in IDH WT versus IDH mut samples and their matching
counterparts. Variance-stabilized expression values were scaled to the expression range.
(I) Kaplan-Meier estimator of survival in the TCGA glioma cohort based on enrichment for the MDM IDH WT signature, assessed by GSVA in IDH mut and IDH WT
gliomas from the combined TCGA cohort. GSVA scores were separated into tertiles across the combined IDH mut and IDH WT sample set. Pairwise p values were
calculated using a log rank test.
(J) Hazard ratios of a multivariate Cox proportional hazards model with transcriptomic subtype (TCGA annotation), IDH status (TCGA annotation), and T-MDM IDH
WT GSVA score as covariates for overall survival within the TCGA glioma cohort (PN, proneural; NE, neural; CL, classical; ME, mesenchymal subtype).
Cell 181, 1643–1660, June 25, 2020 1651

ll
Resource
B C
Figure 5. The Immune Contexture Influences the TME on a Global Level

(A) Inflammation-associated bulk tissue protein concentration heatmap subset on 55 proteins with significantly different concentrations between non-tumor
brain, gliomas, and BrMs in an ANOVA (p < 0.1, nnon-tumor = 3, nglioma = 14, nBrM = 12; concentrations were log10-transformed and Z scored). Rows and columns
1652 Cell 181, 1643–1660, June 25, 2020

ll
Resource
neuroinflammation (Lécuyer et al., 2017), were highly expressed systemic inflammatory disorders (Filková et al., 2009), was up-
in non-tumor brain, IDH mut, and IDH WT samples (protein clus- regulated in BrM-TAMs (Figure 6A).
ter 3; Figure S5A). Conversely, BrM samples had abundant im- Analysis of individual BrM-TAM populations uncovered
mune-regulatory molecules affecting myeloid and lymphocytic distinct expression patterns. BrM-MG showed restricted upre-
cells and their heterotypic signaling (protein cluster 2; Fig- gulation of IL6 (Figure 6A), which exerts immunosuppressive ef-
ure S5A), such as CD40L, IL6R, INHBA, and AREG (Morianos fects on T cell function in cancer and mediates ICB resistance
et al., 2019; Zaiss et al., 2015), possibly reflecting the greater im- (Tsukamoto et al., 2018), and the receptor TREM1, which mod-
mune cell diversity in BrMs. This orthogonal dataset reinforces ulates pro-inflammatory responses in MG and systemically in
the RNA-seq analyses showing that inflammatory signaling path- myeloid cells during neuroinflammation (Liu et al., 2019; Xu
ways are highly enriched in brain tumors. et al., 2019). Among the upregulated chemokines, we found in-
We integrated the cell-type-specific RNA-seq data and bulk creases in both TAM cell types (CCL23) and BrM-MG-restricted
protein data to distinguish proteins with more restricted expres- (CXCL5 and CXCL8) or BrM-MDM-restricted increases (CCL8,
sion versus those that are expressed across a range of cell types. CCL13, CCL17, and CCL18) (Figure 6A). These results reveal
Transcriptome data from CD45 cells, TAMs, neutrophils, and distinct contributions of TAM populations to the inflammatory
CD4+ and CD8+ T cells from all tumor samples were clustered us- TME milieu in a disease-specific manner.
ing a self-organizing map (SOM). This yielded 6 SOM spots (i.e., GSEA identified additional cell-type-specific enrichment pat-
metagenes of co-expressed genes; Figure 5B) that recapitulated terns. BrM-MG showed evidence of IL-6 pathway activity (Fig-
the respective cell lineages (Figure S5B). The CD45 populations ure S6C), and in BrM-MDMs, the ‘‘Naba core matrisome’’ gene
were assigned to three distinct spots that were associated with set was significantly enriched (Figure S6D). This prompted us
more aggressive IDH WT gliomas and BrMs (spot VI) or reflected to assess expression of genes encoding ECM and matricellular
the brain-intrinsic or -extrinsic tumor origin (spots I and V). These proteins in BrM-MDMs versus BrM-MG, which revealed genes
cell-type-associated SOM spots overlapped considerably with encoding matrix proteins, including type III and IV collagens,
the protein data (30 of 55 proteins, Fisher’s exact test, p < FN1, the proteoglycans LUM and OGN, and the matricellular
0.0001; Figure 5C). Although VEGFA, ANG, and TGFB1 were ex- proteins ECM1, SPARC, and SPARCL1 as highly expressed in
pressed by diverse cell types in gliomas and BrMs, other genes, BrM-MDMs (Figure 6B). Although ECM remodeling has been
such as GDF15 and IGFBP2, showed more CD45 cell-restricted implicated in tumor progression, LUM, OGN, SPARCL, and
expression (Figure 5D). The significant contribution of TAMs to SPARCL1 exhibit pro- and anti-metastatic properties, which un-
production of key inflammatory proteins, including SPP1 and derscores the complex context-dependent role of the ECM (Kai
IHNBA, is reflected by TAM SOM spot III, constituting the largest et al., 2019). We also found high expression of the cathepsin pro-
group of proteins with cell type-specific expression (Figures 5C teases CTSB and CTSW in BrM-MDMs (Figure 6B), which partic-
and 5D). ipate in multiple tumor-promoting processes, including invasion
and metastasis (Olson and Joyce, 2015). The hyaluronan recep-
Myeloid Cells Show a Distinct Phenotype in BrMs tor HMMR, involved in macrophage chemotaxis and fibrosis in
Our global analysis juxtaposing the expression patterns of lung injury (Cui et al., 2019), was also higher in BrM-MDMs (Fig-
TAMs in gliomas (regardless of IDH status) with BrMs showed ure 6B). Together, these data suggest that the ECM is not only
disease- and cell-type-specific transcriptomic changes. We shaped by macrophages at the primary site (Afik et al., 2016)
thus explored BrM-specific alterations by focusing on genes but that T-MDMs may also play a pivotal role in ECM niche con-
upregulated only in relation to the corresponding reference struction in BrM that is distinct from IDH WT gliomas (Figure 4G).
and to IDH WT gliomas (Figures S6A and S6B; Table S6). Given the upregulation of CXCL8, a key neutrophil chemoattrac-
Various cytokines, chemokines, and pro-inflammatory mole- tant, by BrM-MG (Figure 6A), we explored the TME contribution to
cules were elevated in BrM-MG and BrM-MDMs (Figure 6A), recruitment of neutrophils, which were highly abundant in BrM
including the potent mediators of autoimmune neuroinflamma- (Figure 1F). Analysis of major neutrophil-recruiting chemokines
tion CSF2 and IL23A (Zhao et al., 2017) and the pattern and their receptors showed broad expression across all interro-
recognition receptor MARCO. Intriguingly, antibody-mediated gated myeloid cells (Figure S6E). To explore the phenotype of
MARCO targeting in extracranial tumors increases M1-like po- BrM-associated neutrophils, we queried the RNA-seq data, which
larization of TAMs and enhances ICB efficacy (Georgoudaki revealed BrM-specific upregulation of ITGA3 (Figure 6C), which is
et al., 2016). These effects relied on interaction of the antibody involved in neutrophil tissue infiltration in sepsis, and CXCL17, pre-
with FCGR2B, which is also part of the T-MDM IDH WT signa- viously implicated in neutrophil and macrophage recruitment in
ture (Figures S2E and S4C). Finally, RETN, which is involved in cancer (Li et al., 2014). We also observed upregulation of the
are hierarchically clustered. Clinical data are annotated per row; column annotation reflects the major protein clusters (further information can be found in
Table S3B).
(B) Self-organizing map (SOM) of RNA expression data of major cell populations in glioma and BrM samples. SOM spots are highlighted, numbered with Roman
numerals, and annotated with their cell type association.
(C) Overlap of individual proteins and SOM spot metagenes; tile color fill reflects protein cluster membership from (A).
(D) RNA-seq counts (normalized, scaled to expression range) of proteins from (A) across major cell types in IDH mut and IDH WT gliomas and BrMs. SOM spot
membership of individual genes is indicated per row.
See also Figure S5.
Cell 181, 1643–1660, June 25, 2020 1653

ll
Resource
B C
Figure 6. Myeloid Cells Show Distinct Transcriptional Changes in BrMs

(A) Normalized counts of the indicated genes in MG and MDMs in non-tumor or reference, IDH WT gliomas, and BrMs. Data are represented as mean ± SEM.
(B) Expression heatmap of Extracellular matrix-associated genes differentially expressed between MG and MDMs in BrMs. Rows are Z-scored and manually
sorted, and columns are ordered by cell type.
(C) Expression of the indicated BrM-specific genes in neutrophils from unmatched healthy blood, IDH WT gliomas, and BrMs. Data are represented as mean
± SEM.
adenosine receptor ADORA2A (Figure 6C), which attenuates the omas, T-MG and T-MDMs mostly neighbored homotypic cells
phenotype of pro-inflammatory neutrophils (Barletta et al., 2012). while lacking T cells in their close vicinity (Figures 7A, 7B, and
Furthermore, we found increased expression of CD177 (Fig- S7A), possibly reflecting the general T cell sparseness in these
ure 6C), a cell surface receptor that modulates neutrophil migra- tumors. In contrast, both TAM populations neighbored T cells
tion and activation and serves as a marker for PR3-positive neutro- far more frequently in BrMs, indicating the potential for interac-
phils, which, in turn, negatively affect T cell proliferation (Yang tion (Figures 7A, 7B, and S7A).
et al., 2018). Notably, MET, which has been linked to recruitment We thus investigated the T cell activation state in BrMs in rela-
of immunosuppressive neutrophils in cancer (Glodde et al., tion to unmatched healthy donor blood and also juxtaposed
2017), was upregulated in neutrophils in a BrM-specific manner them to the corresponding populations from IDH WT gliomas.
(Figure 6C). In sum, we have uncovered multiple disease-specific Compared with controls, CD4+ T cells from BrM showed evi-
alterations of myeloid cells extending beyond BrM-TAMs to neu- dence of a hyporesponsive, anergic phenotype (Figure 7C),
trophils, which has potential implications for the recruitment and whereas CD8+ T cells exhibited an exhaustion signature (Fig-
activation of other cell types within the TME, including T cells. ure 7D), which usually occurs upon chronic activation, resulting
in upregulation of inhibitory receptors. These defective T cell
TAMs Are Poised toward an Immunomodulatory states can be caused by aberrant activation or T cell inhibition
Phenotype in BrMs by tumor cells and antigen-presenting cells in the TME and are
Although we found a significant accumulation of CD4+ and CD8+ a major obstacle in treating cancers.
T cells in BrMs versus IDH WT gliomas by FCM, this analysis of To delineate putative mechanisms in the BrM TME that may
dissociated tissue samples lacks structural information. We thus drive these alterations, we probed the RNA-seq data from
performed neighborhood analysis of IF-phenotyped IDH WT and CD45 cells, TAMs, and T cells (Figure 7E) for expression of acti-
BrM tissue sections to elucidate whether there is a spatial rela- vating and inhibitory immunomodulatory signals (Wei et al.,
tionship between TAMs and CD3+ T cells in BrMs. In IDH WT gli- 2018). This revealed upregulation of various canonical T cell
1654 Cell 181, 1643–1660, June 25, 2020

ll
Resource
A B
C D
E F
Cell 181, 1643–1660, June 25, 2020 1655

ll
Resource
activators and co-activators but also mediators of inhibition in MDM module genes. The notion that BrM-MDMs undergo dis-
T cells (PDCD1/PD1, CD28, and CTLA4), whereas T cell-inhibit- ease-specific alterations distinct from the primary extracranial
ing and activating signals were detected in both TAM popula- tumor is supported by upregulation of these genes (Figure 7G)
tions (CD274/PD-L1 and PDCD1LG2/PD-L2). Notably, we found in our analysis of an external cohort of BrM samples compared
an upregulation of CD80, which has diverse roles in T cell activa- with their matched primary tumor tissue (Vare slija et al., 2019).
tion because it heterodimerizes with CD274, provides co-stimu-
latory signals to T cells via CD28 and exerts inhibitory effects via DISCUSSION
interaction with CTLA4 (Zhao et al., 2019), in both TAM popula-
tions compared with their normal references and IDH WT tumor Brain tumors, including glioblastoma and BrMs, confer some of
populations (Figure 7E). The potential contribution of TAMs to the poorest prognoses for patients with cancer, with survival
metabolic immune evasion is also suggested by high expression rates often measured in just months. Given the current dearth
of IDO1 and IDO2 (Zhai et al., 2018) in BrM (Figure 7E). of effective therapeutic options for these patients and the
We investigated additional immunomodulatory mediators us- modest effects of the various immunotherapies evaluated to
ing weighted gene correlation network analysis (WGCNA; Lang- date, it is of critical urgency to identify novel targets for future
felder and Horvath, 2008) and correlated the resulting expres- clinical evaluation. One potentially rich source of therapeutic tar-
sion patterns with paired FCM abundance of CD4+ and CD8+ gets is the TME. However, even though the TME is now widely
cells in a disease- and cell-type-specific manner. We identified accepted as an important regulator of cancer progression and
15 unique co-expression modules showing significant correla- therapeutic response, our knowledge of the brain TME is
tion (p < 0.05) of their eigengenes (i.e., the first PC of the module restricted to individual brain tumor types or cellular compart-
expression data) with any of the provided sample traits (Fig- ments and lacks comprehensive and integrative analysis.
ure S7B). Among these, the ‘‘brown’’ WGCNA module correlated In this study, we leveraged a diverse panel of analyses to
with a specific BrM-MDM annotation and CD4+ T cell abun- deeply interrogate the immune landscape of primary and meta-
dance. ORA of this module revealed signals for pathways such static brain cancers. Through integration of multiparameter
as coagulation and ECM modulation (Figure S7C) that affect FCM analyses, RNA-seq data, TME cell culture assays, protein
the availability and activity of growth factors and cytokines within arrays, and spatial tissue characterization, we uncovered critical
the TME (Mohan et al., 2020). We ranked genes by module mem- insights into the composition and transcriptomes of the most
bership strength and correlation with CD4+ T cell abundance, abundant immune cell populations in patient samples from IDH
which identified several factors with opposing immunomodula- mut and WT gliomas and BrMs originating from distinct extracra-
tory functions (Figure 7F). Although the receptors CD300E and nial primary tumors.
BST1 promote monocyte motility and survival (Isobe et al., By exploring the broad immune landscape, we uncovered
2018; Ortolan et al., 2019), we also detected effectors of immu- several pronounced differences between gliomas and BrMs
nosuppression, such as the actin-associated regulatory protein when directly compared side by side. In brain tumors, TAMs
CNN2, which negatively regulates macrophage motility and are composed of tissue-resident MG and recruited MDMs, and
phagocytic activity (Huang et al., 2008). The leukocyte immuno- we found a significant shift in the ratio of MG to MDMs between
globulin-like receptor subfamily B members LILRB2 and LILRB3, IDH mut and IDH WT gliomas. Additionally, gliomas contain an
which attenuate myeloid cell activation (van der Touw et al., abundance of TAMs, whereas T cells were much fewer, particu-
2017), are also highly ranked genes within this module. Interest- larly in IDH mut tumors. This confirms the notion that gliomas are
ingly, LILRB2 has been identified as a novel myeloid immune immunologically cold tumors (Jackson et al., 2019). Although
checkpoint that limits antitumor immunity (Chen et al., 2018). T cell sequestration in the bone marrow has been observed in gli-
We also found evidence of effects on T cells; CD52, which, in oma mouse models and following intracranial implantation of
its soluble form, inhibits T cell function, was among the BrM- brain-extrinsic tumors (Chongsathidkiet et al., 2018), our clinical
Figure 7. TAMs Have a Wide Range of Immunomodulatory Functions in BrMs

(A) Representative IF images and corresponding cell type identification of non-immune cells (CD45 ), MG (CD45+, P2RY12/CD68+, CD49D ), MDMs (CD45+,
P2RY12/CD68+, CD49D+), CD3+ (CD45+, P2RY12/CD68 , CD49D /+, CD3+) and CD45+ other cells (CD45+, P2RY12/CD68 , CD49D /+, CD3 ) in IDH WT gli-
omas and BrMs. Scale bars, 50 mm. Insets show quantifications per FOV.
(B) Neighborhood analyses of IDH WT glioma and BrM IF tissue sections. Rows show the mean proportion of each neighboring cell type per frequency of
observed nneighbors in the vicinity of MG or MDMs (nIDH WT = 9, nBrMs = 13).
(C and D) Gene set enrichment analysis (GSEA) of a T cell anergy gene set in CD4+ T cells (C) and a T cell exhaustion gene set in CD8+ T cells (D) from the MSigDB
C2 collection.
(E) Gene expression heatmap of antigen-presenting cell (APC) and T cell activating and inhibitory signaling mediators (left panels, scaled to the expression range
of variance-stabilized counts across all cell types in IDH WT glioma and BrMs) and corresponding fold changes (right panels, BrMs versus non-tumor/reference
and IDH WT glioma versus BrMs, absolute log2(fold change) > 1, adjusted p value < 0.05) in CD45 MG and MDMs and CD4+ and CD8+ T cells in IDH WT gliomas
and BrMs. Gray tiles indicate expression below the threshold (normalized counts < 10); white tiles correspond to a non-significant fold change.
(F) Scatterplot of module membership (correlation of expression to the module eigengene) and gene significance (correlation of expression to CD4+ T cell
abundance) of genes from the BrM-MDM-related gene co-expression network. Highly connected genes with immunomodulatory functions are annotated.
(G) Expression of the indicated genes in matched bulk primary breast cancer and BrM tissues using the Vareslija et al. (2019) dataset (Wilcoxon signed-rank test:
***p < 0.001, ****p < 0.0001).
See also Figure S7.
1656 Cell 181, 1643–1660, June 25, 2020

ll
Resource
BrM samples showed pronounced accumulation of lymphocytes d KEY RESOURCES TABLE

and neutrophils. This indicates that tumors that arise within the d RESOURCE AVAILABILITY
brain indeed shape their TME differently than cancers that B Lead Contact
metastasize from extracranial sites. Moreover, when exploring B Materials Availability
BrMs that originate from distinct primary tumors, there were B Data and Code Availability
additional differences; for example, in melanoma BrM samples, d EXPERIMENTAL MODEL AND SUBJECT DETAILS
the combined abundance of CD4+ and CD8+ T cells represented d METHOD DETAILS
the major immune compartment, whereas breast BrM samples B Clinical sample processing, flow cytometry (FCM) and
showed the highest neutrophil infiltration. These key differences fluorescence activated cell sorting (FACS)
in the TME landscape, which are evident only when directly B Tumor microenvironment-conditioned medium (TME-
juxtaposing different brain malignancies, mirror the efficacy of CM) generation
immunotherapies that show promising efficacy in melanoma pa- B In vitro generation of monocyte-derived macrophages
tients for controlling BrMs but with very modest effects to date in (MDM) and TME-CM stimulation
treating T cell-excluded glioblastoma (Schalper et al., 2019). B RNA isolation, cDNA synthesis and quantitative real-
We also uncovered complex multifaceted phenotypes for time PCR
TAMs across different brain tumors that extend beyond their nu- B Immunofluorescence staining and microscopy image
merical abundance. T-MG and T-MDMs showed distinct tran- acquisition
scriptomic profiles and shared expression signatures, which B Image analysis and cell type identification
are additionally influenced by the underlying disease type (IDH B Protein isolation and enzyme-linked immunosorbent
mut versus IDH WT glioma versus BrMs). A T-MDM signature assay (ELISA)
derived from IDH WT gliomas, consisting of macrophage activa- B RNA sequencing (RNA-seq)
tion markers, chemokine receptors, and cytokines, proved to B Bioinformatic analysis environment
also be a predictor of patient survival in IDH mut gliomas. More- B Gene set-centered analyses
over, analyses of T-MDMs indicated that even though these re- B Deconvolution of Toil-RNA sequencing data
cruited cells have the potential to process and present antigens, B Leading edge metagene (LEM) analysis
and can be located proximally to T cells in BrMs, this potential is B Protein-Protein-Interaction network building
evidently not sufficiently utilized within the brain TME. Orthog- B Nearest neighbor distance measurements and neigh-
onal analyses from the diverse panel of experimental assays borhood analysis of IF data
used in this study reveal additional insights into potential mech- B Cell type abundance estimation in spatial Ivy Glioblas-
anisms of immune suppression. These included our findings that toma Atlas Project (GAP) data
different TAM populations produced pro-inflammatory mole- B Survival analysis of the IDH wt MDM-specific gene
cules, negative regulators of myeloid cell activation, factors signature in gliomas
associated with IPRES, IDO1 and IDO2 immune checkpoint in- B Self-organizing map (SOM) clustering
hibitors, and specific ECM components and proteases that B Weighted gene correlation network analysis (WGCNA)
may collectively help sculpt an immune-suppressive niche. B Expression analysis of external dataset of matched pri-
Therefore, therapeutic strategies that alter the multifaceted phe- mary breast cancer and BrMs
notypes of TAMs (Kowal et al., 2019), rather than aiming to sim- B Plotting and graph generation
ply deplete all of these cells with potentially opposing functions, d QUANTIFICATION AND STATISTICAL ANALYSIS
should be considerably more effective.
Looking beyond TAMs, it will also be critical to assess the roles SUPPLEMENTAL INFORMATION
of neutrophils, particularly in BrMs, where we found them to be
highly abundant, because they can act as potent immune-sup-
cell.2020.05.007.
pressive cells, as indicated by studies of other organs (Coffelt
et al., 2016). Given the highly complex and multifaceted immune ACKNOWLEDGMENTS
landscape of brain cancers revealed in this study, it is clear that
rational combinations of TME-targeted agents will be critical to We thank Prof. Ron Stoop, Dr. Nathalie Piazzon, and the Neurosurgery/Neuro-
avoid the emergence of adaptive resistance, incorporating pre- oncology clinical and nursing teams at CHUV and MSKCC for excellent infra-
structural support; the Joyce lab members for insightful discussions; the Hegi
clinical studies to help determine optimal combinations (Quail
lab members for technical help during sample processing; and Vladimir Wisch-
et al., 2016). In sum, this rich resource is available for further inter- newski for critical manuscript review. We thank the UNIL and MSKCC Flow Cy-
rogation by the research community so that we can work collec- tometry Core Facilities for exceptional technical assistance, especially Romain
tively to uncover novel therapeutic strategies that unleash the po- Bedel. Finally, we convey our immense gratitude to all patients who volun-
tential of diverse cells in the TME to combat different brain teered to participate in this study. Research in the Joyce lab is supported by
malignancies. the Swiss Cancer League, a Swiss Bridge award, the Ludwig Institute for Can-
cer Research, the University of Lausanne, the Breast Cancer Research Foun-
dation, and Cancer Research UK. F.K. was supported in part by the German
STAR+METHODS
Research Foundation (DFG, KL2491/1-1) and Fondation Medic and K.S. by
the Austrian Science Fund (FWF, J4343-B28). The results shown here are in
Detailed methods are provided in the online version of this paper part based on data generated by the TCGA Research Network (https://
and include the following: www.cancer.gov/tcga).
Cell 181, 1643–1660, June 25, 2020 1657

ll
Resource
AUTHOR CONTRIBUTIONS Chongsathidkiet, P., Jackson, C., Koyama, S., Loebel, F., Cui, X., Farber, S.H.,
Woroniecka, K., Elsamadicy, A.A., Dechant, C.A., Kemeny, H.R., et al. (2018).
F.K., R.L.B., and J.A.J. designed the study. F.K., R.R.M., R.L.B., M.K., and K.S. Sequestration of T cells in bone marrow in the setting of glioblastoma and other
performed experiments and analyzed data. F.K., R.R.M., R.L.B., and S.N. per- intracranial tumors. Nat. Med. 24, 1459–1468.
formed computational analyses. C.A.I.-D., C.B., V.T., P.H.G., R.T.D., and Coffelt, S.B., Wellenstein, M.D., and de Visser, K.E. (2016). Neutrophils in can-
M.E.H. provided clinical material. J.-P.B. provided histopathological reviews. cer: neutral no more. Nat. Rev. Cancer 16, 431–446.
F.K. and R.R.M. prepared the figures. F.K. and J.A.J. wrote the manuscript.
Colaprico, A., Silva, T.C., Olsen, C., Garofano, L., Cava, C., Garolini, D., Sabe-
All authors edited or commented on the manuscript.
dot, T.S., Malta, T.M., Pagnotta, S.M., Castiglioni, I., et al. (2016). TCGAbio-
links: an R/Bioconductor package for integrative analysis of TCGA data. Nu-
DECLARATION OF INTERESTS cleic Acids Res. 44, e71.
Cui, Z., Liao, J., Cheong, N., Longoria, C., Cao, G., DeLisser, H.M., and Savani,
The authors declare no competing interests.
R.C. (2019). The Receptor for Hyaluronan-Mediated Motility (CD168) promotes
inflammation and fibrosis after acute lung injury. Matrix Biol. 78-79, 255–271.
Received: November 29, 2019
Dobin, A., Davis, C.A., Schlesinger, F., Drenkow, J., Zaleski, C., Jha, S., Batut,
Revised: April 1, 2020
P., Chaisson, M., and Gingeras, T.R. (2013). STAR: ultrafast universal RNA-seq
aligner. Bioinformatics 29, 15–21.
Published: May 28, 2020
Filková, M., Haluzı́k, M., Gay, S., and Senolt, L. (2009). The role of resistin as a
regulator of inflammation: Implications for various human pathologies. Clin.
REFERENCES
Immunol. 133, 157–170.
Afik, R., Zigmond, E., Vugman, M., Klepfish, M., Shimshoni, E., Pasmanik- Friedman, J., Hastie, T., and Tibshirani, R. (2010). Regularization Paths for
Chor, M., Shenoy, A., Bassat, E., Halpern, Z., Geiger, T., et al. (2016). Tumor Generalized Linear Models via Coordinate Descent. J. Stat. Softw. 33, 1–22.
macrophages are pivotal constructors of tumor collagenous matrix. J. Exp. Gabrusiewicz, K., Rodriguez, B., Wei, J., Hashimoto, Y., Healy, L.M., Maiti,
Med. 213, 2315–2331. S.N., Thomas, G., Zhou, S., Wang, Q., Elakkad, A., et al. (2016). Glioblas-
Aldape, K., Brindle, K.M., Chesler, L., Chopra, R., Gajjar, A., Gilbert, M.R., Got- toma-infiltrated innate immune cells resemble M0 macrophage phenotype.
tardo, N., Gutmann, D.H., Hargrave, D., Holland, E.C., et al. (2019). Challenges JCI Insight 1, e85841.
to curing primary brain tumours. Nat. Rev. Clin. Oncol. 16, 509–520. Gaujoux, R., and Seoighe, C. (2010). A flexible R package for nonnegative ma-
Baddeley, A.D., Eysenck, M.W., and Anderson, M.C. (2015). Spatial Point Pat- trix factorization. BMC Bioinformatics 11, 367.
terns: Methodology and Applications with R (London: Chapman and Hall/ Georgoudaki, A.M., Prokopec, K.E., Boura, V.F., Hellqvist, E., Sohn, S., Ös-
CRC Press). tling, J., Dahan, R., Harris, R.A., Rantalainen, M., Klevebring, D., et al.
Barletta, K.E., Ley, K., and Mehrad, B. (2012). Regulation of neutrophil function (2016). Reprogramming Tumor-Associated Macrophages by Antibody Target-
by adenosine. Arterioscler. Thromb. Vasc. Biol. 32, 856–864. ing Inhibits Cancer Progression and Metastasis. Cell Rep. 15, 2000–2011.
Bates, D., Machler, M., Bolker, B.M., and Walker, S.C. (2015). Fitting Linear Glodde, N., Bald, T., van den Boorn-Konijnenberg, D., Nakamura, K., O’Don-
Mixed-Effects Models Using lme4. J. Stat. Softw. 67, 1–48. nell, J.S., Szczepanski, S., Brandes, M., Eickhoff, S., Das, I., Shridhar, N., et al.
(2017). Reactive Neutrophil Responses Dependent on the Receptor Tyrosine
Benci, J.L., Xu, B., Qiu, Y., Wu, T.J., Dada, H., Twyman-Saint Victor, C., Cu-
Kinase c-MET Limit Cancer Immunotherapy. Immunity 47, 789–802.e9.
colo, L., Lee, D.S.M., Pauken, K.E., Huang, A.C., et al. (2016). Tumor Interferon
Signaling Regulates a Multigenic Resistance Program to Immune Checkpoint Godec, J., Tan, Y., Liberzon, A., Tamayo, P., Bhattacharya, S., Butte, A.J., Me-
Blockade. Cell 167, 1540–1554.e12. sirov, J.P., and Haining, W.N. (2016). Compendium of Immune Signatures
Identifies Conserved and Species-Specific Biology in Response to Inflamma-
Bouralexis, S., Findlay, D.M., and Evdokiou, A. (2005). Death to the bad guys:
tion. Immunity 44, 194–206.
targeting cancer via Apo2L/TRAIL. Apoptosis 10, 35–51.
GTEx Consortium (2013). The Genotype-Tissue Expression (GTEx) project.
Bournazos, S., Wang, T.T., and Ravetch, J.V. (2016). The Role and Function of
Nat. Genet. 45, 580–585.
Fcg Receptors on Myeloid Cells. Microbiol. Spectr. 4 https://doi.org/10.1128/
microbiolspec.MCHD-0045-2016. Gutmann, D.H., and Kettenmann, H. (2019). Microglia/Brain Macrophages as
Central Drivers of Brain Tumor Pathobiology. Neuron 104, 442–449.
Bowman, R.L., Klemm, F., Akkari, L., Pyonteck, S.M., Sevenich, L., Quail, D.F.,
Dhara, S., Simpson, K., Gardner, E.E., Iacobuzio-Donahue, C.A., et al. (2016). Hänzelmann, S., Castelo, R., and Guinney, J. (2013). GSVA: gene set variation
Macrophage Ontogeny Underlies Differences in Tumor-Specific Education in analysis for microarray and RNA-seq data. BMC Bioinformatics 14, 7.
Brain Malignancies. Cell Rep. 17, 2445–2459. Hendriks, L.E.L., Henon, C., Auclin, E., Mezquita, L., Ferrara, R., Audigier-Val-
Cagney, D.N., Martin, A.M., Catalano, P.J., Redig, A.J., Lin, N.U., Lee, E.Q., ette, C., Mazieres, J., Lefebvre, C., Rabeau, A., Le Moulec, S., et al. (2019).
Wen, P.Y., Dunn, I.F., Bi, W.L., Weiss, S.E., et al. (2017). Incidence and prog- Outcome of Patients with Non-Small Cell Lung Cancer and Brain Metastases
nosis of patients with brain metastases at diagnosis of systemic malignancy: a Treated with Checkpoint Inhibitors. J. Thorac. Oncol. 14, 1244–1254.
population-based study. Neuro-oncol. 19, 1511–1521. Huang, Q.Q., Hossain, M.M., Wu, K., Parai, K., Pope, R.M., and Jin, J.P. (2008).
Ceccarelli, M., Barthel, F.P., Malta, T.M., Sabedot, T.S., Salama, S.R., Murray, Role of H2-calponin in regulating macrophage motility and phagocytosis.
B.A., Morozova, O., Newton, Y., Radenbaugh, A., Pagnotta, S.M., et al.; TCGA J. Biol. Chem. 283, 25887–25899.
Research Network (2016). Molecular Profiling Reveals Biologically Discrete Hugo, W., Zaretsky, J.M., Sun, L., Song, C., Moreno, B.H., Hu-Lieskovan, S.,
Subsets and Pathways of Progression in Diffuse Glioma. Cell 164, 550–563. Berent-Maoz, B., Pang, J., Chmielowski, B., Cherry, G., et al. (2016). Genomic
Chen, Z., Feng, X., Herting, C.J., Garcia, V.A., Nie, K., Pong, W.W., Rasmus- and Transcriptomic Features of Response to Anti-PD-1 Therapy in Metastatic
sen, R., Dwivedi, B., Seby, S., Wolf, S.A., et al. (2017). Cellular and Molecular Melanoma. Cell 165, 35–44.
Identity of Tumor-Associated Macrophages in Glioblastoma. Cancer Res. 77, Isobe, M., Izawa, K., Sugiuchi, M., Sakanishi, T., Kaitani, A., Takamori, A.,
2266–2278. Maehara, A., Matsukawa, T., Takahashi, M., Yamanishi, Y., et al. (2018). The
Chen, H.M., van der Touw, W., Wang, Y.S., Kang, K., Mai, S., Zhang, J., Alsina- CD300e molecule in mice is an immune-activating receptor. J. Biol. Chem.
Beauchamp, D., Duty, J.A., Mungamuri, S.K., Zhang, B., et al. (2018). Blocking 293, 3793–3805.
immunoinhibitory receptor LILRB2 reprograms tumor-associated myeloid Jackson, C.M., Choi, J., and Lim, M. (2019). Mechanisms of immunotherapy
cells and promotes antitumor immunity. J. Clin. Invest. 128, 5647–5662. resistance: lessons from glioblastoma. Nat. Immunol. 20, 1100–1109.
1658 Cell 181, 1643–1660, June 25, 2020

ll
Resource
Kai, F., Drain, A.P., and Weaver, V.M. (2019). The Extracellular Matrix Modu- phage activation and polarization: nomenclature and experimental guidelines.
lates the Metastatic Journey. Dev. Cell 49, 332–346. Immunity 41, 14–20.
Klemm, F., and Joyce, J.A. (2015). Microenvironmental regulation of therapeu- Mushtaq, M.U., Papadas, A., Pagenkopf, A., Flietner, E., Morrow, Z., Chaudh-
tic response in cancer. Trends Cell Biol. 25, 198–213. ary, S.G., and Asimakopoulos, F. (2018). Tumor matrix remodeling and novel
Kobayashi, M., Chung, J.S., Beg, M., Arriaga, Y., Verma, U., Courtney, K., immunotherapies: the promise of matrix-derived immune biomarkers.
Mansour, J., Haley, B., Khan, S., Horiuchi, Y., et al. (2019). Blocking Monocytic J. Immunother. Cancer 6, 65.
Myeloid-Derived Suppressor Cell Function via Anti-DC-HIL/GPNMB Antibody Noy, R., and Pollard, J.W. (2014). Tumor-associated macrophages: from
Restores the In Vitro Integrity of T Cells from Cancer Patients. Clin. Cancer mechanisms to therapy. Immunity 41, 49–61.
Res. 25, 828–838. Olson, O.C., and Joyce, J.A. (2015). Cysteine cathepsin proteases: regulators
Kowal, J., Kornete, M., and Joyce, J.A. (2019). Re-education of macrophages of cancer progression and therapeutic response. Nat. Rev. Cancer 15,
as a therapeutic strategy in cancer. Immunotherapy 11, 677–689. 712–729.
Langfelder, P., and Horvath, S. (2008). WGCNA: an R package for weighted Ortolan, E., Augeri, S., Fissolo, G., Musso, I., and Funaro, A. (2019). CD157:
correlation network analysis. BMC Bioinformatics 9, 559. From immunoregulatory protein to potential therapeutic target. Immunol.
Lett. 205, 59–64.
Lécuyer, M.A., Saint-Laurent, O., Bourbonnière, L., Larouche, S., Larochelle,
C., Michel, L., Charabati, M., Abadier, M., Zandee, S., Haghayegh Jahromi, Puchalski, R.B., Shah, N., Miller, J., Dalley, R., Nomura, S.R., Yoon, J.G.,
N., et al. (2017). Dual role of ALCAM in neuroinflammation and blood-brain bar- Smith, K.A., Lankerovich, M., Bertagnolli, D., Bickley, K., et al. (2018). An
rier homeostasis. Proc. Natl. Acad. Sci. USA 114, E524–E533. anatomic transcriptional atlas of human glioblastoma. Science 360, 660–663.
Li, L., Yan, J., Xu, J., Liu, C.Q., Zhen, Z.J., Chen, H.W., Ji, Y., Wu, Z.P., Hu, J.Y., Pyonteck, S.M., Akkari, L., Schuhmacher, A.J., Bowman, R.L., Sevenich, L.,
Zheng, L., and Lau, W.Y. (2014). CXCL17 expression predicts poor prognosis Quail, D.F., Olson, O.C., Quick, M.L., Huse, J.T., Teijeiro, V., et al. (2013).
and correlates with adverse immune infiltration in hepatocellular carcinoma. CSF-1R inhibition alters macrophage polarization and blocks glioma progres-
PLoS ONE 9, e110064. sion. Nat. Med. 19, 1264–1272.
Li, Y., Xie, Y., Hao, J., Liu, J., Ning, Y., Tang, Q., Ma, M., Zhou, H., Guan, S., Qiao, S., Qian, Y., Xu, G., Luo, Q., and Zhang, Z. (2019). Long-term character-
Zhou, Q., and Lv, X. (2018). ER-localized protein-Herpud1 is a new mediator ization of activated microglia/macrophages facilitating the development of
of IL-4-induced macrophage polarization and migration. Exp. Cell Res. 368, experimental brain metastasis through intravital microscopic imaging.
167–173. J. Neuroinflammation 16, 4.
Quail, D.F., and Joyce, J.A. (2017). The Microenvironmental Landscape of
Liberzon, A., Birger, C., Thorvaldsdóttir, H., Ghandi, M., Mesirov, J.P., and
Brain Tumors. Cancer Cell 31, 326–341.
Tamayo, P. (2015). The Molecular Signatures Database (MSigDB) hallmark
gene set collection. Cell Syst. 1, 417–425. Quail, D.F., Bowman, R.L., Akkari, L., Quick, M.L., Schuhmacher, A.J., Huse,
J.T., Holland, E.C., Sutton, J.C., and Joyce, J.A. (2016). The tumor microenvi-
Lim, M., Xia, Y., Bettegowda, C., and Weller, M. (2018). Current state of immu-
ronment underlies acquired resistance to CSF-1R inhibition in gliomas. Sci-
notherapy for glioblastoma. Nat. Rev. Clin. Oncol. 15, 422–442.
ence 352, aad3018.
Liu, Q., Johnson, E.M., Lam, R.K., Wang, Q., Ye, B.H., Wilson, E.N., Minhas,
R Core Team (2018). R: A Language and Environment for Statistical Computing
P.S., Liu, L., Swarovski, M.S., Tran, S., et al. (2019). Peripheral TREM1 re-
(R Foundation for Statistical Computing).
sponses to brain and intestinal immunogens amplify stroke severity. Nat. Im-
munol. 20, 1023–1034. Racle, J., de Jonge, K., Baumgaertner, P., Speiser, D.E., and Gfeller, D. (2017).
Simultaneous enumeration of cancer and immune cell types from bulk tumor
Löffler-Wirth, H., Kalcher, M., and Binder, H. (2015). oposSOM: R-package for
gene expression data. eLife 6, e26476.
high-dimensional portraying of genome-wide expression landscapes on bio-
conductor. Bioinformatics 31, 3225–3227. Rau, A., Gallopin, M., Celeux, G., and Jaffrézic, F. (2013). Data-based filtering
for replicated high-throughput transcriptome sequencing experiments. Bioin-
Long, G.V., Atkinson, V., Lo, S., Sandhu, S., Guminski, A.D., Brown, M.P., Wil-
formatics 29, 2146–2152.
mott, J.S., Edwards, J., Gonzalez, M., Scolyer, R.A., et al. (2018). Combination
Richards, J., Gabunia, K., Kelemen, S.E., Kako, F., Choi, E.T., and Autieri, M.V.
nivolumab and ipilimumab or nivolumab alone in melanoma brain metastases:
(2015). Interleukin-19 increases angiogenesis in ischemic hind limbs by direct
a multicentre randomised phase 2 study. Lancet Oncol. 19, 672–681.
effects on both endothelial cells and macrophage polarization. J. Mol. Cell.
Love, M.I., Huber, W., and Anders, S. (2014). Moderated estimation of fold Cardiol. 79, 21–31.
change and dispersion for RNA-seq data with DESeq2. Genome Biol. 15, 550.
Ripoll, V.M., Irvine, K.M., Ravasi, T., Sweet, M.J., and Hume, D.A. (2007).
Mantovani, A., Marchesi, F., Malesci, A., Laghi, L., and Allavena, P. (2017). Gpnmb is induced in macrophages by IFN-gamma and lipopolysaccharide
Tumour-associated macrophages as treatment targets in oncology. Nat. and acts as a feedback regulator of proinflammatory responses. J. Immunol.
Rev. Clin. Oncol. 14, 399–416. 178, 6557–6566.
Meng, X., Duan, C., Pang, H., Chen, Q., Han, B., Zha, C., Dinislam, M., Wu, P., Robinson, J.T., Thorvaldsdóttir, H., Wenger, A.M., Zehir, A., and Mesirov, J.P.
Li, Z., Zhao, S., et al. (2019). DNA damage repair alterations modulate M2 po- (2017). Variant Review with the Integrative Genomics Viewer. Cancer Res. 77,
larization of microglia to remodel the tumor microenvironment via the p53- e31–e34.
mediated MDK expression in glioma. EBioMedicine 41, 185–199.
Rodón, L., Gonzàlez-Juncà, A., Inda, Mdel.M., Sala-Hojman, A., Martı́nez-
Mohan, V., Das, A., and Sagi, I. (2020). Emerging roles of ECM remodeling pro- Sáez, E., and Seoane, J. (2014). Active CREB1 promotes a malignant TGFb2
cesses in cancer. Semin. Cancer Biol. 62, 192–200. autocrine loop in glioblastoma. Cancer Discov. 4, 1230–1241.
Morianos, I., Papadopoulou, G., Semitekolou, M., and Xanthou, G. (2019). Ac- Sankowski, R., Böttcher, C., Masuda, T., Geirsdottir, L., Sagar, Sindram, E.,
tivin-A in the regulation of immunity in health and disease. J. Autoimmun. 104, Seredenina, T., Muhs, A., Scheiwe, C., Shah, M.J., et al. (2019). Mapping mi-
102314. croglia states in the human brain through the integration of high-dimensional
Müller, S., Kohanbash, G., Liu, S.J., Alvarado, B., Carrera, D., Bhaduri, A., techniques. Nat. Neurosci. 22, 2098–2110.
Watchmaker, P.B., Yagnik, G., Di Lullo, E., Malatesta, M., et al. (2017). Sin- Santana Carrero, R.M., Beceren-Braun, F., Rivas, S.C., Hegde, S.M., Gangad-
gle-cell profiling of human gliomas reveals macrophage ontogeny as a basis haran, A., Plote, D., Pham, G., Anthony, S.M., and Schluns, K.S. (2019). IL-15 is
for regional differences in macrophage activation in the tumor microenviron- a component of the inflammatory milieu in the tumor microenvironment pro-
ment. Genome Biol. 18, 234. moting antitumor responses. Proc. Natl. Acad. Sci. USA 116, 599–608.
Murray, P.J., Allen, J.E., Biswas, S.K., Fisher, E.A., Gilroy, D.W., Goerdt, S., Schalper, K.A., Rodriguez-Ruiz, M.E., Diez-Valle, R., López-Janeiro, A., Por-
Gordon, S., Hamilton, J.A., Ivashkiv, L.B., Lawrence, T., et al. (2014). Macro- ciuncula, A., Idoate, M.A., Inogés, S., de Andrea, C., López-Diaz de Cerio,
Cell 181, 1643–1660, June 25, 2020 1659

ll
Resource
A., Tejada, S., et al. (2019). Neoadjuvant nivolumab modifies the tumor im- Venteicher, A.S., Tirosh, I., Hebert, C., Yizhak, K., Neftel, C., Filbin, M.G., Hov-
mune microenvironment in resectable glioblastoma. Nat. Med. 25, 470–476. estadt, V., Escalante, L.E., Shaw, M.L., Rodman, C., et al. (2017). Decoupling
Stupp, R., Mason, W.P., van den Bent, M.J., Weller, M., Fisher, B., Taphoorn, genetics, lineages, and microenvironment in IDH-mutant gliomas by single-
M.J., Belanger, K., Brandes, A.A., Marosi, C., Bogdahn, U., et al.; European cell RNA-seq. Science 355, eaai8478.
Organisation for Research and Treatment of Cancer Brain Tumor and Radio- Vivian, J., Rao, A.A., Nothaft, F.A., Ketchum, C., Armstrong, J., Novak, A., Pfeil,
therapy Groups; National Cancer Institute of Canada Clinical Trials Group J., Narkizian, J., Deran, A.D., Musselman-Brown, A., et al. (2017). Toil enables
(2005). Radiotherapy plus concomitant and adjuvant temozolomide for glio- reproducible, open source, big biomedical data analyses. Nat. Biotechnol. 35,
blastoma. N. Engl. J. Med. 352, 987–996. 314–316.
Subramanian, A., Tamayo, P., Mootha, V.K., Mukherjee, S., Ebert, B.L., Gil- Wei, S.C., Duffy, C.R., and Allison, J.P. (2018). Fundamental Mechanisms of
lette, M.A., Paulovich, A., Pomeroy, S.L., Golub, T.R., Lander, E.S., and Me- Immune Checkpoint Blockade Therapy. Cancer Discov. 8, 1069–1086.
sirov, J.P. (2005). Gene set enrichment analysis: a knowledge-based approach Wickham, H. (2016). ggplot2: Elegant Graphics for Data Analysis (Springer).
for interpreting genome-wide expression profiles. Proc. Natl. Acad. Sci. USA Xu, P., Zhang, X., Liu, Q., Xie, Y., Shi, X., Chen, J., Li, Y., Guo, H., Sun, R.,
102, 15545–15550. Hong, Y., et al. (2019). Microglial TREM-1 receptor mediates neuroinflamma-
Szklarczyk, D., Morris, J.H., Cook, H., Kuhn, M., Wyder, S., Simonovic, M., tory injury via interaction with SYK in experimental ischemic stroke. Cell Death
Santos, A., Doncheva, N.T., Roth, A., Bork, P., et al. (2017). The STRING data- Dis. 10, 555.
base in 2017: quality-controlled protein-protein association networks, made Xue, J., Schmidt, S.V., Sander, J., Draffehn, A., Krebs, W., Quester, I., De
broadly accessible. Nucleic Acids Res. 45 (D1), D362–D368. Nardo, D., Gohel, T.D., Emde, M., Schmidleithner, L., et al. (2014). Transcrip-
Szulzewsky, F., Arora, S., de Witte, L., Ulas, T., Markovic, D., Schultze, J.L., tome-based network analysis reveals a spectrum model of human macro-
Holland, E.C., Synowitz, M., Wolf, S.A., and Kettenmann, H. (2016). Human phage activation. Immunity 40, 274–288.
glioblastoma-associated microglia/monocytes express a distinct RNA profile Yan, D., Kowal, J., Akkari, L., Schuhmacher, A.J., Huse, J.T., West, B.L., and
compared to human control and murine samples. Glia 64, 1416–1436. Joyce, J.A. (2017). Inhibition of colony stimulating factor-1 receptor abrogates
Tawbi, H.A., Forsyth, P.A., Algazi, A., Hamid, O., Hodi, F.S., Moschos, S.J., microenvironment-mediated therapeutic resistance in gliomas. Oncogene 36,
Khushalani, N.I., Lewis, K., Lao, C.D., Postow, M.A., et al. (2018). Combined 6049–6058.
Nivolumab and Ipilimumab in Melanoma Metastatic to the Brain. N. Engl. J. Yang, T.H., St John, L.S., Garber, H.R., Kerros, C., Ruisaard, K.E., Clise-
Med. 379, 722–730. Dwyer, K., Alatrash, G., Ma, Q., and Molldrem, J.J. (2018). Membrane-Associ-
Thapa, B., and Lee, K. (2019). Metabolic influence on macrophage polarization ated Proteinase 3 on Granulocytes and Acute Myeloid Leukemia Inhibits T Cell
and pathogenesis. BMB Rep. 52, 360–372. Proliferation. J. Immunol. 201, 1389–1399.
Tsukamoto, H., Fujieda, K., Miyashita, A., Fukushima, S., Ikeda, T., Kubo, Y., Young, M.D., Wakefield, M.J., Smyth, G.K., and Oshlack, A. (2010). Gene
Senju, S., Ihn, H., Nishimura, Y., and Oshiumi, H. (2018). Combined Blockade ontology analysis for RNA-seq: accounting for selection bias. Genome Biol.
of IL6 and PD-1/PD-L1 Signaling Abrogates Mutual Regulation of Their Immu- 11, R14.
nosuppressive Effects in the Tumor Microenvironment. Cancer Res. 78, Zaiss, D.M.W., Gause, W.C., Osborne, L.C., and Artis, D. (2015). Emerging
5011–5022. functions of amphiregulin in orchestrating immunity, inflammation, and tissue
Van, P., Jiang, W., Gottardo, R., and Finak, G. (2018). ggCyto: next generation repair. Immunity 42, 216–226.
open-source visualization software for cytometry. Bioinformatics 34, Zhai, L., Ladomersky, E., Lenzen, A., Nguyen, B., Patel, R., Lauing, K.L., Wu,
3951–3953. M., and Wainwright, D.A. (2018). IDO1 in cancer: a Gemini of immune check-
van der Touw, W., Chen, H.M., Pan, P.Y., and Chen, S.H. (2017). LILRB recep- points. Cell. Mol. Immunol. 15, 447–457.
tor-mediated regulation of myeloid cell maturation and function. Cancer Immu- Zhao, J., Sun, L., and Li, X. (2017). Commanding CNS Invasion: GM-CSF. Im-
nol. Immunother. 66, 1079–1087. munity 46, 165–167.
Vareslija, D., Priedigkeit, N., Fagan, A., Purcell, S., Cosgrove, N., O’Halloran, Zhao, Y., Lee, C.K., Lin, C.H., Gassen, R.B., Xu, X., Huang, Z., Xiao, C., Bonor-
P.J., Ward, E., Cocchiglia, S., Hartmaier, R., Castro, C.A., et al. (2019). Tran- ino, C., Lu, L.F., Bui, J.D., et al. (2019). PD-L1:CD80 Cis-Heterodimer Triggers
scriptome Characterization of Matched Primary Breast and Brain Metastatic the Co-stimulatory Receptor CD28 While Repressing the Inhibitory PD-1 and
Tumors to Detect Novel Actionable Targets. J. Natl. Cancer Inst. 111, 388–398. CTLA-4 Pathways. Immunity 51, 1059–1073.e9.
1660 Cell 181, 1643–1660, June 25, 2020

ll
Resource
STAR+METHODS
KEY RESOURCES TABLE

Antibodies
FCM: AF700 mouse monoclonal anti-human CD45 BioLegend Cat#304024; RRID:AB_493761
(clone HI30)
FCM: BV421 rat monoclonal anti-mouse/human CD11B BioLegend Cat#101251; RRID:AB_2562904
(clone M1/70)
FCM: PE mouse monoclonal anti-human CD66B BioLegend Cat#305106; RRID:AB_2077857
(clone G10F5)
FCM: AF488 mouse monoclonal anti-human CD14 BioLegend Cat#325610; RRID:AB_830683
(clone HCD14)
FCM: BUV737 mouse monoclonal anti-human CD16 BD Cat#612786; RRID:AB_2833077
(clone 3G8)
FCM: APC mouse monoclonal anti-human CD49D BioLegend Cat#304308; RRID:AB_2130041
(clone 9F10)
FCM: BV605 mouse monoclonal anti-human CD11C BioLegend Cat#301636; RRID:AB_2563796
(clone 3.9)
FCM: BV711 mouse monoclonal anti-human anti HLA-DR BioLegend Cat#307644; RRID:AB_2562913
(clone L243)
FCM: PerCP/Cy5.5 mouse monoclonal anti-human CD3 BioLegend Cat#300328; RRID:AB_1575008
(clone HIT3a)
FCM: BV 650 mouse monoclonal anti-human anti CD4 BioLegend Cat#317436; RRID:AB_2563050
(clone OKT4)
FCM: PE mouse monoclonal anti-human CD25 (clone BC96) BioLegend Cat#302606; RRID:AB_314276
FCM: BV510 mouse monoclonal anti-human CD127 (clone BioLegend Cat#351332; RRID:AB_2562304
A019D5)
FCM: PE/Cy7 mouse monoclonal anti-human CD8A BioLegend Cat#300914; RRID:AB_314118
(clone HIT8a)
FCM: BUV563 mouse monoclonal anti-human CD20 BD Cat#748456
(clone 2H7)
FCM: BUV563 mouse monoclonal anti-human CD19 (clone BD Cat#612916
SJ25C1)
FCM: PE/Dazzle mouse monoclonal anti-human CD56 BioLegend Cat#318348; RRID:AB_2563564
(clone HDC56)
FCM: PE mouse monoclonal anti-human P2RY12 (clone BioLegend Cat#392103; RRID:AB_2716006
S16001E)
FCM: PE/Cy7 Mouse monoclonal anti-human CD68 (clone BioLegend Cat#333816; RRID:AB_2562936
Y1/82A)
IF: Mouse monoclonal anti-human CD68 (clone KP1), 1:100 Abcam Cat#ab955; RRID:AB_307338
dilution
IF: Rat monoclonal anti-human CD49D (clone PS/2), 1:100 Abcam Cat#ab25247
dilution
IF: Rabbit polyclonal anti-human P2RY12, 1:600 dilution Sigma-Aldrich Cat#HPA014518; RRID:AB_2669027
IF: Goat polyclonal anti-human CD45, 1:100 dilution LSBio Cat#LS-B14248-300
IF: AF488 mouse monoclonal anti-human CD45 (clone HI30), BioLegend Cat#304019; RRID:AB_493033
1:100 dilution
IF: AF488 mouse monoclonal anti-human CD3 (clone UCHT1), BioLegend Cat#300406; RRID:AB_314060
1:100 dilution
IF: Sheep polyclonal anti-human CD31, 1:200 dilution R&D Cat#AF806; RRID:AB_355617
IF: APC rat monoclonal anti Ki-67 (clone SolA15), 1:100 Thermo Fisher Scientific Cat#17-5698-82
dilution
Cell 181, 1643–1660.e1–e7, June 25, 2020 e1

ll
Resource
Continued
IF: AF555 donkey anti-rabbit IgG 1:1000 dilution Thermo Fisher Scientific Cat#A31572, RRID:AB_162543
IF: AF555 donkey anti-mouse IgG, 1:500 dilution Thermo Fisher Scientific Cat#A32773; RRID:AB_2762848
IF: AF488 donkey anti-rat IgG, 1:500 dilution Thermo Fisher Scientific Cat#A21208; RRID:AB_141709
IF: AF647 donkey anti-rat IgG, 1:500 dilution abcam Cat#ab150155; RRID:AB_2813835
IF: DyLight755 donkey anti-goat IgG, 1:500 dilution Thermo Fisher Scientific Cat# SA5-10091; RRID:AB_2556671
IF: AF555 donkey anti-sheep IgG, 1:500 dilution Thermo Fisher Scientific Cat#A21436; RRID:AB_2535857
Biological Samples
Non-tumor, glioma and brain metastasis tissue Centre Hospitalier Universitaire N/A
Vaudois, Lausanne, Switzerland
Non-tumor, glioma and brain metastasis tissue Memorial Sloan Kettering N/A
Cancer Center, New York,
NY, USA
Healthy donor blood Transfusion Interrégionale N/A
Croix-Rouge Suisse, Epalinges,
Switzerland
Healthy donor blood New York Blood Bank, N/A
New York, NY, USA
DMEM-F12 (1:1), GlutaMAX GIBCO Cat#31331028
DMEM, high glucose, GlutaMAX, pyruvate GIBCO Cat#31966021
Penicillin/Streptomycin GIBCO Cat#15140122
Human recombinant CSF-1 R&D Systems Cat#216-MC-025
Ficoll-Paque Premium GE Cat#17-5442-02
Trizol Thermo Fisher Scientific Cat#15596018
Trizol LS Thermo Fisher Scientific Cat#10296028
Tween 20 Applied Chemicals Cat#A4974
Triton X-100 Applied Chemicals Cat#A4975
TNB Blocking Reagent Perkin Elmer Cat#FP1020
Fluorescence Mounting Medium Dako Cat#S302380
Brain Tumor Dissociation Kit (P) Miltenyi Cat#130-095-942
Tumor Dissociation Kit, human Miltenyi Cat#130-095-929
Myelin Removal Beads Miltenyi Cat#130-096-733
CD14 MicroBeads, human Miltenyi Cat#130-050-201
Human TruStain FcX BioLegend Cat#422302
ZombieNIR Fixable Viability Kit BioLegend Cat#423106
High Capacity cDNA Reverse Transcription Kit Applied Biosystems Cat#4368814
TaqMan Universal PCR Master Mix Applied Biosystems Cat#4304437
Quantibody Array Q4000 ELISA Raybiotech Cat#QAH-CAA-4000-1
Deposited Data
RNAseq count data This paper https://joycelab.shinyapps.io/
braintime/
Human reference genome, hg38 Genomics Data Common https://gdc.cancer.gov/about-
data/data-harmonization-and-
generation/gdc-reference-files
TCGA LGG and GBM datasets Genomics Data Common https://portal.gdc.cancer.gov/
TOIL TGCA TARGET GTEx datasets Vivian et al., 2017 https://xenabrowser.net/datapages/
Ivy Glioblastoma Atlas Project RNA sequencing fata Puchalski et al., 2018 https://glioblastoma.alleninstitute.
org/static/download.html
e2 Cell 181, 1643–1660.e1–e7, June 25, 2020

ll
Resource
Continued
STRING Protein-Protein-Interaction database, version 10.5 Szklarczyk et al., 2017 https://version-10-5.string-db.org/
cgi/download.pl
Molecular Signatures Database gene set collection Liberzon et al., 2015; https://www.gsea-msigdb.org/
Subramanian et al., 2005 gsea/msigdb/
RNA sequencing count matrix from matched breast cancer Vare
slija et al., 2019 https://github.com/npriedig
primaries and brain metastases
Oligonucleotides
See Table S7 N/A
FlowJo, version 10.4 BD https://www.flowjo.com/
BBDuk, version 38.12 Joint Genome Institute https://jgi.doe.gov/data-and-tools/
bbtools/
STAR aligner, version 2.5.2b Dobin et al., 2013 https://github.com/alexdobin/STAR
R environment, version 3.5.0 R Core Team, 2018 https://www.r-project.org/
VIS Image Analysis, version 2019.7 Visiopharm https://www.visiopharm.com/
Other
gentleMACS Octo Dissociator Miltenyi Cat#130-095-937
gentleMACS C Tubes Miltenyi Cat#130-096-334
LS Columns Miltenyi Cat#130-042-401
SepMate-50 StemCell Cat#85450
PermaLife Cell Culture Bags OriGen Biomedical Cat#PL30-2G
LSR II flow cytometer BD N/A
Fortessa flow cytometer BD N/A
FACSAria III, flow cytometer & cell sorter BD N/A
Axio Scan.Z1 slide scanner Zeiss N/A
QuantStudio 6 Flex Applied Biosystems N/A
Omni Tissue Homogenizer (TH) Omni International Cat#TH220
Lead Contact
Further information and requests for resources should be directed to the Lead Contact, Johanna Joyce (johanna.joyce@unil.ch).

RNA-seq count expression data generated during this study can be visualized and downloaded at https://joycelab.shinyapps.io/
braintime/. Due to patient privacy protection, the raw RNA-seq data will be made available upon request.
All procedures performed in studies involving human participants were in accordance with the ethical standards of the institutional
and/or national research committee and with the 1964 Helsinki declaration and its later amendments or comparable ethical
standards.
Informed consent was obtained from all individual participants included in this study. The collection of non-tumor and tumor tissue
samples at the Centre Hospitalier Universitaire Vaudois (CHUV, Lausanne, Switzerland) was approved by the Commission cantonale
d’éthique de la recherche sur l’être humain (CER-VD, protocol PB 2017-00240, F25 / 99). Sample collection at Memorial Sloan Ket-
tering Cancer Center (MSKCC, New York, NY, USA) was approved by the institutional review board (IRB, protocols #IRB #06-107,
#14-230). Non-tumor samples of cerebral cortex tissues were collected at CHUV during medically indicated surgical treatment of
Cell 181, 1643–1660.e1–e7, June 25, 2020 e3

ll
Resource
refractory epilepsy patients, and at MSKCC in normal brain distant from the tumor in patients with low-grade glioma or from post-
mortem samples collected through the rapid autopsy program with no history of brain malignancy.
Tissue specimens were immediately collected from the operating room and processed as described below. All patient-related data
and unique identifiers were removed so that human samples were anonymized before any further processing.
Pathological review and molecular analysis of tumor samples was performed as part of standard clinical care at the respective
locations (CHUV or MSKCC). In all glioma samples subjected to RNA sequencing, the IDH1 and IDH2 mutation status was verified
by inspection of the reads from the CD45- population aligning to the IDH1 and IDH2 loci with the Integrative Genomics Viewer (IGV;
Robinson et al., 2017). For immunofluorescence sections the tumor diagnosis was confirmed independently, for all non-tumor sam-
ples, the absence of malignancy was equally confirmed by a pathologist.
Peripheral blood and buffy coats were obtained from the Transfusion Interrégionale, Croix-Rouge Suisse (Epalinges/Lausanne,
Switzerland), the New York Blood Center (New York, NY, USA), and healthy donors.
METHOD DETAILS
Clinical sample processing, flow cytometry (FCM) and fluorescence activated cell sorting (FACS)
Tissue specimens were washed in HBSS and macro-dissected under sterile conditions. Parts of the tissue were either immediately
frozen by submerging the sample in liquid nitrogen-cooled 2-methyl butane (Sigma-Aldrich) or OCT-embedded (Tissue-Tek) before
freezing for subsequent sectioning and immunofluorescence staining. OCT embedding was performed by placing the sample in a
freezing mold filled with OCT and then submerging the mold in 2-methyl butane cooled with dry ice.
The remaining tissue was further processed with either the Brain Tumor Dissociation Kit (Miltenyi) for non-tumor tissue and gli-
omas, or the Tumor Dissociation Kit for BrMs (Miltenyi) using the gentleMACS Octo Dissociator (Miltenyi). Myelin debris in cell sus-
pensions from non-tumor and glioma tissues was removed by incubating the cells with Myelin Removal Beads (Miltenyi) and mag-
netic-activated cell sorting (MACS) using LS columns (Miltenyi) according to the manufacturer’s instructions. All tissue suspensions
were filtered through a 40 mm filter and underwent red blood cell lysis (BioLegend). Single cell suspensions were stained with a fixable
live-dead stain (Zombie NIR, BioLegend), FC-blocked for 10 min (Human TruStain FcX, BioLegend) and then incubated with direct
fluorophore-conjugated antibodies for 20 min at 4 C. All FCM antibodies were titrated in a lot-specific manner. Antibody details are
listed in the Key Resources Table. Cells were washed with PBS +2% fetal bovine serum (FBS) +0.5 mM EDTA and stored at 4 C in the
dark until FAC-sorting.
All FCM acquisition was completed on either a BD Fortessa or a BD LSR II device (BD), and cell sorting was performed on a
FACSAria III (BD) using FACSDiva (BD). Cells were sorted directly into Trizol LS (Thermo Fisher Scientific) and immediately snap
frozen with liquid nitrogen. Analysis of FCM data was performed with FlowJo (BD).
Tumor microenvironment-conditioned medium (TME-CM) generation

Single cell suspensions from whole tumor samples were resuspended in DMEM-F12 (1:1) +Glutamax (GIBCO) +10% FBS +1% peni-
cillin/streptomycin (P/S, GIBCO) and adjusted to a concentration of 2 3 106 cells/ml with 2 ml plated into each well of a 6-well plate
(TPP). The supernatant of these tissue cultures, containing cancer cells, immune cells etc. from the complex brain TME, was har-
vested at 24 hours after initial seeding, spun down to remove debris (300 g, 10 min) and stored at 80 C until further use.
In vitro generation of monocyte-derived macrophages (MDM) and TME-CM stimulation

Peripheral blood mononuclear cells were isolated from buffy coats of healthy donors with a Ficoll (GE) gradient using SepMate tubes
(StemCell) and monocytes selected by MACsorting with CD14 MicroBeads (Miltenyi). Monocytes were differentiated into macro-
phages by culture in Teflon-coated bags (OriGen) for 7 days in DMEM +GlutaMAX (GIBCO) +10% FBS +1% P/S with the addition
of 10 ng/ml recombinant human CSF-1 (R&D Systems).
Differentiated MDMs were plated at a density of 1 3 106 cells/well of a 6-well plate in DMEM +10% FBS +1% P/S +10 ng/ml CSF-1.
After cell attachment, MDMs were cultured in serum free medium for 6 hours before stimulation with TME-CM for 24 hours.
RNA isolation, cDNA synthesis and quantitative real-time PCR

TME-CM-stimulated MDMs were lysed with Trizol (Thermo Fisher Scientific), RNA was purified with Direct-zol columns (Zymo
Research), DNase treated and 1.0 mg of RNA was used for cDNA synthesis using the High Capacity cDNA Reverse Transcription
Kit (Applied Biosystems). An amount of cDNA equivalent to 5 ng total RNA was used for real-time PCR. For primer and probe details
see Table S6. Assays were run in triplicate on a QuantStudio 6 Flex instrument (Applied Biosystems) using the TaqMan Universal PCR
Master Mix (Applied Biosystems) and expression was normalized to the average expression of Ubiquitin C (UBC) and Ribosomal Pro-
tein L19 (RPL19) for each sample.
Immunofluorescence staining and microscopy image acquisition

10 mm cryostat sections were thawed, air-dried and fixed with ice-cold 100% methanol for 5 minutes. After rehydration with PBS,
sections were washed twice in PBS +0.2% Tween 20 (Applied Chemicals), permeabilized with PBS +0.2% Triton X-100 (Applied
Chemicals) for 3 hours and washed again with PBS +0.2% Tween 20. Blocking was performed with PBS +0.5% Tween 20 +1%
e4 Cell 181, 1643–1660.e1–e7, June 25, 2020

ll
Resource
TNB Blocking Reagent (Perkin Elmer), followed by incubation with primary antibody in the same buffer overnight at 4 C. Primary anti-
body information and dilutions are listed in the Key Resources table. Sections were washed with PBS +0.2% Tween 20 before incu-
bation with fluorophore-conjugated secondary antibodies at a dilution of 1:500 in PBS +0.5% Tween 20 +1% TNB Blocking
Reagent +1 mg/ml DAPI at room temperature. Directly-conjugated primary antibodies were employed where indicated after an initial
round of primary and secondary antibody staining, to avoid potential for cross reactivity. Finally, sections were washed with
PBS +0.2% Tween and mounted with Fluorescence Mounting Medium (Dako).
Stained tissue sections were imaged with an Axio Scan.Z1 slide scanner (Zeiss) equipped with a Colibri 7 LED light source (Zeiss)
using a Plan-Apochromat 20x/0.8 DIC M27 coverslip-corrected objective (Zeiss). All slides from the same staining panel were digi-
talized using identical acquisition settings.
Image analysis and cell type identification

Image quantification was performed using the VIS Image Analysis software (Visiopharm). For each staining panel a specific applica-
tion was created using the software’s authoring module. The tissue outline was detected after applying a 21 pixel mean filter. The
edges of the derived regions of interest were smoothened with the built-in function ‘‘close’’ and holes in the mask were filled using
the ‘‘fill holes’’ command. Aberrant signals resulting from e.g., dust particles, tissue folds or air bubbles were manually excluded from
these regions of interest. Nuclear classification was based on the watershed signal of the DAPI staining and filtered by area to exclude
incomplete nuclei. The obtained nuclear label was expanded by 5 pixels to capture both nuclear and adjacent cytoplasmic fluores-
cent signal. Cell types were identified using a hierarchical decision tree with manually set thresholds. Finally, a representation of the
cytoplasm was created using the inbuilt growth algorithm with a maximum distance of 15 pixels from the nucleus. Vessel segmen-
tation was performed by creating a separate classifier based on pixel intensity of the CD31 signal. Nuclear classifiers were excluded a
priori and incorporated in the vessel label only when exceeding the threshold for CD31. Perivascular niches (PVNs) were established
by generating an ROI around vessels at a distance of 20 mm. All object-based phenotyping result tables were exported as csv files for
further analysis within the R environment.
Protein isolation and enzyme-linked immunosorbent assay (ELISA)

Frozen tissues were weighed and homogenized on ice with an Omni Tissue Homogenizer (Omni International) in 10 mL of RIPA lysis
buffer (Thermo Fisher Scientific) +cOmplete Protease Inhibitor (Roche Diagnostics) per mg of tissue. The homogenate was gently
agitated on ice for 10 minutes, centrifuged at 10.000 g for 5 minutes at 4 C and the supernatant collected. The protein concentration
was determined using a Bradford assay (Bio-Rad) and adjusted to 1 mg/ml. Samples were shipped to Raybiotech (Peachtree Corners)
for quantitative analysis with the multiplexed Quantibody Array Q4000 ELISA.
RNA sequencing (RNA-seq)

RNA was isolated by chloroform extraction and isopropanol precipitation. RNA sequencing libraries were generated with the
SMART-Seq preparation kit (CloneTech) and fragmented with the Nextera XT kit (Illumina). Paired end, 100 or 150 base pair, and
single end, 100 base pair, sequencing was performed by Genewiz (South Plainfield, New Jersey, USA) on an Illumina HiSeq 2500
(Illumina).
Reads were adaptor trimmed and quality clipped using BBDuk (version 38.12; https://sourceforge.net/projects/bbmap/). Trimmed
reads were mapped to the Genomic Data Commons (GDC) GRCh38.d1.vd1 reference sequence using the STAR aligner (version
2.5.2b, Dobin et al., 2013) in two-pass mode with parameters corresponding to the GDC RNA-seq alignment workflow. Transcript
abundance was estimated using the corresponding GDC reference gtf file. A raw count matrix was produced and differential
gene expression was assessed with DESeq2 using an absolute log2 fold change of 1 and a false discovery rate of 0.01 when con-
trasting to reference samples, and 0.05 for within tumor contrasts (Love et al., 2014).
Bioinformatic analysis environment

All bioinformatic analyses were performed within the R environment (version 3.5.0, R Core Team 2018).
Gene set-centered analyses

The Molecular Signatures Database (MSigDB, version 6.1, Liberzon et al., 2015; Subramanian et al., 2005) was used as the main
source for gene set-based analyses.
Over-representation was assessed with the goseq R package (Young et al., 2010) for differentially expressed genes to correct for
gene length bias, otherwise the hypergeometric test was employed. For individual samples, gene set enrichment was estimated with
the Gene Set Variation Analysis R package (GSVA, Hänzelmann et al., 2013) using the ‘‘gsva’’ function. Gene set enrichment analysis
(GSEA) was evaluated with the R package fgsea (https://github.com/ctlab/fgsea) using the maximum likelihood log fold changes
determined by DESeq2 as the ranking metric.
Deconvolution of Toil-RNA sequencing data

Toil-processed (Vivian et al., 2017), DESeq2-standardized gene expression data and matching phenotype data from the TCGA and
Genotype-Tissue Expression Project (GTEx) databases were downloaded from the UCSC Xena platform and filtered to include only
Cell 181, 1643–1660.e1–e7, June 25, 2020 e5

ll
Resource
low-grade glioma ‘‘TCGA-LGG’’ and high-grade "TCGA-GBM’’ and ‘‘frontal cortex’’ GTEx samples to integrate bulk glioma expres-
sion data with unmatched non-tumor samples. MG- and MDM-specific marker genes were derived by identifying differentially ex-
pressed genes in these two populations versus all other sorted populations in a pairwise fashion, determining the intersect and
ranking the resulting genes by their fold change versus the CD45- population. The 20 highest ranked genes were then used as
cell type-specific marker genes. Deconvolution of MG- and MDM-proportions in tumor and non-tumor sample expression data
was done with the EPIC R package (Racle et al., 2017) using these marker genes and providing the expression data from the sorted
populations as reference profiles. As the exact amount of RNA within the estimated cell types is not known, this parameter was set to
1 when running the deconvolution.
Leading edge metagene (LEM) analysis

To capture biologically meaningful patterns of gene expression within the differentially expressed genes the LEM approach (Godec
et al., 2016) was employed: (a) GSEA was performed using the MSigDB C7 collection as described above, (b) the leading edge genes
of the significant gene sets were arranged into a genes by gene sets matrix with the shrunken fold changes as the entries, (c) this
matrix was clustered using non-negative matrix factorization with the R package NMF (Gaujoux and Seoighe, 2010), (d) genes
with a small coefficient in each metagene were filtered based on the 95th quantile of a fitted exponential distribution of the coefficients
and (e) each gene with a coefficient above the threshold was assigned to the metagene where it had the highest coefficient.
Protein-Protein-Interaction network building

Version 11.0 of the STRING database (Szklarczyk et al., 2017) was downloaded from the consortium’s website and gene identifiers
from RNA-seq were mapped to Ensembl Protein IDs using the provided accessory data. The resulting interaction data was filtered to
contain only interactions with a high confidence STRING combined score (i.e., > 700). For network layout calculation the combined
score was used as an edge weight.
Nearest neighbor distance measurements and neighborhood analysis of IF data

Nearest neighbor distances from MG and MDM to vessels in IDH wt glioma samples were calculated using the spatstat R package
(Baddeley et al., 2015). Statistical significance was assessed by fitting a mixed effects model with the cell type as the fixed effect, and
the clinical sample ID as the random effect using the R package lme4 (Bates et al., 2015).
Neighbors for each individual cell were determined based on their occurrence within a range of 5 mm outside of the radius of the cell
(calculated based on the area). This was used to tabulate the number of neighbors and their cell type for each cell within the tissue
section.
Cell type abundance estimation in spatial Ivy Glioblastoma Atlas Project (GAP) data
The micro-dissected Ivy GAP (Puchalski et al., 2018) RNA-seq RSEM count data and sample annotation containing anatomical loca-
tion were downloaded from the Ivy GAP website (https://glioblastoma.alleninstitute.org/static/download.html) and normalized using
DESeq2. The relative abundance of cell types was estimated by deriving marker genes through a multinomial logistic regression
model on the normalized expression data of the FAC-sorted cell types of interest in IDH WT tumors and then computing the
GSVA enrichment scores in the Ivy GAP samples.
Survival analysis of the IDH wt MDM-specific gene signature in gliomas

The harmonized TCGA low-grade and high-grade HTSeq hg38 count data and clinical data was accessed from the GDC repository
using the TCGAbiolinks R package (Colaprico et al., 2016). Datasets were pre-processed to remove outliers and normalized using the
functions provided by TCGAbiolink before merging. Subsequent analyses were performed including only samples where an anno-
tation of the IDH mutation status was available. Cell type-specific gene signatures were derived by training a multinomial logistic
regression model with an elastic-net penalty to separate between MG and MDMs along IDH status with the ‘‘glmnet’’ R package
(Friedman et al., 2010). A mean-centered expression matrix of all MG and MDMs expression data in gliomas and BrMs, subset by
genes that were upregulated in tumors versus non-tumor tissue or healthy controls, served as the input matrix. The strength of
the penalty was determined by a 10-fold cross-validation of the l parameter. For survival analysis, GSVA enrichment scores of these
cell type-specific gene signatures were estimated and used to divide samples into tertiles. Kaplan-Meier survival curves were
computed using the ‘‘survfit’’ function. Survival curves were compared with a log-rank test between the individual levels and multi-
variate Cox regression analysis was performed with the ‘‘coxph’’ function.
Self-organizing map (SOM) clustering

Variance stabilized counts from sorted populations of interest from IDH mut, IDH WT glioma and BrM samples were filtered with the R
package HTSFilter (Rau et al., 2013) to ensure removal of genes with a low, constant expression. The resulting matrix of genes and
samples was used as input for the SOM neural network building, which was performed with the oposSOM R package (Löffler-Wirth
et al., 2015) with a map space of 50 3 50. To investigate associations between the sample phenotype and the SOM metagenes, the
tumor type and cell type were provided as group labels.
e6 Cell 181, 1643–1660.e1–e7, June 25, 2020

ll
Resource
Weighted gene correlation network analysis (WGCNA)

The WGCNA (Langfelder and Horvath, 2008) R package was used to identify co-regulated genes associated with a MG- or MDM-BrM
phenotype. A variance stabilized, batch-corrected count matrix of MG and MDM samples was filtered with the R package HTSFilter
(Rau et al., 2013) yielding input expression data with 15826 genes and 56 samples. WGCNA standard parameters were changed as
follows: the soft-thresholding power was raised to 7, the minModuleSize was increased to 50, ‘‘bicor’’ was used to calculate the cor-
relation, the network type was set to ‘‘signed hybrid’’ and a dendrogram cut height of 0.25 was used for module merging. This yielded
20 modules whose eigengene, i.e., the first principal component, was tested for correlation to the provided sample information, i.e.,
tumor- and cell-type and abundance as determined by FCM.
Expression analysis of external dataset of matched primary breast cancer and BrMs
RNA-seq raw count data from patient-matched primary breast tumors and corresponding BrMs (Vare slija et al., 2019) were down-
loaded (https://github.com/npriedig/jnci_2018) and transformed using DESeq2. The statistical significance of gene expression
changes between primary tumors and BrMs was assessed with a two-tailed Wilcoxon signed-rank test on the variance-stabilized
counts.
Plotting and graph generation

Plots were created using the ggplot2 R package (Wickham, 2016) and the ggpubr (https://cran.r-project.org/web/packages/
ggpubr/), survminer (https://cran.r-project.org/web/packages/survminer/), ggraph (https://cran.r-project.org/web/packages/
ggraph/) and ggcyto extensions (Van et al., 2018). Annotated heatmaps were drawn with the pheatmap R package (https://cran.
r-project.org/web/packages/pheatmap/).
Summary data are presented as mean ± standard error of the mean (SEM) or Tukey boxplots using ‘‘ggplot2.’’ Numerical data was
analyzed using the statistical tests noted within the corresponding sections of the article. Hierarchical clustering was performed using
Ward’s method with 1-Pearson correlation coefficient as the distance metric unless noted otherwise. P values were annotated as
follows: * < 0.05, ** < 0.01, *** < 0.001, **** < 0.0001, ns > 0.05.
Cell 181, 1643–1660.e1–e7, June 25, 2020 e7

ll
Resource
Figure S1. FACS of Cell Populations and RNA-Seq, Related to Figure 1

(A) Flow cytometry (FCM) plots illustrating the gating strategy employed during FAC-sorting of immune cell populations in non-tumor and tumor tissue (for cell
type markers, see Table S2). (B) tSNE plot of gene expression data (500 most variable genes) from all sorted cell populations (n = 226) across the complete clinical
cohort (MG = microglia, MDM = monocyte-derived macrophages, reference = unmatched healthy blood and in vitro generated MDMs).
See also Table S2.
ll
Resource

ll
Resource
Figure S2. MG and MDM Marker Expression, Related to Figure 2

(A) Normalized counts (log10 transformed) of MG and MDM marker genes in sorted CD49Dlow MG and CD49Dhigh MDM populations across both non-tumor and
tumor tissues (reference = healthy donor in vitro generated MDMs). (B) Percentage of CD49Dlow MG and CD49Dhigh MDMs positive for P2RY12 and CD68 as
determined by FCM in relation to the total number of MG/MDMs in non-tumor (n = 8) and tumor tissue (nIDH mut = 6, nIDH WT = 6, nBrM = 9). (C) Single channel and
merged immunofluorescence (IF) images of CD45, CD68, P2RY12 and CD49D stainings which were employed to delineate MG and MDMs. The last column
shows the resulting Visiopharm cell type assignments for quantitative analyses (MG (CD45+, P2RY12+/CD68+, CD49D-), MDM (CD45+, P2RY12+/CD68+,
CD49D+), non-immune cells (CD45-) and non-TAM-immune cells (CD45+, P2RY12-/CD68-, CD49D-/+). Scale bars represent 100mm. (D) Scatterplots of the
abundance of MG and MDMs as determined by IF versus FCM in non-tumor (n = 4) and tumor tissues (nIDH mut = 13, nIDH WT = 14, nBrM = 18) processed
independently from the same individual samples. Pearson’s correlation coefficient and significance are indicated at the top of each plot. (E) Heatmap of human
MG- and MDM-specific gene set expression used for deconvolution across FAC-sorted population samples from all disease types.
ll
Resource

ll
Resource
Figure S3. Analysis of DEGs and TAM Activation Patterns, Related to Figure 3
(A) Summary of contrasts applied when performing differential gene expression (DEG) analysis in MG and MDMs in gliomas (regardless of IDH status) and BrMs
(from all primaries) in comparison to normal controls (non-tumor brain MG and in vitro differentiated MDMs respectively) with the corresponding log2(fold-change)
versus -log10(adjusted p value) volcano plots. (B) Euler plot of the number of differentially expressed genes (DEG, log2(fc) > 1, p.adj < 0.01) that overlap in MG and
MDMs as shown in (A). (C) Molecular Signatures Database (MSigDB) ‘‘Hallmark’’ gene set collection overrepresentation analysis (ORA) in genes upregulated in
both gliomas and BrMs versus non-tumor brain tissue or healthy donors in MDMs and MG in MDMs and MG. Dot sizes reflect the fraction of gene set members
found within the analyzed DEGs, and dot color indicates cell type. (D) Heatmap of fold changes of macrophage M1 and M2 polarization marker genes (absolute
log2(fc) > 1, p.adj < 0.05) in MDMs and MG in gliomas and BrMs. Blank tiles indicate the lack of significant fold change. Genes are annotated with their canonical
stimuli and the associated polarization phenotype. (GC = glucocorticoid, Ic = immune complexes, IFNg = Interferon gamma, IL10 = interleukin 10, IL4 = interleukin
4, LPS = lipopolysaccharide, TGFb = transforming growth factor beta). (E) Overlap between leading edge metagenes (LEMs) in MG and MDMs in gliomas and
BrMs. Tile fill color indicates significance of overlap determined by hypergeometric testing (-log10(p.adj)). (F) String-DB protein-protein-interaction network of the
intersect from IFN Type-1 group 2 modules from LEMs ‘‘BrM-MG 1,’’ ‘‘Glioma MDM 1’’ and ‘‘BrM-MDM 4.’’ Genes selected for validation through qRT-PCR are
highlighted in red (corresponding data shown in Figure 3E). Node size indicates the centrality, while edge width corresponds to the String-DB interaction score
(only scores > 700, i.e., with a high degree of confidence have been included).
ll
Resource

ll
Resource
Figure S4. IDH WT-Specific Alterations in TAMs, Related to Figure 4

(A) Representative IF image and cell type quantification below of non-immune cells (CD45-), non-TAM immune cells (CD45+, P2RY12/CD68-, CD49D+/-), MG
(CD45+, P2RY12/CD68+, CD49D) and MDM (CD45+, P2RY12/CD68+, CD49D+) and vessels (CD31+) in IDH WT glioma. Dashed line indicates the border of the
perivascular niche (PVN), scale bar represents 100mm. (B) Heatmap of cell-type gene set variation analysis (GSVA) enrichment scores of micro-dissected Ivy
Glioblastoma Atlas Project samples (dataset from Puchalski et al., 2018). Columns are ordered by anatomical location, rows have been z-scored. (C) Gene set
enrichment analysis (GSEA) results of MSigDB ‘‘C2’’ antigen processing and cross-presentation associated pathways in T-MDMs versus T-MG in IDH WT glioma.
(D) Heatmap of MDM IDH WT gene set expression in sorted MG and MDMs from IDH mut and WT glioma samples. Columns are ordered by IDH status and cell
type, expression values have been z-scored. (E) Plot of z-scored MDM IDH WT signature scores in the TCGA glioma dataset. Subjects are ranked by their
enrichment score (small amount of random variation added for readability) and the IDH status is indicated by color. (F) Kaplan-Meier estimator of survival in the
combined TCGA glioma cohort based on the enrichment for a cell type-specific T-MDM signature (see Figure S2E). (G) ORA of ‘‘innate anti-PD-1 resistance’’
(IPRES) signatures within DEG from MG- and MDMs in IDH WT gliomas DEGs (versus MG from IDH mut tumors) with tile fill indicating the -log10 of the adjusted p
value. (H) GSVA of IPRES signatures in CD45- cells, MG, and MDMs from IDH mut and IDH WT gliomas. Columns are ordered by cell type, rows (z-score) have
been hierarchically clustered.
ll
Resource
Figure S5. Protein Concentration in Bulk Tumor Tissues and Relation to Cell-Type-Associated SOM Spots, Related to Figure 5
(A) Bulk tissue protein concentrations of indicated proteins in non-tumor brain (n = 3), gliomas (n = 14) and BrMs (n = 12). Color indicates disease type and IDH
status. (B) Heatmap of self-organizing map (SOM) spot metagene expression across the analyzed samples. Rows were z-scored and have been hierarchically
clustered, columns were ordered by cell type, disease type and IDH mutation status.
ll
Resource
Figure S6. Gene Expression Analysis in BrM-TAMs, Related to Figure 6

(A) Overlap of the number of differentially expressed genes (DEG, log2(fc) > 1, p.adj < 0.05) in MG and (B) MDMs in the indicated comparisons. BrM-specific gene
sets are highlighted in gray within each cell type. The intersect of highlighted BrM-MG and BrM-MDM sets contains 87 genes. (C) GSEA of the ‘‘Biocarta IL-6
pathway’’ in BrM-MG versus -MDM and the (D) ‘‘Naba core matrisome’’ gene set from the MSigDB ‘‘C2’’ collection in BrM-MDM versus -MG. (E) Expression
(log10-transformed normalized counts) of neutrophil-recruiting chemokines and receptors in sorted MG, MDMs and neutrophil populations from IDH WT and BrM
samples.
ll
Resource
Figure S7. Correlation of WGCNA Modules with External Traits and Module Pathway ORA, Related to Figure 7
(A) Representative immunofluorescence images in IDH WT gliomas and BrMs. Scale bars = 100mm, boxed area is shown in higher magnification in Figure 7A. (B)
Heatmap of the weighted gene correlation network analysis (WGCNA) module eigengene (= first principal component of expression data, columns, module
columns are labeled with a color code) correlation to the traits (rows, cell type and disease, abundance of CD4+ or CD8+ T cells in % of CD45+). Values inside the
cells state Pearson’s r and the associated p value. (C) ‘‘Brown’’ BrM-MDM module MSigDB ‘‘C2CP’’ ORA results (p value < 0.01) enrichment map network
visualization. Node size represents p value, edge thickness reflects overlap of genes between gene sets.
Resource
Personalized Mapping of Drug Metabolism by the

Human Gut Microbiome
Bahar Javdan, Jaime G. Lopez,
Pranatchareeya Chankhamjon, ...,
Xiaojuan Wang, Seema Chatterjee,
Mohamed S. Donia
Correspondence
donia@princeton.edu
In Brief
Each human has a diverse gut
microbiome, which can metabolize drugs
differently. In this resource, Javdan et al.
present a way to capture and grow much
of the unique diversity of human
microbiomes in culture and also a way to
detect many of our microbiome-derived
metabolites. Together, they use these
unique gut communities and the
metabolomics pipeline to see how
personalized microbiomes metabolize
drugs in different ways.
Highlights
d Development of subject-personalized ex vivo batch cultures
of the gut microbiome
d Discovery of diverse drug-microbiome interactions using

MDM-Screen
d MDM-Screen quantifies drug metabolism by personalized

gut microbial communities
d Functional genomic and metagenomic screens identify drug-

metabolizing enzymes
Javdan et al., 2020, Cell 181, 1661–1679

ll
Resource
Personalized Mapping of Drug Metabolism
by the Human Gut Microbiome
Bahar Javdan,1,4 Jaime G. Lopez,2,4 Pranatchareeya Chankhamjon,1 Ying-Chiang J. Lee,1 Raphaella Hull,1 Qihao Wu,1
Xiaojuan Wang,1 Seema Chatterjee,1 and Mohamed S. Donia1,3,5,*
1Department of Molecular Biology, Princeton University, Princeton, NJ 08544, USA
2Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ 08544, USA
3Department of Chemical and Biological Engineering, Princeton University, Princeton, NJ 08544, USA
5Lead Contact
*Correspondence: donia@princeton.edu
SUMMARY
The human gut microbiome harbors hundreds of bacterial species with diverse biochemical capabilities.
Dozens of drugs have been shown to be metabolized by single isolates from the gut microbiome, but the
extent of this phenomenon is rarely explored in the context of microbial communities. Here, we develop a
quantitative experimental framework for mapping the ability of the human gut microbiome to metabolize
small molecule drugs: Microbiome-Derived Metabolism (MDM)-Screen. Included are a batch culturing sys-
tem for sustained growth of subject-specific gut microbial communities, an ex vivo drug metabolism screen,
and targeted and untargeted functional metagenomic screens to identify microbiome-encoded genes
responsible for specific metabolic events. Our framework identifies novel drug-microbiome interactions
that vary between individuals and demonstrates how the gut microbiome might be used in drug development
and personalized medicine.
INTRODUCTION and intestine (Meinl et al., 2009). Direct interactions include the
partial or complete biochemical transformation of a drug into
The oral route is the most common route for drug administration. more or less active metabolites by microbiome-derived enzymes
Upon exiting the stomach, drugs can be absorbed in the small (termed herein: Microbiome-Derived Metabolism, or MDM).
and/or large intestine to reach systemic circulation and eventu- The human gut microbiome harbors hundreds of bacterial
ally the liver or directly transported there via the portal vein. In species, encoding an estimated 100 times more genes than
the liver, drugs may be metabolized and secreted back to the in- the human genome (Qin et al., 2010). This enormous diversity
testines through bile—via enterohepatic circulation (Kimura and richness represent a repertoire of yet-uncharacterized
et al., 1994; Li and Jia, 2013). Even parenterally administered biochemical activities capable of metabolizing ingested chemi-
drugs and their metabolites can reach the intestines through cals (Bäckhed et al., 2005; Koppel et al., 2017). Although MDM
biliary secretion. Thus, whether prior to or after absorption, has been observed in dozens of examples for the past 50 years,
some administered drugs will spend a considerable amount of this process is still mostly overlooked in the drug development
time in the small and large intestines, where our human gut pipeline where little to no effort is spent on determining the spe-
microbiome resides. It is therefore important to study gut micro- cific role of MDM in pharmacokinetics (Ilett et al., 1990; Li and
biome composition and function, specifically as it relates to drug Jia, 2013; Scheline, 1973; Spanogiannopoulos et al., 2016).
interactions, while accounting for the significant variability be- This is because of the vast complexity of the microbiome, and
tween individuals (Falony et al., 2016). overwhelming technical challenge of testing hundreds of drugs
Broadly speaking, the microbiome interacts with drugs both against thousands of cultured isolates under multiple conditions.
directly and indirectly. Indirect interactions include competition In contrast to liver-derived metabolism, we lack a systematic and
between microbiome-derived metabolites and administered standardized map of MDM, hindering our ability to reliably pre-
drugs for the same host metabolizing enzymes (Clayton et al., dict and eventually interfere with undesired microbiome effects
2009), microbiome effects on the immune system in anticancer on drug pharmacokinetics and pharmacodynamics.
immunotherapy (Iida et al., 2013; Sivan et al., 2015; Vétizou To address this gap in knowledge, we developed a
et al., 2015), microbiome reactivation of secreted inactive me- quantitative experimental workflow for mapping MDM of orally
tabolites of the drug (Wallace et al., 2010), and overall micro- administered drugs using personalized gut microbiome-derived
biome effects on the levels of metabolizing enzymes in the liver microbial communities (MDM-Screen). The methods and
ll
Resource
findings reported here provide a framework for discovering and versity across all media, and the closest one to PD) (Figure 1C). In
characterizing novel cases of MDM and for potentially incorpo- PD, there are 33 ASVs present above a relative abundance of
rating an ‘‘MDM module’’ in the drug development pipeline. 1%, 26 (79%) of which are present in mGAM day two culture.
Overall, total shared ASVs between PD and mGAM day two ac-
RESULTS count for 70% of the PD composition (by relative abundance),
indicating that the mGAM culture recapitulates the bulk of the
Mapping the Capacity of a Single Subject’s Microbiome original community. Taken together, and consistent with previ-
to Metabolize Hundreds of Drugs ous reports showing that mGAM can support the growth of a
A major challenge in studying the capacity of the human gut mi- wide variety of gut microorganisms in monoculture (Rettedal
crobiome to metabolize orally administered drugs is the diversity et al., 2014; Tramontano et al., 2018), our results support the
of bacterial species and strains involved (Almeida et al., 2019; use of mGAM day two cultures as a viable ex vivo batch culturing
Lloyd-Price et al., 2017; Nayfach et al., 2019; Pasolli et al., model for the PD microbiome.
2019; Qin et al., 2010). Because it is impractical to systematically With an optimized ex vivo culturing system for PD in hand, we
screen thousands of isolated strains against hundreds of drugs, next developed a combined biochemical and analytical chemis-
previous studies have relied mainly on monocultures of a try approach to map the capacity of PD-derived microbial com-
selected set of representative species. However, gene expres- munities to metabolize clinically used, orally administered drugs
sion and biochemical transformation profiles vary dramatically (MDM-Screen) (Figure 2A). Three samples were prepared per
between a strain grown in monoculture versus in a mixed com- drug of interest: (1) a 24-h mGAM ex vivo culture of PD, incu-
munity. To address these challenges, we sought to develop an bated with the drug of interest (final concentration 33 mM, in
optimized ex vivo mixed culturing system that supports the line with estimates of drug concentrations in the gastrointestinal
growth of a large proportion of the species from a given micro- tract) (Maier et al., 2018), (2) a similar culture incubated with a
biome sample and is amenable to high-throughput (HT) vehicle control (DMSO), and (3) an equal volume of sterile
biochemical screens. mGAM, incubated with the same drug concentration. Cultures
We began our screening efforts focused on a single micro- and controls were then incubated for an additional 24 h at
biome donor (pilot donor, PD). To identify the medium and 37 C in an anaerobic chamber, chemically extracted, and
culturing period that can support the growth of a batch culture analyzed using high performance liquid chromatography
whose composition is maximally similar to the original PD micro- coupled with mass spectrometry (HPLC-MS). To verify the
biome, freshly collected and glycerol-stocked human feces from reproducibility, the entire procedure was repeated three consec-
PD were cultured in 14 different media and sampled daily for four utive times. We tested a diverse library of 575 orally administered
days. We then extracted DNA from all samples, amplified the V4 drugs, and although the majority of the drugs in this library are
region of the bacterial 16S rRNA gene, and deeply sequenced currently being used in the clinic, less than 10% of them had
the amplicons (Figure 1A). From the sequencing results, ampli- been previously explored with respect to MDM (Table S1A). A
con sequence variants (ASVs) were inferred and the taxonomic drug was deemed MDM+ when (1) a new metabolite was
composition at different levels was determined for each sample observed when incubated with PD culture or (2) the drug was
(see STAR Methods) (Bokulich et al., 2018; Bolyen et al., 2018; no longer detected when incubated with PD culture, indicating
Callahan et al., 2016; McDonald et al., 2012). We then quantified that it is either consumed entirely or metabolized into a molecule
the differences between the various media and PD at the family that fails our detection, and (3) the drug was metabolized in the
level (using the Jensen-Shannon divergence, DJS), and the same manner during at least two of three independent
variant recovery from PD at the single ASV level. experiments.
As expected, we observed a great level of variation in both the
taxonomic composition and diversity between the different me- MDM-Screen Identifies Known and Novel Drug-
dia and culturing periods. Some media led to highly diverse com- Microbiome Interactions
munities that captured portions of the original fecal diversity, Among the 575 drugs, we successfully analyzed 438 (76%); the
while others became dominated almost exclusively by a single remaining 137 failed MDM-Screen because of issues related to
family. Among the 14 media commonly used in gut microbiome drug stability or incompatibilities with the extraction or chroma-
cultivation efforts (Rettedal et al., 2014), we identified one, modi- tography methods employed (see Discussion). Of the success-
fied Gifu Anaerobic Medium (mGAM), that supported the growth fully analyzed drugs, we identified 57 (13%) as MDM+. These
of a bacterial community most similar in composition and diver- spanned 28 pharmacological classes and even more based on
sity to PD’s (Figures 1B and S1A). At the family level, mGAM their chemical structure (Figure 2B; Table S1B; Data S1). As ex-
cultures largely match the composition of PD, differing primarily pected, several previously reported MDM cases were identified.
in a commonly observed expansion of the facultative anaerobes, These include the nitroreduction of the muscle relaxant dantro-
Enterobacteriaceae, at the expense of the obligate anaerobes, lene (Kuroiwa et al., 1985), the antiepileptic clonazepam (Elmer
Ruminococcaceae (McDonald et al., 2018). Among all tested and Remmel, 1984; Zimmermann et al., 2019b), and the antihy-
media, mGAM cultures showed the lowest DJS divergence pertensive drug nicardipine (Kuroiwa et al., 1986); hydrolysis of
from PD, becoming increasingly similar to the original sample the isoxazole moiety in the antipsychotic risperidone (Mannens
as growth proceeds (Figure S1A). et al., 1993; Meuldermans et al., 1994); and azoreduction of
Even at the single ASV level, mGAM cultures capture much of the anti-inflammatory prodrug sulfasalazine (Azadkhan et al.,
the diversity in PD (mGAM cultures have the highest Shannon di- 1982; Peppercorn and Goldman, 1972).
1662 Cell 181, 1661–1679, June 25, 2020

ll
Resource
B C
Figure 1. Development of an Ex Vivo Batch-Culturing System for the PD Microbiome

(A) Schematic representation of the media selection procedure.
(B) Family level bacterial composition of the original fecal sample (far left), as well as that of PD ex vivo cultures, grown anaerobically in 14 different media over two
days (.01 and .02). See STAR Methods for full media names. 16S rRNA gene sequences that could not be classified at the family level and families with less than
1% relative abundance in all samples are grouped into ‘‘Other.’’ Cultures are ordered according to their Jensen-Shannon (DJS) divergence from the original PD
sample (upper axes, computed at the family level in base e).
(C) ASV level bacterial composition of the original PD fecal sample, and that of day two ex vivo cultures of PD, where each square represents one sample. Rainbow
colored dots represent the relative abundance of individual ASVs that are above 1% in PD, while gray dots represent the combined relative abundance of all ASVs
below 1% in PD. Samples are ordered by their Shannon diversity (H) at the ASV level, computed in base 2 and shown above each square.
See also Figure S1; Table S2.
More importantly, we identified a suite of novel MDM cases (45 Although HRMS and HRMS/MS analyses can narrow down
cases, 80% of the MDM+ drugs): ten resulted from full depletion the number of possibilities for the molecular structure of a given
of the parent drug (or full conversion to a metabolite that evades metabolite, they are not sufficient for full structural determina-
our detection), while 35 resulted from the appearance of a new tion. Thus, we selected seven MDM+ examples for detailed
metabolite (Figures 2C and 2D; Table S1B; Data S1A). In most characterization of their resulting metabolites: spironolactone
cases, the new metabolites showed a high-resolution mass spec- (anti-hypertensive), tolcapone (anti-Parkinson’s), misoprostol
trometry (HRMS) profile within a small difference from their parent (anti-ulcer), mycophenolate mofetil (immunosuppressant),
drugs and/or a similar tandem MS fragmentation (HRMS/MS) capecitabine (anticancer), and finally, hydrocortisone and hydro-
pattern, indicating that they are derivatives (Wang et al., 2016) (Ta- cortisone acetate (two steroidal anti-inflammatory drugs that
ble S1B; Data S1B). An aggregate statistical analysis of MDM+ and produced an identical MDM metabolite). In all but one example,
MDM drugs revealed specific structural features that are signifi- no direct drug-microbiome interactions had been previously re-
cantly enriched in MDM+ drugs (e.g., a steroidal skeleton, nitro ported. The exception was hydrocortisone where several metab-
groups, ketones, among others) (see STAR Methods; Table S1C). olites had been previously reported from individual gut isolates
Cell 181, 1661–1679, June 25, 2020 1663

ll
Resource
B C
1664 Cell 181, 1661–1679, June 25, 2020

ll
Resource
(Ridlon et al., 2013; Winter et al., 1982), but the identity of the ically following tolcapone use, yet the mechanism of its produc-
MDM metabolite we observed could not be accurately matched tion remains unknown (Jorga et al., 1999; Smith et al., 2003). Our
to any of them based on MS alone. discovery that the same metabolite can be produced via MDM
To unequivocally determine the structure of the resulting provides a possible explanation, and a potential link between
metabolites, we isolated them from scaled-up biochemical incu- the gut microbiome and tolcapone toxicity. In all four cases,
bations with PD cultures and elucidated their structures using additional experiments need to be performed to differentiate
nuclear magnetic resonance (NMR) and/or comparison to an the contribution of human- and microbiome-derived metabolism
authentic standard (STAR Methods; Data S2). For hydrocorti- to the observed drug pharmacokinetics and/or toxicity in
sone, we determined that MDM results in the reduction of the humans.
ketone group at C20, producing 20b-dihydrocortisone. For hy-
drocortisone acetate, the same modification occurs but is Expanding MDM-Screen to Multiple Subjects
accompanied with deacetylation of the C21 hydroxyl group (Fig- Next, we sought to expand our framework to accommodate mul-
ure S2). While C20 ketone reduction was previously reported for tiple subjects. To accomplish this goal, we needed to first design
hydrocortisone (to produce either 20b-dihydrocortisone or a generalizable quantitative metric for assessing the best
20a-dihydrocortisone depending on the gut isolate incubated culturing medium for microbiome samples (Figures 3A and 3B).
with it; Ridlon et al., 2013; Winter et al., 1982), neither MDM de- In our analysis of PD’s ex vivo cultures, we applied a variety of
acetylation nor C20 ketone reduction were reported for hydro- metrics and found a medium that was the best trade-off between
cortisone acetate. For capecitabine, we show that MDM results richness, evenness, and compositional similarity. However, this
in complete deglycosylation; for misoprostol and mycopheno- approach is not scalable to a large number of donors and also
late mofetil, we observed an ester hydrolysis transformation; ignores the role of community biomass, which may lead to sub-
and in the case of spironolactone, a thioester hydrolysis one. optimal media selection. Therefore, we developed a metric
None of these MDM transformations were previously reported called Expected Number of Detectable Strains (ENDS), a cor-
for these drugs. Finally, for tolcapone, we observed two consec- rected richness metric where the contribution of each ASV is
utive transformations, a typical nitroreduction followed by a weighed by the probability that its metabolite can be detected
relatively uncommon N-acetylation—neither of which had been while considering total biomass (STAR Methods; Figure 3B;
previously linked to the microbiome for this drug (Figure 2D). Methods S1). The core idea is that we desire a medium that sup-
Taken together, these results establish MDM-Screen as a viable ports the highest number of different bacterial ASVs, while
method for identifying both known and novel biochemical mod- ensuring that the metabolic contributions of these ASVs are
ifications of structurally and pharmacologically diverse drugs by detectable by our experimental method. ENDS utilizes two
the gut microbiome. data inputs related to the ex vivo culture composition: relative
Interestingly, based on already known pharmacokinetic abundance at a given taxonomic level and total community
studies in humans, some of these new MDM cases may have biomass; and it also utilizes two inputs related to the instrument
direct consequences on the activation or toxicity of the drugs detection sensitivity: a model of instrument background noise
involved. For example, in the case of spironolactone we and a model of instrument measurement noise. Using this
observed the production of 7a-thiospironolactone, a postulated information, a simple mechanistic model of MDM metabolite
intermediate en route to the drug’s main active metabolite, production, and estimations of statistical power, we compute
7a-thiomethylspironolactone (Gardiner et al., 1989; Sica, the probabilities that metabolic reactions performed by each
2005). For the prodrugs misoprostol and mycophenolate mofetil, strain will be detected in the ex vivo culture (see STAR Methods
we observed the production of their active metabolites miso- for a detailed mathematical description of ENDS).
prostol acid (Schoenhard et al., 1985; Tsai et al., 1991) and With this quantitative framework in hand, we collected addi-
mycophenolic acid (Bullingham et al., 1998), respectively. Inter- tional fresh fecal samples from 20 healthy donors (D1-20) and
estingly, mycophenolic acid has been linked to the clinically processed them in the same manner as PD. We then cultured
observed gastrointestinal toxicity associated with mycopheno- each sample in nine representative media and used 16S rRNA
late mofetil use (Taylor et al., 2019), albeit generated via a gene sequencing to determine the composition of the cultured
different route—hydrolysis of a biliary secreted glucuronide con- communities as previously described (Figures 3A–3C and S1;
jugate by gut microbiome-derived b-glucuronidases. Finally, in Table S2A). To measure community biomass, one mL of each
the case of tolcapone, N-acetylamino-tolcapone has been de- culture was pelleted and weighed (STAR Methods; Tables S2C
tected systemically in humans post-tolcapone administration and S2D). We observed a wide variation in culture characteris-
and was suggested to be involved in liver toxicity observed clin- tics, with the richness ranging from 20–135 ASVs and mean
Figure 2. Screening of the PD Microbiome against Orally Administered Drugs Identifies Novel Drug-Microbiome Interactions
(A) Schematic representation of MDM-Screen. A drug was considered MDM+ if a new metabolite is produced (e.g., drug 3) or if the drug is no longer detectable
(e.g., drug 5) after incubation with the microbiome, as compared with abiotic media controls.
(B) A bar graph showing the pharmacological classes of MDM+ drugs discovered by MDM-Screen with the PD microbiome. ‘‘Others’’ include one drug each from
14 additional classes.
(C) Examples of MDM+ drugs where the drug is no longer detectable after incubation with the PD microbiome.
(D) Examples of MDM+ drugs where a new metabolite is discovered by MDM-Screen and fully characterized in this study.
See also Table S1; Data S1 and S2.
Cell 181, 1661–1679, June 25, 2020 1665

ll
Resource
D E F
G H
1666 Cell 181, 1661–1679, June 25, 2020

ll
Resource
biomass density ranging from 2–27.9 g/L (Figure 3D). mGAM and mated to be the most physiologically relevant (Methods S1D). In
Bryant and Burkey (BB) media consistently performed well with addition, BG recapitulates a large portion of the microbial commu-
all 20 donors. Interestingly, mGAM had moderate ASV richness nity in the original fecal sample. On average, BG cultures recover
and high biomass, while BB yielded a much lower biomass 76.6% of the ASVs above 1% in the original fecal sample, which
with high richness and did not suffer from the Enterobacteri- translates to 84.7%, 88.3%, and 92.7% recovery rate on the spe-
aceae expansion observed in mGAM. We calculated that a 70/ cies, genus, and family levels of taxa above 1% in the original sam-
30 BB/mGAM mixture would yield an optimal medium with mod- ple, respectively (Figure 3H). In terms of recovery of all elements,
erate biomass, high richness, and a reduced Enterobacteriaceae BG recovers 43.3%, 57.3%, 60.6%, and 62.8% on the ASV, spe-
expansion (Methods S1B), thus we included this mixture (named cies, genus, and family levels, respectively. BG is also, on average,
BG) as a 10th medium in our culturing trials. the closest in composition to the original sample (DJS = 0.16,
Next, we wondered whether the ex vivo cultured communities computed at the family level in base e), has the highest average di-
are truly personalized per subject, an important prerequisite if versity (H = 4.3, computed at the ASV level in base 2), and shows
cultured communities are to be used for assessing inter-individual the highest ASV richness (ranging from 34-133 ASVs with an
variability in MDM. Personalization between cultures was clearly average of 77). We therefore selected BG as the medium to use
observed at the ASV level, with clear specific patterns unique to in our 20-donor screen.
individual donors and their cultures (Figure 3C). We found signifi- We next sought to develop a HT, quantitative metabolomic
cantly more ASVs shared between donor feces and their self approach to assess MDM inter-individual variability with a sub-
ex vivo cultures than non-self (47.1 versus 27.5 ASVs, p < 0.001, set of drugs. Several improvements were made to the original
permutation test), partially recapitulating the inherent personaliza- drug metabolism screen. All experimental steps including incu-
tion between the donor fecal samples (Figures 3E and 3F). More- bation, chemical extraction, and HPLC-HRMS analysis were
over, we identified 167 ASVs that were unique to one of the 20 do- performed in microtiter 96-well plates instead of individual tubes,
nors in their fecal samples (8.4 ASVs per donor on average) and at a 400 mL volume instead of 3 mL. This lowered the amount of
were concordantly unique to the same donor in their ex vivo cul- drug used per incubation, allowed us to perform triplicated reac-
tures (Table S2F). Finally, we grew and sequenced multiple repli- tions simultaneously, and streamlined our chemical extraction
cates of mGAM and BG ex vivo cultures from different donors (all and analysis procedures. We also spiked a known concentration
20 donors, three replicates each for BG, and eight donors, six rep- of an internal standard prior to the chemical extraction, which al-
licates each for mGAM). The analysis of these replicates, whether lowed us to precisely quantify partial, in addition to complete,
cultured from the same or separate glycerol stock aliquots, re- drug depletion.
vealed a high correlation between ASV abundances of replicates We chose a 23-drug subset to test the ability of our quantita-
from the same donor and ensured that the community assembly tive approach to reveal potential inter-individual variability in
process was replicable (Pearson correlation coefficient >0.9) MDM under the MDM-Screen conditions. 13 drugs had at least
(Methods S1A; Figure S1). These analyses confirm that our one defined metabolite with a known chemical structure, allow-
approach results in personalized and replicable microbial ing us to unambiguously compare their levels between samples
communities. (Figure S3; Table S3). For all 20 donors, ex vivo cultures (in BG
To select a single medium that would be on average optimal for medium) were incubated in triplicates in a 96-well microtiter plate
use in a 20-donor screen, we computed the average ENDS of all with each of the 23 drugs at a final concentration of 33 mM, or
media across a range of reaction rates (Figure 3G). We found with DMSO (Figure 4A). In addition, an abiotic medium-drug
that BG is on average an optimal medium at reaction rates we esti- plate as well as a heat-killed-microbiome-drug (HKM-drug) plate
Figure 3. Identifying the Optimal Medium for Multi-Donor MDM-Screen

(A) Schematic representation of the media selection procedure for D1-20.
(B) Schematic representation of the ENDS metric. Using 16S rRNA gene sequencing and biomass measurements, absolute abundances of ASVs (orange and
gray strains) in different ex vivo communities are measured and metabolite production from each member of the community is estimated using a simple
mathematical model. Using instrument noise properties, distributions of metabolite measurements from each ASV (orange and gray distributions) are estimated
and compared to instrument noise (white distribution). Statistical power estimation is then used to compute metabolism detection probabilities for each ASV and
the condition maximizing ENDS (the sum of these probabilities) is selected.
(C) ASV abundance heatmap of the original fecal samples and ex vivo microbial communities for each donor. Each box corresponds to samples from a single
donor, with the original fecal sample shown on the far left followed by different ex vivo media in the order specified above the heatmap. Only ASVs above 5% in at
least one sample are shown, with all remaining ASVs aggregated into ‘‘Other.’’ The taxonomic classification of each ASV (on the order level) is indicated by the
color bar on the left.
(D) Histogram of ex vivo community biomass for all donors in different media conditions.
(E) Comparison of shared ASVs within (self, i.e., the ASV richness) and between (non-self) donor fecal samples. ***p < 0.001, permutation test.
(F) Comparison of shared ASVs between donor fecal samples and ex vivo cultures originating from the same donor (self) versus ones originating from other donors
(non-self). ***p < 0.001, permutation test.
(G) Average ENDS of different media conditions at varying metabolite production rates (quantified as AUC normalized to an internal standard). ENDS was
computed for each ex vivo culture assuming a p value significance cutoff of 0.01 and three replicates. For each media condition, ENDS was averaged across all
donors.
(H) Average fractional recovery of different taxa in BG ex vivo communities as a function of relative abundance in the original donor fecal sample. The fractional
recovery was calculated for all donors and then averaged.
Cell 181, 1661–1679, June 25, 2020 1667

ll
Resource
B C
D F
1668 Cell 181, 1661–1679, June 25, 2020

ll
Resource
were prepared in the same manner. After 24 h incubation, culture We quantified this variability by computing the Shannon entropy
and control plates were chemically extracted and analyzed using (in base 2) of the distribution of metabolizers and non-metaboliz-
HPLC-HRMS (STAR Methods). ers, denoted as HV. This metric is maximal (HV = 1) when half of
We then devised a targeted metabolomics strategy to quantify donors metabolize the drug and is minimal when the drug is
MDM. We calculated percent drug remaining and metabolite either always or never metabolized (HV = 0).
level, to assess drug depletion and metabolite production in The observed variability ranged widely from 1/20 to 19/20 do-
the presence of microbiome cultures, respectively. Both metrics nors deemed MDM+ for a given type of drug depletion or metab-
were calculated using area under the curve (AUC) integration olite production. In the case of digoxin, for example, 3/20 donors
with normalization to the internal standard (see STAR Methods; (HV = 0.61) produced the known metabolite dihydrodigoxin in
Tables S3J–M). We determined statistical significance for statistically significant amounts (Figures 4C and 4E). Inter-indi-
metabolite production using one-sided Welch’s t tests vidual variability in digoxin MDM has been clinically known for
between the donor-drug condition and the donor-DMSO, decades, where significant reduction of the drug into dihydrodi-
medium-drug, and HKM-drug conditions and corrected the re- goxin and related metabolites occurs in only a subset of patients
sulting p values for multiple hypotheses using the Benjamini- (Lindenbaum et al., 1981). These results demonstrate that our
Hochberg method, requiring that tests against all three control screen can quantitatively assess the inter-individual variability
conditions have a false discovery rate (FDR) corrected p < 0.01 of MDM between personalized gut microbial communities
(Benjamini and Hochberg, 1995). For drug depletion, we used cultured under identical ex vivo conditions. Follow-up studies
the same method with the donor-drug and HKM-drug conditions will need to be performed to evaluate whether our screening re-
as controls and included an additional fold-change cutoff of two sults directly correlate with clinical outcomes.
(Figures 4B and 4C; Tables S3N–S3P). We also performed untar- Next, we sought to determine whether the depletion of drugs
geted metabolomics analyses for new metabolite discovery by in our screen can be explained by the production of associated
identifying unique molecular features from all samples, deter- metabolites. If changes in drug levels are primarily because of
mining statistical significance using similar methods as for conversion to a detected metabolite, there should exist a strong
the targeted metabolomics and verifying the metabolite’s negative correlation between depletion and metabolite produc-
relationship to the parent drug based on their HRMS/MS frag- tion, corresponding to a stoichiometric mass balance. The
mentation pattern (Wang et al., 2016) (STAR Methods; Tables absence of such a correlation, on the other hand, would suggest
S3C and S3D). All verified metabolites from the untargeted me- additional events that are not accounted for (e.g., the production
tabolomics approach were then quantified using the same tar- of additional unknown or undetectable metabolites, the conver-
geted metabolomics workflow described above (Figure 4C; sion of the initial metabolite into a second one, or bacterial con-
Tables S3N–S3P). sumption of the parent drug). For drugs with variable MDM (HV >
We observed cases of consistently negative MDM across do- 0.5 for at least one metabolite or the parent drug), we computed
nors (ketoconazole, praziquantel, ropinirole, and torsemide), the Pearson correlation coefficient of the drug signal and the sum
consistently positive MDM in either drug depletion (misoprostol, of known metabolite signals in all donor-drug ex vivo samples.
nicardipine, and spironolactone), metabolite production (tolca- We then determined whether a drug has statistically significant
pone and vorinostat), or both (clonazepam, risperidone, and sul- correlation by performing t tests, correcting p values using the
fasalazine), and variable MDM (Figures 4B–4D). This variability Benjamini-Hochberg method, and requiring FDR corrected p <
was in drug depletion (ketoprofen and levonorgestrel), metabo- 0.01. For vorinostat and digoxin, for example, we found a signif-
lite production (misoprostol, nicardipine, and spironolactone), icant negative correlation between metabolite production and
or both (capecitabine, clofazimine, digoxin, hydrocortisone, drug depletion (Pearson correlation coefficient of 0.91 and
lovastatin, mycophenolate mofetil, sulindac, and vorinostat). 0.79, respectively), suggesting that the majority of drug
Figure 4. A HT, Quantitative Metabolomic Approach to Assess Inter-Individual Variability in MDM Using Personalized Microbial Communities
(A) Schematic representation of quantitative MDM-Screen with 20 donors and 23 selected drugs.
(B) Heatmap of drug depletion showing the mean fraction of drug remaining after 24 h for each donor-drug combination. The fraction remaining is computed
relative to the medium-drug control, and fractions above 1 are truncated to 1 for simplicity.
(C) Heatmap of metabolite production showing the mean level of metabolite after 24 h, normalized to the maximum level of that metabolite across all donors.
Metabolites in red were discovered using the untargeted metabolomics approach, while ones in black were discovered previously or by MDM-Screen with the PD
microbiome (Table S3B). In (B) and (C), *statistically significant metabolism in the donor condition as compared with controls. The upper inset axes represent
inter-individual variability in MDM using the Shannon entropy (calculated in base 2) of the distribution of donors with significant and non-significant metabolism.
(D) Cumulative histogram of the number of significant donors for both metabolite production and parent drug depletion. For parent drugs, the y axis is normalized
to the total number of drugs tested (23), and for metabolite production, it is normalized to the total number of metabolites produced (32).
(E) Levels of metabolite production (measured by HPLC-HRMS in AUC normalized to an internal standard) for four drugs, with the variability entropy indicated
above. Filled data points indicate that the replicates are significantly higher than control conditions, while hollow data points indicate that they are not.
(F) The upper three scatterplots show significant negative correlation between drug depletion and metabolite production, with the Pearson correlation coefficient
indicated above. The line shown is a linear regression fit of the data. The lower bar plot indicates the Pearson correlation coefficient between remaining drug levels
and total metabolite production for all computed cases. *FDR corrected two-sided t test p < 0.01. For drugs with multiple metabolites, we sum the normalized
AUC of all metabolites.
(G) Correlation between drug depletion and metabolite production for nicardipine before and after inclusion of metabolites discovered by untargeted
metabolomics.
See also Table S3.
Cell 181, 1661–1679, June 25, 2020 1669

ll
Resource
depletion can be explained by the production of the quantified mermann et al., 2019b). For capecitabine deglycosylation, we
metabolite (Figure 4F). Nicardipine, on the other hand, exhibited elected to use a homology-based approach.
a very poor correlation initially (Pearson correlation coefficient of
0.22), implying that additional unknown factors are at play. Characterizing the Genetic Basis of MDM
Interestingly, our untargeted metabolomics pipeline detected Deglycosylation Using a Homology-Based Approach
11 additional metabolites of nicardipine, which upon inclusion To identify a specific microbiome-derived isolate where a homol-
in the analysis resulted in a stronger negative correlation (Pear- ogy-based approach can be employed, we explored the ability
son correlation coefficient of 0.6, FDR corrected p = 0.0102) of a limited panel of bacterial isolates to deglycosylate capecita-
(Figure 4G; Table S3E). Since our screen is based on microbial bine, including strains isolated originally from PD. Interestingly,
communities and not individual strains, it provides a powerful capecitabine deglycosylation was mainly performed by Proteo-
platform to discover interacting factors that influence drug and bacteria (including Escherichia coli), and one of two tested
metabolite levels under realistic conditions—as exemplified by Bacteroidetes: Parabacteroides distasonis, providing genetically
the varying number of nicardipine metabolites observed per tractable organisms for functional studies (Figure S4). In hu-
personalized community. mans, thymidine phosphorylase (TP) and uridine phosphorylase
Next, we assessed whether we could predict MDM using (UP), both part of the pyrimidine salvage pathway, catalyze the
taxonomic data. We computed Spearman correlations between deglycosylation of 50 -deoxy-5-fluorouridine (a late metabolite
absolute abundances of taxonomic elements (at different of capecitabine) to yield 5-fluorouracil (5-FU) (Temmink et al.,
levels) in the BG ex vivo cultures and measured drug and 2007). To test whether bacterial homologs of human TP and/or
metabolite levels in matching donors but found no significant UP are responsible for the observed MDM deglycosylation of ca-
correlations—even in specific cases of MDM where meta- pecitabine, we generated strains of E. coli BW25113 that are
bolism has been previously attributed to a single species knockouts for TP (DdeoA), UP (Dudp), or both, and compared
(e.g., digoxin reduction by Eggerthella lenta) (Haiser et al., their ability to metabolize capecitabine with that of wild-type
2013). This is likely due to a combination of two factors. First, (WT) E. coli (Figure 5A). While WT E. coli efficiently deglycosy-
as has been previously observed (Haiser et al., 2013; Maini Re- lates capecitabine (30% conversion rate), the deglycosylating
kdal et al., 2019), taxonomic classifications may not reflect the activity of the Dudp and DdeoA/Dudp strains is significantly
presence or absence of gene variants that encode strain-spe- diminished (less than 4% conversion rate, p value < 0.001,
cific drug-metabolizing enzymes, even at the ASV level. Sec- two-tailed t test) (Figure 5B). Surprisingly, the DdeoA strain
ond, the observed level of MDM may not be monotonically showed a significant increase in its deglycosylating activity in
dependent on a single taxon’s abundance if confounding com- comparison with WT (50% conversion rate, p value < 0.01,
munity effects are at play. Examples of such effects include the two-tailed t test), possibly because of a compensating mecha-
contribution of several community members to the production nism (e.g., overexpression of udp) in the absence of deoA. These
of the metabolite(s), the consumption of the drug or metabo- results indicate that microbiome-derived UP is, at least in part,
lite(s), or the inhibition of the metabolite-producing or drug- responsible for the deglycosylation of capecitabine.
depleting bacterium or enzyme (for a mathematical analysis Capecitabine is one of several generations of antimetabolite
of the impact of these factors on the correlation, see Methods chemotherapeutic agents, many of which are prodrugs for
S1C). These results emphasize the importance of considering 5-FU, and are known collectively as the oral fluoropyrimidines
whole community effects in MDM. While our ex vivo commu- (FPs) (Lamont and Schilsky, 1999; Longley et al., 2003). Impor-
nities may not fully recapitulate all possible community effects tantly, oral FPs’ bioavailability and toxicity vary widely among pa-
that occur in humans, they represent an important step toward tients (Cleary et al., 2017; Zampino et al., 1999), but the human
identifying and quantifying them. gut microbiome’s contribution to this variability had not been
explored. To determine whether deglycosylation occurs with
Linking MDM to Specific Genes in the Human other FPs, and whether the same enzymes are involved, we
Microbiome investigated the MDM of two additional oral FPs (doxifluridine
Next, we sought to link the observed biochemical transforma- and trifluridine) using WT and mutant E. coli. Unlike with capeci-
tions to specific microbiome-derived enzymes. We picked two tabine, almost complete deglycosylation was observed for both
representative cases of MDM transformations: MDM deglycosy- drugs with WT E. coli, and the activity was dependent on both TP
lation of capecitabine into deglycocapecitabine and C20 ketone and UP (Figures S4 and S5). These results indicate a level of de-
reduction of hydrocortisone into 20b-dihydrocortisone. Three glycosylation specificity for TP/UP among the FPs (Figure 5C).
main approaches had been previously employed to identify Remarkably, the consequences of the same modification differ
genes responsible for a specific MDM transformation: compara- depending on the tested drug. For trifluridine, the resulting
tive transcriptomics, which assumes that the expression of metabolite (trifluorothymine) is inactive (Figures 5D and S4):
metabolizing enzymes is induced in the presence of their sub- trifluridine is typically incorporated intact into DNA to cause cyto-
strates (e.g., digoxin) (Haiser et al., 2013; Koppel et al., 2018), ho- toxicity (Cleary et al., 2017; Lenz et al., 2015). Such a premature
mology-based discovery, which assumes that related classes of intestinal inactivation by the microbiome may thus be an un-
enzymes metabolize similar substrates (e.g., levodopa) (Maini known contributor to the established low bioavailability of triflur-
Rekdal et al., 2019; van Kessel et al., 2019; Zimmerman et al., idine, in addition to the known contribution of human TP (Cleary
2019b)), and HT mutagenesis screens, which identify metabo- et al., 2017). For doxifluridine, however, the resulting metabolite
lizing enzymes by isolating loss-of-function mutant strains (Zim- is the active 5-FU (Figures 5E and S5). This premature activation
1670 Cell 181, 1661–1679, June 25, 2020

ll
Resource
B C
D E
F G
Figure 5. Genetic Basis and Widespread Nature of MDM Deglycosylation among the FPs and in Human Gut Metagenomes
(A) Genetic organization of the udp and deoA loci in the genome of E. coli BW25113.
(B) A bar graph indicating percent conversion of capecitabine to deglycocapecitabine by wild-type (WT) E. coli BW25113 and Dudp, DdeoA, and DdeoA/Dudp
mutants (each tested in triplicate). ***p value < 0.001, while **p value < 0.01, two-tailed t test. Error bars represent the standard deviation.
(C) Biochemical reaction catalyzed by thymidine and uridine phosphorylases on their natural substrates.
(D) MDM deglycosylation of the oral anticancer drug trifluridine leads to its premature inactivation, since trifluorothymine is no longer active.
(E) MDM deglycosylation of the anticancer prodrug doxifluridine leads to its premature activation, since 5-FU is the intended active metabolite.
(F) Heatmaps indicating the prevalence (in percent of subjects harboring the gene of interest) and abundance (in median RPKM for all positive subjects) of E. coli-
derived deoA and udp across six gut metagenomic cohorts. The number of subjects analyzed in each cohort is indicated on the right. For cohorts with multiple
sub-cohorts, the values reported are the averages of the sub-cohort values.
(G) Jitter plots of E. coli-derived deoA and udp abundances (in RPKM) for all positive subjects in the same cohorts.
See also Figures S4 and S5; Table S4.
Cell 181, 1661–1679, June 25, 2020 1671

ll
Resource
of the prodrug may therefore lead to gastrointestinal toxicity— the cat fecal isolate Butyricicoccus desmolans ATCC 43058 (De-
again, a side effect commonly associated with oral doxifluridine vendran et al., 2017). Neither this enzyme nor close homologs
(Kim et al., 2001; Min et al., 2000). Additional studies are neces- thereof (at 60% protein sequence identity or above) could be
sary to directly correlate the level of MDM deglycosylation of identified in a deep metagenomic sequencing dataset that we
different FPs in humans to their clinically observed pharmacoki- generated from the PD fecal DNA (STAR Methods). We therefore
netics and/or toxicity. decided to use this example as a test case for developing an un-
Because capecitabine was significantly and variably metabo- targeted functional metagenomic screening strategy for metab-
lized into deglycocapecitabine in 17/20 donors, we sought to olizing enzymes. In typical functional metagenomic screens,
examine the representation of FP deglycosylating enzymes in the metagenomic DNA is cloned into a vector that replicates in
gut microbiome of the human population at large. We specifically E. coli and functional screens are performed in either a selective
focused on enzymes that we experimentally verified to have a manner (e.g., for antibiotic resistance or an engineered circuit for
role in FP deglycosylation: E. coli-derived TP and UP. Overall, we survival) (Genee et al., 2016; Sommer et al., 2009; Uribe et al.,
analyzed six large and diverse human cohorts: the Human Micro- 2019) or a visual readout (e.g., a colorimetric or antibacterial
biome Project (HMP-1-1 and HMP-1-2, 299 subjects from the one) (Brady et al., 2002; Cohen et al., 2015; Gillespie et al.,
USA) (Human Microbiome Project Consortium, 2012; Lloyd-Price 2002; Rondon et al., 2000). Here, we use a functional metage-
et al., 2017), the Metagenomics of the Human Intestinal Tract Con- nomic screen where the readout is a specific MDM transforma-
sortium (MetaHIT, 219 subjects from Spain and 176 subjects from tion that is detected by MS. For metagenomic genes that are
Denmark) (Nielsen et al., 2014), a Chinese cohort (194 subjects) successfully expressed and produce functional gene products
(Qin et al., 2012), and a Fijian cohort (Fijicomp, 192 subjects) (Brito in E. coli, this approach would allow access to enzymes encoded
et al., 2016). We mapped fecal metagenomic reads from each of by cultured and not-yet cultured members of the microbiome,
the cohort samples to the DNA sequence of deoA and udp, and and to ones that share no close homology with previously
calculated two metrics: prevalence, i.e., the percent of subjects characterized enzymes. Two major technical challenges in this
from each cohort that are positive for a given gene, and abundance strategy, however, are to produce a large-enough metagenomic
of the gene among positive samples (calculated in reads per Kbp library that captures the majority of the genetic content in the
per million of sequenced reads, or RPKM) (STAR Methods; Tables complex microbiome, and to develop a HT analytical chemistry
S4B and S4C). Interestingly, we found that both genes were most approach that permits the screening of such a library.
prevalent in non-Western cohorts (Chinese 74%/76% positive We isolated metagenomic DNA from PD and used it to
subjects, and Fijicomp 64%/70% positive subjects for deoA/udp, construct a 3 3106-member clone library (PD-CL) in an
respectively) in comparison with Western ones (24%/26% on E. coli expression vector (insert size 2–4 Kbp) (Figure 6A). To
average, for deoA/udp, respectively) and that their abundance determine whether PD-CL is truly representative of the genetic
per positive samples varies widely within and between cohorts content in PD, we deeply sequenced a representative pool that
(from 101 to 102 RPKM) (Figures 5F and 5G; Tables S4B and contains 105 unique clones (PD-CL-100) and compared it
S4C). These results indicate that FP deglycosylating enzymes are with the deeply sequenced PD fecal metagenome. We mapped
both widespread and variable in the gut microbiome of diverse hu- metagenomic reads from either PD or PD-CL-100 to assembled
man cohorts (even when considering the contribution of a single scaffolds from the PD metagenome (25,529 scaffolds R 2 Kbp).
bacterial species, E. coli) and further highlight the importance of Satisfyingly, reads from PD-CL-100 (which represents only 3%
considering MDM deglycosylation of FPs in clinical studies. of the full PD-CL) map to 21% of the PD scaffolds, including
ones that originate from all major bacterial phyla and varying
An Untargeted Functional Metagenomic Screening coverages in the PD microbiome (Figure 6B; Table S4A). These
Approach for Identifying Metabolizing Enzymes results indicate that PD-CL represents a large component of
Although the homology-based approach was relatively straight- the genetic content in PD and that it is adequate for use in func-
forward in identifying responsible species and enzymes for the tional metagenomics screens.
deglycosylation of FPs, it is not widely applicable. Unlike pyrim- During the construction of PD-CL, we split it into 80 pools of 2–
idine phosphorylases, oxidoreductases (the enzyme class likely 6 3 104 unique clones (UCs) each and preserved them in corre-
responsible for hydrocortisone reduction to 20b-dihydrocorti- sponding glycerol stocks (see STAR Methods). We tested each
sone) are extremely diverse and typically substrate specific, of these pools for the ability to convert hydrocortisone into
with numerous homologs found per bacterial genome. More- 20b-dihydrocortisone and identified six that showed significant
over, homology-based discovery, as well as comparative tran- metabolism. To reach a single functional clone, we performed
scriptomics and HT mutagenesis screens, typically require the 10-fold serial dilutions of a selected positive pool of 2 3 104
identification of an isolated strain that performs the modification UCs, by following positive sub-pools at the 2 3 103, 2 3 102,
of interest and its use as the basis for genetic manipulations and/ and 2 3 101 UC levels. We then plated the 20-UCs positive
or functional analyses. These two limitations motivated us to sub-pool and screened individual clones in a 96-well plate
employ an orthogonal strategy that is not reliant on enzymatic format to reach a single positive clone: Hyd-red-1 (Figures 6C
homology nor isolated strains. and S6). Sequencing of Hyd-red-1 revealed that it likely origi-
While no human gut microbiome-derived enzymes had previ- nated from a Bifidobacterium sp. Analysis of the genetic context
ously been deemed responsible for converting hydrocortisone of Hyd-red-1 in a PD scaffold revealed a single putative oxidore-
into 20b-dihydrocortisone, a cat microbiome-derived enzyme ductase in the cloned insert (Figure 6D). We then cloned and het-
had: a 20b-hydroxysteroid dehydrogenase (20b-HSDH) from erologously expressed this single gene and showed that it is
1672 Cell 181, 1661–1679, June 25, 2020

ll
Resource
C D
G
F
Cell 181, 1661–1679, June 25, 2020 1673

ll
Resource
indeed a 20b-HSDH (Figures 6C and 6D). A second round of further corroborating our findings (Doden et al., 2019). As
screening of PD-CL performed in a similar manner revealed a mentioned above (Figure 2D), we also observed the production
different clone, Hyd-red-2, harboring the same gene and con- of 20b-dihydrocortisone from hydrocortisone acetate when
firming our findings (Figure 6D). These results indicate that incubated with PD. This transformation would require two steps:
combining MDM-Screen with a functional metagenomics deacetylation at the C21 hydroxyl, by a yet-unidentified enzyme
approach is a valid strategy to link MDM transformations to and reduction of the ketone at C20 by a 20b-HSDH. Interestingly,
metabolizing enzymes from diverse bacteria without the need when we incubated hydrocortisone acetate with either
for bacterial isolation. P. distasonis or C. bolteae, it was deacetylated to yield hydrocor-
We then sought to further probe the biological relevance of the tisone but not further reduced, implying that the two metabolic
discovered 20b-HSDH. Because it was discovered by heterolo- steps at play here can be uncoupled and performed by different
gous expression of PD-derived DNA in E. coli, we wondered members of the microbiome in a sequential manner (Figure S2).
whether it is actually expressed under host colonization condi-
tions. To answer this question, we isolated RNA from PD, sub- MDM Deglycosylation Occurs In Vivo
jected it to deep metatranscriptomic sequencing, and mapped Although MDM-Screen is able to uncover novel microbiome-drug
resulting reads to the PD scaffold harboring the 20b-HSDH interactions, it is unclear whether these results (observed ex vivo)
gene (see STAR Methods). We observed robust expression of can be recapitulated within the gastrointestinal tract of a live
the 20b-HSDH gene in PD-derived metatranscriptomic data mammalian host (in vivo). To address this question, we sought
but not of neighboring genes, suggesting that it is expressed to monitor one MDM transformation, MDM deglycosylation of
individually and not as part of a gene cluster (Figure 6E). To FPs, in an in vivo pharmacokinetic study that is performed in a mi-
determine whether the identified 20b-HSDH is unique to PD or crobiome-dependent manner. Capecitabine was among the initial
widespread in the human population, we mapped fecal metage- hits that resulted from MDM-Screen and its modification yields a
nomic reads from the same six human cohorts mentioned above novel metabolite (deglycocapecitabine) that has not been previ-
to the DNA sequence of its gene and to that of a previously iden- ously reported in humans or animals; we selected its MDM degly-
tified 20a-HSDH gene from the gut microbiome isolate Clos- cosylation as a test case for in vivo studies and a proxy for other
tridium scindens ATCC 35704 for comparison (which converts FPs. We treated two groups of C57BL/6 mice with a cocktail of an-
hydrocortisone to 20a-dihydrocortisone) (Ridlon et al., 2013). tibiotics for 14 days to eliminate their native microbiome, then
While the C. scindens-derived 20a-HSDH gene was rare (present colonized one group with PD while the control group remained
in only 0.6% of subjects, on average), the PD-derived 20b-HSDH non-colonized (see STAR Methods). The two groups were then
gene was widespread in all cohorts (present in 36% of subjects, treated with a single human-equivalent oral dose of capecitabine
on average), and its abundance varied widely between subjects (755 mg/kg), and blood and feces were collected from each
and cohorts (Figures 6F and 6G; Tables S4B and S4C). mouse at 0, 20, 40, 60, 120, and 240 min post-drug administration
Although Bifidobacterium adolescentis had been known to (Figures 7A and 7B). We then quantified capecitabine and its me-
convert hydrocortisone into 20b-dihydrocortisone for almost tabolites in the serial fecal and blood samples using HPLC-HRMS.
40 years (Winter et al., 1982), no responsible enzymes have In blood samples, capecitabine and its major liver-derived metab-
been identified from it. Interestingly, while this manuscript was olite (50 -deoxy-5-fluorocytidine), but not deglycocapecitabine,
under revision, a different study published the crystal structure were readily detected and showed no significant differences be-
of a 20b-HSDH from B. adolescentis L2-32 (which is 98% iden- tween the two groups (Figure S7). In fecal samples, however, de-
tical to the 20b-HSDH we identified from the PD microbiome), glycocapecitabine was detected from animals colonized with PD
Figure 6. A Functional Metagenomic Screening Approach to Identify a Metabolizing Enzyme

(A) Schematic representation of the functional metagenomic screening approach.
(B) A scatterplot comparing the coverage of assembled PD scaffolds (R2 Kbp, in RPKM) in the two metagenomic datasets (PD and PD-CL-100). Dots repre-
senting PD metagenomic scaffolds are colored and sized on the basis of their phylum-level taxonomic assignments and lengths, respectively, and as indicated in
the key on the right. For ease of visualization, only scaffolds with RPKM values %10 are shown in this plot (97% of all scaffolds R2 Kbp) (see also Table S4A for the
entire dataset).
(C) Functional metagenomic screening of the PD-CL library. Beginning with pools containing 2–6 3104 unique clones, pools were selected and further sub-pooled
based on their functional ability to convert hydrocortisone to 20b-dihydrocortisone. Produced 20b-dihydrocortisone levels were quantified using HPLC-HRMS as
AUC normalized to an internal standard. For each round, the pool producing the highest normalized signal of 20b-dihydrocortisone (signified by a red dot with a
black outline) was selected for further sub-pooling, until a unique clone encoding a 20b-HSDH activity was identified. A single 20b-HSDH gene from the positive
metagenomic clone was further verified by heterologous expression in E. coli, when cloned as the native sequence (cloned) or synthesized as codon-optimized
for E. coli (synth.), in comparison with an empty-vector control (empty vector).
(D) Genetic organization of the inserts from two unique clones identified using functional metagenomic screening for the 20b-HSDH activity (PD-CL-Hyd-red-1
and PD-CL-Hyd-red-2) in comparison with their corresponding scaffold assembled from the PD metagenome.
(E) A bar graph indicating the count of PD fecal metatranscriptomic reads that mapped to the discovered 20b-HSDH gene (red) and its flanking genes (gray).
(F) Heatmaps indicating the prevalence (in percent of subjects harboring the gene of interest) and abundance (in median RPKM for all positive subjects) of
20a-HSDH (from C. scindens) and 20b-HSDH (from the PD metagenome) across six gut metagenomic cohorts. The number of subjects analyzed in each cohort is
indicated on the right. For cohorts with multiple sub-cohorts, the values reported are the averages of the sub-cohort values.
(G) Jitter plots of 20a-HSDH gene (from C. scindens) and 20b-HSDH gene (from the PD metagenome) abundances (in RPKM) for all positive subjects from the
same cohorts.
1674 Cell 181, 1661–1679, June 25, 2020

ll
Resource
A Figure 7. MDM Deglycosylation Occurs

In Vivo
(A) Schematic representation of the microbiome-
dependent pharmacokinetic experiment per-
formed here.
(B) Design of the capecitabine pharmacokinetic
experiment. Mice are treated with antibiotics for
14 days, then colonized with PD (n = 6) or left non-
colonized (n = 6). On the pharmacokinetic experi-
ment day, a single human-equivalent dose is
B C administered to mice using oral gavage, and serial
sampling of blood (B) and feces (F) is performed at
0, 20, 40, 60, 120, and 240 min post dosing.
(C) HPLC-HRMS based quantification of deglyco-
capecitabine in fecal samples from mice colonized
with PD in comparison to non-colonized ones.
Metabolite AUC per gram of feces is normalized by
the AUC of the internal standard (see STAR
Methods). Error bars represent the standard error of
the mean. The difference between the two condi-
tions is significant (p < 0.01, determined by testing
the intersection null hypothesis with marginal two-
tailed t tests using the Bonferroni correction to
control family-wise error rate).
See also Figure S7.
as early as 20 min after dosing and was almost completely absent obtained here—including the extent and type of certain modifi-
in non-colonized ones (Figure 7C). These results indicate that—at cations—are specific to the strain-level composition of each
least in the case of FP deglycosylation—MDM transformations donor’s microbiome. MDM-Screen thus has a good potential
observed ex vivo by MDM-Screen are recapitulated in vivo (i.e., for assessing inter-individual variability in MDM.
in mice); establishing the same results in humans awaits further Second, most previous studies have focused on certain combi-
studies. They also suggest that MDM deglycosylation of certain nations of drugs and species that have historically been deemed
FPs (e.g., doxifluridine, which is prematurely activated into 5-FU important (e.g., have been readily observed in humans) or that
upon deglycosylation) should be investigated as a potential are manageable experimentally. By default, our microbial-com-
contributor to their undesired intestinal toxicity observed in the munity setup allows us to screen a wider range of combinations,
clinic, although future in vivo studies with different dosing regi- which enabled us to expand in either the drug or subject spaces
mens and a variety of FPs need to be performed. and to discover drug-microbiome interactions never reported
before. Notably, while this manuscript was under revision, an
DISCUSSION elegant study reported the screening of 271 orally administered
drugs against 76 bacterial isolates of the human gut microbiome
In the current study, we developed a quantitative experimental (Zimmermann et al., 2019a). Two thirds of the tested drugs were
workflow for assessing the ability of the human gut microbiome shown to be significantly depleted by at least one of the tested iso-
to directly metabolize orally administered drugs, using a combi- lates, further emphasizing the great potential of gut microbes to
nation of microbial community cultivation, small-molecule struc- metabolize orally administered small-molecule drugs. We view
tural analysis, quantitative metabolomics, functional genomics these two approaches as complementary: while screening drugs
and metagenomics, and mouse colonization assays. Several against optimized, well-characterized, donor-derived microbial
key differences set our approach apart from previous studies communities in MDM-Screen provides a personalized view of
in this area. First, instead of relying on single isolates in perform- drug metabolism that takes into account strain-level and commu-
ing the initial screen, we use well-characterized, subject-person- nity-wide contributions, screening drugs against a set of repre-
alized microbial communities. Despite the technical challenges sentative gut isolates streamlines the identification and character-
associated with characterizing and maintaining stable microbial ization of specific taxon-drug and gene-drug interactions.
communities in batch cultures, three main advantages make this Combined together, the results from the two approaches serve
strategy worth pursuing: (1) the extent of a biochemical transfor- as a valuable resource for the scientific community to further study
mation performed by single isolates cultured individually may be the mechanistic details and pharmacological consequences of
different than that performed by the same isolates when cultured newly discovered drug-microbiome interactions.
as part of a complex community; (2) the net result of several Despite these advances, our approach is still subject to several
members of the microbiome acting on the same drug can only limitations. First, 24% of the drugs tested failed to be analyzed us-
be identified in mixed communities and not in single-isolate ex- ing the general analytical chemistry workflow described in MDM-
periments, unless all pairwise and higher order permutations Screen. These drugs fell into one or more of three main categories:
are tested; and (3) our strategy is ‘‘personalized.’’ The results unstable after overnight incubation in no-microbiome controls,
Cell 181, 1661–1679, June 25, 2020 1675

ll
Resource
could not be extracted using ethyl acetate, or could not be B ex vivo screening of the drug library with PD
analyzed using reverse phase chromatography. An alternative B Structural elucidation of selected metabolites
chemical analysis method will need to be developed for these B Molecular networking analysis in PD screen
molecules in order to assess their MDM. Second, we focused B Enrichment analysis for drugs in PD screen
initially on oral drugs, yet several parenteral drugs and their B Gene abundance analysis in metagenomic cohorts
liver-derived metabolites may be subject to important MDM trans- B ENDS (Expected Number of Detectable Strains)
formations after biliary secretion. Third, even in our most diverse B High-throughput screen with D1-20
ex vivo cultures, we fail to support the growth of 100% of the com- B Targeted quantitative metabolomics analysis
munity in the original sample. This limitation can potentially be B Untargeted metabolomics analysis
overcome by utilizing multiple distinct media conditions that B Isolate screen for capecitabine
each captures unique portions of the community. ENDS provides B Metagenomic library construction
the theoretical framework for selecting an optimal ensemble of B Functional screening of the metagenomic library
media conditions, and we show in STAR Methods how to B Heterologous expression of PD-derived 20b-HSDH
compute a version of ENDS that estimates the number of detect- B Metagenomic and metatranscriptomic analyses
able strains gained by testing additional media. B TP and UP gene deletions in E. coli BW25113
We developed our screen in two stages. We began with a sin- B MDM-Screen of capecitabine using E. coli mutants
gle human sample, PD, and incubated its ex vivo culture with 575 B MDM-Screen of other FPs using E. coli mutants
drugs. We then transitioned into a HT format with more rigorous B Microbiome-dependent pharmacokinetic experiment
methods for media selection, drug and metabolite quantification, d QUANTIFICATION AND STATISTICAL ANALYSIS
and metabolite discovery and used these methods to screen
ex vivo cultures from 20 human donors against 23 drugs. A simul- SUPPLEMENTAL INFORMATION
taneous expansion into hundreds of drugs and hundreds of
donor samples is necessary to reveal the complete biochemical
cell.2020.05.001.
potential of MDM: it is very likely that the types of MDM transfor-
mations observed here are an underestimation of all possible ACKNOWLEDGMENTS
ones. With the HT experimental approach and automatic tar-
geted and untargeted metabolomic analyses developed here, We would like to thank Wei Wang and the Lewis Sigler Institute sequencing
we have laid the groundwork for this expansion. Finally, and core facility for assistance with HT sequencing; Matthew Cahn and Abhishek
Biswas for assistance with sequencing data analysis; Shuo Wang for assis-
most relevant from a clinical stand point, a direct comparison be-
tance with the functional group analysis; Joseph Koos, A. James Link, and
tween drug metabolism outcomes in humans and in MDM- Yuki Sugimoto for assistance with Mass Spectrometry; Riley Skeen-Gaar for
Screen for the same cohort of donors is important to establish assistance with statistical analysis; Joseph Sheehan and Zemer Gitai for assis-
which MDM transformations can be observed in humans, and tance with obtaining the Keio library mutants; the Laboratory Animal Re-
to quantify the magnitude by which inter-individual variability in sources at Princeton University for assistance with mouse studies; Janie
MDM-Screen recapitulates that which occurs in humans. Our Kim for illustrating the graphical abstract; and members of the Donia lab for
useful discussions. We are grateful to the 21 anonymous donors who provided
quantitative framework—on both the microbial community and
the fecal samples that made this project possible. Figure S6 and a part of
metabolomic angles—provides the necessary tools to perform Figure 7 were created with BioRender.com. Funding for this project has
such comparison. been provided by an Innovation Award from the Department of Molecular
Biology, Princeton University and an NIH Director’s New Innovator Award
(1DP2AI124441), both to M.S.D. B.J. is funded by a New Jersey Commission
STAR+METHODS
on Cancer Research Pre-doctoral award (DFHS18PPC056), Y.-C.J.L. is
funded by a training grant from the National Institute of General Medicine Sci-
Detailed methods are provided in the online version of this paper ences (T32GM007388), and J.G.L. is funded by a National Science Foundation
and include the following: Graduate Research Fellowship (2017249408).
d KEY RESOURCES TABLE AUTHOR CONTRIBUTIONS

B Lead Contact M.S.D. conceived and directed the research, designed the study, and ob-
B Materials Availability tained funding. P.C. and B.J. performed the PD culturing and screening exper-
iments. B.J. performed the D1-D20 culturing and screening experiments. P.C.,
B.J., Q.W., X.W., and M.S.D. analyzed the PD screening data and character-
d EXPERIMENTAL MODEL AND SUBJECT DETAILS ized the new metabolites. J.G.L. performed the statistical, computational, me-
B Human subject samples tabolomic, and quantitative analyses. R.H. and Y.-C.J.L. performed the func-
B Bacterial strains and conditions tional metagenomic screening experiments. S.C. isolated metagenomic DNA
B Mice and RNA and prepared them for sequencing. M.S.D, B.J., and J.G.L. wrote
d METHOD DETAILS the manuscript, with input from all authors.
B Fecal sample processing for PD and D1-20
B ex vivo culture of PD
B ex vivo culture of D1-20 M.S.D. is a member of the scientific advisory board of DeepBiome Therapeu-
B 16S rRNA gene amplicon sequencing and analysis tics. A patent is being filed by Princeton University for the use of quantitative
B Measurement of biomass for cultured D1-20 MDM-Screen to measure inter-individual variability in drug metabolism.
1676 Cell 181, 1661–1679, June 25, 2020

ll
Resource
Received: March 18, 2019 Datsenko, K.A., and Wanner, B.L. (2000). One-step inactivation of chromo-
Revised: January 7, 2020 somal genes in Escherichia coli K-12 using PCR products. Proc. Natl. Acad.
Accepted: April 29, 2020 Sci. USA 97, 6640–6645.
Published: June 10, 2020 Devendran, S., Méndez-Garcı́a, C., and Ridlon, J.M. (2017). Identification and
characterization of a 20b-HSDH from the anaerobic gut bacterium Butyricicoc-
REFERENCES cus desmolans ATCC 43058. J. Lipid Res. 58, 916–925.
Doden, H.L., Pollet, R.M., Mythen, S.M., Wawrzak, Z., Devendran, S., Cann, I.,
Almeida, A., Mitchell, A.L., Boland, M., Forster, S.C., Gloor, G.B., Tarkowska,
Koropatkin, N.M., and Ridlon, J.M. (2019). Structural and biochemical charac-
A., Lawley, T.D., and Finn, R.D. (2019). A new genomic blueprint of the human
terization of 20b-hydroxysteroid dehydrogenase from Bifidobacterium adoles-
gut microbiota. Nature 568, 499–504.
centis strain L2-32. J. Biol. Chem. 294, 12040–12053.
Azadkhan, A.K., Truelove, S.C., and Aronson, J.K. (1982). The disposition and
Elmer, G.W., and Remmel, R.P. (1984). Role of the intestinal microflora in clo-
metabolism of sulphasalazine (salicylazosulphapyridine) in man. Br. J. Clin.
nazepam metabolism in the rat. Xenobiotica 14, 829–840.
Pharmacol. 13, 523–528.
Falony, G., Joossens, M., Vieira-Silva, S., Wang, J., Darzi, Y., Faust, K., Kuril-
Baba, T., Ara, T., Hasegawa, M., Takai, Y., Okumura, Y., Baba, M., Datsenko,
shikov, A., Bonder, M.J., Valles-Colomer, M., Vandeputte, D., et al. (2016).
K.A., Tomita, M., Wanner, B.L., and Mori, H. (2006). Construction of Escheri-
Population-level analysis of gut microbiome variation. Science 352, 560–564.
chia coli K-12 in-frame, single-gene knockout mutants: the Keio collection.
Mol Syst Biol 2. https://doi.org/10.1038/msb4100050. Gardiner, P., Schrode, K., Quinlan, D., Martin, B.K., Boreham, D.R., Rogers,
M.S., Stubbs, K., Smith, M., and Karim, A. (1989). Spironolactone metabolism:
Bäckhed, F., Ley, R.E., Sonnenburg, J.L., Peterson, D.A., and Gordon, J.I.
steady-state serum levels of the sulfur-containing metabolites. J. Clin. Phar-
(2005). Host-bacterial mutualism in the human intestine. Science 307,
macol. 29, 342–347.
1915–1920.
Genee, H.J., Bali, A.P., Petersen, S.D., Siedler, S., Bonde, M.T., Gronenberg,
Bankevich, A., Nurk, S., Antipov, D., Gurevich, A.A., Dvorkin, M., Kulikov, A.S.,
L.S., Kristensen, M., Harrison, S.J., and Sommer, M.O. (2016). Functional min-
Lesin, V.M., Nikolenko, S.I., Pham, S., Prjibelski, A.D., et al. (2012). SPAdes: a
ing of transporters using synthetic selections. Nat. Chem. Biol. 12, 1015–1022.
new genome assembly algorithm and its applications to single-cell
sequencing. J. Comput. Biol. 19, 455–477. Gillespie, D.E., Brady, S.F., Bettermann, A.D., Cianciotto, N.P., Liles, M.R.,
Rondon, M.R., Clardy, J., Goodman, R.M., and Handelsman, J. (2002). Isola-
Benjamini, Y., and Hochberg, Y. (1995). Controlling the False Discovery Rate:
tion of antibiotics turbomycin a and B from a metagenomic library of soil micro-
A Practical and Powerful Approach to Multiple Testing. J. R. Stat. Soc. B 57,
bial DNA. Appl. Environ. Microbiol. 68, 4301–4306.
289–300.
Goodman, A.L., Kallstrom, G., Faith, J.J., Reyes, A., Moore, A., Dantas, G., and
Bokulich, N.A., Kaehler, B.D., Rideout, J.R., Dillon, M., Bolyen, E., Knight, R.,
Gordon, J.I. (2011). Extensive personal human gut microbiota culture collec-
Huttley, G.A., and Gregory Caporaso, J. (2018). Optimizing taxonomic classi-
tions characterized and manipulated in gnotobiotic mice. Proc. Natl. Acad.
fication of marker-gene amplicon sequences with QIIME 2’s q2-feature-clas-
Sci. USA 108, 6252–6257.
sifier plugin. Microbiome 6, 90.
Haiser, H.J., Gootenberg, D.B., Chatman, K., Sirasani, G., Balskus, E.P., and
Bolyen, E., Rideout, J.R., Dillon, M.R., Bokulich, N.A., Abnet, C., Al-Ghalith,
Turnbaugh, P.J. (2013). Predicting and manipulating cardiac drug inactivation
G.A., Alexander, H., Alm, E.J., Arumugam, M., Asnicar, F., et al. (2018). QIIME
by the human gut bacterium Eggerthella lenta. Science 341, 295–298.
2: Reproducible, interactive, scalable, and extensible microbiome data sci-
ence. PeerJ Preprints 6, e27295v27292. Harrison, D.A., and Brady, A.R. (2004). Sample Size and Power Calculations
using the Noncentral t-distribution. Stata J. 4, 142–153.
Brady, S.F., Chao, C.J., and Clardy, J. (2002). New natural product families
from an environmental DNA (eDNA) gene cluster. J. Am. Chem. Soc. 124, Human Microbiome Project Consortium (2012). Structure, function and diver-
9968–9969. sity of the healthy human microbiome. Nature 486, 207–214.
Brito, I.L., Yilmaz, S., Huang, K., Xu, L., Jupiter, S.D., Jenkins, A.P., Naisilisili, Hunter, J.D. (2007). Matplotlib: A 2D Graphics Environment. Comput. Sci. Eng.
W., Tamminen, M., Smillie, C.S., Wortman, J.R., et al. (2016). Mobile genes in 9, 90–95.
the human microbiome are structured from global to individual scales. Nature Iida, N., Dzutsev, A., Stewart, C.A., Smith, L., Bouladoux, N., Weingarten, R.A.,
535, 435–439. Molina, D.A., Salcedo, R., Back, T., Cramer, S., et al. (2013). Commensal bac-
Bullingham, R.E., Nicholls, A.J., and Kamm, B.R. (1998). Clinical pharmacoki- teria control cancer response to therapy by modulating the tumor microenvi-
netics of mycophenolate mofetil. Clin. Pharmacokinet. 34, 429–455. ronment. Science 342, 967–970.
Callahan, B.J., McMurdie, P.J., Rosen, M.J., Han, A.W., Johnson, A.J., and Ilett, K.F., Tee, L.B., Reeves, P.T., and Minchin, R.F. (1990). Metabolism of
Holmes, S.P. (2016). DADA2: High-resolution sample inference from Illumina drugs and other xenobiotics in the gut lumen and wall. Pharmacol. Ther.
amplicon data. Nat. Methods 13, 581–583. 46, 67–93.
Caporaso, J.G., Lauber, C.L., Walters, W.A., Berg-Lyons, D., Huntley, J., Fi- Jorga, K., Fotteler, B., Heizmann, P., and Gasser, R. (1999). Metabolism and
erer, N., Owens, S.M., Betley, J., Fraser, L., Bauer, M., et al. (2012). Ultra- excretion of tolcapone, a novel inhibitor of catechol-O-methyltransferase.
high-throughput microbial community analysis on the Illumina HiSeq and Br. J. Clin. Pharmacol. 48, 513–520.
MiSeq platforms. ISME J. 6, 1621–1624. Karmarkar, D., and Rock, K.L. (2013). Microbiota signalling through MyD88 is
Clayton, T.A., Baker, D., Lindon, J.C., Everett, J.R., and Nicholson, J.K. (2009). necessary for a systemic neutrophilic inflammatory response. Immunology
Pharmacometabonomic identification of a significant host-microbiome meta- 140, 483–492.
bolic interaction affecting human drug metabolism. Proc. Natl. Acad. Sci. USA Kearse, M., Moir, R., Wilson, A., Stones-Havas, S., Cheung, M., Sturrock, S.,
106, 14728–14733. Buxton, S., Cooper, A., Markowitz, S., Duran, C., et al. (2012). Geneious Basic:
Cleary, J.M., Rosen, L.S., Yoshida, K., Rasco, D., Shapiro, G.I., and Sun, W. an integrated and extendable desktop software platform for the organization
(2017). A phase 1 study of the pharmacokinetics of nucleoside analog trifluri- and analysis of sequence data. Bioinformatics 28, 1647–1649.
dine and thymidine phosphorylase inhibitor tipiracil (components of TAS-102) Kim, N.K., Min, J.S., Park, J.K., Yun, S.H., Sung, J.S., Jung, H.C., and Roh, J.K.
vs trifluridine alone. Invest. New Drugs 35, 189–197. (2001). Intravenous 5-fluorouracil versus oral doxifluridine as preoperative
Cohen, L.J., Kang, H.S., Chu, J., Huang, Y.H., Gordon, E.A., Reddy, B.V., Ter- concurrent chemoradiation for locally advanced rectal cancer: prospective
nei, M.A., Craig, J.W., and Brady, S.F. (2015). Functional metagenomic discov- randomized trials. Jpn. J. Clin. Oncol. 31, 25–29.
ery of bacterial effectors in the human microbiome and isolation of commen- Kimura, T., Sudo, K., Kanzaki, Y., Miki, K., Takeichi, Y., Kurosaki, Y., and Na-
damide, a GPCR G2A/132 agonist. Proc. Natl. Acad. Sci. USA 112, kayama, T. (1994). Drug absorption from large intestine: physicochemical fac-
E4825–E4834. tors governing drug absorption. Biol. Pharm. Bull. 17, 327–333.
Cell 181, 1661–1679, June 25, 2020 1677

ll
Resource
Koppel, N., Maini Rekdal, V., and Balskus, E.P. (2017). Chemical transforma- and genetic elements in complex metagenomic samples without using refer-
tion of xenobiotics by the human gut microbiota. Science 356, eaag2770. ence genomes. Nat. Biotechnol. 32, 822–828.
Koppel, N., Bisanz, J.E., Pandelia, M.E., Turnbaugh, P.J., and Balskus, E.P. Northfield, T.C., and McColl, I. (1973). Postprandial concentrations of free and
(2018). Discovery and characterization of a prevalent human gut bacterial conjugated bile acids down the length of the normal human small intestine. Gut
enzyme sufficient for the inactivation of a family of plant toxins. eLife 7, e33953. 14, 513–518.
Kuroiwa, M., Inotsume, N., Iwaoku, R., and Nakano, M. (1985). [Reduction of O’Boyle, N.M., Banck, M., James, C.A., Morley, C., Vandermeersch, T., and
dantrolene by enteric bacteria]. Yakugaku Zasshi 105, 770–774. Hutchison, G.R. (2011). Open Babel: An open chemical toolbox.
Kuroiwa, M., Inotsume, N., and Nakano, M. (1986). [Reduction of nicardipine, J. Cheminform. 3, 33.
calcium antagonist, with enteric bacteria]. Yakugaku Zasshi 106, 698–702.
Oliphant, T.E. (2006). A guide to NumPy (USA: Trelgol Publishing).
Lamont, E.B., and Schilsky, R.L. (1999). The oral fluoropyrimidines in cancer
Pasolli, E., Asnicar, F., Manara, S., Zolfo, M., Karcher, N., Armanini, F., Beghini,
chemotherapy. Clin. Cancer Res. 5, 2289–2296.
F., Manghi, P., Tett, A., Ghensi, P., et al. (2019). Extensive Unexplored Human
Langmead, B., and Salzberg, S.L. (2012). Fast gapped-read alignment with Microbiome Diversity Revealed by Over 150,000 Genomes from Metage-
Bowtie 2. Nat. Methods 9, 357–359. nomes Spanning Age, Geography, and Lifestyle. Cell 176, 649–662.
Lenz, H.J., Stintzing, S., and Loupakis, F. (2015). TAS-102, a novel antitumor
Peppercorn, M.A., and Goldman, P. (1972). The role of intestinal bacteria in the
agent: a review of the mechanism of action. Cancer Treat. Rev. 41, 777–783.
metabolism of salicylazosulfapyridine. J. Pharmacol. Exp. Ther. 181, 555–562.
Li, H., and Jia, W. (2013). Cometabolism of microbes and host: implications for
Planer, J.D., Peng, Y., Kau, A.L., Blanton, L.V., Ndao, I.M., Tarr, P.I., Warner,
drug metabolism and drug-induced toxicity. Clin. Pharmacol. Ther. 94,
B.B., and Gordon, J.I. (2016). Development of the gut microbiota and mucosal
574–581.
IgA responses in twins and gnotobiotic mice. Nature 534, 263–266.
Lindenbaum, J., Tse-Eng, D., Butler, V.P., Jr., and Rund, D.G. (1981). Urinary
excretion of reduced metabolites of digoxin. Am. J. Med. 71, 67–74. Qin, J., Li, R., Raes, J., Arumugam, M., Burgdorf, K.S., Manichanh, C., Nielsen,
T., Pons, N., Levenez, F., Yamada, T., et al.; MetaHIT Consortium (2010). A hu-
Lloyd-Price, J., Mahurkar, A., Rahnavard, G., Crabtree, J., Orvis, J., Hall, A.B.,
man gut microbial gene catalogue established by metagenomic sequencing.
Brady, A., Creasy, H.H., McCracken, C., Giglio, M.G., et al. (2017). Strains,
Nature 464, 59–65.
functions and dynamics in the expanded Human Microbiome Project. Nature
550, 61–66. Qin, J., Li, Y., Cai, Z., Li, S., Zhu, J., Zhang, F., Liang, S., Zhang, W., Guan, Y.,
Shen, D., et al. (2012). A metagenome-wide association study of gut micro-
Longley, D.B., Harkin, D.P., and Johnston, P.G. (2003). 5-fluorouracil: mecha-
biota in type 2 diabetes. Nature 490, 55–60.
nisms of action and clinical strategies. Nat. Rev. Cancer 3, 330–338.
Maier, L., Pruteanu, M., Kuhn, M., Zeller, G., Telzerow, A., Anderson, E.E., Bro- Rettedal, E.A., Gumpert, H., and Sommer, M.O. (2014). Cultivation-based
chado, A.R., Fernandez, K.C., Dose, H., Mori, H., et al. (2018). Extensive multiplex phenotyping of human gut microbiota allows targeted recovery of
impact of non-antibiotic drugs on human gut bacteria. Nature 555, 623–628. previously uncultured bacteria. Nat. Commun. 5, 4714.
Maini Rekdal, V., Bess, E.N., Bisanz, J.E., Turnbaugh, P.J., and Balskus, E.P. Ridlon, J.M., Ikegawa, S., Alves, J.M., Zhou, B., Kobayashi, A., Iida, T., Mita-
(2019). Discovery and inhibition of an interspecies gut bacterial pathway for mura, K., Tanabe, G., Serrano, M., De Guzman, A., et al. (2013). Clostridium
Levodopa metabolism. Science 364, eaau6323. scindens: a human gut microbe with a high potential to convert glucocorticoids
into androgens. J. Lipid Res. 54, 2437–2449.
Mannens, G., Huang, M.L., Meuldermans, W., Hendrickx, J., Woestenborghs,
R., and Heykants, J. (1993). Absorption, metabolism, and excretion of risper- Rondon, M.R., August, P.R., Bettermann, A.D., Brady, S.F., Grossman, T.H.,
idone in humans. Drug Metab. Dispos. 21, 1134–1141. Liles, M.R., Loiacono, K.A., Lynch, B.A., MacNeil, I.A., Minor, C., et al.
McDonald, D., Price, M.N., Goodrich, J., Nawrocki, E.P., DeSantis, T.Z., (2000). Cloning the soil metagenome: a strategy for accessing the genetic
Probst, A., Andersen, G.L., Knight, R., and Hugenholtz, P. (2012). An improved and functional diversity of uncultured microorganisms. Appl. Environ. Micro-
Greengenes taxonomy with explicit ranks for ecological and evolutionary ana- biol. 66, 2541–2547.
lyses of bacteria and archaea. ISME J. 6, 610–618. Scheline, R.R. (1973). Metabolism of foreign compounds by gastrointestinal
McDonald, D., Hyde, E., Debelius, J.W., Morton, J.T., Gonzalez, A., Acker- microorganisms. Pharmacol. Rev. 25, 451–523.
mann, G., Aksenov, A.A., Behsaz, B., Brennan, C., Chen, Y., et al.; American Schmieder, R., and Edwards, R. (2011). Quality control and preprocessing of
Gut Consortium (2018). American Gut: an Open Platform for Citizen Science metagenomic datasets. Bioinformatics 27, 863–864.
Microbiome Research. mSystems 3, e00031-18.
Schoenhard, G., Oppermann, J., and Kohn, F.E. (1985). Metabolism and phar-
McKinney, W.G. (2010). Data structures for statistical computing in python. macokinetic studies of misoprostol. Dig. Dis. Sci. 30 (11, Suppl), 126S–128S.
Proceedings of the 9th Python in Science Conference. 445, 52–56.
Shannon, P., Markiel, A., Ozier, O., Baliga, N.S., Wang, J.T., Ramage, D., Amin,
Meinl, W., Sczesny, S., Brigelius-Flohé, R., Blaut, M., and Glatt, H. (2009).
N., Schwikowski, B., and Ideker, T. (2003). Cytoscape: a software environment
Impact of gut microbiota on intestinal and hepatic levels of phase 2 xenobi-
for integrated models of biomolecular interaction networks. Genome Res. 13,
otic-metabolizing enzymes in the rat. Drug Metab. Dispos. 37, 1179–1186.
2498–2504.
Meuldermans, W., Hendrickx, J., Mannens, G., Lavrijsen, K., Janssen, C.,
Sica, D.A. (2005). Pharmacokinetics and pharmacodynamics of mineralocorti-
Bracke, J., Le Jeune, L., Lauwers, W., and Heykants, J. (1994). The meta-
coid blocking agents and their effects on potassium homeostasis. Heart Fail.
bolism and excretion of risperidone after oral administration in rats and
Rev. 10, 23–29.
dogs. Drug Metab. Dispos. 22, 129–138.
Min, J.S., Kim, N.K., Park, J.K., Yun, S.H., and Noh, J.K. (2000). A prospective Sivan, A., Corrales, L., Hubert, N., Williams, J.B., Aquino-Michaels, K., Earley,
randomized trial comparing intravenous 5-fluorouracil and oral doxifluridine as Z.M., Benyamin, F.W., Lei, Y.M., Jabri, B., Alegre, M.L., et al. (2015).
postoperative adjuvant treatment for advanced rectal cancer. Ann. Surg. On- Commensal Bifidobacterium promotes antitumor immunity and facilitates
col. 7, 674–679. anti-PD-L1 efficacy. Science 350, 1084–1089.
Nayfach, S., Shi, Z.J., Seshadri, R., Pollard, K.S., and Kyrpides, N.C. (2019). Smith, K.S., Smith, P.L., Heady, T.N., Trugman, J.M., Harman, W.D., and Mac-
New insights from uncultivated genomes of the global human gut microbiome. donald, T.L. (2003). In vitro metabolism of tolcapone to reactive intermediates:
Nature 568, 505–510. relevance to tolcapone liver toxicity. Chem. Res. Toxicol. 16, 123–128.
Nielsen, H.B., Almeida, M., Juncker, A.S., Rasmussen, S., Li, J., Sunagawa, S., Sommer, M.O.A., Dantas, G., and Church, G.M. (2009). Functional character-
Plichta, D.R., Gautier, L., Pedersen, A.G., Le Chatelier, E., et al.; MetaHIT Con- ization of the antibiotic resistance reservoir in the human microflora. Science
sortium; MetaHIT Consortium (2014). Identification and assembly of genomes 325, 1128–1131.
1678 Cell 181, 1661–1679, June 25, 2020

ll
Resource
Spanogiannopoulos, P., Bess, E.N., Carmody, R.N., and Turnbaugh, P.J. restrict levels of levodopa in the treatment of Parkinson’s disease. Nat. Com-
(2016). The microbial pharmacists within us: a metagenomic view of xenobiotic mun. 10, 310.
metabolism. Nat. Rev. Microbiol. 14, 273–287.
Vétizou, M., Pitt, J.M., Daillère, R., Lepage, P., Waldschmitt, N., Flament, C.,
Storey, J.D. (2002). A Direct Approach to False Discovery Rates. J. R. Stat. Rusakiewicz, S., Routy, B., Roberti, M.P., Duong, C.P., et al. (2015). Anticancer
Soc. Series B Stat. Methodol. 64, 479–498. immunotherapy by CTLA-4 blockade relies on the gut microbiota. Science
Sugimoto, Y., Camacho, F.R., Wang, S., Chankhamjon, P., Odabas, A., Bis- 350, 1079–1084.
was, A., Jeffrey, P.D., and Donia, M.S. (2019). A metagenomic strategy for har-
Wallace, B.D., Wang, H., Lane, K.T., Scott, J.E., Orans, J., Koo, J.S., Venka-
nessing the chemical repertoire of the human microbiome. Science 366,
tesh, M., Jobin, C., Yeh, L.A., Mani, S., and Redinbo, M.R. (2010). Alleviating
eaax9176.
cancer drug toxicity by inhibiting a bacterial enzyme. Science 330, 831–835.
Taylor, M.R., Flannigan, K.L., Rahim, H., Mohamud, A., Lewis, I.A., Hirota,
S.A., and Greenway, S.C. (2019). Vancomycin relieves mycophenolate mofe- Wang, M., Carver, J.J., Phelan, V.V., Sanchez, L.M., Garg, N., Peng, Y.,
til-induced gastrointestinal toxicity by eliminating gut bacterial b-glucuroni- Nguyen, D.D., Watrous, J., Kapono, C.A., Luzzatto-Knaan, T., et al. (2016).
dase activity. Sci Adv 5, eaax2358. Sharing and community curation of mass spectrometry data with Global Nat-
ural Products Social Molecular Networking. Nat. Biotechnol. 34, 828–837.
Temmink, O.H., de Bruin, M., Turksma, A.W., Cricca, S., Laan, A.C., and Pe-
ters, G.J. (2007). Activity and substrate specificity of pyrimidine phosphory- Winter, J., Cerone-McLernon, A., O’Rourke, S., Ponticorvo, L., and Bokken-
lases and their role in fluoropyrimidine sensitivity in colon cancer cell lines. heuser, V.D. (1982). Formation of 20 b-dihydrosteroids by anaerobic bacteria.
Int. J. Biochem. Cell Biol. 39, 565–575. J. Steroid Biochem. 17, 661–667.
Tramontano, M., Andrejev, S., Pruteanu, M., Klünemann, M., Kuhn, M., Galar- Wood, D.E., and Salzberg, S.L. (2014). Kraken: ultrafast metagenomic
dini, M., Jouhten, P., Zelezniak, A., Zeller, G., Bork, P., et al. (2018). Nutritional sequence classification using exact alignments. Genome Biol. 15, R46.
preferences of human gut bacteria reveal their metabolic idiosyncrasies. Nat.
Microbiol. 3, 514–522. Zampino, M.G., Colleoni, M., Bajetta, E., Stampino, C.G., Guenzi, A., and de
Braud, F. (1999). Pharmacokinetics of oral doxifluridine in patients with colo-
Tsai, B.S., Kessler, L.K., Stolzenbach, J., Schoenhard, G., and Bauer, R.F.
rectal cancer. Tumori 85, 47–50.
(1991). Expression of gastric antisecretory and prostaglandin E receptor bind-
ing activity of misoprostol by misoprostol free acid. Dig. Dis. Sci. 36, 588–593. Zimmermann, M., Zimmermann-Kogadeeva, M., Wegmann, R., and
Uribe, R.V., van der Helm, E., Misiakou, M.A., Lee, S.W., Kol, S., and Sommer, Goodman, A.L. (2019a). Mapping human microbiome drug metabolism by
M.O.A. (2019). Discovery and Characterization of Cas9 Inhibitors Dissemi- gut bacteria and their genes. Nature 570, 462–467.
nated across Seven Bacterial Phyla. Cell Host Microbe 25, 233–241. Zimmermann, M., Zimmermann-Kogadeeva, M., Wegmann, R., and
van Kessel, S.P., Frye, A.K., El-Gendy, A.O., Castejon, M., Keshavarzian, A., Goodman, A.L. (2019b). Separating host and microbiome contributions to
van Dijk, G., and El Aidy, S. (2019). Gut bacterial tyrosine decarboxylases drug pharmacokinetics and toxicity. Science 363, eaat9931.
Cell 181, 1661–1679, June 25, 2020 1679

ll
Resource
STAR+METHODS
KEY RESOURCES TABLE
REAGENT OR RESOURCE SOURCE IDENTIFIER

Chemicals, Media, and Reagents
7a-Thiospironolactone Toronto Research Chemicals Cat#T375000
10mM dNTP mix Life Technologies Cat#18427-088
2-Propanol Fisher Scientific Cat#A451-4
20-alpha-dihydrocortisol MuseChem Cat#M122174
20-beta-dihydrocortisol MuseChem Cat#R060042
Acetonitrile Fisher Scientific Cat#A998-4
Allopurinol Sigma Cat#A8003
Ammonium Chloride Sigma-Aldrich Cat#A4514-500 g
Ampicillin Sigma-Aldrich Cat#A1593-25G
Antarctic Phosphatase New England Biolabs (NEB) Cat#M0289S
Aspartame Sigma Cat#47135
Azathioprine Sigma Cat#A4638
Bacto Agar Fisher Scientific Cat#214010
BD Bacto Peptone Fisher Scientific Cat#S71604
Beef Extract Fisher scientific Cat#S25661A
Bisacodyl Sigma Cat#B1390
Brain Heart Infusion VWR Cat#90003-032 (EA)
Brewer Thioglycollate Medium Sigma Cat#B2551-500 g
Bryant and Burkey Medium Sigma Cat#91903-500 g
Calcium Chloride, dihydrate Sigma Cat#C8106-500 g
Capecitabine Sigma Cat#PHR1405
Clofazimine Sigma Cat#C8895
Clonazepam Sigma Cat#C1277
Cooked Meat Broth, Microbiology Sigma Cat#60865-500 g
D-(+)-Glucose SIGMA Cat#G8270-1KG
Digoxin Sigma Cat#D6003
Dihydrodigoxin Toronto Research Chemicals Cat#D452680
Dimethyl sulfoxide Sigma Cat#D8418-100ML
Dolasetron Sigma Cat#CDS021594
Doxifluridine Sigma Cat#F8791-100MG
Dpn1 New England Biolabs (NEB) Cat#R0176S
EDTA, disodium salt Sigma-Aldrich Cat#E5134-1kg
Ethyl Acetate Fisher Scientific Cat#E195-4
Famciclovir Sigma Cat#F7932
Formic acid Fisher Scientific Cat#A117-50
GAM Broth Modified HyServe Cat#5433
Glycerol Fisher Scientific Cat#BP2291
Hemin BioXtra Sigma-Aldrich Cat#51280-5G
Hydrocortisone Sigma Cat#H4001
IPTG, dioxane-free Thermo Scientific Cat#R0393
Kanamycin Sigma Cat#K4000-5G
Ketoconazole Sigma Cat#K1003
Ketoprofen Sigma Cat#K1751
e1 Cell 181, 1661–1679.e1–e15, June 25, 2020

ll
Resource
Continued
L-arabinose Sigma Cat#A3256-25G
L-Cysteine Fisher Scientific Cat#ICN19464625
LB Broth Sigma Cat#L3522-1Kg
Levonogestrel Santa Cruz Biotech Cat#SC205731
Liver broth Sigma Cat#61724-500 g
Lovastatin Sigma Cat#PHR1285
M17 BROTH Fisher Scientific Cat#OXCMO817B
Magnesium Sulfate, heptahydrate Sigma Cat#230391-500 g
Meat extract Sigma-Aldrich Cat#70164-500G
MegaX DH10b Electrocompetent Cells Thermo Fisher scientific Cat#C640003
Methanol Fisher Scientific Cat#A452-4
Metronidazole Sigma Cat#M3761
Milli-q water Millipore Corporation N/A
Misoprostol Sigma Cat#M6807
Misoprostol free acid Sigma Cat#M6932
MRS Broth Sigma/Millipore Cat#69966-500G
Mycophenolic acid Sigma Cat#Cat#M5255
Mycophenolate mofetil Sigma Cat#SML0284
Neomycin Sigma-Aldrich Cat#N1876-25G
Nicardipine Sigma Cat#N7510
Nitrofurantoin Sigma Cat#N7878
Pellet Paint NF Co-Precipitant EMD Millipore Cat#70748
pGFPuv Clontech Laboratories Cat#632312
Phusion High-Fidelity DNA Polymerase New England Biolabs (NEB) Cat#M0530S
Plasmid pCP20 encoding the FLP Gitai lab, Princeton University (Datsenko and Wanner, 2000)
recombinase
Plasmid pKD46 expressing the Lambda Fischbach lab, Stanford University (Datsenko and Wanner, 2000)
Red recombinase
Potassium phosphate monobasic Sigma Cat#P5655-500G
Praziquantel Sigma Cat#P4668
Reinforced Clostridial Medium Fisher Scientific Cat#DF1808-17-3
Resazurin Santa Cruz Biotechnology Cat#62758-13-8
Risperidone Sigma Cat#PHR1631
Ropinirole Sigma Cat#R2530
Screen-Well FDA approved drug library Enzo Life Sciences Cat#BML-2843-0100
Sodium Bicarbonate Sigma Cat#S6014-500 g
Sodium chloride Sigma-Aldrich Cat#S7653-5KG
Sodium hydroxide Sigma-Aldrich Cat#795429-500G
Spironolactone Sigma Cat#S3378
Sulfamethoxazole Sigma Cat#S7507
Sulfasalazine Sigma Cat#S0883
Sulindac Cayman Cat#38194-50-2
T4 DNA ligase NEB Cat#M0202T
Terrific Broth Fisher Scientific Cat#DF0438-17
Thioglycolate Broth Sigma Cat#70157-500 g
Tinidazole Santa Cruz Cat#sc-205862
Tolcapone Sigma Cat#SML0150
Trace Mineral Supplement ATTC ATCC MD-TMS
Cell 181, 1661–1679.e1–e15, June 25, 2020 e2

ll
Resource
Continued
Trifluridines Sigma Cat#T2255-100MG
Tryptic Soy Broth Fisher Scientific Cat#DF0370 17 3
Tryptic Soy Broth Thermo Fischer Cat#DF0370-17-3
Trypticase Peptone VWR Cat#90000-434 (EA)
Tween 80 Sigma-Aldrich Cat#P1754-500ml
Vancomycin Sigma-Aldrich Cat#V2002-1G
Vitamin K1 Sigma-Aldrich Cat#95271-250MG
Vitamin Mix ATCC ATCC MD-VS
Voriconazole Cayman Cat#15633
Vorinostat Sigma Aldrich Cat#SML0061
Yeast extract VWR Cat#90000-026 (EA)
Zidovudine Sigma Cat#PHR1292
Experimental models: Organisms/Strains
Anaerococcus prevotii Fischbach lab, Stanford University N/A
Anaerostipes caccae DSM 14662 DSMZ DSM 14662
Clostridium bolteae ATCC BAA-613 ATCC ATCC BAA-613
E. coli BW25113 Keio collection (Baba et al., 2006)
E. coli BW25113 mutants harboring Keio collection (Baba et al., 2006)
replacement of deoA or udp with a
kanamycin resistance gene
E. coli BW25113 clean mutants of deoA or This study Methods
udp or both
E.coli BL21 NEB Cat#C2527I
E.coli Stellar Clontech Laboratories Cat#636763
Enterococcus faecalis TYG’11 This paper Methods
Escherichia coli TYG1 This paper Methods
Escherichia coli TYG2 This paper Methods
Lactobacillus gasseri JV-V03 BEI Resources HM-104
Parabacteroides distasonis CL09T03C24 Comstock lab, Harvard Medical School N/A
Prevotella bivia ATCC 29303 ATCC ATCC 29303
Salmonella enterica SL484 Ravel lab, UMB N/A
Serratia marcescens Fischbach lab, Stanford University N/A
Oligonucleotides
Primers used in this study This study Methods
Adobe Illustrator Adobe N/A
Agilent Profinder 8.0 Agilent N/A
Agilent Qualitative Analysis 10.0 Agilent N/A
Agilent Quantiative Analysis 9.0 Agilent N/A
Bowtie2 (Langmead and Salzberg, 2012) http://bowtie-bio.sourceforge.net/bowtie2/
index.shtml
Blast 2.7.1+ NCBI https://blast.ncbi.nlm.nih.gov/Blast.cgi
ChemDraw Professional 16.0 PerkinElmer N/A
Cytoscape (Shannon et al., 2003) https://cytoscape.org/
ENDS This study https://github.com/donia-lab/
personalized_community_MDM_screen
Geneious R9 Geneious https://www.geneious.com/
Global Natural Products Social Molecular (Wang et al., 2016) https://gnps.ucsd.edu
Networking
e3 Cell 181, 1661–1679.e1–e15, June 25, 2020

ll
Resource
Continued
Illumina HiSeq Control Software Illumina N/A
Kraken-1.1.1 (Wood and Salzberg, 2014) http://ccb.jhu.edu/software/kraken
MATLAB R2017b Mathworks https://www.mathworks.com
Matplotlib 2.1.0 (Hunter, 2007) https://matplotlib.org/
MestreNova 10.0 Mestrelab Research N/A
NumPy 1.14.2 (Oliphant, 2006) https://www.numpy.org/
Open Babel (O’Boyle et al., 2011) http://openbabel.org
Pandas 0.22.0 (McKinney, 2010) https://pandas.pydata.org/
PRINSEQ-lite v0.20.4B (Schmieder and Edwards, 2011) http://prinseq.sourceforge.net/
ProteoWizard N/A http://proteowizard.sourceforge.net/
Python 3.6.3 Python Core Team https://www.python.org/
QIIME2 version 2018.6 (Bolyen et al., 2018) https://qiime2.org
SPAdes v3.11.0 (Bankevich et al., 2012) http://cab.spbu.ru/software/spades/
Targeted and untargeted metabolomics This study https://github.com/donia-lab/
pipeline personalized_community_MDM_screen
Deposited Data
Sequencing data This study NCBI BioProject: PRJNA593062
Metabolomics data This study MassIVE: MSV000084641
DNeasy Power Soil Kit(100) QIAGEN Cat#12888-100
End-It DNA End-Repair Kit Epicenter Cat#ER0720
In-Fusion HD Cloning System (50RXNS) TAKARA Cat#639646
Qiaprep Spin mini prep(250) kit QIAGEN Cat#27106
Qiaquick Gel Extraction Kit,250 QIAGEN Cat#8706
QIAquick PCR Purification kit QIAGEN Cat#28106
RNAlater Stabilization Solution Fisher Scientific Cat#AM7021
Zymoclean Large Fragment DNA Zymo Research Cat#D4046
Recovery Kit
Animal Models
C57BL/6 mice Jackson Laboratories (Jax-West facility) N/A
Others
96-well plate, 1.0 mL, round wells, U shape, Agilent Cat#5043-9305
polypropylene, 32 mm, 50/pk
Agilent 1290 Infinity II LC System Agilent N/A
Agilent 2100 Bioanalyzer Agilent N/A
Agilent 6120 quadrupole mass Agilent N/A
spectrometer
Agilent 6530 Q-TOF LC/MS equipment Agilent N/A
Agilent 6545 Q-TOF LC/MS equipment Agilent N/A
Agilent Eclipse Plus C18 RRHD column Agilent Part Number 959757-902
1.8 mM (2.1 3 50 mm)
Agilent Poroshell 120 EC-C18 column Agilent Part Number 695775-902
2.7um (2.1x100mm)
2.7um (4.6x100mm)
2.7um (4.6x50mm)
Anaerobic chamber COY N/A
EquaVAP 96-Well Blowdown Evaporator Fisher Analytical Sales & Services Cat#23096
Cell 181, 1661–1679.e1–e15, June 25, 2020 e4

ll
Resource
Continued
g-TUBE Covaris N/A
Mega Bond Elut-C18 10 g Agilent Part Number 12256031
NanoDrop 2000 Thermo Fisher Scientific N/A
Nunc 96-Well Cap Mats, case of 50 ThermoFisher Cat#276000
Nunc 96-Well Polypropylene DeepWell FisherScientific Cat#278743
Storage Plates, case of 60, 2mL
Reservoir w/Lid 75 mL 25/Pk RV-L25 Mettler-Toledo Rainin LLC Cat#17007886
Sealing mat, 96 wells, round, preslitted, Agilent Part Number 5043-9317
silicone, 50/pk
Lead Contact
Further information and requests for resources and reagents should be directed to and will be fulfilled by the Lead Contact, Mohamed
S. Donia donia@princeton.edu)
All unique/stable reagents generated in this study are available from the Lead Contact, but we may require a completed Materials
Transfer Agreement if there is potential for commercial application.

The sequencing datasets generated during this study are available in Table S2 and at NCBI BioProject: PRJNA593062. The metabolo-
mics datasets generated during this study are available in Tables S1 and S3, and at MassIVE: MSV000084641. The code generated dur-
ing this study is available at GitHub (https://github.com/donia-lab/personalized_community_MDM_screen).
Human subject samples

Fecal samples were collected under Princeton University IRB#11606 at the Princeton University Department of Molecular Biology.
Healthy volunteers were recruited via e-mails sent to the departmental listserv as well as flyer advertisements. Volunteers gave
informed consent prior to sample collection. Eligibility criteria included age (18 and above) and health status (feeling well at time
of sample collection, no diabetes, gastrointestinal, oral, or skin infections, diseases, malignancies, or antibiotic use three months
prior to or during sample collection).
Bacterial strains and conditions

The following media were pre-reduced by incubation in the anaerobic chamber for 24 h before inoculation with the
corresponding isolate’s glycerol stock: (PYG for Anaerostipes caccae and Clostridium bolteae; RCM for Prevotella bivia, Parabacter-
oides distasonis; mGAM for Serratia marcescens, Enterococcus faecalis TYG’11, Anaerococcus prevotii, Escherichia coli
TYG’1 and Escherichia coli TYG’2; BHI for Lactobacillus gasseri; LB for Salmonella enterica and Escherichia coli BL21). See
below the complete media names. Cultures were grown overnight at 37 C in the same anaerobic chamber (70% N2, 25% CO2,
5% H2).
MegaX DH10b E. coli (used for the metagenomic library construction) and BL21-DE3 E. coli (used for the heterologous expression
of the discovered 20b-HSDH gene) were cultured aerobically in LB medium, at 37 C.
E. coli BW25113 wild type and mutants that harbor a replacement of deoA or udp with a kanamycin resistance gene were
obtained from the Keio collection (Baba et al., 2006) and cultured in LB medium at 37 C. Clean TP knockout (DdeoA), UP knockout
(Dudp), and TP/UP double knockout (DdeoA/Dudp) strains were generated as explained below, and also cultured in LB medium
at 37 C.
Mice
8-to-10-week-old male and female mice (25–30 g) C57BL/6 mice were purchased from Jackson laboratories. All animals
were housed and maintained in a certified animal facility and all experiments were conducted according to USA Public Health
Service Policy of Humane Care and Use of Laboratory Animals. All protocols were approved by the Institutional Animal Care
and Use Committee, protocol 2087-16 (Princeton University). The sex and number of animals are specified for the pharmacokinetic
study.
e5 Cell 181, 1661–1679.e1–e15, June 25, 2020

ll
Resource
METHOD DETAILS
Fecal sample processing for PD and D1-20

Freshly collected human fecal material from the healthy donors (30 min from collection for PD, transported on ice; < 15 min from
collection for the rest of the donors, transported without ice) was brought into an anaerobic chamber (70% N2, 25% CO2, 5% H2). One
gram of the sample was suspended in 15 mL of sterile phosphate buffer supplemented with 0.1% L-cysteine (PBSc) in a 50 mL sterile
falcon tube. The suspension was left standing still for 5 min to let insoluble particles settle. The supernatant was mixed with an equal
volume of 40% glycerol in PBSc. Aliquots (1 mL) of this suspension were placed in sterile cryogenic vials and frozen at 80 C until
use (Goodman et al., 2011). Samples were assigned de-identifying numbers (PD, and D1-D20), and stored in the Donia laboratory.
ex vivo culture of PD
A small aliquot (20 mL) from a PD glycerol stock was used to inoculate 10 mL of 14 different media: Liver Broth (Liver), Brewer Thiogly-
colate Medium (BT), Bryant and Burkey Medium (BB), Cooked Meat Broth (Meat), Thioglycolate Broth (TB), Luria-Bertani Broth (LB) (ob-
tained from Sigma Aldrich, USA), Brain Heart Infusion (BHI), MRS (MRS), Reinforced Clostridium Medium (RCM), M17 (M17) (obtained
from Becton Dickinson, USA), modified Gifu Anaerobic Medium (mGAM) (obtained from HyServe, Germany), Gut Microbiota Medium
(GMM (Goodman et al., 2011)), TYG, and a 1:1 mix of each (BestMix), and cultures were incubated at 37 C in an anaerobic chamber.
One mL was harvested from each culture each day for 4 consecutive days, and centrifuged to recover the resulting bacterial pellets.
ex vivo culture of D1-20

30 mL from each donor glycerol stock was used to inoculate 3 mL of 10 different pre-reduced media: Liver Broth (Liver), Bryant and
Burkey Medium (BB), Thioglycolate Broth (TB), Luria-Bertani Broth (LB) (obtained from Sigma Aldrich, USA), Brain Heart Infusion
(BHI), MRS (MRS), Reinforced Clostridium Medium (RCM), modified Gifu Anaerobic Medium (mGAM) (obtained from HyServe, Ger-
many), Gut Microbiota Medium (GMM (Goodman et al., 2011)), and a 70:30 mix of BB:GAM (BG), and cultures were incubated at 37 C
in an anaerobic chamber. One mL was harvested from each culture after 48 h and centrifuged to recover the resulting bacterial
pellets.
16S rRNA gene amplicon sequencing and analysis

DNA was extracted from all pellets using the Power Soil DNA Isolation kit (Mo Bio Laboratories, USA, now QIAGEN), the 16S rRNA
gene was amplified (250 bp, V4 region), and Illumina sequencing libraries were prepared from the amplicons according to a pre-
viously published protocol and primers (Caporaso et al., 2012). Libraries were further pooled together at equal molar ratios and
sequenced on an Illumina HiSeq 2500 Rapid Flowcell (PD samples) or MiSeq (D1-D20 samples) as paired-end reads. For PD sam-
ples, these reads were 2X175 bp with an average depth of 100,000 reads, while for D1-D20 samples the reads were 2X150bp with
an average depth of 30,000 reads. Also included were 8 bp Index reads, following the manufacturer’s protocol (Illumina, USA). Raw
sequencing reads were filtered by Illumina HiSeq Control Software to generate Pass-Filter reads for further analysis. Different sam-
ples were de-multiplexed using the index reads. Amplicon sequencing variants (ASVs) were then inferred from the unmerged paired-
end sequences using the DADA2 plugin within QIIME2 version 2018.6 (Bolyen et al., 2018; Callahan et al., 2016). For PD samples, the
forward reads were trimmed at 165 bp and the reverse reads were trimmed at 140 bp. For D1-20 samples, the forward reads were
trimmed at 150bp and the reverse reads trimmed at 140bp. All other settings within DADA2 were default. Taxonomy was assigned to
the resulting ASVs with a naive Bayes classifier trained on the Greengenes database version 13.8 (Bokulich et al., 2018; McDonald
et al., 2012). Only the target region of the 16S rRNA gene was used to train the classifier. Downstream analyses were performed in
either MATLAB or Python (Hunter, 2007; McKinney, 2010; Oliphant, 2006). See Table S2.
Measurement of biomass for cultured D1-20

30 mL from each donor glycerol stock was cultured in 10 different pre-reduced media as previously described (in duplicates). One mL
was harvested from each culture after 48 h and centrifuged to recover the resulting bacterial pellets. The pellets were weighed in
Eppendorf tubes and the mass was subtracted from that of the empty tube prior to pellet collection. See Table S2.
ex vivo screening of the drug library with PD

In an anaerobic chamber, a small volume (100 mL) of a PD glycerol stock was diluted in 1 mL of mGAM, then 20 mL of this solution
was used to inoculate 3 mL of mGAM in culture tubes. Cultures were grown for 24 h at 37 C in an anaerobic chamber. After 24 h, 10 mL
of each drug (575 total drugs, a subset of the SCREEN-WELL FDA approved drug library, Enzo Life Sciences, Inc. with each mole-
cule having a concentration of 10 mM in DMSO) or of a DMSO control was added to the growing microbial community. In addition,
10 mL of each drug was also incubated similarly in a no-microbiome, mGAM control. The no-drug control distinguishes microbiome-
derived small molecules from ones that result from MDM, and the no-microbiome control distinguishes cases of passive drug degra-
dation or faulty chemical extraction from those of active MDM. PD-DMSO control pellets from several batches of the screen were
analyzed using high-throughput 16S rRNA gene sequencing as described above to ensure the maintenance of a similarly diverse
microbial composition. Experiments and controls were allowed to incubate under the same conditions for a second 24 h period. After
incubation, cultures were extracted with double the volume of ethyl acetate and the organic phase was dried under vacuum using a
Cell 181, 1661–1679.e1–e15, June 25, 2020 e6

ll
Resource
rotary evaporator (Speed Vac). This extraction method recovers organic molecules from both cells and broths of the cultures, and
therefore is not affected by cases of bacterial sequestration of the parent drugs. The dried extracts were suspended in 250 mL
MeOH, centrifuged at 15000 rpm for 5 min to remove any particulates, and analyzed using HPLC-MS (Agilent Single Quad, column:
Poroshell 120 EC-C18 2.7mm 4.6 3 50mm, flow rate 0.8 mL/min, 0.1% formic acid in water (solvent A), 0.1% formic acid in acetonitrile
(solvent B), gradient: 1 min, 0.5% B; 1-20 min, 0.5%–100% B; 20-25 min, 100% B). We tested each drug twice, along with matching
no-drug and no-microbiome controls. If drugs were deemed positive for MDM in one or both of the two trials they were tested a third
time and analyzed using both HPLC-MS and HR-HPLC-MS/MS (Agilent QTOF, column: Poroshell 120 EC-C18 2.7mm 2.1x100 mm,
flow rate 0.25 mL/min, 0.1% formic acid in water (solvent A), 0.1% formic acid in acetonitrile (solvent B), gradient: 1 min, 0.5% B;
1-20 min, 0.5%–100% B; 25-30 min, 100% B). For selected molecules, cultures were scaled up and metabolites were isolated
and their structures were elucidated using NMR and/or comparison to an authentic standard obtained commercially using HPLC-
HRMS/MS (see below). See Data S1 for the chromatograms of all MDM+ metabolites.
Structural elucidation of selected metabolites

One mL of PD glycerol stock was used to inoculate 100 mL of mGAM medium and cultured for 24 h at 37 C in an anaerobic chamber.
After 24 h, 2 mL of 10 mM of either capecitabine, hydrocortisone, tolcapone, or misoprostol solutions was added to the PD culture
and incubated for another 24 h. After the second 24 h, the cultures were extracted with double the volume of ethyl acetate and the
organic solvent layer was dried under vacuum in a rotary evaporator. The dried extract was then suspended in MeOH and partitioned
by reversed phase flash column chromatography (Mega Bond Elut-C18 10 g, Agilent Technology, USA) using the following mobile
phase conditions: solvent A, water with 0.01% formic acid; solvent B acetonitrile with 0.01% formic acid, gradient, 100% A to 100% B
in 20% increments. Fractions containing the metabolites of interest were identified by HPLC-MS, and reverse phase HPLC was used
to purify each metabolite using a fraction collector. The purified metabolites were subjected to NMR and HR-MS/MS analysis. For
misoprostol, hydrocortisone, spironolactone, and mycophenolate mofetil, detailed HPLC-HRMS/MS comparisons with authentic
standards were also performed. Structural elucidation details of capecitabine, hydrocortisone, tolcapone, spironolactone, misopros-
tol, and mycophenolate mofetil metabolites are detailed in Data S2.
Molecular networking analysis in PD screen

Raw data files were converted to the .mzXML format using ProteoWizard and uploaded to the Global Natural Products Social Mo-
lecular Networking (GNPS) online platform (https://gnps.ucsd.edu) (Wang et al., 2016). The data was first filtered, removing MS/MS
peaks within ± 17 Da of the precursor m/z. MS/MS spectra were window filtered by choosing only the top 6 peaks in the ± 50 Da
window throughout the spectrum. Before networking, the data was dereplicated using MS-cluster with a parent mass and MS/
MS fragment mass tolerance of 0.1 Da and minimum fragment intensity of 1000. Following this consensus spectra were removed
if they contained less than 2 spectra. Molecular ion networking was then performed, requiring that two ions have a cosine similarity
of 0.5 and share at least 3 peaks in order to be linked. Connections were removed if the ions did not appear in each other’s top 10
most similar ions. Molecular networks were visualized and mined using Cytoscape (Shannon et al., 2003). We call the two
compounds (parent drug and metabolite) related if they are in the same connected component of the graph. In the cases where either
the metabolite or the parent drug or both were not picked up in the molecular ion networking analysis, we deem the linkage ‘‘unde-
termined.’’ There are several reasons why the metabolites or drugs are not picked up in the analysis, including the abundance of the
ions and the number and abundance of fragment ions. See Data S1B for figures of all molecular ion networks of linked metabolites
and parent drugs, and Table S1 for the GNPS web links of all molecular ion networking analyses.
Enrichment analysis for drugs in PD screen

The results of our screen against 438 drugs allow for an aggregate analysis of MDM by the PD microbiome. We hypothesized that
members of the microbiome would be more likely to metabolize natural compounds or derivatives thereof due to a higher probability
of prior exposure. To test this hypothesis, we first annotated each of the MDM+ or MDM- drugs to one of three categories: naturally
occurring molecules (i.e., molecules directly derived from humans, plants, or microbes; e.g., hydrocortisone; N = 30), derivatives of
naturally occurring molecules (i.e., a semisynthetic derivative or a close structural mimic of a natural product, e.g., hydrocortisone
acetate; N = 90), and synthetic molecules (e.g., nicardipine; N = 318). By comparing the fraction of MDM+ drugs in the first two cat-
egories (natural + derivative, 26 out of 120, 21.6%) to that of the third category (synthetic, 31 out of 318, 10%), we revealed a signif-
icant difference (p < 0.001, two-tailed proportions z-test, n based on the number of molecules with and without the classification).
Intrigued, we decided to examine differences in MDM at lower levels of drug classification. We observed a significantly higher hit
rate among steroids (steroids: 16 out of 28, 57.1%; non-steroid: 41 out of 410, 10%, p < 0.001, two-tailed proportions z-test),
including hormonal steroids, corticosteroids, bile acids, and derivatives thereof. In fact, the high hit rate of the steroid class is the
major contributor to the observed difference between the hit rates of natural/derivative and synthetic groups, which is abolished
upon exclusion of the steroids (non-steroid natural/derivative: 10 out of 94, 10.6%; non-steroid synthetic: 31 out of 316, 9.8%) (Table
S1). The high hit rate among steroids is in-line with the idea that the microbiome is more likely to metabolize compounds it frequently
encounters, as steroids (e.g., bile acids) are normally present in the gut at high concentrations (Northfield and McColl, 1973). The fact
that 10% of fully synthetic molecules tested in our screen are metabolized by PD indicates the presence of a yet-unexplored range
of biochemical activities encoded by the gut microbiome that are capable of recognizing foreign substrates.
e7 Cell 181, 1661–1679.e1–e15, June 25, 2020

ll
Resource
Other than natural and synthetic classifications, we also looked more closely at functional groups that are enriched or depleted in
MDM+ drugs. We generated a list of 94 common functional groups and structural features and searched for them within all of the
drugs tested in our screen. To determine whether certain functional groups are enriched in MDM positive drugs, we aggregated
the SMARTS of common functional groups and the SMILES of all drugs within our screen. We then searched for these functional
groups within the drugs using the obgrep function within Open Babel (O’Boyle et al., 2011). We then tested for enrichment or deple-
tion of these groups within MDM+ drugs using two-sided proportion z-tests, correcting the resulting p values using the Benjamini-
Hochberg method and requiring that the false discovery rate (FDR) corrected p value is less than 0.01. The n for these tests is based
on the number of molecules with and without the functional group. Not surprisingly, we observed an enrichment of the following func-
tional groups in MDM+ drugs: nitro groups (FDR corrected p = 3e-16), ketones (FDR corrected p = 3e-8), carbonyl groups with one
carbon attachment (FDR corrected p = 8e-4), azo groups (FDR corrected p = 0.001), imines (FDR corrected p = 0.002), and alkenes
(FDR corrected p = 0.001). These results are consistent with common reduction and hydrolysis reactions often performed by gut bac-
teria. On the other hand, we observed a general depletion of arenes and nitrogen atoms in MDM+ drugs (FDR corrected p = 7e-5 and
FDR corrected p = 1e-7, respectively). However, when we excluded steroids and repeated the analysis we found that the depletions
in arenes and nitrogen atoms were no longer statistically significant (corrected p = 0.7 and corrected p = 0.7, respectively). This in-
dicates that the original statistically significant depletions were the result of steroids being a highly modified class that generally does
not contain these functional groups, rather than the functional groups themselves being important predictors for the lack of meta-
bolism (Figure 2D and Table S1). The exclusion of steroids, on the other hand, did not affect the observed enrichments we found
for nitro groups, imines, azo groups and ketones (FDR corrected p < 0.01). It is important to note that the results of our analysis
of MDM enrichment are based on a single subject’s microbiome, and should be repeated in the future with data from a much larger
set of donors.
Gene abundance analysis in metagenomic cohorts

The following datasets were used for the metagenomic analysis of the genes of interest in this study: HMP-1-1 (Human Microbiome
Project Consortium, 2012), HMP-1-2 (Lloyd-Price et al., 2017), MetaHIT (Nielsen et al., 2014), Chinese (Qin et al., 2012), and Fijicomp
(Brito et al., 2016). Raw sequencing data were obtained using the accession numbers of the associated manuscripts, and pre-pro-
cessed as previously described (Sugimoto et al., 2019). Quality-filtered reads were mapped to each gene using Bowtie2 (–end-to-
end,–fast,–score-min L,-0.6,-0.3) (Langmead and Salzberg, 2012), and gene abundance (in Reads per Kbp per Million reads, RPKM)
as well as gene breadth coverage (in percent of gene length) were calculated.
We only considered the gene as ‘‘present’’ if the reads cover greater than 50% of the gene length, otherwise the quantification
RPKM is considered zero. For datasets with multiple samples per subject, we aggregated the quantifications. If one or more samples
corresponding to a single subject met the coverage threshold, we present the average RPKM of these samples as the RPKM of the
subject. If no samples met the threshold, we consider the overall RPKM of the subject to be zero. See Table S4 for tabulated results of
this analysis.
ENDS (Expected Number of Detectable Strains)

A metric (‘‘ENDS’’) was developed to estimate the number of strains for which MDM reactions can be experimentally detected in
ex vivo cultures. ENDS answers the following question: if all of the ASVs in the ex vivo culture performed an MDM reaction at a given
rate r (in units of normalized metabolite signal per unit biomass per time), how many ASVs’ reactions will be detected in our screen
given the culture composition and measurement instrument? This framework is needed to incorporate the potentially confounding
impact of community biomass in the media selection process. For example, if biomass is not considered, a high-diversity low-density
community where bacterial load is too low to produce detectable metabolite levels would be favored over a lower diversity commu-
nity with high enough bacterial load to produce detectable metabolite levels. Formally, we are computing the expected value of the
number of the reactions detected:
X
n
E½Ns = Bðxi Þ
i=1
Where E½Ns is the expected number of detectable microbes, and Bðxi Þ is the probability of microbe i’s reaction being detected with
an absolute population of size xi :
How can we construct Bðxi Þ? In statistical terms, Bðxi Þ is equivalent to the power (the probability of deciding there is a reaction
when the reaction is actually present) of the hypothesis testing method used to analyze the data. In this case, we are using a one-
sided unequal variances t test with cutoff a. Now that we have a framework for calculating Bðxi Þ; we must relate a given xi to a
null and alternative distribution of measurements.
We assume that the metabolite measurements are composed of two types of signal: background noise, X and compound signal, Y.
Both signals are assumed to be normally distributed (we show in Methods S1F that an empirically estimated power function provides
similar results). The background noise X Nðm1 ; s1 Þ is the signal present when no actual metabolite is present. The compound signal
Y Nðm2 ; fðm2 ÞÞ is the portion of the signal due to measurement of an actual metabolite. The measurements in the control condition
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
are modeled by X while the measurements in the experimental conditions are modeled as Z = X + Y Nðm1 + m2 ; s21 + fðm2 Þ2 Þ
Cell 181, 1661–1679.e1–e15, June 25, 2020 e8

ll
Resource
We will now relate the population abundance xi to the mean of Y, m1 . In real terms, m1 is the average level of a metabolite produced
by xi . We assume the production metabolite is governed by the dynamics ðd½M =dtÞ = rxi where ½M is the concentration of the metab-
olite and r is the rate of metabolite production per cell. If we assume the drug is added at stationary phase such that ðdxi =dtÞ = 0 for all
t after drug addition, the total amount of metabolite produced is ½M = trxi where t is the incubation time. The r can vary widely, and to
account for this it can be set using the rate of a known MDM reaction (see Methods S1D).
We must now estimate the distribution of X. We estimate this by computing the mean and standard deviation of spurious peaks
detected when samples not containing the compound being measured are quantified.
Now we must define the standard deviation of Y, fðm2 Þ. This is clearly dependent on the instrument being used. By plotting the
standard deviations of triplicate measurements from our machine against their mean, we can estimate fðm2 Þ. To ensure we are
capturing only measurement signal, we based our model only on measurements that largely composed of measurement signal
(more than three standard deviations above the mean of the null distribution). We have found that a power law fðm2 Þ = amb2 fits the
data well. With the distributions of X and Z, we then estimate the Bðxi Þ using existing methods (Harrison and Brady, 2004).
What if we want to create an ensemble of multiple culture conditions to detect even more microbial reactions? To do this we can
define the expected number of new detectable microbes gained if we include another media E½DNs . For each ASV in all media, we
take the product of the probability that the reaction is not detected in the existing media and the probability that the reaction is
detected in the new media.
X
n Y
m

E½DNs = Bðxi Þ 1 B yij
i=1 j=1
Where m is the number of media in the existing ensemble, and yij is the abundance of ASV i in existing media j.
In our actual computation of ENDS, we exclude samples with less than 10,000 reads and take the optimal medium as the one with
the highest ENDS averaged across all twenty donors.
High-throughput screen with D1-20

A new medium, BG, was formulated (see Methods S1-B). BG was made as follows: BB powder and mGAM powder were reconsti-
tuted in water and autoclaved as per manufacturer instructions (Sigma and Hyserv, respectively). Liquid BB medium was mixed with
liquid mGAM in a 70:30 ratio, respectively. For each donor (D1-D20), 500 mL of a donor’s glycerol stock was used to inoculate 50 mL
of pre-reduced BG medium and incubated for 24 h in an anaerobic chamber at 37 C. The culture was transferred to a sterile Nunc 96-
well plate (Fisher Scientific) with each well containing 400 mL of culture. 13.2 mL of each drug stock solution (1mM in DMSO), or 13.2 mL
of DMSO as a vehicle-only control, was pipetted and resuspended in 4 adjacent wells of the 96-well plate (quadruplicates) (see Table
S3). Selected drugs included MDM+ drugs from our original screen (N = 15), MDM- drugs randomly (N = 4) or rationally (N = 2)
selected, and drugs reported in the literature to be MDM+ but were deemed negative in our screen (N = 2). The plate was incubated
for 24 h anaerobically at 37 C. Before chemical extraction, 10 mL of the internal standard (voriconazole, 1mM) was pipetted into the
wells. For chemical extraction, 800 mL of ethyl acetate was pipetted into the wells and resuspended several times. 400 mL of the
organic layer was pipetted into an Agilent 96-well plate and dried under Nitrogen with a 96-well blow-down evaporator (Fisher Analyt-
ical). For HPLC-HRMS analysis, the plate was resuspended in 300 mL methanol and left for 10 min at room temperature prior to centri-
fugation at 3900 RPM for 10 min at 4 C. 60 mL was carefully decanted from the top into a new plate with 60 mL methanol and run on an
Agilent 6545 LC/QTOF machine (0.5 mL injection, only three of the four replicates for each drug were run on the machine). The remain-
ing 240 mL were dried and stored at 20 C for future runs. An abiotic BG-drug plate and a heat-killed-microbiome-drug plate were
prepared and analyzed using the same method, serving as controls to estimate non-enzymatic drug degradation or metabolite pro-
duction. To prepare the heat-killed-microbiome plate, 500 mL of D20 glycerol stock was used to inoculate 50 mL of pre-reduced BG
media and incubated for 24 h in an anaerobic chamber at 37 C. The culture was heat-killed at 100 C for 30 min while keeping the flask
containing it sealed (and therefore maintaining the anaerobic conditions). Drug incubation, chemical extraction, and HPLC-HRMS
analysis for the control plates were performed as previously described for the donor-drug plates.
Targeted quantitative metabolomics analysis

HPLC-HR-MS analysis was performed on an Agilent 6545 LC/QTOF machine. The Autosampler compartment was kept at 7 C, and
the column was kept at 25 C. Reverse phase chromatography was performed using an Agilent Eclipse Plus C18 RRHD column
1.8 mM (2.1 3 50 mm) column (Agilent, USA) with the gradient 95%A, 5%B to 5%A, 95%B in 12 min, then 95%B for 2 min, followed
by initial conditions (95%A, 5%B) for 3 min to re-equilibrate the column (A = 0.1% formic acid in water and B = 0.1% formic acid in
Acetonitrile). The flow rate was 0.4 mL/min. The samples in this study were run in one of two modes: a high resolution 4GHz mode and
a high dynamic range 2GHz mode. MS acquisition parameters for the 4GHz mode were set as follows: positive ion polarity, 0.5min
delay before MS measurement, 325 C gas temperature, 10 L/min drying gas flow rate, 20 psi nebulizer pressure, 325 C sheath gas
temperature, 12 L/min sheath gas flow rate, 4000 V capillary voltage, 500 V nozzle voltage, 135 V fragmentor voltage, 45 V skimmer
voltage, MS and MS/MS mass range of 100 - 1700 m/z, acquisition of 5 MS1 spectra/s, acquisition of 3 MS2 spectra/s, 20 eV collision
energy, a maximum of 2 precursors per cycle, and a precursor selection threshold of 200 counts absolute or 0.01% relative. The sys-
tem was run in auto MS/MS mode. For the 2GHz mode, the parameters were the same as the 4GHz mode with the following changes:
e9 Cell 181, 1661–1679.e1–e15, June 25, 2020

ll
Resource
acquisition of 8 MS1 spectra/s, acquisition of 6 MS2 spectra/s, maximum of 5 precursors per cycle, precursor selection threshold of
2000 counts.
To verify that the concentration of the internal standard (voriconazole) used in the screen was below the saturation limit of the ma-
chine, we constructed a standard curve of voriconazole. 12 mL of 1 mM voriconazole was added to 228 mL of methanol and serial
dilutions were performed by a factor of three to cover the concentration ranges of 40 mM to 0.165 mM. These samples were analyzed
using the 2GHz setting described above to match the setting used for drug quantification in the multi-donor screen.
Drugs and their detected metabolites were quantified in the MS1 of all samples using MassHunter Quantitative Analysis with the
Agile2 integrator. The metabolites quantified here were either ones that we previously discovered during the PD screen, ones that
were previously reported in the literature, or novel metabolites from the multi-donor screen identified using untargeted metabolomics
and verified by molecular networking (See below, Table S3, Figure S3). For quantification of dihydrodigoxin, we required a highly
sensitive integration method to distinguish between dihydrodigoxin and the isotopes of digoxin, since parent and metabolite eluted
at similar retention times. To do this, we performed integrations within MassHunter Qualitative, specifying a mass range of 805.43 -
805.44 m/z for dihydrodigoxin. We verified this method could differentiate the two compounds using authentic standards of dihydro-
digoxin and digoxin, showing that it could accurately quantify dihydrodigoxin and that it did not detect dihydrodigoxin when only
digoxin was present.
Following quantification, all further data processing was performed in MATLAB. For each plate, we remove any samples whose
internal standard AUC was greater than three interquartile ranges above the third quartile or below the first quartile. In order to correct
for differences in extraction efficiency, all peak areas in a given sample were divided by the corresponding internal standard area. This
ratio was then used for hypothesis testing and all other downstream analyses. For drug depletion, unadjusted p values were obtained
by one-sided Welch’s t tests testing whether drug levels are significantly lower in the donor-drug conditions than in controls; p values
were computed for tests against controls where the drug was incubated with BG medium (medium-drug) and incubated with the
heat-killed-microbiome (HKM-drug) controls. For all metabolite quantification statistical tests, n is the number of replicates passing
quality control (maximum 3). For metabolite production, unadjusted p values were obtained for one-sided Welch’s t tests testing
whether metabolite levels are significantly higher in donor-drug conditions than in control conditions; p values were computed for
tests against medium-drug, HKM-drug, as well as donor-DMSO (where the cultures are incubated with only the vehicle, DMSO) con-
trols. Correction for multiple hypotheses was performed using the Benjamini-Hochberg procedure (Benjamini and Hochberg, 1995).
Metabolite production and drug depletion p values were adjusted separately. For depletion to be considered significant, we required
FDR corrected p value < 0.01 for both the medium-drug and HKM-drug adjusted p values and also depletion was required to be
greater than 50% relative to both controls. For metabolite production to be considered significant, we required that FDR corrected
p value < 0.01 for the medium-drug, HKM-drug and donor-DMSO p values.
We then looked for correlation between the results of the targeted metabolomics and the compositions of the BG communities, We
restricted the drugs and metabolites tested by requiring that they must have at least one associated compound (parent drug or
metabolite) with an inter-individual variability entropy > 0.5, and that there exists at least one sample with more than 20% of the
drug remaining relative to medium-drug controls. We tested only taxonomic elements present in at least three samples with a
biomass of at least 1 mg/L in at least one sample, and corrected the resulting p values for multiple hypotheses at each taxonomic
level using the Benjamini-Hochberg method. The n for these tests is based on the number of observations used to compute the cor-
relations. Spearman correlation was computed for this analysis. The mean BG community composition for each donor was used for
the correlation analysis. We tested taxonomic elements at the ASV, species, genus, and family levels.
Untargeted metabolomics analysis

In order to extract all features in a given sample, we used the batch recursive feature extraction method for small molecules and pep-
tides within Profinder 8 (Agilent). We changed the following settings in the method from default: compound ion count threshold set to
two ions, alignment retention time tolerance set to 0.2 min, minimum MFE score set to 90, minimum file prevalence set to 2, expected
retention time set to ± 1 min, retention time contribution to matching score set to 90, expected MS1 mass variation set to 10 ppm,
expected retention time tolerance set to 0.2 min, absolute height threshold for EIC integration set to 2500 counts, and final absolute
height threshold set to 5000. We then analyzed the resulting feature abundances in MATLAB. We remove any sample whose internal
standard AUC is less than 106. We then perform hypothesis testing using similar statistical methods as for metabolite production in
the targeted metabolomics, except that we utilize the multiple hypothesis correction of Storey (Storey, 2002) in place of the Benja-
mini-Hochberg method and require a fold-change cut-off of two relative to all controls. We combine features if their retention times
and estimated molecular weights differ by less than 0.2 min and 0.01 Da, respectively. We then remove all ions already quantified in
our screen and all features that are statistically significant for more than one drug.
For the remaining statistically significant features, we use molecular ion networking to verify the metabolite’s relationship to the
parent drug based on their HR-MS/MS fragmentation pattern. In order to gather data with a large enough number of MS2 spectra
per parent ion we reran samples of interest on our QTOF HPLC-HRMS/MS instrument (Agilent) using the same column and condi-
tions, and the 4GHz settings listed above (instead of the 2GHz one used in the multi-donor screen). In order to minimize the number of
samples run a second time, we identified the minimal set of 81 samples that would allow us to detect and perform molecular ion
networking on all novel metabolites found in the original 1380 donor-drug samples. For molecular ion networking, we used the Global
Natural Product Social Molecular Networking (GNPS) server using the same settings used for the PD molecular networking. In order
Cell 181, 1661–1679.e1–e15, June 25, 2020 e10

ll
Resource
to determine whether a metabolite is linked to its parent by the molecular ion networking, we first identify whether the drug and
metabolite are present in the network. For this, we require that the mass and retention time found in the molecular ion networking
differ by less than 0.2 min and 0.02 Da, respectively, from the properties reported by the initial donor-drug stage of the pipeline.
We call the two compounds related if they are in the same connected component of the graph. In the cases where either the metab-
olite or the parent drug or both were not picked up in the molecular ion networking analysis, we deem the linkage ‘‘undetermined.’’
There are several reasons why the metabolites or drugs are not picked up in the analysis, including the abundance of the ions and the
number and abundance of fragment ions.
Isolate screen for capecitabine

Three-mL overnight seed cultures of the following bacteria in their respective media were obtained as described above: Anaerostipes
caccae, Clostridium bolteae, Prevotella bivia, Parabacteroides distasonis, Serratia marcescens, Enterococcus faecalis TYG’11,
Anaerococcus prevotii, Escherichia coli TYG’1, Escherichia coli TYG’2, Lactobacillus gasseri, Salmonella enterica, and Escherichia
coli BL21. This panel was selected from three of the most abundant Phyla that normally inhabit the gut microbiome (Firmicutes, Bac-
teroidetes, and Proteobacteria), spans 10 bacterial genera (12 strains in total), and includes three strains isolated originally from PD
using standard techniques (Enterococcus faecalis TYG11, Escherichia coli TYG1, and Escherichia coli TYG2). 20 mL of these seed
cultures was inoculated into 3 mL of the same selected pre-reduced medium, and incubated at 37 C under the same anaerobic con-
ditions for an additional 24 h. After 24 h, 10 mL of the 10 mM drug solution in DMSO, or of a DMSO control was added to the growing
microbial culture and incubated for another 24 h. In addition, 10 mL of the drug was incubated for 24 h under the same conditions in a
no-bacterium, medium-only control. After incubation, cultures were extracted with ethyl acetate and the organic phase was dried
under vacuum in a rotary evaporator. Extracts were suspended in 250 mL of MeOH and analyzed using HPLC-MS as described above
in the PD ex vivo screen. See Figure S4.
Metagenomic library construction

Metagenomic DNA was extracted from approximately 0.25 g of PD stool (stored at 80 C in RNAlater (Thermo Fisher Scientific,
Massachusetts, USA)) using the PowerSoil DNA Isolation kit (MO BIO Laboratories California, now QIAGEN, USA) according to man-
ufacturer’s instructions. DNA was quantified using a NanoDrop 2000 (Thermo Fisher Scientific, Massachusetts, USA).
The vector pGFPuv (Clonetch Laboratories) was used as the parent vector for the metagenomic library. The pGFPuv plasmid was
linearized by using PCR and Phusion High-Fidelity DNA Polymerase (New England Biolabs, Massachusetts, USA) and the 2.5 Kbp
product was excised and extracted from the gel after gel electrophoresis and using the QIAquick Gel Extraction kit according to man-
ufacturer’s instructions. PCR primers for linearization were: forward, pGFP-IF-F = TAATGAATTCCAACTGA GCGCCGG and reverse,
pGFP-IF-R = CATAGCTGTTTCCTGT GTGAAATTG. DNA was quantified using a NanoDrop 2000 and incubated with Dpn1 (New En-
gland BioLabs, Massachusetts, USA) at 37 C for one h followed by incubation at 80 C for 20 min to heat inactivate the Dpn1 enzyme.
To prevent re-ligation of the vector, Antarctic Phosphatase (New England Biolabs, Massachusetts, USA) was added to the Dpn1
treated mixture and incubated at 37 C for 15 min and 70 C for five minutes to heat inactivate the phosphatase. The product was
then purified using the QIAquick PCR Purification kit (QIAGEN, Maryland, USA) according to manufacturer’s instructions and quan-
tified with a NanoDrop 2000.
The extracted DNA sample was brought to 150 mL in Milli-Q water. DNA was sheared via a g-TUBE (Covaris, Massachusetts,
USA) according to manufacturer’s instructions using an accuSpin Micro 17R Microcentrifuge (Thermo Fisher Scientific, Massachu-
setts, USA). DNA fragment size was validated using the Agilent 2100 Bioanalyzer (Agilent Technologies, California, USA).
Sheared DNA was subjected to gel electrophoresis, and the 2-4 Kbp product was excised from the gel. Gel extraction was
performed using Zymoclean Large Fragment DNA Recovery Kit (Zymo Research, California, USA) according to the manufacturer’s
instructions.
Sheared DNA was end-repaired using the End-It DNA End-Repair Kit (Epicenter (Illumina), California, USA) according to manufac-
turer’s instructions. The end-repaired DNA was then purified by isopropanol precipitation. Blunt-end ligation of DNA with linearized
pGFPuv was performed with T4 DNA ligase (New England Biolabs, Massachusetts, USA) according to manufacturer’s instructions at
16 C overnight using 100 ng vector and 360 ng insert to achieve a 1:3 molar ratio. Ligated plasmids were purified by precipitation with
Pellet Paint NF Co-Precipitant (EMD Millipore, Massachusetts, USA) and brought to 4 mL in Milli-Q water. 2 mL ligated plasmid were
transformed into MegaX DH10b Electrocompetent Cells (Thermo Fisher Scientific, Massachusetts, USA) according to manufac-
turer’s instructions, and recovered in 1 mL of LB medium.
The 2 3 1 mL transformation products from each ligation were combined and brought up to 20.2 mL in LB/ampicillin and divided
into 20 3 1 mL pools. Each 1 mL pool was cultured in a shaking incubator at 30 C overnight and mixed with 1 mL 50% glycerol the
following day, gently vortexed, and stored at 80 C for future screening. Serial dilutions were made from the remaining 200 mL of the
transformation product and spread on LB/ampicillin plates which were incubated overnight at 37 C, to calculate the library titer of
unique clones the following day. Random clones were picked, checked for the presence of insert, and used to estimate insert
size by PCR (most clones harbored inserts in the 2-4 Kbp range). This procedure was repeated 4 times, with a yield of 2-6 X 104
unique clones per pool (80 total pools) and a total of 3 X 106 unique clones.
e11 Cell 181, 1661–1679.e1–e15, June 25, 2020

ll
Resource
Functional screening of the metagenomic library

2 mL from each of the initial 80 pools containing 2-6 X 104 unique clones were added to 3 mL of LB-carbenicillin (LB-carb, 100 mg/mL)
in glass culture tubes and grown at 37 C for 1 h. After 1 h, 10 mL of 10 mM hydrocortisone (in DMSO) was added to each culture and
incubated at 37 C for 20 h.
After a total of 20 h of growth post hydrocortisone addition, cultures were chemically extracted as follows: 5 mL of 10 mM vorico-
nazole was added to the cultures, followed by the addition of 7 mL of ethyl acetate solvent using a glass pipette. The resulting solution
was vortexed on a medium-high setting twice, allowing for the mixture to separate between vortexing steps. After the last vortexing
step, the mixture was left undisturbed for 5 min. 5 mL of the top organic layer of solvent was transferred to a 4 mL glass tube using a
chemically resistant 1 mL tip and pipette. The samples were then dried using a Labconco CentriVap Concentrator for 1.5 h. Dried
samples are then resuspended in 250 mL of methanol and sonicated for 5 min. Samples are then transferred to a 1.5 mL Eppendorf
tube and centrifuged. 180 mL of the clear solution was then deposited in glass LCMS vial and analyzed using HPLC-HRMS on an
Agilent 6545 LC/QTOF instrument using the same column and gradient as described for the D1-D20 plates, and the 4GHz MS set-
tings. An identical quantitative workflow was employed to identify the pools with the highest signal for the hydrocortisone metabolite/
internal standard ratio.
Selected pools were then plated for colony forming units (CFU) counting. CFU counting was done as follows: 5 mL of each glycerol
stock was taken and diluted into 95 mL of LB-carb by pipetting up and down. 5 mL was taken out from the first tube and moved to a
second tube containing also 95 mL of LB-carb, with pipetting to ensure proper mixing. Serial dilutions were done four more times for a
total of five dilutions. 50 mL of each serial dilution was plated onto LB-carb agar plates and spread evenly across the entire surface.
Plates were incubated at 37 C overnight and checked for colonies the next day. The number of colonies on a plate were counted and
used to calculate CFUs for the original positive sample glycerol stock.
The positive glycerol stock sample was then diluted to obtain sub-pools of 2,000 clones (based on the CFU counts) per sub-pool
with a total of 20 sub-pools to be screened for each positive sample. These 20 sub-pools were then tested for hydrocortisone meta-
bolism and glycerol stocked following the same protocol outlined above for the 20,000 level sub-pools, except that hydrocortisone
was added after 4 h of culture instead of 1 h and that a glycerol stock was made from each sub-pool 12 h after the culture was initi-
ated. Glycerol stock from the top 2,000-clone sub-pool that produced the hydrocortisone metabolite was then subjected to CFU
counting, further sub-pooled and tested at the 200-clone level. The top 200-clone sub-pool was treated the same way and tested
at the 20-clone level.
Glycerol stock from the sub-pool positive for hydrocortisone metabolite at the 20-clone level was then serially diluted and plated
onto five LB-carb plates and incubated overnight at 37 C. The next day, the plates were retrieved and single colonies were picked
and grown in 4 X 96 deep-well plates with 500 mL LB-carb per well. The deep-well plates were then grown at 37 C at 220 RPM shaking
overnight. The next day, 100 mL of culture was saved from each well into another 96-well plate for glycerol stock using a multichannel
pipette.
10 mL from each well of each row were pooled before inoculating a new glass culture tube containing 3 mL LB-carb. The resulting
32 culture tubes corresponding to the 32 pooled rows were grown at 37 C for 4 h before the addition of 10 mL of 10 mM hydrocor-
tisone, after which samples were grown at 37 C for 20 h.
After 20 h, 5 mL of 10 mM voriconazole was added and chemical extraction with ethyl acetate was performed as described pre-
viously and analyzed using HPLC-HRMS. The positive hit indicated which row the hydrocortisone metabolite originated from. 5 mL of
each well in the positive row was used to inoculate a new glass culture tube with 3 mL LB-carb, which was then grown for 4 h at 37 C,
after which hydrocortisone was added and the metabolism screen proceeded as previously described. A single well from the positive
row was then identified as a positive hit. See Figure S6.
The glycerol stock of the positive well was then used to inoculate a 5 mL LB-carb culture that was grown overnight at 37 C and
shaken at 220 RPM. The next day, the plasmid containing metagenomic DNA was isolated from the culture using a QIAprep
Spin Miniprep Kit (QIAGEN, USA) according to manufacturer’s instructions and subjected to DNA sequencing (Sanger) to identify
the metagenomic insert on the plasmid. End-sequences were compared to the assembled PD metagenome using BLASTn.
Heterologous expression of PD-derived 20b-HSDH

The 20b-HSDH gene identified from the functional metagenomic screening of PD metagenomic DNA was codon optimized for
expression in E. coli and ordered as a gBlock from IDT (USA). Primers annealing to the gBlock insert were designed to contain over-
hang restriction enzyme cut sites that would recognize restriction digested pet28a vector. The forward primer contains the cut site for
NdeI, and the reverse primer contains the cut site for NotI. Primer sequences are as follows (50 to 30 ):
Forward: CTGGTGCCGCGCGGCAGCCATATGGCAGATGAATCATCGAAGATTCC
Reverse: GGTGGTGCTCGAGTGCGGCCTTAGAAAACTGAATACCCACCGTCC
PCR product was gel extracted and cloned into a double-digested pet28a vector using the InFusion cloning kit (Takara). InFusion
product was transformed into chemically competent BL21-DE3 E. coli cells. Plasmid was recovered using a QIAprep Spin Miniprep
Kit (QIAGEN, USA) and sequenced (Sanger) to confirm the presence of the correct insert.
To clone the native-sequence of the 20b-HSDH gene identified from the functional metagenomic screening of PD metagenomic
DNA, primers designed with restriction cut sites (as described above) were used to clone it from the metagenomic clone into a dou-
ble-digested pet28a vector. Primer sequences are as follows (50 to 30 ):
Cell 181, 1661–1679.e1–e15, June 25, 2020 e12

ll
Resource
Forward: CTGGTGCCGCGCGGCAGCCATATGGCAGACGAATCATCGAAGATTC
Reverse: GGTGGTGCTCGAGTGCGGCCCAAATGGGGTACGGTGATTAGAAGAC
Plasmid was recovered using a QIAprep Spin Miniprep Kit (QIAGEN, USA) and sequenced (Sanger) to confirm the presence of
the correct insert.
Cultures of BL21-DE3 E. coli harboring either the codon optimized or native sequence of the 20b-HSDH gene were grown from
glycerol stock overnight at 37 C, 220 RPM. The next day, cultures were back diluted to an OD600 of 0.05, grown at 37 C,
220RPM until OD600 was 0.4 and induced with a final concentration of 1mM IPTG. After four h of growth (from the time of back dilu-
tion), 10 mL of 10 mM hydrocortisone was added. All cultures were then grown for 20 h at 37 C, 220 RPM, chemical extraction and
HPLC-HRMS analysis were performed as described above.
Metagenomic and metatranscriptomic analyses

PD metagenomic sequencing: metagenomic DNA of PD, prepared as described above, was sheared to a mean size of 500 bps
using a Covaris E220 sonicator (Covaris, USA). An Illumina sequencing library was prepared from the sheared DNA using the auto-
mated Apollo 324TM NGS Library Prep System and the PrepX DNA library kit (WaferGen, USA) according to the manufacturer’s
protocol. This step included DNA end repairing, A-tailing, adaptor ligation, and limited PCR amplification. After examination on Bio-
analyzer (Agilent, CA) DNA HS chips for size distribution, and quantification by Qubit fluorometer (Invitrogen, CA), the library was
sequenced on Illumina HiSeq 2500 Rapid Flowcell as paired-end reads, along with 8 bps Index reads, following the manufacturer’s
protocol (Illumina, USA). Raw sequencing reads were filtered by Illumina HiSeq Control Software to generate Pass-Filter reads for
further analysis.
PD metatranscriptomic sequencing: Total nucleic acid (DNA and RNA) was prepared from the RNAlater-preserved PD stool sam-
ple using the AllPrep PowerViral DNA/RNA Kit (QIAGEN, USA). DNA was digested and removed using Turbo DNase (ThermoFisher
Scientific, USA) following the manufacturer’s instructions, and remaining DNA-free RNA concentration and quality were measured
using a Nano Drop 2000 and a Bioanalyzer (Agilent, USA). This RNA was further subjected to ribosomal RNA depletion using the
Ribo-Zero Gold rRNA Removal Kit (Epidemiology) (Illumina, USA), and a strand-specific Illumina RNA-seq library was prepared
from it as described for the PD metagenomic DNA sample.
PD-CL-100 metagenomic library sequencing: plasmids were isolated using the QIAprep Spin Miniprep Kit (QIAGEN, USA) from two
sub-pools containing a total 100,000 Unique Clones. An Illumina sequencing library was prepared and sequenced from the resulting
DNA as described above, except that it was sequenced on a NovaSeq 6000 instead of the HiSeq 2500.
PD metagenomic sequencing yielded 35,527,955 reads (2 X 175 bps), PD metatranscriptomic sequencing yielded 30,796,174
reads (2 X 150 bps), and PD-CL-100 yielded 69,173,413 reads (2 X151 bps).
Raw Illumina reads were filtered using PRINSEQ, according to the following parameters: minimum average quality score of 30,
maximum percentage of undetermined reads of 2%. Trimming on each end was implemented until a minimum average quality score
of 30 is reached. Trimmed reads that are shorter than half of the original read length were discarded (Schmieder and Edwards, 2011).
SPAdes was used to assemble the resulting pairs and singletons of filtered reads, using default parameters (Bankevich et al., 2012).
For PD-CL-100, Bowtie2 was used to identify and remove reads mapping to the pGFP-UV plasmid backbone using the following
settings:–end-to-end–sensitive. Reads that do not map to the backbone were then aligned to the SPAdes scaffolds produced from
the PD metagenomic dataset, also using Bowtie2 (–very-sensitive-local) (Langmead and Salzberg, 2012). Reads from the PD meta-
genomic dataset itself were also aligned to their corresponding SPAdes-produced scaffolds using the same settings. RPKM values
and coverage breadths for each PD scaffold equal or longer than 2 Kbp were calculated on the basis of the Bowtie2 alignment results
for each the two read datasets (PD metagenome and PD-CL-100). A PD scaffold was considered ‘‘present’’ in PD-CL-100 if meta-
genomic reads from PD-CL-100 covered at least 50% of the scaffold’s length, otherwise the quantification RPKM is considered zero.
To infer the taxonomy of the PD scaffolds, we ran kraken-1.1.1 with standard settings and the ‘‘MiniKraken DB_8GB’’ pre-built data-
base constructed from complete bacterial, archaeal, and viral genomes in RefSeq (as of Oct. 18, 2017) (Wood and Salzberg, 2014).
See Table S4.
Metratranscriptomic analysis: we first used BLASTn to map a database of quality filtered PD metatranscriptomic reads to a query
of the PD scaffold harboring the 20b-HSDH gene, while specifying an e-value cutoff of 1e-30. Next, we matched these BLASTn-iden-
tified reads to either the 20b-HSDH gene or neighboring genes that we annotated in 5 Kbp windows upstream and downstream of it.
For this step, we used the Geneious assembler in Geneious (Kearse et al., 2012) with the following parameters: minimum overlap:
50 bp, minimum percent identity at overlap: 90%, and maximum percentage of mismatch per read: 20%. Matched reads per
gene were counted and used to construct the bar graph in Figure 6E.
TP and UP gene deletions in E. coli BW25113

E. coli BW25113 mutants that harbor a replacement of deoA or udp with a kanamycin resistance gene were obtained from the Keio
collection (Baba et al., 2006). Since the kanamycin resistance gene is flanked by FLP recognition target sites, we decided to excise it
and obtain in-frame deletion mutants. Plasmid pCP20, encoding the FLP recombinase, was transformed to each of the mutants by
electroporation, and transformants were selected on Ampicillin at 30 C for 16 h. 10 transformants from each mutant were then picked
in 10 mL LB medium with no selection, and incubated at 42 C for 8 h to cure them from the temperature-sensitive pCP20 plasmid.
Each growing colony was then streaked on three plates (LB-ampicillin, LB-kanamycin, and LB with no selection). Mutants that could
e13 Cell 181, 1661–1679.e1–e15, June 25, 2020

ll
Resource
only grow on LB, but not on LB-ampicillin (confirming the loss of the pCP20 plasmid), nor on LB-kanamycin (confirming the excision of
the kanamycin resistance gene) were confirmed to harbor the correct deletion using PCR and DNA sequencing. Primers deoA-
Check-F: 50 -CGCATCCGGCAAAAGCCGCCTCATACTCTTTTCCTCGGGAGGTTACCTTG-30 , deoA-Check-R: 50 - CAAATTTAAA
TGATCAGATCAGTATACCGTTATTCGCTGATACGGCGATA-3 0 , udp-Check-F: 50 -CGCGTCGGCCTTCAGACAGGAGAAGAGAA
TTACAGCAGACGACGCGCCGC-30 , and udp-Check-R: 50 -TGTCTTTTTGCTTCTTCTGACTAAACCGATTCACAGAGGAGTTGT
ATATG-30 were used in PCR experiments to confirm the deletion of the deoA or upd genes and the kanamycin resistance gene re-
placing them (Baba et al., 2006). To construct the DdeoA/Dudp double knockout, the in-frame Dudp knockout obtained above was
used as a starting point. Plasmid pKD46 expressing the Lambda Red recombinase was transformed to it using electroporation (Dat-
senko and Wanner, 2000) and transformants were selected on LB-Ampicillin at 30 C for 16 h. One Ampicillin-resistant transformant
was then cultured at 30 C in 50 mL of LB-Ampicillin, with an added 50 mL of 1 M L-arabinose to induce the expression of the recom-
binase. At an optical density of 0.4-0.6, electrocompetent cells were prepared from the growing culture by serial washes in ice cold
10% glycerol, and 300 ng of a linear PCR product were transformed to it by electroporation. This PCR product was prepared by
using the deoA-Check-F and deoA-Check-R primers on a template DNA prepared from the deoA mutant of the Keio library, in which a
kanamycin resistance gene replaces deoA. After electroporation, transformants were selected on LB-kanamycin at 37 C to induce
the loss of the temperature sensitive pKD46 plasmid, cultured in LB-kanamycin overnight at 37 C, and checked by PCR to confirm
the correct recombination position. Finally, the kanamycin resistance gene was excised from the deoA locus by the FLP recombinase
using the same strategy explained above, resulting in the final DdeoA/Dudp mutant
MDM-Screen of capecitabine using E. coli mutants

Wild type E. coli BW25113, and corresponding TP knockout (DdeoA), UP knockout (Dudp), and TP/UP double knockout (DdeoA/
Dudp) strains were cultured overnight in LB medium (aerobically, shaking at 37 C, 50 mL each). Triplicates of 3 mL for each strain
were incubated with 10 mL of 10 mM capecitabine (in DMSO) for an additional 24 h in an anaerobic chamber along with bacteria-
only and media-only controls. Cultures were then extracted and analyzed as previously described in the PD screen, except for
the addition of 20 mL of 0.25 mg/mL of an internal standard (voriconazole) prior to the extraction.
MDM-Screen of other FPs using E. coli mutants

Wild type E. coli BW25113, and corresponding TP knockout (DdeoA), UP knockout (Dudp), and TP/UP double knockout (DdeoA/
Dudp) strains were cultured overnight in LB medium (aerobically, shaking at 37 C, 50 mL each). Aliquots (100 mL) of each strain
were used to inoculate 3 mL of M9 medium, which were grown again overnight (aerobically, shaking at 37 C). 10 mL of 10 mM doxi-
fluridine (in DMSO) or trifluridine (in methanol) were incubated with each culture for an additional 24 h in an anaerobic chamber, along
with bacteria-only and medium-only controls. Cultures were spun down and collected supernatants were lyophilized. The dried res-
idues were then resuspended in 500 mL methanol and analyzed by HPLC-MS (Agilent Single Quad; column: Poroshell 120 EC-C18
2.7mm 4.6 3 100mm; flow rate: 0.6 mL/min; solvent A: 0.1% formic acid in water: solvent B: 0.1% formic acid in acetonitrile) and the
following gradient: 1 min, 0.5% B; 1-20 min, 0.5%–35% B; 25-30 min, 35%–100% B; 30-35 min, 100% B. The structures of all re-
sulting metabolites were confirmed by comparison to authentic standards. See Figure S4 and S5.
Microbiome-dependent pharmacokinetic experiment

Twelve C57BL/6 mice were treated with a commonly used cocktail of antibiotics (1 g/L of ampicillin, neomycin, metronidazole and 0.5
g/L vancomycin) in drinking water for 14 days (Planer et al., 2016). The antibiotic solution was supplemented with 5 g/L aspartame to
make it more palatable (Karmarkar and Rock, 2013). During these two weeks, the gut microbiome composition was monitored by
collecting feces from each mouse and performing molecular and microbiological analyses to make sure the microbiome is being
cleared by the antibiotic treatment. On day 15, no antibiotics were administered for 24 h (a washout period). On day 16, mice
were separated into the two groups, 6 per group (3 males and 3 females). In group 1, mice remained non-colonized. In group 2,
mice were administered 200 mL of freshly thawed PD glycerol stock using oral gavage. On day 17, the oral gavage was repeated
the same way to ensure the colonization of the administered bacteria (fecal samples were collected on days 16 and 17 and cultured
anaerobically to ensure colonization). On day 18, the pharmacokinetic experiment was performed by monitoring the fate of capeci-
tabine in mouse blood and feces over time. A capecitabine dose equivalent to a single human dose and adjusted to the weight of the
mice was administered by oral gavage (755 mg / kg, as a solution in 50 mL DMSO), then serial sampling of tail vein blood (by tail snip-
ping), as well as fecal collection were performed at these time points (zero, 20 min, 40 min, 60 min, 2 h, and 4 h post dosage). Blood for
each time point (30 mL) was collected using a 30 mL capillary tube and bulb dispenser (Drummond Microcaps, Drummond Scientific),
quickly dispensed in 60 mL EDTA to prevent blood coagulation, and stored on ice for up to 4 h and then frozen at 80 C until further
analysis. Feces were also collected at the same time points (even though defecation was left at will, we succeeded in collecting feces
for most time points), stored on ice for up to 4 h and then frozen in 80 C until further analysis. After the 4 h pharmacokinetic time
point, mice were euthanized.
For chemical extraction, 2 mL of an internal standard solution (0.5 mg / mL of voriconazole) were added to the blood / EDTA solution
mentioned above, and the sample was mixed using a vortex mixer. Next, 500 mL of ethyl acetate was added and mixed. The sample
was then centrifuged briefly at 15000 rpm, and the organic layer was transferred to a glass tube and evaporated under vacuum using
rotary evaporation (Speed Vac). The dried residue was dissolved in 100 mL of MeOH, and the solution was centrifuged at 15000 rpm
Cell 181, 1661–1679.e1–e15, June 25, 2020 e14

ll
Resource
and transferred to an autosampler vial for HPLC-HR-MS analysis. For fecal samples, pellets were weighed (for later normalizations),
and suspended in 500 mL sterile Milli-Q water (Millipore Corporation, USA). 2 mL of an internal standard solution (0.5 mg / mL of vor-
iconazole) were added to the sample, and the mixtures were extracted with 500 mL 1:1 ethyl acetate: MeOH. Fecal debris were then
spun down and collected supernatants were dried under vacuum using a rotary evaporator (Speed Vac). The dried residues were
suspended in 100 mL MeOH. The final solutions were centrifuged at 15000 rpm and transferred to autosampler vials.
The prepared samples were analyzed by HPLC-HR-MS (Agilent QTOF). Chromatography separation was carried out on a Poros-
hell 120 EC-C18 2.7 mm 2.1 3 100 mm column (Agilent, USA) with the gradient: 99.5% A, 0.5% B to 100% B in 20 min and a flow rate
of 0.25 mL/min, where A = 0.1% formic acid in water and B = 0.1% formic acid in acetonitrile. A 10 mL aliquot of the reconstituted
extract was injected into the HR-HPLC-MS system, and the Area Under the Curve (AUC) was integrated for each metabolite and
normalized by the internal standard’s AUC. Peak identities were confirmed by accurate mass, and by comparison of chromato-
graphic retention time and MS/MS spectra to those of authentic standards.
All statistical analyses were performed in MATLAB. p values less than 0.01 (after correction for multiple hypotheses, if applicable)
were considered significant. For comparisons of the means of two populations, Welch’s t test was generally used. In cases where
the independence assumption of this test was not met (as determined by the form of the null hypothesis), permutation tests were
used instead. Comparison of multiple means was done via ANOVA. Comparisons of two proportions was done via a proportions
z-test. For all analyses, the meaning and value of n and the measures of center, dispersion, and precision used can be found in
the relevant main text or in Method Details.
e15 Cell 181, 1661–1679.e1–e15, June 25, 2020

ll
Resource
Figure S1. Expanded Bacterial Community Composition Analysis for PD, D1-20 and Their Ex Vivo Cultures, Related to Figures 1 and 3
A) Four-day time course of PD cultured ex vivo in various media. Family level bacterial composition of the original PD fecal sample (far left), as well as that of PD
ex vivo cultures grown anaerobically in 14 different media over four days (.01, .02, .03, .04). 16S rRNA gene amplicon sequences that could not be classified at the
family level, and families with less than 1% relative abundance in all samples are grouped into ‘‘Other.’’ Cultures are ordered according to their Jensen-Shannon
divergence (DJS) from the original PD sample (upper axes, computed at the family level), where lower values indicate higher similarity to PD. Note that cultures
grown in mGAM are the most similar to PD. B) Family level bacterial composition of the original D1-20 fecal samples and corresponding BG cultures. ‘‘Other’’
includes 16S rRNA gene amplicon sequences not classified at the family level and taxa that are below 5% abundance in all sequenced samples. Replicate BG
cultures for each donor labeled as ‘‘BG_1,’’ ‘‘BG_2,’’ and ‘‘BG_3.’’ See also STAR Methods, Table S2.
ll
Resource
Figure S2. Sequential Metabolism of Hydrocortisone Acetate by the PD Microbiome, Related to Figure 2
HPLC-MS analysis of hydrocortisone acetate (1) and hydrocortisone (2) incubated with PD mGAM.02 culture, mGAM.02 broth, P. distasonis culture, or C. bolteae
culture. An HPLC chromatogram at an absorbance of 254 nm is shown for all samples, indicating the conversion of hydrocortisone acetate to hydrocortisone by
both P. distasonis and C. bolteae, and the conversion of both hydrocortisone acetate (in two steps) and hydrocortisone (in one step) to 20b-dihydrocortisone (3) in
the presence of the PD microbiome.
ll
Resource
Figure S3. Structures of All Drugs Quantified in the D1-20 MDM-Screen and Their Known or Predicted Metabolites, Related to Figure 4
ll
Resource
Figure S4. MDM Deglycosylation of the Anticancer Drugs Capecitabine and Trifluridine, Related to Figure 5
A) A heatmap indicating the ability of each of 12 tested bacterial strains to perform capecitabine deglycosylation (d-G) in order to identify candidate species for the
homology-based gene finding approach. E. faecalis TYG11, E. coli TYG1, and E. coli TYG2 were originally isolated from PD. B. HPLC-MS analysis of trifluridine (1)
incubated with wild type E. coli BW25113 (WT), and Dudp, DdeoA, and DdeoA/Dudp mutants in M9 medium. An HPLC chromatogram at an absorbance of
250 nm is shown for all samples, indicating the conversion of trifluridine (1) to trifluorothymine (2) by wild type E. coli BW25113 (WT), Dudp, and DdeoA, but not the
DdeoA/Dudp mutant.
ll
Resource
Figure S5. MDM Deglycosylation of the Anticancer Prodrug Doxifluridine, Related to Figure 5
HPLC-MS analysis of doxifluridine (1) incubated with wild type E. coli BW25113 (WT), and Dudp, DdeoA, and DdeoA/Dudp mutants in M9 medium. Extracted Ion
Chromatograms for both doxifluridine (1) and its resulting MDM metabolite 5-fluorouracil (2) are shown for all samples, indicating the complete conversion of
doxifluridine (1) to 5-fluorouracil (2) by wild type E. coli BW25113 (WT), Dudp, and DdeoA, but not the DdeoA/Dudp mutant.
ll
Resource
Figure S6. Functional Metagenomic Screening for Hydrocortisone Metabolizing Enzymes in the PD Microbiome, Related to Figure 6
Schematic representation of the screening scheme followed to identify the 20b-HSDH gene from the PD microbiome metagenomic clone library.
ll
Resource
Figure S7. Microbiome-Dependent Pharmacokinetics of Capecitabine, Related to Figure 7

HPLC-HR-MS based quantification of capecitabine and its human metabolite (50 -deoxy-5-fluorocytidine) in fecal and blood samples from mice colonized with PD
in comparison to non-colonized ones. No significant difference in the levels of capecitabine were observed between the two groups. (p > 0.05, significance was
determined by testing the intersection null hypothesis with marginal two-tailed t tests using the Bonferroni correction to control family-wise error rate). Error bars
represent the standard error of the mean.
Resource
Metabolic Dynamics and Prediction of Gestational

Age and Time to Delivery in Pregnant Women
Liang Liang,
Marie-Louise Hee Rasmussen,
Brian Piening, ..., Hanyah Zackriah,
Michael Snyder, Mads Melbye
Correspondence
mpsnyder@stanford.edu (M.S.),
mmelbye@stanford.edu (M.M.)
In Brief
Identification of blood metabolites in
pregnant women that can accurately
predict gestational age and provide
insights into pregnancy variations
undetected by ultrasound.
Highlights
d Weekly metabolome of maternal blood changes dynamically
through healthy pregnancy
d A metabolic clock of five blood metabolites accurately

predicts gestational age
d Two to three metabolites identify labor onset within two, four,

and eight weeks
d Women with metabolic clocks that outpaced ultrasound

evaluation tend to deliver earlier
Liang et al., 2020, Cell 181, 1680–1692

ll
Resource
Metabolic Dynamics and Prediction of Gestational
Age and Time to Delivery in Pregnant Women
Liang Liang,1 Marie-Louise Hee Rasmussen,2 Brian Piening,1,8 Xiaotao Shen,1 Songjie Chen,1 Hannes Röst,1,9
John K. Snyder,3 Robert Tibshirani,4 Line Skotte,2 Norman CY. Lee,3 Kévin Contrepois,1 Bjarke Feenstra,2
Hanyah Zackriah,5 Michael Snyder,1,10,* and Mads Melbye2,6,7,*
1Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
2Department of Epidemiology Research, Statens Serum Institut, Copenhagen, 2300, Denmark
3Department of Chemistry and the Chemical Instrumentation Center, Boston University, Boston, Massachusetts 02215, USA
4Department of Statistics and Biomedical Data Science, Stanford University, Stanford, CA 94305, USA
5Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, CA 94720, USA
6Department of Clinical Medicine, University of Copenhagen, Copenhagen, 2200, Denmark
7Department of Medicine, Stanford University School of Medicine, Stanford, CA 94305, USA
8Present address: Earle A. Chiles Research Institute, Providence Portland Medical Center, Portland, OR, 97213 USA
9Present address: Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, ON M5S 3E1, Canada
10Lead Contact
*Correspondence: mpsnyder@stanford.edu (M.S.), mmelbye@stanford.edu (M.M.)

SUMMARY
Metabolism during pregnancy is a dynamic and precisely programmed process, the failure of which can bring
devastating consequences to the mother and fetus. To define a high-resolution temporal profile of metabo-
lites during healthy pregnancy, we analyzed the untargeted metabolome of 784 weekly blood samples from
30 pregnant women. Broad changes and a highly choreographed profile were revealed: 4,995 metabolic fea-
tures (of 9,651 total), 460 annotated compounds (of 687 total), and 34 human metabolic pathways (of 48 total)
were significantly changed during pregnancy. Using linear models, we built a metabolic clock with five me-
tabolites that time gestational age in high accordance with ultrasound (R = 0.92). Furthermore, two to three
metabolites can identify when labor occurs (time to delivery within two, four, and eight weeks, AUROC R
0.85). Our study represents a weekly characterization of the human pregnancy metabolome, providing a
high-resolution landscape for understanding pregnancy with potential clinical utilities.
INTRODUCTION (Committee on Obstetric Practice, the American Institute of Ul-

trasound in Medicine, and the Society for Maternal-Fetal Medi-
Pregnancy is one of the most critical periods for mother and child cine, 2017). The current clinical method of determining the
(Alkema et al., 2016; Say et al., 2014). It involves a tremendous gestational age and due date is based on information about
flow of physiological changes and metabolic adaptations week the last menstruation date, which can be imprecise, or ultra-
by week, and even small deviations from the norm might have sound imaging, which depends on accessibility at early preg-
detrimental consequences at different pregnancy stages. For nancy (Committee on Obstetric Practice, the American Institute
example, approximately 20% of all pregnancies end in miscar- of Ultrasound in Medicine, and the Society for Maternal-Fetal
riage (< 20 weeks), and around 10% end in preterm birth (< Medicine, 2017). Missing the time window is common even in
37 weeks) (Blencowe et al., 2013; Wang et al., 2003). The latter developed countries: in the United States, approximately
is the leading cause of global neonatal morbidity and mortality 900,000 pregnancies annually do not have a prenatal visit before
(Blencowe et al., 2013). Of 200 million annual pregnancies, the second or third trimester (Martin et al., 2018).
300,000 pregnancy- and birth-related maternal deaths and 7 The maternal circulatory system connects with the fetal circu-
million perinatal deaths occur worldwide (GBD 2013 Mortality latory system through the placenta, carrying bioactive molecules
and Causes of Death Collaborators, 2015; Sedgh et al., 2014). and biomarkers such as steroid hormones, micronutrients, and
With a better understanding of how pregnancy is regulated, circulating nucleic acids, whose concentrations alter as gesta-
even small improvements in obstetric health care can enhance tion progresses (King, 2000; Koh et al., 2014; Tulchinsky et al.,
the well-being of many women and children. 1972; Wang et al., 2016). Recent work on cell-free RNA suggests
An accurate estimation of the timing of pregnancy and birth is that markers in maternal blood can be used to estimate gesta-
important for many clinical decisions in obstetrics, including tional age, but sequencing can be expensive and time-
determination of preterm birth and related treatment regimens consuming, and the accuracy, at present, is not ideal (Ngo
ll
Resource
Table 1. Demographics and Birth Characteristics of the The study identified a large number of pregnancy-related metab-
Discovery and Validation Cohorts olites and metabolic pathways offering a comprehensive view of
Discovery Test Set 1 Test Set 2 the metabolite changes during healthy pregnancy and the post-
partum period. Leveraging the high-resolution datasets, we built
N = 21 N=9 N=8
a metabolic clock that not only predicts gestational age in high
Demographics
accordance with the first-trimester ultrasound, the clinical gold
Maternal age at birth, 29.8 ± 3.1 29.7 ± 3.3 31.4 ± 1.0 standard, but also recovers personal pregnancy variations unde-
years
tected by ultrasound but capable of affecting delivery time.
Previous births, No. (%)
0 13 (61.9) 6 (66.7) 4 (50.0) RESULTS
1 8 (38.1) 2 (22.2) 3 (37.5)
R2 0 (0) 1 (11.1) 1 (12.5) Danish Pregnancy Cohort: A Study of Normal Pregnancy
with High-Density Sampling
Pre-pregnancy BMI, 22.1 ± 2.9 21.2 ± 3.4 21.1 ± 1.6
kg/m2 To capture the highly dynamic pregnancy process at high reso-
lution, we established a multi-year single-center Danish normal
Smoking during pregnancy, No. (%)
pregnancy cohort and a design of high-density blood sampling.
Yes 0 (0) 0 (0) 1 (12.5) Consenting female participants submitted weekly blood draws
No 18 (85.7) 9 (100) 6 (75.0) beginning in week 5 of pregnancy and ending in the postpartum
Missing 3 (14.3) 0 (0) 1 (12.5) period. A total of 30 women with weekly blood samples were as-
Alcohol during pregnancy, No. (%) signed to a discovery (N = 21) and a validation (test set 1, N = 9)
Yes 5 (23.8) 1 (11.1) 1 (12.5)
cohort (Table 1; Figures 1A and S1A). The samples were
analyzed in two separate years. In addition, another separate
Average number of 0.80 1.0 0.25
set of women (N = 8) was included as the secondary validation
units per week
cohort. These samples were analyzed independently three years
No 13 (61.9) 8 (88.9) 6 (75.0)
apart from the discovery cohort (test set 2) (Table 1).
Missing 3 (14.3) 0 (0) 1 (12.5)
Birth characteristics Weekly Pregnancy Progression Is Precisely Ordered by
Gestational age, days 281 ± 8.4 280.7 ± 8.3 279.3 ± 9.5 Metabolites
Mode of delivery, No. (%) We randomized the 784 samples from the first 30 subjects within
each cohort (Discovery and test set 1), processed them by using
Spontaneous vaginal birth 10 (47.6) 5 (55.6) 4 (50.0)
a standardized protocol (Contrepois et al., 2015), and analyzed
Induced vaginal birth 7 (33.3) 1 (11.1) 3 (37.5)
them by liquid chromatography-mass spectrometry (LC-MS)
C-section before onset 1 (4.8) 3 (33.3) 1 (12.5) for untargeted metabolomics across two separate years. After
of labor
quality control, data filtering, and normalization (see STAR
C-section during labor 3 (14.3) 0 (0) 0 (0) Methods) (N = 30), we identified 9,651 metabolic features across
Birth weight, grams 3,638 ± 500 3,803 ± 662 3,362 ± 493 the different samples. Of these, 4,995 features (51.7%) were
Birth length, centimeters 52.4 ± 2 53.3 ± 2 51 ± 2.3 altered during pregnancy and/or the postpartum period (false
Gender of child, No. (%) discovery rate [FDR] < 0.05), suggesting extensive metabolic
Male 9 (42.9) 5 (55.6) 5 (62.5)
changes occur during pregnancy. We examined the data glob-
ally with principal component analysis (PCA), in which the sam-
Female 12 (57.1) 4 (44.4) 3 (37.5)
ples were distributed on the basis of the first two principal com-
Values are means (SDs) or numbers (percentages).
ponents according to their gestational stages (Figure 1B; Scree
plot in Figure S1B and the partial least-squares discriminant
et al., 2018). Therefore, a more accurate and cost-effective analysis [PLS-DA] results in Figure S1C), regardless of individual
method for estimating gestational age and delivery time, variation and batches (Figures S1D and S1E). Interestingly, we
possibly using blood metabolites, is needed. In addition, current found that metabolites with uni-directional behaviors dominated
clinical tests often only focus on a few markers, whereas the features, and over half of them increased across pregnancy
research covering more molecules often examines the profiles until reaching their peaks immediately before labor (Figures S1F
at one or a few time points during pregnancy (Bahado-Singh and S1G).
et al., 2012; Chan et al., 2003; Dudzik et al., 2014; Gagnon To understand the potential function of pregnancy-related
et al., 2008; Kenny et al., 2010; Koh et al., 2014; López-Hernán- metabolites, we first annotated metabolic features by using an
dez et al., 2019; Romero et al., 2010; Sachse et al., 2012; Soldin in-house library and a combined public spectral database (see
et al., 2005). Thus, a high-resolution landscape of pregnancy- details in STAR Methods). A total of 952 metabolic features
related metabolites during healthy pregnancy and the post- were mapped to 687 compounds, which include plasma metab-
partum period is still poorly understood. olites with important functions in humans. We then applied
Here, we use untargeted metabolomics (Kaddurah-Daouk significance analysis for microarrays (SAM) to examine the cor-
et al., 2008) to systematically profile blood metabolites relation between the abundance of each compound and the
throughout pregnancy with weekly sampling of maternal blood. reported gestational age of a woman at blood sampling. Among
Cell 181, 1680–1692, June 25, 2020 1681

ll
Resource
B C
Gestational
Age/ weeks
10 20 30 40 Postpartum
Estriol-16-Glucuronide
60 Pr Estrone 3-sulfate
eg 0.15 THDOC
Af na N-Acetyl-D-glucosamine
Weekly log2(intensity) change

te
rc
nc Progesterone
hi
y 17alpha-Hydroxyprogesterone
ld LPC(P-18:0)
bi LPC(20:5) 3-Acetoxypyridine
30 rth 0.10 MG(14:1) 5-Pregnane-3,17-diol-20-one 3-sulfate
LPC(P-18:1) 7alpha,24-Dihydroxy-4-cholesten-3-one
PC2 (3.9 %)
PE(P-16:0e/0:0) Theophylline
DHEA-S 1-Methylxanthine
Oleoylcarnitine C16PAF (Platelet-activating factor)
beta-Glycyrrhetinic acid Cortisone
0 0.05 LPC(17:0) Docosadienoic acid
LPC(P-16:0)
MG(24:1)
MG(24:0)
LPE(22:2)
MG(22:2)
0.00 MG(20:0)
−30 Top increased
Top decreased
−0.05 All others
−60 −30 0 30 60 0 100 200

PC1 (5.1%) Rank
D E
Log2(Intensity) Log2(Intensity)
up
up
ro
Norm to 1st Trimester Norm to 1st Trimester

ro
G
G
Estriol−16−Glucuronide 8,9−DHET
Estrone 3−sulfate 4 Hexadecadienoylcarnitine 2
LPE(22:2)
Progesterone LPC(20:5)
2 1
17alpha−Hydroxyprogesterone LPC(P−18:0)
THDOC LPC(P−16:0)
0 Oleoylcarnitine 0
C16 PAF (Platelet−activating factor) LPE(22:4)
Taurochenodeoxycholate −2 LPC(18:2) −1
17,18−EpETE LPE(20:1)
Cortisone −4 LPC(17:0) −2
LPE(22:1)
Cortisol LPE(20:0)
Theobromine LPE(20:3)
Tetracosapentaenoic acid Sinapyl alcohol Group
Tetracosatetraenoic acid 2−Phenylbutyric acid Amino acid metabolism
Isobutyryl−L−carnitine Bile acid biosynthesis
1−Methylxanthine LPC(24:0) Ca eine metabolism
Docosadienoic acid PC(18:1(9Z)e/2:0) Fatty acid metabolism
Erucic acid Tricosanoic acid Phospholipid metabolism
Hydroxybupropion Steroid hormone biosynthesis
Caffeine Others
MG(18:1)
Theophylline Glycyrrhetinic acid
N−Acetyl−D−glucosamine PE(P-16:0e/0:0)
5−Pregnane−3,17−diol−20−one 3−sulfate DHEA-S
3−Acetoxypyridine LPC(P−18:1)
Valylhistidine
7alpha,24−Dihydroxy−4−cholesten−3−one Glycochenodeoxycholate
Dodecanoylcarnitine Androsterone sulfate
Androstane−3,17−diol 3−Hydroxyoleylcarnitine
Pregnenolone sulfate PC(22:1/22:1) (Lecithin)
beta−Glycyrrhetinic acid
Corticosterone MG(14:1)
Sphingosine MG(24:0)
7−Methylguanine MG(24:1)
Ketoisovaleric acid
Cyclo(leucylprolyl) MG(20:0)
Tetracosahexaenoic acid MG(22:2)
16
18
20
22
24
26
28
30
32
34
36
38
40
PP
16
18
20
22
26
24
28
30
32
34
36
38
40
PP
Adj. GA (weeks) Adj. GA (weeks)
Figure 1. Untargeted Metabolomics Cluster the Weekly Plasma Samples Precisely According to Gestational Age
(A) Sampling scheme. Note that validation cohort refers to test set 1 in Table 1.
(B) Principal component analysis (PCA) distributed individual samples according to pregnancy stages (based on 9,651 features). The two PCs explaining the
largest part of the variation are shown.
(C) Plot shows the top 15 increased (red) and decreased (blue) metabolites (with MSI level 1 or 2 identification) in pregnancy.
(D and E) Heatmap displays the metabolite signal intensity averaged across individuals, showing the top 68 altered metabolites (D) increased and (E) decreased
by the end of pregnancy. Abbreviations are as follows: PP, postpartum. The gestational ages (GAs) were calculated by scaling delivery events to 40 weeks. The
1682 Cell 181, 1680–1692, June 25, 2020

ll
Resource
the 687 annotated compounds, 460 compounds were signifi- steadily increased throughout pregnancy, estriol-16-glucuro-
cantly associated with pregnancy (67.0%; FDR < 0.05, SAM). nide exhibited a rapid increase before week 24, (Figures 1D
In addition, 264 compounds were identified with a metabolomics and 2B). Nearly all upregulated metabolites positively correlated
standards initiative (MSI) level 1 or 2 identification (Viant et al., with this cluster of steroids, whereas all downregulated metabo-
2017), among which 176 compounds (66.7%) were significantly lites negatively correlated with this cluster (Figure S2A). This
associated with pregnancy, as determined by linear regression result suggests that different steroid hormones might regulate
with gestational age (FDR < 0.05, SAM). global metabolome dynamics during pregnancy.
Our dense sampling revealed detailed temporal patterns of Within the lipid cluster, intra-correlation was relatively high. The
molecular changes. Among the top 68 metabolites (of the 176) largest cluster was composed of LysoPCs (Figures 2A and S2A), a
that changed over 50% during whole pregnancy, those that class of phospholipids. LysoPCs gradually decreased during
increased (N = 30) included steroid hormones estriol-16-glucu- pregnancy and increased after childbirth in a pattern that highly
ronide, estrone 3-sulfate, and tetrahydrodeoxycorticosterone correlates with the steroid dehydroepiandrosterone sulfate
(THDOC). All three increased more rapidly than the well-known (DHEA-S) (Figure 2C). LysoPCs are bioactive pro-inflammatory
steroids progesterone and 17a-hydoxyprogesterone (FDR < lipids that have been linked with organismal oxidative stress
0.05, SAM) (Figures 1C and 1D). By contrast, the top metabolites and inflammation (Sevastou et al., 2013). The second-largest
that decreased during pregnancy (N = 38) were mostly lipids or cluster of lipids included several free fatty acids that were highly
lipid-like molecules, such as monoacylglycerides (MGs), lyso- correlated within the cluster (Figures 2A and S2A). Long-chain
phosphatidylcholines (LPC or lysoPC), and oleoylcarnitine (Fig- fatty acids showed intricate dynamics in their amounts revealed
ures 1C and 1E). Hierarchical clustering of the weekly samples by the dense sampling. Hexadecadienoylcarnitine and tetracosa-
on the basis of the top 68 altered metabolites revealed a week- hexaenoic acid (THA) decreased at the beginning of pregnancy,
order mostly consistent with the actual progression of gesta- followed by waves of increased amounts similar to other fatty
tional age (Table S1; Figures 1D and 1E). Intriguingly, most of acids in the late second and third trimesters (Figure 2D). After
these metabolite changes rapidly returned to baseline after childbirth, the amounts of most long-chain fatty acids decreased,
childbirth (postpartum) (Figures 1B and 1D and 1E). Together, except for hexadecadienoylcarnitine (Figure 2D).
these results suggest a dramatic and programmed change of hu- Within the non-lipid cluster, one sub-cluster included five high-
man blood metabolites at a system level during pregnancy. ly correlated metabolites belonging to the same caffeine meta-
bolism pathway (Figures 2A and S2A). All five metabolites were
Metabolite Groups Altered during Pregnancy consistently elevated during pregnancy, and caffeine reached
To detect the functional groups of metabolites that change during a concentration three times higher at the end of pregnancy
pregnancy, we performed correlation analysis on the temporal in- than at the beginning (Figure 2E). This elevation might be due
tensity profiles of the top 68 pregnancy-related compounds to a slower caffeine metabolism in pregnant women rather than
mentioned above. In Figure S2, metabolites that were significantly an increase in coffee intake (Knutti et al., 1981). Overall, among
increased or decreased tended to cluster together. Using existing the 68 top-altered metabolites in pregnancy, functional metabo-
structural and biological information, we first categorized the top lite groups (e.g., steroids, LysoPCs, fatty acids, and caffeine
changing compounds in pregnancy into seven groups. Interest- metabolites) were altered in an orchestrated manner during
ingly, compounds of the same groups tended to cluster together pregnancy, and individual compounds within each group
in the correlation matrix. On the basis of the correlation relation- showed inter-correlation to each other (Figure 2A).
ship, we constructed a regularized partial correlation network us-
ing all pregnancy-related compounds to explore the potential reg- Orchestrated Metabolome Reconfigurations Span
ulatory relationships (Figures 2A and S2B). The topology of the Multiple Pathways during Pregnancy
network indicates that different metabolite groups occupied Next, we longitudinally examined the global pathway changes of
different positions; dense interactions occurred between both in- all 687 annotated compounds during normal pregnancy. Among
ter- and intra- metabolite groups with the densest interactions be- the 48 mapped Kyoto Encyclopedia of Genes and Genomes
tween central steroid hormones (Figure 2A). These findings high- (KEGG) pathways, 34 showed significant changes (70.8%,
light that even though the amount of each compound dynamically adjusted FDR < 0.05, global test) (Goeman et al., 2004; Xia and
changes during pregnancy, a highly coordinated metabolite reg- Wishart, 2010a) through metaboanalystR (Figure 3A) (Chong
ulatory network underlies the pregnancy process. et al., 2018), suggesting large-scale pathway changes of meta-
We next examined the main clusters that were present in the bolism in pregnancy. To quantify the pathway activities through
correlation analysis. Three main clusters emerged from the hier- gestational age, we calculated the average intensity of metabo-
archical clustering of metabolites (Figure S2A), with a steroid lites in the pathways (Figure 3B; see STAR Methods). Among the
cluster (e.g., antrostane-3,17-diol, estriol-16-glucuronide, pro- top altered pathways (Figure 3A), steroid hormone biosynthesis
gesterone, 17a-hydroxyprogesterone, and THDOC) sitting be- showed elevated activity precisely timed to gestation, peaking
tween the large clusters of lipids and non-lipid molecules. before the end of pregnancy and then declining sharply shortly
Compared with the other steroids in this cluster that slowly but after delivery (Figure 3B). Along with the essential roles of steroid
week order, which mostly coincides with the actual order, was ordered by hierarchical clustering on the basis of Manhattan distances. The intensities averaged
before 14 weeks of all women were used as the baseline.
Cell 181, 1680–1692, June 25, 2020 1683

ll
Resource
A Ketoisovaleric acid
LPC(P−18:1)
LPE(22:2)
Theobromine
LPC(P−16:0)
1−Methylxanthine
Theophylline PE(P−16:0e/0:0)
LPE(22:1)
Caffeine
LPE(20:0) LPC(P−18:0) Group
LPC(17:0) 2−Phenylbutyric acid
Amino acid metabolism
LPC(18:2) LPE(20:1) LPE(22:4)
Bile acid biosynthesis
Cyclo(leucylprolyl)
LPE(20:3) Caffeine metabolism
Taurochenodeoxycholate Fatty acid metabolism
3−Acetoxypyridine Oleoylcarnitine Phospholipid metabolism
PC(22:1/22:1) (Lecithin)
Glycochenodeoxycholate PC(18:1(9Z)e/2:0) 3−Hydroxyoleylcarnitine Steroid hormone biosynthesis
N−Acetyl−D−Pregnenolone sulfate Others
glucosamine MG(14:1)
Androsterone sulfate
ate
te
e LPC(20:5)
Sinapyl alcohol
Estrone 3−sulfate
Androstane−3,17−diol
Sphingosine Tetracosahexaenoic acid
5−Pregnane−3,17
−diol−20−one 3−sulfate THDOC 7−Methylguanine Tetracosapentaenoic acid
Estriol−16−
Glucuronide Hexadecadienoylcarnitine
Tetracosatetraenoic acid
Progesterone 17alpha− DHEA−S Erucic acid
Hydroxyprogesterone
Dodecanoylcarnitine
Tricosanoic acid Docosadienoic acid
7alpha,24−Dihydroxy−
Cortisone C16 PAF
4−cholesten−3−one Valylhistidine
Cortisol
17,18−EpETE LPC(24:0)
MG(24:1) MG(20:0)
Corticosterone beta−Glycyrrhetinic acid

Isobutyryl−L−carnitine MG(24:0) MG(22:2)
Hydroxybupropion
Glycyrrhetinic acid MG(18:1)
8,9−DHET
B C
Steriod hormone biosynthesis Phospholipids and DHEA-S
1.0
Mean log2(Intensity)
5.0
0.5 PC (22:1/22:1) (Lecithin)

LPC (18:2)
2.5 LPC (17:0)
Estriol−16−Glucuronide 0.0
THDOC DHEA−S
0.0 17alpha−Hydroxyprogesterone PE (P−16:0e/0:0)
Progesterone −0.5 LPE (22:2)
5−Pregnane−3,17−diol−20−
−2.5 one 3−sulfate −1.0
1416 24 32 40 PP 1416 24 32 40 PP
Gestational age (weeks) Gestational age (weeks)
D E
Long-chain fatty acids Caffeine metabolism
1.5
1.0
C16 PAF (Platelet−activating factor)
Docosadienoic acid 1.0
0.5
Tetracosapentaenoic acid Caffeine
Tetracosatetraenoic acid 0.5 Theophylline
0.0 Sphingosine 1−Methylxanthine
Erucic acid Theobromine
−0.5 Dodecanoylcarnitine 0.0 Cyclo (leucylprolyl)
Tetracosahexaenoic acid (THA)
Hexadecadienoylcarnitine
−1.0
−0.5
1416 24 32 40 PP 14 16 24 32 40 PP
Gestational age (weeks) Gestational age (weeks)
Figure 2. Functional Metabolite Groups Altered during Pregnancy

(A) Regularized partial correlation network of top altered compounds in pregnancy. Here, each node represents a compound, and each edge represents the
strength of partial correlation between two compounds after conditioning on all other compounds in the datasets. Edge weights represent the partial correlation
coefficients. Note that the seven nodes with red circles with central positions were also the predictors in the models of Figures 4 and 5.
(B–E) The average levels of the metabolite changes against the gestational progression in the clusters of steroid hormone biosynthesis (B), phospholipids and
DHEA-S (C), long-chain fatty acids (D), and caffeine metabolism (E). The intensities were normalized to the baseline, which was defined by averaging all samples
before 14 weeks. The standard errors, derived from 30 subjects, are shown. The GAs were standardized by scaling delivery events to 40 weeks. Abbreviation is as
follows: PP, postpartum. Note that the y axis scale is much larger for steroids than for other compounds.
See also Figure S2.
1684 Cell 181, 1680–1692, June 25, 2020

ll
Resource
A B
Figure 3. System-Wide Reconfiguration of Metabolic Pathways during Pregnancy

(A) Metabolic pathways undergoing significant changes during pregnancy. Red dots denote pregnancy-related pathways with FDR < 0.05, which were further
analyzed in (B). The topological pathway effects were quantified by using published methods (Xia and Wishart, 2010a).
(B) Heatmap shows the temporal changes of pregnancy-related pathway activities during pregnancy and postpartum (PP). To quantify pathway activity, the
average intensity of metabolites in each pathway at each time window was calculated. Note that although some pathways contained mainly the metabolites
increasing or decreasing during pregnancy, many pregnancy-related pathways contained both metabolites increasing and decreasing. Thus, their average
values would not show large changes in the heatmap. For each pathway, the average values from samples earlier than 14 weeks (marked as week 14) were used
as the baseline.
(C) Human disease states that correlated with pregnancy-related metabolites on the basis of published metabolomics data (Chong et al., 2018).
See also Figure S3.
hormones in maintaining pregnancy and later inducing parturi- tex, gonad, and placenta were among the top origins of preg-
tion (Mendelson, 2009), we observed an orchestrated elevation nancy-related metabolites (Figure S3B). The ability to recognize
of many components centered on progesterone, including many well-known and less-characterized steroid hormone
some less well-characterized hormones (Figure S3A). Consistent changes across pregnancy validates our approach.
with known sources of pregnancy metabolites (e.g., hormones) In addition to the steroid pathway, we observed a dynamic
(Maltepe and Fisher, 2015), metabolite set enrichment analysis pattern of metabolite changes with pregnancy in other pathways,
(MSEA) (Xia and Wishart, 2010b) revealed that the adrenal cor- such as the arachidonic acid metabolism pathway (Figures 3A,
Cell 181, 1680–1692, June 25, 2020 1685

ll
Resource
3B, and S3C). We observed 20-HETE amounts increased until mance of our model in two independent validation cohorts (test
week 34; 20-HETE is potentially linked to the regulation of blood set 1 and test set 2). In test set 1, the model yielded an R of
pressure and renal function during pregnancy (Wang et al., 0.89 (R2 = 0.80, p = 8e93, RMSE = 4.11) (Figure 4C). The model,
2002; Wu et al., 2014) (Figures S3C and S3D). By contrast, including four steroids and one lipid (Figure 4D), was further veri-
5-HETE amounts generally decreased during pregnancy, poten- fied in a second independent-validation cohort of eight individuals
tially associated with its regulation of the uterus (Figures S3C with R of 0.91 (R2 = 0.83, RMSE = 3.05, samples n = 32, test set 2)
and S3D) (Edwin et al., 1997; Pearson et al., 2010). Thus, beyond (Table 1; Figure 4E). The compound identifications were
energy metabolism and hormones, a system-wide reconfiguration confirmed by chemical standards (Figures 4F–4H, S4E, and
of the metabolome occurs as the mother adapts to pregnancy. In S4F; Table S4; see STAR Methods). We noted that four of these
addition, based on MSEA analysis, many pregnancy-related me- five compounds are among the central steroid cluster forming a
tabolites are implicated in human disease states, including obesity dense correlation network with one another (Figure 2A).
and prepartum depression (Figure 3C) (Xia and Wishart, 2010b). As pregnancy progresses toward term, clinical classifications
and decisions often need to be made based on the timing of preg-
The Metabolic Clock of Normal Pregnancy Identified by nancy (e.g., < 37 weeks for preterm birth). Babies born before
Machine Learning 37 weeks are considered preterm, those born before 20 weeks
We next determined whether we can build a metabolic clock are considered a miscarriage, and those born before 24 weeks
based on the high-resolution profile to predict gestational age have low survival. Therefore, for clinical action it is important to
for individual plasma samples. In the discovery cohort (samples accurately classify the gestational age by clinical cutoff points at
n = 507, subjects N = 21), we applied feature selection (lasso weeks 20–37. As a proof-of-principle, we tested the potential use
[least absolute shrinkage and selection operator]) with all 9,651 of the metabolome data to classify the normal pregnancy samples
features to build the linear regression model that shows optimal as before or after 20, 24, 28, 32, and 37 gestational weeks (Fig-
cross-validation performance for predicting a given phenotype in ure S5A). First, using only samples from the third-trimester (>
this cohort. We then ran the validation cohort data (test set 1, 28 weeks of gestation), the time window where women were
samples n = 245, subjects N = 9) through the model established more susceptible to preterm delivery, we determined whether the
in the discovery cohort to measure the independent perfor- identified maternal blood metabolites can distinguish the sample
mance of our model (Figure 4A; see STAR Methods). gestational age as before or after 37 weeks. Both the discovery
We first tested whether the metabolome change can quantita- and the validation prediction yielded an area under the receiver
tively determine the gestational age in normal pregnant women. operating characteristics (AUROC) over or close to 0.90 (Fig-
Feature selection in the discovery cohort yielded a linear model ure S5B; see STAR Methods). Remarkably, the prediction model
that included 42 metabolic features (Figure S4A; Table S2). In contained only three metabolites, and the abundance range of
the cross-validation test of 507 samples in the discovery cohort, each individual metabolite separated the > 37 week samples
the metabolic model predicted gestational age in weeks from the < 37 week samples for all but one to two validation sub-
(GAmetabolic) that correlated with gestational age estimated by jects (Figures S5C–S5F). Similarly, using samples across the whole
the first-trimester ultrasound (GAultrasound, in compliance with pregnancy, we found that metabolites can also accurately distin-
the clinical standard of care) with a Pearson correlation coeffi- guish pregnancy samples before or after other important gesta-
cient (R) of 0.96 (R2 = 0.93, p < 1 X 10100, root mean squared tional age cutoffs, such as 20, 24, 28, and 32 gestational weeks
error [RMSE]= 2.49) (Figure S4B). In the independent-validation (Figures S5A and S5G–S5J).
cohort, the model yielded a similar R of 0.95 (R2 = 0.91, p < 1 X
10100, RMSE = 2.76, test set 1) (Figure S4C). This indicates Personal Metabolic Clock of Pregnancy Linked with
metabolic features can accurately predict the gestational age Timing of Delivery and Fetal Growth
on the basis of a blood sample from a pregnant woman. Next, we examined the metabolic clock prediction performance
For potential clinical use, we next tested whether we can use in individuals. First, we noted that for most individuals, our model
the annotated compounds in blood to predict the gestational produced predictions consistently aligned with the gestational
age in pregnant women. We performed feature selection in dis- age estimated by the first-trimester ultrasound (Figures 5A and
covery cohort by using the 264 level 1 and level 2 compounds 5B). In both cross validations for Discovery and test set 1, the
identified in the Human Metabolome Database (HMDB) in the dis- prediction deviation (measured by RMSE) in individuals centered
covery cohort (Table S3). This yielded a linear model including five around 3 weeks (Figures S6A and S6B). However, in each data-
compounds (Figures S4D and 4D) that together are highly predic- set, there is a small population of individuals with higher predic-
tive. We first evaluated the performance of the model in a 10-fold tion deviation. When we examined these individuals (e.g.,
cross-validation (CV) test in the discovery cohort, in which sam- subjects 1, 2, and 4), we found the predictions were not more
ples were distributed into folds by subject instead of by sample randomly scattered than other individuals. Rather, in the majority
to prevent person-specific information cross-over between the of them, predictions shifted away from the actual gestational
training folds and the test fold. In the CV test, the metabolic-clock ages in a portion of the pregnancy duration (Figures 5A and
model produced a result (GAmetabolic) that correlated with the 5B), suggesting effects from non-random causes.
gestational age estimated by the first-trimester ultrasound We hypothesized that some of these large prediction deviations
(GAultrasound) with a Pearson correlation coefficient (R) of 0.92 might arise from biological causes, particularly from the maternal-
(R2 = 0.85, p = 8e222, RMSE = 3.67) (Figure 4B). To avoid the fetal interaction. It is reported that the fetoplacental unit secretes
hyperparametric selection bias, we further evaluated the perfor- hormones in conjunction with fetal growth and development
1686 Cell 181, 1680–1692, June 25, 2020

ll
Resource
B C
D E
F G H
Figure 4. Metabolic Clock Of Pregnancy: Five Metabolites Selected by Machine Learning Can Accurately Predict the Timing of Normal
Pregnancy Progression in Both a Discovery and Two Validation Cohorts
(A) Design of the analytical pipeline.
(B and C) Gestational age (GA) predicted by the linear model consisting of five identified metabolites (GAmetabolic, y axis) highly correlates with clinical values
determined by the standard of care (by first-trimester ultrasound [GAultrasound] x axis) in the Discovery (B) and the validation cohort (test set 1) (C). Note that two
samples presented as outliers in the validation cohort, possibly because of occasional mass-spectrometry signal instability in given samples. The 95% confi-
dence interval for the linear regression is represented by the gray area.
(D) Contribution of the five metabolites to the gestational age prediction model.
(E) Gestational age predicted by the five metabolites (GAmetabolic, y axis, scaled) correlates with clinical values determined by the standard of care (by first-
trimester ultrasound [GAultrasound] x axis) in the test set 2 cohort. The 95% confidence interval for the linear regression is represented by the gray area.
(F–H) Confirmation of the metabolites predicting gestational age in the metabolic clock model by standard compounds, THDOC (F), estriol-16-glucuronide (G),
and progesterone (H) (see two additional compounds PE(P-16:0e/0:0) and DHEA-S in Figures S4E and S4F). Measured MS/MS spectral fragmentation profiles
(top, in black) matching chemical standards (bottom, in red). Note that the discovery results were from the 10-fold CV to avoid over-fitting (see STAR Methods).
See also Figures S4 and S5 and Tables S2–S4.
(Murphy et al., 2006). Indeed, we noted that the average prediction gestational age estimation determined by first-trimester ultrasound
deviation strongly correlated with adjusted infant birth weight (Fig- in mothers with a heavier fetus while being delayed in the mothers
ure 5C, adjusted for gestational length; see Figure S6C and STAR with a lighter fetus. The finding suggests that fetal growth appears
Methods). Thus, the overall metabolic clock tends to outpace the to be one of the inputs read by the metabolic clock.
Cell 181, 1680–1692, June 25, 2020 1687

ll
Resource
A B
RMSE= 2.6 RMSE= 4.2 RMSE= 3.1 RMSE= 2.4 RMSE= 2.1 RMSE= 3.2 RMSE= 3.4 RMSE= 4.3 RMSE= 3.0 RMSE= 2.9
R2= 0.88 R2= 0.90 R2= 0.93 R2= 0.94 R2= 0.88 R2= 0.93 R2= 0.89 R2= 0.88 R2= 0.92 R2= 0.90
40 p= 5e−11 40 p= 2e−13 40 p= 3e−14 40 p= 2e−19 40 p= 3e−09 40 p= 2e−09 40 p= 7e−12 40 p= 9e−14 40 p= 2e−17 40 p= 1e−12
30 30 30 30 30 30 30 30 30 30
20 20 20 20 20 20 20 20 20 20
10 10 10 10 10 10 10 10 10 10
GAmetabolic (weeks)
GAmetabolic (weeks)
Subject 18 Subject 11 Subject 15 Subject 5 Subject 16 Subject 7 Subject 19 0 Subject 30 0 Subject 25 Subject 26
0
0 0 0
0 0 0 0
0 10 20 30 40 0 10 20 30 40 0 10 20 30 40 0 10 20 30 40 0 10 20 30 40 0 10 20 30 40 0 10 20 30 40 0 10 20 30 40 0 10 20 30 40 0 10 20 30 40
RMSE= 2.7 RMSE= 5.7 RMSE= 3.3 RMSE= 4.8 RMSE= 2.0 RMSE= 4.6 RMSE= 4.5 RMSE= 3.6 RMSE= 3.0 RMSE= 4.6
R2= 0.96 R2= 0.82 R2= 0.93 R2= 0.90 R2= 0.94 R2= 0.90 R2= 0.89 R2= 0.91 R2= 0.88 R2= 0.82
40 p= 2e−10 40 p= 3e−12 40 p= 3e−16 40 p= 2e−17 40 p= 7e−16 40 p= 3e−14 40 p= 5e−15 40 p= 2e−13 40 p= 6e−12 40 p= 3e−14
30 30 30 30 30 30 30 30 30 30
20 20 20 20 20 20 20 20 20 20
10 10 10 10 10 10 10 10 10 10
0 Subject 21 Subject 4 0 Subject 17 0 Subject 1 Subject 10 Subject 2 Subject 3 Subject 28 Subject 22 Subject 24
0 0 0 0 0 0
0
0 10 20 30 40 0 10 20 30 40 0 10 20 40 0 10 20 30 40 0 10 20 30 40 0 10 20 30 40 0 10 20 30 40 0 10 20 30 40 0 10 20 30 40 0 10 20 30 40
RMSE= 3.3 RMSE= 4.1 RMSE= 2.6 RMSE= 3.5 RMSE= 2.5 RMSE= 4.1 RMSE= 3.0
RMSE= 2.6 RMSE= 3.3 RMSE= 7.5
R2= 0.88 R2= 0.89 R2= 0.97 R2= 0.93 R2= 0.94 R2= 0.87 R2= 0.84
R2= 0.90 R2= 0.86 R2= 0.56
40 p= 3e−11 40 p= 3e−15 40 p= 1e−08 40 p= 4e−17 40 p= 3e−14 40 p= 1e−08 40 p= 5e−08
40 p= 9e−13 40 p= 7e−12 40 p= 3e−05
30 30 30 30 30 30 30
30 30 30
20 20 20 20 20 20 20
20 20 20
10 10 10 1030 10 10 10
10 10 10
0 Subject 12 Subject 8 0 Subject 20 0 Subject 14 Subject 13 Subject 9 Subject 6 Subject 29 Subject 27 Subject 23
0 0 0 0 0 0
0 10 20 30 40 0 10 20 30 40 0 10 20 30 40 0 10 20 30 40 0 10 20 30 40 0 10 20 30 40 0 10 20 30 40 0
0 10 20 30 40 0 10 20 30 40 0 10 20 30 40
GAultrasound (weeks) GAultrasound (weeks)
C D E
P = 0.01 P = 0.02
R2= 0.22 R2= 0.28
( (GAmetabolic- GAultrasound)) (weeks)
42
2
GA at delivery (weeks)
Average discrepencies
41
0
Weeks to delivery: 8 4 2
40
AUC in validation: 0.91 0.87 0.86
THDOC 1 2 1
−2 39 Androstane-3,17-diol 2
Estriol-16-glucuronide 1 3
N= 29 N= 18 17alpha-Hydroxyprogesterone 2
with birthweight info.
38
with natural labor onsets PE(P−16:0e/0:0) 3
−4
−1000 −500 0 500 1000 −4 −2 0 2
Adjusted brithweight deviation Average discrepencies
from the population mean (g) ( (GAmetabolic- GAultrasound)) (weeks)
F Weeks to delivery (WD) < 2w G H

1.00 Weeks to delivery predictors THDOC
WD > 2w
WD < 2w
0.5 13.5
True positive rate
0.75
Contribution
0.4
log2(Intensity)
13.0
0.3
0.50 12.5
0.2
AUC: 0.88
Discovery CV N= 128 0.1 12.0
0.25 95% CI: 0.82-0.95 0.0
AUC: 0.86 11.5
C
O
Test Set 1 N= 65
D
0.00
TH
95% CI: 0.80-0.94

28
29
30
22
25
26
0.00 0.25 0.50 0.75 1.00

Subject ID
False positive rate
Figure 5. Personal Metabolic Clock of Pregnancy Linked with Timing of Delivery and Fetal Growth
(A and B) Highly correlated patterns of the metabolic-clock-predicted gestational age (GAmetabolic) of the five-metabolite model with the gestational age estimated
by the first-trimester ultrasound (GAultrasound) at the individual level in the cross validation (A) and test set 1 (B). Note that the outlier sample with negative prediction
value in Figure 4C belonged to the last subject of the test set 1 and did not show in the current plot with the y axis scale limitation.
(C) The average discrepancies between metabolic-clock-predicted gestational age and ultrasound-estimated gestational age (D(GAmetabolic-GAultrasound)) were
significantly correlated with the fetal growth deviation from the population by person. All 29 subjects who had baby birth weight information are included here. The
95% confidence interval for the linear regression is represented by the gray area.
(D) Average discrepancies between GAmetabolic and GAultrasound (D(GAmetabolic-GAultrasound)) were negatively correlated with the actual delivery weeks (by ultra-
sound-estimation). All 18 subjects who had natural labor onset are included here. Dashed lines marked the ultrasound estimated GA at 40 weeks (due date,
black), GAmetabolic one week earlier than the GAultrasound (blue), and GAmetabolic one week later than the GAultrasound (red). The 95% confidence interval for the linear
regression is represented by the gray area.
(E) Summary of prediction models of 2, 4, and 8 weeks approaching delivery, using two to three metabolites. The contribution rank of each predictor in every
model is listed as number 1, 2, and 3. The weeks to delivery were built using samples of the third trimester (> 28 weeks). AUCs in the validation cohort (test set 1)
are listed.
1688 Cell 181, 1680–1692, June 25, 2020

ll
Resource
In addition, within the 18 women with natural labor onset (i.e., (Tulchinsky et al., 1972; Wang et al., 2016) (such as progester-
excluding women with induction before labor onset and sched- one, 17-hydroxyprogesterone, and the linoleic acid pathway),
uled cesarean-section), we found that the women whose overall validating our approach. At the same time, we also noted that
metabolic clock of pregnancy outpaced ultrasound evaluation a large portion of pregnancy-related metabolites identified in
tended to deliver earlier, whereas a delay in metabolic clock our study was less well-studied. For example, over 95% of the
correlated with a delayed time to child delivery compared to ul- pregnancy-related metabolites identified in our study were not
trasound estimated due date (Figure 5D). Interestingly, five out of recovered from a targeted metabolic profiling study on preg-
six women (83%) with a metabolic-clock-predicted gestational nancy (Wang et al., 2016), demonstrating the power of unbiased
age more than one week later than the ultrasound-estimated and hypothesis-independent profiling. Among the changing me-
gestational age had natural labor onset after their due date (esti- tabolites, the major class that increased was steroids, including
mated by ultrasound, marked in red in Figure 5D), and four out of progesterone, which interacts with the hypothalamic-pituitary-
five women (80%) with metabolic-clock-predicted gestational adrenal axis (HPA axis) (Chrousos et al., 1998), and estriol-16-
age more than one week earlier than the ultrasound-estimated glucuronide produced by the placenta (Levitz et al., 1984).
gestational age had natural labor onset before their due date Here, the detailed differences in their temporal profiles were re-
(marked in blue in Figure 5D). These results suggest the meta- vealed by the weekly sampling design of the study (Figures 1D
bolic clock of pregnancy with maternal metabolites contains in- and 2B). In addition, we discovered less well-studied steroids
formation on the timing of delivery in normal pregnancy. in pregnancy, such as the neurosteroid THDOC, an allosteric
modulator of the GABAA receptor that potentially affects stress
Prediction for Timing of Delivery and depression in human pregnancy (Hosie et al., 2006; Reddy,
We then tested whether the maternal blood metabolites can also 2003). Intriguingly, many pregnancy-related metabolites that
predict the timing of a normal delivery event within a defined period changed, including steroids, quickly returned to the maternal
(2, 4, and 8 weeks from delivery) approaching the labor events (in the non-pregnant state after childbirth (Figures 1B and 1D and 1E).
third trimester). We first examined whether metabolites can predict In addition, we also identified a wide variety of non-steroid
a delivery within 2 weeks (weeks to delivery [WD] < 2w). To predict hormones whose abundance altered during pregnancy
delivery triggered naturally without outside procedures (such as progression.
scheduled cesarean-section), we only included delivery events These metabolite changes presumably accommodate and/or
naturally triggered (subjects N = 18, samples n = 193). With just three reflect important maternal biological physiology during preg-
metabolites, the metabolome accurately predicted an upcoming nancy and fetal growth (Bispham et al., 2003; Prentice and Gold-
delivery event within 2 weeks in both discovery and validation co- berg, 2000). For maternal nutrient metabolism, one of the
horts with AUROC close to 0.9 (Figures 5E–5H, S6D, and S6E; decreased carnitines, oleoylcarnitine (Figure 1C), accumulates
see STAR Methods). Similarly, identified metabolites can also be during certain metabolic conditions, including fasting (Hoppel
used to predict the timing of a normal delivery event within 4 and and Genuth, 1980; Minkler et al., 2005). Also, one phosphatidyl-
8 weeks (Figures 5E and S6F–S6I). Intriguingly, the panels of metab- choline that functions as a micronutrient, lecithin, increased in
olites partially overlapped between the models, whereas the individ- pregnancy, suggesting a systematic change in the maternal
ual metabolites contributed differently to the models (Figure 5E). All nutritional status during gestation. Within molecules reflecting
of the metabolite markers were identified as steroids, except for pregnancy-related physiological changes, consistent with
phospholipid PE(P-16:0e/0:0), and most of them (three out of five decreased blood pressure (Hermida et al., 1997), the antihyper-
in total) also appeared in the aforementioned metabolic clock for tensive molecule 20-HETE of the arachidonic acid metabolism
gestational age (Figure 5E; Table S4). These results demonstrate pathway is elevated during pregnancy until the early third
that we can precisely categorize critical pregnancy stages in normal trimester, and its synthesis is regulated in a renal-specific
subjects by using a small number of maternal blood metabolites, manner (Wang et al., 2002; Wu et al., 2014). This reveals the high-
which can be further validated in larger and independent cohorts. ly dynamic temporal regulation of 20-HETE in blood pressure
and kidney function during pregnancy. In contrast, compared
DISCUSSION with early pregnancy and postpartum, the amount of 5-HETE
in the same pathway was generally lower in the second and third
In this study, we performed untargeted metabolomics profiling trimesters with an increasing trend right before the childbirth,
and identified highly dynamic temporal regulation of metabolic consistent with previous findings that 5-HETE elevates in the
changes in human pregnancy: more than half of the measured uterus and amniotic fluid at the onset of human labor (Edwin
metabolites and metabolic pathways changed during preg- et al., 1997; Pearson et al., 2010). In the developing fetus,
nancy. We were able to detect many of the pregnancy-associ- changes in hexadecadienoylcarnitine amounts are associated
ated metabolite profiles revealed in previous targeted studies with congenital heart defects (Bahado-Singh et al., 2014).
(F) The logistic regression model based on three metabolites can accurately identify the third-trimester plasma samples approaching the delivery (weeks to
delivery [WD] < 2w; only women with natural labor onset included).
(G) Contribution of the three metabolites to the prediction model of 2 weeks approaching delivery.
(H) Metabolite THDOC showed abundance separations before or after 2 weeks approaching the delivery, except in one subject. See Figure S6 for other me-
tabolites in the model. Note that the discovery results were from the 10-fold CV instead of direct fitting to avoid over-fitting.
Cell 181, 1680–1692, June 25, 2020 1689

ll
Resource
Here, we revealed that the amount of hexadecadienoylcarnitine women that had advanced metabolic clock tended to deliver
in the blood decreased continuously until week 24, then steadily earlier than predicted by ultrasound, whereas a delay in metabolic
increased thereafter (Figure 2D). In addition, the amount of long- clock correlated with a delayed time to child delivery (Figure 5D).
chain fatty acids in maternal blood samples is associated with In summary, combining untargeted metabolome and high-den-
childhood metabolic health (Maslova et al., 2018). Here, the sity sampling revealed the landscape of metabolome changes dur-
omega-3 fatty acid THA decreased during early pregnancy and ing pregnancy and the postpartum period with high resolution. The
gradually increased before childbirth (Figure 2D), suggesting data itself can serve as a resource for future research. As a proof-
gestation-related changes in the formation of docosahexaenoic of-principle, we also demonstrated that the temporal abundance
acid (DHA) (Moore et al., 1995). Our findings are robust even information of metabolome can be used to predict gestational
without a requirement for prior fasting. It will be interesting to age with high accuracy in a cohort of healthy women. There is a
validate these findings in cohorts that have dietary information great need for accurate timing of pregnancy: in the US alone,
and detailed clinical measurements, to define critical ‘‘nutritional 900,000 women annually missed their first-trimester ultrasound
time-zones’’ for micronutrient amounts and further understand (Martin et al., 2018), currently the only accurate timing method
the metabolite changes that are important for physiologic for pregnancy (Committee on Obstetric Practice, the American
changes across pregnancy. Institute of Ultrasound in Medicine, and the Society for Maternal-
Our high-density sampling scheme allowed us to study the tem- Fetal Medicine, 2017). In low and middle-income countries, acces-
poral alteration of metabolite levels at weekly resolution. For sibility to ultrasound is even more scarce, complicating many preg-
example, even though many steroid metabolites were elevated nancies and fetal care down-stream (e.g., identify imminent labor,
during pregnancy, our profiling was able to show that there were manage complications, etc.). Our study demonstrated that the
at least two different behaviors: an early wave (such as progester- development of clinical tools with a few metabolites in maternal
one and 17a-hydroxyprogesterone) and a second wave (such as blood to time pregnancy is promising. Testing of blood drawn
estriol-16-glucuronide). These temporal changes of steroids from the pregnant woman would likely be limited to once or a
across pregnancy and after childbirth are at least partially regulated few times to be informative and have the potential to benefit preg-
by the fetoplacental unit, including both maternal adrenal gland nant women in both developed and developing worlds.
and placenta and fetal adrenals and liver (Diczfalusy, 1953; Frand-
sen and Stakemann, 1961; Raeside, 2017). Further investigation
STAR+METHODS
into the interaction of fetal-maternal contribution will be necessary
for understanding the temporal regulation of these metabolites.
Untargeted metabolome and high-density sampling enabled us
and include the following:
to identify a broad set of high-resolution temporal profiles of me-
tabolites during pregnancy. We hypothesized that this information d KEY RESOURCES TABLE
might help us to understand the underlying metabolic clock that d RESOURCE AVAILABILITY
times the progression of pregnancy. We found that solely using B Lead Contact
the abundance of five compounds, without any other inputs from B Materials Availability
clinical features, we can precisely determine the gestational age B Data and Code Availability
of a healthy pregnant woman. The precision surpasses the recent d EXPERIMENTAL MODEL AND SUBJECT DETAILS
cell-free RNA model by using maternal blood (Ngo et al., 2018). B Pregnancy cohort
Similarly, with two to three compounds, we can categorically pre- d METHOD DETAILS
dict many pregnancy cutoff times with high AUC: we can deter- B Plasma sample preparation
mine whether a woman has reached 20, 24, 28, 32, or 37 weeks B Chemical materials for untargeted metabolomics
(clinical cutoffs for miscarriage, age of viability, extremely preterm, B MS acquisition
very preterm, and prematurity, respectively) into her pregnancy B Chromatographic conditions
(Figure S5A), or whether a woman will enter into labor within the d QUANTIFICATION AND STATISTICAL ANALYSIS
next two, four, or eight weeks (Figure 5E). The proof-of-principle B Section 1: Metabolomics Data Processing
study suggested that metabolome bears rich quantitative informa- B Section 2: Metabolic Features Identification
tion about pregnancy progression. However, our study has its lim- B Section 3: Identify Significantly Altered
itations. The studied population consisted of healthy Caucasian Features/Compounds
pregnant women with small variations in clinical characteristics. B Section 4: Regularized Partial Correlation Network
In the future, we need to test the models in a larger cohort with B Section 5: Pathway Analysis
diverse ethnicities and complications. Meanwhile, targeted chem- B Section 6: Machine Learning for Pregnancy Timing
ical assays need to be developed on the small panels of identified B Section 7: Analyze the discrepancies between meta-
metabolite markers that were discovered by untargeted metabolo- bolic clock (GA prediction model) and first-trimester ul-
mics to measure the metabolite concentration independent of trasound estimations
batches. Intriguingly, we found the metabolic clock of pregnancy
to be robust in general, but small personal deviations can be
observed, most likely affected by the fetal growth (Figure 5C).
Lastly, we also found that the discrepancies between metabolic Supplemental Information can be found online at https://doi.org/10.1016/j.
timing and ultrasound suggested biological significance: the cell.2020.05.002.
1690 Cell 181, 1680–1692, June 25, 2020

ll
Resource
ACKNOWLEDGMENTS Soon Preterm Birth Action Group (2013). Born too soon: the global epidemi-
ology of 15 million preterm births. Reprod. Health 10, S2.
We thank the dedicated women who made this study possible by generously Chambers, M.C., Maclean, B., Burke, R., Amodei, D., Ruderman, D.L., Neu-
participating every week during their pregnancy. We thank R. Jian, L. Tian, and mann, S., Gatto, L., Fischer, B., Pratt, B., Egertson, J., et al. (2012). A cross-
C. Yeh for technical assistance; S.M.S. Rose, G. Chen, and Y. Zhang for re- platform toolkit for mass spectrometry and proteomics. Nat. Biotechnol. 30,
viewing the manuscripts. This work was supported by the Bill & Melinda Gates 918–920.
Foundation, United States (OPP1128928). M.S. was supported by the Center
for Personal Dynamic Regulomes (2RM1HG00773506). M.M. and B.F. were Chan, L.Y., Chiu, P.Y., and Lau, T.K. (2003). Cord blood thyroid-stimulating
supported by the Oak Foundation, Switzerland (OCAY-18-598). M.L.H.R. hormone level in high-risk pregnancies. Eur. J. Obstet. Gynecol. Reprod.
was supported by the Independent Research Fund Denmark (6120-000998). Biol. 108, 142–145.
L.S. was supported by a Carlsberg Foundation Postdoctoral Fellowship Chong, J., Soufan, O., Li, C., Caraus, I., Li, S., Bourque, G., Wishart, D.S., and
(CF15-0899). Xia, J. (2018). MetaboAnalyst 4.0: towards more transparent and integrative
metabolomics analysis. Nucleic Acids Res. 46, W486–W494.
AUTHOR CONTRIBUTIONS Chrousos, G.P., Torpy, D.J., and Gold, P.W. (1998). Interactions between the
hypothalamic-pituitary-adrenal axis and the female reproductive system: clin-
M.-L.H.R. and M.M. organized and contributed to the collection of pregnancy ical implications. Ann. Intern. Med. 129, 229–240.
samples. L.L., B.P., B.F., M.S., and M.M. conceptualized the study. L.L., M.S., Committee on Obstetric Practice, the American Institute of Ultrasound in Med-
and M.M. created the analysis plan. L.L. processed and analyzed the samples. icine, and the Society for Maternal-Fetal Medicine (2017). Committee Opinion
L.L. and H.R. analyzed the data. L.L., X.S., S.C., J.S., and N.L. contributed to No 700: Methods for Estimating the Due Date. Obstet. Gynecol. 129,
the spectral analysis. L.L., M.S., and M.M. wrote the manuscript, and all au- e150–e154.
thors contributed to reviewing and editing the manuscript.
Contrepois, K., Jiang, L., and Snyder, M. (2015). Optimized Analytical Proced-
ures for the Untargeted Metabolomic Profiling of Human Urine and Plasma by
DECLARATION OF INTERESTS Combining Hydrophilic Interaction (HILIC) and Reverse-Phase Liquid Chroma-
tography (RPLC)-Mass Spectrometry. Mol. Cell. Proteomics 14, 1684–1695.
M.S. is a co-founder and member of the scientific advisory boards of the Diczfalusy, E. (1953). Chorionic gonadotrophin and oestrogens in the human
following: Personalis, SensOmics, Filtricine, Qbio, January, Mirvie, and Ora- placenta. Acta Endocrinol. Suppl. (Copenh.) 11, 1–175.
lome. He is a member of the scientific advisory board of Jungla. M.M. is a
Donahue, S.M., Kleinman, K.P., Gillman, M.W., and Oken, E. (2010). Trends in
co-founder of Mirvie. L.L., M.S., and M.M. are inventors on the patent applica-
tion PCT/US2019/052515 related to this work. birth weight and gestational length among singleton term births in the United
States: 1990-2005. Obstet. Gynecol. 115, 357–364.
Received: September 4, 2018 Dudzik, D., Zorawski, M., Skotnicki, M., Zarzycki, W., Kozlowska, G., Bibik-
Revised: March 11, 2020 Malinowska, K., Vallejo, M., Garcı́a, A., Barbas, C., and Ramos, M.P. (2014).
Accepted: April 29, 2020 Metabolic fingerprint of Gestational Diabetes Mellitus. J. Proteomics
Published: June 25, 2020 103, 57–71.
Edwin, S.S., Mitchell, M.D., and Dudley, D.J. (1997). Action of immunoregula-
REFERENCES tory agents on 5-HETE production by cultured human amnion cells. J. Reprod.
Immunol. 36, 111–121.
Alkema, L., Chou, D., Hogan, D., Zhang, S., Moller, A.B., Gemmill, A., Fat, Epskamp, S., and Fried, E.I. (2018). A tutorial on regularized partial correlation
D.M., Boerma, T., Temmerman, M., Mathers, C., and Say, L.; United Nations networks. Psychol. Methods 23, 617–634.
Maternal Mortality Estimation Inter-Agency Group collaborators and technical
Frandsen, V.A., and Stakemann, G. (1961). The site of production of oestro-
advisory group (2016). Global, regional, and national levels and trends in
genic hormones in human pregnancy. Hormone excretion in pregnancy with
maternal mortality between 1990 and 2015, with scenario-based projections
anencephalic foetus. Acta Endocrinol. (Copenh.) 38, 383–391.
to 2030: a systematic analysis by the UN Maternal Mortality Estimation Inter-
Agency Group. Lancet 387, 462–474. Gagnon, A., and Wilson, R.D.; Society of Obstetricians and Gynaecologists of
Canada Genetics Committee (2008). Obstetrical complications associated
Altman, N.S. (1992). An Introduction to Kernel and Nearest-Neighbor Nonpara-
with abnormal maternal serum markers analytes. Journal d’obstetrique et gy-
metric Regression. Am. Stat. 46, 175–185.
necologie du Canada 30, 918–932.
Bahado-Singh, R.O., Akolekar, R., Mandal, R., Dong, E., Xia, J., Kruger, M.,
Goeman, J.J., and Bühlmann, P. (2007). Analyzing gene expression data in
Wishart, D.S., and Nicolaides, K. (2012). Metabolomics and first-trimester pre-
terms of gene sets: methodological issues. Bioinformatics 23, 980–987.
diction of early-onset preeclampsia. J Matern Fetal Neonatal Med 25,
1840–1847. Goeman, J.J., van de Geer, S.A., de Kort, F., and van Houwelingen, H.C.
(2004). A global test for groups of genes: testing association with a clinical
Bahado-Singh, R.O., Ertl, R., Mandal, R., Bjorndahl, T.C., Syngelaki, A., Han,
outcome. Bioinformatics 20, 93–99.
B., Dong, E., Liu, P.B., Alpay-Savasan, Z., Wishart, D.S., et al. (2014). Metab-
olomic prediction of fetal congenital heart defect in the first trimester. A J Ob- Hermida, R.C., Ayala, D.E., Mojón, A., Fernández, J.R., Silva, I., Ucieda, R.,
stet Gynecol 211, e1–e14. and Iglesias, M. (1997). High sensitivity test for the early diagnosis of gesta-
tional hypertension and preeclampsia. IV. Early detection of gestational hyper-
Baumann, D., and Baumann, K. (2014). Reliable estimation of prediction errors
tension and preeclampsia by the computation of a hyperbaric index. J. Perinat.
for QSAR models under model uncertainty using double cross-validation.
Med. 25, 254–273.
J. Cheminform. 6, 47.
Bispham, J., Gopalakrishnan, G.S., Dandrea, J., Wilson, V., Budge, H., Keisler, Hochreiter, S., and Schmidhuber, J. (1997). Long short-term memory. Neural
D.H., Broughton Pipkin, F., Stephenson, T., and Symonds, M.E. (2003). Comput. 9, 1735–1780.
Maternal endocrine adaptation throughout pregnancy to nutritional manipula- Hoppel, C.L., and Genuth, S.M. (1980). Carnitine metabolism in normal-weight
tion: consequences for maternal plasma leptin and cortisol and the program- and obese human subjects during fasting. Am. J. Physiol. 238, E409–E415.
ming of fetal adipose tissue development. Endocrinology 144, 3575–3585. Hosie, A.M., Wilkins, M.E., da Silva, H.M., and Smart, T.G. (2006). Endogenous
Blencowe, H., Cousens, S., Chou, D., Oestergaard, M., Say, L., Moller, A.B., neurosteroids regulate GABAA receptors through two discrete transmem-
Kinney, M., Lawn, J., and Born Too Soon Preterm Birth Action, G.; Born Too brane sites. Nature 444, 486–489.
Cell 181, 1680–1692, June 25, 2020 1691

ll
Resource
Kaddurah-Daouk, R., Kristal, B.S., and Weinshilboum, R.M. (2008). Metabolo- Raeside, J.I. (2017). A Brief Account of the Discovery of the Fetal/Placental
mics: a global biochemical approach to drug response and disease. Annu. Unit for Estrogen Production in Equine and Human Pregnancies: Relation to
Rev. Pharmacol. Toxicol. 48, 653–683. Human Medicine. Yale J. Biol. Med. 90, 449–461.
Kenny, L.C., Broadhurst, D.I., Dunn, W., Brown, M., North, R.A., McCowan, L., Reddy, D.S. (2003). Is there a physiological role for the neurosteroid THDOC in
Roberts, C., Cooper, G.J., Kell, D.B., and Baker, P.N.; Screening for Preg- stress-sensitive conditions? Trends Pharmacol. Sci. 24, 103–106.
nancy Endpoints Consortium (2010). Robust early pregnancy prediction of Romero, R., Mazaki-Tovi, S., Vaisbuch, E., Kusanovic, J.P., Chaiworapongsa,
later preeclampsia using metabolomic biomarkers. Hypertension 56, 741–749. T., Gomez, R., Nien, J.K., Yoon, B.H., Mazor, M., Luo, J., et al. (2010). Metab-
King, J.C. (2000). Physiology of pregnancy and nutrient metabolism. Am. J. olomics in premature labor: a novel approach to identify patients at risk for pre-
Clin. Nutr. 71 (5, Suppl), 1218S–1225S. term delivery. The journal of maternal-fetal & neonatal medicine 23,
Knutti, R., Rothweiler, H., and Schlatter, C. (1981). Effect of pregnancy on the 1344–1359.
pharmacokinetics of caffeine. Eur. J. Clin. Pharmacol. 21, 121–126. Sachse, D., Sletner, L., Mørkrid, K., Jenum, A.K., Birkeland, K.I., Rise, F., Pieh-
Koh, W., Pan, W., Gawad, C., Fan, H.C., Kerchner, G.A., Wyss-Coray, T., Blu- ler, A.P., and Berg, J.P. (2012). Metabolic changes in urine during and after
menfeld, Y.J., El-Sayed, Y.Y., and Quake, S.R. (2014). Noninvasive in vivo pregnancy in a large, multiethnic population-based cohort study of gestational
monitoring of tissue-specific global gene expression in humans. Proc. Natl. diabetes. PLoS ONE 7, e52399.
Acad. Sci. USA 111, 7361–7366. Say, L., Chou, D., Gemmill, A., Tunçalp, Ö., Moller, A.B., Daniels, J., Gülmezo-
Levitz, M., Kadner, S., and Young, B.K. (1984). Intermediary metabolism of es- glu, A.M., Temmerman, M., and Alkema, L. (2014). Global causes of maternal
triol in pregnancy. J. Steroid Biochem. 20 (4B), 971–974. death: a WHO systematic analysis. Lancet Glob. Health 2, e323–e333.
López-Hernández, Y., Herrera-Van Oostdam, A.S., Toro-Ortiz, J.C., López, Sedgh, G., Singh, S., and Hussain, R. (2014). Intended and unintended preg-
J.A., Salgado-Bustamante, M., Murgu, M., and Torres-Torres, L.M. (2019). Uri- nancies worldwide in 2012 and recent trends. Stud. Fam. Plann. 45, 301–314.
nary Metabolites Altered during the Third Trimester in Pregnancies Compli- Sevastou, I., Kaffe, E., Mouratis, M.A., and Aidinis, V. (2013). Lysoglycero-
cated by Gestational Diabetes Mellitus: Relationship with Potential Upcoming phospholipids in chronic inflammatory disorders: the PLA(2)/LPC and ATX/
Metabolic Disorders. Int. J. Mol. Sci. 20, 1186. LPA axes. Biochim. Biophys. Acta 1831, 42–60.
Maltepe, E., and Fisher, S.J. (2015). Placenta: the forgotten organ. Annu. Rev. Shen, X., Wang, R., Xiong, X., Yin, Y., Cai, Y., Ma, Z., Liu, N., and Zhu, Z.J.
Cell Dev. Biol. 31, 523–552. (2019). Metabolic reaction network-based recursive metabolite annotation
Martin, J.A., Hamilton, B.E., Osterman, M.J.K., Driscoll, A.K., and Drake, P. for untargeted metabolomics. Nat. Commun. 10, 1516.
(2018). Births: Final Data for 2017. Natl. Vital Stat. Rep. 67, 1–50. Soldin, O.P., Guo, T., Weiderpass, E., Tractenberg, R.E., Hilakivi-Clarke, L.,
Maslova, E., Rifas-Shiman, S.L., Olsen, S.F., Gillman, M.W., and Oken, E. and Soldin, S.J. (2005). Steroid hormone levels in pregnancy and 1 year post-
(2018). Prenatal n-3 long-chain fatty acid status and offspring metabolic health partum using isotope dilution tandem mass spectrometry. Fertil. Steril. 84,
in early and mid-childhood: results from Project Viva. Nutr. Diabetes 8, 29. 701–710.
Mayr, A., Klambauer, G., Unterthiner, T., Steijaert, M., Wegner, J.K., Ceule- Stein, S.E., and Scott, D.R. (1994). Optimization and testing of mass spectral
mans, H., Clevert, D.A., and Hochreiter, S. (2018). Large-scale comparison library search algorithms for compound identification. J. Am. Soc. Mass Spec-
of machine learning methods for drug target prediction on ChEMBL. Chem. trom. 5, 859–866.
Sci. (Camb.) 9, 5441–5451. Tulchinsky, D., Hobel, C.J., Yeager, E., and Marshall, J.R. (1972). Plasma
Mendelson, C.R. (2009). Minireview: fetal-maternal hormonal signaling in preg- estrone, estradiol, estriol, progesterone, and 17-hydroxyprogesterone in hu-
nancy and labor. Mol. Endocrinol. 23, 947–954. man pregnancy. I. Normal pregnancy. Am. J. Obstet. Gynecol. 112,
1095–1100.
Minkler, P.E., Kerner, J., North, K.N., and Hoppel, C.L. (2005). Quantitation of
long-chain acylcarnitines by HPLC/fluorescence detection: application to Tusher, V.G., Tibshirani, R., and Chu, G. (2001). Significance analysis of micro-
plasma and tissue specimens from patients with carnitine palmitoyltransfer- arrays applied to the ionizing radiation response. Proc. Natl. Acad. Sci. USA
ase-II deficiency. Clin. Chim. Acta 352, 81–92. 98, 5116–5121.
Moore, S.A., Hurt, E., Yoder, E., Sprecher, H., and Spector, A.A. (1995). Doco- Viant, M.R., Kurland, I.J., Jones, M.R., and Dunn, W.B. (2017). How close are
sahexaenoic acid synthesis in human skin fibroblasts involves peroxisomal we to complete annotation of metabolomes? Curr. Opin. Chem. Biol.
retroconversion of tetracosahexaenoic acid. J. Lipid Res. 36, 2433–2443. 36, 64–69.
GBD 2013 Mortality and Causes of Death Collaborators (2015). Global, Wang, M.H., Zand, B.A., Nasjletti, A., and Laniado-Schwartzman, M. (2002).
regional, and national age-sex specific all-cause and cause-specific mortality Renal 20-hydroxyeicosatetraenoic acid synthesis during pregnancy. Am. J.
for 240 causes of death, 1990-2013: a systematic analysis for the Global Physiol. Regul. Integr. Comp. Physiol. 282, R383–R389.
Burden of Disease Study 2013. Lancet 385, 117–171. Wang, X., Chen, C., Wang, L., Chen, D., Guang, W., and French, J. (2003).
Murphy, V.E., Smith, R., Giles, W.B., and Clifton, V.L. (2006). Endocrine regu- Conception, early pregnancy loss, and time to clinical pregnancy: a popula-
lation of human fetal growth: the role of the mother, placenta, and fetus. En- tion-based prospective study. Fertil. Steril. 79, 577–584.
docr. Rev. 27, 141–169. Wang, Q., Würtz, P., Auro, K., Mäkinen, V.P., Kangas, A.J., Soininen, P., Tiai-
Ngo, T.T.M., Moufarrej, M.N., Rasmussen, M.H., Camunas-Soler, J., Pan, W., nen, M., Tynkkynen, T., Jokelainen, J., Santalahti, K., et al. (2016). Metabolic
Okamoto, J., Neff, N.F., Liu, K., Wong, R.J., Downes, K., et al. (2018). Nonin- profiling of pregnancy: cross-sectional and longitudinal evidence. BMC Med.
vasive blood tests for fetal development predict gestational age and preterm 14, 205.
delivery. Science 360, 1133–1136. Wu, C.C., Gupta, T., Garcia, V., Ding, Y., and Schwartzman, M.L. (2014). 20-
Pearson, T., Zhang, J., Arya, P., Warren, A.Y., Ortori, C., Fakis, A., Khan, R.N., HETE and blood pressure regulation: clinical implications. Cardiol. Rev.
and Barrett, D.A. (2010). Measurement of vasoactive metabolites (hydroxyei- 22, 1–12.
cosatetraenoic and epoxyeicosatrienoic acids) in uterine tissues of normal Xia, J., and Wishart, D.S. (2010a). MetPA: a web-based metabolomics tool for
and compromised human pregnancy. J. Hypertens. 28, 2429–2437. pathway analysis and visualization. Bioinformatics 26, 2342–2344.
Prentice, A.M., and Goldberg, G.R. (2000). Energy adaptations in human preg- Xia, J., and Wishart, D.S. (2010b). MSEA: a web-based tool to identify biolog-
nancy: limits and long-term consequences. Am. J. Clin. Nutr. 71 (5, Suppl), ically meaningful patterns in quantitative metabolomic data. Nucleic Acids
1226S–1232S. Res. 38, W71–W77.
1692 Cell 181, 1680–1692, June 25, 2020

ll
Resource
STAR+METHODS
KEY RESOURCES TABLE

Biological Samples
Plasma samples This paper N/A
MS-grade water Fischer Scientific Cat#7732-18-5
MS-grade methanol Fischer Scientific Cat#A456-500
MS-grade acetonitrile Fischer Scientific Cat#A9554
MS-grade acetone Fischer Scientific Cat#67-64-1
MS-grade acetic acid Sigma Aldrich Cat#64-19-7
Progesterone Sigma Aldrich Cat#P-069-1ML
THDOC Sigma Aldrich Cat#P2016-5MG
Estriol-16-glucuronide Sigma Aldrich Cat#E1877-10MG
DHEA-S Sigma Aldrich Cat#D-066-1ML
PE(P-16:0e/0:0) Avanti Polar Lipids Cat#852470
Androstane-3,17-diol Sigma Aldrich Cat#A7755-100MG
17a-Hydroxyprogesterone Sigma Aldrich Cat#H-085-1ML
Deposited Data
Raw data This paper https://www.metabolomicsworkbench.org;
Project ID PR000918; Project https://
doi.org/10.21228/M81H58.
Progenesis QI Software Nonlinear Dynamics http://www.nonlinear.com/progenesis/qi/
ProteoWizard version 3.0.19095- Chambers et al., 2012 http://proteowizard.sourceforge.net
938eda31a
the k-Nearest Neighbor algorithm Altman, 1992 N/A
Forward Dot–Product algorithm Stein and Scott, 1994 N/A
MetDNA Shen et al., 2019 http://metdna.zhulab.cn/
MetaboAnalystR Chong et al., 2018; https://github.com/xia-lab/
Xia and Wishart, 2010a; MetaboAnalystR
Xia and Wishart, 2010b
R R Core Team https://www.R-project.org
MS/MS identification pipeline This paper https://jaspershen.github.io/metID/
index.html
Other
Zorbax SB columns (2.1 X 50mm, 1.8 Agilent Technologies 827700-914
Micron, 600 Bar)
Lead Contact
Further information and requests for resources and reagents should be directed to and will be fulfilled by the Lead Contact, Mike
Snyder (mpsnyder@stanford.edu).
Cell 181, 1680–1692.e1–e5, June 25, 2020 e1

ll
Resource

Original data have been deposited to the NIH Common Fund’s National Metabolomics Data Repository (NMDR) website (supported
by NIH grant U2C-DK119886), the Metabolomics Workbench https://www.metabolomicsworkbench.org, Project ID PR000918.
https://doi.org/10.21228/M81H58.
The code for the MS/MS identification pipeline used is available on github at https://jaspershen.github.io/metID/index.html https://
github.com/jaspershen/metID.
Pregnancy cohort
We recruited pregnant women through family doctors and advertisements (Danish IRB number H-3-2014-004). At enrollment, all
women were screened to ensure that they were healthy at baseline, without chronic conditions, and without medication intake of
any kind (ages 23 to 36 at giving birth). From each woman, non-fasting blood samples were collected weekly during pregnancy
and one sample was collected after pregnancy (2x9 mL EDTA tube and 1xPaxGene RNA tube).
METHOD DETAILS
Plasma sample preparation

Within Discovery and Test Set 1 cohorts, 784 normal pregnancy samples from 30 women were completely randomized and analyzed
in 12 batches across two years. The 32 normal pregnancy samples in Test Set 2 were randomized and analyzed three years later.
Plasma was prepared from whole blood treated with anti-clot EDTA, aliquoted, and stored at 80 C. Plasma (200 mL) was treated
with four volumes (800 mL) of an acetone:acetonitrile:methanol (1:1:1, v/v) solvent mixture with internal standards (i.e., Acetyl-d3-
carnitine, Phenylalanine-3,3-d2, Tiapride, Trazodone, Reserpine, Phytosphingosine, and Chlorpromazine), mixed for 15 min at
4 C, and incubated at 20 C for 2 h to allow for protein precipitation. The supernatant was collected after centrifugation at
10,000 rpm for 10 min at 4 C and evaporated under nitrogen to dryness (Biotage Turbovap). The dry extracts were reconstituted
with 200 mL 1:1 methanol:water before analysis. A quality control (QC) sample was generated by pooling all the plasma samples
from 10 women and injected between every 10–15 sample injections to monitor the consistency of the retention time and the signal
intensity. The QC sample was also diluted two, four, and eight times to determine the linear-dilution effect of metabolic features.
Chemical materials for untargeted metabolomics

MS-grade water (7732-18-5), methanol (A456-500), acetonitrile (A9554), and acetone (67-64-1) were purchased from Fischer Scien-
tific (Morris Plains, NJ, USA). MS-grade acetic acid (64-19-7) was purchased from Sigma Aldrich (St. Louis, MO, USA). Analytical
grade chemical standards were purchased [Progesterone (Sigma-Aldrich, P-069-1ML), THDOC (Sigma-Aldrich, P2016-5MG), Es-
triol-16-Glucuronide (Sigma-Aldrich, E1877-10MG), DHEA-S (Sigma-Aldrich, Dehydroepiandrosterone-D5-3-sulfate (DHEAS-D5)
(2,2,3,4,4,-D5) sodium salt solution, D-066-1ML), PE(P-16:0e/0:0) (Avanti Polar lipids, 852470), Androstane-3,17-diol (Sigma-Aldrich,
A7755-100MG), 17a-Hydroxyprogesterone (Sigma-Aldrich, H-085-1ML)] and prepared in methanol, except PE(P-16:0e/0:0), which
was prepared in chloroform/methanol (8:2).
MS acquisition
Metabolic extracts were analyzed by reversed-phase liquid chromatographic (RPLC)-mass spectrometry (MS) in both positive and
negative ionization modes. Thermo Q Exactive Hybrid Quadrupole-Orbitrap plus and Q Exactive mass spectrometers (Xcalibur,
Thermo Scientific, San Jose, CA, USA) were operated in full MS-scan mode for data acquisition (acquisition from m/z 500 to
2,000) with a scan rate of approximately 4 Hz and a resolution set at 30,000 (at m/z 400). The MS/MS spectra of the QC sample
were acquired under different fragmentation energy (25 NCE and 50 NCE) of the top 10 parent ions. The resulting mass spectra
were exported into Progenesis QI Software (Nonlinear Dynamics, Durham, NC, USA) for further processing.
Chromatographic conditions
RPLC separation was performed using Zorbax SB columns (2.1 X 50mm, 1.8 Micron, 600 Bar; 827700-914) purchased from Agilent
Technologies (Santa Clara, CA, USA). Mobile phases for RPLC consisted of 0.06% acetic acid in water (phase A) and 0.06% acetic
acid in MeOH (phase B). Metabolites were eluted from the column at a flow rate of 0.6 mL/min, leading to a backpressure of 220–
280 bar at 99% phase A. A linear 1%–80% phase B gradient was applied over 9–10 min. The oven temperature was set to 60 C,
and the sample injection volume was 5 mL.
Section 1: Metabolomics Data Processing

Metabolomic features were extracted with a unique mass/charge ratio and retention time, then aligned and quantified with the
Progenesis QI software (Nonlinear Dynamics, Durham, NC, USA, http://www.nonlinear.com/progenesis/qi/). Peak deconvolution
e2 Cell 181, 1680–1692.e1–e5, June 25, 2020

ll
Resource
was performed under default settings in Progenesis QI. Acquired data were processed using an analysis pipeline written in R (https://
www.R-project.org). Progenesis QI output was then processed by removing all metabolites that were quantified in less than 30% of
the samples or had a median intensity of less than twofold signal over the noise threshold (S/N < 2). The noise threshold was esti-
mated by using the median signal across all the blank runs (if no quantitation was reported in any of the blank runs, the feature
was also included in the analysis, as it likely had good S/N characteristics). Then the data were log-transformed and normalized.
For each run, the median of all features was centered to correct for variation in the sample amount. Then for each analyte, a linear
correction was applied per batch to correct for any linear decrease or increase in abundance during the acquisition of a batch. In
short, for each analyte and each batch, a linear model was fitted with the log-abundance of the analyte as the dependent variable
and the acquisition number [run order (randomized)] as the independent variable. The model prediction was interpreted as an under-
lying drift in mass spectrometric sensitivity and subtracted from the analyte level to yield within-batch normalized abundances.
Finally, for each analyte, the abundances were median centered by batch to correct for sensitivity differences between batches.
The positive- and negative-mode features were then concatenated for downstream analysis. In total, 9,651 features were included
in the final analysis. In addition, for samples with more than 50% of the values missing, the sample was removed (one sample in total).
The remaining missing values were imputed by the nearest 10 neighbors using the k-Nearest Neighbor algorithm (Altman, 1992). Note
that Discovery and Test Set 1 were normalized together, while samples of Test Set 2 were normalized independently.
We applied principal component analysis (PCA) to examine the overall distribution of the sample data (with all 9,651 features) and
check the run quality. The gestational ages (based on first-trimester ultrasound measurements) were superimposed to facilitate the
analysis. During the analysis, the vast majority of the samples were separated by pre- and postpartum in PCA space defined by two
components, which explained the largest variations (PC1 and 2, Figure 1B), while two samples of a same subject (last two in her
collection, before and after childbirth) displayed irregular behavior in PCA and unsupervised clustering analysis. The two samples
were treated as outliers and excluded from further analysis. We also performed partial least-squares discriminant analysis (PLS-
DA) according to the categories of gestational age (by the mixOmics package).
Section 2: Metabolic Features Identification

Metabolite identification was performed using a two-step approach. First, to identify compounds, we used our in-house metabolite
library, which contains chemical standards and a manually curated compound list based on accurate mass (m/z, ± 5 ppm), retention
time and spectral patterns. Second, further metabolites were identified based on accurate mass, isotope pattern and MS/MS spectra
against public databases, including HMDB, MoNA, MassBank, METLIN, and NIST.
Specifically, tandem mass spectrometry (MS/MS) data of QC samples were acquired using a Thermo Q Exactive plus mass spec-
trometers. The raw MS data (.raw format) were converted to .mgf format files using ProteoWizard (Chambers et al., 2012) (Version
3.0.19095-938eda31a, http://proteowizard.sourceforge.net). Using the metabolic features table (from Waters Progenesis QI) and QC
MS/MS data (.mgf format), the metabolic features and MS/MS spectra were matched according to their accurate masses (±25 ppm),
and RT values (±30 s) (Shen et al., 2019). If one metabolic feature matched multiple MS/MS spectra, then all matched MS/MS spectra
were used for the identification.
Next, the generated MS1/MS2 pairs were automatically searched in the public databases: HMDB (http://www.hmdb.ca/), MoNA
(http://mona.fiehnlab.ucdavis.edu/), and MassBank (http://www.massbank.jp/). The MS/MS spectra similarity score was calculated
using the forward dot-product algorithm (Stein and Scott, 1994), which considers both fragments and intensities. The similarity score
cutoff was set as 0.5.
Furthermore, the metabolic features with MS/MS spectra and not matched in download public databases were searched in the
online public databases, METLIN (https://metlin.scripps.edu) and NIST (https://www.nist.gov/). Then the MS/MS spectra match
was manually checked to confirm the identifications, which was considered a level 2 identification according to MSI (Viant et al.,
2017). In addition, the metabolic peaks with MS/MS spectra that were not matched in public databases were analyzed by MetDNA
(Shen et al., 2019) and given a MSI level 4 identification.
Finally, predictors from the machine-learning models were further confirmed with chemical standards by matching the accurate
masses (±5 ppm), retention time (±30 s), and the MS/MS spectra for a MSI level 1 identification (Viant et al., 2017).
In the rare cases, when a given metabolic feature was matched differently between different matching methods, we choose the
matching based on the identification level: standards > MS/MS > MetDNA.
Section 3: Identify Significantly Altered Features/Compounds

A statistical method specialized for multi-testing, SAM (Significance Analysis of Microarrays) (Tusher et al., 2001) was applied to iden-
tify metabolic features/compounds altered significantly in metabolome-wide analysis. Specifically, we used SAM to examine the cor-
relation between abundance of each compounds and the gestational age of each sample in Discovery and Test Set 1 cohorts. For all
SAM analyses, distribution-independent ranking tests (based on the Wilcoxon test) and the sample-wise permutation (default by the
samr package) were used to ascertain significance (false discovery rate, FDR < 0.05). The adjusted gestational ages were included in
a number of plots to present the changes in metabolites among individuals, which were calculated by scaling all delivery event timing
to 40 weeks. The populational baseline was calculated by taking the mean intensity values of all women with samples before 14 (20
out of 30 women).
Cell 181, 1680–1692.e1–e5, June 25, 2020 e3

ll
Resource
To identify top changed compounds with abundance increases or decreases more than 50% during the whole pregnancy
(40 weeks), we performed a linear regression between log2 abundance and the gestational weeks of samples, and only those com-
pounds with absolute slope larger than log2(1.5)/40 weeks = 0.015 were chosen.
Section 4: Regularized Partial Correlation Network

The regularized partial correlation network captures the remaining association between two nodes after controlling all other informa-
tion (indirect correlations) in the network (Epskamp and Fried, 2018). Namely, each node represents a compound, and each edge
represents the strength of partial correlation between two nodes after conditioning on all other variables in the datasets. Edge weights
represent the partial correlation coefficeients. Lasso (least absolute shrinkage and selection operator) was used to shrink small
association coefficient to zero and thus limit spurious correlations in the network. To perform the lasso-based regularized partial cor-
relation, we used qgraph package in R. The tuning parameter gamma(g), which controls the complexity of the network, was set to 0.5
as suggested (Epskamp and Fried, 2018). Three measures, strength (the sum of absolute edge weight connected to each metabolite),
closeness (inverse of the sum of distances from one metabolite to all others), and betweenness (how often one metabolite is in the
shortest paths between other metabolites), indicated how important metabolites are in the network.
Section 5: Pathway Analysis

The compounds identified by the methods mentioned above were pooled together. We utilized MetaboAnalystR (Chong et al., 2018)
(https://github.com/xia-lab/MetaboAnalystR) to perform the metabolite set enrichment analysis (MSEA) (Xia and Wishart, 2010b) as
well as metabolic pathway analysis (MetPA) (Xia and Wishart, 2010a) on all identified metabolites. For the potential location/organ
analysis on metabolites, we excluded male organ/cell types for MSEA. To quantify pathway activity, we averaged the intensities
of all identified metabolites for each pathway that includes no less than three identified metabolites and plotted them on the heatmap
(Figure 3B). The pathway activity before 14 weeks were averaged across all available samples and subtracted from all later time
points. The statistical significance of the changes in a pathway’s activity across pregnancy was evaluated by global testing (Goeman
and Bühlmann, 2007), the default method used by MetaboAnalystR. The topological pathway impacts were quantified using
published method (Xia and Wishart, 2010a), with MetaboAnalystR. Human desease states that correlated with pregnancy-related
metabolites were calculated based on published metabolomics data (Chong et al., 2018).
Section 6: Machine Learning for Pregnancy Timing

Three cohorts of data collected and run at different years but from the same center were used to establish Discovery (subjects N = 21,
samples n = 507), Test Set 1 (subjects N = 9, samples n = 245), and Test Set 2 (subjects N = 8, samples n = 32) datasets, excluding
non-pregnant (postpartum) samples. We applied lasso (R package: glmnet) in the Discovery dataset to select compounds/metabolic
features to build the linear regression model to predict gestational age. A 10-fold cross validation was performed to choose optimal
lambda (penalty for the number of features), which determines the performance of the lasso model (number of features included in the
model and prediction deviations). For the practical utility of a signature in potential clinical settings, in the identified compound-pre-
diction models, if the number of predictors exceeded five under a given optimal lambda, we increased the lambda value so that the
number of predictors is no more than five in the final models. We then used two different methods to evaluate the prediction deviation
of our lasso model produced under a given lambda value: 1) A 10-fold cross validation within the Discovery cohort, in which the
optimal lambda was used (Baumann and Baumann, 2014; Mayr et al., 2018). In the CV, samples were distributed into folds by subject
instead of by samples to prevent person-specific information cross-over between the training folds and the test fold. 2) Validation
tests in the separate Test Set 1 and Test Set 2 cohorts, in which independent subjects were included and samples were analyzed
one year and three years from the Discovery cohort. We built the model using the optimized lambda and full discovery datasets.
This model was applied to the validation cohort for prediction and verification. A linear fitting from the two above evaluations
were performed, between the predicted value and the actual values, with Pearson correlation coefficient (R), R2, and RMSE reported.
The contribution of each predictor (metabolite) in each prediction model is defined by:
Contribution = absðcoefficienti Þ=sumðabsðcoefficienti ÞÞ
i: the metabolite included in the linear model
Unlike cross-validation in the discovery dataset, validation tests are not prone to hyperparametric selection bias for lambda value.
Since samples from Test Set 2 cohort were normalized independently from other samples, a scaling was done in the end. Note that
since the sequential nature of the data were not used in the machine-learning methods, other statistical tools, such as recurrent neu-
ral networks (e.g., LSTM (Hochreiter and Schmidhuber, 1997; Mayr et al., 2018)), may be explored to improve the model.
For samples collected after 28 weeks (third-trimester samples), we started with 264 level 1 and 2 compounds, and we used a
similar discovery and validation pipeline described for predicting gestational age (above) to build logistic regression models predict-
ing the categorical labels of gestational age > 32 and 37 weeks or delivery within 2, 4, 8 weeks. The prediction models for > 20, 24, and
28 weeks were built using samples from all three trimesters. For the prediction on delivery within 2, 4, and 8 weeks, only the 18 women
(out of 30) with natural labor onset were included, excluding subjects with induction before labor onset and scheduled cesarean-sec-
tion (induction by oxytocin/membrane strip after the onset is allowed). To estimate the confidence interval for each AUROC, we per-
formed bootstrapping by person (instead of by samples) for 1000 times, and calculate the 95% confidence interval for AUROC.
e4 Cell 181, 1680–1692.e1–e5, June 25, 2020

ll
Resource
Section 7: Analyze the discrepancies between metabolic clock (GA prediction model) and first-trimester ultrasound
estimations
We evaluated the individual correlation between the predictions made by the metabolic clock and the estimations from first-trimester
ultrasound: We first examined the correlation between metabolic clock predictions and the gestational age based on first-trimester
ultrasound in individual persons. Each correlation was evaluated by Pearson’s correlation. We then performed meta-analysis across
the persons to generate a summary p value, using Fisher’s method, to describe the overall correlation in each cohort (cross-validation
in the Discovery, Independent validation of Test Set 1).
Previous literature (Donahue et al., 2010) and our own observations suggest that birth weight and gestational length are positively
correlated; later delivery is associated with a heavier absolute birth weight of an infant. To determine whether an infant’s birth weight
falls above or below the group mean, we performed a linear regression between the two parameters and took the residuals to repre-
sent the birth weight deviation adjusted for delivery timing.
Average D(GAmetabolic- GAultrasound): For each person, at each time point, we examined the differences between the metabolic clock
and first-trimester ultrasound estimation of gestational age. These values were averaged for each person to represent the overall
relative pace of metabolic clock compared to the first-trimester ultrasound estimation. We then examined the correlation between
delivery timing adjusted birth weights and average D(GAmetabolomic- GAultrasound) (Figure 5C).
To examine whether an accelerated metabolic clock (compared to the first-trimester ultrasound estimation) associates with
advanced delivery, we performed the correlation between average D(GAmetabolomic- GAultrasound) and delivery timing, only in women
with a natural labor onset (Figure 5D).
Cell 181, 1680–1692.e1–e5, June 25, 2020 e5

ll
Resource
A Postpartum samples B
Pregnancy samples
Child birth event
250 350 450

Variances
150
1 2 3 4 5 6 7 8 9 10
Component number
Subjects
C
PLS−DA
GA:
Under 10
Component 2 (2%)
50
10−20
20−30
25 Over 30
PP
0
−25
−30 0 30
Component 1 (5%)
5 14 28 37 40
Gestational age (weeks)
D E
60 60
Batches
Discovery
30 30
Validation
PC2 (3.9 %)
PC2 (3.9 %)
0 0
−30 −30
−60 −30 0 30 60 −60 −30 0 30 60

PC1 (5.1%) PC1 (5.1%)
F G
1000
log10(Frequency)
Log2(Intensity)
10
0.1 0.2
−0.1 0.0
Linear fitting slope
ln(Intensity) ~ Gestational age
Gestational age (weeks)
Figure S1. Untargeted Metabolomics for Longitudinal Pregnancy Samples, Related to Figure 1
(A) High-density longitudinal sampling of pregnancies.
(B) The Scree plot of the principal component analysis.
(C) The PLS-DA result according to the categories of gestational age. GA: gestational age; PP: postpartum.
ll
Resource
(D and E) Principal component analysis based on all 9,651 features shows that the samples do not separate according to the 30 subjects (D) samples from
individual subjects are represented by different colors or experimental batches of Discovery and Validation (Test Set 1) analyzed across two different years (E)
samples of the discovery cohort are presented in red; samples of the validation cohort (Test Set 1) are presented in blue.
(F) Histogram shows the distribution of slopes in the linear fitting model of the 9,651 features (intensities against the gestational ages).
(G) For each of the 30 women, the intensities of an example metabolic feature are shown over the course of gestation, which reveals consistent increases in
abundance according to gestational age among 30 subjects, despite individual differences.
ll
Resource
A
Pearson correlation coefficient
1 0.5 0 −0.5 −1
Taurochenodeoxycholate
Cyclo(leucylprolyl)
Theophylline
Caffeine Pregnancy alteration
Theobromine
1−Methylxanthine
Pregnenolone sulfate
Decrease
Estrone 3−sulfate
Corticosterone
Increase
Cortisone
17,18−EpETE
Cortisol
3−Acetoxypyridine Group
N−Acetyl−D−glucosamine
Androstane−3,17−diol Amino acid metabolism
5−Pregnane−3,17−diol−20−one 3−sulfate
Estriol−16−Glucuronide
Bile acid biosynthesis
Progesterone Caffeine metabolism
17alpha−Hydroxyprogesterone Fatty acid metabolism
THDOC
7−Methylguanine Phospholipid metabolism
7alpha,24−Dihydroxy−4−cholesten−3−one
Hexadecadienoylcarnitine Steroid hormone biosynthesis
Dodecanoylcarnitine Others
Tetracosahexaenoic acid
Tetracosapentaenoic acid
Sphingosine
Docosadienoic acid
Erucic acid
Glycochenodeoxycholate
Sinapyl alcohol
Isobutyryl−L−carnitine
8,9−DHET
2−Phenylbutyric acid
DHEA-S
LPC(18:2)
LPC(17:0)
PE(P-16:0e/0:0)
LPE(22:2)
LPE(22:4)
LPE(20:3)
LPC(24:0)
LPE(22:1)
LPC(P−18:0)
PC(18:1(9Z)e/2:0)
LPE(20:0)
LPE(20:1)
MG(20:0)
MG(22:2)
MG(18:1)
Tricosanoic acid
MG(24:1)
MG(24:0)
MG(14:1)
Oleoylcarnitine
3−Hydroxyoleylcarnitine
LPC(20:5)
LPC(P−18:1)
LPC(P−16:0)
Hydroxybupropion
Ketoisovaleric acid
Valylhistidine
Glycyrrhetinic acid
Cyclo(leucylprolyl)
Theophylline
Caffeine
Theobromine
1−Methylxanthine
Estrone 3−sulfate
Corticosterone
Cortisone
17,18−EpETE
Cortisol
3−Acetoxypyridine
Progesterone
17alpha−Hydroxyprogesterone
THDOC
7−Methylguanine
Dodecanoylcarnitine
Sphingosine
Docosadienoic acid
Erucic acid
Sinapyl alcohol
8,9−DHET
DHEA-S
LPC(18:2)
LPC(17:0)
PE(P-16:0e/0:0)
LPE(22:2)
LPE(22:4)
LPE(20:3)
LPC(24:0)
LPE(22:1)
LPC(P−18:0)
PC(18:1(9Z)e/2:0)
LPE(20:0)
LPE(20:1)
MG(20:0)
MG(22:2)
MG(18:1)
Tricosanoic acid
MG(24:1)
MG(24:0)
MG(14:1)
Oleoylcarnitine
LPC(20:5)
LPC(P−18:1)
LPC(P−16:0)
Hydroxybupropion
Ketoisovaleric acid
Valylhistidine
Glycyrrhetinic acid
Pregnancy Alteration
Group
B
Strength Closeness Betweenness
THDOC
Dehydroisoandrosterone sulfate (DHEA−S)
Estrone 3−sulfate
17alpha−Hydroxyprogesterone
LPC(20:5)
7−Methylguanine
LPC(18:2)
PC(18:1(9Z)e/2:0)
LPC(17:0)
LPE(20:1)
LPE(22:4)
Progesterone
Sphingosine
Cyclo(leucylprolyl)
LPE(20:3)
Valylhistidine
Cortisol
LPE(22:1)
Erucic acid
Cortisone
17,18−EpETE
Oleoylcarnitine
LPE(20:0)
Tricosanoic acid
PE(P−16:0e/0:0)
Docosadienoic acid
LPC(P−18:0)
3−Acetoxypyridine
LPC(24:0)
Sinapyl alcohol
MG(24:1)
MG(14:1)
Corticosterone
Caffeine
MG(24:0)
Dodecanoylcarnitine
Theophylline
LPC(P−18:1)
LPE(22:2)
1−Methylxanthine
MG(20:0)
Glycyrrhetinic acid
Theobromine
MG(18:1)
Hydroxybupropion
MG(22:2)
LPC(P−16:0)
−2 −1 0 1 2 −2 −1 0 1 2 −1 0 1 2 3

ll
Resource
Figure S2. Functional Metabolite Groups Altered during Pregnancy, Related to Figure 2
(A) Correlation matrix colored by the Pearson correlation coefficient of each pair of pregnancy-related compounds across samples.
(B) The strength, closeness, and betweenness of metabolites in the regularized partial correlation network indicate how important the metabolites are in the
network. Metabolite names are listed on the left side ranked by the closeness, with the names of the seven compounds in the prediction models of Figure 4 and
Figure 5 (bold).
ll
Resource
Figure S3. Pregnancy-Related Metabolic Pathways and Metabolite Origin Analysis, Related to Figure 3
(A) Steroid hormone biosynthesis pathway, with metabolite increases (in red) or decreases (in blue) over the course of gestation.
(B) Numerous metabolites in plasma that were altered during pregnancy can be traced back to organs by metabolite set enrichment analysis (MSEA).
(C) Arachidonic acid metabolism pathway, with metabolite increases (in red) or decreases (in blue) over the course of gestation.
(D) The average levels of the 20-HETE and 5-HETE changes against the gestational progression. The intensities were normalized to the baseline, which was
defined by averaging all samples before 14 weeks. The standard errors, derived from 30 subjects, are shown. The gestational ages were adjusted by scaling
delivery events to 40 weeks. PP, postpartum.
ll
Resource
A Model reduction
256
Number of predictors
32
4 6 8
Cross validation: RMSE
B C
Discovery Test Set 1
2
40 2
R = 0.93 R = 0.91
GAmetabolic (weeks)
40
P< 1X10-100 N= 245
N= 507
30 30
20 20
10 10
10 20 30 40 10 20 30 40
GAultrasound (weeks) GAultrasound (weeks)
D Model reduction
256
Number of predictors
32
3 4 5 6 7 8 9
Cross validation: RMSE
E F
1.0 1.0
m/z: 438.2974 m/z: 367.1583
RT error (second): 0.6 RT error (second): 9
0.5 0.5
Relative intensity
Relative intensity
0.0 0.0
-0.5 -0.5
O
O
H3 C P NH 2
O O
OH
-1.0 Standard: PE(P-16:0e/0:0) HO H -1.0 Standard: DHEA-S(Dehydroepiandrosterone sulfate)
100 200 300 400 100 200 300 400
Mass to charge ratio (m/z) Mass to charge ratio (m/z)
Figure S4. Metabolites Predict Gestational Age in Machine-Learning Models, Related to Figure 4
(A) Feature selection for predicting gestational age (GA) using metabolomic features.
(B and C) GA predicted by metabolic features (GAmetabolic, y axis) highly correlates with clinical values determined by standard of care (by first-trimester ultra-
sound, GAultrasound, x axis) in the Discovery (B) and the validation cohort (Test Set 1) (C). The 95% confidence interval for the linear regression is represented by the
gray area.
(D) Feature selection for predicting GA using identified metabolites.
(E and F) Measured MS/MS fragmentation profiles (upper) matching of PE(P-16:0e/0:0) (E) and DHEA-S (F) with the MS/MS of standard compounds (lower). GA,
gestational age.
ll
Resource
Child birth
Gestational age (weeks): 20 24 28 32 37
AUC in validation: 0.98 0.98 0.97 0.89 0.87
THDOC 2 1 1 2
Progesterone 2 1 2
Androstane-3,17-diol 3
Estriol-16-glucuronide 1 3 2 1
B Gestational age (GA) > 37w C D

1.00 GA predictors
GA < 37w
True positive rate
0.75 0.4
GA > 37w
Contribution
log2(Intensity)
12
0.50 0.2
AUC: 0.91
Discovery N= 222 0.0 11
0.25 95% CI: 0.87-0.96
ne TH e
ol
,1 C
id
di
−3 DO
AUC: 0.87
n
7−
ro
10
cu
Test Set 1 N= 98
lu
G
0.00 95% CI: 0.79-0.93

6−
ta
−1
os
0.00 0.25 0.50 0.75 1.00
28
29
30
22
23
24
25
26
27
ol
dr
tr i
An
Es
False positive rate Subject ID
E THDOC GA < 37w F Androstane-3,17-diol GA < 37w

GA > 37w GA > 37w
13.5
13.5
13.0
log2(Intensity)
13.0
12.5
12.5
12.0
11.5 12.0
28
29
30
28
29
30
22
23
24
25
26
27
22
23
24
25
26
27
Subject ID Subject ID
G H
Prediction: Gestational age (GA) > 20w Prediction: Gestational age (GA) > 24w
1.00 1.00
0.75 0.75
True positive rate
True positive rate
0.50 0.50
AUC: 0.99
Discovery N= 507 AUC: 0.97
Discovery N= 507
95% CI: 0.97-0.99
0.25 0.25 95% CI: 0.96-0.98
AUC: 0.98 AUC: 0.98
Validation N= 245 Validation N= 245
95% CI: 0.96-0.99 95% CI: 0.97-0.99
0.00 0.00
0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00
False positive rate False positive rate
I Prediction: Gestational age (GA) > 28w J Prediction: Gestational age (GA) > 32w
1.00 1.00
0.75 0.75
True positive rate
True positive rate
0.50 0.50
AUC: 0.97 AUC: 0.90

Discovery N= 507 Discovery N= 222
0.25 0.25
95% CI: 0.96-0.98 95% CI: 0.84-0.96
AUC: 0.98 AUC: 0.89
0.00 95% CI: 0.97-0.99 0.00 95% CI: 0.83-0.95
0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00

ll
Resource
Figure S5. Metabolites Selected by Machine Learning Can Accurately Predict Gestational Age before or after 20, 24, 28, 32, and 37 Weeks in
Both the Discovery and Validation Cohort (Test Set 1), Related to Figure 4
(A) Summary of prediction models of gestational age (GA) before or after 20, 24, 28, 32, and 37 weeks, using two to three metabolites. Note that the prediction
models for 20, 24, and 28 gestational weeks were built using samples from all three trimesters and the ones for late pregnancy (32 and 37 weeks) were build using
third-trimester samples. The contribution rank of each predictor in every model is listed as number 1, 2, and 3. Area under the curves (AUCs) in the validation
cohort (Test Set 1) are listed.
(B) The logistic regression model based on three metabolites can accurately distinguish the third-trimester plasma samples before or after 37 weeks.
(C) Contribution of the three metabolites to the prediction model of gestational age before or after 37 weeks.
(D) Estriol-16-Glucuronide shows intensity range separations before and after 37 weeks.
(E and F) THDOC and androstane-3,17-diol show intensity range separations before/after 37 weeks.
(G–J) The logistic regression models can accurately distinguish pregnancy samples before or after 20 (G) 24 (H), and 28 (I) weeks, and the third trimester plasma
samples before or after 32 weeks (J). GA, gestational age.
ll
Resource
A B C 5000
●
●
Discovery Test Set 1 R2= 0.11

4500 ●
●
10.0 5
●
● ●
●
●
●
Birth weight (g)

●
●
7.5
4000
Frequency
Frequency
●
●
3 ●
●
5.0 ●
●
●
●
2
3500
●
●
2.5
1
●
●
●
●
● ●
0.0 0 3000 ●
●
●
●
●
●
● ●
●
2 4 6 8 10 2 4 6 8 10
●
●
●
●
Prediction deviation: RMSE Prediction deviation: RMSE

2500 ●
● N= 29
38 39 40 41 42
Birth gestational age (weeks)
D
Androstane-3,17-diol WD > 2w E Estriol−16−Glucuronide WD > 2w
WD < 2w WD < 2w
13.5
log2(Intensity)
13.0
12.5
12.0
28
29
30
22
25
26
28
29
30
22
25
26
Subject ID Subject ID
F G
1.0 1.0
m/z: 257.226076 m/z: 331.226241
RT error (second): 2.4 RT error (second): 29.4
0.5 0.5
Relative intensity
Relative intensity
0.0 0.0
-0.5 -0.5
-1.0 Standard: Androstane-3,17-diol -1.0 Standard: 17α-Hydroxyprogesterone
100 150 200 250 100 200 300

Mass to charge ratio (m/z) Mass to charge ratio (m/z)
H Prediction: Weeks to Delivery (WD) < 4w I Prediction: Weeks to Delivery (WD) < 8w
1.00 1.00
0.75 0.75
True positive rate
True positive rate
0.50 0.50
AUC: 0.96 AUC: 0.90

Discovery N= 128 Discovery N= 128
0.25 0.25
95% CI: 0.93-0.99 95% CI: 0.84-0.96
AUC: 0.87 AUC: 0.91

0.00 95% CI: 0.78-0.95 0.00 95% CI: 0.80-0.97
0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00

ll
Resource
Figure S6. Identified Compounds Predict Gestational Age and 4 and 8 Weeks Approaching Delivery, Related to Figure 5
(A and B) Histogram shows the distribution of prediction deviation (RMSE) in the cross-validation of the discovery cohort (A) and the validation cohort (B) Test
Set 1.
(C) The baby birth weight shows correlation with the gestational length (gestational age at childbirth). All 29 subjects who had baby birth weight information are
included here. The 95% confidence interval for the linear regression is represented by the gray area.
(D and E) Androstane-3,17-diol (D) and estriol-16-Glucuronide (E) show intensity range separations before or after 2 weeks approaching the delivery.
(F and G) Measured MS/MS fragmentation profiles (upper) matching of androstane-3,17-diol (F) and 17a-hydroxyprogesterone (G) with the MS/MS of standard
compounds (lower).
(H and I) The logistic regression models can accurately identify the third trimester plasma samples approaching delivery (weeks to delivery, WD < 4w (H), WD < 8w
(I); only includes women with natural labor onset). Note that the discovery results were from the 10-fold cross-validation (CV) instead of direct fitting to avoid
over-fitting.
ll
Correction
An Engineered CRISPR-Cas9 Mouse Line
for Simultaneous Readout of Lineage Histories
and Gene Expression Profiles in Single Cells
Sarah Bowling, Duluxan Sritharan, Fernando G. Osorio, Maximilian Nguyen, Priscilla Cheung, Alejo Rodriguez-Fraticelli,
Sachin Patel, Wei-Chien Yuan, Yuko Fujiwara, Bin E. Li, Stuart H. Orkin, Sahand Hormoz,* and Fernando D. Camargo*
*Correspondence: sahand_hormoz@hms.harvard.edu (S.H.), fernando.camargo@childrens.harvard.edu (F.D.C.)
(Cell 181, 1410–1422.e1–e27; June 11, 2020)

Due to a production error, Figures S5 and S6 were not included with the article when it initially published. In addition, in the following
equation, equal ( = ) signs were incorrectly included between the six union (u) signs and the curly brackets.

. s .
Mj;k = mj1;k1 m j ; mj1;k1 ˛Mj1;k1
rk

. s .
W = d j1;k1 m j ; d j1;k1 ˛Dj1;k1
rk

. s .
W = i j1;k1 m j ; i j1;k1 ˛I j1;k1
rk

. B .
Dj;k = mj;k1 m ; mj;k1 ˛Mj;k1
rk

. B . . B .
W = d j;k1 m ; d j;k1 ˛Dj;k1 W = i j;k1 m ; i j;k1 ˛I j;k1
rk rk

. s
I j;k = mj1;k m j ; mj1;k ˛Mj1;k
B

. s . . s .
W = d j1;k m j ; d j1;k ˛Dj1;k W = i j1;k m j ; i j1;k ˛I j1;k
B B
Incorrect equation
ll

. s .
Mj;k = mj1;k1 m j ; mj1;k1 ˛Mj1;k1
rk

. s .
W d j1;k1 m j ; d j1;k1 ˛Dj1;k1
rk

. s .
W i j1;k1 m j ; i j1;k1 ˛I j1;k1
rk

. B .
Dj;k = mj;k1 m ; mj;k1 ˛Mj;k1
rk

. B .
W d j;k1 m ; d j;k1 ˛Dj;k1
rk

. B .
W i j;k1 m ; i j;k1 ˛I j;k1
rk

. s
I j;k = mj1;k m j ; mj1;k ˛Mj1;k
B

. s .
W d j1;k m j ; d j1;k ˛Dj1;k
B

. s .
W i j1;k m j ; i j1;k ˛I j1;k
B
Correct equation
These errors have now been corrected online, and we apologize for the inconvenience.
1694 Cell 181, 1693–1694, June 25, 2020

ll
Retraction
Retraction Notice to:
A Monoclonal Antibody that Targets
a NaV1.7 Channel Voltage Sensor
for Pain and Itch Relief
Jun-Ho Lee, Chul-Kyu Park, Gang Chen, Qingjian Han, Rou-Gang Xie, Tong Liu, Ru-Rong Ji,* and Seok-Yong Lee*
*Correspondence: ru-rong.ji@duke.edu (R.-R.J.), sylee@biochem.duke.edu (S.-Y.L.)
(Cell 157, 1393–1404; June 5, 2014)

In this publication, using in vitro and in vivo approaches, we described and characterized a monoclonal antibody (mAb) that binds to
the voltage sensor of the sodium channel subtype Nav1.7 and inhibits channel function. A follow-up study by Liu et al. reported that a
similar but distinct recombinant mAb targeting Nav1.7 did not show significant in vitro activities (Liu et al., 2016, F1000 Res. 5, 2764,
https://doi.org/10.12688/f1000research.9918.1), which prompted us to re-examine our previous results. In the process, we found
irregularities in some of the raw data used for the published in vitro results, and we notified our institution, Duke University, about
the irregularities. The institution subsequently appointed an ad hoc committee that concluded that the first author, Jun-Ho Lee, fabri-
cated and/or falsified the results in Figures 3A, 3C, 3D, and 4. While subsequent work has both clarified the distinct in vitro activities of
the two antibodies and confirmed the in vivo activity of our antibody (Bang et al., 2018, Neurosci. Bull. 34, 22–41, https://doi.org/10.
1007/s12264-018-0203-0), we feel that the responsible course of action is to retract our paper because it contains falsified data. We
sincerely apologize to the scientific community for the inconvenience and confusion that we have caused.
The first author, Jun-Ho Lee, did not respond to the request to sign this retraction.
Cell 181, 1695, June 25, 2020 ª 2020 Elsevier Inc. 1695
SnapShot: JAK-STAT Signaling II
Alejandro V. Villarino,1 Massimo Gadina,1 John J. O’Shea,1 and Yuka Kanno1
1
National Institute of Arthritis and Musculoskeletal and Skin Diseases (NIAMS),
National Institutes of Health (NIH), Bethesda, MD 20892, USA
JAK STAT Ligand
JH
FERM domain 6-7 ND N-terminal domain
Receptor
CCD Coiled-coil domain PLASMA MEMBRANE
SH2 like domain 3-5
DBD DNA-binding domain
Pseudokinase 2 USP18
domain SH2 Src homology 2 domain JAK JAK CYTOPLASM
Y PY PTP SOCS
Kinase domain 1 ATP
CD
C-terminal
PY PY Cytoplasmic
PY Y STAT pool Monomer
JAKi trans-activation domain
PS S JAKi PY PY
STAT STAT
Dimer
FERM 4.1 protein, ezrin,
radixin moesin
Parallel SK
JH JAK homology (1-7) IRF9 Tetramer
Protein tyrosine
STAT2 STAT1 PS ETC
PTP phosphatase PY PY
PY PY PY PY PY PY PY Y Y
PS PS
PS PS PS PS PS PS PS
PS
I II IV ATP
Suppressor of
SOCS cytokine signaling
III V Ca2+
Heterotrimer Tetramer Heterodimer Homodimer Antiparallel STAT3
Ubiquitin specific
USP18
peptidase 18
Mitochondrion
SK Serine kinase
ATP Adenosine triphosphate
JAKi JAK inhibitor
ETC Electron transport chain Proximal enhancer Gene-coding region

PTP
Gene transcritption Intronic
enhancer
TSS Transcription start site
HAT
TF Other transcription factors Pol II
TF
Pol II RNA polymerase II TSS Pol II
HAT Histone acetyltransferase Promotor mRNA

MT Methyltransferase PIAS IL-12-induced STAT4 binding motif
T
TTCC GGAAGAA
G
eRNA enhancer RNA A T
C
T
C G
A
GA
G
TC ATC C
PIAS Protein inhibitor of eRNA transcritption also STAT1(IFN-γ), STAT3(IL-6), STAT5(IL-2)

activated STAT
IRF9 HAT
TF Pol II HAT
MT
IFN-β-induced STAT2 binding motif IL-4-induced STAT6 binding motif TF
ATA
GT
TC A
CA
AGTTTC
AGTTTCC
C
CCGG
T TC
T
AGG C
CC
GG
A TTCC A
G
TT
TCG
CGT
A AG AC
GAA
T C
NUCLEUS also STAT1(IFN-β)

Distal enhancer Repressed chromatin
Mutations of JAK Mutations of STAT

Mouse STAT Mouse knockout Human LOF Human GOF
JAK Human LOF Human GOF
knockout STAT1
Impaired IFN signaling, viral
Primary immunodeficiency
Fungal susceptibility,
and bacterial susceptibility autoimmunity, aneurysms
Primary immunodeficiency, Systemic immune Interferonopathy,
Perinatal lethality, atypical mycobacterial dysregulation, STAT2 Viral susceptibility Viral susceptibility
autoinflammatory disease
JAK1 T−B+NK− SCID infections; somatic hypereosinophilic
mutations in cancers syndrome Lethal; multiple defects Primary immunodeficiency, Systemic autoimmunity,
STAT3 revealed by conditional hyperimmunoglobulin anemia, leukemia,
deletion E syndrome lymphomas other cancers
Myeloproliferative
Embryonic lethal,
JAK2 Not reported neoplasms,
Tuberculosis, atypical
Variants associated with
anemia leukemia, lymphoma rheumatoid arthritis, Sjogren’s
STAT4 Impaired type I responses mycobacterial and fungal
syndrome and systemic
susceptibility, Kaposi sarcoma
lupus erythematosus
JAK3 T−B+NK− SCID T−B+NK− SCID Leukemia Impaired mammary gland Leukemia, lymphomas
STAT5A development Not reported other cancers
Somatic mutations and
Impaired NK cell function and eosinophilia, urticaria, leukemia,
STAT5B sexual dimorphic growth Dwarfism, autoimmunity
Viral susceptibility, Primary immunodeficiency, lymphomas, other cancers
TYK2 diminished responses variants protect against Not reported
to type I IFNs, IL-12, IL-23 autoimmunity Impaired type II
STAT6 immune responses
Not reported Lymphoma
See online version for

1696 Cell 181, June 25, 2020 © 2020 Published by Elsevier Inc. DOI https://doi.org/10.1016/j.cell.2020.04.052 legends and references
SnapShot: Jak-STAT Signaling II
Alejandro V. Villarino,1 Massimo Gadina,1 John J. O’Shea,1 and Yuka Kanno1
1
National Institute of Arthritis and Musculoskeletal and Skin Diseases (NIAMS),
National Institutes of Health (NIH), Bethesda, MD 20892, USA
The discovery of the Janus kinase (JAK) and signal transducer and activator of transcription (STAT) pathway arose from investigation of interferon (IFN) signaling (Darnell et
al., 1994; Leonard and O’Shea, 1998). Canonical JAK-STAT signaling begins with extracellular apposition of members of a family of structurally related cytokines, interleukins,
interferons, colony-stimulating factors, and some hormones with their corresponding structurally related transmembrane receptors. This enables trans-activation of receptor-
bound JAKs that catalyze tyrosine phosphorylation (p-Tyr) of receptors and STATs, resulting in the formation of homodimers and/or heterodimers that accumulate in the
nucleus and instruct gene transcription. Interferons also induce a complex of STAT1, STAT2, and IRF9.
In mammals, there are 4 JAKs sharing 4 major structural domains: (1) the FERM (band-four-point-one ezrin radixin moesin) domain, which mediates interaction with
receptors and promotes kinase function, (2) the SH2-like domain, which mediates interaction with receptors, (3) the pseudokinase domain, which regulates kinase activity,
and (4) the kinase domain. Germline loss of function (LOF) mutations of JAK3 and TYK2 underlie human primary immunodeficiency disorders (Tangye et al., 2017). Somatic
LOF mutations of JAK1 can arise in tumor cells and are associated with resistance to IFN-γ and cancer evasion. Gain of function (GOF) mutations underlie systemic autoim-
munity (JAK1), polycythemia vera (JAK2 kinase-like domain), leukemias, lymphomas, and other malignancies. JAKs were recognized as pivotal drug targets and multiple JAK
inhibitors (jakinibs) have been approved for the treatment of myeloproliferative neoplasms, graft versus host disease, rheumatoid arthritis, psoriatic arthritis, and inflammatory
bowel disease (Gadina et al., 2020). Most jakinibs target the JAK kinase domain, but newer agents act via allosteric mechanisms (e.g., target the kinase-like domain). Despite
the clinical success, questions remain regarding the optimal degree of JAK inhibition for specific cell types in various tissues and disorders. Compared with first-generation
pan-jakinibs, selective jakinibs may provide advantages in terms of reduced toxicity. For example, selective targeting of JAK1, TYK2, and JAK3 would avoid disrupting actions
of JAK2-dependent cytokines involved in hematopoiesis (e.g., erythropoietin). In some circumstances, though, selectivity might result in reduced efficacy.
There are seven mammalian STAT family members bearing five major structural domains. In addition to JAKs, STATs may be phosphorylated by receptor tyrosine kinases,
src family kinases, and bacterial and parasite enzymes. STAT expression and availability are subject to a range of intrinsic (e.g., cellular lineage) and extrinsic (e.g., cytokines)
factors. They are regulated at transcriptional, post-transcriptional, and post-translational levels, including by protein tyrosine phosphatases (PTPs), which “de-activate” STATs
and mediate degradation or recycling. STATs are subject to serine phosphorylation (p-Ser), which influences both p-Tyr-dependent and p-Tyr-independent functions. The latter
are particularly relevant for “unphosphorylated” STATs (uSTATs), which are arranged as anti-parallel dimers (unlike conventional p-Tyr-dependent parallel dimers). uSTATs can
modulate the pool of cytoplasmic STATs and mediate gene transcription (Stark et al., 2018; Stark and Darnell, 2012). STATs are present in mitochondria and positively regulate
of complexes I, II, and V of the electron transport chain, elevating mitochondrial membrane potential and favoring the mitochondrial respiration (Garama et al., 2016).
STATs are classical transcription factors (TFs) that engage DNA regulatory elements (DREs) bearing a defined sequence motif and instruct transcription of protein-coding
(mRNAs) or non-coding genes (microRNAs, long non-coding RNAs). This is achieved through a combination of proximal DREs located close to the transcriptional start sites
(TSS) or within gene bodies, distal elements which can physically interact via chromatin looping. STATs bind broadly throughout the genome at promoters, even more so at
enhancers. STATs often congregate at both conventional and super-enhancers; notably, the genes encoding STATs themselves bear super-enhancers, in line with the idea that
they are tightly transcriptionally regulated.
STATs typically engage multiple DREs associated with a given gene, and genes are often bound by multiple STAT family members, along with other TFs, such as IRF, AP-1,
and NF-κB family proteins, creating a platform for multi-molecular networks to control gene expression (Villarino et al., 2017). STATs also influence chromatin remodeling by
recruiting histone-modifying enzymes. Although originally identified as transcriptional activators, STAT binding is associated with both induction and repression of genes.
Germline STAT mutations are associated with primary immunodeficiency and autoimmunity, whereas somatic GOF STAT mutations are associated with cancer. STATs do
not have enzymatic activity and are more challenging than JAKs as therapeutic targets. Three potential strategies have been employed: (1) inhibitory peptides, which seques-
ter STATs from upstream receptors and kinases, (2) small-molecule inhibitors, which impede STAT activation and/or function, and (3) decoy oligonucleotides, which sequester
STATs away from genomic binding sites.
The JAK-STAT pathway is negatively regulated in multiple ways. Aside from dephosphorylation, suppressor of cytokine signaling (SOCS) proteins are induced by cytokine
and IFNs and inhibit proximal receptor signaling (Morris et al., 2018). Ubiquitin specific peptidase 18 (USP18) is induced by IFNs and binds to STAT2, negatively regulating its
function; mutations that interfere with USP18 action are associated with lethal interferonopathy (Basters et al., 2018). STATs are also targeted and degraded by viruses.
In summary, the JAK-STAT pathway has emerged as a paradigm for membrane-to-nucleus signaling, and building on the legacy of the first quarter century of JAK-STAT
research, the field continues to deliver transformative, clinically relevant insights on the nature of intercellular communication, gene expression, and signal-dependent
transcription factors. Still, there is much to learn. Detailed molecular and cell-biologic understanding of cytokine receptor/JAK structure-function relationships will be further
enhanced by advanced real-time measurement of signaling and gene transcription by imaging. These advances are eagerly awaited and will no doubt offer many new trans-
lational insights and therapeutic opportunities.
ACKNOWLEDGMENTS
This work was supported by the NIAMS Intramural Research Program.
REFERENCES
Basters, A., Knobeloch, K.P., and Fritz, G. (2018). USP18 - a multifunctional component in the interferon response. Biosci. Rep. 38, BSR20180250.
Darnell, J.E., Jr., Kerr, I.M., and Stark, G.R. (1994). Jak-STAT pathways and transcriptional activation in response to IFNs and other extracellular signaling proteins. Science 264,
1415–1421.
Gadina, M., Chisolm, D.A., Philips, R.L., McInness, I.B., Changelian, P.S., and O’Shea, J.J. (2020). Translating JAKs to Jakinibs. J. Immunol. 204, 2011–2020.
Garama, D.J., White, C.L., Balic, J.J., and Gough, D.J. (2016). Mitochondrial STAT3: Powering up a potent factor. Cytokine 87, 20–25.
Leonard, W.J., and O’Shea, J.J. (1998). Jaks and STATs: biological implications. Annu. Rev. Immunol. 16, 293–322.
Morris, R., Kershaw, N.J., and Babon, J.J. (2018). The molecular details of cytokine signaling via the JAK/STAT pathway. Protein Sci. 27, 1984–2009.
Stark, G.R., and Darnell, J.E., Jr. (2012). The JAK-STAT pathway at twenty. Immunity 36, 503–514.
Stark, G.R., Cheon, H., and Wang, Y. (2018). Responses to Cytokines and Interferons that Depend upon JAKs and STATs. Cold Spring Harb. Perspect. Biol. 10, a028555.
Tangye, S.G., Pelham, S.J., Deenick, E.K., and Ma, C.S. (2017). Cytokine-Mediated Regulation of Human Lymphocyte Development and Function: Insights from Primary Immuno-
deficiencies. J. Immunol. 199, 1949–1958.
Villarino, A.V., Kanno, Y., and O’Shea, J.J. (2017). Mechanisms and consequences of Jak-STAT signaling in the immune system. Nat. Immunol. 18, 374–384.
1696.e1 Cell 181, June 25, 2020 © 2020 Published by Elsevier Inc. DOI https://doi.org/10.1016/j.cell.2020.04.052

Cell - Vol. 181 (Nº7)

Uploaded by

Copyright:

Available Formats

You might also like

Cell - Vol. 181 (Nº7)

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Cell - Vol. 181 (Nº7)

Uploaded by

Copyright:

Available Formats

ll

The Cell Editorial Team

1444 Cell 181, June 25, 2020

Cell 181, June 25, 2020 ª 2020 Elsevier Inc. 1445

1446 Cell 181, June 25, 2020

Cell 181, June 25, 2020 1447

and curtailment of patient enrollment in

1448 Cell 181, June 25, 2020

Cell 181, June 25, 2020 1449

1450 Cell 181, June 25, 2020 ª 2020 Elsevier Inc.

Figure 1. Biological Impacts of Cap-Snatching

Cell 181, June 25, 2020 1451

Role of Microbiota-Derived Bile

1452 Cell 181, June 25, 2020 ª 2020 Elsevier Inc.

Figure 1. Role of Microbially-Derived Bile Acids in Enteric Infections

Cell 181, June 25, 2020 1453

Mapping the Uncharted Territories of Human

1454 Cell 181, June 25, 2020 ª 2020 Elsevier Inc.

Cell 181, June 25, 2020 1455

1456 Cell 181, June 25, 2020

Cell 181, June 25, 2020 1457

*Correspondence: gregory.sempowski@duke.edu (G.D.S.), barton.haynes@duke.edu (B.F.H.)

Introduction cholesterolemia, osteoporosis, cancer, and infectious diseases

1458 Cell 181, June 25, 2020 ª 2020 Elsevier Inc.

this technology was recently used in China to isolate the first

Cell 181, June 25, 2020 1459

1460 Cell 181, June 25, 2020

Fosum, Pharma/Pfizer) or DNA (e.g., Inovio) vaccines as well

tant to develop capacity for large-scale vaccine production, in

Cell 181, June 25, 2020 1461

1462 Cell 181, June 25, 2020

Cell 181, June 25, 2020 1463

*Correspondence: strappe@bsu.edu (S.T.), martin.walsh@mssm.edu (M.J.W.)

1464 Cell 181, June 25, 2020 ª 2020 Elsevier Inc.

Preclinical Animal Study Sites

Cell 181, June 25, 2020 1465

Figure 2. A General Schematic of the Preclinical and Clinical Studies

1466 Cell 181, June 25, 2020

Cell 181, June 25, 2020 1467

Phenotype Measures: Pediatrics

1468 Cell 181, June 25, 2020

Cell 181, June 25, 2020 1469

1470 Cell 181, June 25, 2020

revealing significant differential responses possibly interacting

Data and Resources Dissemination

Cell 181, June 25, 2020 1471

1472 Cell 181, June 25, 2020

Cell 181, June 25, 2020 1473

1474 Cell 181, June 25, 2020

Host-Viral Infection Maps Reveal Signatures of

d Viral-Track sorts infected from bystander cells and reveals

d SARS-CoV-2 infects epithelial cells and alters immune

d Co-infection of SARS-Cov-2 and hMPV affects monocytes

Bost et al., 2020, Cell 181, 1475–1488

*Correspondence: benno@pasteur.fr (B.S.), zhangzheng1975@aliyun.com (Z.Z.), ido.amit@weizmann.ac.il (I.A.)

1476 Cell 181, 1475–1488, June 25, 2020

(legend on next page)

Cell 181, 1475–1488, June 25, 2020 1477

1478 Cell 181, 1475–1488, June 25, 2020