Download as pdf or txt
Download as pdf or txt
You are on page 1of 22

Article

The gut microbiota is associated with


immune cell dynamics in humans

https://doi.org/10.1038/s41586-020-2971-8 Jonas Schluter1,2 ✉, Jonathan U. Peled3,4, Bradford P. Taylor2, Kate A. Markey3,4,


Melody Smith3,4, Ying Taur5, Rene Niehus6, Anna Staffas7, Anqi Dai3, Emily Fontana5,
Received: 3 May 2019
Luigi A. Amoretti5, Roberta J. Wright5, Sejal Morjaria5, Maly Fenelus8, Melissa S. Pessin8,
Accepted: 30 September 2020 Nelson J. Chao9, Meagan Lew9, Lauren Bohannon9, Amy Bush9, Anthony D. Sung9,
Tobias M. Hohl5, Miguel-Angel Perales3,4, Marcel R. M. van den Brink3,4 & Joao B. Xavier2 ✉
Published online: xx xx xxxx

Check for updates


The gut microbiota influences development1–3 and homeostasis4–7 of the mammalian
immune system, and is associated with human inflammatory8 and immune diseases9,10
as well as responses to immunotherapy11–14. Nevertheless, our understanding of how
gut bacteria modulate the immune system remains limited, particularly in humans,
where the difficulty of direct experimentation makes inference challenging. Here we
study hundreds of hospitalized—and closely monitored—patients with cancer
receiving haematopoietic cell transplantation as they recover from chemotherapy
and stem-cell engraftment. This aggressive treatment causes large shifts in both
circulatory immune cell and microbiota populations, enabling the relationships
between the two to be studied simultaneously. Analysis of observed daily changes in
circulating neutrophil, lymphocyte and monocyte counts and more than 10,000
longitudinal microbiota samples revealed consistent associations between gut
bacteria and immune cell dynamics. High-resolution clinical metadata and Bayesian
inference allowed us to compare the effects of bacterial genera in relation to those of
immunomodulatory medications, revealing a considerable influence of the gut
microbiota—together and over time—on systemic immune cell dynamics. Our analysis
establishes and quantifies the link between the gut microbiota and the human
immune system, with implications for microbiota-driven modulation of immunity.

The human gut microbiota is considered a major modulator of the between 2003 and 2019 (Fig. 1a, Supplementary Table 1). The condition-
immune system during development3 and in health and disease8,9. For ing regimen of radiation and chemotherapy administered to patients
example, preterm infants have distinct microbiome compositions before HCT is the most severe perturbation to the immune system
and distinct developmental trajectories of peripheral immune cell deliberately performed in humans: this offers a unique opportunity to
populations3. In adults, the success of immunotherapies that rely on investigate links between the gut microbiota and immune dynamics
peripheral immune cells, such as checkpoint inhibitor treatments, directly in humans.
has been associated with the composition of the microbiome11–13,15. Conditioning depletes white blood cell (WBC) counts, leading to
There is an increasing interest in using the microbiome to modulate the neutropenia (less than 500 neutrophils per μl blood) until transplanted
immune system and augment treatments7,16, including the growing field stem cells begin to release granulocytes from the bone marrow, initi-
of chimeric antigen receptor T cell therapy17. However, our understand- ating immune reconstitution (Fig. 1a–c). HCT also damages the gut
ing of how the microbiota influences the dynamics of immune cells microbiota18 and reduces its biodiversity (Fig. 1d–i), a collateral effect
in humans and how this compares to deliberate immunomodulatory associated with increased mortality in patients undergoing HCT19.
interventions remains limited owing to a lack of feasible experiments Immune and microbiome reconstitution vary considerably between
in human subjects. patients and treatment types (Fig. 1, Extended Data Fig. 1a), enabling
To overcome this limitation, we investigated whether the gut micro- analyses of associations between microbiome and immune system, and
biota could influence day-by-day changes in peripheral immune cell their comparison with immunomodulators such as granulocyte-colony
counts. We collected a vast dataset of immune-reconstitution dynam- stimulating factor (GCSF).
ics after allogeneic haematopoietic cell transplantation (HCT) from To detect a directional and causal link between the microbiota and cir-
individuals treated at Memorial Sloan Kettering Cancer Centre (MSK) culatory WBCs, we first used data from a randomized trial of autologous

Institute for Computational Medicine, NYU Langone Health, New York, NY, USA. 2Computational and Systems Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer
1

Center, New York, NY, USA. 3Adult Bone Marrow Transplantation Service, Department of Medicine, Memorial Sloan Kettering Cancer Center, New York, NY, USA. 4Weill Cornell Medical
College, New York, NY, USA. 5Infectious Disease Service, Department of Medicine, and Immunology Program, Sloan Kettering Institute, New York, NY, USA. 6Harvard University, T. H. Chan
School of Public Health, Boston, MA, USA. 7Sahlgrenska Cancer Center, Department of Microbiology and Immunology, Institute of Biomedicine, University of Gothenburg, Gothenburg,
Sweden. 8Department of Laboratory Medicine, Memorial Sloan Kettering Cancer Center, New York, NY, USA. 9Division of Hematologic Malignancies and Cellular Therapy, Duke University
School of Medicine, Durham, NC, USA. ✉e-mail: jonas.schluter@nyulangone.org; xavierj@mskcc.org

Nature | www.nature.com | 1
Article
a I II III HCT phase b Neutrophil
Patient 1 (PBSC) c Neutrophil
Patient 2 (cord)
engraftment engraftment
20 Mean (n = 2,335)

(×1,000 μl–1)
15 Neutrophil counts
GCSF GCSF
10
5
0
3

(×1,000 μl–1)
2 Lymphocyte counts

0
(×1,000 μl–1)
3 Monocyte counts

1.5

0
0 20 40 56 0 20 40 56 0 20 40 56
Time after HCT (d) Time after HCT (d) Time after HCT (d)
d e f
Mean (n = 1,294)
Diversity

15
10
5

g 1
Mean (n = 1,294) h i
abundance
Relative

0.5 0.5 0.5

0 0 0
0 20 40 56 0 20 40 56 0 20 40 56
Time after HCT (d) Time after HCT (d) Time after HCT (d)
Akkermansiaceae Enterobacteriaceae Lactobacillaceae Actinomycetaceae
Lachnospiraceae Bacteroidaceae Enterococcaceae Peptostreptococcacea Streptococcaceae
Enterococcaceae Bifidobacteriaceae Erysipelotrichaceae Ruminococcaceae Veillonellaceae
Ruminococcaceae
Clostridiaceae 1 Lachnospiraceae Staphylococcaceae Other families

Fig. 1 | Immune reconstitution and microbiome dynamics after HCT. receiving transplants from the same source. In a, coloured bars on the left
a–c, Major phases of HCT: immunoablation during conditioning before HCT on indicate the range of cell counts in healthy individuals. d–f, Loss of microbial
day 0 (I) is followed by post-HCT neutropenia (II) and reconstitution (III). Daily diversity during HCT, measured by 16S rRNA gene sequencing of faecal
mean counts (shaded area indicates s.d.) of neutrophils, lymphocytes and samples, supporting previous smaller studies23,24. In d, the line shows daily
monocytes from individuals receiving transplants between 2003 and 2019 (a), mean across patients, shaded area shows s.d. e, f, Data from individual patients.
compared with those from individuals (b, c) representative of the recovery g–i, Relative abundance of commensal bacteria families. g, Mean (± s.d.)
trajectories for different stem-cell graft sources. Patient 1 received a PBSC relative abundance across all patients. h, i, Relative abundance in individual
graft and patient 2 received umbilical-cord blood. Line with circles shows data patients.
from the patient; solid line and shaded region show mean ± s.d. for all patients

faecal microbiota transplantation (auto-FMT)—a controlled micro- blood de novo by differentiation of haematopoietic progenitor cells
biota manipulation experiment performed directly on our patients20 from the bone marrow, and can be mobilized from thymus and lymph
(Extended Data Fig. 2a). To investigate whether auto-FMT affected WBC nodes (lymphocytes), and spleen, liver and lungs (neutrophils); WBCs
reconstitution, we compared the neutrophil, lymphocyte and mono- can also migrate from the blood to other tissues when needed23. To iden-
cyte counts after neutrophil engraftment in 24 individuals (engraftment tify modulators of these dynamic processes, we developed a two-stage
defined as 3 consecutive days with over 500 neutrophils per μl). FMTs approach analysing the changes of WBC counts between two days
were conducted at variable dates relative to neutrophil engraftment (Fig. 3a). Stage 1 served as a clinical- and metadata-feature-selection
(Fig. 2a, Supplementary Table 2). Overall, we observed higher counts stage using blood and medication data of 1,096 patients without avail-
of each WBC type in individuals who received an auto-FMT during the able microbiome information (Extended Data Fig. 1b shows data inclu-
first 100 days after neutrophil engraftment (P < 0.001, Fig. 2b, c; total sion). Stage 2 was performed on data from an independent cohort of 841
WBCs, Extended Data Fig. 2b–g). different patients from whom concurrent microbiome samples were
The higher WBCs in individuals receiving auto-FMT could result from available to detect associations between microbiome and peripheral
the successful reconstitution of a complex microbiota20 and associated immune cell dynamics.
metabolic capabilities21, or they could be a systemic response to a severe In stage 1, we analysed the changes in neutrophils, lymphocytes
therapy that introduced billions of intestinal organisms at once via an and monocytes during recovery from more than 20,000 pairs of
enema (no enema was administered to controls20). Moreover, chance post-engraftment blood samples separated by a single day (Fig. 3b).
differences in extrinsic factors such as different immunomodulator A cross-validated feature-selection approach detected medications
medications may have affected this result owing to the small cohort and HCT parameters associated with WBC dynamics (Extended Data
size. Nonetheless, the results supported the notion that the microbiota Fig. 3a–c, Supplementary Table 3). In stage 2, we sought to identify the
can modulate the peripheral immune system. High lymphocyte counts additional contribution of the gut microbiome. We performed Bayes-
during immune reconstitution have been associated with improved ian inferences using data from different sets of patients with available
clinical outcomes22, and 3-year survival was positively associated microbiome samples (Supplementary Table 4). Stage 1 had identified—
with higher mean levels of WBCs during the 100 days after neutro- as expected—that the sources of stem-cell grafts are associated with
phil engraftment in the individuals receiving HCT (hazard ratio = 0.91, immune reconstitution kinetics (for example, umbilical-cord blood
P = 0.04). Identifying the taxa that modulate immune dynamics could is associated with slower kinetics than peripheral blood24 (peripheral
therefore open new ways to improve immune reconstitution—critical blood stem cells (PBSCs)), and we therefore stratified our patients
for clinical outcomes. by graft source in stage 2. The dynamic systems model of stage 2
To investigate the links between the gut microbiota and the dynam- thus included bacterial genera as predictors of daily changes in WBC
ics of WBC recovery, we turned to our large observational cohort of counts, in addition to the medications selected in stage 1, clinical fea-
patients receiving HCT. Homeostasis of circulatory WBC counts is a tures (conditioning intensity, age and sex), and the current state of the
complex, dynamic process: WBCs are formed and released into the blood in the form of counts of neutrophils, lymphocytes, monocytes,

2 | Nature | www.nature.com
a 10
Neutrophils (×1,000 μl–1)
1.5
Lymphocytes (×1,000 μl–1)
3
Monocytes (×1,000 μl–1)
a GCSF-associated increase of +43% (HPDI95 (+30%, +58%), v-score = 3)
0.05 0.05 0.05
in monocyte rates and a smaller increase in lymphocyte rates (+16%,
HPDI95 (+5%, +27%), v-score = 3). Neutrophil and lymphocyte rates
decreased following antihistamine or immunosuppressive medication
Control

(cetirizine, −18%, HPDI95 (−35%, +5%); mycophenolate mofetil, −8%,


HPDI95 (−15%, +1%)). Finally, less intensive chemotherapeutic condi-
tioning regimens (non-ablative and reduced intensity) were associated
with increased lymphocyte and monocyte rates (Extended Data Fig. 4c).
Beyond the mechanistically plausible associations of medications,
our analysis detected associations between the current count of WBCs
FMT-treated

and their rates of change: negative associations of neutrophil and lym-


phocyte counts with the rates of monocytes, and negative associa-
tions of platelet and lymphocyte counts with the rates of lymphocytes
and neutrophils (Fig. 3e). Conversely, we found positive associations
between monocytes and the rates of each of the investigated WBC
0 100 0 100 0 100
subsets. These associations, derived from daily counts of WBCs, could
Days relative to neutrophil engraftment
reflect a complex network underlying the regulation of blood immune
b Neutrophils Lymphocytes Monocytes cell composition23. More importantly, the associations quantified for
10 10 10
FMT these potential homeostatic feedbacks and medications provided a
Control
benchmark against which we could compare gut microbial taxa.
μl–1)

(×1,000 μl–1)

(×1,000 μl–1)

5 5 5 We identified bacterial genera that consistently associated with WBC


(×1,000

dynamics (Fig. 3f). Higher relative abundances of Faecalibacterium


0.05 0.05 0.05 (+8%, HPDI95 (+1%, +14%) per unit log10 difference), Ruminococcus
0 5 10 0 5 10 0 5 10
Weeks relative to FMT randomization date 2 (+5%, HPDI95 (0%, +10%) per log10 difference) and Akkermansia (+4%,
c HPDI95 (+1%, +7%) per log10 difference) were associated with increased
Neutrophils Lymphocytes Monocytes
neutrophil rates, whereas Rothia (−3%, HPDI95 (−7%, 0%) per log10 dif-
Epost-FMT
Eintercept
ference) and Clostridium sensu stricto 1 (−3%, HPDI95 (−6%, 0%) per log10
difference) associated with reduced neutrophil rates. These results
0 2.5 0 0.38 0 0.5
(×1,000 μl–1) (×1,000 μl–1) (×1,000 μl–1)
were validated in univariate analyses of the Duke cohort (Extended
Data Fig. 4g–i). We also used total bacterial abundances as predictors
Fig. 2 | Neutrophil, lymphocyte and monocyte counts increased in instead of relative abundances; this confirmed Faecalibacterium as
FMT-treated individuals in the weeks following treatment. a, Absolute most strongly associated with neutrophil dynamics (Extended Data
counts of neutrophils, lymphocytes and monocytes in 10 control and 14
Fig. 5). Ruminococcus 2 (+5%, HPDI95 (+1%, +9%) per log10 difference)
FMT-treated individuals after neutrophil engraftment. Vertical lines indicate
and Staphylococcus (+4%, HPDI95 (+1%, +6%) per log10 difference) were
randomization dates. b, Weekly mean cell counts aligned to the randomization
positively associated with lymphocyte rates. Faecalibacterium and
date. Line shows weekly mean, shaded region shows 95% CI. c, Coefficient
Ruminococcus 2 also associated with increases in monocyte rates; this
estimates from linear mixed-effects models of neutrophils, lymphocytes and
monocytes over time indicate an increase of each WBC type induced by association was validated in other cohorts (v-score 3 and 1, respec-
auto-FMT (corresponding coefficient, βFMT: P = 4 × 10 −11 (neutrophils), tively), but there was higher uncertainty of the association estimate
P = 2 × 10 −10 (lymphocytes) and P = 2 × 10 −16, (monocytes); full regression results (HPDI50 > 0). Clostridium sensu strictu 1 (−3%, HPDI95 (−5%, −1%) per
are presented in the Supplementary Information). In b, c, N = 24 subjects, log10 difference) associated with decreased rates of monocytes. The
n = 921 blood samples. associations we identified—and validated in other cohorts—between
gut microbial taxa and daily changes in WBCs support the idea that
haematopoiesis and mobilization respond to the composition of the
eosinophils and platelets (Fig. 3a). The dataset comprised 841 individu- gut microbiome, influencing systemic immunity26.
als, but approximately 60% of the stool samples paired with a daily Intestinal bacteria may affect circulatory WBC counts by influencing
change in WBC counts were taken before neutrophil engraftment, either their sources in the bone marrow (or their cytokine profiles27 and
that is, when WBC counts were zero. Nevertheless, more than 2,000 proliferation rates in the blood), their sinks in different organs, or both.
post-engraftment observations of WBC changes during immune recon- The immune system in turn can interact with the microbiota and modu-
stitution provided a large sample for analysis of dynamics (Fig. 3b). late its composition, for example, via immunoglobulin A, as studied in
Stage 2 focused on data from the largest (Fig. 3c) cohort: PBSC graft mice28–30. To investigate the effect of the immune system on bacterial
recipients. We withheld the other cohorts (bone marrow (BM); T cell populations, we used an analogous approach to the stage 1 analysis.
depleted ex vivo by CD34+ selection grafts (TCD); and umbilical cord Dynamics of WBCs could be estimated from changes in absolute cell
(cord)) for validation scoring, and included data from patients treated counts, and to obtain the necessary absolute bacterial counts, we meas-
at Duke University for additional validation (Supplementary Table 5). ured total bacterial 16S rRNA gene copies per gram of stool for a subset
Notably, as a verification of our approach, we detected associa- of our samples (3,995 samples from 481 subjects) to jointly infer the
tions between immunomodulator administrations and consequent bidirectional association network between microbiota and the periph-
immune cell dynamics that were consistent with known biological eral immune system dynamics. All of our subjects received antibiotics
mechanisms (Fig. 3c, Extended Data Fig. 4a–f). The strongest asso- at some point during their treatment18, and their strong influence on
ciation across all predictors is the well-known neutrophil-increasing microbiota dynamics were the dominant effects that survived feature
effect of GCSF25; GCSF administration—used to accelerate recovery selection (Extended Data Fig. 6). However, relaxing the regulariza-
from chemotherapy-induced neutropenia25—was associated with a tion strength (Methods) revealed several bidirectional relationships
+140% increase in the rate of neutrophil changes from one day to the between WBCs and gut bacterial dynamics (Extended Data Fig. 7). Of
next (95% highest posterior density interval (HPDI95) (+114%, +170%)). note, we detected a negative association of absolute [Ruminococcus]
This finding was observed in all MSK (v-score = 3, Fig. 3d) and Duke gnavus group abundance with lymphocytes rates, confirming our main
validation datasets (Extended Data Fig. 4g–j). Furthermore, we found result based on relative bacterial abundances (Fig. 3f). In the reverse

Nature | www.nature.com | 3
Article
a Medications and treatments b Δt = 1 day, WBC dynamic data c Fig. 3 | Bayesian inference reveals associations between the microbiota and
Host state Neutrophils
With microbiota dynamics of circulatory WBC counts. a, Cartoon of the model: observed
Lymphocytes
Monocytes
changes in WBC counts between two consecutive days are associated with the
1,000 Validation
daily change current state of the host in the form of blood cell counts in circulation,

samples
No. of
log(Wt+1) – log(Wt ) 500

PC 2 (EV: 23%)
immunomodulatory medications, clinical metadata and the state of the
W: 0
neutrophils 300 microbiome. b, Visualization of the WBC dynamics data. Scatter plot of the

recipients
lymphocytes

No. of
monocytes 150 principal components (PC) of observed daily changes of neutrophils,
0 lymphocytes and monocytes without (grey; n = 20,751 (after neutrophil

PBSC

BM
TCD

cord
Microbiome engraftment), 81,253 (total)) and with (orange; n = 2,615 (after neutrophil
PC 1 (EV: 48%) Graft type
engraftment), 6,297 (total)) available concurrent microbiota samples. EV,
d V-score V-score V-score e V-score V-score V-score
explained variance. c, Recipients of PBSCs (N = 312) provided most paired
tes ** ** **
cy blood dynamics and microbiota samples (n = 995). Datasets from recipients of
GCSF *** *** *** no ils
Mo oph stem cells from TCD, bone marrow (BM) and umbilical cord (cord) grafts were
MM sin ils **
* * Eo oph used for validation. d–f, Bayesian inference results from PBSC graft recipient
utr tes ** * **
Cetirizine Ne ocy
h
mp ele
ts * data. d, Posterior coefficient distributions of associations between treatments
Ly Plat
–1 0 1 –1 0 1 –1 0 1 –0.3 0.0 0.5 –0.3 0 0.25 –0.3 0 0.25 (colour shows more than 95% posterior density (PD) of coefficient estimates
Effect on: ΔNeutrophils ΔLymphocytes ΔMonocytes ΔNeutrophils ΔLymphocytes ΔMonocytes
Posteriors
greater than zero (red) or less than zero (blue)). MM, mycophenolate mofetil.
e, WBC counts. f, Relative abundances of microbial genera and daily changes
50%

V-score V-score V-score


Mean f (Δ) in neutrophils, lymphocytes and monocytes. The v-score is the number of
95% HPDI
>95% prob.<0
Faecalibacterium validation cohorts confirming associations; it is set to zero if invalidated by
>95% prob.>0
Ruminococcus 2 validation cohorts (additional coefficients in Extended Data Fig. 4a–c). Unid.,
Akkermansia unidentified. g, One hundred microbiota samples with highest (left) or lowest
3 (right) relative abundance of Faecalibacterium, Ruminococcus 2 and
V-score

Unid. Lactobacillales 635


2
1
0 Veillonella Akkermansia. h, Simulation of neutrophil dynamics in the presence of GCSF
Bacteroides and microbiota compositions sampled from those with high (blue) or low (red)
[Clostridium] innocuum group relative abundance of Faecalibacterium, Ruminococcus 2 and Akkermansia as
Staphylococcus
shown in g. Lines show medians of 1,000 simulations and shaded regions show
Parabacteroides
the interquartile range of simulated trajectories. i, Time until neutrophil
[R.] gnavus group
counts first reach a density of 2,000 cells per μl in equivalent simulations
without GCSF.
Clostridium sensu stricto 1

Rothia

Faecalitalea
Ruminococcus 2 and Akkermansia that we associated with increased
–0.1
Effect on:
0.0 0.1
ΔNeutrophils
–0.1 0.0 0.1
ΔLymphocytes
–0.1 0.0
ΔMonocytes
0.1
WBC rates were also among those best reconstituted by auto-FMT20,
g h i potentially explaining the higher counts of neutrophils, monocytes
Faecalibacterium
Ruminococcus 2 25 + GCSF
4
– GCSF and lymphocytes in auto-FMT-treated individuals.
Simulated neutrophil count

Akkermansia
0.6 20 Our analyses show that the gut microbiome is associated with
Probability (%)
Relative abundance

immune cell dynamics in humans. The inferred associations should


(×1,000 μl–1)

15
100 highest 100 lowest 2 be interpreted as net effects, since they do not, for example, distin-
10
>15 d guish the effect of the microbiota on de novo haematopoiesis from its
0.1
5 effect on other sources and sinks. Unlike the plausible role of obligate
0 0 anaerobe fermenters in augmenting haematopoiesis via nutritional
Sample index Sample index 0 1 2 3 0 5 10 15
Simulated days Time to 2,000 support21, the positive association detected between Staphylococcus
neutrophils per μl (d)
and lymphocyte dynamics could instead result from reduced extrava-
sation of T cells from circulation into the gut epithelium40, especially
since high abundances of Staphylococcus are associated with low gut
direction, we saw a positive association of lymphocyte counts with [R.] microbiota diversity (P < 0.001, Extended Data Fig. 9a), which indicates
gnavus group growth rates. Ruminococcus gnavus is associated with a depleted microbiota.
inflammatory bowel diseases31 and autoimmune disorders10; our analy- Nevertheless, our approach enables us to leverage the chronology
sis suggests that it may drive high neutrophil-to-lymphocyte ratios that of events and assess ‘mathematical causality’41. Owing to the observa-
are broadly characteristic of poor disease outcomes in inflammatory tional nature of these data, there are risks of confounding, for exam-
bowel diseases32 and other conditions33,34. ple, from undetected infections or dietary components, which could
Several of the bacterial taxa positively associated with WBC rates explain some of the associations, but the close temporal correspond-
were obligate anaerobes, some of which produce cell-wall molecules1,35 ence41 between microbiota and WBC dynamics reduces the number of
and short-chain fatty acids36 that modulate immune responses and plausible confounders. Notwithstanding potential confounders, our
granulopoiesis37. Ruminococcus 2, for example, contains keystone results suggest candidate bacterial taxa that might improve immune
species that release nutrients from complex dietary starch38, and such reconstitution, and focused follow-up studies are required to evaluate
nutritional support from the microbiota improved haematopoietic their immunomodulatory efficacy. Members of Faecalibacterium,
reconstitution in mice21. To identify a similar association in our patients, Ruminococcus12 in one study, and Akkermansia11 in another have been
we estimated the microbiota reconstitution potency of each sample associated with better responses to anti–PD-1 immunotherapy, which
(Methods). Shotgun metagenomic sequences from 124 of our sam- suggested a disagreement regarding involved taxa42. Our results, how-
ples revealed that samples with positive microbiota potency were ever, identified Faecalibacterium, Ruminococcus 2 and Akkermansia as
enriched in cholate degradation, vitamin B1 synthesis and butanoate the taxa with the strongest associations with immune cell dynamics,
formation pathways (Extended Data Fig. 8). In line with evolutionary agreeing with the findings of both these previous studies that these
theory39, our results suggest that such broadly available microbial taxa are associated with human immune modulation. Furthermore,
traits may be co-opted by the host as part of the homeostatic interplay our work enables us to directly compare their inferred effect sizes with
between immune system and microbiota. The genera Faecalibacterium, the effects of immunomodulatory drugs. These genera are common in

4 | Nature | www.nature.com
healthy people43, but their abundance can fall below the detection limit 9. Markey, K. A. et al. The microbe-derived short-chain fatty acids butyrate and propionate
in patients after HCT18. Realistic ranges of 3–5 orders of magnitude in are associated with protection from chronic GVHD. Blood 136, 130–136 (2020).
10. Azzouz, D. et al. Lupus nephritis is linked to disease-activity associated expansions and
bacterial relative abundances (Fig. 3g, Extended Data Fig. 9b, c) could immunity to a gut commensal. Ann. Rheum. Dis. 78, 947–956 (2019).
yield effect sizes similar to the homeostatic feedbacks inferred between 11. Routy, B. et al. Gut microbiome influences efficacy of PD-1-based immunotherapy against
WBCs and immunomodulatory medications (for example, a change in epithelial tumors. Science 359, 91–97 (2018).
12. Gopalakrishnan, V. et al. Gut microbiome modulates response to anti-PD-1
Ruminococcus 2 from below the detection limit to 1% relative abundance immunotherapy in melanoma patients. Science 359, 97–103 (2018).
was associated with a +67% change in neutrophils and a +63% change 13. Vétizou, M. et al. Anticancer immunotherapy by CTLA-4 blockade relies on the gut
microbiota. Science 350, 1079–1084 (2015).
in lymphocytes). The effect sizes of gut bacteria may initially appear
14. Matson, V. et al. The commensal microbiome is associated with anti-PD-1 efficacy in
small relative to those of immunomodulatory drugs, but their effect metastatic melanoma patients. Science 359, 104–108 (2018).
could be considerable, as they refer to changes in exponential rates of 15. Tanoue, T. et al. A defined commensal consortium elicits CD8 T cells and anti-cancer
immunity. Nature 565, 600–605 (2019).
WBCs and would therefore accumulate while those bacteria remain
16. Brandi, G. & Frega, G. Microbiota: overview and implication in immunotherapy-based
abundant. To demonstrate this accumulation over time, we simulated cancer treatments. Int. J. Mol. Sci. 20, 2699 (2019).
WBC dynamics using our posterior coefficient distributions (Meth- 17. Xin Yu, J., Hubbard-Lucey, V. M. & Tang, J. The global pipeline of cell therapies for cancer.
Nat. Rev. Drug Discov. 18, 821–822 (2019).
ods). We simulated 1,000 time series for microbiota compositions
18. Morjaria, S. et al. Antibiotic-induced shifts in fecal microbiota density and composition
chosen from the 100 samples highest or lowest in Faecalibacterium, during hematopoietic stem cell transplantation. Infect. Immun. 87, e00206-19 (2019).
Ruminococcus 2 and Akkermansia (Fig. 3g), in the presence (Fig. 3h) 19. Peled, J. U. et al. Microbiota as predictor of mortality in allogeneic hematopoietic-cell
transplantation. N. Engl. J. Med. 382, 822–834 (2020).
or absence (Fig. 3i) of GCSF administration. Simulations predict that
20. Taur, Y. et al. Reconstitution of the gut microbiota of antibiotic-treated patients by
microbiota enriched in these genera accelerate immune reconstitu- autologous fecal microbiota transplant. Sci. Transl. Med. 10, eaap9489 (2018).
tion, and reduce the time until neutrophils reach a level of more than 21. Staffas, A. et al. Nutritional support from the intestinal microbiota improves
hematopoietic reconstitution after bone marrow transplantation in mice. Cell Host
2,000 μl−1 in the absence of GCSF by 2.4 days, from a predicted 6.8
Microbe 23, 447–457. (2018).
days (95% confidence interval (CI) (6.5, 7)) to 4.4 days (95% CI (4.3, 4.5)) 22. Savani, B. N. et al. Absolute lymphocyte count on day 30 is a surrogate for robust
days. Gut bacteria, together and over time, could therefore influence hematopoietic recovery and strongly predicts outcome after T cell-depleted allogeneic
stem cell transplantation. Biol. Blood Marrow Transplant. 13, 1216–1223 (2007).
steady-state immune homeostasis considerably, even in individuals 23. Scheiermann, C., Frenette, P. S. & Hidalgo, A. Regulation of leucocyte homeostasis in the
with less severely injured microbiomes. circulation. Cardiovasc. Res. 107, 340–351 (2015).
In sum, our work links the human gut microbiota to the dynam- 24. Thompson, P. A. et al. Umbilical cord blood graft engineering: challenges and
opportunities. Bone Marrow Transplant. 50 (Suppl 2), S55–S62 (2015).
ics of the immune system via peripheral WBC dynamics. Our analy- 25. Gabrilove, J. L. et al. Effect of granulocyte colony-stimulating factor on neutropenia and
sis uses WBCs counted directly from human subjects, which is a associated morbidity due to chemotherapy for transitional-cell carcinoma of the
coarse-grained clinical analysis conducted at large scale, but it lacks urothelium. N. Engl. J. Med. 318, 1414–1422 (1988).
26. Belkaid, Y. & Hand, T. W. Role of the microbiota in immunity and inflammation. Cell 157,
details such as the subtypes of lymphocytes and other immune cells. 121–141 (2014).
Because our study is in humans, it fills an important gap at a critical 27. Schirmer, M. et al. Linking the human gut microbiome to inflammatory cytokine
time for microbiome research, when the clinical relevance of animal production capacity. Cell 167, 1125–1136 (2016).
28. McLoughlin, K., Schluter, J., Rakoff-Nahoum, S., Smith, A. L. & Foster, K. R. Host selection
models of microbiome-immune interaction has been questioned44. of microbiota via differential adhesion. Cell Host Microbe 19, 550–559 (2016).
By studying a large number of patients over time, we could infer and 29. Hooper, L. V., Littman, D. R. & Macpherson, A. J. Interactions between the microbiota and
quantify the association between gut bacteria and systemic immune the immune system. Science 336, 1268–1273 (2012).
30. Palm, N. W. et al. Immunoglobulin A coating identifies colitogenic bacteria in
cell dynamics, and our results help to consolidate previous apparently inflammatory bowel disease. Cell 158, 1000–1010 (2014).
contradictory findings11,12,42. Our demonstration that the microbiota 31. Henke, M. T. et al. Ruminococcus gnavus, a member of the human gut microbiome
influences systemic immunity in humans opens the door towards an associated with Crohn’s disease, produces an inflammatory polysaccharide. Proc. Natl
Acad. Sci. USA 116, 12672–12677 (2019).
exploration of potential microbiota-targeted interventions to improve 32. Okba, A. M. et al. Neutrophil/lymphocyte ratio and lymphocyte/monocyte ratio in
immunotherapy and treatments for immune-mediated and inflamma- ulcerative colitis as non-invasive biomarkers of disease activity and severity. Auto Immun.
tory diseases8,10–12. Highlights 10, 4 (2019).
33. Choi, S.-J. et al. High neutrophil-to-lymphocyte ratio predicts short survival duration in
amyotrophic lateral sclerosis. Sci. Rep. 10, 428 (2020).
34. Gao, Y. et al. Neutrophil/lymphocyte ratio is a more sensitive systemic inflammatory
Online content response biomarker than platelet/lymphocyte ratio in the prognosis evaluation of
unresectable pancreatic cancer. Oncotarget 8, 88835–88844 (2017).
Any methods, additional references, Nature Research reporting sum- 35. Hergott, C. B. et al. Peptidoglycan from the gut microbiota governs the lifespan of
maries, source data, extended data, supplementary information, circulating phagocytes at homeostasis. Blood 127, 2460–2471 (2016).
acknowledgements, peer review information; details of author con- 36. Smith, P. M. et al. The microbial metabolites, short-chain fatty acids, regulate colonic Treg
cell homeostasis. Science 341, 569–573 (2013).
tributions and competing interests; and statements of data and code 37. Balmer, M. L. et al. Microbiota-derived compounds drive steady-state granulopoiesis via
availability are available at https://doi.org/10.1038/s41586-020-2971-8. MyD88/TICAM signaling. J. Immunol. 193, 5273–5283 (2014).
38. Ze, X., Duncan, S. H., Louis, P. & Flint, H. J. Ruminococcus bromii is a keystone species for
the degradation of resistant starch in the human colon. ISME J. 6, 1535–1543 (2012).
1. Mazmanian, S. K., Liu, C. H., Tzianabos, A. O. & Kasper, D. L. An immunomodulatory 39. Foster, K. R., Schluter, J., Coyte, K. Z. & Rakoff-Nahoum, S. The evolution of the host
molecule of symbiotic bacteria directs maturation of the host immune system. Cell 122, microbiome as an ecosystem on a leash. Nature 548, 43–51 (2017).
107–118 (2005). 40. Fu, Y.-Y. et al. T cell recruitment to the intestinal stem cell compartment drives
2. Gomez de Agüero, M. et al. The maternal microbiota drives early postnatal innate immune-mediated intestinal damage after allogeneic transplantation. Immunity 51,
immune development. Science 351, 1296–1302 (2016). 90–103 (2019).
3. Olin, A. et al. Stereotypic immune system development in newborn children. Cell 174, 41. Gerber, G. K. The dynamic microbiome. FEBS Lett. 588, 4131–4139 (2014).
1277–1292 (2018). 42. Jobin, C. Precision medicine using microbiota. Science 359, 32–34 (2018).
4. Tan, T. G. et al. Identifying species of symbiont bacteria from the human gut that, alone, 43. The Integrative HMP (iHMP) Research Network Consortium. The integrative human
can induce intestinal Th17 cells in mice. Proc. Natl Acad. Sci. USA 113, E8141–E8150 (2016). microbiome project. Nature 569, 641–648 (2019).
5. Deshmukh, H. S. et al. The microbiota regulates neutrophil homeostasis and host 44. Walter, J., Armet, A. M., Finlay, B. B. & Shanahan, F. Establishing or exaggerating causality
resistance to Escherichia coli K1 sepsis in neonatal mice. Nat. Med. 20, 524–530 (2014). for the gut microbiome: lessons from human microbiota-associated rodents. Cell 180,
6. Ivanov, I. I. et al. Specific microbiota direct the differentiation of IL-17-producing T-helper 221–232 (2020).
cells in the mucosa of the small intestine. Cell Host Microbe 4, 337–349 (2008).
7. Geva-Zatorsky, N. et al. Mining the human gut microbiota for immunomodulatory Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in
organisms. Cell 168, 928–943 (2017). published maps and institutional affiliations.
8. Lloyd-Price, J. et al. Multi-omics of the gut microbial ecosystem in inflammatory bowel
diseases. Nature 569, 655–662 (2019). © The Author(s), under exclusive licence to Springer Nature Limited 2020

Nature | www.nature.com | 5
Article
Methods (Invitrogen) containing 1 copy of the 16 s rRNA gene. Cycling conditions
were 95 °C for 10 min followed by 40 cycles of 95 °C for 30 s, 52 °C for
No statistical methods were used to predetermine sample size. The 30 s and 72 °C for 1 min. We used the measurements of total 16S rRNA
experiments were not randomized, except for the auto-FMT trial as gene counts per gram of stool to multiply the relative abundances of
explained in NCT02269150. The investigators were not blinded to allo- taxa obtained from 16S amplicon sequencing to obtain the estimate
cation during experiments and outcome assessment. of their total abundance per gram of stool (supplementary informa-
tion). Of note, this does not account for 16S copy-number variation
Ethics approval and informed consent between taxa, but the observed dynamic ranges in total abundances
The participants in the auto-FMT trial (NCT02269150) provided written of taxa in our dataset span up to nine orders of magnitude, exceeding
informed consent to participate in the trial (#14-025). Participants in the potential inaccuracies due to copy-number variation.
the observational cohorts at both MSK and at Duke provided written
informed consent for the use of their faecal specimens and clinical Diversity calculations
data. The use and analysis of these specimens for the work herein was Microbiome alpha-diversity was measured by the inverse Simpson
approved by Institutional Research Boards at both institutions: MSK (IS) index of a sample. It was calculated by ISi = N 1 2 , where p is the
∑ j =1 pij
(#16-834) and Duke (PRO0006268 and Pro00050975).
relative abundance of the jth ASV out of N total ASVs in sample i.
Complete blood count collection and characterization
Absolute WBC count data were obtained from routine complete blood Linear mixed-effects model of WBC counts
counts ordered by clinicians during normal clinical practice, used to To study the effect of auto-FMT on WBCs, we investigated the WBC
obtain informative diagnostic and monitoring information. Blood counts of 24 subjects enrolled in this trial from the day of neutrophil
samples received in the clinical haematology laboratory were analysed engraftment until 100 days after engraftment. FMT was performed on
using Sysmex XN automated haematology analysers (Sysmex) and, different days relative to neutrophil engraftment. Thus, we performed
when needed based on specific flags and parameters as per MSKCC an analogous analysis to that conducted in the original publication
standard operating procedures, were validated manually using the that demonstrated how FMT re-established a diverse microbiome in
Sysmex DI-60 Slide Processing System or CellaVision DM9600 Auto- the post-FMT period20. To determine whether WBC counts differed
mated Digital Morphology System (Sysmex). after FMT, we used a linear mixed-effects model of WBC counts, y,
modelled as a function of the FMT treatment as well as patient- and
16S rRNA gene amplification and multiparallel sequencing time-point-specific random effects. We included random intercept
For each sample, duplicate 50-μl PCRs were performed, each containing terms for each day i and each patient j, and a fixed-effects term for
50 ng purified DNA, 0.2 mM deoxynucleotide triphosphates, 1.5 mM the post-FMT period with associated coefficient ‘armpost’, using the
MgCl2, 2.5 U Platinum Taq DNA polymerase, 2.5 μl of 10× PCR buffer, indicator variable FMT, which has a value of 1 when a patient was from
and 0.5 μM of each primer designed to amplify the V4–V5 region: the FMT-treated arm of the trial and day was greater than or equal to
563F (5′-nnnnnnnnNNNNNNNNNNNNAYTGGGYDTAAAGNG-3′) and the day of the FMT procedure. We conducted independent analyses
926R (5′-nnnnnnnnNNNNNNNNNNNNCCGTCAATTYHTTTRAGT-3′ for neutrophil, lymphocyte and monocyte counts. This resulted in the
). A unique 12-base Golay barcode (Ns) precedes the primers for sam- following model of a cell count, y, for patient j on day i:
ple identification45, and one to eight additional nucleotides (n) were
placed in front of the barcode to offset the sequencing of the primers. yij = β0 + armpost × FMTij + dayi + patientj + εij , i = 0, …, D, j = 1, …, P
Cycling conditions were 94 °C for 3 min, followed by 27 cycles of 94 °C
2 2
for 50 s, 51 °C for 30 s, and 72 °C for 1 min. For the final elongation step, with prior distributions dayi ~ N(0, σ day), and patientj ~ N(0, σ patient),
72 °C for 5 min was used. Replicate PCRs were pooled, and amplicons independent error εij ~ N(0, σ 2) and fixed intercept β0, for the D days
were purified using the QIAquick PCR Purification Kit (Qiagen). PCR after neutrophils engraftment and P individuals, (D = 100, P = 24). For
products were quantified and pooled at equimolar amounts before convenience of those interested in reanalysing our data, the part of
Illumina barcodes and adaptors were ligated, using the Illumina TruSeq our data concerning the auto-FMT analysis is available in tidy format
Sample Preparation protocol. The completed library was sequenced (Supplementary Information), and the analysis code conducted in
on an Illumina MiSeq platform following the Illumina recommended the R programming language is available as an exported notebook
procedures with a paired-end 250 × 250-bp kit (fmt_effect_on_wbc.pdf) on github: https://github.com/jsevo/
wbcdynamics_microbiome/ 49. We conducted an additional analysis
Sequence analysis with ‘day’ as a continuous predictor, which did not change our conclu-
The 16S (V4-V5) paired-end reads were merged and demultiplexed. sions (Supplementary Information).
Amplicon sequence variants (ASVs) were identified using the divisive
amplicon denoising algorithm (DADA2) pipeline including filtering Dynamic systems analyses
and trimming of the reads46. Reads were trimmed to the first 180 bp We analysed factors associated with the observed changes of absolute
or the first point with a quality score Q < 2. Reads were removed if they counts of neutrophils, lymphocytes and monocytes between two days.
contained ambiguous nucleotides (N) or if two or more errors were In the following we describe how chronology of events and biological
expected based on the quality of the trimmed read. We assigned tax- samples were encoded, and the models used to infer a role of medica-
onomy to ASVs using an octamer-based classifier trained by IDTaxa47 tions, clinical parameters and the microbiome on dynamics of WBCs.
using the SILVA database48. To reveal factors that associate with day-to-day changes in WBC
counts, we started from a first-order differential equation of WBC
Quantification of total microbiota density per gram of stool and (W) dynamics:
estimation of total genus abundances
qPCR was performed on DNA extracted from 1 g wet weight of a stool sam- d(W )  P 
ple using DyNAmo SYBR Green qPCR kit (Finnzymes) and 0.2 μM of the dt
= W gr +
 ∑ βj Xj 
 j =1 
universal bacterial primer 8F (5′-AGAGTTTGATCCTGGCTCAG-3′) and the
broad-range bacterial primer 338R (5′-TGCTGCCTCCCGTAGGAGT-3′). Where gr represents the intercept, that is, the base line rate of change
Standard curves were prepared by serial dilution of the PCR blunt vector during immune reconstitution, and βj are the to-be-estimated
coefficients of the P predictors Xj, j ∈ P of the WBC dynamics. This Stage 1 identified important differences between transplant types,
equation was then linearized to and we therefore stratified our data into four cohorts according to
the source of the stem-cell graft. Using data independently from each
P
d(ln W ) cohort, we applied ‘no u-turn’ sampling53 to produce 10,000 posterior
dt
= gr + ∑ βj Xj
j =1 samples from 5 independent MCMC chains that parameterized the
model:
And we parameterized the corresponding discrete difference
equation: y ~ N(μ, σ 2)

P
Δ ln(W ) Pˆ
Δt
= gr + ∑ βj Xj μ = gr + ∑ xj βj
j =1
j =1

where Δln(W) is the log-difference between single days of neutrophils,


lymphocytes or monocytes counts, and Δt = 1 for all intervals. Predictors with uninformative prior distributions
include the counts of neutrophils, lymphocytes, monocytes, eosino-
phils and platelets during an interval (homeostatic feedbacks), immu- gr ~ N(mean = 0, standard deviation = 100)
nomodulatory medication and clinical observations such as a blood
stream infection and the onset of graft versus host disease, HCT param- βj ~ N(mean = 0, standard deviation = 100)
eters such as graft types and conditioning regimens, and, additionally,
the microbiota composition in stage 2 of our analysis (Supplementary
σ ~ half Cauchy(beta = 2)
Information for data exclusion and additional details on interval defini-
tions). Importantly, by parameterizing a dynamic equation and analys- where y is the observed daily change of a focal WBC type as in stage 1
ing rates of change, our coefficient estimates have an immediate causal with normal distributed mean μ, and σ, the model uncertainty with a
interpretation within our modelling framework (that is, a βj >0 implies thick-tailed half Cauchy prior (importantly, our posterior estimates
that higher levels of the corresponding Xj increases the rate of change do not depend on this choice as we obtain the same results with an
of WBC type, W). To differentiate such results from other associations, inverse gamma prior, Extended Data Fig. 10b). μ was a function of the
they have been described by the term ‘mathematical causality’41. baseline growth rate gr, and predictors P̂: medications with non-zero
coefficients in stage 1, the WBC counts, patient age and sex, and HCT
Stage 1 analysis conditioning intensities; additionally, P̂ now included the
This includes feature selection: identifying medications and clinical log-abundances of microbial genera as measured by 16S sequencing
observations associated with WBC dynamics from patients without from DNA in the stool collected on the second day of a daily interval
microbiome data. Stage 1 uses data of patients without any available (see supplementary information for details). We considered taxa that
microbiome samples and the following model of WBC changes, y: were among the 100 most abundant, or had reached maximum relative
p
abundances of at least 10%, and selected those who were non-zero in
more than 75% of our samples. WBC counts and microbiota data present
y = gr + ∑ βjX j,
j=1 during a daily interval were log-transformed, and zeros were filled with
half of the minimum observed non-zero counts (that is, 0.5 × 103 and
with intercept, gr. The predictors, X, include dummy variables for the 2 × 10−6, respectively). We focused on the largest cohort (PBSC) and used
HCT graft type, patients’ age on the date of HCT, sex, 13 most frequently the independent inference results from TCD, bone marrow and cord
observed positive blood cultures with remaining other blood stream cohorts for validation.
infections grouped into a separate category ‘other infections’, an indi-
cator for the onset of graft versus host disease, administrations of 55 Validation score
different, most common immunomodulatory medications and platelet Coefficients learned from the PBSC cohort were assigned a valida-
transfusion events, and HCT conditioning intensity regimens as well tion score based on the results obtained from the other three MSK
as the log-transformed geometric mean counts of neutrophils, lym- patient cohorts. Our requirements for validation were conservative;
phocytes, monocytes, eosinophils and platelets during the respective we required evidence from our validation datasets as well as absence
interval. We used elastic net regression50 for feature selection using the of counter evidence. For regression results from each of the validation
sklearn package for the Python programming language51. For elastic net graft type cohorts, that is, TCD, bone marrow and cord, we checked if a
regression with 50% L1 penalty, predictors were scaled between zero coefficient had more than 75% probability (50%HPDI) to have the same
and 1, and we used tenfold cross validation (that is, leaving out 10% of sign as the mean of the PBSC coefficient posterior for a given predictor.
patients at each cross-validation step) to choose the regularization If so, this was considered evidence of validation, and we summed the
strength, λ, solving for evidence over the three validation sets (that is, maximum score of 3, 1
from each of TCD, bone marrow and cord cohorts). Conversely, if we
 N  p 
2
p p  found more than 75% probability among any of the validation datasets
 1 1 1 
argmingr , β  ∑ yi − gr − ∑ xijβj  + λ ∑ |βj | + 2 λ ∑ β j2 that a given predictor had the opposite sign as the posterior mean
 2N i =1  j=1 
2 j =1 j =1  calculated from PBSC data, this was considered counter evidence and
 
the validation score was always set to zero.
Stage 1 yielded a sparse coefficient matrix of predictors used to design
the model in stage 2. Analysis of WBC dynamics with absolute bacterial abundances
as predictors instead of relative abundances
Stage 2 We conducted an ordinary least-squares regression using the statsmod-
Stage 2 of the analysis comprises expanded analysis on patients els package in the Python programming language of the same model
with microbiome data. To identify associations between microbiota as in the main Bayesian analysis using total bacterial abundances as
and WBC dynamics, we conducted an analogous, Bayesian regression predictors. This was only possible on a subset of 389 neutrophil, 331
using the package PyMC3 for the Python programming language52. lymphocyte and 376 monocyte rate observations from PBSC patients.
Article
where η is the total number of observed daily log changes in genera and
Forwards simulation of predicted immune system WBCs, and ρ the total number of predictors. This yielded a strongly
reconstitution kinetics regularizing λs, and thus few predictors. To characterize potential bidi-
To assess the impact of the estimated microbiota coefficients on immune rectional relationships between WBC counts and the gut microbiota,
system dynamics, we conducted 1,000 simulations of the system of 3 dif- we iteratively reduced the regularization strength until the strongest
ferential equations describing the dynamics of neutrophils, lymphocytes interaction between microbiota and WBC dynamics, that is, Faecali-
and monocytes. We ran 1,000 simulations four times: in the presence and bacterium with neutrophil dynamics, was detected. We than re-ran the
absence of GCSF, each with microbiota compositions enriched or depleted regression with this reduced regularization strength λr.
in Faecalibacterium, Ruminococcus 2 and Akkermansia. To identify these
compositions, we ranked the observed microbiota compositions by these Shotgun sequencing
taxa, and chose randomly either from the top or bottom 100. The coef- Sequencing of 124 post-neutrophil engraftment was conducted on
ficients for WBC interactions, interactions with the microbiota and the the Illumina HiSeq platform. For details and the processing of the
effect of GCSF were sampled from our posterior coefficient distributions. FASTQ files, see supplementary information. We used the HUMAnN2
Using these coefficients sampled at the start of the simulation, and using pipeline54 with default settings for functional profiling of our sam-
50 cells μl−1 of neutrophils, lymphocytes and monocytes as initial values, ples, with the UniRef90 data base and ChocoPhlAn for alignment,
we simulated these differential equations forwards in time using the odeint and we renormalized our samples by library depth to copies per mil-
function of the scipy package for the Python programming language. lion. We used MetaCyc to obtain stratified and unstratified pathway
abundances.
Validation on data from Duke University
We analysed 9,603 blood samples with 25,581 associated administra- Statistical analysis of shotgun data
tions of immunomodulatory medications, and 741 microbiota sam- We calculated the predicted microbiota potency score for each sample
ples from Duke as an orthogonal dataset to validate our findings. The and separately for neutrophils, lymphocytes and monocytes, by multi-
temporal resolution of this data was much lower, and after filtering for plying the abundances of taxa in each of the 124 samples with the cor-
samples from the relevant post-neutrophil engraftment period, and by responding posterior coefficients obtained from the PBSC inference.
requiring daily intervals, 83 valid, complete data points were available. To distinguish the sets of metabolic functions that separate samples
Using these data, we correlated daily blood cell changes individually with positive and negative predicted potencies, we converted the path-
in univariate, or jointly in a partial least squares regression, with those way abundances into presence and absences profiles. We performed
predictors that achieved more than 95% probability density in the posi- a linear discriminant analysis between positive and negative potency
tive or negative domain in the PBSC data regression. For each of these samples with a least squares solver and automatic shrinkage using the
predictors, we present the sign of slopes and Bonferroni-corrected P Ledoit–Wolf lemma using the sklearn package for the Python program-
values from individual linear regressions. ming language51. To assess differences in the presence or absence of
pathways between samples with positive and negative potency, we
Joint analysis of the effect of antibiotics and WBC counts on used Fisher’s exact test for each pathway.
the microbiota and the microbiota and immunomodulatory
medications on WBC counts Reporting summary
Analogous to stage 1, we performed cross-validated, regularized linear Further information on research design is available in the Nature
regressions (ElasticNet) using the scikit-learn package for the Python Research Reporting Summary linked to this paper.
programming language to jointly estimate the association network
between microbiota and circulatory WBCs. For this, we constructed a
block matrix X of predictor matrices Xi that include the absolute bacte- Data availability
rial abundances, drug data (antibiotics for bacterial dynamics and All data supporting the findings of this study are available within the
immune modulators for WBC dynamics), as well as the counts of WBCs paper and its Supplementary Information files. The data used in our
and a separate intercept term per block. Each block X ln l , pl , with nl obser- study are organized in Excel-compatible comma-separated value files
vations and pl predictors (l = 0,...,k), on the diagonal of X corresponds as Supplementary Tables (data-tables.zip). All sequencing data have
to the indices of the observed daily log-changes of one of the 41 bacte- been made available publicly, and the NCBI SRA accession numbers are
rial genera considered in our main analysis or the log changes in neu- listed in the Supplementary Tables. Metadata and processed sequenc-
trophil, lymphocyte and monocyte counts from PBSC patients ing data are made available on a public repository via Figshare: meta
contained in Y (in total we calculated 15,833 rates from 256 patients). data, https://doi.org/10.6084/m9.figshare.12016986.v4; samples,
Our regression problem can thus be written as: https://doi.org/10.6084/m9.figshare.12016983.v4; 16S counts, https://
doi.org/10.6084/m9.figshare.12016989.v3; and 16S taxonomy, https://
X n00, p0 ⋯ 0n 0, pk doi.org/10.6084/m9.figshare.12016992.v1.
argminβ (Y − Xβ) where X = ⋮ ⋱ ⋮
0n k , p0 ⋯ X nk k , pk Code availability
All of the steps of the analyses that were performed in this study
with k = 44, that is, 41 bacterial genera and 3 WBC types, the to-be estimated are described in detail to allow reproduction of the results. Rele-
coefficient vector β and 0 the zero matrix. This system is underdetermined vant analysis code is available publicly at https://github.com/jsevo/
and we therefore chose the same approach as in stage 1, elastic net regres- wbcdynamics_microbiome.
sion, for feature selection. Predictors were scaled between zero and 1, and
we used threefold cross validation, leaving out one-third of the patients 45. Caporaso, J. G. et al. Ultra-high-throughput microbial community analysis on the Illumina
HiSeq and MiSeq platforms. ISME J. 6, 1621–1624 (2012).
at each iteration to identify a global regularization strength λ, solving for 46. Callahan, B. J. et al. DADA2: high-resolution sample inference from Illumina amplicon
data. Nat. Methods 13, 581–583 (2016).
  
2  47. Murali, A., Bhargava, A. & Wright, E. S. IDTAXA: a novel approach for accurate taxonomic
1 1 1  classification of microbiome sequences. Microbiome 6, 140 (2018).
argminβ  ∑ yi − ∑ xij βj  +
2
λ ∑ |βj | + 2
λ ∑ β 2j  48. Quast, C. et al. The SILVA ribosomal RNA gene database project: improved data
2 i =1  j =1  j =1 j =1 
  processing and web-based tools. Nucleic Acids Res. 41, D590–D596 (2013).
49. Pinheiro, J. C., Bates, D. M., DebRoy, S. S. & Sarkar, D. nlme: Linear and Nonlinear Mixed RO1 AI093808. The funders had no role in study design, data collection and analysis, decision
Effects Models. R package version 3.1-150 (2013). to publish or preparation of the manuscript.
50. Tibshirani, R. Regression shrinkage and selection via the lasso. J. R. Stat. Soc. B 58,
267–288 (1996). Author contributions J.S. and J.B.X. wrote the manuscript. J.S. and J.B.X. designed the
51. Pedregosa, F. et al. Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825 analyses with expert help from R.N. J.U.P. and Y.T. contributed to the clinical data preparation,
(2011). B.P.T. provided the 16S data-processing pipelines, K.A.M., M.S., A.S., S.M., M.F., M.S.P., T.M.H.,
52. Salvatier, J., Wiecki, T. V. & Fonnesbeck, C. Probabilistic programming in Python using M.-A.P. and M.R.M.v.d.B. provided clinical context and helped with variable selection, N.J.C.,
PyMC3. PeerJ Comput. Sci. 2, e55 (2016). M.L., L.B., A.B. and A.D.S. provided clinical and other data from Duke, A.D. provided the
53. Hoffman, M. D. & Gelman, A. The No-U-turn sampler: adaptively setting path lengths in shotgun processing pipelines. E.F., L.A.A. and R.J.W. processed patients’ stool samples,
Hamiltonian Monte Carlo. J. Mach. Learn. Res. 15, 1593–1623 (2014). including for 16S sequencing, shotgun metagenomics and qPCR quantification of total 16S
54. Franzosa, E. A. et al. Species-level functional profiling of metagenomes and rRNA gene. All authors contributed to the writing and interpretation of the results.
metatranscriptomes. Nat. Methods 15, 962–968 (2018).
Competing interests M.R.M.v.d.B. and J.U.P. received financial support from Seres
Therapeutics. M.-A.P. has received honoraria from AbbVie, Bellicum, Bristol-Myers Squibb,
Acknowledgements We thank M. Lipsitch, S. B. Andersen, K. R. Foster, J. K. Sia, E. G. Pamer, Incyte, Merck, Novartis, Nektar Therapeutics, and Takeda, research support for clinical trials
K. Coyte, S. Mitschka and the members of the Xavier lab for helpful discussion and comments from Incyte, Kite (Gilead) and Miltenyi Biotec, and serves on data and safety monitoring boards
on the manuscript. This work was supported by the National Institutes of Health (NIH) grants for Servier and Medigene and scientific advisory boards for MolMed and NexImmune.
U01 AI124275, R01 AI137269 and U54 CA209975 to JBX, by the MSKCC Cancer Center Core
Grant P30 CA008748, the Parker Institute for Cancer Immunotherapy at Memorial Sloan Additional information
Kettering Cancer Center, the Sawiris Foundation, the Society of Memorial Sloan Kettering Supplementary information is available for this paper at https://doi.org/10.1038/s41586-020-
Cancer Center, MSKCC Cancer Systems Immunology Pilot Grant and Empire Clinical Research 2971-8.
Investigator Program. M.S. received funding from the Burroughs Wellcome Fund Postdoctoral Correspondence and requests for materials should be addressed to J.S. or J.B.X.
Enrichment Program, the Damon Runyon Physician-Scientist Award, and the Robert Wood Peer review information Nature thanks Henrik Nielsen and the other, anonymous, reviewer(s)
Johnson Foundation. T.M.H. is investigator in the Pathogenesis of Infectious Diseases from the for their contribution to the peer review of this work.
Burroughs Wellcome Fund, and funded via an award from Geoffrey Beene Foundation, and NIH Reprints and permissions information is available at http://www.nature.com/reprints.
Article

Extended Data Fig. 1 | Blood cell counts over time. a, WBC counts and platelet counts per graft source over the first 100 days post HCT per day relative to HCT
from N = 2,235 adult patients (detailed demographics in supplementary Table 1); lines: mean, shaded: ± standard deviations. b, Data exclusion diagram.
Extended Data Fig. 2 | FMT increases WBC counts. a, HCT patient who estimates (mean vs. mean + FMT effect) from linear mixed effects models of
received an autologous faecal microbiota transplant (auto-FMT, dashed red total WBC counts over time indicate an auto-FMT-induced increase of WBCs
line) that restored commensal microbial families and ecological diversity in (βFMT: P = 7 × 10 −14). e–g, Respectively: neutrophil, lymphocyte and monocyte
the gut microbiota, with concurrent cell counts of peripheral neutrophils, count trajectories of 24 FMT trial patients. Thin lines: raw data (blue:
lymphocytes and monocytes and immunomodulatory drug administrations. post-FMT); thick black: mean per day, thick blue: mean+post-FMT coefficient.
b, Total WBC counts in 24 enrolled patients (10 control, 14 treated) Means and confidence intervals (shaded region) without (black) and after FMT
post-neutrophil engraftment; vertical lines indicate randomization dates. (blue), as well as the coefficient estimate for FMT treatment and its P value from
c, Weekly mean WBC counts aligned to the randomization date (FMT-treated: a linear mixed effects model relating cell counts over time to the FMT
red, control: black). Line: mean per week, shaded region: 95% CI. d, Coefficient treatment (Methods).
Article

Extended Data Fig. 3 | Results of the feature selection stage 1 regression. gr: intercept; TCD: T cell depleted graft (ex-vivo) by CD34+ selection; PBSC:
a–c, Stage 1 regression on neutrophil, lymphocyte, and monocyte dynamics, peripheral blood stem cells; BM: bone marrow; cord: umbilical cord blood;
respectively, on patients without microbiome data. Coefficients from NONABL: Nonmyeloablative; REDUCE: reduced-intensity conditioning
tenfold cross-validated elastic net regression daily changes in neutrophils. regimen; F: female; N: patients, n: samples (daily changes in neutrophils).
Extended Data Fig. 4 | Additional coefficients, posterior convergence coefficients from individual univariate regressions of microbiome and clinical
evaluation and validation. a–c, Additional posterior coefficient estimates of predictors with changes in neutrophils, lymphocytes and monocyte, and for
medications, additional genera and HCT metadata from the Bayesian stage 2 comparison the corresponding coefficients signs from the Bayesian multiple
regression, see also Fig. 3. REDUCE: reduced-intensity conditioning regimen; linear regressions in stage 2 of the analysis of WBC dynamics in MSK patients
NONABL: non-myeloablative conditioning regimen. F: female. d–f, posterior (Fig. 3). Pvalues were adjusted for multiple hypothesis testing using Bonferroni
sampling convergence. Histograms of the ranked posterior draws from the correction: ***P < 0.001, **P < 0.01, *P < 0.05; P > 0.05: n.s. Sign of coefficients
model of neutrophil, lymphocyte and monocyte dynamics, respectively, in from MSK PBSC patients for comparison. j, Equivalent validation analysis from
PBSC patients (ranked over all chains), plotted separately for each chain show patients treated at Duke using partial least squares regression of microbiome
no substantial differences between chains. g–i, Predictors of WBC dynamics and clinical predictors identified in stage 2 of our analysis on daily changes in
using data from patients treated at Duke. Heatmaps indicate the slope neutrophils, lymphocytes and monocyte.
Article

Extended Data Fig. 5 | Validation using absolute instead of relative that is, neutrophil, lymphocyte and monocyte daily log-changes, was
abundance bacterial genus data. a–d, Validation analysis of the main model conducted, and coefficients for medications (a), WBC feedbacks (b)
using absolute bacterial abundances as predictors instead of relative metadata (c) and total genus abundances (d) are shown. This was only possible
abundances in Fig. 3. Results show inferred coefficients and P values from for only a subset of the data used in the main analysis for which we obtained
multiple linear regressions. One regression per analysed WBC type dynamics, absolute bacterial abundance estimates (Methods), n: samples, N: patients.
Extended Data Fig. 6 | Jointly inferred association network between WBC and bacterial genus dynamics. Strong regularization yields few non-zero
coefficients and antibiotics dominate the dynamics.
Article

Extended Data Fig. 7 | Jointly inferred association network between WBC for example, between lymphocytes and [Ruminococcus] gnavus group
and bacterial genus dynamics with reduced regularization. Reducing (highlighter green boxes, and cartoon).
regularization strength (Methods) indicates potential bidirectional feedbacks,
Extended Data Fig. 8 | Functional analysis of microbiota samples. To samples that distinguished positive and negative potency samples the most
distinguish samples predicted to increase rates of WBCs, a microbiota potency (LDA-score magnitude in the 95th percentile). Highlighted pathways are
score was calculated from posterior coefficients (Fig. 3, Methods) and the discussed in the main text. For each pathway, we tested whether pathway
relative abundance of taxa in samples. Bars show linear discriminant analysis presence was enriched (depleted) in positive (negative) potency samples using
(LDA) scores of MetaCyc pathway profiles from 124 shotgun sequenced one-sided Fisher’s exact test; ***P < 0.001, **P < 0.01, *P < 0.05.
Article

Extended Data Fig. 9 | Abundance profiles of bacterial genera across Staphylococcus abundances). b, Abundance profiles of the two genera,
analysed samples. a, The relative non-zero abundance of Staphylococcus is Faecalibacterium and Ruminococcus 2, most strongly associated with WBC
inversely related to microbiome alpha diversity, bold line: regression line from increase; number of times detected (left) and log10 abundance distribution
a linear model of the mean of the log10 Staphylococcus relative abundance, when above detection (right).
shaded: 95% confidence intervals (n = 1,381 samples with non-zero
Extended Data Fig. 10 | Survival analysis and confirmation of model results prior for σ in the main Bayesian model. Plotted are the posterior means from
with different priors. a, Kaplan–Meier plot of patient 3-year survival with our main analysis against the equivalent inference with an inverse Gamma prior
sufficient available blood data (Supplementary Information, Extended Data (alpha = 1, beta = 1).
Fig. 1). b, posterior association coefficients do not depend on the choice of
nature research | reporting summary
Corresponding author(s): Jonas Schluter, Joao B. Xavier
Last updated by author(s): Aug 8, 2020

Reporting Summary
Nature Research wishes to improve the reproducibility of the work that we publish. This form provides structure for consistency and transparency
in reporting. For further information on Nature Research policies, see our Editorial Policies and the Editorial Policy Checklist.

Statistics
For all statistical analyses, confirm that the following items are present in the figure legend, table legend, main text, or Methods section.
n/a Confirmed
The exact sample size (n) for each experimental group/condition, given as a discrete number and unit of measurement
A statement on whether measurements were taken from distinct samples or whether the same sample was measured repeatedly
The statistical test(s) used AND whether they are one- or two-sided
Only common tests should be described solely by name; describe more complex techniques in the Methods section.

A description of all covariates tested


A description of any assumptions or corrections, such as tests of normality and adjustment for multiple comparisons
A full description of the statistical parameters including central tendency (e.g. means) or other basic estimates (e.g. regression coefficient)
AND variation (e.g. standard deviation) or associated estimates of uncertainty (e.g. confidence intervals)

For null hypothesis testing, the test statistic (e.g. F, t, r) with confidence intervals, effect sizes, degrees of freedom and P value noted
Give P values as exact values whenever suitable.

For Bayesian analysis, information on the choice of priors and Markov chain Monte Carlo settings
For hierarchical and complex designs, identification of the appropriate level for tests and full reporting of outcomes
Estimates of effect sizes (e.g. Cohen's d, Pearson's r), indicating how they were calculated
Our web collection on statistics for biologists contains articles on many of the points above.

Software and code


Policy information about availability of computer code
Data collection no software was used for data collection

Data analysis python3.7.0, R3.6.1, DADA2, Humann2, ChocoPhlAn, MetaCyc, PyMC3, sklearn-0.23.2, https://github.com/jsevo/wbcdynamics_microbiome/
For manuscripts utilizing custom algorithms or software that are central to the research but not yet described in published literature, software must be made available to editors and
reviewers. We strongly encourage code deposition in a community repository (e.g. GitHub). See the Nature Research guidelines for submitting code & software for further information.

Data
Policy information about availability of data
All manuscripts must include a data availability statement. This statement should provide the following information, where applicable:
- Accession codes, unique identifiers, or web links for publicly available datasets
- A list of figures that have associated raw data
- A description of any restrictions on data availability
April 2020

The data used in our study is organized in Excel compatible comma separated value files as supplementary tables (data-tables.zip). All sequencing data have been
made available publicly, and the NCBI SRA accession numbers are listed in the tables below:

1. cGENUS.csv: relative taxon abundances in fecal microbiota samples from 12,633 stool samples
2. cHCTMETA.csv: HCT characteristics
3. cINFECTIONS.csv: positive blood culture results
4. cMISAMPLES.csv: NCBI SRA accession number, diversity (inverse Simpson index), total 16S (where available), stool consistency for each fecal microbiota sample
5. cMED.csv: medication data

1
6. cPIDMETA.csv: anonymized patient demographics
7. cWBC.csv: absolute counts of neutrophils, lymphocytes, monocytes, eosinophils, and platelets with indication if included in analyses

nature research | reporting summary


8. cDUKE__GENUS.csv: relative taxon abundances in fecal microbiota samples from 12,633 stool samples
9. cDUKE__WBC.csv: absolute counts of neutrophils, lymphocytes, monocytes, eosinophils, and platelets with indication if included in analyses
10. cDUKE__MED.csv: medication data
11. cFMT_analysis.csv: convenience table for Figure 2

Metadata and processed sequencing data are made available on a public repository via Figshare:

meta data: doi.org/10.6084/m9.figshare.12016986.v4


samples: doi.org/10.6084/m9.figshare.12016983.v4
16S counts: doi.org/10.6084/m9.figshare.12016989.v3
16S taxonomy: doi.org/10.6084/m9.figshare.12016992.v1

Field-specific reporting
Please select the one below that is the best fit for your research. If you are not sure, read the appropriate sections before making your selection.
Life sciences Behavioural & social sciences Ecological, evolutionary & environmental sciences
For a reference copy of the document with all sections, see nature.com/documents/nr-reporting-summary-flat.pdf

Life sciences study design


All studies must disclose on these points even when the disclosure is negative.
Sample size Throughout the manuscript, sample sizes are specified. There are many nested analyses on various subsets, and each analysis specifies the
sample sizes. Sample sizes were not predetermined for retrospective analyses, instead all data from electronic health records from allo-HCT
patients since 2003 were used, with specific exclusion criteria listed in the data exclusion section.

Data exclusions Non-adults, non-engrafted patients were excluded. Data from patients without valid two-day apart sample pairs were excluded.The
supplementary methods provides a detailed flow chart of data inclusion/exclusion.

Replication All analyses can be reproduced with the code and data provided. No experiments were conducted.

Randomization N/A as no trial was conducted for this study. The randomized data was previously published and the randomization procedure is explained in
the relevant reference (Taur et al. 2018)

Blinding N/A. No trial was conducted.

Reporting for specific materials, systems and methods


We require information from authors about some types of materials, experimental systems and methods used in many studies. Here, indicate whether each material,
system or method listed is relevant to your study. If you are not sure if a list item applies to your research, read the appropriate section before selecting a response.

Materials & experimental systems Methods


n/a Involved in the study n/a Involved in the study
Antibodies ChIP-seq
Eukaryotic cell lines Flow cytometry
Palaeontology and archaeology MRI-based neuroimaging
Animals and other organisms
Human research participants
Clinical data
Dual use research of concern

Human research participants


April 2020

Policy information about studies involving human research participants


Population characteristics Tables S1-S5 list the patient population characteristics.

Recruitment No patients were specifically recruited for this work. Allo-HCT patients since 2003 were considered and included or excluded
as detailed in the data inclusion/exclusion section.

Ethics oversight The participants in the auto-FMT trial (NCT02269150) provided written informed consent to participate in the trial (#14-025).

2
Ethics oversight Participants in the observational cohorts at both Memorial Sloan Kettering Cancer Center and at Duke University School of
Medicine provided written informed consent for the use of their fecal specimens and clinical data. The use and analysis of

nature research | reporting summary


these specimens for the work herein was approved by IRBs at both institutions: MSK (#16-834) and Duke (PRO0006268 and
Pro00050975).
Note that full information on the approval of the study protocol must also be provided in the manuscript.

Clinical data
Policy information about clinical studies
All manuscripts should comply with the ICMJE guidelines for publication of clinical research and a completed CONSORT checklist must be included with all submissions.

Clinical trial registration NIH clinicaltrials.gov : NCT02269150


https://clinicaltrials.gov/ct2/show/NCT02269150

Study protocol https://clinicaltrials.gov/ct2/show/NCT02269150

Data collection This is a randomized, open-label, controlled study designed to assess the efficacy of autologous fecal microbiota transplantation
(auto-FMT) for prevention of Clostridium difficile infection (CDI) in patients who have undergone allogeneic hematopoietic stem cell
transplantation (allo-HSCT). Patients will be enrolled prior to allo-HSCT; feces will be collected and stored from all participating
subjects prior to the initiation of conditioning regimens, analyzed by deep 16S rRNA gene sequencing, and tested by assay for
intestinal pathogens including Clostridium difficile. Later in the course of transplantation, following engraftment (defined as the first
day of three consecutive days, that the absolute blood neutrophil count is at above f 500 mm3), subjects will undergo fecal testing
for presence of Bacteroidetes by 16S PCR. Subjects will be eligible for study if they have a microbiologically diverse pre-transplant
colonic microbiota, and if the post-engraftment specimen contains Bacteroidetes at a prevalence equal to or below (0.1%)

Actual Study Start Date : October 2014


Estimated Primary Completion Date : October 2021

Outcomes Primary Outcome: Clostridium difficile infection (CDI) [ Time Frame: up to 1 year following randomization ]

CDI is defined as diarrheal stool (unformed stool conforming to the shape of a specimen container), and a positive test for toxin-
producing C. difficile (either by toxin B gene PCR or cytotoxicity assay).

April 2020

You might also like