Professional Documents
Culture Documents
Global Protein-Protein Interaction Network in The Human Pathogen Mycobacterium Tuberculosis H37Rv
Global Protein-Protein Interaction Network in The Human Pathogen Mycobacterium Tuberculosis H37Rv
Global Protein-Protein Interaction Network in The Human Pathogen Mycobacterium Tuberculosis H37Rv
National Key Laboratory of Agricultural Microbiology, Center for Proteomics Research, College of Life Science
and Technology, Huazhong Agricultural University, Wuhan 430070, China
Analysis of the protein-protein interaction network of a pathogen is a powerful approach for dissecting
gene function, potential signal transduction, and virulence pathways. This study looks at the construction
of a global protein-protein interaction (PPI) network for the human pathogen Mycobacterium
tuberculosis H37Rv, based on a high-throughput bacterial two-hybrid method. Almost the entire
ORFeome was cloned, and more than 8000 novel interactions were identified. The overall quality of
the PPI network was validated through two independent methods, and a high success rate of more
than 60% was obtained. The parameters of PPI networks were calculated. The average shortest path
length was 4.31. The topological coefficient of the M. tuberculosis B2H network perfectly followed a
power law distribution (correlation ) 0.999; R-squared ) 0.999) and represented the best fit in all
currently available PPI networks. A cross-species PPI network comparison revealed 94 conserved
subnetworks between M. tuberculosis and several prokaryotic organism PPI networks. The global
network was linked to the protein secretion pathway. Two WhiB-like regulators were found to be highly
connected proteins in the global network. This is the first systematic noncomputational PPI data for
the human pathogen, and it provides a useful resource for studies of infection mechanisms, new
signaling pathways, and novel antituberculosis drug development.
10.1021/pr100808n 2010 American Chemical Society Journal of Proteome Research 2010, 9, 6665–6677 6665
Published on Web 10/25/2010
research articles Wang et al.
Figure 1. Construction and property of the M. tuberculosis PPI network. (A) The process of constructing large-scale M. tuberculosis
protein-protein interaction networks. (B) Schematic representation of bacterial two-hybrid analysis in this study. (C) Global
protein-protein interaction network views of M. tuberculosis. A graph of the pathogenic PPI network involving 2907 proteins linked
via 8042 interactions. Several major protein families are indicated by different colors. (D) The degrees of top 10 highly connected
proteins in the M. tuberculosis B2H PPI network. The numbers of PPIs are indicated on top of the respective columns. The functions
of 10 proteins are shown on the right of the panel.
For the study of PPIs, the yeast two-hybrid system has come Materials and Methods
to the forefront as a powerful tool for PPI identification.
However, the development of bacterial two-hybrid systems now Cloning of M. tuberculosis ORFeome into the Bacterial
allows proteins to be assayed for interactions under conditions Two-Hybrid Plasmids. Primer pairs for a total of 3989 ORFs
that closely match the native environment of M. tuberculosis
were designed, and each predicted ORF was amplified with a
(Figure 1B).18-20 In the present paper, we show how we
specific primer pair through PCR. PCR products were digested
successfully constructed a global M. tuberculosis PPI network.
by corresponding restriction enzyme pairs or were ligated with
We cloned almost the entire ORFeome of the pathogen and
an EcoRI-SmaI adaptor (TakaRa Biotech. Co.) if this type of
identified more than 8000 mostly novel interactions among the
2907 proteins. The overall quality of the PPI network was enzyme site was lacking within the ORF sequences. The
validated. Two WhiB-like transcriptional factors were also found cleaned ORF fragments were then cloned into a pair of
to be highly connected proteins in the global network, indicat- BacterioMatch@II vectors of pBT and pTRG vector (Stratagene).
ing that these genes might be core regulators. The capacity for We carried out sequencing of the entire clone library to confirm
comprehensive screening of PPI in M. tuberculosis is expected that the ORFs were correctly amplified and cloned. All pTRG-
to provide further information regarding infection mechanisms ORF plasmids were then pooled, and from there, an ORF library
and to aid in identifying novel antitubercular drugs. for M. tuberculosis was constructed.
Figure 2. Interaction distributions in COG functional categories. (A) Interactions between COG functional categories. Numbers and
colors indicate the number of interactions between each pair of COG functional categories in the M. tuberculosis B2H PPI network. (B)
Number of proteins in each COG functional category encoded by the M. tuberculosis genome. (C) Number of interactions in each COG
functional category in the M. tuberculosis B2H PPI network.
genes encoded by the genome of M. tuberculosis), after categories, the B2H data are presented using a heat map, and
eliminating false-positive cotransformants (Figure 1A). A global interactions between lipid transport and lipid metabolism
network of M. tuberculosis H37Rv, containing 8042 PPIs, was proteins are shown to be relatively enriched in the global
constructed (Figure 1C), and the network (or graph) based on network (Figure 2A). The number and map of B2H interactions
these PPI data was visualized by the program Cytoscape29 show a consistency with that of ORFs encoded by the M.
applying the Dual Layout plugin (Supporting Information Table tuberculosis genome (Figure 2B and 2C).
S2). We further finished the PPI enrichment analysis in the
The bacterial proteins formed a highly linked network. As context of these COG categories (Supporting Information Figure
shown in Figure 1C, a giant network component of 8020 S2). Proteins from category I (lipid transport and metabolism),
interactions between 2895 (99.59%) proteins and six small D (cell cycle control, mitosis, and meiosis), and O (posttrans-
isolated network components of less than three proteins were lational modification, protein turnover, chaperones) presented
included in the constructed global PPI network of M. tubercu- higher connection to the rest of the proteins in the network
losis H37Rv. The top 10 proteins (right panel) and their number (Supporting Information Figure S2A). Proteins within the
of linkages (left panel) in the B2H network are listed in Figure category D (cell cycle control, mitosis, and meiosis) were
1D. These include dihydrodipicolinate reductase DapB, two remarkably highly interconnected (Supporting Information
WhiB-like transcriptional factors (WhiB3 and WhiB7), and Figure S2B). Significant enrichment of the interactions between
several proteins of unknown function. On the basis of the COG categories M (cell wall/membrane biogenesis) and U (intra-
Figure 3. Topological properties of the M. tuberculosis B2H PPI network. (A) Degree distribution. The number of proteins with a given
link (k) in the M. tuberculosis B2H PPI network follows a power law P(k) ) akb (a ) 3039.8, b ) -1.968). R2 ) 0.884 for the power law
fit. (B) Topological coefficients distribution. The topological coefficient was plotted against the number of links. The topological coefficient
with a given link (k) in the M. tuberculosis B2H PPI network follows a power law TC(k) ) akb (a ) 0.976, b ) -0.972, R2 ) 0.999). (C)
Average clustering coefficient distribution. R2 ) 0.234 for power law fit. (D) Closeness centrality distribution. R2 ) 0.383 for power law
fit. (E) Shortest path length distribution. On average, any two proteins in the M. tuberculosis B2H PPI networks are connected though
4.31 links. (F) Stress distribution. The stress of a node n is the number of shortest paths passing through n. The stress distribution
gives the number of nodes with stress s for different values of s. The values for the stress are grouped into bins whose size grows
exponentially by a factor of 10. The bins used for this distribution are {0}; [1, 10); [10, 100); ...
cellular trafficking and secretion) was observed (Supporting networks can be found in Supporting Information Table S1 and
Information Figure S2B). Signal proteins from category T (signal Figure S3. The first three topological features were identical to
transduction mechanisms) tended to interact with proteins our current understanding of PPI networks in prokaryotic
from category I (lipid transport and metabolism), C (energy organisms4,12,13,23-25 (Supporting Information Table S1). How-
production and conversion), and J (translation) (Supporting ever, the M. tuberculosis network only had a consistent cluster-
Information Figure S2B).
ing coefficient with that of Synechocystis sp.11 and was lower
Topology of the M. tuberculosis PPI Network. The topologi-
cal parameters of PPI networks were calculated using Network than the other reference networks (Supporting Information
Analysis29(Figure 3). The average shortest path length was 4.31, Table S1). The clustering coefficient of different data sets
network diameter 10, average number of neighbors 5.509, and generated from the pull-down assay tended to be closer
clustering coefficient 0.006. A complete network parameter of (0.064-0.079), while the two-hybrid-based screens showed
M. tuberculosis B2H PPI and several prokaryotic reference apparent variance (0.006-0.233).
Figure 4. Coexpression/copurification and SPR assays for verifying bacterial two-hybrid interactions. (A) A scheme of coexpression/
copurification method for PPI verification. (B) A scheme of SPR method for PPI verification. (C) Coexpression/copurification assay. The
pair of potential interaction genes was cloned into the pHEX-derived and pGEX-derived vectors. The E. coli BL21 (DE3) cotransformants
with a pair of recombinant vectors were grown at 37 °C, and protein expression was induced. The cells were sonicated, and the
supernatants were loaded onto a 15 mL glutathione (GST) column for copurification. The copurified sample was run on SDS-PAGE for
Western blot analysis. A Western blot analysis was conducted using primary antibody (antisera to His-tag) (1:1000) and secondary
antibody IgG-HRP (goat anti-Rabbit) (1:10000). Representative data are shown. (D) SPR assay. His-tagged M. tuberculosis protein was
immobilized onto the NTA chips. Another GST-tagged protein, to be used as the ligand, was diluted in the HBS buffer [10 mM Hepes
(pH 7.4), 150 mM NaCl, 50 µM EDTA, 5 mM ATP, 0.005% BIAcore surfactant P20] at a concentration of <200 nM and injected at 10
µL/min for 5 min at 25 °C. GST protein was used as a negative control. Each analysis was performed three times. An overlay plot was
produced, and representative curves are shown.
identify conserved subnetworks between C. jejuni and H. pylori; groups of regulatory proteins that were obviously enriched,
this may be caused by the low interactome coverage of H. pylori such as sigma factors (SigB, SigE, SigH, and SigF), two-
PPI networks.4 Notably, among the 12 conserved subnetworks component signal proteins (MtrA, MtrB, DevR, PhoP, and
between M. tuberculosis and Synechocystis sp. were several KdpD), serine/threonine protein kinases (PknD, PknJ, and
Figure 5. Representative examples of conserved subnetworks between M. tuberculosis and other reference species. Conserved
subnetworks identified by NetworkBlast. Red nodes represent the proteins from M. tuberculosis, and green nodes represent the proteins
from corresponding reference species. Blue dashed lines indicate cross-species sequence similarity between proteins. (A) Conserved
subnetworks between M. tuberculosis and C. jejuni. (B) Conserved subnetworks between M. tuberculosis and E. coli. (C) Conserved
subnetworks between M. tuberculosis and Synechocystis sp. (D) Conserved subnetworks between M. tuberculosis and T. pallidum. A
complete list of conserved subnetworks identified by NetworkBlast in this study is available in the Supporting Information.
PknK), and cyclic-di-GMP small molecular signal proteins secretion through coupling with ESX system proteins, ESAT6,
(Rv1354c and Rv1357c) (Figure 5 and Supporting Information and CFP10-like proteins in M. tuberculosis. Multiple M. tuber-
Figure S4). All of these represent the dominant regulation culosis ESX systems may coordinate through protein-protein
mechanisms in prokaryotic organisms. Our results suggest that interactions and cooperatively serve in bacterial protein secre-
these proteins, through their involvement in the conserved tion. This function would be consistent with the previously
interactions, may function as protein complexes or in the same observed crosstalk between two different ESX systems in a
pathway. mycobacterial species, M. marinum.43
Linking PPI Networks to Protein Secretion Pathways. PPI Network Analysis Revealed Two WhiB-Like Core
ESAT-6 and CFP-10 were two major secreted antigens of M. Regulators. Two WhiB-like regulators, WhiB3 and WhiB7, were
tuberculosis and were also involved in protein secretion.42 found among those in the top 10 with high presence in the
ESAT6 and CFP10-like proteins and 408 PPIs were included in global PPI network (Figure 1D). In total, 58.2% (96/165) of these
our global network among 22 ESAT6 and CFP10-like gene interactions could be validated by the coexpression/copurifi-
clusters in the genome of M. tuberculosis17 (Supporting Infor- cation method (Supporting Information Table S5), which is very
mation Table S4). Of these, 22 proteins formed a subnetwork, close to the overall success rate of the PPI network (60%).
most of which interacted with multiple other proteins, indicat- WhiB3 has been recently characterized to have extensive
ing that different pairs of ESAT6 and CFP10-like proteins may involvement in the regulation of lipid metabolism and in
cross-regulate during the virulence secretion of the pathogen response to environmental stresses.44 In the present study,
(Figure 6A). A linkage between two major M. tuberculosis ESX WhiB-3 interacted with 113 proteins including 8 lipid transport
protein secretion systems, ESX-1 and ESX-5, was also observed and metabolism proteins, 3 oxidoreductase proteins, 4 regula-
(Figure 6B). Several family proteins in the subnetwork were tory proteins, and a number of transport-related proteins, such
found to interact with these two ESX system proteins, including as 4 membrane transport proteins, 3 PE/PPE proteins, and 15
eight ESAT6 and CFP10-like proteins, three secretion proteins, secretion proteins (Figure 7). In all, 29 hypothetical proteins
one PE/PPE protein, two transmembrane proteins, two lipid and other function proteins were found to link with WhiB3
transport and metabolism proteins, one regulatory protein, and (Figure 7).
eight hypothetical proteins (Figure 6B). In contrast, WhiB7 is linked to drug resistance and persistent
In addition, 37 reported secretion proteins were included in infection.45 WhiB7 was involved in 52 PPIs that included 2 drug
a PPI subnetwork, and these directly interacted with ESX resistance proteins (TrpG and Rv1260), 4 oxidoreductase
systems or ESAT6 and CFP10-like proteins (Figure 6C). There proteins, 3 persistent infection proteins, 4 regulatory proteins,
were also 20 transmembrane proteins, 6 PE/PPE proteins, 3 and 4 protein kinases (Figure 7). Interestingly, both WhiB
lipid transport and metabolism proteins, 4 regulatory proteins, proteins in the subnetwork interacted with the same hypotheti-
and 9 hypothetical proteins included (Figure 6C). This indicates cal protein, Rv2418c. Although each WhiB-like protein may play
that these related proteins might be involved in protein different roles during infection of M. tuberculosis, our data
Figure 6. PPI network and protein secretion pathway. The local network was constructed based on the data from our bacterial two-
hybrid experiments. Several major protein families are indicated in different colors. (A) The PPI network of 11 pairs of ESAT6/CFP10-
like proteins. (B) The PPI network involved in ESX1 and ESX5 system proteins. (C) A local network of some reported secretion proteins
in direct linkage with ESX systems or ESAT6 and CFP10-like proteins.
Figure 7. Local network involved in WhiB3 and WhiB7. The local network was constructed based on the data from our bacterial two-
hybrid experiments. Several major protein families are indicated in different colors.
suggest that these two proteins, WhiB3 and WhiB7, could even human PPI networks.4,10,12,13,23-25 This indicates that the
crosstalk and cooperatively regulate the persistent infection of network property might be a general characteristic. Most
the pathogen through a master protein, Rv2418c. interestingly, the topological coefficient of the M. tuberculosis
B2H network perfectly followed a power law distribution
Discussion (correlation ) 0.999; R-squared ) 0.999) and represented the
best fit in all currently available PPI networks (Supporting
This study introduces the systematic bacterial two-hybrid Information Table S3).
analysis of the human pathogen proteins. On the basis of
Very few experimentally validated protein-protein interac-
ORFeome cloning and extensive library screening, we were able
tions have been reported in M. tuberculosis; nevertheless, some
to identify 8042 protein-protein interactions connecting the
of the only interactions to have been reported were found in
2907 bacterial proteins, which represented about 74.1% func-
the current global PPI network.15,50-54 However, among 8042
tional proteins encoded by the entire genome. The M. tuber-
pairs of PPIs, we observed only about 90 conserved interactions
culosis B2H PPI network was separated into seven connected
components, with the largest component accounting for 2895 that were also found in other model organisms (Figure 5 and
(99.59%) proteins from the whole PPI network. Supporting Information Figure S6). This is relatively low but is
The yeast two-hybrid system has been used as a powerful similar to what has been reported in other previous studies on
tool for PPI identification.4-13 However, yeast has certain PPIs.10,11,25 It also demonstrates that the newly produced PPIs
limitations, especially with respect to the assay of bacterial in this study were mostly novel interactions.
protein interactions. The development of bacterial two-hybrid The ORFeome of an organism is useful for various reverse
systems now allows proteins to be assayed for interactions proteomics studies, including global analysis of protein-protein
underconditionsthatcloselymatchtheirnativeenvironment.46-48 interaction and the identification of functional genes. In the
In the current study, we successfully constructed a global PPI present study, a high coverage rate and a relatively small library
network of the pathogenic bacterium, M. tuberculosis H37Rv, have been achieved by cloning the ORFeome from predicted
using the bacterial two-hybrid technique. We identified more genes in the pathogenic genome. This offered a tremendous
than 8000 mostly novel interactions in the pathogen. advantage for extensive characterization of protein-protein
In a PPI network, the term degree represents the number of interactions within the genome. This is the first-ever report on
proteins that interact with another specific protein. The degree the ORFeome cloning in M. tuberculosis in a bacterial two-
distribution of the M. tuberculosis network proteins slowly hybrid vector. A small number of ORFs (<0.9%) could not be
decreased, leading to the generation of a pattern similar to that successfully amplified (Supporting Information Table S6). This
found in the other model organisms.49 The calculated average is probably due to sequencing errors in the genome project or
shortest path length (4.31) within the largest network (Figure simply because the best PCR conditions have not yet been
3C) was close to the value presented for other prokaryotic and found.