Professional Documents
Culture Documents
Gkad 445
Gkad 445
Gkad 445
Received March 18, 2023; Revised May 01, 2023; Editorial Decision May 09, 2023; Accepted May 11, 2023
* To
whom correspondence should be addressed. Tel: +49 3641 944 320; Email: udo.hahn@uni-jena.de
Correspondence may also be addressed to Sascha Schäuble. Tel: +49 3641 532 1152; Fax: +49 3641 532 2152; Email: sascha.schaeuble@leibniz-hki.de
C The Author(s) 2023. Published by Oxford University Press on behalf of Nucleic Acids Research.
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which
permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
W238 Nucleic Acids Research, 2023, Vol. 51, Web Server issue
Table 1. Current numbers of molecular interactions in GEPI. This in- After the retrieval process, GEPI results are presented
cludes events with one and two arguments in a dashboard. This includes aggregated overview pan-
Interactions Unique interactions els and a table of complete primary data where each re-
trieved interaction is listed individually within its document
PubMed 14 441 503 411 500 context. The aggregation panels summarize the publication
PubMed Central 52 144 387 1 032 377
PM+PMC 66 585 890 1 204 986 state with respect to the input and include frequency-sorted
pie and bar charts of interaction partners and Sankey dia-
grams revealing interaction frequencies. We distinguish two
Supplementary Material S1. An evaluation report on the types of Sankey diagrams. The first shows the most fre-
gene interaction extraction components is provided in Sup- quent interactions in the retrieval result. The second sum-
plementary Material S2. Evaluation results are shown in marizes second-degree interactions, offering information
Supplementary Tables S1 and S2. on interaction partners occurring in more complex inter-
actions than provided in the output table. Finally, the table
from PM and the PMC OA subset. STRING’s text min- the 5’ adenosine monophosphate-activated protein kinase
ing facilities collect information for the interaction of pro- (AMPK) complex (31). By using full-text filter capabilities,
tein pairs in two fundamental ways. Firstly, statistical meth- we identified interactions between members of the Akt fam-
ods are leveraged to find over-represented proteins in docu- ily and pyruvate dehydrogenase kinase 2 (PDK2) in the con-
ments with respect to the user input. This is used to calculate text of cellular stress (32). We also used our NLP engine to
an aggregated measure of confidence that a unique pair of gain literature-based insight into the current state of poten-
proteins is associated in general, instead of identifying ex- tial interaction partners on kinases of the PI3K/Akt sig-
plicit association descriptions of a protein pair in individual naling network based on quantitative phosphoproteomics
documents (5). This can be witnessed in STRING’s text min- as input for our NLP service (33). We furthermore com-
ing viewer where at least two input proteins are highlighted bined results from our event extraction system with large-
in each citation but are not necessarily mentioned in any scale gene expression analysis and multiple validation ex-
interaction. Secondly, a modern NLP approach is used to periments to generate a molecular signature of the activa-
find protein pairs explicitly described to physically interact, tion of aryl hydrocarbon receptor (AHR) (34). Finally, our
regular updates to keep up with newest developments in 6. Franceschini,A., Szklarczyk,D., Frankild,S.P., Kuhn,M.,
the literature. We presented powerful query and filtering Simonovic,M., Roth,A., Lin,J., Minguez,P., Bork,P., von Mering,C.
et al. (2013) String v9.1 : protein-protein interaction networks, with
options for contextualized retrieval of interactions between increased coverage and integration. Nucleic Acids Res., 41,
genes, proteins, families and protein complexes. The interac- D808–D815.
tion results can be downloaded as an Excel workbook that 7. Snel,B., Lehmann,G., Bork,P. and Huynen,M.A. (2000) String : a
contains all identified relation pairs and statistics about the Web-server to retrieve and display the repeatedly occurring
frequency of the interaction partners and the interactions neighbourhood of a gene. Nucleic Acids Res., 28, 3442–3444.
8. Huang,C.-C. and Lu,Z. (2016) Community challenges in biomedical
themselves. For each result interaction, its textual source text mining over 10 years: success, failure and the future. Brief.
is provided for cross-checking. The NLP pipelines we pre- Bioinform., 17, 132–144.
sented here and their accessibility via a Web application 9. Gerner,M., Sarafraz,F., Bergman,C. and Nenadic,G. (2012)
opens our NLP service to the broad life science community BioContext: an integrated text mining system for large-scale
extraction and contextualization of biomolecular events.
to effectively foster new scientific discoveries. Bioinformatics, 28, 2154–2161.
27. The Gene Ontology Consortium (2000) Gene Ontology: tool for the 32. Heberle,A.M., Razquin Navas,P., Langelaar-Makkinje,M.,
unification of biology. Nature Genetics, 25, 25–29. Kasack,K., Sadik,A., Faessler,E., Hahn,U., Marx-Stoelting,P.,
28. Zoran,T., Seelbinder,B., White,P., Price,J., Kraus,S., Kurzai,O., Opitz,C.A., Sers,C. et al. (2019) The PI3K and MAPK/p38 pathways
Linde,J., Häder,A., Loeffler,C. et al. (2022) Molecular profiling control stress granule assembly in a hierarchical manner. Life Sci.
reveals characteristic and decisive signatures in patients after Alliance, 2, e201800257.
allogeneic stem cell transplantation suffering from invasive 33. Reimann,L., Schwäble,A.N., Fricke,A.L., Mühlhäuser,W.W.D.,
pulmonary aspergillosis. J. Fungi, 8, 171. Leber,Y., Lohanadan,K., Puchinger,M.G., Schäuble,S., Faessler,E.
29. Weis,S., Carlos,A.R., Moita,M.R., Singh,S., Blankenhaus,B., et al. (2020) Phosphoproteomics identifies dual-site phosphorylation
Cardoso,S., Larsen,R., Rebelo,S., Schäuble,S., Del Barrio,L. et al. in an extended basophilic motif regulating FILIP1-mediated
(2017) Metabolic adaptation establishes disease tolerance to sepsis. degradation of filamin-C. Commun. Biol. [Nature], 3, 253.
Cell, 169, 1263–1275. 34. Sadik,A., Somarribas Patterson,L.F., Öztürk,S., Mohapatra,S.R.,
30. Hahn,U., Matthies,F., Faessler,E. and Hellrich,J. (2016) Panitz,V., Secker,P.F., Pfänder,P., Loth,S., Salem,H., Prentzell,M.T.
UIMA-based JCoRe 2.0 goes GitHub and Maven Central: et al. (2020) IL4|1 is a metabolic immune checkpoint that activates the
State-of-the-art software resource engineering and distribution of AHR and promotes tumor progression. Cell, 182, 1252–1270.
NLP pipelines. In: LREC 2016 –– Proceedings of the 10th 35. Thürmann,L., Klös,M., Mackowiak,S.D., Bieg,M., Bauer,T.,
C The Author(s) 2023. Published by Oxford University Press on behalf of Nucleic Acids Research.
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which
permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.