Professional Documents
Culture Documents
Wittig 2017
Wittig 2017
PII: S0168-1656(17)30288-2
DOI: http://dx.doi.org/doi:10.1016/j.jbiotec.2017.06.007
Reference: BIOTEC 7916
Please cite this article as: Wittig, Ulrike, Rey, Maja, Weidemann, Andreas, Muller,
Wolfgang, Data management and data enrichment for systems biology projects.Journal
of Biotechnology http://dx.doi.org/10.1016/j.jbiotec.2017.06.007
This is a PDF file of an unedited manuscript that has been accepted for publication.
As a service to our customers we are providing this early version of the manuscript.
The manuscript will undergo copyediting, typesetting, and review of the resulting proof
before it is published in its final form. Please note that during the production process
errors may be discovered which could affect the content, and all legal disclaimers that
apply to the journal pertain.
1
Highlights
SABIO-RK: Manually-curated kinetic data for modellers and experimentalists
Excemplify: Excel sheet handling for experimentalists
SEEK: Data and model management for systems biology projects
Abstract
Collecting, curating, interlinking, and sharing high quality data are central to de.NBI-SysBio,
the systems biology data management service center within the de.NBI network (German
Network for Bioinformatics Infrastructure). The work of the center is guided by the FAIR
principles for scientific data management and stewardship. FAIR stands for the four
foundational principles Findability, Accessibility, Interoperability, and Reusability which were
established to enhance the ability of machines to automatically find, access, exchange and
use data.
Within this overview paper we describe three tools (SABIO-RK, Excemplify, SEEK) that
exemplify the contribution of de.NBI-SysBio services to FAIR data, models, and experimental
methods storage and exchange. The interconnectivity of the tools and the data workflow within
systems biology projects will be explained. For many years we are the German partner in the
FAIRDOM initiative (http://fair-dom.org) to establish a European data and model management
service facility for systems biology.
1. Introduction
The increasing amount of data doesnt necessarily entail an increasing amount of knowledge.
The real problem is that we have failed to store and organize much of the rapidly accumulating
information (whether in databases or documents) in rigorous, principled ways, so that finding
what we want and understanding what's already known become exhausting, frustrating,
stressful and increasingly costly experiences (Attwood et al., 2009). To use and reuse the
data, storage, organization and communication in a structured and standardized format is
needed. Today, the FAIR principles sum up what data organization should be: Findable,
Accessible, Interoperable, and Reusable (Wilkinson et al., 2016). All of these principles, except
for Accessibility rely on data quality. Biocuration is a key to data quality (Bateman, 2010):
Findability is enhanced by using standard identifiers and annotations which point to standard
ontologies and databases. The same applies for the use of controlled vocabularies. This allows
answering questions that arise from ambiguous information, like for example: Has the
abbreviation Glu in one document the same meaning in another document?. An identifier
based on standards determines unambiguously that Glu represents either Glucose or
Glutamate.
2
Because of the fact that publications are largely unstructured, a large amount of manual work
by biological experts is still needed to understand the whole publication. Analyzing just
sentence by sentence or paragraph by paragraph by text mining tools is not sufficient. At the
moment natural language processing tools for automatic data extraction and text
understanding are not able to fulfill our requirements. SABIO-RK for example contains about
250 database fields which have to be filled with information about enzymes, proteins,
compounds, reactions, parameters etc. Currently no text mining tool is able to extract this
comprehensive amount of information from a publication (Karp, 2016b).
One main challenge in the information extraction from publications is the question how exact
the entities (e.g. compounds, proteins, enzymes) can be identified within an article,
representing efforts for the FAIR principle Findability. The usage of unique identifiers and
standard naming given by ontologies, controlled vocabularies and databases is essential for a
definite data assignment but in most of the articles unique identifiers and controlled
vocabularies are missing (Wittig et al., 2014a,b). As a solution, journal editors should
encourage authors to use complete, standardized and structured data in their publications.
Collaborations between publishers and database developers to agree on common standards
and data formats are preferable for the future. In addition to that, experimental results could
be collected electronically and automatically uploaded to databases or data management
systems including all relevant standardized metadata for documentation, exchange and further
usage of the data (see section Excemplify and SEEK for more details).
So, the user can choose if to enter queries into the main search bar (i.e. simple query
specification) or the advanced search field (i.e. precise query specification). All queries are
entered by the system into the main search bar, can be cut, pasted, and extended by hand.
When browsing the query result that is presented either by Entry View or Reaction View. The
Visual Search allows to interactively further restrict the query by clicking in the diagram e.g. to
select a specific organism, tissue, or kinetic parameter type.
Users can choose to export data, which is visualised in a shopping cart in the upper right
corner. For example Figure 3 shows that all 234 database entries for the first reaction in the
list are selected for export. By clicking on the shopping cart the data can be exported as
spreadsheet, SBML, or BioPAX. This workflow is simple, well-known from shopping
applications and has a high user acceptance.
Within this section, we have described SABIO-RK as a hand-curated data source where a high
degree of manual curation together with an elaborate search interface, as well as flexible
export functionality enables easy Findability and Reusability of data. However, this degree of
curation cannot be performed on all relevant publications. This leads to efforts to distribute
curation workload towards the users of lab information and data management systems. One
such example is Excemplify described in the following section.
at each stage the actual data entry is done via Excel, using the full freedom of the Excel user
interface. The way of working does not change much, except for using Excemplify instead of
Excel for the sheet transformation operations.
Technically, Excemplify transforms Excel sheets into each other. It has a flexible parsing
framework that breaks up sheets into regions. These then can be transformed using
appropriate transformer objects. Excemplify is a web application and users have their own
accounts. After login, using Excemplify mainly means uploading an Excel Sheet to Excemplify
and receiving a transformed Excel sheet back.
A public demo version is accessible at http://sabiork.h-its.org/excemplify/.
From a data managers point of view this means that Excemplify receives sheets and is able
to store them. Excemplify is trading service for the user against properly annotated data.
Ironically, in the stand-alone Excemplify setup, the only stage without support from Excemplify
is the last one: Collecting the final data of the experiment. Collecting the final data of the
experiment is motivated by automatic deposition: Excemplify enables the user to upload the
data in Excemplify to a connected SEEK instance, for example the FAIRDOM Hub or a project
SEEK instance. This allows the data storage in a structured format and the exchange of data
offering the Reusability based in the FAIR principles.
Extensions to the Excemplify concept:
In the above paragraphs, we have discussed Excemplify as a data collection and storage tool.
In its base version, Excemplify is intended to be light-weight and just input/outputs Excel sheets
and lists of Excel sheets. The tool explicitly tries to avoid duplicating functionality of other tools,
in particular the data exploration functionality. However, it has turned out that users want to
use the software differently. Many users want to explore their data before they share them to
a wider audience. This motivated adding such functionality into Excemplify, including the
graphical display of spreadsheet data in an interactive manner (see Figure 5).
Figure 5: Excemplify screenshot containing the graphical representation of example
immunoblot data.
A positive side effect of supporting experimentalists in handling their different Excel sheets
from the beginning with the experimental setup planning till to the storage of the experimental
results all relevant metadata can be stored, processed and passed to the next experimental
stage. Metadata like the biological sources, protocols, or experimental background information
are mandatory for the setup planning, are therefore also passed through all phases of the
experiment and finally stored and exchanged together with the experimental results using the
SEEK/FAIRDOMHub data management system to allow broader Accessibility and Findability,
to make it Interoperate with other data, and thus improve Reusability.
controlled access of digital assets (data, models, SOPs) and secure sharing between project
partners or keeping preliminary the data private.
SEEK situates itself between a lab notebook one one side and data publications systems
designed around datasets (like http://zenodo.org, http://figshare.com). Within SEEK, the center
is a project and its outcomes in relation to people who created these outcomes. SEEK in turn
plays nicely with related systems, allowing linking up with lab notebooks, and being able to
publish research objects into Zenodo.
A comprehensive overview about other data management tools and data collections beside
SEEK is given by Wruck et al. (Wruck et al., 2012).
In the following we describe (i) the yellow pages in which programmes, projects, institutions,
and scientists can present themselves and can be found by their methods and research
interest, (ii) models, SOPs, and experimental data that can be associated with their creators
and contributors, the (iii) Investigation, Study, Assay structuring of the data that makes data
much more intelligible.
The yellow pages include information about programmes and projects, institutions, and
registered people with contact information, methods, and research interests. It is easy to get
an overview: Who uses the same methods? Who might have run into similar problems and
can discuss them? Who could be a collaboration partner?
Digital assets, i.e. models, SOPs, and experimental data can be either uploaded to SEEK, or
they can be registered. Uploading means that an actual copy of the data is made and stored
within SEEK. Registering means that a link to the data item is established. Both of the uses
make sense: Uploading is best for small to medium data. When uploading data, SEEK also
provides versioning, and the FAIRDOMHub provides backup service for the data. Registering
means that the holder of the data is responsible for the data. However, the metadata is centrally
stored, the data can be interlinked with data in the SEEK. This way of sharing makes sense in
particular if the data is either very big, or there are data mobility restrictions due to regulations.
Uploaded or registered, users are able to share any type of data files and interlink them for
example with publications, SOPs (Standard Operation Procedures), events or collaboration
partners.
SEEK offers versioning of uploaded data files, models, and SOPs for documentation and
reproducibility. To all data uploaded to or created in SEEK a predefined set of general
metadata (see for more details: http://docs.seek4science.org/help/metadata-guidelines.html)
is automatically assigned (e.g. title, project, version number, people involved). Beyond these
automatically generated metadata users are responsible for more specific metadata related to
their specific data. In dependence on the FAIR principles the more metadata the user provides
for the assets in SEEK, the easier it is to find them and to compare them with other assets.
SEEK excels by its handling of spreadsheets. Excel files can be browsed online, and such files
can be turned into semantic-web-enabled templates using the RightField tool (Wolstencroft et
al., 2011). The resulting templates contain ontology information. They are easier to fill for the
user, and at the same time more valuable for reuse. The JERM (Just Enough Result Model)
ontology used in many such templates has been developed to cater for the users needs and
interlink relevant terms to existing ontologies.
To structure different experiments and relate them to each other, the standard ISA
(Investigation-Study-Assay)-structure is available. An investigation represents the general
project/experiment context, a study stands for a smaller unit of experiment and an assay gives
specific analytical measurements to build an extensible and hierarchical structure of
experiments within projects (Sansone et al., 2012). Data files, models and SOPs can be
interlinked with assays to connect the results, protocols or models with the experimental ISA-
structure. Figure 6 shows a graphical representation of an example ISA-structure (3 columns
in the left part of the graph) in FAIRDOMHub connected with related data files, models and
SOPs (right column in the graph). The color coding allows to distinguish between
investigations, studies, assays, models, SOPs, and different file formats.
Figure 6: FAIRDOMHub screenshot containing an example ISA-structure and
connected data files for de.NBI workshop hands-on material about model management
Data in SEEK can be displayed within the web interface if the file format is supported (e.g.
Excel, Word, pdf), downloaded to local machines or accessed automatically using RESTful
8
web services. Models stored in SEEK can be developed and validated using the integrated
JSW simulation tool. Peters et al., 2017 describes also the new SED-ML support of SEEK.
SEEK, RightField and associated tools give the possibilities to self-curate FAIR data. However,
such tools should be complemented with appropriate services. These range from help-to-help-
yourself services (e.g. template building, curation advice), as well as training to a full curation
service. To an extent, the quality of these additional services determines how FAIR the
resulting data will be.
The possible data workflow for systems biology data in Figure 7 reflects all four FAIR principles:
Findability, Accessibility, Interoperability, and Reusability by ensuring that public data and
models in SABIO-RK and SEEK/FAIRDOMHub are (i) searchable for the community, (ii)
accessible by other researchers, (iii) stored and exchanged in standard formats, and (iv) re-
usable by other researchers.
6. Development approach
We would like to stress that all the tools described here benefit a lot from a development
approach that tries to find out how tools will be used and form the tool in agreement with the
user. To this end, FAIRDOM has a set of PALs, users that participate in discussions about new
features, giving suggestions, testing features and giving feedback. These user interactions are
9
7. Summary
Data are most useful if they are findable, accessible, exchangeable and reusable. If
experimental results are published but not stored and organized in a structure format to further
use them scientific impact is reduced. The challenge for scientific data management is to
sensitize experimentalists to view FAIR publishing of data as an natural extension to publishing
results to the scientific community.
The services offered by the de.NBI-SysBio center mainly include data management and data
enrichment. The SEEK data management system together with tools like Excemplify support
experimentalist in the laboratory with easy-to-use tools for data handling. SABIO-RK mainly
uses already published data to offer them in a structured format and enriches them to enhance,
refine or improve the data.
In the near future we will work on further facilitating the integration of our tools into existing
workflows. In particular we are interested in workflows, where collecting data early on facilitates
curation, similar to Excemplify. At the same time we plan to work on improving the literature
curation processes. The goal is to facilitate concentration on the intellectual challenges of the
curation work at hand. All of this work benefits from integration into the infrastructures
community (e.g. de.NBI, ELIXIR, FAIRDOM) and into standardisation efforts (e.g. COMBINE
(Hucka et al., 2015), STRENDA (Apweiler et al., 2005)).
Acknowledgements
The authors gratefully acknowledge the collaboration partners, especially the group of Carole
Goble at the University of Manchester (UK) and the group of Ursula Klingmueller at the German
Cancer Research Center (Germany). Special thanks go to our users for their feedback during
the development processes and for many discussions about their requirements. We also wish
to thank our collaborators in the projects we are part of, among them our neighbors in the
de.NBI-ModSim and the other de.NBI projects, as well as our FAIRDOM partners. The projects
are financed by the Klaus Tschira Foundation (http://www.klaus-tschira-stiftung.de/), the
German Federal Ministry of Education and Research (http://www.bmbf.de/) within de.NBI
(031A540), ERASysAPP (031A525), SysMO-DB, SysMO-DB 2 (0315781), Virtual Liver
Network (0315749), SBEpo (0316182E) ; and the DFG LIS (http://www.dfg.de/) as part of the
project Integrierte Immunoblot Umgebung.
10
References
Apweiler, R., Cornish-Bowden, A., Hofmeyr, J.H., Kettner, C., Leyh, T.S., Schomburg, D., Tipton, K., 2005. The
importance of uniformity in reporting protein-function data. Trends Biochem Sci 30(1) 11-2.
Attwood, T.K., Kell, D.B., McDermott, P., Marsh, J., Pettifer, S.R., Thorne, D., 2009. Calling International Rescue:
knowledge lost in literature and data landslide! Biochem J. 424(3):317-33.
Bateman, A., 2010. Curators of the world unite: the International Society of Biocuration. Bioinformatics 26(8):991.
Bauch, A., Adamczyk, I., Buczek, P., Elmer, F.-J., Enimanev, K., Glyzewski, P., Kohler, M., Pylak, T., Quandt, A.,
Ramakrishnan, C., Beisel, C., Malmstrm, L., Aebersold, R., Rinn, B., 2011. openBIS: a flexible framework for
managing and analyzing complex data in biology research. BMC Bioinformatics 12:468.
Bourne, P.E., Lorsch, J.R., Green, E.D., 2015. Perspective: Sustaining the big-data ecosystem. Nature, 527: S16-
S17.
Caspi, R., Altman, T., Billington, R., Dreher, K., Foerster, H., Fulcher, C.A., Holland, T.A., Keseler, I.M., Kothari, A.,
Kubo, A., Krummenacker, M., Latendresse, M., Mueller, L.A., Ong, Q., Paley, S., Subhraveti, P., Weaver, D.S.,
Weerasinghe, D., Zhang, P., Karp, P.D., 2014. The MetaCyc database of metabolic pathways and enzymes and
the BioCyc collection of Pathway/Genome Databases. Nucleic Acids Res. 42, D459-71.
Courtot, M., Juty, N., Knpfer, C., Waltemath, D., Zhukova, A., Drger, A., Dumontier, M., Finney, A., Golebiewski,
M., Hastings, J., Hoops, S., Keating, S., Kell, D.B., Kerrien, S., Lawson, J., Lister, A., Lu, J., Machne, R., Mendes,
P., Pocock, M., Rodriguez, N., Villeger, A., Wilkinson, D.J., Wimalaratne, S., Laibe, C., Hucka, M., Le Novre, N.,
2011. Controlled vocabularies and semantics in systems biology. Mol Syst Biol 7, 543.
Croft, D., Mundo, A.F., Haw, R., Milacic, M., Weiser, J., Wu, G., Caudy, M., Garapati, P., Gillespie, M., Kamdar,
M.R., Jassal, B., Jupe, S., Matthews, L., May, B., Palatnik, S., Rothfels, K., Shamovsky, V., Song, H., Williams, M.,
Birney, E., Hermjakob, H., Stein, L., D'Eustachio, P., 2014. The Reactome pathway knowledgebase. Nucleic Acids
Res. 42, D472-7.
de Matos, P., Alcntara, R., Dekker, A., Ennis, M., Hastings, J., Haug, K., Spiteri, I., Turner, S., Steinbeck, C., 2010.
Chemical Entities of Biological Interest: an update. Nucleic Acids Res. 38, D249-54.
Demir, E., Cary, M.P., Paley, S., Fukuda, K., Lemer, C., Vastrik, I., Wu, G., D'Eustachio, P., Schaefer, C., Luciano,
J., Schacherer, F., Martinez-Flores, I., Hu, Z., Jimenez-Jacinto, V., Joshi-Tope, G., Kandasamy, K., Lopez-Fuentes,
A.C., Mi, H., Pichler, E., Rodchenkov, I., Splendiani, A., Tkachev, S., Zucker, J., Gopinath, G., Rajasimha, H.,
Ramakrishnan, R., Shah, I., Syed, M., Anwar, N., Babur, O., Blinov, M., Brauner, E., Corwin, D., Donaldson, S.,
Gibbons, F., Goldberg, R., Hornbeck, P., Luna, A., Murray-Rust, P., Neumann, E., Ruebenacker, O., Samwald, M.,
van Iersel, M., Wimalaratne, S., Allen, K., Braun, B., Whirl-Carrillo, M., Cheung, K.H., Dahlquist, K., Finney, A.,
Gillespie, M., Glass, E., Gong, L., Haw, R., Honig, M., Hubaut, O., Kane, D., Krupa, S., Kutmon, M., Leonard, J.,
Marks, D., Merberg, D., Petri, V., Pico, A., Ravenscroft, D., Ren, L., Shah, N., Sunshine, M., Tang, R., Whaley, R.,
Letovksy, S., Buetow, K.H., Rzhetsky, A., Schachter, V., Sobral, B.S., Dogrusoz, U., McWeeney, S., Aladjem, M.,
Birney, E., Collado-Vides, J., Goto, S., Hucka, M., Le Novre, N., Maltsev, N., Pandey, A., Thomas, P., Wingender,
E., Karp, P.D., Sander, C., Bader, G.D., 2010. The BioPAX community standard for pathway data sharing. Nat
Biotechnol. 28(9):935-42.
Funahashi, A., Jouraku, A., Matsuoka, Y., Kitano, H., 2007. Integration of CellDesigner and SABIO-RK. In Silico
Biol 7, S81-90.
Gremse, M., Chang, A., Schomburg, I., Grote, A., Scheer, M., Ebeling, C., Schomburg, D., 2011. The BRENDA
Tissue Ontology (BTO): the first all-integrating ontology of all organisms for enzyme sources. Nucleic Acids Res.
39, D507-13.
The Gene Ontology Consortium, 2000. Gene ontology: tool for the unification of biology. Nat Genet 25(1):25-9.
Hucka, M., Finney, A., Sauro, H.M., Bolouri, H., Doyle, J.C., Kitano, H., Arkin, A.P., Bornstein, B.J., Bray, D.,
Cornish-Bowden, A. et al., 2003. The systems biology markup language (SBML): a medium for representation and
exchange of biochemical network models. Bioinformatics. 19, 524-31.
Hucka, M., Nickerson, D.P., Bader, G.D., Bergmann, F.T., Cooper, J., Demir, E., Garny, A., Golebiewski, M., Myers,
C.J., Schreiber, F., Waltemath, D., Le Novre, N., 2015. Promoting Coordinated Development of Community-Based
Information Standards for Modeling in Biology: The COMBINE Initiative. Front Bioeng Biotechnol. 3:19.
Kanehisa, M., Goto, S., Furumichi, M., Tanabe, M., Hirakawa, M., 2010. KEGG for representation and analysis of
molecular networks involving diseases and drugs. Nucleic Acids Res. 38, D355-60.
Kanehisa, M., Goto, S., Sato, Y., Kawashima, M., Furumichi, M., Tanabe, M., 2014. Data, information, knowledge
and principle: back to metabolism in KEGG. Nucleic Acids Res, 42, D199-205.
Karp, P.D., 2016a. How much does curation cost? Database, baw110.
Karp, P.D., 2016b. Can we replace curation with information extraction software? Database, baw150.
Le Novre, N., Bornstein, B., Broicher, A., Courtot, M., Donizelli, M., Dharuri, H., Li, L., Sauro, H., Schilstra, M.,
Shapiro, B., Snoep, J.L., Hucka, M., 2006. BioModels Database: a free, centralized database of curated, published,
quantitative kinetic models of biochemical and cellular systems. Nucleic Acids Res. 34:D689-91.
Moraru, I.I., Schaff, J.C., Slepchenko, B.M., Blinov, M.L., Morgan, F., Lakshminarayana, A., Gao, F., Li, Y., Loew,
L.M., 2008. Virtual Cell modelling and simulation software environment. IET Syst Biol 2(5) 352-62.
Peters, M., Eicher, J.J., van Niekerk, D.D., Waltemath, D., Snoep, J.L., 2017. The JWS online simulation database.
Bioinformatics, 33(10):1589-1590
Placzek, S., Schomburg, I., Chang, A., Jeske, L., Ulbrich, M., Tillack, J., Schomburg, D., 2017. BRENDA in 2017:
new perspectives and new tools in BRENDA. Nucleic Acids Res, 45, D380-8.
Ruebenacker, O., Moraru, I.I., Schaff, J.C., Blinov, M.L., 2009. Integrating BioPAX pathway knowledge with SBML.
IET Syst Biol models, 3(5):317-28.
11
Sansone, S.A., Rocca-Serra, P., Field, D., Maguire, E., Taylor, C., Hofmann, O., Fang, H., Neumann, S., Tong, W.,
Amaral-Zettler, L., Begley, K., Booth, T., Bougueleret, L., Burns, G., Chapman, B., Clark, T., Coleman, L.A.,
Copeland, J., Das, S., de Daruvar, A., de Matos, P., Dix, I., Edmunds, S., Evelo, C.T., Forster, M.J., Gaudet, P.,
Gilbert, J., Goble, C., Griffin, J.L., Jacob, D., Kleinjans, J., Harland, L., Haug, K., Hermjakob, H., Ho Sui, S.J.,
Laederach, A., Liang, S., Marshall, S., McGrath, A., Merrill, E., Reilly, D., Roux, M., Shamu, C.E., Shang, C.A.,
Steinbeck, C., Trefethen, A., Williams-Jones, B., Wolstencroft, K., Xenarios, I., Hide, W., 2012. Toward
interoperable bioscience data. Nat Genet. 44(2):121-6.
Sayers, E.W., Barrett, T., Benson, D.A., Bolton, E., Bryant, S.H., Canese, K., Chetvernin, V., Church, D.M.,
DiCuccio, M., Federhen, S. et al., 2011. Database resources of the National Center for Biotechnology Information.
Nucleic Acids Res. 39, D38-51.
Shi, L., Jong, L., Wittig, U., Lucarelli, P., Stepath, M., Mueller, S., D`Alessandro, L.A., Klingmller, U., Mller, W.,
2013. Excemplify: A Flexible Template Based Solution, Parsing and Managing Data in Spreadsheets for
Experimentalists. J Integrat Bioinform, 10(2):220
Taylor, C.F., Field, D., Sansone, S.A., Aerts, J., Apweiler, R., Ashburner, M., Ball, C.A., Binz, P.A., Bogue, M.,
Booth, T., Brazma, A., Brinkman, R.R., Clark, A.M., Deutsch, E.W., Fiehn, O., Fostel, J., Ghazal, P., Gibson, F.,
Gray, T., Grimes, G., Hancock, J.M., Hardy, N.W., Hermjakob, H., Julian, R.K. Jr, Kane, M., Kettner, C., Kinsinger,
C., Kolker, E., Kuiper, M., Le Novre, N., Leebens-Mack, J., Lewis, S.E., Lord, P., Mallon, A.M., Marthandan, N.,
Masuya, H., McNally, R., Mehrle, A., Morrison, N., Orchard, S., Quackenbush, J., Reecy, J.M., Robertson, D.G.,
Rocca-Serra, P., Rodriguez, H., Rosenfelder, H., Santoyo-Lopez, J., Scheuermann, R.H., Schober, D., Smith, B.,
Snape, J., Stoeckert, C.J. Jr, Tipton, K., Sterk, P., Untergasser, A., Vandesompele, J., Wiemann, S., 2008.
Promoting coherent minimum reporting guidelines for biological and biomedical investigations: the MIBBI project.
Nat Biotechnol. 26(8):889-96.
The UniProt Consortium, 2011. Ongoing and future developments at the Universal Protein Resource. Nucleic Acids
Res. 39, D214-9.
Weidemann, A., Richter, S., Stein, M., Sahle, S., Gauges, R., Gabdoulline, R., Surovtsova, I., Semmelrock, N.,
Besson, B., Rojas, I., Wade, R., Kummer, U., 2008. SYCAMORE--a systems biology computational analysis and
modeling research environment. Bioinformatics 24, 1463-4.
Wilkinson, M.D., Dumontier, M., Aalbersberg, I.J., Appleton, G., Axton, M., Baak, A., Blomberg, N., Boiten, J.W., da
Silva Santos, L.B., Bourne, P.E., Bouwman, J., Brookes, A.J., Clark, T., Crosas, M., Dillo, I., Dumon, O., Edmunds,
S., Evelo, C.T., Finkers, R., Gonzalez-Beltran, A., Gray, A.J., Groth, P., Goble, C., Grethe, J.S., Heringa, J., 't Hoen,
P.A., Hooft, R., Kuhn, T., Kok, R., Kok, J., Lusher, S.J., Martone, M.E., Mons, A., Packer, A.L., Persson, B., Rocca-
Serra, P., Roos, M., van Schaik, R., Sansone, S.A., Schultes, E., Sengstag, T., Slater, T., Strawn, G., Swertz, M.A.,
Thompson, M., van der Lei, J., van Mulligen, E., Velterop, J., Waagmeester, A., Wittenburg, P., Wolstencroft, K.,
Zhao, J., Mons, B., 2016. The FAIR Guiding Principles for scientific data management and stewardship. Sci Data.
3:160018.
Wittig, U., Kania, R., Golebiewski, M., Rey, M., Shi, L., Jong, L., Algaa,E., Weidemann, A., Sauer-Danzwith, H.,
Mir, S., Krebs, O., Bittkowski, M., Wetsch, E., Rojas, I., Mller, W., 2012. SABIO-RK database for biochemical
reaction kinetics. Nucleic Acids Res, 40(D1):D790-6.
Wittig, U., Kania, R., Bittkowski, M., Wetsch, E., Shi, L., Jong, L., Golebiewski, M., Rey, M., Weidemann, A., Rojas,
I., Mller, W., 2014a. Data extraction for the reaction kinetics database SABIO-RK. Perspectives in Science 1, 33
40.
Wittig, U., Rey, M., Kania, R., Bittkowski, M., Shi, L., Golebiewski, M., Weidemann, A., Mller, W., Rojas, I., 2014b.
Challenges for an enzymatic reaction kinetics database. FEBS Journal, 281(2):572-582.
Wolstencroft, K., Owen, S., Horridge, M., Krebs, O., Mueller, W., Snoep, J.L., du Preez, F., Goble, C., 2011.
RightField: embedding ontology annotation in spreadsheets. Bioinformatics 27(14):2021-2.
Wolstencroft, K., Owen, S., Krebs, O., Nguyen, Q., Stanford, N.J., Golebiewski, M., Weidemann, A., Bittkowski, M.,
An, L., Shockley, D., Snoep, J.L., Mueller, W., Goble, C., 2015. SEEK: a systems biology data and model
management platform. BMC Syst Biol. 9:33.
Wolstencroft, K., Krebs, O., Snoep, J.L., Stanford, N.J., Bacall, F., Golebiewski, M., Kuzyakiv, R., Nguyen, Q.,
Owen, S., Soiland-Reyes, S., Straszewski, J., van Niekerk, D.D., Williams, A.R., Malmstrm, L., Rinn, B., Mller,
W., Goble, C., 2017. FAIRDOMHub: a repository and collaboration environment for sharing systems biology
research. Nucleic Acids Res. 45(D1):D404-D407.
Wruck, W., Peuker, M., Regenbrecht, C.R., 2014. Data management strategies for multinational large-scale
systems biology projects. Brief Bioinform. 15(1):65-78.
Figure Caption
12
Figr-1
13
Figr-2
14
Figr-3
15
Figr-4
16
Figr-5
17
Figr-6
18
Figr-7