Professional Documents
Culture Documents
Labman A Research Information System To Foster Insight Discovery Through Visualizations
Labman A Research Information System To Foster Insight Discovery Through Visualizations
1 Introduction
Open Data principles are crawling their path through different actors present
in our daily lives: governments publishing data about their members and hi-
erarchical structures, councils providing open access to local data (i.e., energy
consumption rates, air and water quality, cartography of the surrounding areas,
etc.), prosumers uploading data captured by diverse sensors to allow third-parties
benefit for it and so on. The academic world is not an exception.
Many universities are providing machine-readable data about their institu-
tions on dedicated portals, in order to foster service discovery and data reusabil-
ity. Topics covered span from map layouts of the different buildings and rooms of
the faculties, to statistics about the number of enrolments for the new academic
year, going through the courses offered, staff resumes, etc. Linked Universities, as
stated in their official website1 , “is an alliance of european universities engaged
1
http://linkeduniversities.org/
into exposing their public data as linked data”. Whereas focused on educational
data, they promote the use of shared vocabularies to describe resources, encour-
aging universities to publish data in an interoperable manner.
Regarding research centres, many of them expose official websites showing
their scientific contributions (either publications, events or projects), contact
and organizational information about their researchers, etc. Despite the similar-
ity between research units worldwide, these websites lack a common structure,
making it difficult to compare the success of various units between them. Fur-
thermore, the possibility to automatically download data stored in the research
information system is restricted to the implementation of a customized HTML
parser for the portal’s layout.
In this paper, we present our approach to harmonise how research informa-
tion is collected, stored, linked and presented by developing Labman (Labora-
tory Management), a research-themed information management system which
makes use of Semantic Web Standards and best practices to share data over the
Internet.
This paper is structured as follows: section 2 presents similar works on re-
search information visualization. Section 3 describes the system we have de-
veloped, and some of its most relevant highlights. Section 4 addresses some of
the visualizations generated from the data within the system. Finally, section 5
summarises the conclusions drawn from the development and drafts some future
work lines.
2 Related work
There are some information systems and plugins which provide visual represen-
tations over research data.
The Semantic Web Journal (SWJ) published by IOS Press provides a Sci-
entometrics portal2 which gathers data from its SPARQL endpoint, generating
publication and author related visualizations. Its goal is to make all submitted
manuscripts, solicited reviews, status updates and comments available for ev-
erybody. This SWJ Portal [4] presents different visualizations about the data
stored in SWJ, allowing visitors to consult an author’s citation or co-authorship
networks, the geospatial distribution of its articles’ citations, or the usage of
selected keywords. Similar results but based on the LAK (Learning Analytics
and Knowledge) are reached in [5].
AMiner3 [9] focuses its metrics on researchers, providing a set of statistics,
metadata and visualizations for individuals in order to understand their contri-
butions through time.
[2] used the LOD Visualization Suite on top of the RILOD (Research Infor-
mation Linked Open Data) dataset to visualize the Flemish research landscape.
With more than 400 million triples, it shows how visualizations can represent the
2
http://semantic-web-journal.com/SWJPortal/
3
http://aminer.org/
research status of a whole country. In their demo4 , users can navigate through
publications, projects, topics and research centres, discovering the connections
that link them.
The university of Indiana, with works such as [6, 8], visualized the research
status of the US and major international conference through topic, keyword and
trend analysis.
Finally, the EU-funded CODE project5 aims to collect publication data and
metadata from different sources, extract scientific facts from them and publish
everything to the LOD Cloud. Its Visualization Wizard [7] allows to analyse
linked open datasets visually, offering a visualization recommender based on the
data the user is interested in.
3 Labman
The need for updated, high-quality, non-redundant data is a must for any in-
formation architect, data analyst or management enthusiast. A centralised data
storage helps in minimising data redundancy, avoiding common errors produced
by keeping different versions of data resources in a variety of locations (i.e., when
applying for a new fund, is usual to send a complete version of the participants
CV. If some researchers keep their own updated CVs elsewhere, the provenance
of the presented data will be damaged).
9
https://zotero.org/
4 Visualising Research Data
All entity views in labman try to display at once all the available information
about the items of interest and the resources connected with them. Presenting
data and linked items in a clean and ordered way, we encourage website visi-
tors to navigate through the system in an exploratory process, discovering new
knowledge on the way.
Apart from textual information and links to related instances, labman pro-
vides different visualizations that summarise diverse aspects of the research cen-
tre’s performance. Most visual representations can be found under the Charts
menu, or from each researcher’s personal view.
As our research information system is web-based, most of the visualizations
are dynamic and interactive, to foster a serendipitous behaviour. The visual-
izations generated within labman are implemented using JavaScript graphing
libraries, such as d3.js 10 , Google Charts 11 and sigma.js 12 .
Some of the most interesting charts are explained next, using as example our
unit’s deployment of labman 13 :
Fig. 1. Timeline with MORElab’s role distribution since its creation a decade ago
10
http://d3js.org/
11
https://developers.google.com/chart/
12
http://sigmajs.org/
13
http://morelab.deusto.es/
This chart could exhibit interesting patterns in human-resource management:
the growth/downturn of the group in a specific period, promotions (PhD student
→ Post-doctoral researcher), unification of role labels, etc.
Fig. 2. Reduced version of the projects timeline for a selected person. The colour legend
signals the role played during its execution
Metrics like h-index, i10-index, number of publications, etc. are de facto means
to evaluate and compare performance between researchers. In the “publish-or-
perish” scenario research is usually surrounded in, where authors can “cheat” the
metrics focusing on publication volume and auto-citation networks, a broader
set of factors need to be considered when judging the impact of contributions.
As an idea to check the contribution of each author in a publication, labman
presents a combo chart per researcher that aggregates the authoring position
by publication type, and depicts their distribution (see Figure 4). Whereas this
metric could not be relevant for all knowledge areas, in some fields such as
Computer Science, usually place is proportionally related to the contributions
made by each author.
Fig. 4. Place distribution for a researcher of our group, aggregated by publication type
When a publication type is selected, the chart is re-arranged in a time-based
series fashion. Both visuals can help detecting common patterns, i.e., PhD stu-
dents starting from low-level places (collaborators in an existing research when
they entered the lab) and climbing positions as their works produce significant
results; well-positioned researchers with a successful trajectory co-authoring sev-
eral articles in the same conferences/journals yearly, etc.
Labman summarises our efforts towards managing all the information within
a research centre, with the possibility of publishing it as LOD, and presenting
dynamic visualizations in order to get the full picture about the unit’s status
and performance in different scenarios. These charts and graphs also allow unit
leaders plan strategic goals for the next years, empowering strong knowledge
areas, promoting promising new research topics, presenting innovative project
proposals to first-level funding calls, etc.
Actually, our research information system is deployed in three different re-
search groups within the university: two from DeustoTech - Deusto Institute
of Technology (Mobility14 and MORElab’s official websites), and one from the
Transnational Law Research Unit15 . These knowledge-areas diversity has im-
proved labman by making it more generalist, without being closely-coupled with
our unit’s needs.
Labman has proven to be practical by its users, which was one of the main
reasons to avoid standard CMSs and develop a generic system from the ground.
The ease of importing all publication-related data thanks to Zotero is a really
appreciated feature, and users have repeatedly pledged for a similar approach
to import project-related data. However, as for the best of our knowledge there
are no similar tools for project data, the need to input the data manually is still
required. Nevertheless, the benefits still exceed the system’s lacks.
The software is open-sourced under a GPL v3 license16 , and its code can be
browsed, downloaded and contributed to in its GitHub repository17 . The project
is continuously improved by adding new features, solving bugs and including pull
requests from different contributors.
As listed in the issues section of the code base, our focus is set on providing
more insightful visualizations to foster information discovery when visiting a
labman-powered website.
We are also considering the possibility to offer a JSON-LD18 based API in
order to allow both programmers and Semantic Web practitioners access the data
stored within labman in a more traditional way, without the required knowledge
about the SPARQL language and related technologies. Individual web views will
14
http://research.mobility.deustotech.eu/
15
http://research.transnational.deusto.es/
16
http://www.gnu.org/copyleft/gpl.html
17
https://github.com/OscarPDR/labman ud
18
http://json-ld.org/
also be extended with schema.org 19 metadata definitions for promoting resource
discovery and interoperability.
Finally, we are completing the implementation of OAI-PMH20 (Open Archives
Initiative Protocol for Metadata Harvesting) to make all the metadata about
publications discoverable by clients supporting these conventions. Many digi-
tal libraries, institutional repositories and digital archives have already adopted
OAI-PMH in their systems.
6 Acknowledgements
The research activities described in this paper are funded by the Basque Gov-
ernment’s Universities and Research department, under grant PRE 2014 2 298.
References
1. Bizer, C., Cyganiak, R.: D2r server-publishing relational databases on the semantic
web. In: Poster at the 5th International Semantic Web Conference (2006)
2. Dimou, A., De Vocht, L., Van Grootel, G., Van Campe, L., Latour, J., Mannens,
E., Van de Walle, R.: Visualizing the information of a Linked Open Data enabled
Research Information System. Procedia Computer Science 33, 245–252 (2014)
3. Everett, M., Borgatti, S.P.: Ego network betweenness. Social Networks 27(1), 31–38
(Jan 2005)
4. Hu, Y., Janowicz, K., McKenzie, G., Sengupta, K., Hitzler, P.: A Linked-Data-
Driven and Semantically-Enabled Journal Portal for Scientometrics. In: The Se-
mantic Web – ISWC 2013, pp. 114–129. Springer Berlin Heidelberg (Jan 2013)
5. Hu, Y., McKenzie, G., Yang, J.A., Gao, S., Abdalla, A., Janowicz, K.: A Linked-
Data-Driven Web Portal for Learning Analytics: Data Enrichment, Interactive Vi-
sualization, and Knowledge Discovery. In: LAK Workshops (2014)
6. Ke, W., Borner, K., Viswanath, L.: Major Information Visualization Authors, Pa-
pers and Topics in the ACM Library. In: Proceedings of the IEEE Symposium
on Information Visualization. pp. 216.1–. INFOVIS ’04, IEEE Computer Society,
Washington, DC, USA (2004)
7. Mutlu, B., Hoefler, P., Sabol, V., Tschinkel, G., Granitzer, M.: Automated Visual-
ization Support for Linked Research Data. I-SEMANTICS (Posters & Demos) 1026,
40–44 (2013)
8. Skupin, A., Biberstine, J.R., Börner, K.: Visualizing the Topical Structure of the
Medical Sciences: A Self-Organizing Map Approach. PLoS ONE 8(3) (2013)
9. Tang, J., Zhang, J., Yao, L., Li, J., Zhang, L., Su, Z.: ArnetMiner: Extraction and
Mining of Academic Social Networks. In: Proceedings of the 14th ACM SIGKDD
International Conference on Knowledge Discovery and Data Mining. pp. 990–998.
KDD ’08, ACM, New York, NY, USA (2008)
19
http://schema.org/
20
https://www.openarchives.org/pmh/