Professional Documents
Culture Documents
Extraction of Adverse Drug Effects From Medical Case Reports
Extraction of Adverse Drug Effects From Medical Case Reports
Extraction of Adverse Drug Effects From Medical Case Reports
1
Gurulingappa et al
2
Extraction of Adverse Drug Effects
and conditions that denote a focussed relation class being studied. ontology coincide with a DRUG - ADMINISTRATION in AEO. The
ADE ontology additionally introduce the entity DOSAGE , not spe-
3.2 Assessment of Relation Extraction cified in AEO at the time of its development since AEO originally
focused on vaccines for which dosing is not an essential medical
Baseline experiments began with training and cross-validation of
concept. Both ADE and AEO model a causal relationship between
JSRE over the ADE - EXT- TRAIN corpus. Results of system’s per-
C ONDITION OR A DVERSE EVENT and D RUG OR M EDICAL INTE -
formances are shown in Table 2. The system achieved an overall
RVENTION , with the latter being the causal source. The only entity
F-score of 0.87 after cross-validation. Upon the final test over
shared by the CLEF annotation ontology with AEO and ADE is
ADE - EXT- TEST , the system attained F-score of 0.87 indicating a
the D RUG - OR - DEVICE, that coincide with a D RUG OR M EDICAL
consistency in classification. A subset of instances misclassified
INTERVENTION .
during the cross-validation and testing were manually investiga-
ted to understand the common sources of errors. Limited context
appeared to be one reason for misclassification. For example, the 4 CONCLUSION
title Niacin maculopathy (PMID:3174043) infers maculopathy as This work reports on the adaptation of a machine learning-based
an adverse effect of niacin that lacks contextual description to JSRE system for the identification and extraction of adverse effe-
support machine classification. Distantly co-occurring inter-related cts of drugs in case reports. A methodology has been discussed to
entities constituted couple of errors. For example, in the sentence enrich a sparsely annotated corpus and its subsequent use to build
CASE SUMMARY: A 65-year-old patient chronically treated with a classification model. Evaluation of the system’s performance sho-
the selective serotonin reuptake inhibitor (SSRI) citalopram develo- wed promising results. Performance of the system can be improved
ped confusion, agitation, tachycardia, tremors, myoclonic jerks and in several ways. In the current experiments, only the default features
unsteady gait, consistent with serotonin syndrome, following initi- acceptable by JSRE were used. Optimization of feature represen-
ation of fentanyl, and all symptoms and signs resolved following tation to include additional features for instance from syntactic
discontinuation of fentanyl (PMID:17381671); the relation betw- sentence parse trees may further improve the results. Development
een confusion and the last appearing drug name fentanyl was not of additional strategies like post-processing to classify relations with
correctly classified. missing contextual descriptions can help to recover more relations.
The reported experimental results denote research status on
3.3 Impact of Size of the Training Set on the adverse drug effect identification from text. There are several strate-
Performance gies that will be immediately followed. The authors plan to bench-
In order to study the impact of size of the training data on per- mark the performances of several named entity taggers against the
formance of classification, the ADE - EXT- TRAIN was decomposed ADE corpus for the identification of drugs and conditions mentions
into random subsets containing 10, 20, 50, 100, 200, 500, 1000, in text.
and 2000 documents. The JSRE was trained on these subsets The current experiments have been performed on the ADE cor-
independently in different rounds and subsequently applied on the pus, since that was the only one available when this work was done,
ADE - EXT- TEST for performance evaluation. Table 3 shows that alre- however while writing this report a new corpus has been published,
ady using 500 documents one could achieve performances in the namely the EU-ADR corpus(van Mulligen et al., 2012). It will be
80% range. Whereby, to reach a classifier with a standard deviation interesting to see if the performance of JSRE on the ADE corpus will
of 1%, one needs a substantially large training data. be different compared to the EU-ADR corpus.
3
Gurulingappa et al
Similarly, benchmarking results of commercial and public rela- extraction of drug-related adverse effects from medical case reports. Journal of
tion extraction systems such as SemRep, Luxid MER Skill Biomedical Informatics.
Hanisch, D., Fundel, K., Mevissen, H.-T., Zimmer, R., and Fluck, J. (2005). Prominer:
Cartridge
,R RelEx, MedScan will be performed. The outcome of
rule-based protein and gene entity recognition. BMC Bioinformatics, 6 Suppl 1,
relation extraction from text to support signal detection and identify S14.
potentially novel or under-reported adverse effects will be studied. Hauben, M. and Bate, A. (2009). Decision support methods for the detection of adverse
The use of ontologies for driving information extraction has been events in post-marketing data. Drug Discov Today, 14(7-8), 343–357.
reported (Wimalasuriya and Dou, 2010; Pandit and Honavar, 2010), Henegar, C., Bousquet, C., Lillo-Le Louet, A., Degoulet, P., and Jaulent, M.-C. (2006).
Building an ontology of adverse drug reactions for automated signal generation in
we plan to explore the use of various available tools (e.g. ODIE, pharmacovigilance. Computers in Biology and Medicine, 36, 748–767.
semantixs) using the AEO ontology and compare the performance Knox, C., Law, V., Jewison, T., Liu, P., Ly, S., Frolkis, A., Pon, A., Banco, K., Mak,
of the ontology driven methods against the method presented here. C., Neveu, V., Djoumbou, Y., Eisner, R., Guo, A. C., and Wishart, D. S. (2011).
An outcome of the current work has demonstrated promising Drugbank 3.0: a comprehensive resource for ’omics’ research on drugs. Nucleic
Acids Res, 39(Database issue), D1035–D1041.
results and it has a potential to reduce the manual reading time, acce-
Leaman, R., Wojtulewicz, L., Sullivan, R., Skariah, A., Yang, J., and Gonzalez, G.
lerate the signal tracking process, and therefore ensure safe use of (2010). Towards internet-age pharmacovigilance: extracting adverse drug reacti-
drugs in the market. ons from user posts to health-related social networks. In Proceedings of the 2010
Workshop on Biomedical Natural Language Processing, pages 117–125.
Merrill, G. H. (2008). The meddra paradox. AMIA Annu Symp Proc, pages 470–474.
5 ACKNOWLEDGEMENTS Ogren, P. (2006). Knowtator: a Protégé plug-in for annotated corpus construction.
This work has been partly supported by Fraunhofer Institute for In Proceedings of the 2006 conference of the North American chapter of the
Algorithms and Scientific Computing (SCAI), Sankt Augustin, association for computational linguistics on human language technology, pages
273–275.
Germany and Bonn-Aachen International Center for Information Pandit, S. and Honavar, V. (2010). Ontology-guided extraction of complex nested relati-
Technology, Bonn, Germany. onships. In 22nd IEEE International Conference on tools with artificial intelligence
(ICTAI), pages 173–178.
Roberts, A., Gaizauskas, R., Hepple, M., Demetriou, G., Guo, Y., Roberts, I., and
REFERENCES Setzer, A. (2007). The CLEF corpus: semantic annotation of clinical text. In
Aramaki, E., Miura, Y., Tonoike, M., Ohkuma, T., Masuichi, H., Waki, K., and Ohe, K. Proceedings of the AMIA Symposium, pages 625–629.
(2010). Extraction of adverse drug effects from clinical records. In Studies Health Roberts, A., Gaizauskas, R., Hepple, M., Demetriou, G., Guo, Y., Roberts, I., and
Technology Informatics, volume 160, pages 739–743. Setzer, A. (2009). Building a semantically annotated corpus of clinical texts. Journal
Benton, A., Ungar, L., Hill, S., Hennessy, S., Mao, J., Chung, A., Leonard, C., and Hol- of Biomedical Informatics, 42, 950–966.
mes, J. (2011). Identifying potential adverse effects using the web: A new approach Tikk, D., Thomas, P., Palaga, P., Hakenberg, J., and Leser, U. (2010). A comprehensive
to medical hypothesis generation. Journal of Biomedical Informatics, 44, 989–996. benchmark of kernel methods to extract protein-protein interactions from literature.
Burges, C. (1998). A Tutorial on Support Vector Machines for Pattern Recognition. PLoS Comput Biol, 6, e1000837.
Data Mining and Knowledge Discovery, 2. van Mulligen, E., Fourrier-Reglat, A., Gurwitz, D., Molokhia, M., Nieto, A., Trifiro,
Delamarre, D., Lillo-Le Louet, A., Jamte, A., Sadou, E., Ouazine, T., Burgun, A., G., Kors, J., and Furlong, L. (2012). The eu-adr corpus: Annotated drugs, diseases,
and Jaulent, M. (2010). Documentation in pharmacovigilance: using an ontology to targets, and their relationships. Journal of Biomedical Informatics.
extend and normalize Pubmed queries. In Studies Health Technology Informatics, Vandenbroucke, J. P. (2001). In defense of case reports and case series. Ann Intern
volume 160, pages 518–522. Med, 134(4), 330–334.
Giuliano, C., Lavelli, A., Pighin, D., and Romano, L. (2007). FBK-IRST: Kernel Meth- Wang, X., Hripcsak, G., Markatou, M., and Friedman, C. (2009). Active computeri-
ods for Semantic Relation Extraction. In Proceedings of the Fourth International zed pharmacovigilance using natural language processing, statistics, and electronic
Workshop on Semantic Evaluations. health records: a feasibility study. J Am Med Inform Assoc, 16(3), 328–337.
Gurulingappa, H., Fluck, J., Hofmann-Apitius, M., and Toldo, L. (2011). Iden- Wimalasuriya, D. and Dou, D. (2010). Ontology-based information extraction: an intro-
tification of adverse drug event assertive sentences in medical case reports. In duction and a survey of current approaches. Journal of Information Science, 36,
First International Workshop on Knowledge Discovery and Health Care Manage- 306–323.
ment (KD-HCM), European Conference on Machine Learning and Principles and Yongqun, H., Zuoshuang, X., Sarntivijai, S., Toldo, L., and Ceusters, W. (2011). AEO:
Practice of Knowledge Discovery in Databases (ECML PKDD). A Realism-Based Biomedical Ontology for the Representation of Adverse Events.
Gurulingappa, H., Mateen-Rajput, A., Roberts, A., Fluck, J., Hofmann-Apitius, M., In ICBO: International on Biomedical Ontology Buffalo, NY, USA, Representing
and Toldo, L. (2012). Development of a benchmark corpus to support the automatic Adverse Events Workshop.