Download Artificial Intelligence In Medicine 19Th International Conference On Artificial Intelligence In Medicine Aime 2021 Virtual Event June 15 18 2021 Lecture Notes In Computer Science 12721 Allan Tucker online ebook texxtbook full chapter pdf

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 70

Artificial Intelligence in Medicine 19th

International Conference on Artificial


Intelligence in Medicine AIME 2021
Virtual Event June 15 18 2021 Lecture
Notes in Computer Science 12721 Allan
Tucker
Visit to download the full and correct content document:
https://ebookmeta.com/product/artificial-intelligence-in-medicine-19th-international-co
nference-on-artificial-intelligence-in-medicine-aime-2021-virtual-event-june-15-18-202
1-lecture-notes-in-computer-science-12721-allan-tucker/
More products digital (pdf, epub, mobi) instant
download maybe you interests ...

Artificial Intelligence in Medicine 20th International


Conference on Artificial Intelligence in Medicine AIME
2022 Halifax NS Canada June 14 17 Lecture Notes in
Computer Science 13263 Martin Michalowski (Editor)
https://ebookmeta.com/product/artificial-intelligence-in-
medicine-20th-international-conference-on-artificial-
intelligence-in-medicine-aime-2022-halifax-ns-canada-
june-14-17-lecture-notes-in-computer-science-13263-martin-
michalow/
Progress in Artificial Intelligence 20th EPIA
Conference on Artificial Intelligence EPIA 2021 Virtual
Event September 7 9 2021 Proceedings Lecture Notes in
Computer Science 12981 Goreti Marreiros (Editor)
https://ebookmeta.com/product/progress-in-artificial-
intelligence-20th-epia-conference-on-artificial-intelligence-
epia-2021-virtual-event-september-7-9-2021-proceedings-lecture-
notes-in-computer-science-12981-goreti-marreiros-ed/

AIxIA 2021 Advances in Artificial Intelligence 20th


International Conference of the Italian Association for
Artificial Intelligence Virtual Event Lecture Notes in
Computer Science 13196 Stefania Bandini
https://ebookmeta.com/product/aixia-2021-advances-in-artificial-
intelligence-20th-international-conference-of-the-italian-
association-for-artificial-intelligence-virtual-event-lecture-
notes-in-computer-science-13196-stefania-band/

Agents and Artificial Intelligence 13th International


Conference ICAART 2021 Virtual Event February 4 6 2021
Revised Selected Papers Lecture Notes in Computer
Science 13251 Ana Paula Rocha
https://ebookmeta.com/product/agents-and-artificial-
intelligence-13th-international-conference-icaart-2021-virtual-
event-february-4-6-2021-revised-selected-papers-lecture-notes-in-
Artificial Intelligence and Security 7th International
Conference ICAIS 2021 Dublin Ireland July 19 23 2021
Proceedings Part I Lecture Notes in Computer Science
12736 Xingming Sun (Editor)
https://ebookmeta.com/product/artificial-intelligence-and-
security-7th-international-conference-icais-2021-dublin-ireland-
july-19-23-2021-proceedings-part-i-lecture-notes-in-computer-
science-12736-xingming-sun-editor/

Reversible Computation 13th International Conference RC


2021 Virtual Event July 7 8 2021 Proceedings Lecture
Notes in Computer Science 12805 Shigeru Yamashita
(Editor)
https://ebookmeta.com/product/reversible-computation-13th-
international-conference-rc-2021-virtual-event-
july-7-8-2021-proceedings-lecture-notes-in-computer-
science-12805-shigeru-yamashita-editor/

Automated Deduction CADE 28 28th International


Conference on Automated Deduction Virtual Event July 12
15 2021 Proceedings Lecture Notes in Computer Science
12699 André Platzer (Editor)
https://ebookmeta.com/product/automated-deduction-cade-28-28th-
international-conference-on-automated-deduction-virtual-event-
july-12-15-2021-proceedings-lecture-notes-in-computer-
science-12699-andre-platzer-editor/

The Semantic Web ISWC 2021 20th International Semantic


Web Conference ISWC 2021 Virtual Event October 24 28
2021 Proceedings Lecture Notes in Computer Science
Andreas Hotho
https://ebookmeta.com/product/the-semantic-web-iswc-2021-20th-
international-semantic-web-conference-iswc-2021-virtual-event-
october-24-28-2021-proceedings-lecture-notes-in-computer-science-
andreas-hotho/

Artificial General Intelligence 14th International


Conference AGI 2021 Palo Alto CA USA October 15 18 2021
Proceedings 1st Edition Ben Goertzel

https://ebookmeta.com/product/artificial-general-
intelligence-14th-international-conference-agi-2021-palo-alto-ca-
usa-october-15-18-2021-proceedings-1st-edition-ben-goertzel/
Allan Tucker · Pedro Henriques Abreu ·
Jaime Cardoso · Pedro Pereira Rodrigues ·
David Riaño (Eds.)
LNAI 12721

Artificial Intelligence
in Medicine
19th International Conference
on Artificial Intelligence in Medicine, AIME 2021
Virtual Event, June 15–18, 2021
Proceedings

123
Lecture Notes in Artificial Intelligence 12721

Subseries of Lecture Notes in Computer Science

Series Editors
Randy Goebel
University of Alberta, Edmonton, Canada
Yuzuru Tanaka
Hokkaido University, Sapporo, Japan
Wolfgang Wahlster
DFKI and Saarland University, Saarbrücken, Germany

Founding Editor
Jörg Siekmann
DFKI and Saarland University, Saarbrücken, Germany
More information about this subseries at http://www.springer.com/series/1244
Allan Tucker Pedro Henriques Abreu
• •

Jaime Cardoso Pedro Pereira Rodrigues


• •

David Riaño (Eds.)

Artificial Intelligence
in Medicine
19th International Conference
on Artificial Intelligence in Medicine, AIME 2021
Virtual Event, June 15–18, 2021
Proceedings

123
Editors
Allan Tucker Pedro Henriques Abreu
Brunel University London University of Coimbra
Uxbridge, UK Coimbra, Portugal
Jaime Cardoso Pedro Pereira Rodrigues
University of Porto University of Porto
Porto, Portugal Porto, Portugal
David Riaño
Universitat Rovira i Virgili
Tarragona, Spain

ISSN 0302-9743 ISSN 1611-3349 (electronic)


Lecture Notes in Artificial Intelligence
ISBN 978-3-030-77210-9 ISBN 978-3-030-77211-6 (eBook)
https://doi.org/10.1007/978-3-030-77211-6
LNCS Sublibrary: SL7 – Artificial Intelligence

© Springer Nature Switzerland AG 2021


This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the
material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation,
broadcasting, reproduction on microfilms or in any other physical way, and transmission or information
storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now
known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication
does not imply, even in the absence of a specific statement, that such names are exempt from the relevant
protective laws and regulations and therefore free for general use.
The publisher, the authors and the editors are safe to assume that the advice and information in this book are
believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors
give a warranty, expressed or implied, with respect to the material contained herein or for any errors or
omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in
published maps and institutional affiliations.

This Springer imprint is published by the registered company Springer Nature Switzerland AG
The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
Preface

The European Society for Artificial Intelligence in Medicine (AIME) was established in
1986 following a very successful workshop held in Pavia, Italy, the year before. The
principal aims of AIME are to foster fundamental and applied research in the appli-
cation of artificial intelligence (AI) techniques to medical care and medical research,
and to provide a forum at biennial conferences for discussing any progress made. The
main activity of the society thus far has been the organization of a series of biennial
conferences, held in Marseilles, France (1987), London, UK (1989), Maastricht,
Netherlands (1991), Munich, Germany (1993), Pavia, Italy (1995), Grenoble, France
(1997), Aalborg, Denmark (1999), Cascais, Portugal (2001), Protaras, Cyprus (2003),
Aberdeen, UK (2005), Amsterdam, Netherlands (2007), Verona, Italy (2009), Bled,
Slovenia (2011), Murcia, Spain (2013), Pavia, Italy (2015), Vienna, Austria (2017),
Poznan, Poland (2019), and Minneapolis, USA (2020) - the latter hosted virtually due
to the COVID-19 pandemic.
AIME 2021 was to be hosted in Portugal but, due to the ongoing pandemic, it was
held virtually. This volume contains the proceedings of AIME 2021, the International
Conference on Artificial Intelligence in Medicine, hosted virtually by the University of
Coimbra, Portugal, during June 15–18, 2021.
The AIME 2021 goals were to present and consolidate the international state of the
art of AI in biomedical research from the perspectives of theory, methodology, systems,
and applications. The conference included two invited keynotes, full and short papers,
tutorials, workshops, and a doctoral consortium. In the conference announcement,
authors were invited to submit original contributions regarding the development
of theory, methods, systems, and applications for solving problems in the biomedical
field, including AI approaches in biomedical informatics, molecular medicine, and
health-care organizational aspects. Authors of papers addressing theory were requested
to describe the properties of novel AI models potentially useful for solving biomedical
problems. Authors of papers addressing theory and methods were asked to describe the
development or the extension of AI methods, to address the assumptions and limita-
tions of the proposed techniques, and to discuss their novelty with respect to the state
of the art. Authors of papers addressing systems and applications were asked to
describe the development, implementation, or evaluation of new AI-inspired tools and
systems in the biomedical field. They were asked to link their work to underlying
theory, and either analyze the potential benefits to solve biomedical problems or pre-
sent empirical evidence of benefits in clinical practice. All authors were asked to
highlight the value their work, created for the patient, provider, or institution, through
its clinical relevance.
AIME 2021 received 138 submissions across all categories. Submissions came from
34 countries, including submissions from Europe, North and South America, Asia,
Australia, and Africa. All papers were carefully peer-reviewed by experts from the
Program Committee, with the support of additional reviewers, and by members of the
vi Preface

Senior Program Committee (a new layer to the review process introduced in AIME
2020). Each submission was reviewed by at least three reviewers. The reviewers judged
the overall quality of the submitted papers together with their relevance to the AIME
conference, originality, impact, technical correctness, methodology, scholarship, and
quality of presentation. In addition, the reviewers provided detailed written comments
on each paper, and stated their confidence in the subject area. A Senior Program
Committee member was assigned to each paper and they wrote a meta-review and
provided a recommendation to the Organizing Committee.
A small committee consisting of the conference co-chairs, Allan Tucker, Pedro
Henrique Abreu, and Jaime Cardoso, made the final decisions regarding the AIME
2021 scientific program.
This process began with virtual meetings starting in March 2021. As a result, 28
long papers (an acceptance rate of 23%) and 30 short papers were accepted. Each long
paper was presented in a 20-minute oral presentation during the conference. Each
regular short paper was presented in a 5-minute presentation and by a poster. The
papers were organized according to their topics in the following main themes: (1) Deep
Learning; (2) Natural Language Processing; (3) Predictive Modeling; (4) Image
Analysis; (5) Unsupervised Learning; (6) Temporal Data Analysis; (7) Planning; and
(8) Knowledge Representation.
AIME 2021 had the privilege of hosting two invited keynote speakers: Virginia
Dignum, Wallenberg chair on Responsible Artificial Intelligence and Scientific
Director of WASP-HS (Humanities and Society) based at Umeå University, Sweden,
who gave the keynote exploring “The myth of complete AI-fairness” and Pearse
Keane, Associate Professor at the UCL Institute of Ophthalmology and Consultant at
Moorfields Eye Hospital, UK, who talked about “Transforming healthcare with arti-
ficial intelligence - lessons from ophthalmology”.
AIME 2021 provided an opportunity for PhD students to present their research
goals, proposed methods, and preliminary results at an associated doctorial consortium.
A scientific panel consisting of experienced researchers in the field provided con-
structive feedback to the students in an informal atmosphere. The doctoral consortium
was chaired by Dr. David Riaño.
Two workshops: (1) The 12th International Workshop on Knowledge Representa-
tion for Health Care - KR4HC 2021 (David Riaño, Mar Marcos, and Annette ten
Teije); (2) Explainable Artificial Intelligence in Healthcare (Jose M. Juarez, Gregor
Stiglic, Huang Zhengxing, and Katrien Verbert); and an interactive half-day tutorial:
Evaluating Prediction Models (Ameen Abu Hanna), took place prior to the AIME 2021
main conference.
Prizes were awarded for best student paper, best bioinformatics paper, and a new
rising star award for young researchers within the AIME community who are running
labs with recent breakthrough papers in the area.
We would like to thank everyone who contributed to AIME 2021. First of all, we
would like to thank the authors of the papers submitted and the members of the Pro-
gram Committee together with the additional reviewers. Thank you to the Senior
Program Committee for writing meta-reviews and to members of the Senior Advisory
Committee for providing guidance during conference organization. Thanks are also due
to the invited speakers, as well as to the organizers of the tutorials and doctoral
Preface vii

consortium panel. Many thanks go to the Local Organizing Committee, who helped
plan this conference and transition it to a virtual one. The free EasyChair conference
system (http://www.easychair.org/) was an important tool supporting us in the man-
agement of submissions, reviews, and selection of accepted papers. We would like to
thank Springer and the Artificial Intelligence Journal (AIJ) for sponsoring the con-
ference. Finally, we thank the Springer team for helping us in the final preparation of
this LNAI book.

June 2021 Allan Tucker


Pedro Henriques Abreu
Jaime Cardoso
Pedro Pereira Rodrigues
David Riaño
Organization

General Chairs
Allan Tucker Brunel University London, UK
Pedro Henriques Abreu University of Coimbra, Portugal
Jaime Cardoso INESC TEC and University of Porto, Portugal

Local Organizing Chair


Pedro Pereira Rodrigues University of Porto, Portugal

Doctoral Consortium Chair


David Riaño Universitat Rovira i Virgili, Spain

Senior Program Committee


Ameen Abu-Hanna University of Amsterdam, Netherlands
Riccardo Bellazzi University of Pavia, Italy
Carlo Combi University of Verona, Italy
Arianna Dagliati University of Manchester, UK
Michel Dojat INSERM, France
Adelo Grando Arizona State University, USA
Milos Hauskrecht University of Pittsburgh, USA
John Holmes University of Pennsylvania, USA
Jose Juarez University of Murcia, Spain
Elpida Keravnou-Papailiou University of Cyprus
Xiaohui Liu Brunel University London, UK
Martin Michalowski University of Minnesota, USA
Mar Marcos Universitat Jaume I, Spain
Stefania Montani University of Piemonte Orientale, Italy
Robert Moskovitch Ben Gurion University, Israel
Barbara Oliboni University of Verona, Italy
Enea Parimbelli University of Pavia, Italy
Nada Lavrač Jozef Stefan Institute, Slovenia
Peter Lucas Leiden University, Netherlands
Niels Peek University of Manchester, UK
Silvana Quaglini University of Pavia, Italy
Lucia Sacchi University of Pavia, Italy
Yuval Shahar Ben Gurion University, Israel
Stephen Swift Brunel University London, UK
Annette ten Teije Vrije Universiteit Amsterdam, Netherlands
x Organization

Szymon Wilk Poznan University of Technology, Poland


Blaz Zupan University of Ljubljana, Slovenia

Program Committee
Syed Sibte Raza Abidi Dalhousie University, Spain
Amparo Alonso-Betanzos University of A Coruña, Spain
José Amorim University of Coimbra, Portugal
Mahir Arzoky Brunel University London, UK
Pedro Barahona Universidade NOVA de Lisboa, Portugal
Isabelle Bichindaritz State University of New York at Oswego, USA
Henrik Boström KTH Royal Institute of Technology, Sweden
Alessio Bottrighi Università del Piemonte Orientale, Italy
Ricardo Cardoso University of Coimbra, Portugal
Kerstin Denecke Bern University of Applied Sciences, Switzerland
Barbara Di Camillo University of Padova, Italy
Georg Dorffner Medical University Vienna, Austria
Inês Dutra University of Porto, Portugal
Jan Egger Graz University of Technology, Austria
Henrik Eriksson Linköping University, Sweden
Ben Evans Brunel University London, UK
Jesualdo Tomàs Universidad de Murcia, Spain
Fernàndez-Breis
Pedro Furtado University of Coimbra/CISUC, Portugal
Josep Gomez Pere Virgili Institute, Joan XXIII University Hospital,
Spain
Zhe He Florida State University, USA
Jaakko Hollmen University of Stockholm, Sweden
Arjen Hommersom Open University of the Netherlands, Netherlands
Zhengxing Huang Zhejiang University, China
Nevo Itzhak Ben Gurion University, Israel
Charles Kahn University of Pennsylvania, USA
Eleni Kaldoudi Democritus University of Thrace, Greece
Aida Kamisalic University of Maribor, Slovenia
Haridimos Kondylakis FORTH, Greece
Pedro Larranaga University of Madrid, Spain
Giorgio Leonardi Universita’ del Piemonte Orientale, Italy
Michael Liebman IPQ Analytics, USA
Beatriz López University of Girona, Spain
Simone Marini University of Florida, USA
Carolyn McGregor Ontario Tech University, Canada
Alan McMillan University of Wisconsin-Madison, USA
Paola Mello University of Bologna, Italy
Alina Miron Brunel University London, UK
Diego Molla Macquarie University, Australia
Sara Montagna University of Bologna, Italy
Organization xi

Irina Moreira University of Coimbra, Portugal


Laura Moss University of Aberdeen, UK
Fleur Mougin Université de Bordeaux, France
Henning Müller HES-SO, Switzerland
Loris Nanni University of Padua, Italy
Goran Nenadic University of Manchester, UK
Øystein Nytrø Norwegian University of Science and Technology,
Norway
Dympna O’Sullivan Technological University Dublin, Ireland
Panagiotis Papapetrou University of Stockholm, Sweden
Luca Piovesan, DISIT Università del Piemonte Orientale, Italy
Christian Popow Medical University of Vienna, Austria
Cédric Pruski Luxembourg Institute of Science and Technology,
Luxembourg
Andrew Reader King’s College London, UK
Stephen Rees Aalborg University, Denmark
Aleksander Sadikov University of Ljubljana, Slovenia
Seyed Erfan Sajjadi Brunel University London, UK
Clarisa Sanchez Gutierrez Raboud UMC, Netherlands
Miriam Santos University of Coimbra, Portugal
Isabel Sassoon Brunel University London, UK
Brigitte Seroussi Hôpitaux de Paris, France
Erez Shalom Ben Gurion University, Israel
Yuan Shang University of Arizona, USA
Darmoni Stefan University of Rouen, France
Gregor Stiglic University of Maribor, Slovenia
Manuel Striani University of Piemonte Orientale, Italy
João Manuel R. S. Tavares University of Porto, Portugal
Paolo Terenziani Universita’ del Piemonte Orientale, Italy
Francesca Toni Imperial College London, UK
Samson Tu Stanford University, USA
Ryan Urbanowicz University of Pennsylvania, USA
Frank Van Harmelen Vrije Universiteit Amsterdam, Netherlands
Alfredo Vellido Universitat Politècnica de Catalunya, Spain
Francesca Vitali University of Arizona, USA
Dongwen Wang Arizona State University, USA
Leila Yousefi Brunel University London, UK
Pierre Zweigenbaum Université Paris-Saclay, France
xii Organization

Sponsors

We would like to thank The AI Journal and Springer for their sponsorship of the
conference.
Transforming Healthcare with Artificial
Intelligence – Lessons from Ophthalmology
(Abstract of Invited Talk)

Pearse A. Keane

UKRI Future Leaders Fellow


Consultant Ophthalmologist, Moorfields Eye Hospital NHS Foundation Trust
Associate Professor, Institute of Ophthalmology, University College London
pearse.keane1@nhs.net

Abstract. Ophthalmology is among the most technology-driven of the all the


medical specialties, with treatments utilizing high-spec medical lasers and
advanced microsurgical techniques, and diagnostics involving ultra-high reso-
lution imaging. Ophthalmology is also at the forefront of many trailblazing
research areas in healthcare, such as stem cell therapy, gene therapy, and - most
recently - artificial intelligence. In July 2016, Moorfields announced a formal
collaboration with the artificial intelligence company, DeepMind. This collab-
oration involves the sharing of >1,000,000 anonymised retinal scans with
DeepMind to allow for the automated diagnosis of diseases such as age-related
macular degeneration (AMD) and diabetic retinopathy (DR). In my presentation,
I will describe the motivation - and urgent need - to apply deep learning to
ophthalmology, the processes required to establish a research collaboration
between the National Health Service (NHS) and a company like DeepMind, the
initial results of our research, and finally, why I believe that ophthalmology
could be first branch of medicine to be fundamentally reinvented through the
application of artificial intelligence.
Contents

Invited Talk

The Myth of Complete AI-Fairness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3


Virginia Dignum

Image Analysis

A Petri Dish for Histopathology Image Analysis . . . . . . . . . . . . . . . . . . . . . 11


Jerry Wei, Arief Suriawinata, Bing Ren, Xiaoying Liu, Mikhail Lisovsky,
Louis Vaickus, Charles Brown, Michael Baker, Naofumi Tomita,
Lorenzo Torresani, Jason Wei, and Saeed Hassanpour

fMRI Multiple Missing Values Imputation Regularized


by a Recurrent Denoiser. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
David Calhas and Rui Henriques

Bayesian Deep Active Learning for Medical Image Analysis . . . . . . . . . . . . 36


Biraja Ghoshal, Stephen Swift, and Allan Tucker

A Topological Data Analysis Mapper of the Ovarian Folliculogenesis


Based on MALDI Mass Spectrometry Imaging Proteomics. . . . . . . . . . . . . . 43
Giulia Campi, Giovanna Nicora, Giulia Fiorentino, Andrew Smith,
Fulvio Magni, Silvia Garagna, Maurizio Zuccotti, and Riccardo Bellazzi

Predictive Modelling

Predicting Kidney Transplant Survival Using Multiple Feature


Representations for HLAs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
Mohammadreza Nemati, Haonan Zhang, Michael Sloma,
Dulat Bekbolsynov, Hong Wang, Stanislaw Stepkowski, and Kevin S. Xu

Sum-Product Networks for Early Outbreak Detection


of Emerging Diseases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
Moritz Kulessa, Bennet Wittelsbach, Eneldo Loza Mencía,
and Johannes Fürnkranz

Catching Patient’s Attention at the Right Time to Help Them Undergo


Behavioural Change: Stress Classification Experiment from Blood
Volume Pulse . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
Aneta Lisowska, Szymon Wilk, and Mor Peleg
xvi Contents

Primary Care Datasets for Early Lung Cancer Detection:


An AI Led Approach. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
Goce Ristanoski, Jon Emery, Javiera Martinez Gutierrez,
Damien McCarthy, and Uwe Aickelin

Addressing Extreme Imbalance for Detecting Medications Mentioned


in Twitter User Timelines. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
Davy Weissenbacher, Siddharth Rawal, Arjun Magge,
and Graciela Gonzalez-Hernandez

ICU Days-to-Discharge Analysis with Machine Learning Technology . . . . . . 103


David Cuadrado and David Riaño

Transformers for Multi-label Classification of Medical Text:


An Empirical Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
Vithya Yogarajan, Jacob Montiel, Tony Smith, and Bernhard Pfahringer

Semantic Web Framework to Computerize Staged Reflex Testing Protocols


to Mitigate Underutilization of Pathology Tests for Diagnosing
Pituitary Disorders. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
William Van Woensel, Manal Elnenaei, Syed Ali Imran,
and Syed Sibte Raza Abidi

Using Distribution Divergence to Predict Changes in the Performance


of Clinical Predictive Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
Mohammadamin Tajgardoon and Shyam Visweswaran

Analysis of Health Screening Records Using Interpretations


of Predictive Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
Yuki Oba, Taro Tezuka, Masaru Sanuki, and Yukiko Wagatsuma

Seasonality in Infection Predictions Using Interpretable Models for High


Dimensional Imbalanced Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
Bernardo Cánovas-Segura, Antonio Morales, Jose M. Juárez,
and Manuel Campos

Monitoring Quality of Life Indicators at Home from Sparse, and Low-Cost


Sensor Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
Dympna O’Sullivan, Rilwan Basaru, Simone Stumpf, and Neil Maiden

Detection of Parkinson's Disease Early Progressors Using Routine


Clinical Predictors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
Marco Cotogni, Lucia Sacchi, Dejan Georgiev, and Aleksander Sadikov

Detecting Mild Cognitive Impairment Using Smooth Pursuit


and a Modified Corsi Task . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168
Alessia Gerbasi, Vida Groznik, Dejan Georgiev, Lucia Sacchi,
and Aleksander Sadikov
Contents xvii

Temporal Data Analysis

Neural Clinical Event Sequence Prediction Through Personalized Online


Adaptive Learning. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175
Jeong Min Lee and Milos Hauskrecht

Using Event-Based Web-Scraping Methods and Bidirectional Transformers


to Characterize COVID-19 Outbreaks in Food Production
and Retail Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187
Joseph Miano, Charity Hilton, Vasu Gangrade, Mary Pomeroy,
Jacqueline Siven, Michael Flynn, and Frances Tilashalski

Deep Kernel Learning for Mortality Prediction in the Face


of Temporal Shift . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199
Miguel Rios and Ameen Abu-Hanna

Model Evaluation Approaches for Human Activity Recognition


from Time-Series Data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209
Lee B. Hinkle and Vangelis Metsis

Unsupervised Learning

Unsupervised Learning to Subphenotype Heart Failure Patients


from Electronic Health Records . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219
Melanie Hackl, Suparno Datta, Riccardo Miotto, and Erwin Bottinger

Stratification of Parkinson’s Disease Patients via Multi-view Clustering . . . . . 229


Anita Valmarska, Nada Lavrač, and Marko Robnik–Šikonja

Disentangled Hyperspherical Clustering for Sepsis Phenotyping . . . . . . . . . . 240


Cheng Cheng, Jason Kennedy, Christopher Seymour,
and Jeremy C. Weiss

Phenotypes for Resistant Bacteria Infections Using an Efficient Subgroup


Discovery Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 246
Antonio Lopez-Martinez-Carrasco, Jose M. Juarez, Manuel Campos,
and Bernardo Canovas-Segura

Predicting Drug-Drug Interactions from Heterogeneous Data:


An Embedding Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252
Devendra Singh Dhami, Siwen Yan, Gautam Kunapuli, David Page,
and Sriraam Natarajan

Detection of Junctional Ectopic Tachycardia by Central Venous Pressure . . . . 258


Xin Tan, Yanwan Dai, Ahmed Imtiaz Humayun, Haoze Chen,
Genevera I. Allen, and Parag N. Jain MD
xviii Contents

Planning and Decision Support

A Cautionary Tale on Using Covid-19 Data for Machine Learning . . . . . . . . 265


Diogo Nogueira-Leite, João Miguel Alves, Manuel Marques-Cruz,
and Ricardo Cruz-Correia

MitPlan 2.0: Enhanced Support for Multi-morbid Patient Management


Using Planning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 276
Martin Michalowski, Malvika Rao, Szymon Wilk, Wojtek Michalowski,
and Marc Carrier

Explanations in Digital Health: The Case of Supporting People Lifestyles . . . 287


Milene Santos Teixeira, Ivan Donadello, and Mauro Dragoni

Predicting Medical Interventions from Vital Parameters:


Towards a Decision Support System for Remote Patient Monitoring . . . . . . . 293
Kordian Gontarska, Weronika Wrazen, Jossekin Beilharz,
Robert Schmid, Lauritz Thamsen, and Andreas Polze

CAncer PAtients Better Life Experience (CAPABLE) First


Proof-of-Concept Demonstration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 298
Enea Parimbelli, Matteo Gabetta, Giordano Lanzola, Francesca Polce,
Szymon Wilk, David Glasspool, Alexandra Kogan, Roy Leizer,
Vitali Gisko, Nicole Veggiotti, Silvia Panzarasa, Rowdy de Groot,
Manuel Ottaviano, Lucia Sacchi, Ronald Cornet, Mor Peleg,
and Silvana Quaglini

Deep Learning

Sensitivity and Specificity Evaluation of Deep Learning Models


for Detection of Pneumoperitoneum on Chest Radiographs. . . . . . . . . . . . . . 307
Manu Goyal, Judith Austin-Strohbehn, Sean J. Sun, Karen Rodriguez,
Jessica M. Sin, Yvonne Y. Cheung, and Saeed Hassanpour

An Application of Recurrent Neural Networks for Estimating the Prognosis


of COVID-19 Patients in Northern Italy. . . . . . . . . . . . . . . . . . . . . . . . . . . 318
Mattia Chiari, Alfonso E. Gerevini, Matteo Olivato, Luca Putelli,
Nicholas Rossetti, and Ivan Serina

Recurrent Neural Network to Predict Renal Function Impairment


in Diabetic Patients via Longitudinal Routine Check-up Data . . . . . . . . . . . . 329
Enrico Longato, Gian Paolo Fadini, Giovanni Sparacino,
Angelo Avogaro, and Barbara Di Camillo

Counterfactual Explanations for Survival Prediction of Cardiovascular


ICU Patients. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 338
Zhendong Wang, Isak Samsten, and Panagiotis Papapetrou
Contents xix

Improving the Performance of Melanoma Detection in Dermoscopy Images


Using Deep CNN Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 349
Himanshu K. Gajera, Mukesh A. Zaveri, and Deepak Ranjan Nayak

Mobile Aided System of Deep-Learning Based Cataract Grading


from Fundus Images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 355
Yaroub Elloumi

Uncertainty Estimation in SARS-CoV-2 B-Cell Epitope Prediction


for Vaccine Development . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 361
Bhargab Ghoshal, Biraja Ghoshal, Stephen Swift, and Allan Tucker

Attention-Based Explanation in a Deep Learning Model For Classifying


Radiology Reports. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 367
Luca Putelli, Alfonso E. Gerevini, Alberto Lavelli, Roberto Maroldi,
and Ivan Serina

Evaluation of Encoder-Decoder Architectures for Automatic Skin Lesion


Segmentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 373
José G. P. Lima, Geraldo Braz Junior, João D. S. de Almeida,
and Caio E. F. Matos

A Novel Deep Learning Model for COVID-19 Detection from Combined


Heterogeneous X-ray and CT Chest Images . . . . . . . . . . . . . . . . . . . . . . . . 378
Amir Bouden, Ahmed Ghazi Blaiech, Khaled Ben Khalifa,
Asma Ben Abdallah, and Mohamed Hédi Bedoui

An Experiment Environment for Definition, Training and Evaluation


of Electrocardiogram-Based AI Models . . . . . . . . . . . . . . . . . . . . . . . . . . . 384
Nils Gumpfer, Joshua Prim, Dimitri Grün, Jennifer Hannig, Till Keller,
and Michael Guckert

Enhancing the Value of Counterfactual Explanations for Deep Learning . . . . 389


Yan Jia, John McDermid, and Ibrahim Habli

Natural Language Processing

A Multi-instance Multi-label Weakly Supervised Approach for Dealing


with Emerging MeSH Descriptors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 397
Nikolaos Mylonas, Stamatis Karlos, and Grigorios Tsoumakas

Demographic Aware Probabilistic Medical Knowledge Graph Embeddings


of Electronic Medical Records . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 408
Aynur Guluzade, Endri Kacupaj, and Maria Maleshkova

Modeling and Representation by Graphs of the Reasoning of an Emergency


Doctor: Symptom Checker MedVir . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 418
Loïc Etienne, Francis Faux, and Olivier Roecker
xx Contents

Effect of Depth Order on Iterative Nested Named Entity


Recognition Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 428
Perceval Wajsbürt, Yoann Taillé, and Xavier Tannier

The Effectiveness of Phrase Skip-Gram in Primary Care NLP for the


Prediction of Lung Cancer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 433
Torec T. Luik, Miguel Rios, Ameen Abu-Hanna,
Henk C. P. M. van Weert, and Martijn C. Schut

Customized Neural Predictive Medical Text: A Use-Case on Caregivers . . . . 438


John Pavlopoulos and Panagiotis Papapetrou

Outlier Detection for GP Referrals in Otorhinolaryngology. . . . . . . . . . . . . . 444


Chee Keong Wee and Nathan Wee

The Champollion Project: Automatic Structuration of Clinical Features


from Medical Records . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 452
Oliver Hijano Cubelos, Thomas Balezeau, and Julien Guerin

Knowledge Representation and Rule Mining

Modelling and Assessment of One-Drug Dose Titration . . . . . . . . . . . . . . . . 459


David Riaño and Aida Kamišalić

TransICD: Transformer Based Code-Wise Attention Model


for Explainable ICD Coding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 469
Biplob Biswas, Thai-Hoang Pham, and Ping Zhang

Improving Prediction of Low-Prior Clinical Events with Simultaneous


General Patient-State Representation Learning . . . . . . . . . . . . . . . . . . . . . . 479
Matthew Barren and Milos Hauskrecht

Identifying Symptom Clusters Through Association Rule Mining . . . . . . . . . 491


Mikayla Biggs, Carla Floricel, Lisanne Van Dijk,
Abdallah S. R. Mohamed, C. David Fuller, G. Elisabeta Marai,
Xinhua Zhang, and Guadalupe Canahuate

A Probabilistic Approach to Extract Qualitative Knowledge for Early


Prediction of Gestational Diabetes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 497
Athresh Karanam, Alexander L. Hayes, Harsha Kokel, David M. Haas,
Predrag Radivojac, and Sriraam Natarajan

Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 503


Invited Talk
The Myth of Complete AI-Fairness

Virginia Dignum(B)

Umeå University, Umeå, Sweden


virginia@cs.umu.se

Just recently, IBM invited me to participate in a panel titled “Will AI ever


be completely fair?” My first reaction was that it surely would be a very short
panel, as the only possible answer is ‘no’. In this short paper, I wish to further
motivate my position in that debate: “I will never be completely fair. Nothing
ever is. The point is not complete fairness, but the need to establish metrics and
thresholds for fairness that ensure trust in AI systems”.
The idea of fairness and justice has long and deep roots in Western civiliza-
tion, and is strongly linked to ethics. It is therefore not strange that it is core
to the current discussion about the ethics of development and use of AI sys-
tems. Given that we often associate fairness with consistency and accuracy, the
idea that our decisions and decisions affecting us can become fairer by replac-
ing human judgement by automated, numerical, systems, is therefore appealing.
However, as Laurie Anderson recently said1 “If you think technology will solve
your problems, you don’t understand technology—and you don’t understand
your problems.” AI is not magic, and its results are fundamentally constrained
by the convictions and expectations of those that build, manage, deploy and
use it. Which makes it crucial that we understand the mechanisms behind the
systems and their decision capabilities.
The pursuit of fair AI is currently a lively one. One involving many
researchers, meetings and conferences (of which FAccT2 is the most known)
and refers to the notion that an algorithm is fair, if its results are independent of
given variables, especially those considered sensitive, such as the traits of individ-
uals which should not correlate with the outcome (i.e. gender, ethnicity, sexual
orientation, disability, etc.). However, nothing is ever 100% fair in 100% of the
situations, and due to complex networked connection, to ensure fairness for one
(group) may lead to unfairness for others. Moreover, what we consider fair often
does depend on the traits of individuals. An obvious example are social services.
Most people believe in the need for some form of social services, whether it is for
children, for the elderly, for the sick or the poor. And many of us will benefit from
social services at least once in our lives. Decision making in the attribution of
social benefits is dependent on individual characteristics such as age, income, or
chronic health problems. Algorithmic fairness approaches however overempha-
size concepts such as equality and do not adequately address caring and concern
for others.

1
As quoted by Kate Crawford on Twitter https://twitter.com/katecrawford/status/
1377551240146522115; 1 April 2021.
2
https://facctconference.org/.
c Springer Nature Switzerland AG 2021
A. Tucker et al. (Eds.): AIME 2021, LNAI 12721, pp. 3–8, 2021.
https://doi.org/10.1007/978-3-030-77211-6_1
4 V. Dignum

Many years ago, I participated in a project at my children basic school that


was aimed at helping kids develop fairness standards, roughly modelled along
Kohlberg’s stages of moral development. It became clear quite quickly that chil-
dren aged 6–12 easily understand that fairness comes in many ‘flavours’: if given
cookies to divide between all kids of the class, the leading principle was equality,
i.e. giving each kid the same amount of cookies. But they also understood and
accepted the concept of equity: for instance in deciding that a schoolmate with
dyslexia should be given more time to perform a school exam. Unfortunately, for
the average algorithm, common sense and world knowledge is many light years
away from that of a six year old, and switching between equity and equality
depending on what is the best approach to fairness in a given situation, is rarely
a feature of algorithmic decision making.

So, How Does Fairness Work in Algorithms and What


is Being Done to Correct for Unfair Results?

Doctors deciding on a patient’s treatment, or judges deciding on sentencing,


must be certain that probability estimates for different conditions are correct
for each specific subject, independent of age, race or gender. Increasingly these
decisions are mediated by algorithms. Algorithmic fairness can be informally
described as the probability of being classified in a certain category should be
similar to for all that exhibit those characteristics, independently of other traits
or properties. In order to ensure algorithmic fairness, given often very unbalanced
datasets, data scientists use calibration (i.e. the comparison of the actual output
and the expected output). Moreover, if we are concerned about fairness between
two groups (e.g. male and female patients, or African-American defendants and
Caucasian defendants) then this calibration condition should hold simultane-
ously for the set of people within each of these groups as well [4]. Calibration is
a crucial condition for risk prediction tools in many settings. If a risk prediction
tool for evaluating defendants is not calibrated towards race, for example, then a
probability estimate could carry different meaning for African-American than for
Caucasian defendants, and hence the tool would have the unintended and highly
undesirable consequence of incentivizing judges to take race into account when
interpreting the tool’s predictions [7]. At the same time, ideally the incidence
of false positives (being incorrectly classified as ‘X’) and false negatives (failing
incorrectly to be classified as ‘X’) should be the same independently of other
traits or properties. That is, fairness also means that, for instance, male and
female candidates have the same chance of being offered a, for them irrelevant,
service or product, or failing to receive for them relevant services or products.
Unfortunately, research shows that it is not possible to satisfy some of these
expected properties of fairness simultaneously: calibration between groups, bal-
ance for false negatives, and balance for false positives. This means that if we
calibrate data, we need to be prepared to accept higher levels of false positives
and false negatives for some groups, and to deal with their human and societal
impact [6]. Taking the diagnostic example, a false positive means that a patient
The Myth of Complete AI-Fairness 5

is diagnosed with a disease they don’t have. With a false negative that disease
goes undiagnosed. The impact of either, both personal as well as societal, can be
huge. In the same way, being wrongly classified as someone with a high risk to
re-offend(false positive) has profound personal consequences, given that innocent
people are held without bail, while incorrect classification as someone with a low
risk to re-offend (false negative) has deep societal consequences, where people
that pose a real criminal threat are let free3 .
Given these technical difficulties in achieving perfectly fair, data-driven, algo-
rithms, it is high time to start a conversation about the societal and individual
impact of false positives and false negatives, and, more importantly, about what
should be the threshold for acceptation of algorithmic decisions, that, by their
nature, will never be completely ‘fair’.

Fairness is Not About Bias but About Prejudice


A commonly voiced explanation for algorithmic bias is the prevalence of human
bias in the data. For example, when a job application filtering tool is trained
on decisions made by humans, the machine learning algorithm may learn to
discriminate against women or individuals with a certain ethnic background.
Often this will happen even if ethnicity or gender are excluded from the data
since the algorithm will be able to exploit the information in the applicant’s
name, address or even the use of certain words. For example, Amazon’s recruiting
AI system filtered out applications by women, because they lacked ‘masculine’
wording, commonly used in applications by men.
There are many reasons for bias in datasets, from choice of subjects, to
the omission of certain characteristics or variables that properly capture the
phenomenon we want to predict, to changes over time, place or situation, to the
way training data is selected.
Much has been done already to categorize and address the many forms of
machine bias [8]. Also, many tools are available to support to unbias AI systems,
including IBM’s AI Fairness 3604 and Google’s What If Tool5 . Basically, these
tools support the testing and mitigation of bias through libraries of methods
and test environments. According to Google “[...] with the What If Tool you
can test performance in hypothetical situations, analyse the importance of dif-
ferent data features, and visualize model behaviour across multiple models and
subsets of input data, and for different ML fairness metrics.” Note the focus on
performance, a constant in much of the work on AI.
However, not all bias is bad, in fact, there are even biases in the way we
approach bias. Bias in human data in not only impossible to fully eliminate, it
is often there for a reason. Bias is part of our lives partly because, we do not
3
This example is at the core of the well-known Propublica investigations of the
COMPAS algorithms used by courts in the US to determine recidivism risk: www.
propublica.org/article/how-we-analyzed-the-compasrecidivism-algorithm.
4
https://github.com/Trusted-AI/AIF360.
5
https://pair-code.github.io/what-if-tool/index.html#about.
6 V. Dignum

have enough cognitive bandwidth to make every decision from ground zero and
therefore need to use generalizations, or biases, as a starting point. Without
bias, we would not been able to survive as a species, it helps us selecting from
a myriad of options in our environment. But not all biases are bad. But that
doesn’t mean we shouldn’t address them.
Bias is not the problem, prejudice and discrimination are. Whereas prejudice
represents a preconceived judgment or attitude, discrimination is a behaviour. In
society, discrimination is often enacted through institutional structures and poli-
cies, and embedded in cultural beliefs and representations, and is thus reflected
in any data collected. The focus need be on using AI to support interventions
aimed at reducing prejudice and discrimination, e.g. through education, facili-
tation of intergroup contact, targeting social norms promoting positive relations
between groups, or supporting people identify their own bias and prejudices.
Facial analysis tools and recognition software have raised concerns about
racial bias in the technology. Work by Joy Buolamwini and Timnit Gebru has
shown how deep these biases go and how hard they are to eliminate [1]. In fact,
debiasing AI often leads to other biases. Sometimes this is known and under-
stood, such as the dataset they created as alternative to the datasets commonly
used for training facial recognition algorithms: using what they called ‘parlia-
ments’, Buolamwini and Gebru created a dataset of faces balanced in terms of
race and gender, but notably unbalanced in terms of age, lighting or pose. As
long as this is understood, this dataset is probably perfectly usable for train-
ing an algorithm to recognise faces of a certain age, displayed under the same
lighting conditions and with the same pose. It will however not be usable if
someone tries to train an algorithm to recognise children’s faces. This illustrates
that debiasing data is not without risks, in particular because it focus on those
characteristics that we are aware of, which are ‘coloured’ by our own experience,
time, place and culture.
AI bias is more than biased data. It starts with who is collecting the data,
who is involved in selecting and/or designing the algorithms and who is training
the algorithms and labelling the data. Moreover, decisions about which and
whose data is collected and which data is being used to train the algorithms
and how the algorithm is designed and trained also influence the fairness of the
results. From labelling farms to ghost workers, the legion of poorly paid, badly
treated and ignored human labourers, working behind the scenes of any working
AI system, is huge and little is being done to acknowledge them and to improve
their working conditions. Books such as ‘Ghost work’ [5] by Mary L. Gray and
Siddharth Suri, or ‘Atlas of AI’ [2] by Kate Crawford, are raising the issue but,
as often is the case, the ‘folk is sleeping’: it is easier to use the systems and
profit from their results, than to question how these results are being achieved
and at what cost. The question is thus: how fair is algorithm fairness for those
that label, train and calibrate the data it needs to produce fair results and, more
importantly, if we expect them to provide us with unbiased data, shouldn’t we
be treating them fairly?
The Myth of Complete AI-Fairness 7

Beyond Fairness
The fact that algorithms and humans, cannot ever be completely fair, does not
mean that we should just accept it. Improving fairness and overcoming prejudice
is partly a matter of understanding how the technology works. A matter of
education. Moreover, using technology properly, fair treatment of those using
and being affected by it, requires participation. Still many stakeholder are not
invited to the table, not joining the conversation. David Sumpter describes the
quest for algorithmic fairness as a game of ‘whack-a-mole’: when you try to solve
bias in one place, it appears up somewhere else [9]. The elephant in the room
is the huge blind spot we all have about our own blind spots. We correct bias
for the bias we are aware of. An inclusive, participatory, approach to design and
development of AI systems will facilitate a wider scope.
Lack of fairness in AI systems is often also linked to a lack of explanatory
capabilities. If the results of the system cannot be easily understood or explained,
it is difficult to assess its fairness. Many of the current tools that evaluate bias
and fairness help identify where biases may occur, whether in the data or the
algorithms or even in their testing and evaluation. Even if not all AI systems
can be fully explainable, it is important to make sure that their decisions are
reproducible and the conditions for their use are clear and open to auditing.
Current AI algorithms are built for accuracy and performance, or for effi-
ciency. Improving the speed of the algorithm, minimizing its computational
requirements and maximizing the accuracy of the results are the mantras that
lead current computer science and engineering education. However, these are
not the only optimization criteria. When humans and society are at stake, other
criteria need be considered. How do you balance safety and privacy? Explainabil-
ity and energy resources? Autonomy and accuracy? What do you do when you
cannot have both? Such moral overload dilemmas are at the core of responsible
development and use of AI [3].
Addressing them requires multidisciplinary development teams and involve-
ment of the humanities and social sciences in software engineering education. It
also requires a redefinition of incentives and metrics for what is a ‘good’ system.
Doing the right thing, and doing it well means that we also need to define what
is good and for whom.
Finally, it is important to keep continuous efforts to improve algorithms and
data, define regulation and standardisation, and develop evaluation tools and
corrective frameworks. But the same time, we cannot ignore that no technology
is without risk, no action is without risk. It is high time to start the conversation
on which AI-risks we find acceptable for individuals and for society as a whole,
and how we distribute these risks, as well as the benefits of AI.
8 V. Dignum

References
1. Buolamwini, J., Gebru, T.: Gender shades: intersectional accuracy disparities in
commercial gender classification. In: Conference on Fairness, Accountability and
Transparency, pp. 77–91. PMLR (2018)
2. Crawford, K.: The Atlas of AI. Yale University Press, London (2021)
3. Dignum, V.: Responsible Artificial Intelligence: How to Develop and Use AI in
a Responsible Way. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-
30371-6
4. Flores, A.W., Bechtel, K., Lowenkamp, C.T.: False positives, false negatives, and
false analyses: a rejoinder to machine bias: there’s software used across the country
to predict future criminals. And it’s biased against blacks. Fed. Probation 80, 38
(2016)
5. Gray, M.L., Suri, S.: Ghost Work: How to Stop Silicon Valley from Building a New
Global Underclass. Eamon Dolan Books (2019)
6. Kleinberg, J., Mullainathan, S., Raghavan, M.: Inherent trade-offs in the fair deter-
mination of risk scores. arXiv preprint arXiv:1609.05807 (2016)
7. Pleiss, G., Raghavan, M., Wu, F., Kleinberg, J., Weinberger, K.Q.: On fairness and
calibration. arXiv preprint arXiv:1709.02012 (2017)
8. Schnabel, T., Swaminathan, A., Singh, A., Chandak, N., Joachims, T.: Recommen-
dations as treatments: debiasing learning and evaluation. In: International Confer-
ence on Machine Learning, pp. 1670–1679. PMLR (2016)
9. Sumpter, D.: Outnumbered: From Facebook and Google to Fake News and Filter-
Bubbles-the Algorithms that Control Our Lives. Bloomsbury Publishing, London
(2018)
Image Analysis
A Petri Dish for Histopathology
Image Analysis

Jerry Wei1(B) , Arief Suriawinata2 , Bing Ren2 , Xiaoying Liu2 , Mikhail Lisovsky2 ,
Louis Vaickus2 , Charles Brown2 , Michael Baker2 , Naofumi Tomita1 ,
Lorenzo Torresani1 , Jason Wei1 , and Saeed Hassanpour1
1
Dartmouth College, Hanover, NH, USA
saeed.hassanpour@dartmouth.edu
2
Dartmouth-Hitchcock Medical Center, Lebanon, NH, USA

Abstract. With the rise of deep learning, there has been increased interest in
using neural networks for histopathology image analysis, a field that investigates
the properties of biopsy or resected specimens traditionally manually examined
under a microscope by pathologists. However, challenges such as limited data,
costly annotation, and processing high-resolution and variable-size images make
it difficult to quickly iterate over model designs.
Throughout scientific history, many significant research directions have lever-
aged small-scale experimental setups as petri dishes to efficiently evaluate
exploratory ideas. In this paper, we introduce a minimalist histopathology image
analysis dataset (MHIST), an analogous petri dish for histopathology image anal-
ysis. MHIST is a binary classification dataset of 3,152 fixed-size images of col-
orectal polyps, each with a gold-standard label determined by the majority vote of
seven board-certified gastrointestinal pathologists and annotator agreement level.
MHIST occupies less than 400 MB of disk space, and a ResNet-18 baseline can
be trained to convergence on MHIST in just 6 min using 3.5 GB of memory on
a NVIDIA RTX 3090. As example use cases, we use MHIST to study natural
questions such as how dataset size, network depth, transfer learning, and high-
disagreement examples affect model performance.
By introducing MHIST, we hope to not only help facilitate the work of cur-
rent histopathology imaging researchers, but also make the field more-accessible
to the general community. Our dataset is available at https://bmirds.github.io/
MHIST.

Keywords: Histopathology images · Deep learning · Medical image analysis

1 Introduction
Scientific research has aimed to study and build our understanding of the world, and
although many problems initially seemed too ambitious, they were ultimately sur-
mounted. In these quests, a winning approach has often been to break down large ideas
into smaller components, learn about these components through experiments that can
be iterated on quickly, and then validate or translate those ideas into large-scale applica-
tions. For example, in the Human Genome Project (which helped us understand much

c Springer Nature Switzerland AG 2021
A. Tucker et al. (Eds.): AIME 2021, LNAI 12721, pp. 11–24, 2021.
https://doi.org/10.1007/978-3-030-77211-6_2
12 J. Wei et al.

Fig. 1. Key features of our minimalist histopathology image analysis dataset (MHIST).

of what we know now about human genetics), many fundamental discoveries resulted
from petri dish experiments—small setups that saved time, energy, and money—on
simpler organisms. In particular, the Drosophila fruit fly, an organism that is inexpen-
sive to culture, has short life cycles, produces large numbers of embryos, and can be
easily genetically modified, has been used in biomedical research for over a century to
study a broad range of phenomena [1].
In deep learning, we have our own set of petri dishes in the form of benchmark
datasets, of which MNIST [2] is one of the most popular. Comprising the straightfor-
ward problem of classifying handwritten digits in 28 by 28 pixel images, MNIST is
easily accessible, and training a strong classifier on it has become a simple task with
today’s tools. Because it is so easy to evaluate models on MNIST, it has served as the
exploratory environment for many ideas that were then validated on large scale datasets
or implemented in end-to-end applications. For example, many well-known concepts
such as convolutional neural networks, generative adversarial networks [3], and the
Adam optimization algorithm [4] were initially validated on MNIST.
In the field of histopathology image analysis, however, no such classic dataset cur-
rently exists due to many potential reasons. To start, most health institutions do not
have the technology nor the capacity to scan histopathology slides at the scale needed
to create a reasonably-sized dataset. Even for institutions that are able to collect data,
a barrage of complex data processing and annotation decisions falls upon the aspir-
ing researcher, as histopathology images are large and difficult to process, and data
annotation requires the valuable time of trained pathologists. Finally, even after data
is processed and annotated, it can be challenging to obtain institutional review board
(IRB) approval for releasing such datasets, and some institutions may wish to keep
such datasets private. As a result of the inaccessibility of data, histopathology image
analysis has remained on the fringes of computer vision research, with many popular
image datasets dealing with domains where data collection and annotation are more
straightforward.
A Petri Dish for Histopathology Image Analysis 13

To address these challenges that have plagued deep learning for histopathology
image analysis since the beginning of the area, in this paper, we introduce MHIST:
a minimalist histopathology image classification dataset. MHIST is minimalist in that
it comprises a straightforward binary classification task of fixed-size colorectal polyp
images, a common and clinically-significant task in gastrointestinal pathology. MHIST
contains 3,152 fixed-size images, each with a gold-standard label determined from the
majority vote of seven board-certified gastrointestinal pathologists, that can be used
to train a baseline model without additional data processing. By releasing this dataset
publicly, we hope not only that current histopathology image researchers can build
models faster, but also that general computer vision researchers looking to apply mod-
els to datasets other than classic benchmarks can easily explore the exciting area of
histopathology image analysis. Our dataset is publicly available at https://bmirds.github.
io/MHIST following completion of a simple dataset-use agreement form.

2 Background

Deep learning for medical image analysis has recently seen increased interest in analyz-
ing histopathology images (large, high-resolution scans of histology slides that are typ-
ically examined under a microscope by pathologists) [5]. To date, deep neural networks
have already achieved pathologist-level performance on classifying diseases such as
prostate cancer, breast cancer, lung cancer, and melanoma [6–12], demonstrating their
large potential. Despite these successes, histopathology image analysis has not seen
the same level of popularity as analysis of other medical image types (e.g., radiology
images or CT scans), likely because the nature of histopathology images creates a num-
ber of hurdles that make it challenging to directly apply mainstream computer vision
methods. Below, we list some factors that can potentially impede the research workflow
in histopathology image analysis:

– High-resolution, variable-size images. Because the disease patterns in histology


slides can only occur in certain sections of the tissue and can only be detected at
certain magnifications under the microscope, histopathology images are typically
scanned at high resolution so that all potentially-relevant information is preserved.
This means that while each sample contains lots of data, storing these large, high-
resolution images is nontrivial. For instance, the slides from a single patient in
the CAMELYON17 challenge [13] range from 2 GB to 18 GB in size, which is
up to one-hundred times larger than the entire CIFAR-10 dataset. Moreover, the
size and aspect ratios of the slides can differ based on the shape of the specimen
in question—sometimes, multiple large specimens are included in one slide, and so
some scanned slides may be up to an order of magnitude larger than others. As deep
neural networks typically require fixed-dimension inputs, preprocessing decisions
such as what magnification to analyze the slides at and how to deal with variable-
size inputs can be difficult to make.
– Cost of annotation. Whereas annotating data in deep learning has been simpli-
fied by services such as Mechanical Turk, there is no well-established service for
annotating histopathology images, a process which requires substantial time from
14 J. Wei et al.

experienced pathologists who are often busy with clinical service. Moreover, access
to one or two pathologists is often inadequate because inter-annotator agreement is
low to moderate for most tasks, and so annotations can be easily biased towards the
personal tendencies of annotators.
– Unclear annotation guidelines. It is also unclear what type of annotation is needed
for high-resolution whole-slide images, as a slide may be given a certain diagnosis
based on a small portion of diseased tissue, but the overall diagnosis would not apply
to the normal portions of the tissue. Researchers often opt to have pathologists draw
bounding boxes and annotate areas with their respective histological characteristics,
but this comes with substantial costs, both in training pathologists to use annotation
software and in increased annotation time and effort.
– Lack of data. Even once these challenges are addressed, it is often the case that,
due to slides being discarded as a result of poor quality or to remove classes that are
too rare to include in the classification task, training data is relatively limited and
the test set is not sufficiently large. This makes it difficult to distinguish accurately
between models, and models are therefore easily prone to overfitting.

To mitigate these challenges of data collection and annotation, in this paper we


introduce a minimalist histopathology dataset that will allow researchers to quickly
train a histopathology image classification model without dealing with an avalanche of
complex data processing and annotation decisions. Our dataset focuses on the binary
classification of colorectal polyps, a straightforward task that is common in a gastroin-
testinal pathologist’s workflow. Instead of using whole-slide images, which are too large
to directly train on for most academic researchers, our dataset consists only of 224 ×
224 pixel image tiles of tissue; these images can be directly fed into standard com-
puter vision models such as ResNet. Finally, for annotations, each patch in our dataset
was directly classified by seven board-certified gastrointestinal pathologists and given
a gold-standard label based on their majority vote.
Our dataset aims to serve as a petri dish for histopathology image analysis. That is,
it represents a simple task that can be learned quickly, and it is easy to iterate over. Our
dataset allows researchers to, without dealing with the confounding factors that arise
from the nature of histopathology images, quickly test inductive biases that can later be
implemented in large-scale applications. We hope that our dataset will allow researchers
to more-easily explore histopathology image analysis and that this can facilitate further
research in the field as a whole.

3 MHIST Dataset
In the context of the challenges mentioned in the above section, MHIST has several
notable features that we view favorably in a minimalist dataset:
1. Straightforward binary classification task that is challenging and important.
2. Adequate yet tractable number of examples: 2,175 training and 977 testing images.
3. Fixed-size images of appropriate dimension for standard models.
4. Gold-standard labels from the majority vote of seven pathologists, along with anno-
tator agreement levels that can be used for more-specific model tuning.
A Petri Dish for Histopathology Image Analysis 15

Table 1. Number of images in our dataset’s training and testing sets for each class. HP: hyper-
plastic polyp (benign), SSA: sessile serrated adenoma (precancerous).

Train Test Total


HP 1,545 617 2,162
SSA 630 360 990
Total 2,175 977 3,152

The rest of this section details the colorectal polyp classification task (Sect. 3.1),
data collection (Sect. 3.2), and the data annotation process (Sect. 3.3).

3.1 Colorectal Polyp Classification Task


Colorectal cancer is the second leading cause of cancer death in the United States, with
an estimated 53,200 deaths in 2020 [14]. As a result, colonoscopy is one of the most
common cancer screening programs in the United States [15], and classification of col-
orectal polyps (growths inside the colon lining that can lead to colonic cancer if left
untreated) is one of the highest-volume tasks in pathology. Our task focuses on the
clinically-important binary distinction between hyperplastic polyps (HPs) and sessile
serrated adenomas (SSAs), a challenging problem with considerable inter-pathologist
variability [16–20]. HPs are typically benign, while SSAs are precancerous lesions
that can turn into cancer if left untreated and require sooner follow-up examinations
[21]. Pathologically, HPs have a superficial serrated architecture and elongated crypts,
whereas SSAs are characterized by broad-based crypts, often with complex structure
and heavy serration [22].

3.2 Data Collection


For our dataset, we scanned 328 Formalin Fixed Paraffin-Embedded (FFPE) whole-
slide images of colorectal polyps, which were originally diagnosed on the whole-slide
level as hyperplastic polyps (HPs) or sessile serrated adenomas (SSAs), from patients
at the Dartmouth-Hitchcock Medical Center. These slides were scanned by an Aperio
AT2 scanner at 40x resolution; to increase the field of view, we compress the slides with
8x magnification. From these 328 whole-slide images, we then extracted 3,152 image
tiles (portions of size 224 × 224 pixels) representing diagnostically-relevant regions of
interest for HPs or SSAs. These images were shuffled and anonymized by removing all
metadata such that no sensitive patient information was retrievable from any images. All
images contain mostly tissue by area (as opposed to white space) and were confirmed
by our pathologists to be high-quality with few artifacts. The use and release of our
dataset was approved by Dartmouth-Hitchcock Health IRB.

3.3 Data Annotation


For data annotation, we worked with seven board-certified gastrointestinal pathologists
at the Dartmouth-Hitchcock Medical Center. Each pathologist individually and inde-
pendently classified each image in our dataset as either HP or SSA based on the World
16 J. Wei et al.

Fig. 2. Distribution of annotator agreement levels for images in our dataset.

Table 2. Model performance for five different ResNet depths. Adding more layers to the model
does not improve performance. n indicates the number of images per class used for training.
Means and standard deviations shown are for 10 random seeds.

AUC (%) on test set by training set size


ResNet n = 100 n = 200 n = 400 Full
18 67.4 ± 3.1 73.6 ± 3.7 79.3 ± 2.3 84.5 ± 1.1
34 64.1 ± 2.2 74.8 ± 3.1 78.0 ± 2.4 85.1 ± 0.7
50 64.7 ± 3.1 72.2 ± 2.8 76.6 ± 1.9 83.0 ± 0.6
101 65.2 ± 5.6 73.2 ± 1.9 77.3 ± 0.9 83.2 ± 1.3
152 62.3 ± 3.3 73.3 ± 1.2 77.5 ± 1.9 83.5 ± 0.8

Health Organization criteria from 2019 [23]. After labels were collected for all images
from all pathologists, the gold standard label for each image was assigned based on the
majority vote of the seven individual labels, a common choice in literature [24–33]. The
distribution of each class in our dataset based on the gold standard labels of each image
is shown in Table 1.
In our dataset, the average percent agreement between each pair of annotators was
72.9%, and each pathologist agreed with the majority vote an average of 83.2% of the
time. There is, notably, nontrivial disagreement between pathologists (approximately
16.7% of images have 4/7 agreement), which corresponds with the difficulty of our
colorectal polyp classification task. The mean of the per-pathologist Cohen’s κ was
0.450, in the moderate range of 0.41−0.60. Although not directly comparable with prior
work, a similar evaluation found a Cohen’s κ of 0.380 among four pathologists [20].
To facilitate research that might consider the annotator agreement of examples during
training, for each image, we also provide the agreement level among our annotators
(4/7, 5/7, 6/7, or 7/7). Figure 2 shows the distribution of agreement levels for our dataset.
A Petri Dish for Histopathology Image Analysis 17

4 Example Use Cases


In this section, we demonstrate example use cases of our dataset by investigating sev-
eral natural questions that arise in histopathology image analysis. Namely, how does
network depth affect model performance (Sect. 4.2)? How much does ImageNet pre-
training help (Sect. 4.3)? Should examples with substantial annotator disagreement be
included in training (Sect. 4.4)? Moreover, we vary the size of the training set to gain
insight on how the amount of available data interacts with each of the above factors.

4.1 Experimental Setup


For our experiments, we follow the DeepSlide code repository [11] and use the ResNet
architecture, a common choice for classifying histopathology images. Specifically, for
our default baseline, we use ResNet-18 and train our model for 100 epochs (well past
convergence) using stochastic data augmentation with the Adam optimizer [4], batch
size of 32, initial learning rate of 1 × 10−3 , and learning rate decay factor of 0.91.

Table 3. Using weights pretrained on ImageNet significantly improves the performance of


ResNet-18 on our dataset. n indicates the number of images per class used for training. Means
and standard deviations shown are for 10 random seeds.

AUC (%) on test set by training set size


Pretraining? n = 100 n = 200 n = 400 Full
No 67.4 ± 3.1 73.6 ± 3.7 79.3 ± 2.3 84.5 ± 1.1
Yes 83.7 ± 1.7 89.3 ± 1.8 92.4 ± 0.7 92.7 ± 0.4

For more-robust evaluation, for each model we consider the five highest AUCs on
the test set, which are evaluated at every epoch. We report the mean and standard devi-
ation of these values calculated over 10 different random seeds.
Furthermore, we train our models with four different training set sizes: n = 100,
n = 200, n = 400, and Full, where n is the number of training images per class and
Full is the entire training set. To obtain subsets of the training set, we randomly sample
n random images for each class from the training set for each seed. We keep our testing
set fixed to ensure that models are evaluated equally.

4.2 Network Depth


We first study whether adding more layers to our model improves performance on our
dataset. Because deeper models take longer to train, identifying the smallest model that
achieves the best performance allows for maximum accuracy with the least necessary
training time.
We evaluate all five ResNet models proposed in [34]—ResNet-18, ResNet-34,
ResNet-50, ResNet-101, and ResNet-152—on our dataset, and all hyperparameters
(e.g., number of epochs, batch size) are kept constant; we only change the model depth.
18 J. Wei et al.

As shown in Table 2, adding more layers does not significantly improve perfor-
mance at any training dataset size. Furthermore, adding model depth past ResNet-34
actually decreases performance, as models that are too deep will begin to overfit, espe-
cially for small training set sizes. For example, when training with only 100 images per
class, mean AUC decreases by 5.1% when using ResNet-152 compared to ResNet-18.
We posit that increasing network depth does not improve performance on our dataset
because our dataset is relatively small, and so deeper networks are unnecessary for the
amount of information in our dataset. Moreover, increasing network depth may increase
overfitting on training data due to our dataset’s small size. Our results are consistent with
findings presented by Benkendorf and Hawkins [35]—deeper networks only perform
better than shallow networks when trained with large sample sizes.

4.3 Transfer Learning

We also examine the usefulness of transfer learning for our dataset, as transfer learning
can often be easily implemented into existing models, and so it is helpful to know
whether or not it can improve performance.

Table 4. Removing high-disagreement images during training may slightly improve performance.
Means and standard deviations shown are for 10 random seeds.

Training images used AUC (%) on test set


Very easy images only 79.9 ± 0.8
Easy images only 83.1 ± 0.6
Very easy + Easy images 84.6 ± 0.8
Very easy + Easy + Hard images 85.1 ± 0.8
All images 84.5 ± 1.1

Because deeper models do not achieve better performance on our dataset (as shown
in Sect. 4.2), we use ResNet-18 initialized with random weights as the baseline model
for this experiment. We compare our baseline with an identical model (i.e., all hyperpa-
rameters are congruent) that has been initialized with weights pretrained on ImageNet.
Table 3 shows the results for our ResNet-18 model with and without pretraining. We
find that ResNet-18 initialized with ImageNet pretrained weights significantly outper-
forms ResNet-18 initialized with random weights. For example, our pretrained model’s
performance when trained with only 100 images per class is comparable to our baseline
model’s performance when trained with the full training dataset. When both models are
trained on the full training set, the pretrained model outperforms the baseline by 8.2%,
as measured by mean AUC. These results indicate that, for our dataset, using pretrained
weights can be extremely helpful for improving overall performance.
The large improvement from ImageNet pretraining is unlikely to result from our
dataset having features expressed in ImageNet because ImageNet does not include
histopathology images. Instead, the improvement is, perhaps, explained by our dataset’s
A Petri Dish for Histopathology Image Analysis 19

small size, as ImageNet pretraining may help prevent the model from overfitting. This
would be consistent with Kornblith et al. [36], which found that performance improve-
ments from ImageNet pretraining diminishes as dataset size increases.

4.4 High-Disagreement Training Examples

As many datasets already contain annotator agreement data [24–30, 33, 37–41], we also
study whether there are certain ways of selecting examples based on their annotator
agreement level that will maximize performance. Examples with high annotator dis-
agreement are, by definition, harder to classify, so they may not always contain features
that are beneficial for training models. For this reason, we focus primarily on whether
training on only examples with higher annotator agreement will improve performance.
For our dataset, which was labeled by seven annotators, we partition our images
into four discrete levels of difficulty: very easy (7/7 agreement among annotators), easy
(6/7 agreement among annotators), hard (5/7 agreement among annotators), and very
hard (4/7 agreement among annotators), following our prior work [42]. We then train
ResNet-18 models using different combinations of images selected based on difficulty:
very easy images only; easy images only; very easy and easy images; and very easy,
easy, and hard images. For this experiment, we do not modify the dataset size like we
did in Sects. 4.2 and 4.3, as selecting training images based on difficulty inherently
changes the training set size.
As shown in Table 4, we find that excluding images with high annotator disagree-
ment (i.e., hard and very hard images) during training achieves comparable performance
to training with all images. Using only very easy images or only easy images, however,
does not match or exceed performance when training with all images. We also find that
training with all images except very hard images slightly outperforms training with all
images. One explanation for this is that very hard images, which only have 4/7 annota-
tor agreement, could be too challenging to analyze accurately (even for expert humans),
so their features might not be beneficial for training machine learning models either.

5 Related Work

Due to the trend towards larger, more computationally-expensive models [43], as well
as recent attention on the environmental considerations of training large models [44],
the deep learning community has begun to question whether model development needs
to occur at scale. In the machine learning field, two recent papers have brought attention
to this idea. Rawal et al. [45] proposed a novel surrogate model for rapid architecture
development, an artificial setting that predicts the ground-truth performance of archi-
tectural motifs. Greydanus [46] proposed MNIST-1D, a low-computational alternative
resource to MNIST that differentiates more clearly between models. Our dataset falls
within this direction and is heavily inspired by this work.
In the histopathology image analysis domain, several datasets are currently avail-
able. Perhaps the two best-known datasets are CAMELYON17 [13] and PCam [47].
CAMELYON17 focuses on breast cancer metastasis detection in whole-slide images
(WSIs) and includes a training set of 1,000 WSI with labeled locations. CAMELYON17
20 J. Wei et al.

is well-established, but because it contains WSIs (each taking up >1 GB), there is a
large barrier to training an initial model, and it is unclear how to best pre-process the
data to be compatible with current neural networks’ desired input format.
PCam is another well-known dataset that contains 327,680 images of size 96 ×
96 pixels extracted from CAMELYON17. While PCam is similar to our work in that
it considers fixed-size images for binary classification, we note two key differences.
First, the annotations in PCam are derived from bounding-box annotations which were
drawn by a student and then checked by a pathologist. For challenging tasks with high
annotator disagreement, however, using only a single annotator can cause the model to
learn specific tendencies of a single pathologist. In our dataset, on the other hand, each
image is directly classified by seven expert pathologists, and the gold standard is set as
the majority vote of the seven labels, mitigating the potential biases that can arise from
having only a single annotator. Second, whereas PCam takes up around 7 GB of disk
space, our dataset aims to be minimalist and is therefore an order of magnitude smaller,
making it faster for researchers to obtain results and iterate over models.
In Table 5, we compare our dataset with other previously-proposed histopathol-
ogy image analysis datasets. Our dataset is much smaller than other datasets, yet
it still has enough examples to serve as a petri dish in that it can test models and
return results quickly. Additionally, our dataset has robust annotations in comparison
to other histopathology datasets. Datasets frequently only have one or two annotators,
but MHIST is annotated by seven pathologists, making it the least influenced by biases
that any singular annotator may have.

Table 5. Comparison of well-known histopathology datasets. Our proposed dataset, MHIST, is


advantageous due to its relatively small size (making it faster to obtain results) and its robust
annotations. ROI: Regions of Interest.

Dataset Images Image type Annotation type Number of Dataset size


Annotators
MITOS (2012) [48] 50 High power fields Pixel-level 2 ∼1 GB
TUPAC16 [49] 821 Whole-slide images ROI 1 ∼850 GB
CAMELYON17 [13] 1,000 Whole-slide images Contoured 1 ∼2.3 TB
PCam (2018) [47] 327,680 Fixed-sized images Image-wise 1 ∼7 GB
BACH (2018) [50] 500 Microscopy images Image-wise 2 >5 GB
LYON19 [51] 83 Whole-slide images ROI 3 ∼13 GB
MHIST (Ours) 3,152 Fixed-sized images Image-wise 7 ∼333 MB

6 Discussion
The inherent nature of histopathology image classification can create challenges for
researchers looking to apply mainstream computer vision methods. Histopathology
images themselves are difficult to handle because they have high resolutions and are
variable-sized, and accurately and efficiently annotating histopathology images is a
nontrivial task. Furthermore, being able to address these challenges does not guarantee
A Petri Dish for Histopathology Image Analysis 21

a high-quality dataset, as histopathology images are difficult to acquire, and so data is


often quite limited. Based on a thorough analysis of these challenges, we have presented
MHIST, a histopathology image classification dataset with a straightforward yet chal-
lenging binary classification task. MHIST comprises a total of 3,152 fixed-size images
that have already been preprocessed. In addition to providing these images, we also
include each image’s gold standard label and degree of annotator agreement.
Of possible limitations, our use of fixed-size images may not be the most-precise
approach for histopathological image analysis, as using whole-slide images directly
would likely improve performance since whole-slide images contain much more infor-
mation than fixed-size images. Current computer vision models cannot train on whole-
slide images, however, as a single whole-slide image can take up more than 10 GB of
space. Thus, our dataset includes images of fixed-size, as this is appropriate for most
standard computer vision models. Another limitation is that our dataset does not include
any demographic information about patients nor any information regarding the size and
location of the polyp (data that is often used in clinical classification). Our dataset con-
tains purely image data and is limited in this fashion.
In this paper, we aim to have provided a dataset that can serve as a petri dish for
histopathology image analysis. We hope that researchers are able to use MHIST to test
models on a smaller scale before being implemented in large-scale applications, and
that our dataset will facilitate further research into deep learning methodologies for
histopathology image analysis.

References
1. Jennings, B.H.: Drosophila - a versatile model in biology & medicine. Mater. Today, 14(5),
190–195 (2011). http://www.sciencedirect.com/science/article/pii/S1369702111701134
2. Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document
recognition. Proc. IEEE, 86(11), 2278–2324 (1998). https://ieeexplore.ieee.org/document/
726791
3. Goodfellow, I.J., et al.: Generative adversarial networks (2014). https://arxiv.org/pdf/1406.
2661.pdf
4. Kingma, D., Ba, J.: Adam: a method for stochastic optimization. In: International Conference
on Learning Representations (2014). https://arxiv.org/pdf/1412.6980.pdf
5. Srinidhi, C.L., Ciga, O., Martel, A.L.: Deep neural network models for computational
histopathology: a survey. Med. Image Anal. 101813 (2020). https://www.sciencedirect.com/
science/article/pii/S1361841520301778
6. Arvaniti, E., et al.: Automated Gleason grading of prostate cancer tissue microarrays via
deep learning. Nat. Sci. Rep. 8(1), 1–11 (2018). https://www.nature.com/articles/s41598-
018-30535-1
7. Bulten, W., et al.: Automated deep-learning system for Gleason grading of prostate cancer
using biopsies: a diagnostic study. Lancet Oncol. 21(2), 233–241 (2020). https://arxiv.org/
pdf/1907.07980.pdf
8. Hekler, A., et al.: Pathologist-level classification of histopathological melanoma images with
deep neural networks. Euro. J. Cancer, 115, 79–83 (2019). http://www.sciencedirect.com/
science/article/pii/S0959804919302758
22 J. Wei et al.

9. Shah, M., Wang, D., Rubadue, C., Suster, D., Beck, A.: Deep learning assessment of tumor
proliferation in breast cancer histological images. In: 2017 IEEE International Conference
on Bioinformatics and Biomedicine (BIBM), pp. 600–603 (2017). https://ieeexplore.ieee.
org/abstract/document/8217719
10. Ström, P., et al.: Pathologist-level grading of prostate biopsies with artificial intelligence.
CoRR (2019). http://arxiv.org/pdf/1907.01368
11. Wei, J.W., Tafe, L.J., Linnik, Y.A., Vaickus, L.J., Tomita, N., Hassanpour, S.: Pathologist-
level classification of histologic patterns on resected lung adenocarcinoma slides with deep
neural networks. Sci. Rep. 9(1), 1–8 (2019). https://www.nature.com/articles/s41598-019-
40041-7
12. Zhang, Z., et al.: Pathologist-level interpretable whole-slide cancer diagnosis with deep
learning. Nat. Mach. Intell. 1, 236–245 (2019). https://www.nature.com/articles/s42256-
019-0052-1
13. Bándi, P., et al.: From detection of individual metastases to classification of lymph node
status at the patient level: the camelyon17 challenge. IEEE Trans. Med. Imaging, 38(2),
550–560 (2019). https://ieeexplore.ieee.org/document/8447230
14. Colorectal cancer statistics. https://www.cancer.org/cancer/colon-rectal-cancer/about/key-
statistics.html. Accessed 06 Jan 2021
15. Rex, D.K., et al.: Colorectal cancer screening: Recommendations for physicians and patients
from the U.S. multi-society task force on colorectal cancer. Gastroenterology, 153, 307–323
(2017). www.gastrojournal.org/article/S0016-5085(17)35599--3/fulltext
16. Abdeljawad, K., Vemulapalli, K.C., Kahi, C.J., Cummings, O.W., Snover, D.C., Rex, D.K.:
Sessile serrated polyp prevalence determined by a colonoscopist with a high lesion detec-
tion rate and an experienced pathologist. Gastrointest. Endosc. 81, 517–524 (2015). https://
pubmed.ncbi.nlm.nih.gov/24998465/
17. Farris, A.B., et al.: Sessile serrated adenoma: challenging discrimination from other ser-
rated colonic polyps. Am. J. Surg. Pathol. 32, 30–35 (2008). https://pubmed.ncbi.nlm.nih.
gov/18162767/
18. Glatz, K., Pritt, B., Glatz, D., HArtmann, A., O’Brien, M.J., Glaszyk, H.: A multina-
tional, internet-based assessment of observer variability in the diagnosis of serrated colorec-
tal polyps. Am. J. Clin. Pathol. 127(6), 938–945 (2007). https://pubmed.ncbi.nlm.nih.gov/
17509991/
19. Khalid, O., Radaideh, S., Cummings, O.W., O’brien, M.J., Goldblum, J.R., Rex, D.K.:
Reinterpretation of histology of proximal colon polyps called hyperplastic in 2001. World
J. Gastroenterol. 15(30), 3767–3770 (2009). https://www.ncbi.nlm.nih.gov/pmc/articles/
PMC2726454/
20. Wong, N.A.C.S., Hunt, L.P., Novelli, M.R., Shepherd, N.A., Warren, B.F.: Observer agree-
ment in the diagnosis of serrated polyps of the large bowel. Histopathology, 55(1), 63–66
(2009). https://pubmed.ncbi.nlm.nih.gov/19614768/
21. Understanding your pathology report: Colon polyps (sessile or traditional serrated adeno-
mas). https://www.cancer.org/treatment/understanding-your-diagnosis/tests/understanding-
your-pathology-report/colon-pathology/colon-polyps-sessile-or-traditional-serrated-
adenomas.html. Accessed 06 Jan 2021
22. Gurudu, S.R., et al.: Sessile serrated adenomas: Demographic, endoscopic and pathological
characteristics. World J. Gastroenterol. 16(27), 3402–3405 (2010). https://www.ncbi.nlm.
nih.gov/pmc/articles/PMC2904886/
23. Nagtegaal, I.D., et al.: The 2019 who classification of tumours of the digestive system.
Histopathology, 76(2), 182–188 (2020). https://onlinelibrary.wiley.com/doi/abs/10.1111/his.
13975
A Petri Dish for Histopathology Image Analysis 23

24. Chilamkurthy, S., et al.: Deep learning algorithms for detection of critical findings in head CT
scans: a retrospective study. Lancet, 392(10162), 2388–2396 (2018). www.thelancet.com/
journals/lancet/article/PIIS0140-6736(18)31645--3/fulltext
25. Gulshan, V., et al.: Development and validation of a deep learning algorithm for detection
of diabetic retinopathy in retinal fundus photographs. JAMA, 316(22), 2402–2410 (2016).
https://jamanetwork.com/journals/jama/fullarticle/2588763
26. Irvin, J., et al.: Chexpert: a large chest radiograph dataset with uncertainty labels and expert
comparison. In: Association for the Advancement of Artificial Intelligence (AAAI) (2019).
http://arxiv.org/pdf/1901.07031
27. Kanavati, F., et al.: Weakly-supervised learning for lung carcinoma classification using
deep learning. Nat. Sci. Rep. 10(1), 1–11 (2020). https://www.ncbi.nlm.nih.gov/pmc/articles/
PMC7283481/
28. Korbar, B., et al.: Deep learning for classification of colorectal polyps on whole-slide images.
J. Pathol. Inform. 8 (2017). https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5545773/
29. Sertel, O., Kong, J., Catalyurek, U.V., Lozanski, G., Saltz, J.H., Gurcan, M.N.: Histopatho-
logical image analysis using model-based intermediate representations and color texture: fol-
licular lymphoma grading. J. Sig. Process. Syst. 55(1), 169–183 (2009). https://link.springer.
com/article/10.1007/s11265-008-0201-y
30. Wang, S., Xing, Y., Zhang, L., Gao, H., Zhang, H.: Deep convolutional neural network
for ulcer recognition in wireless capsule endoscopy: experimental feasibility and opti-
mization. Computat. Math. Meth. Med. (2019). https://www.ncbi.nlm.nih.gov/pmc/articles/
PMC6766681/
31. Wei, J., Wei, J., Jackson, C., Ren, B., Suriawinata, A., Hassanpour, S.: Automated detection
of celiac disease on duodenal biopsy slides: a deep learning approach. J. Pathol. Inform.
10(1), 7 (2019). http://www.jpathinformatics.org/article.asp?issn=2153-3539;year=2019;
volume=10;issue=1;spage=7;epage=7;aulast=Wei;t=6
32. Wei, J., et al.: Difficulty translation in histopathology images. In: Artificial Intelligence in
Medicine (AIME) (2020). https://arxiv.org/pdf/2004.12535.pdf
33. Zhou, J., et al.: Weakly supervised 3D deep learning for breast cancer classification and local-
ization of the lesions in MR images. J. Magn. Reson. Imaging, 50(4), 1144–1151 (2019).
https://onlinelibrary.wiley.com/doi/abs/10.1002/jmri.26721
34. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: 2016
IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2015). http://arxiv.
org/pdf/1512.03385
35. Benkendorf, D.J., Hawkins, C.P.: Effects of sample size and network depth on a deep learning
approach to species distribution modeling. Ecol. Inform. 60, 101137 (2020). http://www.
sciencedirect.com/science/article/pii/S157495412030087X
36. Kornblith, S., Shlens, J., Le, Q.V.: Do better imagenet models transfer better? CoRR (2018).
http://arxiv.org/pdf/1805.08974
37. Coudray, N., Moreira, A.L., Sakellaropoulos, T., Fenyö, D., Razavian, N., Tsirigos, A.: Clas-
sification and mutation prediction from non-small cell lung cancer histopathology images
using deep learning. Nat. Med. 24(10), 1559–1567 (2017). https://www.nature.com/articles/
s41591-018-0177-5
38. Ehteshami Bejnordi, B., et al.: Diagnostic assessment of deep learning algorithms for detec-
tion of lymph node metastases in women with breast cancer. JAMA, 318(22), 2199–2210
(2017). https://jamanetwork.com/journals/jama/fullarticle/2665774
39. Esteva, A., et al.: Dermatologist-level classification of skin cancer with deep neural networks.
Nature, 542(7639), 115–118 (2017). https://www.nature.com/articles/nature21056
40. Ghorbani, A., Natarajan, V., Coz, D., Liu, Y.: Dermgan: synthetic generation of clinical skin
images with pathology (2019). https://arxiv.org/pdf/1911.08716.pdf
24 J. Wei et al.

41. Wei, J.W., et al.: Evaluation of a deep neural network for automated classification of colorec-
tal polyps on histopathologic slides. JAMA Netw. Open, 3(4) (2020). https://jamanetwork.
com/journals/jamanetworkopen/article-abstract/2764906
42. Wei, J., et al.: Learn like a pathologist: curriculum learning by annotator agreement for
histopathology image classification. In: Winter Conference on Applications of Computer
Vision (WACV) (2020)
43. Brown, T.B., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165
(2020). https://arxiv.org/pdf/2005.14165.pdf
44. Strubell, E., Ganesh, A., McCallum, A.: Energy and policy considerations for deep learning
in nlp. arXiv preprint arXiv:1906.02243 (2019). https://arxiv.org/pdf/1906.02243.pdf
45. Rawal, A., Lehman, J., Such, F.P., Clune, J., Stanley, K.O.: Synthetic petri dish: a novel sur-
rogate model for rapid architecture search. arXiv preprint arXiv:2005.13092 (2020). https://
arxiv.org/pdf/2005.13092.pdf
46. Greydanus, S.: Scaling down deep learning. arXiv preprint arXiv:2011.14439 (2020). https://
arxiv.org/pdf/2011.14439.pdf
47. Veeling, B.S., Linmans, J., Winkens, J., Cohen, T., Welling, M.: Rotation equivariant CNNs
for digital pathology. In: Frangi, A.F., Schnabel, J.A., Davatzikos, C., Alberola-López, C.,
Fichtinger, G. (eds.) MICCAI 2018. LNCS, vol. 11071, pp. 210–218. Springer, Cham (2018).
https://doi.org/10.1007/978-3-030-00934-2 24
48. Cireşan, D.C., Giusti, A., Gambardella, L.M., Schmidhuber, J.: Mitosis detection in breast
cancer histology images with deep neural networks. In: Mori, K., Sakuma, I., Sato, Y., Baril-
lot, C., Navab, N. (eds.) MICCAI 2013. LNCS, vol. 8150, pp. 411–418. Springer, Heidelberg
(2013). https://doi.org/10.1007/978-3-642-40763-5 51
49. Veta, M., et al.: Predicting breast tumor proliferation from whole-slide images: the tupac16
challenge. Med. Image Anal. 54, 111–121 (2019). https://doi.org/10.1016/j.media.2019.02.
012, http://www.sciencedirect.com/science/article/pii/S1361841518305231
50. Aresta, G., et al.: Bach: Grand challenge on breast cancer histology images. Med.
Image Anal.56, 122–139 (2019). https://doi.org/10.1016/j.media.2019.05.010, http://www.
sciencedirect.com/science/article/pii/S1361841518307941
51. Swiderska-Chadaj, Z., et al.: Learning to detect lymphocytes in immunohistochemistry with
deep learning. Med. Image Anal. 58, (2019). https://doi.org/10.1016/j.media.2019.101547,
http://www.sciencedirect.com/science/article/pii/S1361841519300829
fMRI Multiple Missing Values Imputation
Regularized by a Recurrent Denoiser

David Calhas(B) and Rui Henriques

Instituto Superior Técnico, Lisbon, Portugal


david.calhas@tecnico.ulisboa.pt

Abstract. Functional Magnetic Resonance Imaging (fMRI) is a neu-


roimaging technique with pivotal importance due to its scientific and
clinical applications. As with any widely used imaging modality, there
is a need to ensure the quality of the same, with missing values being
highly frequent due to the presence of artifacts or sub-optimal imaging
resolutions. Our work focus on missing values imputation on multivari-
ate signal data. To do so, a new imputation method is proposed consist-
ing on two major steps: spatial-dependent signal imputation and time-
dependent regularization of the imputed signal. A novel layer, to be used
in deep learning architectures, is proposed in this work, bringing back
the concept of chained equations for multiple imputation [26]. Finally, a
recurrent layer is applied to tune the signal, such that it captures its true
patterns. Both operations yield an improved robustness against state-of-
the-art alternatives. The code is made available on Github.

1 Introduction

The ability to learn from functional Magnetic Resonance Imaging (fMRI) data
is generally hampered by the presence of artifact and limits on the available
instrumentation and acquisition protocol, resulting in pronounced missingness.
As MRI is collected in frequency space with the usual type of missing/corrupted
values occurring at the frequency space. On the other hand, low-quality (voxel
space) recordings prevents whole-brain analyzes in clinical settings and is spe-
cially pervasive among stimuli-inducing setups in research settings. Imputation
of incomplete/noisy recordings is critical to classification [19], synthesis and
enhancement tasks. For instance, given the unique spatial and temporal nature
of each neuroimaging modality, synthesis between distinct modalities is a diffi-
cult task (particularly of multivariate time series nature), being imputation an
important step of the process [16,22]. Finally, the integration of heterogeneous
sources of fMRI recordings by multiple initiatives worldwide also drives the need
to increase image resolutions and correct differences arriving from distinct setups.
This work reclaims the importance of a machine learning based model to
perform imputation of multivariate signal data, as opposed to individual-specific
imputation. In this context, we propose a novel layer that perform feature based
imputation with the principle of chained equations [26]. Further, a recurrent
c Springer Nature Switzerland AG 2021
A. Tucker et al. (Eds.): AIME 2021, LNAI 12721, pp. 25–35, 2021.
https://doi.org/10.1007/978-3-030-77211-6_3
26 D. Calhas and R. Henriques

layer is proposed to serve as a denoiser to the spatially imputed signal. This


two-step principled approach for imputation is illustrated in Fig. 1. Results on
resting-state and stimuli-based fMRI recordings validate the robustness of the
proposed approach against competitive alternatives.

2 Problem Setting
Multivariate Time Series (MTS) Missing Value Imputation is the focus of this
work, specifically high-dimensional MTS data from fMRI recordings. The prob-
lem is divided into two parts: spatial imputation, in which missing values are
sequentially predicted from the existing features; and time dimension regular-
ization, where the imputed values are time tuned.
Consider an fMRI recording to be a multivariate time series x ∈ Rv×t , being
v = (v0 , ..., vV −1 ) the voxel dimension with V voxels and t = (t0 , ..., tT −1 ) the
temporal dimension with T timesteps. An fMRI volume, at timestep t, is denoted
as xt = [xt0 , ..., xtV −1 ] and a voxel time series, at voxel v, is denoted as xv =
[x0v , ..., xTv −1 ], each can be seen as the column and row of matrix x, respectively.
In the problem of imputation, consider ψ ∈ Rv×t ∪ {nan}, ϕ ∈ Rv×t and
μ ∈ {0, 1}v×t as the variables involved in the learning phase. The nan symbol
denotes a missing value in the fMRI instance ψ. μ is the mask, with 0 represent-
ing a complete value and 1 a missing value. ϕ is the complete fMRI instance.
Illustrating1 , given V = 5 and T = 7,
⎡ ⎤ ⎡ ⎤ ⎡ ⎤
1 2 3 4 5 6 7 0 0 0 0 0 0 0 1 2 3 4 5 6 7
⎢nan nan nan nan nan nan nan⎥ ⎢1 1 1 1 1 1 1⎥ ⎢8 9 10 11 12 13 14⎥
⎢ ⎥ ⎢ ⎥ ⎢ ⎥
ψ=⎢
⎢ 15 16 17 18 19 20 21 ⎥ ⎢
⎥ , μ = ⎢0 0 0 0 0 0 0⎥ ⎢
⎥ , ϕ = ⎢15 16 17 18 19 20 21⎥⎥
⎣nan nan nan nan nan nan nan⎦ ⎣1 1 1 1 1 1 1⎦ ⎣22 23 24 25 26 27 28⎦
nan nan nan nan nan nan nan 1 1 1 1 1 1 1 29 30 31 32 33 34 35

Consider the imputation of missing values is made by a model, I, and each


imputed value is denoted as ιtv ∈ R. Continuing our example, an fMRI instance,
ψ, after processing by I, is ι ∈ Rv×t ,
⎡ ⎤
1 2 3 4 5 6 7
⎢ ι01 ι11 ι21 ι31 ι41 ι51 ι61 ⎥
⎢ ⎥
I(ψ) = ι = ⎢
⎢15 16 17 18 19 20 21⎥ ⎥
⎣ ι03 ι13 ι23 ι33 ι43 ι53 ι36⎦

ι04 ι14 ι24 ι34 ι44 ι54 ι64


Considering typical fMRI resolution, each voxel has a 3D euclidean point
correspondence. As such, a spatial distance matrix, d ∈ RV ×V , is defined, where
di,j corresponds to the distance between voxels vi and vj , with i, j ∈ {0, ..., V −
1} ∈ N.
Missing voxels can occur at random or, in contrast, be spatially autocorre-
lated within a variable number of regions, resembling the characteristics of an
artifact. Both modes are targeted in this work.
1
ψ, μ, ϕ and ι values selected for simplicity sake, not necessarily resembling fMRI
values.
fMRI Multiple Missing Values Imputation Regularized 27

3 Proposed Approach
In this section, two main steps are proposed to perform imputation of missing
values from MTS data:

– Spatial imputation, where imputation is done by estimating missing values,


μji = 1, from complete features, μlk = 0, with i, k ∈ {0, ..., V − 1}, j, l ∈
{0, ..., T − 1}.
– Time series regularization, where the previously derived missing values
are processed by a recurrent neural network.

This two-step approach is shown in Sect. 6 to outperform competitive baselines.

3.1 Spatial Imputation

We propose a novel neural network layer, Φ, that performs imputation inspired


by the chained equations principle proposed by [26]. This imputation method
consists of filling a missing value at a time, and using its estimate to guide the
imputation of the remaining missing values.
The priority in which the values are filled is given by the pairwise voxel
correlation matrix, C, computed from the training set data.
This layer, Φ, is characterized by a weight matrix WΦ and bias BΦ ,
⎡ ⎤ ⎡ ⎤
0 w01 ... w0V −1 b0
⎢ w1 0
0 V −1 ⎥
... w1 ⎥ ⎢ b1 ⎥
⎢ ⎢ ⎥
WΦ = ⎢ . .. .. .. ⎥ , BΦ = ⎢ .. ⎥ .
⎣ .. . . . ⎦ ⎣ . ⎦
0 1
wV −1 wV −1 ... 0 bV −1

The activation function of Φ is linear, making the imputed values a linear


combination of the already filled and complete values. The weight matrix has a
zero-filled diagonal for each voxel to be described as a linear combination of the
remaining voxels.
Since this neural function estimates a single value at a time, one only needs to
compute the dot product of the missing value, v, with the corresponding column,
WΦv , and add the bias, BΦv . The imputation operation of a missing value, v, is
ψ 0 WΦ v +BΦ v
denoted as −−−−−−−→.
Let us consider the input presented in Sect. 2, with ψ 0 = [1, nan, 15,
ψ 0 WΦ c +BΦ c
nan, nan], and −−−−−−−→ as the operation made by layer Φ at each iteration
to impute a missing value, c. Being c = max(Cμ0 =1 ) the missing voxel that has
the highest correlation with the complete and filled voxels. This scheme allows
an imputation of missings under the chained equation principle, φ is the output
of
 layer Φ, with φi corresponding to the vi missing voxel imputation. A total of
μt iterations are necessary, corresponding to the number of missing values,
28 D. Calhas and R. Henriques

⎡ ⎤T ⎡ ⎤T ⎡ ⎤T ⎡ ⎤T
1 1 1 1
⎢nan⎥ ⎢ φ1 ⎥ ⎢ φ1 ⎥ ⎢φ1 ⎥
⎢ ⎥ ψ0 WΦ c +BΦ c ⎢ ⎥ ψ0 WΦ c +BΦ c ⎢ ⎥ ψ0 WΦ c +BΦ c ⎢ ⎥
Φ(ψ 0 ) = ⎢ ⎥ ⎢ ⎥ ⎢ ⎥
⎢ 15 ⎥ −−−−−−−→ ⎢ 15 ⎥ −−−−−−−→ ⎢ 15 ⎥ −−−−−−−→ ⎢15⎥ = φ.
⎢ ⎥
⎣nan⎦ ⎣nan⎦ ⎣nan⎦ ⎣φ3 ⎦
nan nan φ4 φ4

Algorithm 1 presents the pseudocode for this


Algorithm 1 Φ chained imputation scheme. The Φ layer imputes all missing
imputation cycle
values per time frame, t, of an fMRI recording, ψ t .
φ ← ψ t Imputation is merely done accounting other features
while μt > 0 do
on the same time frame, therefore spatial. This layer
c ← max(Cμt =1 )
φc ← ψ t WΦ c + BΦ c
contrasts with the traditional dropout layer for imput-
t
μc ← 0 ing missing values. In a dropout layer, each weights’
end while column, W v , shows intra-correlation, converging to
φ a single target independently. However, there is no
inter-column correlation/dependency. Φ layer forces
the columns, WΦv , to be inter-correlated, converging to the same target as a unit.
Here, the estimates of a column (the imputed values) influence the estimates of
the upcoming columns along the imputation process.

3.2 Time Dimension Regularization

Once spatial imputation is done for each voxel, v, of an fMRI recording, φv , the
imputed values are fed to a recurrent layer, tweaking the signal in such a way
that it emulates the target time series patterns. The recurrent layer removes the
noise created by the spatial imputation method. We refer to this recurrent layer
as the Denoiser, D, component of the imputation pipeline. An illustration of this
noise removal is shown in Fig. 1.
D is a single layer Gated
Recurrent Unit (GRU) [4].
This choice was motivated
by results collected against
its rival Long-Short Term
Memory Layer on the tar-
get task, and further sup-
ported by studies showing
that GRU performs well on
datasets with limited observa-
tions [13], the common case
when learning Neuroimaging Fig. 1. Time regularization (denoising) component.
datasets. [3,15] altered the
internal function of a GRU layer to perform direct imputation on a multivariate
time series. In contrast, we maintain the GRU layer as it was originally pro-
posed [4] since our purpose is to remove noise and capture the desired temporal
patterning properties of the (neurophysiological) signal. The imputation model
we propose is denoted as Φ + D, corresponding to the junction of the spatial
Another random document with
no related content on Scribd:
happen to you. Now,’ I wrote, ‘that seems to me hardly as it should
be. If you should die—a motor accident might happen any day—then
all the money of our aunt would pass into the hands of your
husband, this husband with whom you have so little in common and
who had no relations with our good aunt. And I, who am your nearest
in kin, would receive not one penny. Think of that,’ I wrote, ‘and
consider whether it would be fair. Is the fortune of our family to pass
into the hands of strangers and we ourselves to be left without a
share in it?’ ”
Renard looked from the Inspector to Sir Clinton and back to the
Inspector, as though seeking for sympathy. Apparently finding
nothing very satisfying in their expressions, he continued his tale.
“I put it to her that this state of affairs was not as it should be. I
did not plead for myself, of course. That is not my way. I tried to
show her that as things stood, injustice would be done if she should
happen to die. And I urged her very strongly to make a fresh will.
‘See,’ I wrote, ‘how things would fall out. To you, it would mean
nothing, very naturally. You would be far beyond all cares. But this
money would be left. Would you desire that it should fall into the
hands of this husband of yours, with whom you cannot find anything
in common? Or would you not prefer that it should be left to your
brother who has always been good to you?’ That is how I put it to
her. I asked her to take swift action and to call in a lawyer who could
aid her to draw up a fresh will which would be fair to both her
husband and myself. I desired to be fair, you understand? merely to
be fair. He would have received back his own stocks and shares
which he had given to her at the time of their marriage. I would have
gained the fortune which descended from my aunt. That seemed
reasonable, surely.”
“Yes,” Sir Clinton confirmed, “it sounds quite reasonable in the
circumstances. And what happened?”
“I have been to see the lawyers,” Renard went on. “Figure to
yourself what I discovered. My poor Yvonne was not a woman of
affairs. She had no business-like habits. If a thing seemed likely to
give her trouble, she would put it aside for as long as she could,
before dealing with it. Affairs bored her. It was her temperament, like
that. So when she received my letter, she put it aside for some days.
One cannot blame her. It was not in her nature to go to great trouble
over a thing like that. Besides, death was not in her thoughts. One
day was as good as another.”
He paused, as though wishing to heighten the interest of his
narrative; for it was evident that he had produced but little impression
on Sir Clinton.
“She had a good heart, my poor sister. She understood the
position well enough, it seems. And she had no wish to see her good
brother left out in the cold, as you English put it. But she delayed and
delayed in the affair. And in the end she delayed too long.”
Again he hitched himself forward in his chair, as though he were
approaching something important.
“I went to the lawyers. What did I find? This. My poor Yvonne had
not forgotten her good brother. She had the intention of setting things
right. One day she rang up the lawyers on the telephone and made
an appointment with them for the following afternoon. She informed
them that she proposed to alter her will; but of course, over the
telephone, she said nothing about her wishes on the point. That is to
be understood. But she said she would jot down the points to be
embodied in the new will and bring that paper with her. That is all the
lawyers know. That is all I know myself. For before the next
afternoon, when she had made her appointment with the lawyers—
my poor Yvonne was dead! Is it not distressing? Twelve thousand
pounds! A million and a half francs! And they slip through my fingers
just by a few hours. But I make no complaint, of course. I do not
grumble. It is not my way. These things happen, and one has to bear
them.”
If he had expected to read any sympathy in Sir Clinton's face, he
must have been disappointed. The Chief Constable betrayed nothing
of the feelings in his mind.
“Was it not most inopportune?” Renard continued. “Or most
opportune indeed, for Silverdale, that things fell out as they have
done. A coincidence, of course. Life is full of these things. I have
seen too many to be astonished, myself. But is it not most apt that
she should die just at that juncture? Another day of life, and the
twelve thousand pounds goes into one pocket; a death, and the
money falls into other hands. I am something of a philosopher. One
has to be, in this world. And these strange chances have an
attraction for my mind. I know there is nothing behind them, nothing
whatever, you understand? And yet, is it not most striking that things
fall out as they do?”
The Chief Constable declined to be drawn into a general
discussion on the Universe.
“I am afraid it is scarcely a matter for the police, Mr. Renard. Wills
hardly fall into our province, you know, unless a case of forgery turns
up; and in this case there's nothing of that sort. The only advice I
could give you would be to consult a lawyer, but as you've already
had the legal position made clear, I don't see that there's anything to
be done.”
Inspector Flamborough took his cue and, without more ado he
hinted to Renard very plainly that enough time had been spent on
the matter. At length the little Frenchman withdrew, leaving the two
officials together.
“I don't much care for his way of telling his story, sir,”
Flamborough remarked, “but I'm not sure, if I were in his shoes, that
I wouldn't feel much the same as he seems to do. It must be a bit
galling to lose £12,000 by a few hours’ delay. And he's quite
reasonably suspicious, evidently.”
Sir Clinton refused to be drawn.
“Don't let's be too much influenced by the stop press news,
Inspector. Renard's evidence is the latest we have; but that adds
nothing to its value, remember. Look at the case as a whole and try
to reckon up the people who could conceivably gain anything by the
crime. Then you can assess the probabilities in each case—apart
altogether from the order in which the facts have come to light.”
The Inspector had evidently considered the matter already from
this stand-point. He hardly paused before offering his views.
“Well, sir, if you ask me, Silverdale had at least two sound
motives for committing murder. By getting his wife out of the way, he
opened the road to a marriage with the Deepcar girl, whom he's
obviously keen on. Also, if Renard's story's true, the death of his wife
at that particular juncture put £12,000 into his pocket, which he'd
have lost if Mrs. Silverdale had lived a day or two longer.”
“One has to admit that he hadn't evidence to get a divorce, which
would have been an obvious alternative to murder,” Sir Clinton
acknowledged. “And the cash affair makes the death of Mrs.
Silverdale peculiarly opportune. It's no use burking the plain fact that
either money or a woman might tempt a man to murder; and when
you've got both of them together, one can't brush them aside
cavalierly. But go on with your list, Inspector.”
“There's that money-lender, Spratton,” Flamborough pursued. “If
young Hassendean's death can be proved to be a murder, then
Spratton lifts some thousands out of the pocket of the insurance
company in return for the payment of a single premium. That's a
motive, certainly.”
“It's a sound motive for proving that it was a case of murder and
not suicide; and it's a possible motive for murder, I admit. But the
position of a gentleman who commits a murder for gain and can only
collect the money by proving that murder was done . . . Well, it
sounds a bit complicated, doesn't it?”
“Unless he can be sure of fixing the murder on someone else,
sir.”
“It's a bit difficult in practice to produce a frame-up of that
description, isn't it?”
The Inspector refrained from betraying any opinion on this point.
“Then there's the Hailsham girl, sir. She's a vindictive type; and
she quite obviously had the worst kind of grudge against both of
them. Revenge might have been at the back of the business for all
one can tell. I don't say it's likely; but I'm considering possibilities, not
necessarily probabilities.”
“I don't think Miss Hailsham can reckon me among her admirers,”
Sir Clinton confessed. “But that's hardly evidence against her in a
murder case. We'd need something a bit more concrete.”
“She admitted that she left the dance early that night and took her
car home, sir. She hasn't got a clean alibi for the time the murder
was committed.”
“So I noticed when she told her story. But the absence of an alibi
doesn't establish murderous intent, you know. Go ahead.”
“Well, sir, there's the Deepcar girl. She's keen on Silverdale. It's
always a motive.”
“Save me from being mixed up in any murder case that you have
charge of, Inspector. My character wouldn't escape, I see. You'll
need to have something better than that before you start arresting
anyone.”
“I'm not talking about arresting anyone, sir,” the Inspector replied
in an injured tone. “I'm just reviewing possible motives.”
“Quite true. Can't one make a feeble joke without rasping your
susceptibilities? Now is that the end of your list?”
“I think so, sir.”
“Ah! You didn't think of including someone with the initial ‘B,’
then? You remember the ‘B’ on the bracelet?”
The Inspector seemed rather startled.
“You mean this fellow B. might have been a discarded lover of
Mrs. Silverdale's who was out for revenge like the Hailsham girl? I
hadn't thought of that. It's possible, of course.”
“Now let's turn to a fresh side of the case,” Sir Clinton suggested.
“One thing's certain; hyoscine played a part in the affair. What about
Mr. Justice's pertinent inquiry: ‘Who had access to hyoscine at the
Croft-Thornton Institute?’ ”
“Every blessed soul in the place, so far as I could see,” the
Inspector confessed, rather ruefully. “Silverdale, Markfield, young
Hassendean, and the two girls: they all had equal chances of helping
themselves from that bottle in the store. I don't think that leads very
far. That hyoscine was common property so far as access to it went.
Anyone might have taken some.”
“Then push the thing a little further. Out of all that list, who had an
opportunity of administering hyoscine to Mrs. Silverdale—directly or
indirectly—on the night she died?”
“Directly or indirectly?” Flamborough mused. “There's something
in that perhaps. On the face of it, only three people could have
administered the drug directly, since there were only three people at
Heatherfield in a fit state to do it. I take it that she swallowed the stuff
at Heatherfield, sir, because I found no trace of a paper which might
have held it, either at the bungalow or on the bodies of young
Hassendean and Mrs. Silverdale.”
“That's sound, I believe,” Sir Clinton acquiesced. “She swallowed
the stuff at Heatherfield before going out. Now who are your three
suspects?”
“Mrs. Silverdale herself might have taken it, sir, either on purpose
or by mistake.”
“But she had no access to hyoscine that we know of.”
“No, sir, but both Silverdale and young Hassendean had. She
may have taken it in mistake for a headache powder or something of
that sort. And it might have been added to a headache powder by
either Silverdale or young Hassendean.”
“That's a good enough suggestion, Inspector. But I didn't see any
sign of a powder paper in her room when I searched it; and you
remember she came straight downstairs and went out of the house,
according to the maid's evidence. Any other view?”
“Then it must have been administered in the coffee, sir, by either
young Hassendean or the maid.”
“The maid? Where would she get hyoscine?”
“From Silverdale, sir. It's just occurred to me. Silverdale wanted a
divorce; but he couldn't get evidence because his wife was simply
playing with young Hassendean and keeping well within the limits.
But if she were drugged, then young Hassendean might seize the
chance that was offered to him, and if Silverdale was prepared
beforehand, he'd have his evidence at the cost of watching them for
an hour or two.”
“So Silverdale gave the maid the drug to put in one of the cups of
coffee and ordered her to give that cup to Mrs. Silverdale, you
think?”
“It's possible, sir. I don't put it higher. That maid was a simple
creature—look how the doctor pumped her on the pretence of
getting medical information that night. She was devoted to
Silverdale; he told us that himself. She'd swallow any talk he chose
to hand out to her. Suppose he faked up some yarn about Mrs.
Silverdale needing a sedative but refusing to take it. The maid would
believe that from Silverdale, and she'd put the hyoscine into the cup
quite innocently. If the worst came to the worst, and the cups got
mixed, then young Hassendean would get the dose instead.”
“It's asking a bit too much, I'm afraid. Remember it was a heavy
over-dose that was given.”
“Everybody's liable to make a mistake, sir.”
“True. And I suppose you'd say that after the murder at the
bungalow Silverdale awoke to the fact that the maid's evidence
about the hyoscine would hang him, probably; so he went back and
murdered her also.”
“It was someone well known to her who did her in, sir. That's
clear enough.”
“In the meantime, you've left aside the possibility that young
Hassendean may have administered the stuff. How does that strike
you?”
“It's possible, sir,” the Inspector admitted cautiously. “But there's
no evidence for it.”
“Oh, I shouldn't like to go so far as that,” Sir Clinton said,
chaffingly. “I'll tell you what evidence there is on the point. There's
Hassendean's own diary, first of all. Then there's what we found in
young Hassendean's laboratory notebook.”
“But that was just some stuff about weighing potash-bulbs,
whatever they may be.”
“Quite correct. That was what it was.”
“Well, I'm no chemist, sir. It's off my beat.”
“There's no chemistry in it. I gave you the key to it at the time.
Then there's other evidence. Young Hassendean was a careless
worker. Everyone agreed on that; and his notebook confirmed it.
Next, there's what Miss Hailsham said about hyoscine, which is
more or less common knowledge, nowadays, of course. And there's
young Hassendean's interference in the serving of coffee at
Heatherfield, that night. Finally, there's what the maid said about
Mrs. Silverdale's appearance when she was going out of the house.
Put all these points together, and I'll engage to satisfy a jury that
young Hassendean administered the hyoscine to Mrs. Silverdale in
her coffee, with a definite purpose—but not murder—in view.”
“I'll need to think over all that, sir. You seem pretty sure about it.”
“I'm practically certain. Now look at the business from another
stand-point. Who had a grudge against the two victims, either
separately or together?”
“Silverdale, obviously.”
“Obviously, as you say. That's if you take them together, of
course. Now for a final problem. Who is Mr. Justice? He seems to be
in the know, somehow. If we could lay hands on him, we might be
near the centre of things. He knew before anyone else that
something had happened at the bungalow. He knew about the
hyoscine at the Institute—although as Silverdale's a fairly well-
recognised authority on alkaloids, that might have been just a shot
aimed on chance. Anyhow, look at it as you choose, Mr. Justice has
information, and he seems to have a motive. Who is he, can you
guess?”
“Somebody who won't come out into the open until he's dragged
there, evidently. It might be an unwilling accomplice, sir.”
“That's possible. Anyone else?”
“It might be Spratton. He's got an interest in establishing that it
was a case of murder and not suicide.”
“Obviously true. Anyone else?”
“I can't think of anyone else who would fit the case, sir. By the
way, I've got the originals of these advertisements—the code ones. I
sent down to the newspaper offices and got hold of them.”
He produced two sheets of paper from his pocket-book and
handed them to the Chief Constable. Sir Clinton glanced over them.
“H'm! The first one—the letters—is built up as usual from
telegram forms. The one with the numbers is fitted together from
numbers printed in a newspaper; it might have been clipped from
one of these lists of the results of drawings of bonds for redemption
—Underground Electric Railways, and that kind of thing. These
advertisements have columns and columns of figures out of which
he'd be able to pick what he wanted easily enough. Now what about
this address that he's put down—the usual guarantee of good faith at
the bottom. It's fictitious, of course?”
“Yes, sir. There's no such place.”
“It's in writing. It looks like a girl's writing. This is a dangerous
game for Mr. Justice; but I suppose if he'd put all the advertisement
in clipped-out letters the newspaper people might have got
suspicious and refused to print it. What about this handwriting,
Inspector?”
Flamborough's expression showed that he felt he had done his
work thoroughly.
“I managed to get hold of specimens of the writing of Miss
Hailsham and Miss Deepcar. It isn't either of them. Then I tried to get
it recognised—and I succeeded, sir. Miss Hailsham recognised it at
once. It's Mrs. Silverdale's own writing!”
“A forgery, then? That's very neat of Mr. Justice. I feel inclined to
take off my hat to that fellow. He thinks of everything.”
“Well, it's a blank end for us, so far as I can see, sir.”
Sir Clinton seemed to be so lost in admiration of Mr. Justice's
ingenuity that he failed to notice Flamborough's dissatisfaction.
When he spoke again, it was on a different topic.
“What about your friend, Mr. Whalley, Inspector? It seems to me
we ought to have him up and put him through it as quick as possible.
Quite obviously he knows something.”
“I've tried to get hold of him, sir. But he's left the town and I can't
get on his track. He's gone off to some race-meeting or other, I
expect. He often goes off like that and leaves no address. I'll lay
hands on him as soon as he comes back to Westerhaven.”
“He's an essential witness, I suspect; so don't let him slip through
your fingers. You'd better ask for assistance from the local police in
likely places.”
“Very good, sir.”
“And now, Inspector, how are you getting along with the game of
eliminations? How low have you brought the possibles out of the
original nine solutions?”
Flamborough produced his often-unfolded scrap of paper and
scanned it once more.
“If one accepts what you said a minute or two ago, sir, then the
drugging of Mrs. Silverdale was meant to be plain drugging and
wasn't wilful murder. So the last case drops out.”
He put his pencil through the line of writing.
“That leaves only two alternatives:

Hassendean Mrs. Silverdale


X—Suicide .................. Accident
Y—Murder .................. Accident
And young Hassendean, from all accounts, was hardly the lad to
suicide by shooting himself twice in the body—too painful for him. So
it really looks rather like Case Y. Certainly it's coming down to brass
tacks quicker than I thought it would.”
Chapter XIII.
The Murder of the Informer
As he entered Sir Clinton's office on the following morning,
Inspector Flamborough blurted out bad news without any preliminary
beating about the bush.
“There's been another murder, sir,” he announced, with a tinge of
what seemed grievance in his tone.
Sir Clinton looked up from the mass of papers upon his desk.
“Who is it, this time?” he demanded curtly.
“It's that fellow Whalley, sir—the man who seemed to have some
information about the bungalow affair.”
The Chief Constable leaned back in his chair and gazed at
Flamborough with an expressionless face.
“This is really growing into a wholesale trade,” he said, drily.
“Four murders in quick succession, and we've nothing to show for it.
We can't go on waiting until all the population of Westerhaven, bar
one individual, is exterminated; and then justify ourselves by
arresting the sole survivor on suspicion. The public's getting restive,
Inspector. It wants to know what we do for our money, I gather.”
Inspector Flamborough looked resentful.
“The public'll have to lump it, if it doesn't like it,” he said crudely.
“I've done my best. If you think I ought to hand the thing over to
someone else, sir, I'll be only too glad to do so.”
“I'm not criticising you, Inspector,” Sir Clinton reassured him. “Not
being a member of the public—for this purpose, at least—I know
enough to appreciate your difficulties. There's no burking the fact
that whoever's at the back of this affair is a sharper man than the
usual clumsy murderer. He hasn't left you much of a chance to pick
up usable clues.”
“I've followed up every one that he did leave,” Flamborough
argued. “I don't think I've been exactly idle. But I can't arrest
Silverdale merely because I picked up his cigarette-holder in
suspicious surroundings. Confound the public! It doesn't understand
the difference between having a suspicion and being able to prove a
case.”
“Let's hear the details of this latest affair,” Sir Clinton demanded,
putting aside the other subject.
“I've been trying to get hold of this fellow Whalley for the last day
or two, sir, so as to follow up that line as soon as possible,” the
Inspector began. “But, as I told you, he's been away from
Westerhaven—hasn't been seen anywhere in his usual haunts. I've
made repeated inquiries at his lodgings, but could get no word of him
except that he'd gone off. He'd left no word about coming back; but
he obviously did mean to turn up again, for he left all his traps there
and said nothing about giving up his bedroom.”
“You didn't get on his track elsewhere?”
“No, I hardly expected it. He's a very average-looking man and
one couldn't expect people to pick him out of a crowd at a race-
meeting by his appearance.”
Sir Clinton nodded as a permission to the Inspector to continue
his narrative.
“This morning, shortly before seven o'clock,” Flamborough
continued, “the driver of a milk-lorry on the Lizardbridge Road
noticed something in the ditch by the roadside. It was about half an
hour before sunrise, so I expect he still had his lamps alight. It's
pretty dark, these misty mornings. Anyhow, he saw something
sticking up out of the ditch and he stopped his lorry. Then he made
out that it was a hand and arm; so he got down from his seat and
had a closer look. I expect he took it for a casual drunk sleeping
things off quietly. However, when he got up to the side of the road,
he found the body of a man in the ditch, face downward.
“This milkman was a sensible fellow, it seems. He felt the flesh
where he could get at it without moving the body; and the coldness
of it satisfied him that he'd got a deader on his hands. So instead of
muddling about and trampling all over the neighbourhood, he very
sensibly got aboard his lorry again and drove in towards town in
search of a policeman. When he met one, he and the constable went
back on the lorry to the dead man; and the constable stood on guard
whilst the milkman set off with the lorry again to give the alarm.”
“Did you go down yourself, by any chance, Inspector?”
“Yes, sir. The constable happened to recognise Whalley from
what he could see of him—I told you he was pretty well known to our
men—and knowing that I'd been making inquiries about the fellow,
they called me up, and I went down at once.”
“Yes?”
“When I got there, sir,” the Inspector continued, “it didn't take long
to see what was what. It was a case of the tourniquet again. Whalley
had been strangled, just like the maid at Heatherfield. Quite obvious
symptoms: face swollen and congested; tongue swollen, too; eyes
wide open and injected a bit, with dilated pupils; some blood on the
mouth and nostrils. And when I had a chance of looking for it, there
was the mark of the tourniquet on his neck sure enough.”
Flamborough paused, as though to draw attention to his next
point.
“I hunted about in the ditch, of course. And there, lying quite
openly, was the tourniquet itself. Quite a complicated affair this time;
he's evidently improved his technique.”
“Well, what about it?” Sir Clinton demanded rather testily, as
though impatient of the Inspector's comments.
“Here it is, sir.”
Flamborough produced the lethal instrument with something of a
flourish.
“You see, sir, it's made out of a banjo-string threaded through a
bit of rubber tubing. The handles are just bits of wood cut from a
tree-branch, the same as before; but the banjo-string and the rubber
tube are a vast improvement on the bit of twine he used last time, at
Heatherfield. There'd be no chance of the banjo-string breaking
under the strain; and the rubber tube would distribute the pressure
and prevent the wire cutting into the flesh as it would have done if it
had been used bare.”
Sir Clinton picked up the tourniquet and examined it with obvious
interest.
“H'm! I don't say you've much to go on, but there's certainly more
here than there was in the other tourniquet. The banjo-string's not
much help, of course; one can buy 'em in any musical-instrument
shop. But the rubber tubing might suggest something to you.”
Inspector Flamborough scrutinised it afresh.
“It's very thick-walled, sir, with a much smaller bore than one
would expect from the outside diameter.”
Sir Clinton nodded.
“It's what they call ‘pressure-tubing’ in a chemical laboratory. It's
used when you're pumping out vessels or working under reduced
pressures generally. That's why it's made so thick-walled: so that it
won't collapse flat under the outside air-pressure when you've
pumped all the gas out of the channel in the middle.”
“I see,” said the Inspector, fingering the tubing thoughtfully. “So
it's the sort of thing one finds in a scientific place like the Croft-
Thornton Institute?”
“Almost certainly,” Sir Clinton agreed. “But don't get too sure
about your rubber tubing. Suppose someone is trying to throw
suspicion on one of the Croft-Thornton staff, wouldn't this be an
excellent way of doing it? One can buy pressure-tubing in the open
market. It's not found exclusively in scientific institutes, you know.”
Flamborough seemed a shade crestfallen at the loss of what he
had evidently regarded as a promising line.
“Oh, indeed?” he said. “I suppose you're right, sir. Still it's a bit
uncommon, isn't it?”
“Not what you'd expect the ordinary criminal to hit on straight off, I
suppose you mean? But this fellow isn't an ordinary criminal. He's
got plenty of brains. Now doesn't it strike you as strange that he
should go to the trouble of leaving this tourniquet for your
inspection? He could have slipped it into his pocket easily enough
and it wouldn't have bulged much.”
“Well, sir, a glance at the body would show anyone that
something of the sort had been used. He wasn't giving much away
by leaving the thing itself, was he?”
Sir Clinton did not seem altogether satisfied with the Inspector's
view.
“The less a murderer leaves behind, the more difficult it is to
catch him, Inspector. That's a truism. Now this fellow is no fool, as
I've frequently remarked to you. Hence one might have anticipated
that he'd leave as few traces as possible. But here he presents us
with the actual weapon, and a weapon that has fairly salient
peculiarities of its own. Queer, isn't it?”
“Then you think it's a non-scientific murderer using scientific
appliances so as to suggest that the crime was done by someone in
the scientific line—Silverdale, I mean?”
Sir Clinton was silent for a moment or two, then he said
thoughtfully:
“What I'm not sure about is whether it's a pure bluff or a double
bluff. It looks like one or the other.”
The Inspector obviously had difficulty in interpreting this rather
cryptic utterance. At last he saw his way through it.
“I think I see what you mean, sir. Suppose it's not Silverdale that
did the murder. Then somebody—knowing that this kind of tubing's
common in Silverdale's laboratory—may have left it on purpose for
us to find, so that we'd be bluffed into jumping to the conclusion—as
I admit I did—that Silverdale did the trick. That would be a simple
bluff. Or again, supposing it's Silverdale who's the murderer, then he
may have left the tubing on purpose, because he'd say to himself
that we'd never believe that he'd be such a fool as to chuck a thing
like that down beside the body—and hence we'd pass him over in
our suspicions. Is that it, sir?”
“It sounds devilish involved, as you put it, Inspector; but I have a
sort of dim perception that you've grasped my meaning,” Sir Clinton
answered. “My own impression is simply that we musn't let this
tourniquet lead us too far, for fear we go completely astray. If we get
on the right track, I've no doubt it'll fit neatly enough to the rest of the
evidence; but it's not the sort of thing I'd care about staking a lot on
by itself. Now suppose we come out of these flowery by-paths and
get back to the main thoroughfare of the facts.”
The Inspector refused to be damped by his superior. Indeed, he
had the air of a player holding good cards, and not caring who knew
it.
“It was hard frost last night, sir, as you’ll remember; so there were
no foot-prints on the road, or anything of that sort. But the grass by
the side of the ditch is fairly long; and when I examined it, it was
clear enough that there hadn't been any struggle on it. They may
have struggled on the road, of course; but the grass was quite
undisturbed.”
“Then the body hadn't been dragged off the road into the ditch? It
must have been lifted and pitched in?”
“So I think, sir. The grass border between road and ditch is quite
narrow—just room to stand on it comfortably. One could hoist a body
over it without too much trouble.”
“And from the look of the body you think it had been thrown in?”
“Yes, sir. It was huddled up anyhow in the ditch, just as it might
have fallen if it had been dropped in with a thud.”
“Single-handed business, then, you believe?”
“Well, sir, I think if two people had been handling him—one taking
his shoulders and another taking his feet—he'd have fallen more
tidily. He certainly looked as if he's been bundled in anyhow. I'd put it
down as a single-handed job from the look of it.”
“I suppose you examined the pockets, and so forth?” Sir Clinton
asked.
“Of course, sir. But there was nothing in them of any use to us.”
The Inspector's voice betrayed that he had something still in
reserve. Now he brought it forward.
“I examined his hands, sir; and in the right one, I found something
important. The hand was clenched, and when I got it open at last,
this fell out.”
He produced a button with a shred of cloth attached to it, which
he laid on the desk before Sir Clinton. The Chief Constable picked it
up, examined it closely, and then, pulling out a pocket magnifying
glass, made a still more minute inspection.
“Very interesting, Inspector. What do you make of it?”
“Obviously it was torn off the murderer's clothes during the
struggle, sir. And I've seen something like it before. You see that
canary-coloured stain on the bit of cloth and also on the threads that
hold the button to the fabric?”
“Dyed with picric acid, by the look of it, I should say. Is that what
you mean?”
“Yes, sir. And the pattern of the cloth's another point.”
“You mean it looks like a button torn off the old jacket that
Silverdale was wearing, that day we saw him at the Croft-Thornton
Institute—his laboratory coat?”
“That's undoubtedly what it is, sir. I remember that stain perfectly.
And as soon as I saw it, I remembered the pattern of the cloth.”
“And your view is?”
“I think that when Silverdale set out to murder Whalley he was
afraid that some blood from the face might get on to his coat. So he
put on his old laboratory jacket. If it got spotted, he could destroy it
and rouse no suspicions. It was only an old coat that he might think
was worn out. Quite a different thing from destroying some of his
ordinary clothes. That would have been suspicious. But an old coat
—no one would wonder if he got rid of it and brought another one
down to the laboratory to replace it.”
“It sounds deuced plausible, Inspector, I must admit. But——”
“But what, sir?”
“Well,” Sir Clinton answered thoughtfully, “it leaves us again with
the choice between the single and the double bluff, you see, even if
one goes no further with one's inquiries.”
The Inspector pondered over the point for a few seconds, but at
the end of his cogitation he seemed unimpressed. Apparently,
however, he thought it wise to change the subject.
“In any case, sir, I think Whalley's part in the bungalow affair is
pretty plain now. I told you he was the sort of fellow who was out for
easy money, no matter how dirty it might be. By the way, he was the
man who inquired about the number of that motor which he said
knocked him spinning—an obvious try-on to get damages, although
he wasn't hurt at all. You can see he'd do anything to make money
and save himself from honest work. If you remember that, it's easy
enough to see the part he played at the bungalow. He was the
person you christened Peeping Tom.”
“Anything further about him that you can think of, Inspector? I
don't say you're wrong, of course.”
“Well, sir, if Silverdale expected to take his wife in flagrante
delicto, he'd need an independent witness, wouldn't he? Possibly
Whalley was the man he picked out for the work.”
“Do you think he was the sort of witness that was wanted? I'm not
so sure of his suitability myself.”
“It wasn't exactly a nice job, sir,” the Inspector pointed out.
“Silverdale would hardly care to take one of his close friends to
inspect an affair of that sort. And of course a woman——”
He broke off suddenly, as though struck by a fresh idea. Sir
Clinton ignored the last phrase of the Inspector.
“Assume that Whalley was the witness, then, what next?”
“Assume that Silverdale posted Whalley at the second window
and went round to the first one—at the front. Then, to make the thing
complete, he breaks in through the window and jumps into the room.
Young Hassendean has his pistol and mistakes the state of affairs—
thinks that Silverdale means to thrash him or worse. He pulls out his
pistol and there's a struggle for the possession of it. The pistol goes
off accidentally, and the bullet hits Mrs. Silverdale in the head by
pure chance. Then the struggle goes on, and in the course of it,
young Hassendean gets shot twice over in the lung.”
The Chief Constable looked at his subordinate with quite
unaffected respect.
“It looks as if you'd come very near the truth there,” he admitted.
“Go on.”
“The rest's fairly obvious, if you grant what's gone before.
Whalley's seen the whole affair from his post at the window. He
sneaks off into the dark and gets out of Silverdale's reach. If he
hadn't, then Silverdale would probably have shot him at sight to
destroy the chance of evidence against him. But when Whalley has
time to think things over, he sees he's got a gold-mine in the
business. If he can blackmail Silverdale, he's got a steady income for
life. But I expect he weakened and tried to play for safety. He
blackmailed Silverdale; then he came to us, so that he could say
he'd been to the police, meaning to give information. Then he went
back to Silverdale, and in some way he let out that he'd given us a
call. That would be enough for Silverdale. Whalley would have to go
the way the maid went. And so he did.”
Sir Clinton had listened intently to the Inspector's reconstruction
of the episode.
“That's very neat indeed, Inspector,” he adjudged at the close.
“It's quite sound, so far as it goes, and so far as one can see. But, of
course, it leaves one or two points untouched. Where does the
murder of the maid come into the business?”
Flamborough reflected for a moment or two before answering.
“I'm not prepared to fill that gap just at this moment, sir. But I'll
suggest something. Renard told us that Mrs. Silverdale was going to
draw up a note of the terms of her new will. It's on the cards that
Silverdale knew about that—she may have mentioned it to him. He'd
want to get that note and destroy it at any cost, before there was any
search of his house or any hunting through Mrs. Silverdale's
possessions.”
“He might have thought it worth while, I admit. But I'd hardly think
it important enough to lead to an unnecessary murder. Besides, it
wasn't necessary for Silverdale to murder the maid at all. It was his
own house. He could search where he chose in it and nobody could
object. The maid wouldn't see anything strange in that.”
“It was pretty clear that the maid knew her murderer, anyhow,”
the Inspector pointed out. “Everything points to that. I admit I'm only
making a guess, sir. I can't bring any evidence against Silverdale on
that count yet. For all one can tell, she may have seen something—
blood on his coat from the shots, or something of that sort. Then he'd
have to silence her.”
Sir Clinton made no comment on the Inspector's suggestion.
Instead, he turned to a fresh aspect of the case.
“And where does Mr. Justice come into your theory of the affair?
He wasn't your friend Whalley. That's evident.”
The Inspector rubbed his nose thoughtfully, as though trying to
gain inspiration from the friction.
“It's a fact, sir, that I can't fit Mr. Justice into my theory at present.
He wasn't Whalley, and that's a fact. But hold on a moment!
Suppose that Whalley wasn't Silverdale's witness at all. Come to
think of it, Whalley was hardly the sort that one would pick out for the
job, if one had been in Silverdale's shoes.”
“I'm quite convinced of that, at any rate, Inspector. You needn't
waste breath in persuading me.”
“Yes, but there's another possibility that's been overlooked, sir,”
Flamborough interrupted eagerly. “I've been assuming all along that
Silverdale was the only person at the opened window. But suppose
he'd brought someone along with him. Both of them might have been
looking through the front window, whilst Whalley was at the side
window, quite unknown to them at the time.”
“Now you're getting positively brilliant, Inspector,” Sir Clinton
commended. “I think you've got at least half the truth there, beyond a
doubt.”
“Who could Silverdale's witness have been?” the Inspector
pursued, as if impatient of the interruption. “What about the Deepcar
girl?”
“Think again,” Sir Clinton advised him drily. “Do you really
suppose that Silverdale—who seems in love with the girl—would
have picked her out for business of that sort? It's incredible,
Inspector.”
The first flush of enthusiasm at his discovery passed from
Flamborough's thoughts at the tone of the Chief Constable's voice.
“I suppose you're right, sir,” he had to admit. “But there's another
girl who'd have enjoyed the job—and that's the Hailsham girl. She'd
have given a good deal just for the pleasure of seeing those two
humiliated. She'd have gloated over the chance of giving that
particular evidence in court and squaring accounts with young
Hassendean and Mrs. Silverdale. It would have been all jam to her,
sir. You can't deny that.”
Sir Clinton conceded the point without ado.
“I won't deny it,” he said curtly. “But you needn't let your mind run
exclusively on the female population of Westerhaven in a matter of
this sort. A man would be a much more convenient witness for
Silverdale to take with him. Why leave Silverdale's male friends out
of account?”
“If you're thinking of Markfield, sir, we'll not get much out of him,
I'm afraid,” Flamborough pronounced. “So far, except when he
couldn't help it, he's done his level best to refuse any information
about Silverdale and his doings—if he hasn't actually served out

You might also like