Professional Documents
Culture Documents
Image
Image
Image
Understanding
Information Retrieval
Medical
SERIES ON SOFTWARE ENGINEERING AND KNOWLEDGE ENGINEERING
Series Editor-in-Chief
S K CHANG (University of Pittsburgh, USA)
e-Learning
Understanding
Information Retrieval
Medical
r LeWorld Scientific
NewJersey London Singapore Hong Kong
Published by
World Scientific Publishing Co. Pte. Ltd.
5 Toh Tuck Link, Singapore 596224
USA ofice: Suite 202, 1060 Main Street, River Edge, NJ 07661
UK office: 57 Shelton Street, Covent Garden, London WC2H 9HE
For photocopying of material in this volume, please pay a copying fee through the Copyright
Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, USA. In this case permission to
photocopy is not required from the publisher.
ISBN 981-238-587-8
The role played by images in several human activities, that ranges from
entertainment to studies and covers all phases of the learning process, is
ever more relevant and irreplaceable.
The computer age may be interpreted as a transformation of our social
life in its working and leisure aspects. In our opinion this change is so
relevant that it could be compared with the invention of printing, of the
steam-engine or the discovery of radio-waves.
While for a long time images could only be captured by photography,
we are now able to capture, to manipulate and to evaluate images with the
computer. Since original image processing literature is spread over many
disciplines, we can understand the need to gather into a specific science all
the knowledge in this field.
This new science takes into account the image elaboration, transmission,
understanding, ordering and finally the role of image in knowledge as a
general matter.
This book aims a t putting as evidence some of the above listed subjects.
First of all we wish to emphasize the importance of images in the
learning process and in the transmission of knowledge (e-Learning section).
How much and what kind of information contents do we need in im-
age comprehension? We try to give an answer, even if partially, in the
Understanding section of this book.
The big amount of images used in internet sites requires the solution of
several problems. Their organization and the transmission of their content
is the typical field of interest of information retrieval, which studies and
provides solution to this specific problem.
In the last two decades the number and the role played by images in
the medical field has become ever more important. At the same time the
physicians require methodologies typical of Computer Science for the anal-
ysis, the organization and for CAD (Computer-Aided Design) purposes
applied t o medical images treatment.
The Medical section of this volume gives examples of the interaction
between computer science and medical diagnosis.
This book tries to offer a new contribution to computer science that will
inspire the reader to discover the power of images and to apply the new
knowledge of this science adequately and successfully to his or her research
area and to everyday life.
Sergio Vitulano
vii
This page intentionally left blank
CONTENTS
Preface vii
Medical Session
Chairman: M. Tegolo
An Introduction to Biometrics and Face Recognition 1
F. Perronnin, Jean-Luc Dugelay
e-Learning Session
Chairman: M. Nappi
The e-Learning Myth and the New University 60
V. Cantoni, M. Porta, M. G. Semenza
ix
X
Understanding Session
Chairman: Jean-Luc Dugalay
Issues in Image Understanding 159
V , Di Geszi
1. Introduction to Biometrics
The ability to verify automatically and with great accuracy the identity
of a person has become crucial in our society. Even though we may not
notice it, our identity is challenged daily when we use our credit card or try
t o gain access to a facility or a network for instance. The two traditional
approaches t o automatic person identification, namely the knowledge-based
approach which relies on something that you know such as a password,
and the token-based approach which relies on something that you have such
as a badge, have obvious shortcomings: passwords might be forgotten or
guessed by a malicious person while badges might be lost or stolen '.
Biometrics person recognition, which deals with the problem of iden-
tifying a person based on his/her physical or behavioral characteristics, is
an alternative to these traditional approaches as a biometric attribute is
inherent to each person and thus cannot be forgotten or lost and might be
difficult t o forge. The face, the fingerprint, the hand geometry, the iris,
1
L
etc. are examples of physical characteristics while the signature, the gait,
the keystroke, etc. are examples of behavioral characteristics. It should be
underlined that a biometric such as the voice is both physical and behav-
ioral. Ideally a biometric should have the following properties: it should be
universal, unique, permanent and easily collectible 2 .
In the next three sections of this introductory part, we will briefly de-
scribe the architecture of a typical biometric system, the measures to eval-
uate its performance and the possible applications of biometrics.
1.1. Architecture
A biometric system is a particular case of a pattern recognition system
’. Given a set of observations (captures of a given biometric) and a set
of possible classes (for instame the set of persons that can be possibly
identified) the goal is to associate to each observation one unique class.
Hence, the main task of pattern recognition is to distinguish between the
intru-class and inter-class variabilities. Face recognition, which is the main
focus of this article, is a very challenging problem as faces of the same
person are subject to variations due to facial expressions, pose, illumination
conditions, presence/absence of glasses and facial hair, aging, etc.
A biometric system is composed of at least two mandatory modules,
the enrollment and recognition modules, and an optional one, the adapta-
tion module. During enrollment, the biometric is first measured through
a sensing device. Generally, before the feature eotraction step, a series of
pre-processing operations, such as detection, segmentation, etc. should be
applied. The extracted features should be a compact but accurate repre-
sentation of the biometric. Based on these features, a model is built and
stored, for instance in a database or on a smart card. During the recognition
phase, the biometric characteristic is measured and features are extracted
as done during the enrollment phase. These features are then compared
with one or many models stored in the database, depending on the op-
erational mode (see the next section on performance evaluation). During
the enrollment phase, a user friendly system generally captures only a few
instances of the biometric which may be insufficient to describe with great
accuracy the characteristics of this attribute. Moreover, this biometric can
vary over time in the case where it is non-permanent (e.g. face, voice).
Adaptation maintains or even improves the performance of the system over
time by updating the model after each access to the system.
3
~I-~I_ICUIIIIIF*llOh.- EXTRACTION ID
1.3. Applications
There are mainly four areas of applications for biometrics: access control,
transaction authentication, law enforcement and personalization.
Access control can be subdivided into two categories: physical and uir-
’.
tual access control The former controls the access to a secured location.
An example is the Immigration and Naturalization Service’s Passenger Ac-
celerated Service System (INSPASS) deployed in major US airports which
enables frequent travelers to use an automated immigration system that
authenticates their identity through their hand geometry. The latter one
enables the access to a resource or a service such as a computer or a net-
work. An example of such a system is the voice recognition system used in
the MAC 0s 9.
Transaction authentication represents a huge market as it includes
transactions at an automatic teller machine (ATM) , electronic fund trans-
fers, credit card and smart card transactions, transactions on the phone or
on the Internet, etc. Mastercard estimates that a smart credit card incor-
porating finger verification could eliminate 80% of fraudulent charges 8 . For
transactions on the phone, biometric systems have already been deployed.
For instance, the speaker recognition technology of Nuance is used by the
clients of the Home Shopping Network or Charles Schwab.
Law enforcement has been one of the first applications of biometrics.
Fingerprint recognition has been accepted for more than a century as a
means of identifying a person. Automatic face recognition can also be very
useful for searching through large mugshot databases.
Finally, personalization through person authentication is very appealing
in the consumer product area. For instance, Siemens allows to personalize
one’s vehicle accessories, such as mirrors, radio station selections, seating
5
2.1.l. Eigenfaces
Eigenfaces are based on the notion of dimensionality reduction. first
outlined that the dimensionality of the face space, i.e. the space of variation
6
the K-dimensional space. The weights wk’s are the projection of the face
image on the k - th eigenface ek and thus represent the contribution of each
eigenface to the input face image.
7
To find the best match for an image of a person’s face in a set of stored
facial images, one may calculate the Euclidean distances between the vector
representing the new face and each of the vectors representing the stored
faces, and then choose the image yielding the smdlest distance 18.
2.1 3. Eigenfeatures
An eigenface-based recognition system can be easily fooled by gross varia-
tions of the image such as the presence or absence of facial hair 19. This
shortcoming is inherent to the eigenface approach which encodes a global
representation of the face. To address this issue, l9 proposed a modular or
8
i= 1
i = l k=l
formed into:
...................................
Figure 4. (a) Template image and (b) query image with their associated grids. (c) Grid
after deformation using the probabilistic deformable model of face mapping (c.f. section
2.2.3). Images extracted from the FERET face database.
the form:
iteration both the optimal global translation and the set of optimal local
perturbations. In 35, the same authors further drop the C, term in equation
(8). However, to avoid unreasonable deformations, local translations are
restricted to a neighborhood.
While the cost of local matchings in C, only makes use of the mag-
nitude of the complex Gabor coefficients in the EGM approach, the
phase information is used to disambiguate features which have a
similar magnitude, but also to estimate local distortions.
The features are no longer extracted on a rectangular graph but
they now refer to specific facial landmarks called fiducial points.
A new data structure called bunch graph which serves as a gen-
eral representation of the face is introduced. Such a structure is
obtained by combining the graphs of a set of reference individuals.
An obvious shortcoming of EGM and EBGM is that C,, the cost of local
matchings, is simply a sum of all local matchings. This contradicts the fact
that certain parts of the face contain more discriminant information and
that this distribution of the information across the face may vary from one
person t o another. Hence, the cost of local matchings at each node should
be weighted according to their discriminatory power 38y39134935.
13
4. Multimodality
Reliable biometric-based person authentication systems, based for instance
on iris or retina recognition already exist but the user acceptance for such
systems is generally low and they should be used only in high security
scenarios. Systems based on voice or face recognition generally have a high
user acceptance but their performance is not satisfying enough.
Multimodality is a way to improve the performance of a system by com-
bining different biometrics. However, one should be extremely careful about
which modalities should be combined (especially, it might not be useful to
combine systems which have radically different performances) and how to
combine them. In the following, we will briefly describe the possible multi-
modality scenarios and the different ways to fuse the information.
5. Summary
We introduced in this paper biometrics, which deals with the problem of
identifying a person based on his/her physical and behavioral character-
istics. Face recognition, which is one of the most actively research topic
in biometrics, was briefly reviewed. Although huge progresses have been
made in this field for the past twenty years, research has mainly focused
o n frontal face recognition from still images. We also introduced the no-
tion of multimodality as a way of exploiting t h e complementary nature of
monomodal biometric systems.
References
1. S. Liu and M. Silverman, “A practical guide to biometric security technol-
ogy”, I T Professional”, vol. 3, no. 1, pp. 27-32, Jan/Feb 2001.
2. A. Jain, R. Bolle and S. Pankanti, “Biometrics personal identification in
networked society”, Boston, MA: Luwer Academic, 1999.
3. R. 0. Duda, P. E. Hart and D. G. Stork, “Pattern classification”, 2nd edition,
John Wiley & Sons, Inc.
4. P. J. Phillips, H. Moon, S. Rizvi and P. Rauss, “The FERET evaluation
methodology for face recognition algorithms”, IEEE Tbans. on PAMI, 2000,
vol. 22, no. 10, October.
5. K. Messer, J. Matas, J. Kittler and K. Jonsson, “XMZVTSDB: the extended
M2VTS database”, AVBPA’99, 1999, pp. 72-77.
6. P. J. Phillips, A. Martin, C. L. Wilson and M. Przybocki, “An introduction
t o evaluating biometric systems”, Computer, 2000, vol. 33, no. 2, pp. 56-63.
7. INSPASS, http://www.immigration.gov/graphics/howdoi/inspass.htm
8. 0. O’Sullivan, “Biometrics comes t o life”, Banking Journal, 1997, January.
9. Nuance, http://www.nuance.com
10. Siemens Automotive, http://media.siemensauto.com
11. R. Chellappa, C. L. Wilson and S. Sirohey, “Human and machine recognition
of faces: a survey”, Proc. of the IEEE, 1995, vol. 83, no. 5, May.
12. J. Daugman, “HOWiris recognition works” ICIP, 2002, vol. 1, pp. 33-36.
13. B. Moreno, A. Sanchez and J. F. Velez, “On the use of outer ear images for
personal identification in security applications”, IEEE 3rd Conf. on Security
Technology, pp. 469-476.
14. R. W. Fkischholz and U.Dieckmann, “BioID: a multimodal biometric iden-
tification system”, Computer, 2000, vol. 33, no. 2, pp. 64-68, Feb.
15. E. Hjelmas and B. K. Low, “Face detection: a survey”, Computer Vision and
Image Understanding, 2001, vol. 83, pp. 236-274.
16. M. Kirby and L. Sirovich, “Application of the karhunen-lohe procedure for
the characterization of human faces,” IEEE Bans. on PAMI, vol. 12, pp.
103-108, 1990.
17. I. T. Joliffe, “Principal Component Analysis”, Springer-Verlag, 1986.
18. M. A. Turk and A. P. Pentland, “Face recognition using eigenfaces,” in IEEE
19
Recently, it has been estimated that oral squamous cell carcinoma ( OSCC)
represents 3% of all malignant neoplasms . OSCC, usually, affects more men
than women so is considered the 6" most frequent male malignant tumour and
the 12" female one. In U.S.A about 21.000 new cases of OSCC are diagnosed
every year and 6.000 people die because of this disease.
In the last decade OSCC has gone on developing. This has caused a terrible
increase of under 30 individuals affected by oral carcinoma.
A serious data concerns prognosis in these patients. If the neoplasm is detected
within its 1'' or 2"d stage, the probability of living for five years will be 76%.
This value will go down 41% , if the malignant tumour is diagnosed withm its
3rdstage.
Only 9% of the patients goes on living after five years since OSCC diagnosis
during its 4" stage.
The diagnostic delay is caused by different reasons:
carcinoma development. OSCC, during its manifestation, doesn't
reveal any particular symptom or painful. So, the patient tends to
ignore the lesion and hardly goes to the dentist to ask for a precise
diagnosis;
the polymorfism that the oral lesions often show. For example, an ulcer
can appear similar to a trauma, aphtae major or carcinoma;
22
the doctors in charge who aren’t used to examining the oral cavity
during the routine check-up. So, recent researches has proved that a
person suffering from mucous lesions
in the oral cavity goes first to h s family doctor who advises him a
dermatological visit.
Usually, carcinoma is detected after 80 days after its first symptoms, so, this
delay even is responsible for the short OSCC prognosis .
2. Fluorescence methodologies
9. TISSUE CLASSIFICATION.
Fig. 2. Example of mean neural work input curves grouped according to the
clinical diagnosis. (Oral Oncology 36 (2000) 286-293)
3. Toulidine blue
Fig. 3. Example of neoplastic lesion stained by using toluidine blue. Areas with
more active mitosis stain more with toluidine blue.
The patient rinses again the lesion with acetic acid to remove the excessive and
not fixed colour. In this moment the clinician can detect the lesion according to
the colour even though OSCC diagnosis
depends largely on histology report.
So, the coloured lesion can be defined :
a) TRUE POSITIVE : the lesion has absorbed the colour and is an OSCC from
an histological point of view;
b) FALSE POSITIVE: the lesion has absorbed the colour but isn’t an OSCC
from an histological point of view;
c) TRUE NEGATIVE: the lesion doesn’t absorb the colour and isn’t an OSCC
from an histological point of view;
d) FALSE NEGATIVE: the lesion doesn’t absorb the colour but is an OSCC
from an histological point of view.
25
Fig.4. Example of traumatic lesion: even though stained by toluidine blue, the
lesion is not a carcinoma (false positive).
In realty, this methodology is sensible but not particularly specific. The number
of coloured lesions, even though aren’t cancerous, is large.
Scientific literature shows different researches on the reliability of this
methodology. The case histories reveal encouraging data about the diagnostic
power of toulidine blue but no study has still considered the use of a digital
reading of the lesion. Employing digital methodologies could make more reliable
this test and , for example, it’s possible to use the different blue gradations
invisible to the naked eye. The reading of the coloured lesions making use of
toulidine blue aims to offer the dentists another diagnostic tool. It is inexpensive,
easy to use and not invasive, so, can be normally used like screening for the
patients who are used to go to the dentist. On the other hand this methodology
makes possible an on-line communication of the digital images to specialized
centres in order to have other consultations.
Actually, there isn’t a screening methodology with a sensibility and specificity
of 100%.
However, the use of data processing system improves the reliability in
diagnostic methodologies and offers an objective analysis.
4. Conclusions
Scientific literature hasn’t showed trial which have compared the efficacy of the
different methodologies used to analyse the image in OSCC diagnosis.
We hope we will use an univocal, reliable and inexpensive reading methodology
of the lesion. The information development should help clinic-medical
diagnosis. It could be the ideal way to have an early diagnosis. This will cause a
prognosis improvement which will make the relationship between medicine and
computer science extraordinary.
26
Acknowledgments
References
PAOLA CAMPADELLI
Dipartamento di Scienze dell 'Informazione,
Universitd degli Studi d i Milano,
Via Comelico, 39/41
20135, Milano, Italy
E-mail: campadelli0dsi.unimi.it
ELENA CASIRAGHI
Dipartimento di Scienze dell 'Informazione,
Uniuersitd degli Studi d i Milano,
Via Comelico, 39/41
80135, Milano, Italy
E-mail: casiraghiQdsi.unimi.it
The use of image processing techniques and Computer Aided Diagnosis (CAD)
systems has proved to be effective for the improvement of radiologists' diagnosis,
especially in the case of lung nodules detection. The first step for the development
of such systems is the automatic segmentation of the chest radiograph in order to
extract the area of the lungs. In this paper we describe our segmentation method,
whose result is a close contour which strictly encloses the lung area.
1. Introduction
In the field of medical diagnosis a wide variety of ima-ging techniques is
currently avalaible, such as radiography, computed tomography (CT) and
magnetic resonance ima-ging (MRI). Although the last two are more precise
and more sensitive techniques, the chest radiography is still by far the most
common type of procedure for the initial detection and diagnosis of lung
cancer, due to its noninvasivity characteristics, radiation dose and economic
consideration. Studies by [20] and [ll]explain why chest radiograph is one
of the most challenging radiograph to produce technically and to interpret
diagnostically. When radiologists rate the severity of abnormal findings,
large interobserver and intraobserver differences occur. Moreover several
studies in the last two decades, as for example [B] and [2], calculated an av-
28
erage miss rate of 30% for the radiographic detection of early lung nodules
by humans. In a large lung cancer screening program 90% of peripheral
lung cancers have been found to be visible in radiographs produced earlier
than the date of the cancer discovery by the radiologist. This results showed
the potentiality of improved early diagnosis, suggesting the use of computer
programs for radiograph analysis. Moreover the advent of digital thorax
units and digital radiology departments with Picture Archiving Commu-
nication Systems (PACS) makes it possible to use computerized methods
for the analysis of chest radiographs as a routine basis. The use of im-
age processing techniques and Computer Aided Diagnosis (CAD) systems
has proved to be effective for the improvement of radiologists’ detection
accuracy for lung nodules in chest radiographs as reported in [15].
The first step of an automatic system for lung nodule detection, and in
general for any further analysis of chest radioraphs, is the segmentation of
lung field so that all the algorithms for the identification of lung nodules
will be applied just to the lung area.
The segmentation algorithms proposed in the literature t o identify the
lung field can be grouped into: rule based systems ([l],[21], [22], [7], [4],
[14], [5], [3]), pixel classification methods including Neural Networks ([13],
[12], [9], [IS]) and Markov random fields ([18] and [19]),active shape models
([S]) and their extensions ([17]).
In this paper we describe an automatic segmentation method which
identifies the lung area in Postero-anterior (PA) digital radiographs. Since
the method is thought as the first step of an automatic lung nodule detec-
tion algorithm, we choose to include in the area of interest also the bottom
of the chest and the region behind the heart; they are usually excluded by
the methods presented in the literature. Besides, we tried t o avoid all kind
of assumptions such as the position and orientation of the thorax: we work
with images where the chest is not always located in the central part of the
image, it can be tilted and it can have structural abnormalities.
The method is made of two steps. First, the lungs are localized using sim-
ple techniques (section 4), then their borders are more accurately defined
and fitted with curves and lines in order to obtain a simple close contour
(section 5).
2. Materials
Our database actually contains 11 1 radiographs of patients with no disease
and 13 of patients with lung nodules. They have been acquired in the
29
Figure 1. lung mask image, initial edge image and edge image
lung for those oriented at 90" and 135'. We filter the image with a gaussian
filter at scale c,related to the stripe dimension, take the vertical derivative
and mantain the 5% of the pixels with the highest gradient value. These
edge pixels, which often belongs to the lung borders, are added to the edge
image. Since the costophrenic angle can still be missing we filter the image
at a finer scale 0/2, take the derivative at 135" and 45" (depending on the
side) and mantain the 10% of the edge pixels. A binary image that may
represent the costophrenic angles is obtained combining this information
with the 10% of the pixels with the highest value in the vertical direction.
The regions in the binary image just created are added to the lung edge
i m a g e E if they touch, or are attached to, some edge pixels in it.
At this stage most of the edge pixels belonging to the lung borders should
have been determined; the image can hence be reduced defining a rectan-
gular bounding box slightly greater than the lung area defined by the lung
edge image E .
regions in this image intersecting edge pixels are added to the lung edge
i m a g e (the result of this addition is shown in [Fig.3] (right)).
Figure 3. enhanced image with the seed points, edge image after growing, edge image
after the last regions added
At this point we can define the close contour of the area containing the
lungs, fitting the borders found with curves and lines. We describe the
operation on the left lung only, referring to the binary image of its edges as
left edge image El. We noticed that the shape of the top part of the lung
could be well fitted by a second order polynomial function. To find it we
use the Hough transform to search for parabolas, applied to the topmost
points of each column in El. The fitted parabola is stopped, on the right
side of its vertex, in the point where it crosses a line parallel to the axis and
passing through the rightmost pixel; on the left side it is stopped where it
crosses the left edge image; if more than one point is found we select the
one with the lowest y coordinate.
5. Results
We detected small errors in 4 of the 124 images in our database, where we
consider as error the fact that a part of the lung has not been included by
the lung contours defined. The part missed by the algorithm is the border
of the costophrenic angle. The algorithm anyway shows to be robust to
structural abnormalities of the chest. ([Fig.$]). The algorithm has been
implemented in IDL, an interpreted language and, when executed on a
Pentium N with 256 Mb of RAM, it takes from 12 seconds (for images of
patients with little sized lung that can be cutted as described in section
4.4) to 20 seconds (for images of big sized lung).
(b) (c)
References
1. S.G. Armato, M.Giger, and H.MacMahon. Automated lung segmentation in
digitized posteroanterior chest radiographs. Academic radiology, 5:245-255,
1998.
2. J.H.M. Austin, B.M. Romeny, and L.S. Goldsmith. Missed bronchogenic car-
cinoma: radiographic findings in 27 patients with apotentially resectable
lesion evident in retrospect. Radiology, 182:115-122, 1992.
36
3. M.S. Brown, L.S. Wilson, B.D. Doust, R.W. Gill, and CSun. Knowledge-
based method for segmentation and analysis of lung boundaries in chest x-
rays images. Computerized Medical Imaging and Graphics, 22:463-477, 1998.
4. F.M. Carrascal, J.M. Carreira, M. Souto, P.G. Tahoces, L. Gomez, and J.J.
Vidal. Automatic calculation of total lung capacity from automatically traced
lung boundaries in postero- anterior and lateral digital chest radiographs.
Medical Physics, 25:1118-1131, 1998.
5. D. Cheng and M. Goldberg. An algorithm for segmenting chest radiographs.
Proc SPIE, pages 261-268, 1988.
6. T. Cootes, C. Taylor, D. Cooper, and J . Graham. Active shape models-their
training and application. Comput. Vzs, Image Understanding, 61:38-59, 1995.
7. J. Duryea and J.M. Boone. A fully automatic algorithmfor the segmentation
of lung fields in digital chest radiographic images. Medical Physics, 22:183-
191, 1995.
8. J. Forrest and P. Friedman. Radiologic errors in patient with lung cancer.
West Journal on Med., 134485-490, 1981.
9. A. Hasegawa, S.-C. Lo, M.T. Freedman, and S.K. Mun. Convolution neural
network based detection of lung structure. Proc. SPIE 2167, pages 654-662,
1994.
10. R. Klette and P.Zamperoni. Handbook of image processing operators. Wiley,
1994.
11. H. MacMahon and K. Doi. Digital chest radiography. Clan. Chest Med.,
12:19-32, 1991.
12. M.F. McNitt-Gray, H.K. Huang, and J.W. Sayre. Feature selection in the pat-
tern classification problem of digital chest radiographs segmentation. IEEE
Duns. on Med. Imaging, 14:537-547, 1995.
13. M.F. McNitt-Gray, J.W. Sayre, H.K. Huang, and M. Razavi. A pattern clas-
sification approach to segmentation of chest radiographs. PROC SPIE 1898,
pages 160-170, 1993.
14. E. Pietka. Lung segmentation in digital chest radiographs. Journal of digital
imaging, 7:79-84, 1994.
15. T.Kobayashi, X.-W. Xu, H. MacMahon, C. Metz, and K. Doi. Effect of a
computer-aided diagnosis scheme on radiologists’ performance in detection
of lung nodules on radiographs. Radiology, 199:843-848, 1996.
16. 0. Tsuji, M.T. Freedman, and S.K. Mun. Automated segmentation of
anatomic regions in chest radiographs using an adaptive-sized hybrid neu-
ral network. Med. Phys., 25:998-1007, 1998.
17. B. van Ginneken. Computer-aided diagnosis in chest radiographs. P.h.D. dis-
sertation, Utrecht Univ., Utrecht, The Nederlands, 2001.
18. N.F. Vittitoe, R.Vargas-Voracek, and C.E. Floyd Jr. Identification of lung
regions in chest radiographs using markov random field modeling. Med. Phys.,
25:976-985, 1998.
19. N.F. Vittitoe, R. Vargas-Voracek, and C.E. Floyd Jr. Markov random field
modeling in posteroanterior chest radiograph segmentation. Med. Phys.,
26:1670-1677, 1999.
20. Cj Vyborny. The aapm/rsna physics tutorial for residents: Image quality and
37
C . VALENTI
Dipartimento di Matematica ed Applicazioni
Universitci degli Studi d i Palermo
Via Archirafi 34, 90123 Palermo - Italy
E-mail: cvalenti@math.unipa.it
The new field of research of discrete tomography will be described in this paper.
It differs from standard computerized tomography in the reduced number of pro-
jections. It needs ud hoc algorithms which usually are based on the definition of
the model of the object to reconstruct. The main problems will be introduced and
an experimental simulation will prove the robustness of a slightly modified version
of a well known method for the reconstruction of binary planar convex sets, even
in case of projections affected by quantization error. To the best of our knowl-
edge this is the first experimental study of the stability problem with a statistical
approach. Prospective applications include crystallography, quality control and
reverse engineering while biomedical tests, due to their important role, still require
further research.
1. Introduction
Computerized tomography is an example of inverse problem solving. It
consists of the recovering of a 3D object from its projections Usually '.
this object is made of materials with different densities and therefore it is
necessary t o take a number of projections ranging between 500 and 1000.
When the object is made of just one homogeneous material, it is possible
to reduce the number of projections to no more than four, defining the so
called discrete tomography '. In such a case we define a model of the body,
assuming its shape. For example, we may know about the types of atoms
to analyze, the probability to find holes inside the object and its topology
(e.g. successive slices are similar to each other or some configurations of
pixels are energetically unstable) '.
Though this assumptions may be useful when considering applications
39
1 3 2
Figure 1 . A subset of '
2 and its corresponding linear equation system. The black disks
(+)and the small dots (+) represent the points of the object and of the discrete lattice,
respectively.
Main issues in discrete tomography arise from this dearth of the input
data. In 1957 a polynomial time method to solve the consistency problem
(i.e. the ability to state whether there exists any A compatible with a given
p-) has been presented 4 .
40
The uniqueness problem derives from the fact that different A’s can sat-
isfy the same p . For example, two A’s with the same horizontal and vertical
projections can be transformed one into each other by a finite sequence of
switching operations (Figure 2). Moreover, there is an exponential number
of hv-convex polyominoes (i.e. 4-connected sets with 4-connected rows and
columns) with the same horizontal and vertical projections 5 .
Lastly, the stability problem concerns how the shape of an object changes
while perturbing its projections. In computerized tomography the variation
in the final image due to the fluctuation in one projection sample is generally
disregarded, since it forms independently, as one of many, the result and
the effect is therefore distributed broadly across the reconstructed image 6 .
This is not true in the discrete case and the first theoretical analysis to
reconstruct binary objects of whatever shape has proved that this task is
instable and that it very hard to obtain a reasonably good reconstruction
’.
from noisy projections Here we will describe how our experimental results
show that it possible to get convex binary bodies from their perturbed
projections, still maintaining a low reconstruction error.
3. Reconstruction algorithm
In order to verify the correctness of the algorithm we have generated 1900
convex sets with (10 x 10,15x 15,. . . , 100 x 100) pixels. Further 100 convex
sets with both width and height randomly ranging between 10 and 100 have
been considered too. Their projections have been perturbed 1000 times
by incrementing or decrementing by 1 the value of some of their samples,
randomly chosen. This is to estimate the effect of errors with absolute value
0 5 E 5 1, so simulating a quantization error. The number of the samples
has been decided in a random way, but if we want to let the area of the
reconstructed body be constant, we add and subtract the same amount of
pixels in all projections.
41
Figure 3. The first two filling operations are not based on the projection value. The
circles (+) represent pixels not yet assigned to the core.
Figure 4. Convex recover through {dlrd2,d5}. The spine is showed in the first two
steps, the filling operations in the remaining ones. T h e grey pixels are not yet assigned.
two literals) lo. This complete search has exponential time complexity, but
it has been proved that these formulas are very small and occur rarely,
especially for big images 11,
In order t o measure the difference between the input image taken from
the database and the obtained one, we have used the Hamming distance (i.e.
we have counted the different homologous pixels), normalized according t o
the size of the image. Most of times we have obtained non-convex solutions
for which the boolean evaluation involves a bigger average error. Due to
this reason, we have preferred not to apply the evaluation on the ambiguous
zones, when they were not due to switching components. We want to
emphasize that these pixels take part in the error computation only when
compared with those of the object. That is, we treat these uncertain pixels,
if any, as belonging to the background of the image.
Figure 5 . Non-convex recover (upper right) from a binarized real bone marrow scintig-
raphy (left) with 1 pixel added/subtracted along { d 3 , d 4 } and without spine. T h e final
reconstructed image (lowerright) is obtained by deleting all remaining grey pixels. T h e
input image is utilized and reproduced with permission from the MIR Nuclear Medicine
digital teaching file collection at Washington University School of Medicine. MIR and
Washington University are not otherwise involved in this research project.
4. Experimental results
This final section summarizes the most important results we obtained, giv-
ing also a brief explanation.
The average error rate increases when the number of modified samples
43
increases. Obviously, the more we change the projections, the harder is for
the algorithm to reconstruct the object (Figure 6a).
Many non-convex sets suffers from a number of wrong pixels lower than
the average error. Despite the algorithm couldn’t exactly reconstruct the
convex set, the forced non-convex solutions still keep the shape of the orig-
inal object. For example, there are about 66.11% of non-convex solutions,
marked in grey, with fixed 100 x 100 size, 1 pixel addedlsubtracted along
directions ( d 3 , d 4 , d 5 , d s ) , and error smaller than the 0.34% average error
(Figure 6b).
In the case of convex solutions, the spine construction lets reduce the
number of unambiguous cells for the successive phase of filling. In the
case of non-convex solutions, the spine usually assumes an initial object
shape that produces solutions very different from the input polyomino. An
example of non-convex set obtained without spine preprocessing is shown
in Figure 5.
The choice of the horizontal and vertical directions { d l , d z } is not always
the best one. For example, ( 4 ,d 4 ) and ( d 5 , d ~ )let recover more non-
convex solutions with a smaller error. This is due t o the higher density
of the scan lines, that corresponds to a better resolution. More than two
directions improve the correctness of the solutions, thanks t o the reduced
degree of freedom of the undetermined cells. The following tables concisely
reports all these results, obtained for objects with 100 x 100 pixels, with our
without the spine construction, along different directions and by varying
the number of perturbed samples.
To the best of our knowledge this is the first experimental study of the
stability problem with a statistical approach. Our results give a quantita-
tive esteem for both the probability of finding solutions and of introducing
errors at a given rate. We believe that a more realistic instrumental noise
should be introduced, considering also that the probability of finding an
error with magnitude greater than 1 usually grows in correspondence of
the samples with maximum values. Moreover, though the convexity con-
straint is interesting from a mathematical point of view, at present we are
also dealing with other models of objects to reconstruct, suitable for real
microscopy or crystallography tools.
Acknowledgements
The author wishes to thank Professor Jerold Wallis 12 for his kind contri-
bution in providing the input image of Figure 5.
44
301
p 20
+Isamples % error
Figure 6. a: Average (*) minimum (D) and maximum (0) error versus number of
modified samples, for non-convex solutions with fixed 100 x 100 size, directions {dl,dz}
and spine preprocessing. Linear least-square fits are superimposed. b: Number of non-
convex solutions versus error, for fixed 100 x 100 size and 1 pixel added/subtracted along
directions {d3,d4,d5,d6) without spine. The dashed line indicates the average error.
References
1. KAK A.C. AND SLANEY M., Principles of Computerized Tomography Imag-
ing. IEEE Press, New York, 1988.
45
1. Introduction
In the last years computer generated imaging (CGI) has been often used for
forensic reconstruction [19], as an aid for the identification of cadavers, as well
as for medical visualization [3,16], for example in the planning of maxillo-facial
surgery [ 141. In fact, 3D modelling, rendering and animation environments
today available have greatly increased their power to quickly and effectively
produce realistic images of humans [8]. Nevertheless the typical approach
usually adopted for modelling a face is often still too much artistic and it mainly
relies on the anatomic and physiognomic knowledge of the modeller. In other
terms computer technology is simply replacing the old process of creating an
identikit by hand drawn sketches or by sculpting clay, adding superior editing
and simulative capabilities, but often with the same limits in term of reliability
of the results.
The recent findings of five skulls [see Figure 11 and several bones (from a
group of sixteen individuals in Murecine (near Pompei), offers the opportunity
to use CGI, and craniographic methods [ 5 ] ,to reconstruct the aspect of the
victims of this tremendous event.
This paper starts assuming that, unfortunately, what is lost in the findings of
ancient human remains, is lost forever. This means that by no way is possible to
exactly reproduce a face simply from its skull, because there are many ways in
which soft tissues may cover the same skull leading to different final aspects.
47
The problem is even more complicated in the (frequent) case of partial findings,
because the missing elements (mandible or teeth for example) could not be
derived from the remaining bones [7].
Figure 1. One of the skulls found in the archaeological site of Murecine, near Pompei.
Nevertheless is true that the underlying skeleton affects directly the overall
aspect of an individual, and many fundamental physiognomic characteristics are
strongly affected by the skull. One of the main purposes of this study, is
therefore to correlate ancient skulls to skulls of living individuals, trying, in this
way, to replace lost information (for example missing bones and soft tissues)
with new compatible data. Additionally, the physiognomic relevant elements
that are too much aleatory to be derived from a single compatible living
individual, are selected through a search in a facial database (built from classical
art reproductions of typical Pompeians) and then integrated in the previous
reconstruction.
2. Related Works
Facial reconstruction from skull begins has a long history, and begins around the
end of nineteenth century. The reconstructive methodologies developed over
more of a century [20] basically come from two main approaches:
Many of the methods mentioned above rely on a large survey on facial soft
tissue depth, measured in a set of anatomically relevant points. Firstly developed
on cadavers, this measurement protocol has been improved 141 with data from
other races, various body build, and even from living individuals by radiological
and ultrasound diagnostic techniques.
49
Figure 2. Landmarks located on front and side view of skull and craniometrical tracing.
50
Once the database is built, it is possible to search through it to find the record
(the modern Pompeian individual) whose craniometrical features are more
similar to the unknown subject given in input. This task is accomplished by
evaluating for each record i the Craniometrical Similarity Score (CSS) that is
calculated as :
3.8. Warping the roughface model to fit the original set of landmarks
If the CSS of the reconstructed head is not equal to 1 (and this will probably
always be true) then we would like to modify the shape of this model to better
fit the craniometrical features of the found skull.
This kind of tridimensional deformation of a mesh, based on vertex relocation
by a specific transformation of coordinates, is usually referred as a “warping”.
More precisely, we want to move every bone landmark Lj of the “best match”
on the dry skull) to a new position that correspond to the coordinates of L;. The
purpose is to affect the polygonal surface local to Li using the landmark as an
handle to guide the transformation. Many different algorithms are available to
accomplish this task, but we chosen a free form deformation which simply
works assigning to the input mesh a lattice withn control vertex (our
landmarks&) and by moving them (to L;) it deforms smoothly the
surrounding surface.
After warping is applied, the face model fit better the dry skull, and this
match can be easily verified visualizing at the same time the skull mesh (from
CT 3D reconstruction) and the face model with partial transparency.
4. Discussion
These peculiarities lead to a precise applicative range for the proposed method,
with advantages and limits respect to other methods presented.
The proposed method works best on a complete skull, but even in the case of
missing mandible it can still produce interesting results, using the remaining
craniometrical measurements to search a similar subject in the CD, thus
replacing (even if a major alea would arise ) the lost information.
Another critical point about “warping methods” mentioned in section 2 is
the reference face mesh to warp, because its physiognomic features affect the
final result independently from the correctness of soft tissue depth in the discrete
set of landmarks involved in the process. The basic classification for races
(Caucasian, Afro, Asian, etc.), sex and build (fat, normal or thin) is often too
generic to accurately reproduce the aspect of specific ethnic groups.
The proposed method, based on the custom built CD containing records of
anthropologically compatible individuals, uses as a reference mesh the 3D face
model of the most similar subject in database, thus minimizing the amount of
57
5. Conclusion
References
121 J.P. Moss, A.D. Linney, S.R. Grindrod, C.A. Mosse, A laser scanning
system for the measurement of facial surface morphology, Optics Lasers Eng.
10 (1989) 179-190.
[3] A.C. Tan, R. Richards, A.D. Linney, 3-D medical graphics - using the
T800 transputer, in: Proceedings th of the 8 OCCAM User Group Technical
Meeting, 1988, pp. 83-89.
[4] J.S. Rhine, C.E. Moore, Facial reproduction tables of facial tissue thickness
of American Caucasoids in forensic anthropology, in: Maxwell Museum
Technical Series 1, Maxwell Museum, Albuquerque, New Mexico, ,1982.
[5] R.M. George, The lateral craniographic method of facial reconstruction, J.
Forensic Sci. 32 (1987)
1305-1 330.
[6] R.M. George, Anatomical and artistic guidelines for forensic facial
reconstruction, in: M.H. Iscan, R.P. Helmer (Eds.), Forensic Analysis of the
Skull, Wiley-Liss, New York, 1993, pp. 215-227, Chapter 16.
[7] H. Peck, S. Peck, A concept of facial aesthetics, Angle Orthodont. 40 (1970)
284-318.
[8] K.Waters, D. Terzopoulos, Modelling and animating faces using scanned
data, J. Visual. Graphics Image.
[9] H. Hjalgrim, N. Lynnerup, M. Liversage, A. Rosenklint, Stereolithography:
potential applications in anthropological studies, Am. J. Phys. Anthropol. 97
(1995) 329-333.
[lo] A.W. Sharom, P. Vanezis, R.C. Chapman, A. Gonzales, C. Blenkinsop,
M.L. Rossi, Techniques in facial identification: computer-aided facial
reconstruction using a laser scanner and video superimposition, 1nt.J. Legal
Med. 108 (1996) 194-200.
1111 N. Lynnerup, R. Neave, M. Vanezis, P. Vanezis, H. Hjalgrim, Skull
reconstruction by stereolithography,th in: J.G. Clement, D.L. Thomas (Eds.),
Let’s Face It! Proceedings of the 7 Scientific Meeting of the International
Association For Cranofacial Identification, Local Organising Committee of the
IACI, Melbourne, 1997, pp. 11-14. P.Vanezis et a1 . / Forensic Science
International 108 (2000) 81- 95 95
[ 121 Gonzalez-Figueroa, An Evaluation of the Optical Laser Scanning System
for Facial Reconstruction, Ph.D. thesis, University of Glasgow, 1998.
[13] R. Enciso, J. Li, D.A. Fidaleo, T-Y Kim, J-Y Noh and U. Neumann,
Synthesis Of 3d Faces - Integrated Media Systems Center, University of
Southern California - Los Angeles.
[14] M.W. Vannier, J.L. Marsh, J.O. Warren, Three dimensional CT
reconstruction images for craniofacial surgical planning and evaluation,
Radiology 150 (1984) 179-184.
[ 151 S. Arridge, J.P. Moss, A.D. Linney, D.R. James, Three-dimensional
digitisation of the face skull, J. Max.-fac. Surg. 13 (1985) 136-143.
59
1161 S.R. Arridge, Manipulation of volume data for surgical simulation, in: K.H.
Hohne, H. Fuchs, S.M. Pizer (Eds.), 3D Imaging in Medicine, NATO AS1
Series F 60, Springer-Verlag, Berlin, 1990, pp. 289-300.
[17] J.P. Moss, A.D. Linney, S.R. Grinrod, S.R. Arridge, J.S. Clifton, Three
dimensional visualization of the face and skull using computerized tomography
and laser scanning techniques, Eur. J. Orthodont. 9 (1987) 247-253.
[ 181 J.P. Moss, A.D. Linney, S.R. Grinrod, S.R. Arridge, D. James, A computer
system for the interactive planning and prediction of maxillo-facial surgery,
Am. J. Orthodont. Dental-facial Orthopaed. 94 (1988) 469-474.
[19] P. Vanezis, R.W. Blowes, A.D. Linney, A.C. Tan, R. Richards, R. Neave,
Application of 3-D computer graphics for facial reconstruction and comparison
with sculpting techniques, Forensic Sci. Int. 42 (1989) 69-84.
[20] A.J. Tyrell, M.P. Evison, A.T. Chamberlain, M.A. Green, Forensic three-
dimensional facial reconstruction: historical review and contemporary
developments, J. Forensic Sci. 42 (1997) 653-661.
[211 G. Quatrehomme, S. Cotin, G. Subsol, H. Delingette, Y. Garidel, G.
Grevin, M. Fidrich, P. Bailet, A.Ollier, A fully three-dimensional method for
facial reconstruction based on deformable models, J. Forensic Sci. 42 (1997)
649-652.
60
1. Introduction
vi) to conclude, we can say that while in the in-presence paradigm the
student plays a reactive role, in the distance modality the student assumes
a proactive role.
The traditional university, as an institution offering on-site courses, to
maintain their prestigious position, needs to know how to make the most of the
opportunities being offered by new technologies. The challenge is to rethink their
hgher education environment in the light of new technologies in order to meet the
challenges of a global context. For this reason, several countries are promoting
technological development measures for education policy, either from government
or from university associations. This implies the establishment of strategic lines
for the development of a more open education.
4. E-learning in Europe
The use of e-learning for enhancing quality and improving accessibility to
education and training is generally seen as one of the keystones for building the
European knowledge society.
At the Member State level, most countries have their own Action Plan for
encouraging the use of ICT in education and training: often involving direct
support for local pilots of e-learning in schools and higher education.
Evidence that true e-learning is being used in Europe is not easy to find, as
it’s typical at this stage of an embryonic technology market for organizations to
work through a series of internal pilots.
Compared to the USA, in some ways Europe is following a different path:
greater government involvement, more emphasis on creative and immersive
approaches to learning, more blending of e-learning with other forms, a greater
use of learning communities (mainly by southern European users), and
(particularly in Scandinavia) a strong emphasis on simulation and mobile
communications.
E-learning standards are recognized as being useful, even essential, to
encourage the reuse and interoperability of learning materials.
It is important to sustain the exchange of experience within Europe on the
use of ICT for learning and to develop a common understanding of what is good
or Best Practice.
We think that e-learning standards can only be established as the result of a
profitable collaboration among varied entities, operating in different contexts,
with different objectives. Only by sharing problems, solutions and evaluations of
the various outcomes the real essence of potential drawbacks and advantages can
be assessed.
new potentialities of technology and remain in the "cybercave" where they can
see only the shadows of technology. Even the ones that are completely temped
by the "hi-tech" can make a big mistake by forcing contents on technology: true
effectiveness is only obtained by adapting technology to contents!
contexts. In fact, the very notion of learning objects relies on this fundamental
principle: when developing a new educational unit, the objective should be the
construction of several basic components about the subject considered and these
components should be reusable in other contexts and with different learning
strategies.
Eventually, all the educational units will be accessible through the Internet,
i.e. they will be edited and used by many users simultaneously. It has to be
remarked that the objective of the analysis and development presented is not that
of achieving another 'Content Management System' but rather the definition of
the guidelines for selecting the tools and educational resources that will
eventually become the shared infrastructure.
A final not negligible objective of the Pavia e-learning project is that of
enlarging as much as possible the basis of potential teachers by stimulating and
promoting the adoption the new activity, up to achieving course portfolio for the
entire traditional university background that range from humanistic, to scientific,
to all the applied sciences.
References
CARLA MILAN1
IBM Emea, South Region, Learning Solutions
After having analyzed the motivations that lead to a revision of the teaching and learning
models, baring in mind the European Union initiatives, we now analyze different
possible views of e-learning, both from a technological perspective and through a more
global and integrated approach.
IBM has decided to participate in this transformation challenge and, after an accurate
analysis of all aspects of the phenomenon and the realization inside the company of wide
e-learning strategies, is ready for the education world, as a partner able to handle
complex projects, both in the academic and corporate environments.
We present the IBM role in the EU initiatives and in the public-private partnerships
started by the commission. We also outline the IBM education model, useful to building
learning projects with a “blended” methodology.
The change of the education systems is one of the most important challenges
for our society - the mobilisation and participation of all players is required.
Education institutions have been conservative by culture and tradition - hence
the challenge of the change process is second to none. This social transformation
can only be successful if managed in a proactive and holistic way - taking into
account all critical success factors, in particular the role of e-learning as the
driver of change must be recognised and understood.
1. What is e-learning?
E-learning in a broad sense embraces all these views and meanings. As such
it can be conceived as a complex, integrated process, where the Internet enables
social inclusion and social cohesion - enabling us to involve and connect people,
pedagogy, processes, content and technology. E-learning is supporting the
development, delivery, evaluation, management and commerce of learning in an
integrated way.
Understanding the complex nature of this new learning paradigm has led
IBM to adopt a broad definition of e-learning, based on a total systems
perspective. It is related to our notion of e-business, which is about transforming
core business processes by leveraging the net. Typical core business processes
are customer relationship management (CRM), Supply Chain Management and
e-commerce. Since e-learning affects the core business processes and the
business model relating to learning provision we define it as follows:
Process Content
v
* Trackinglreporting
* Instructional
* Skills, planning implementation
and assessment
* SuppoNhelp Interactional
IT infrastructure,content repositories,
portals, learning management system,
LCMS, authoring tools
It is clear from the previous page that technology is one of the necessary
conditions to make e-learning work - though not a sufficient condition.
However, without a sound technology strategy there is no way even to get started
with an e-learning deployment.
The base layer of the enabling technology is the network infrastructure. Our
customers tell us they need a network infrastructure that is robust, reliable,
scalable, secure and flexible - based on open standards. Availability,
interoperability and manageability are also key requirements. The network
infrastructure must allow for access by multiple devices ranging from laptop
computers to mobile phones. The network infrastructure sets the basic
capabilities and limitations as to what type of e-learning programmes can be
provided.
media objects repositories and authoring tools. Software brings flexibility and
innovation for the teachers in their course context and enables teachers and
learners to collaborate synchronously or asynchronously and to establish work
processes. Software also integrates and secures your existing environment with
e-learning.
Learning portals integrate the view and access of the learning environment
from the user perspective and eventually enable the user to create a personalised
‘my.University’ or ‘mySchoo1’ - based on the students, educators,
administration staff, alumni and external stakeholders profile.
0 IBM has long standing experience with integrating access devices into
an e-learning environment. In a concept known as ‘Thinkpad*
University’ we work with universities to implement an integrated
75
programme for the deployment and support of mobile computing for the
students and faculty
IBM has a strong commitment to open sourcing and open standards and
has been strongly engaged in Linux, Extensible Markup Language
(XML), JavaT"2 Platform, Enterprise Edition (J2EE), Web services and
JavaTM. With the emergence of new standards in the field of e-learning
IBM takes an active role in the international standardisation work such
as the development of standards for learning object metadata
To sum it up - With IBM develop e-learning for everyone, not just a small
group and get a better return on investment (ROI).
76
Also, education in the 21st century should address the role of preparing
students to operate in an uncertain and ever-changing environment. What is
needed is a toolkit to last through life, comprising such intra-personal elements
as a values framework, self-knowledge, capacity for critical analysis, ability to
learn, as well as communication and social skills.
The use of new technologies in schools will free educators time for
concentrating on these new core competencies and for ‘identifymg the strengths
of individuals, to focus on them and to lead students to achievement’. The use of
technology for e-learning also forces teachers to develop a new relationship with
students - one in which teachers act as facilitator and mentor to the self-directed,
independent and collaborative learning activities of students. The learning
journey is becoming an interactive process, which at times demands self
direction by the learner, at times is dependent on feedback from peers and tutors
and at others is simply a function of instructor defined outcomes. It will be
necessary to place comparable emphasis on technology and face-to-face
interaction, to balance a teacher-directed and a facilitated, collaborative
approach and to place equal importance on teacher delivery and learner
exploration. All of this argues for what has been called a ‘high tech - high touch’
approach. Such paradigm shifts in teaching and learning require radical changes
in the competencies of teachers and the attitudes of learners.
Our value proposition for education customers builds on four main sources:
77
1. IBM has invested $70m, since 1994, into it’s ‘Reinventing Education’
partnership programme with the objective of improving the quality of
primary and secondary education. From the 28 installations around the
world and the ensuing research projects, IBM has gained significant
intellectual capital and has developed solutions for schools. More details
about IBMs Reinventing Education Programme can be found at
i bm .com/ibm/ibmgives.
Learning Village solution incorporates the experiences and the findings of these
projects.
Get together
Since the early days of the European Commission’s white papers ‘Growth,
Competitiveness and Employment’ (1994) and ‘Teaching and Learning’ (1999,
the European institutions have played an important role in actively tackling the
challenges of the 2 1st century and challenging prevailing mindsets. These white
papers set out the framework for subsequent commission documents, including
the eEurope Initiative, the eEurope Action Plan, the e-learning initiative and the
e-learning Summit and Action Plan (May 2001), the Memorandum on Lifelong
Learning and the Report On The Concrete Future Objectives Of Education
Systems. Attaining all the goals defined at the Lisbon European Council in
March 2000, presupposes the committed involvement of all the players involved
in education and training.
“The fact is that in the future a society’s economic and social performance will
increasingly be determined by the extent to which its citizens and its economic
and social forces can use the potential of these new technologies, how efficiently
they incorporate them into the economy and build up a knowledge-based
society”.
(Communication from the Commission: eLearning - Designing tomorrow’s
Education, 2000)
Career-Space was founded in late 1998 with support and sponsorship from the
European Commission. Seven major ICT companies have been founding
members - BT, IBM, Microsof<*, Nokia, Philips, Siemens and Thales (formerly
Thomson CSF). This initiative was triggered by the structural shortage of
qualified ICT personnel in Europe, which will impact the future prosperity of the
content, if not addressed adequately. As a first compelling issue to be addressed
in the context of the ICT skills gap, Career-Space has given a response from the
industry perspective as to what generic skills will be needed in the future and
should be built by the ‘suppliers’. Following the publication of 13 ‘generic ICT
skills profiles’ by the end of 1999, new members joined the group - Cisco,
IntelrM,Nortel Networks and Telefonica. In addition, the European ICT Industry
Association (EICTA) and CEN/ISSS (the European standardisation organisation
for ICT) joined the Steering Committee along with EUREL (the convention of
80
It is our vision that new value-nets will emerge, with government bodies,
education institutions, corporations, technology providers, media companies and
publishers joining forces to provide learning on demand - as a ‘utility’. E-
learning utilities will serve schools, universities, small and medium enterprises,
larger corporations and individuals to meet their learning needs, shield them
from the complexities of the underlying infrastructures and systems and provide
ubiquitous access to learning via education portals to learning experiences. The
e-learning utility will also provide a managed environment for content producers
to deliver their content to education and learning environments. The e-learning
utility will be part of an overall learning environment, where the human elements
82
such as tutors, mentors and social interactions between groups will continue to
play a vital role - yet in an effective blend with technology.
The E-learning Utility will shield the education institutions from the
complexity of the IT solution and from the load of building, running and
maintaining it. It will allow focus on what is essential for the institutions - for
instance, the provision of well orchestrated learning opportunities, strong
curricula and adequate pedagogy and meeting the need of a diverse audience.
The knowledge society enforces a new way of thinlung and acting about
learning. We are at the very beginning of this journey, but it has definitely
started. The winners will be those who get on the learning curve early.
Hostinglbandwidth )
Interaction and collaboration are the most meaningful aspects that support
the so-called IBM e-Learning model that the company uses both for its own
internal education and for the management of large e-Learning projects within its
own customers.
In addition, the model lets you develop courses both horizontally and
vertically. In other words, some courses can be based on a simple tier, using only
one e-Learning methodology, while others need more levels and different
methodologies. These latest solutions are called blended solutions
The result is a time reduction for students outside the work schedule and an
optimization of teachers’ precious time as well as expensive resources.
85
Web links:
Information about our Learning Solutions for schools, higher education and
government can be found at this Web site:
ibm.com/learning
i bm .com/software/info/university/scholarsprogram
86
SHI-KUO CHANG
Department of Computer Science
University of Pittsburgh
Pittsburgh, PA, USA
E-rnai1:chang@ cs.pitt.edu
An evolutionary query is a query that changes in time and/or space. For example, when
an emergency management worker moves around in a disaster area, an evolutionary
query can be executed repeatedly to evaluate the surrounding area in order to locate
objects of threat. Depending upon the position of the query originator, the time of the
day and other factors such as feedback from sensors, the query can be modified.
Incremental query modification leads to a query similar to the original query. Non-
incremental query modification on the other hand may lead to a substantially different
query. Query morphing includes both incremental query modification and non-
incremental query modification. In sensor-based evolutionary query processing, through
query morphing one or more sensor can provide feedback to the other sensors. The
sensor dependency graph is used to facilitate query optimization because most sensors
can generate large quantities of temporaVspatia1 information within short periods of time.
Applications to multi-sensor information fusion in emergency management, pervasive
computing and situated computing are discussed.
1. Evolutionary Queries
The person or agent who issues the query is called the query originator.
Depending upon the spatialltemporal coordinates of the query originator and
feedback from sensors, an evolutionary query can be modified accordingly
87
Under normal circumstances the modified query is quite similar to the original
query, differing mainly in the spatial constraints of the query.
real-time sources and databases [5, 6, 9, 151. ZQL allows a user to specify
powerful spatiaVtemporal queries for both multimedia data sources and
multimedia databases, thus eliminating the need to write separate queries for
each. ZQL can be seen as a tool for handling spatiaYtempord information for
sensor-based information fusion, because most sensors generate spatial
information in a temporal sequential manner [14]. A powerful visual user
interface called the Sentient Map allows the user to formulate spatialhemporal cr-
queries using gestures [7, 81.
For empirical study we collaborated with the Swedish Defense Research Agency
who has collected information from different type of sensors, including laser
radar, infrared video (similar to video but generated at 60 framedsec), and CCD
digital camera. When we applied ZQL to the fusion of the above described
sensor data, we discovered that in the fusion process data from a single sensor
yields poor results in object recognition. For instance, the target object may be
partially hidden by an occluding object such as a tree, rendering certain type of
sensors ineffective.
assumed to provide the complete information needed by the queries. Almost all
previous approaches fall under the category of incremental query modification.
For information fusion we must consider non-incremental query modification
where not only the constraints but also the sources and even the query structure
are modified. .It is for this reason we introduce the notion of query morphing.
If the computation results of a node PI are the required input to another node P2,
there is a directed arc from PI to P2. Usually we are dealing with sensur
dependency trees where the directed arcs originate from the leave nodes and
terminate at the root node. The leave nodes of the tree are the information
sources such as laser radar, infrared camera, CCD camera and so on. They have
parameters such as (none, LR, NONE, 0, (1,1), (1,1), sqo,, tqoi, soiall,toiall).
Sometimes we represent such leave nodes by their symbolic names such as LR,
IR, CCD, etc. The intermediate nodes of the tree are the objects to be
recognized. For example, suppose the object type is 'truck'. An intermediate
node may have parameters (truck, LR, recog315, 10, (0.3, OS),(I,l), sqoi, tqoi,
soill, toiall). The root node of the tree is the result of information fusion, for
example, a node with parameters (truck, ALL, fusion7, 2000, (O,l), (O,]), sqoi,
tqoi, soiall,toiall)where the parameter ALL indicates that information is drawn
from all the sources. In what follows, some parameters such as the
spatialhemporal coordinates sqoi and tqoi for the query originator, the all-
inclusive space-of-interest soialland the all-inclusive time-of-interest toiallwill be
omitted for the same of clarity.
(mle,ITCNONEPLl,l)Ll,lf) ( ~ . A L L F u u o ~ ~ , ~ ~i(o,i))
(o,~
91
This means the information is from the three sources - laser radar, infrared
camera and CCD camera - and the information will be fused for recognizing the
object type 'truck'.
Next, we select some of the nodes to compute. For instance, all the three source
nodes can be selected, meaning information will be gathered from all three
sources. After this computation, the processed nodes are dropped and the
following updated sensor dependency graph T2 is obtained:
(truck,CCD, recogl1,100,(0.6,0.8),(0.1,0.3))
We can then select the next node(s) to compute. Since IR has the smallest
estimated computation time, it is selected and recognition algorithm 144 is
applied. The sensor dependency graph T3 is:
I))
(truck,ALL,fusion7,2OOO,(O,~),(0,
(truck,CCD, recogl1,100,(0.6,0.8),(0.1,0.3))
In the updated graph, the IR node has been removed. We now select the CCD
node because it has much higher certainty range than LR and, after its
processing, select the LR node. The sensor dependency graph T4 is:
I
I
(truck,LR, recog315,20,(0.3,0.5),(0.4,0.6))
I 1
(truck,ALL,fusion7,2OOO,(O,l),(0,1)) I
I
Finally the fusion node is selected. The graph T5 has only a single node:
I))
(truck,ALL,fusion7.2000,(0,I),(O,
After the fusion operation, there are no unprocessed (i.e., unselected) nodes, and
query processing terminates.
92
The recognition algorithm 315 can now be applied to recognize objects of the
type 'truck' in this smaller space-of-interest. Finally, the fusion algorithm fusion7
is applied.
The query modification approach is outlined below, where italic words indicate
operations for the second (and subsequent) iteration.
Step 1. Analyze the user query to generatelupdate the sensor dependency graph
based upon the ontological knowledge base (see Section 6) and the multi-level
view database (see Section 5) that contains up-to-date contextual information in
the object view, local view and global view, respectively.
Step 3. Execute the portion of the o-query that is executable according to the
sensor dependency graph.
93
As mentioned above, if in the original query we are interested only in finding un-
occluded objects, then the query processor must report failure when only an
occluded object is found. If, however, the query is modified to "find both un-
occluded and occluded objects", then the query processor can still continue.
The multiple views may include the following three views in a resolution
pyramid structure: the global view, the local view and the object view. The
global view describes where the target object is situated in relation to some other
objects, e.g. a road from a map. This will enable the sensor analysis program to
find the location of the target object with greater accuracy and thus make a better
analysis. The local view provides the information such as the target object is
partially hidden. The local view can be described, for example, in terms of
Symbolic Projection [4],or other representations. Finally, there is also a need for
a symbolic object description. The views may include information about the
query originator and can be used later on in other important tasks such as in
situation analysis.
The multi-level views are managed by the view manager, which can be regarded
as an agent, or as middleware, depending upon the system architecture. The
global view is obtained primarily from the geographic information system (GIS).
94
The local view and object view are more detailed descriptions of local areas and
objects. The results of query processing, and the movements of the query
originator, may both lead to the updating of all three views.
For any single sensor the sensed data usually does not fully describe an object,
otherwise there will be no need to utilize other sensors. In the general case the
system should be able to detect that some sensors are not giving the complete
view of the scene and automatically select those sensors that can help the most in
providing more information to describe the whole scene. In order to do so the
system should have a collection of facts and conditions, which constitute the
working knowledge about the real world and the sensors. We propose to store
this knowledge in the ontological knowledge base, whose content includes
object knowledge structure, sensor and sensor data control knowledge.
The ontological knowledge base consists of three parts: the sensor part
describing the sensors, recognition algorithms and so on, the external conditions
part providing a description of external conditions such as weather condition,
light condition and so on, and the sensed objects part describing objects to be
sensed. Given the external condition and the object to be sensed, we can
determine what sensor(s) and recognition algorithm(s) may be applied. For
example, IR and Laser can be used at night (time condition), while CCD cannot
be used. IR probably can be used in foggy weather, but Laser and CCD cannot
be used (weather condition). However, such determination is often uncertain.
Therefore certainty factors should be associated with items in the ontological
knowledge base to deal with the uncertainty.
7. Query Optimization
Suppose that we have the sensor dependency graph such as T1 of Section 3. For
the recognition algorithm recog3 15, we have the following certainty range:
P(recog315 = yes 1 X = truck, Y = LR) E (0.3,0.5), and P(recog315 = no 1 X #
truck ,Y= LR) E (0.4,0.6), where X= truck, Y = LR means that there is a truck
in the frame which is obtained by LR. If the input data has certainty range (a,b)
and the recognition algorithm has certainty range (c,d), then the output has
95
56
i=l
5 1 (for jth order, at most one algorithm can run.)
56
j=l
5 1 (for every algorithm, it can be at most in one order.)
max( c N
6 , ) 2 8 ( 8 is the certainty threshold.)
where
c 1
= ( c( ALG,, priori certainty), .., , c( ALGN,priori certainty) )
96
...
Given the sensor dependency graph, a dual problem is to recognize the object
'truck' within the processing time limit. Our goal is to maximize the certainty
value for the object truck under the condition that the total processing time is
below the time limit. The problem is as follows:
Maximize m a x ( c 6,)
1'
if algorithm i runs at jth order
where 6 ,= 0 if algorithm i doesn't run at jth order
c 1
= ( c( ALG1, priori certainty), ... , c( ALGN, priori certainty) )
...
subject to
I:6 ,I
N
j=l
1 (for every algorithm, it can be at most in one order.)
N N
X&,Ti I T (T is the maximum time that we can bear.)
where
In the above optimization problems we have not considered the space of interest
soi when we formalize the problem. If we put it in, the formulation of the
problem becomes more complicated:
1
F =(T (ALG1 , initial soi ) , . . . , T (ALGN, initial soi) )
I
A = ( a( ALG1, initial soi), . .. , a( ALGN,initial soi) )
...
5F
k=l
fi is the total running time.
N
Minimize F~ fi 0
k =l
Note:
a ( ALGi, soi) is the output soi after using algorithm i on the
input soi.
8. An Experimental Prototype
?om12
-LOW1
-IOWl
I
- BOW1
IQusrys
-SOW1
5ualy9
- D OW2
~ 6d OW2
IQuerp0
-IOW2
IQuery1
-1OW2
IQuay1 2
Figure 5. The dependency tree (left) and a selected node (right). We can trace
the Query processing step by step.
101
mmm
POW12
-4
€Bowl
mow1
mow1
- 4ow2
8OEJ2
SOW2
mow2
Figure 6. The next step of query processing after the step shown in Figure 5.
Bot tep.
Figure 7. p window.
The main window in Figure 3 illustrates the visual construction of a query. The
user drags and drops objects and enters their attributes, and the constructed
query is shown in the upper right window. The objects in the dependency tree
are shown as an object stream in the middle right window. In Figure 4 the lower
right window shows the query results. When an evolutionary query is being
executed, its dependency tree will change dynamically. Figure 5 displays the
same information as that of the object stream, but in a format more familiar to
102
end users. It shows the dependency tree on the left side of the screen, and the
selected node with its attributes on the right side of the screen. In the next step,
both the dependency tree and the query may be changed, as illustrated in Figure
6. As shown in Figure 7, the information of optimization can be shown in a pop-
up window.
The CQL query shown in the upper right window of Figure 3 is as follows:
SELECT object
CLUSTER * ALIAS OBJl OBJ2
FROM
SELECT t
CLUSTER *
FROM video-source
WHERE 0BJl.type = 'car' AND 0BJl.color = 'red AND
OBJ2.type = 'truck AND 0BJl.t < OBJ2.t
OBJ2
video-source
OBJ2.type = 'truck'
OBJl
video-source
0BJl.type = 'car'
0BJl.color = 'red
SELECT object
CLUSTER * ALIAS OBJl OBJ2
FROM
SELECT t
CLUSTER *
FROM LR, CCD, IR
WHERE
WHERE OBJ 1.type = 'car' AND OBJ 1 .color = 'red' AND
OBJ2.type = 'truck AND 0BJl.t e OBJ2.t
FUSION
OB52
LR, CCD, IR
OBJ2.type = 'truck'
FUSION
OBJl
LR, CCD, IR
0BJl.type = 'car'
OBJ 1.color = 'red'
104
As already explained, Figure 5 and 6 show the dependency tree whose dynamic
changes illustrates the steps of query processing. Figure 9 also shows the
dependency tree. However, in this example of query processing for fusion,
several nodes are marked as 'cut-off nodes, meaning the certainty values are
already above a threshold and consequently no further processing of these nodes
is needed.
Conceptually, query morphing is somewhat like image morphing: the end user
formulates one query called a query point and requests the query processor to
morph one query point into another query point. Within limits, the two query
points are arbitrary, and the query processor is able to figure out automatically
how a query point is morphed into another query point. Sometimes query
morphing is accomplished by modifying the query incrementally. Sometimes
more substantial query modification is necessary. In incremental query
105
modification, the two query points are more or less similar. In non-incremental
query modification, the two query points are substantially different.
We define a distance measure d(ql, q,,) between two query points q1 and q,
based upon the number and type of transformation steps to transform q1 into 4.,
Depending upon the type of transformation, different weights are assigned to the
transformation steps. An infinite weight is assigned to a forbidden type of
transformation. Let fi,f2, ...,fn.l be the n-1 transformations such that fi (ql) =
q 2 , f 2 (q2) = q3, ..., fn-1 (q,-l) = q,. The distance between 41 and q, is defined as:
n
d(ql, qn) = wj where w, is the weight assigned to transformation stepA
j=l
Each transformation step is assigned a certain weight. For example the add
transformation step has the weight 1. An incremental morphing pair of queries
(ql, 4), is one whose distance d(ql, 4”) is below a threshold 5. If q1 and qn form
an incremental morphing pair, morphing from one into another by incremental
query modification is possible. If q 1 and q, do not form an incremental morphing
pair, morphing from one into another by incremental query modification is
impossible.
A non-incremental morphing pair of queries (ql, qn) is one whose distance d(ql,
9,) is finite but above the threshold z. If q1 and q2 form a non-incremental
morphing pair, morphing from one into another by non-incremental query
modification is possible. A non-incremental transformation is one that
completely rewrites the query. A morphing pair is either an incremental or a
non-incremental morphing pair. If q1 and q2 do not form a morphing pair,
morphing from one into another by query modification is impossible.
Original query I :
Select object
From textbook
Where object.topic = “binary tree”
The query processor finds related class notes, reference books and videotaped
materials, and consequently the sources in the query are updated by the query
processor. This is a typical example of incremental query morphing.
Morphed query 2 :
Select object
From textbook, classnotes, reference-book, videotaped-materials
Where object.topic = “binary tree”
Morphed query 3:
Select object
From case-studies
Where object.topic = “binary tree”
or
Select object
From life-experiences
Where object.topic = “binary tree”
Morphed query 4 :
Select object
From classnotes of student1
Where object.topic = “binary tree”
or
Select object
From life-experiences of student2
Where object.topic = “binary tree”
It is important to note that both the morphed query and the retrieval results
contain information valuable to the user/learner/student. In other words, the
questions are just as important as the answers. To this end, adlets [S] is used to
generate morphed queries to gather information. Adlets travel from nodes to
nodes to acquire more information. The query is morphed as the adlets travel
along a chosen path.
The end user can posit a sequence of events and query points to define a query
path for morphing. In some cases the initial query is too restrictive and the end
user may wish to enhance the significance of a certain type of objects. If the end
user is able to visualize the type of objects that meet the information needs, the
end user can add clauses andor constraints involving that type of object to the
evolutionary query. In other words the end user can repeatedly adjust the query
path for morphing to focus on certain type of objects. In that regard, we note the
importance of visualization in query morphing: we need to visualize both the
query and the retrieval results.
Corresponding to adjusting the query path, the morphing algorithm revises its
strategy to modify the query. For example, in case of cloudy conditions a source
such as the CCD sensor should be replaced by another source such as the IR
sensor. A roaming query path can be defined, which is materialized into a query
path based upon the contents of the ontological knowledge base. By specifying
108
the appropriate adlet propagation rule, adlet generation rule and adlet
modification rule, the interactive morphing algorithm can be designed.
10. Discussion
Under the guise of routing for emergency rescue in catastrophic events, a test
bed could be implemented that obeys the following stages. First, a query is
issued that activates the appropriate sensors to collect information about the
environment. A query processor then collects data from the sensors and fuses
the information into a coherent statement about the environment. The relevant
information is passed to a display that helps the viewer visualize the results. An
interaction loop between the viewer and the display allows the viewer to provide
feedback and modify the query.
At the broadest level, the test bed is fairly simple, consisting of three main
interface components: a query mechanism, a visualization display, and a
feedback mechanism. This general model offers the broadest possible solution
and probably describes many visual information systems for fusion. Additional
requirements may include:
3) The evaluation of the system by the viewer needs to permit the viewer to
provide feedback on the accuracy of the sensors (a), as well as the accuracy
of guidance provided by the visualization (b).
for the Query module can rely on ZQL, the query refinement fusion algorithms
described in [lo] and the query morphing approach discussed above. A
framework for evolutionary query, visualization and evaluation of dynamic
environments is formulated. To close the loop in the system requires that the
viewer be able to provide feedback, evaluating both the query results as well as
the visualization itself. The query mechanism should support two major types of
feedback: sensor accuracy and expressiveness of the query. Results from
evolutionary query optimization using limited query morphing, and interactive
approaches using roaming query paths can then be compared and evaluated.
Initially we will invite graduate and undergraduate students to participate in the
evaluation study. When the algorithms are well developed and the system more
mature, we plan to evaluate the applicability of query morphing techniques to
emergency management.
References
c. G R A N A ~ G.
, PELLACANI~,s. SEIDENARI~,R. CUCCHIARA~
t Dipartamento d i Ingegneria dell ’Informazione
% Dapartimento d i Dermatologia
Universita d i Modena e Reggio Emilia, Italy
1. Introduction
A fruitful representation of the image content, often exploited in many
tasks of understanding, recognition, and information retrieval by similarity,
is based on region segmentation; a richer description adds to the region’s
attributes some relationships between regions, spatial and topological, that
describe the way we perceive the mutual relations between parts of the
image. To this aim, graph-based description is a power formalism t o model
the knowledge extracted from the images of the regions of interest and their
relationships.
Moreover, the management of large volumes of digital images has gen-
erated additional interest in methods and tools for real time archiving
113
2. Topological Relations
Given an image space and an 8-connection neighborhood system, that for
each point xi defines the neighbor set Nzi, segmentation by color clustering
aims to partition the image into a set of regions R = { R l ,. . . , R k } such
n
that U Ri = I and Ri = 0 . To this aim, a clustering process that groups
pixel w.r.t their color, should embed or be followed by a pixel connectivity
analysis, according with the given neighborhood system.
Then a graph-based representation describes spatial and/or topological
114
(1) it carries out a color based segmentation in two clusters, using the
PCA and FCM algorithmg;
(2) while segmenting it builds the corresponding tree;
(3) it recursively applies the segmentation to the regions of interest
created by the previous steps of the algorithm.
RecursiveFCM(R,,P,)
v regions R, End.
RecursiveFCM(R,,P,)
significant areas are erased and it is searched for the presence of an external
one. The remaining regions are organized in a structure that allows for a
correct recursion step. In particular, in the ideal case (which generates a
TT with a single child for each node), the FCM algorithm creates two clus-
ters and should create two regions, one including the other. In real images,
often many regions are created. If one of these can be chosen as “external”
it becomes a new node, parent of the others; all the other regions are fur-
ther inspected in the recursion. However, some of these regions could be
also present mutual inclusions and thus not allow a correct tree generation.
We call these regions suspended, since they need a specific management.
indicating with this term the set of all the regions that cannot be immedi-
ately analyzed, but must wait for the including one.
We thus consider the set R after the elimination of low interest regions
and the possible external one. From R, we distinguish between regions Rk
not included in others and sets Pk of regions included in Rk . Now, for each
region the algorithm is recursively called along with its set of suspended
regions.
RNI = S
P = O
b ’ R , e R
{
if (3 Rk E RNI : R, is included in R k )
Pk = p k u { R a }
else
The process for finding all suspended regions is described in Fig. 3.2.
It is to note that a reduction of the search space is obtained, by ig-
noring all regions of P , in fact the external region should contain not only
all regions of R but also all the suspended regions, but for the transitive
property of inclusion this is guaranteed by the fact that they are included
in regions of R.
4. Tree matching
The construction of TT is the basis of a retrieval approach searching fro tree
similarities. An interesting non exact tree matching uses the edit distance
to compare two-trees. It measure the cost of operations such as adding
or eliminating nodes to transform a tree into another. Unfortunately the
118
(1) we can match only nodes on the same level of the tree;
(2) given two sets of nodes, taken from two trees, we match one against
the other without solving the associated linear assignment prob-
lem, but considering a sorting of the two sets and letting greater
importance nodes have first choice on the other set.
(1) The roots are compared in an Euclidean feature space by the dis-
tance d of the feature vec.ior. It can comprehend color, area, sym-
metry, texture and whichever other information of each region. An
equivalence measure is obtained as
1
E=-
l+d
(2) Children equivalence is evaluated:
(a) Let us call it 'TI
the tree with more nodes and Tz the other
one;
(b) The nodes of TI are considered in order of importance (eval-
uated on the feature vector);
119
Got =
Eroot
-
(ciI&%* + 1)
,
2 CaIi
where Ef is the signed equivalence of each node (with a matching
node or with the null vector), Ii is the importance of the node and
Erootis the equivalence of the roots, as previously defined.
The equivalence measure E is bounded between 0 and 1 and this guar-
antees that E,' is in range [-I, 11 so the the weighted sum can give -1 in
case of total mismatch of the tree structure or 1 in case of perfect match.
This value is converted by the equation to the interval [0,1] and used as
a reduction factor for the matching value of the roots. The interval shift
has the implicit property of reducing the influence of a mismatch at lower
levels of the tree.
In this way, given an image represented by its TT we are able to find
in a image database other images with a similar TT on the basis of the
previous algorithm. Obviously the TT representation cannot be the unique
approach to support query-by-example system and in melanoma diagnosis
a number of other features" on the whole lesion or their part should be
considered in an integrated way. Nevertheless, this is a powerful represen-
tation method that integrated with proven dermatological criteria can give
interesting results in retrieval.
5. Experimental Results
Experimental results have been conducted on synthetic ad on real images,
to first test the correct response of the adopted algorithm and then to verify
its applicability to real world images.
perfect match. All other images present not so significant variation from the
original image. The colors were ignored in this evaluation and the feature
vector distance is provided computing only the distance of the center of
mass from the parent one and the percentage of parent area occupied by
the region. Thus S3 and S4 are very similar to S1 while S2 and S5 do not
have a node of S1. The results follow a correct evaluation, giving the ability
t o order the images by the similarity from the first one.
6. Conclusions
References
E.G.M. Petrakis et al., Image Indexin Based on Spatial Similarity, Technical
Report MUSIC-TR-01-99, Multimedia Systems Institute of Crete (MUSIC),
1999.
E.G.M. Petrakis et al., Similarity Searching in Medical Image Databases
IEEE Trans. Knowl. Data Eng. 9, 435-447 (1997)
A.W.M. Smeulders et al., Content-Based Image Retrieval at the End of the
Early Years, IEEE Trans. Pattern Anal. Mach. Intell. 22, 1349-1380 (2000).
M. Flickner et. al., Query By Image and Video Content: The QBIC System,
122
Abstract. The paper describes an integrated environment for control and man-
agement of pictorial information system. We consider the diagnostic radiology
field as a case study. A system for filing and processing medical images is a
particular pictorial information system that requires to manage information of
heterogeneous nature. In this perspective, the developed environment provides
the medical user with the tools to manage textual data and images in integrated
way. A Visual Data Definition Language was projected and implemented, that
allows the administrator of the system to extend the actual database on the base
of new queries of the users: the insertion of new entities and the creation of new
relationships between them take place simply manipulating the iconical repre-
sentation related to the information you manage. A Visual Query Language
represents a visual environment in which a user could query the database using
iconic operators related to the management of the alphanumeric and pictorial
information with the ability to formulate composed queries: from an alphanu-
meric query will be drawn pictorials data that are contained in the database and
vice versa.
1 Introduction
Image
Transmission + Comunication Net
Storage
t
Figure 1. A Pictorial Information System.
In the field of diagnostic radiology the typical information to consider can be divided
findamentally in two categories: data related to the clinical briefcases management,
that give the medical consumer a real support for the development of protocol of
treatment for the examined patients, and data related to the medical images elabora-
tion, filing and retrieval. The analysis of a medical image, a report of a radiology ex-
amination, involves the execution of complex operations like the survey of a possible
anomaly, the determination of its exact spatial position with respect to other objects
(elements of the human body) contained in the image, the calculus of its geo-
morphological characteristics such as area, density, symmetry etc.
In this work, we apply an icon-based methodology to device an integrated environ-
ment providing a set of tools for visual pictorial database definition and manipulation.
Section 2 describes the integrated environment in which the Visual Data Definition
Language and the Visual Query Language are defined. Section 3 presents the Medical
Image Management System and similarity retrieval techniques. Section 4 describes
the management of query results. Finally, section 5 presents future extensions of this
research.
125
A system for the acquisition and elaboration of medical images is a pictorial informa-
tion system that requires to manage textual data (related to the management of medi-
cal record) and medical images. In the field of the informative medical systems the
elaboration of a radiological image consists of abnormality individuation, determina-
tion of geo-morphological characteristics, evaluation of the spatial relationships be-
tween the pathology and the anatomical organs in which it is located. On the base of
these information, the retrieval of similar images, by means of appropriate retrieval
techniques, facilitates the formulation of diagnosis and treatment plan for examined
patients. In this perspective, a Visual Data Definition Language was introduced and
implemented; it allows the administrator of the system to extend the actual database
on the basis of new queries of the users: the insertion of new entity and the creation of
new relationships between them take place simply manipulating the iconical represen-
tation related to the managed information (Figure 2).
The realized Visual Query Language represents a visual environment in whch the
user interrogates the database by means of iconic operators related to the management
of alphanumeric and pictorial information with the ability to formulate composed que-
ries: from an alphanumeric query pictorials data can be drawn that are contained in
the database and vice versa (Figure 3).
126
It is possible to store the results of a query in the database, in terms of the correspond-
ing iconic representation, with an efficient reuse of previously formulated queries.
The developed VDBMS is implemented in Java, using JDBC to connect to the under-
lying RDBMS. Particularly, the environment was developed like a webclient applica-
tion that guarantees the portability on any platform and the access to any RDBMS
(eventually remote) by means of a platform-independent user friendly interface.
. .
I I
In the environment, a virtual image was linked to each examined image. The virtual
image is built using the canonical representative objects of human body components,
which describe the content of the real image and the spatial relationships between the
contained objects and the individualized anomaly in compact manner (Figure 5).
The concept of similarity between two images is expressed in terms of Euclidean dis-
tance, in the space of the characteristics, between the points that represent them. The
geo-morphological characteristics we used for search strategy are: area, density,
asymmetry, orientation as regards the centroid, spreadness and uniformity (Figure 6).
Given a real image im, the virtual image im.i associated with im is a pair (Ob, Rel)
where:
Ob = {obl, ob2,..., ob,} is a set of objects of im;
Re1 = (Rel,, Rel,) is a pair of sets of binary spatial relations over Ob, where Rel,
(resp. Rel,) contains the mutually disjoint subsets of Ob x Ob that express relations
holding between pairs of objects of im along the x-projection (resp. y-projection)
1151.
Let Q be the virtual image associated with an image used as a query for similarity re-
trieval and i m the
~ virtual image associated with one of the images examined for pos-
sible retrieval. In the case of similarity retrieval for spatial relationships the similarity
degree, denoted by Sim-deg (Q, imJ, is a value belonging to the interval [0,1] that is
defined by a formula that considers how many objects of Q are contained in imviand
how many spatial relationships similar to those of Q are found in imG[16].
Therefore, if Sim-deg (Q, imi) results better or equal of the least of similarity degree
in query, the image will be recovered. In the environment of visual management real-
ized in this paper, the relational algebra operators to interrogate the database on re-
gard of the alphanumeric information contained, and the operators Similarity Re-
trieval and Similarity Retrieval By Virtual Image to allow the recovery of medical
images similar to the query image (one considered for perform the search) and rela-
tive to clinical cases previously examined are considered (Figure 7).
129
Similarity Retriev a I
The first allows to choose the query image according to the number of practice, year
of filing and number form of the clinical diary which the image referred; the second
allows to start the retrieval beginning from the virtual image associated with the query
image and inserted in the list of the Analyzed Abnormalities (Figure 8).
If a query is saved as Query formulation, each time it is launched, the query will per-
form on the actual data of the examined table; otherwise, if it is saved as Quely result,
the present data to the date of query storing will be saved.
The icons related to queries saved like Query formulation, is inserted in the Stored
Researches. The icons related to queries saved like Query result, is inserted in the Old
Researches. By clicking on either icons the results of the associated operation will be
visualized.
Future developments will concern the study of the techmques of analysis of medical
images of different type (mammography, etc.) and the improvement of the manage-
ment of the multi-user and of the access from remote to different RDBMS.
6 References
[ 11 S.K. Chang, “Principles of Pictorial Information System Design”, Prentice Hall, 1989.
[2] S. G. Carlton e R. Mitchell, “Image segmentation using texture and gray level”, Proc. IEEE
Conf Pattern Recognition and Image Processing. Troy, New York, pp.387-391,6-8 Giugno
1977.
[3] G. B. Coleman, “Image segmentation by clustering”, Report 750, University of Southern
California Image Processing Institute, Luglio (1 977).
[4] S. Vitulano, C. Di Ruberto, M. Nappi, “Different methods to segment biomedical images”,
Pattern Recognition Letters, vol. 18, (1997).
[5] A. Klinger e C. R. Dyer, “Experiments on picture representation using regular decomposi-
tion’’, Computer Graphics and Image Processing 4,360-372 (1976).
[6] Alan Bryant, “Recognizing Shapes in Planar Binary Images”, Pattern Recognition, vol. 22,
pp. 155-164, (1989).
[7] F. Gritzali and G. Papakonstantinou, “A Fast Piece Linear Approximation Algorithm”, Sig-
nal Processing, vol. 5, pp. 221-227, (1983).
[8] James George Dunham, “Optimum Uniform Piece Linear Approximation of Planar
Curves”, IEEE Transaction on Pattern Analysis and Machine Intelligence, PAM1 vo1.8, no 1,
(1986).
[9] G. Papakonstantinou, “Optimal Polygonal Approximation of Digital Curves”, Signal Proc-
essing, vol. 8, pp. 131-135,( 1985).
[lo] T. H. Comer, C. E. Leiserson, R. L. Rivest., “Introduzione agli algoritmi”, cap. 35,
pp.835-864, (1996).
[ 113 Jia-Guu Leu, “Computing a Shape’s Moments from its Boundary, Pattern Recognition”,
vol. 24, no 10, pp. 949-957, (1991).
[I21 Mark H. Singer, “A General Approch to Moment Calculation for Polygons and Line Seg-
ments”, Pattern Recognition, vol. 26, no 7, pp. 1019-1028, (1993).
[ 131 Bing-Cheng and Jun Shen, “Fast Computation of Moments Invariant”, Pattern Recogni-
tion, vol. 24, n”8, pp. 807-813, (1991).
[ 141 Jin-Jang Leou and Wen-Hsiang Tsai, “Automatic Rotational Symmetry Determination for
Shape Analisis”, Pattern Recognition, vol. 20, no 6, pp.571-582, (1987).
[I51 M. Sebillo, G. Tortora, M. Tucci e G. Petraglia, “Virtual Images for Similarity Retrieval in
Images Databases”, IEEE Trans. on Knowledge and Data Engineering vol. 13, no. 6, Nov.-
Dec. 2001, pp. 951-967.
[I61 A. F. Abate, M. Nappi, G. Tortora e M. Tucci, “IME: an image management environment
with content-based access”, Image and Vision Computing, vol. 17, n. 13, pp 967-980, 1999.
132
v. DI GESU, D. TEGOLO
Universita di Palermo
Dipartimento di Matematica ed Applicazioni
via Archirafi 34, 90123 Palermo, Italy
{digesu,tegolo}@math.unipa.it
F. ISGRO, E. TRUCCO
Heriot- Watt University
School of Engineering & Physical Science
Edinburgh EHI4 4AS, U.K.
Cfisgro,e.trucco}@hw.ac.uk
This paper introduces a simple and efficient methodology to detect starfish in video sequences
from underwater missions. The nature of the input images is characterised by a low ratio sig-
navnoise and the presence of noisy background represented by pebbles; this makes the detection
a non-trivial task. The procedure we used is a chain of several steps that starting from the ex-
traction of the area of interest ends with the classification of the starfish. Experiments report a
success rate of 96% in the detection.
1. Introduction
Underwater images have been used recently for a variety of inspection tasks, in
particular for military purposes as mine detection, or for the inspection of under-
water pipelines, cables or platforms *, or the detection of hand-made objects 7.
A number of underwater missions are for biological studies, as the inspection
of underwater life. Despite the fact of the large number of such missions, and that
image analysis techniques are starting to be adopted in the fish farming field ',
the majority of the inspection of the video footage recorded during the mission
is mostly done manually, as research trying to use image analysis techniques for
biological mission is relatively new 4J0.
In this paper we present a simple system for the analysis of underwater video
stream for biological studies. In particular our task is the detection of starfish in
each frame of the video sequence. The system presented here is the first stage of
133
a more complex system for determining the amount of starfish in a particular area
of the sea-bottom.
The problem we tackle in this work is non-trivial, because of a number of
reasons; in particular: the low quality of underwater images bringing a very low
signal to noise ratio; the different kind of possible backgrounds as starfish can be
found on various classes of sea-bottoms (e.g., sand, rock);
The system we present here is a chain of several modules (see Figure 1) that
starting from the extraction of area of interests in the image, and has as last module
a classifier to discriminate the selected areas between the two classes of starfish
and non-starfish. Experiments performed on a sample of 1090 candidates report
an average success rate for the detection of 96% .
The paper is structured as follows. The next section gives an overview of the
system. The method adopted for selection areas of interest is described in section
3. In section 4 we describe the features that we extract from the areas of interest for
the classification, and section 5 briefly discusses the classification methodology
used for this system. Experimental results are reported and discussed in section 6,
and section 7 is left to final remarks and future developments.
2. System overview
The system, depicted in Figure 1, works as a pipeline of the following four differ-
ent modules:
(1) Data acquisition: each single frame of the underwater video sequence
(live video or recorded off-line), is read by the system for processing;
(2) Extraction of areas of interest: candidate starfish are extracted from the
current frame (section 3);
(3) Computation of shape indicators features): for each candidate a set of
features are computed. The features chosen are a set of shape descriptors
(section 4).
(4) Classijication: this module discriminates the candidate starfish between
starfish and non-starfish, using the features extracted by the previous mod-
ule (section 5).
Extraction of
Connected Regions
Geometrical Indicator
Morphological Indicator
Histogram Indicator
Classification
The method adopted is very simple. We first binaries the image using a simple
adaptive threshold 2 , that computes local statics for each pixel (mean value p and
standard deviation a)in a window of size 7 x 7. From the binary images all the
connected components are extracted and the small size ones are filtered out using
the simple X84 rejection rule 3, an efficient outlier rejection method for robust
estimation.
4. Features extraction
The definition of suitable shape indicators is essential for the classification phase.
In our case the shape indicators have been suggested by the morphological struc-
ture of the starfish. We identified three indicators that are combined into a feature
vector to discriminate the connected components extracted between starfish and
noise.
p = - Qcc
Qch
where a,, is the area of the connected component, and a c h represents the area of
the convex hull. Small values of p will mostly represent starfish.
,g=- a o c
Qcc
where aOcis the area of the result obtained applying the opening to the connected
component. Starfish are likely to return small values for the 8 indicator.
5. The classifier
For the classification module we adopted a simple Bayesian classifier Let C1
and C, represent the starfish class and the non-starfish class respectively, and let
2 be a vector in the feature space. What we want is to compute the a posteriori
probabilities P(xIC,) of a vector z to belong to the class Ci, and assign the vector
x to the class having the largest P ( z ( C i ) .
The Bayes’ formula states that
Assuming a Gaussian model for the a priori probabilities of the two classes of
vectors in the features space p, 8 ,(P(C,(z)),
~ a uniform distribution for P ( z )
(i.e., P ( 2 ) = l), and assuming that P(C1) = P(C2) we get that the a priori
136
Figure 3. Examples of the components extracted from the video sequences. First row show examples
of starfish. The second row shows a selection of elements from the non-starfish class.
6. Experimental results
We tested our system on different video sequences obtained as different chunks
of a long video from an underwater mission. We classified manually a number of
connected components from three different video sequences.
A set of 394 components (197 starfish and 197 non-starfish) from the first
video sequence, were used as training set in order to estimate the two Gaussian
distributions. The two clusters of points in the feature space relative to the training
set are shown in Figure 4.
A second set of 348 components, divided in 174 starfish and 174 non-starfish,
and a third set of 742 components, divided in 371 starfish and 371 non-starfish,
137
have been used as test sets. The two sets were extracted from the second and third
video sequence respectively. The results are reported in Table 1. In general we can
observe that the success rate in classifying elements from the starfish class is high
(in the order of 98%), that is a very good results for such a simple classifier. Higher
is the error in classifying elements from the non-starfish class (in the order of 7%).
This is due to the fact that we included among the non-starfish some components
that are small parts of a starfish (such as tentacles), and these have morphological
properties similar to the starfish. A way to overcome these problem is to identify
a feature discriminating between starfish and this sub-class of starfish and adopt
a multistep classifier, or add this feature to the feature space if different from the
three adopted.
Table 1. Results of the experiments on the two test sets. %E = errors per-
centage, #E = number of errors, MCS = Mis-classified starfish, MCNS =
Mis-classified non-starfish
#Components %E #E MCS MCNS
%E #E %E #E
Test1024b 348 (2 x 174) 3.7 14 1.72 3 6.3 11
Tes21550b 742 (2 x 371) 4.8 36 2.1 8 7.5 28
7. Conclusions
This paper presented a system for the detection of starfish from underwater video
sequences. The system is composed by a chain of modules which ends with a
Bayesian classifier, that discriminates if a area of interest extracted from the input
image represents a starfish or not. Experiments perfonned on a number images
(more than 1000) show that our system has a classification success rate of 96%.
The system can be developed and improved in a number of ways. Most of
them regard the classification module. First the classification module could im-
plement modern and sophisticated learning techniques (e.g., support vector ma-
chines). We might also associate to each classification a confidence level (for
instance a candidate is classified as a starfish with 90% confidence). Moreover
we might think to extend the classification to more classes, discriminating among
different species of starfish. We will need more than the three features described
in section 4, and it might be useful to use more than one classifier.
So far the system works on single frames. An interesting and useful extension
is to count the amount of starfish in a video sequence. To this purpose we need
to remember the starfish seen and counted in previous frames. Therefore a track-
ing module (which tracks starfish in consecutive frames) must be introduced, and
138
Figure 4. Plot of the distribution of the training set in the feature space. The dark points represent
elements in the non-starfish class, the grey crosses elements in the starfish class.
several candidate algorithms have been identified. Starfish counting also requires
identifying and occlusions between starfish.
Acknowledgements
We thank Dr. Ballaro for useful discussions. This work has been partially sup-
ported by the following projects: EIERO project under grant number EU-Contract
HPRI-CT-200 1-00173: the international project for universities scientific cooper-
ation CORI May 2001-EF2001; COST-action 283. The test data were provided by
Dr. Anthony Grehan (Martin Ryan Marine Science Institute, University College,
Galway, IRELAND).
References
R. 0. Duda, P. E. Hart, and D. G. Stork. Pattern classification. Wiley, 2001.
R. C. Gonzales and R. E. Woods. Digital imageprocessing. Addison Wesley, 1993.
F. R. Hampel, E. M. Ronchetti, P.J. Rousseeuw, and W. A. Stahel. Robust Statistics:
the approach based on inJIuencefunctions. John Wiley & Sons, 1986.
D.M. Kocak, N. da Vitoria Lobo, and E.A. Widder. Computer vision techniques
for quantifying, tracking, and identifying bioluminescent plankton. IEEE Journal of
Oceanic Engineering, 24(1):81-95, 1999.
139
C.SCINTU
Dipartimento Ingegneria del Territorio, Facolta di Ingegneria
p.zza d % - m i .09123, Cagliari, Italy
E-mail: cescintu@unica. it
In this paper we propose a comparison among algorithms (HER and HEAT, appeared in
the last three years in literature) and classical elaboration and transformation methods
(such as DFT, Wavelet and Euclidean Distance), when applied to information retrieval
with several multimedia1 databases, chosen under specific experimental criteria.
The first database is a collection of Brodatz textures, on which we applied some linear
and non-linear transformations; this choice was due to the wide popularity in the
scientific environment of the above mentioned textures and in the easy way to put a
visual interpretation on the obtained results due to the applied transformations. The
second database contains several mammographies, characterized by both benignant and
malignant lesions, while the last database is an aerophotogrammetric image of Cagliari’s
district area. The choice of the last two databases was due to the high grade of difficulty
of their image content.
1. Introduction
The problem of image classification and retrieval by content, based only on the
actual content in the pictorial scene, is an hard one. As it turns out, human beings
are extremely good at recognizing shapes and textures independently from their
position and orientation, but much less confident when programming a machine
to achieve the same task; finding an automated technique to solve the pattern
recognition problem by computer is a dounting task and no general solution is
yet available, even if scientists carried solutions for specific problems on fixed
areas through.
In the scientific literature the proposed technique fall almost invariably in
the category of feature extraction methods, whose key idea is to analyse the
pictorial scene in order to obtain n numerical features. In this way an image is
mapped from the Image (or pixel) Space into a single point in n-dimensional
Feature Space, where traditional -and exact- spatial access methods may be used
to retrieve points (i.e. images) that are close to a query image. This type of user
141
si = [Ei- o i , E i +oili
Ei
where Eiis the energy value for the considered absolute maximum and 0,is its
standard deviation.
We transform a signal f(t) from the Time Space into the Entropy Space by
adopting the following criterion: we considered the entropy values associated
with the absolute maxima following their hierarchical extraction order, then we
place in the Distance/Entropy Space these related values of the entropy, with the
first maximum at the place x=O.
HER is characterized by some interesting properties: it is invariant with
respect to the translation of the signal (i.e. to the amplitude, to the time-shift and
to the initial phase-shift).
HEAT introduces the linear transformation:
The first experimental test was focused to study the behaviom of the
methods with respect to linear and non-linear transformation of the considered
signals.
The first database was a set of 256 signals, obtained from 16 different
Brodatz textures[5]. For each of the textures we selected a 32 x 32 pixel area,
that produces a 1D signal of 1024 pixel length when HEAT is applied; these
steps generate a set of 16 One-Dimensional signals.
Figure 1 Selection of tiles obtained from the Brodatz textures augmented with
transformed versions of 16 original signals (first database).
14 ~
-#False Alarm Position
12 - - - . -# -MatchesFirst 15 retrieved
10 -
CA
a
3
* 8 -
L1
O 6 -
2
4-
2 -
DFT Harmonics
0 I I I I
17 34 500 1000
Figure 2. Graph of the behaviour of the DFT-' considering different values of the
reconstruction harmonics.
Figure 3 shows a signal belonging to the first database (bark element) and
the same signal when reconstructed using 17 harmonics in the Inverse Fourier
reconstruction.
Figure 3 - Original Bark Brodatz signal (top) and the same signal after the DFT'
obtained with the first 17 harmonics (down)
145
We can notice that the inverse DFT transform produces a set of signals
almost overlapping when applied to the set of 16 relevant matches (Le. to one of
the transformed version of the image query, including the query too) using a
restricted number of components.
These results are also clearly shown in Table 1, which includes the
comparison among the proposed methods.
Euclidean
HER Fourier Wavelet
Distance
#Correct Tiles 15
1°#False Alarm position 12 11 12 12
#False Dismissal 4 5 4 4
Normalized Recall 0.969 0.927 0.969 0.969
The graphs of the results due to the different methods are shown in Figure
5a and Figure 5b, where a benignant and a malignant query is adopted
respectively.
25 1 -THEORICAL
- - - HER
20-
D.EUCLIDEA and FOURIER
-v)
15-
WAVELET
B
.-t
j lo-
5-
/
/
Tiles
Figure 5a Results due to the different methods when a benignant query is applied.
25
-THEORICAL
--- HER
/ -
20 -D.EUCLID.
-
_ _ _ _ _ FOURIER
G WAVELET
5 15 -
f.-
e
g
lo-
5 -
Tiles
0- , I
Figure 5b Results due to the different methods when a malignant query is applied.
147
The wide range of variability of the signals belonging to this database and
their non- periodic nature are the most effective factors in the qualitative and
quantitative response of the considered methods [7][8] [9] [lo].
In Figure 5a and 5b we can observe that HER gives again results very close
to the theoretical response, while both DFT and Wavelet behaviours are similar
each other but less perfonning with respect to HER.
The thud and last database is a portion of an aerophotogrammetry of
Cagliari’s district area, acquired at an altitude of 10.000 meters; the image is
characterized by rural communication road, extended plantations (trees and
horticultural) and farms (Figure 6 ) .
100 -
90 -
80 -
70 -
60-
'.-
I
.I
Y
50-
?i
L.
Y
2 40-
I
0 20 40 60 80 100 120
3. Conclusions
In this work we have faced the problem of image retrieval efficiency; four
different methods were performed when applied on three diverse databases.
Experimental results show the robustness both of HER and HEAT, also
stated in some previous works. We were also able to prove that the above
mentioned methods give results that can be compared to DFT, Wavelet and
Euclidean Distance, despite HER and HEAT intrinsic nature of non-linearity.
Textures, medical and aerophotogrammetric images has been considered as
databases for our experiments. The obtained results need some further
considerations; the use of the Brodatz database allowed us to show that HER,
DFT and wavelet behaviour is quite similar. The comparison among methods
gives more interesting results when applied to the other proposed databases; in
fact if the behaviour of HER is generally the best one, DFT, wavelet and
Euclidean Distance behaviour is worse and almost invariable with respect to the
considered database; among the last three methods we are not able to establish
which of them works generally better. Anyway the Euclidean Distance is the less
performing among the considered methods.
149
Aknowledgements
The authors would like to thank Marco Cabras and Maria Giuseppina Carta of
the Provincia di Cagliari for their concerning in obtaining the permission to use
some portions of the digital images of Cagliari’s district area.
References
1 Brandt s:, Laaksonen J., Oja E., Statistical Shape features in Content-
based Image Retrieval, Proc. of ICPR, Barcelona, Spain, September 2000
2 Teolis A., Computational signal processing with wavelets, 1998,
Birkhauser.
3 Casanova A., Fraschini M., Vitulano S., Hierarchical Entropy Approach
for image and signals Retrieval, Proc. FSKDO2, Singapore, L.Wang et al.
Editors.
4 Distasi R., Nappi M., Tucci M., Vitulano S., CONTEXT: A technique fur
Image retrieval Integrating CoNtour and TEXture Information, Proc. of
ICIAP 2001, Palermo 224:229,-Italy, IEEE Comp. SOC.
5 Brodatz P., Textures, A Photographic Album of Artists and Designers,
Dover Publications, New York, 1966. Available in a single .tar file
:ftp://ftp.cps .msu.edu/pub/prip/textures/
6 Suckling J., Parker J., Dance D.R. et al., The mammographic image
analysis society digital mammogram database, in Digital Mammography,
Gale, Astley, Cairns Eds, pp 375-378, Elsevier, Amsterdam, 1994
7 Issam El Naqa, Yongyi Yang, et alt., Content-based image retrieval fur
digital mammography, ICIP 2002.
8 Acharyya M., Kundu M.K., Wavelet-based Texture Segmentation of
remotely Sensed Images, Proc. of ICIAP 2001, Palermo, 69:74, IEEE
Computer Society.
9 Wang J.Z., Wiederhold G., Firshein O., Wie S.X., Content-based image
indexing and searching using Daubechies wavelets, Int. Jour. Digit. Libr.,
1997, 1:311-328 Springer Verlag.
10 Chang R.F., Kuo W.J. Tsai H.C., Image Retrieval on Uncompressed and
Compressed Domain, ICIP 2000.
150
1. Introduction
The Information Retrieval field has generated additional interest in methods and
tools for multimedia database management, analysis and communication.
Multimedia computing systems are widely used for everyday tasks and in
particular, image database represents the most common type of applications; it is
important to extend the capabilities of such application field by developing
multimedia database systems based on retrieval by content. Searching for an
image in a database is a complex issue expecially if we restrict the queries to
approximate or similarity matches.
A variety of techniques and working prototypes for content-based image
indexing systems exist in literature.
This paper presents an overview and some remarks of an indexing
technique, Hierarchical Entropy-based Representation for images retrieval based
on contour and texture data, and shows last results obtained with Brodatz
texture, aerial photographs and medical image datasets. Our method has shown
to be effective on retrieve images in all cases under investigation and it has
invariance and robustness properties that make it attractive for incorporation into
larger systems.
We also think, that using this method in medical databases as the basis for a
computer-aided detection (CAD) system could be a relatively new and intriguing
151
idea; our first experimental results have shown that it is effective considering the
most objective indexes to estimate the performance of diagnosis results
(sensitivity, specificity, positive predictive value, and negative predictive value).
The paper is organized as follows: Section 2 shortly reassumes how our
method works and some of its properties; Section 3 shows a comparison with
Wavelet based method and Section 4 describes the results obtained from
experimentation on several image datasets.
Considering a signal f(') in the time space, HER represents the signal in
the entropy space following these steps:
Selection of first absolute maxima
Consider the maxima to be the midpoint of a Gaussian distribution
Compute its relative entropy
Go back to first step until we have used a predefined number M of
maxima or when the fraction of the total energy remaining in the signal
falls below a given threshold
In the entropy space the signal is represented by means the sequence of the
extracted maxima, located by the distance from first maximum (largest).
The distance between two given signal f, (') e f2(t) is obtained by
means the comparison of the correspondence non-linear HER representations.
HER is a good candidate for content based retrieval whenever the
information can be accurately represented by a 1-D signal.
As said above, there are several methods available for image retrieval. The
methods based on the multiresolution formulation of wavelet transforms are
among the most reliable and robust. A wavelet is a waveform of effectively
limited duration that has an average value of zero. One of the most advantages
afforded by wavelets is the ability to perform local analysis.
The comparison was aimed at assessing the efficiency and effectiveness of
the retrieval. In particular, efficiency is related to the computational
requirements and to the index size, while effectiveness has to do with the quality
of the answer set. As for the quality of the retrieval, wavelet-based approaches
are very robust and tolerate even the addition of Gaussian noise to the query
texture without too negative consequences. In HER, as few as 4 or 5 maxima are
usually enough to characterize a texture in an effective way. Indeed, having too
many maxima in the index does not improve on the performance. As a
consequence, the typical size of HER indices is rather small. On the other hand,
a typical wavelet-based index requires about a hundred coefficients to work with
good accuracy.
154
Summing up, HER’S performance in terms of quality are very close to those
of methods based on the wavelet transform, but it is much less costly in terms of
computing resources and index size. Additionally, as stated above, this
representation can be effectively used for different kinds of data; in particular
contours and textures.
4. Experimental Results
Several experiments have been performed in order to assess the validity of the
proposed method. For these tests, we focused on aerial images dataset, Brodatz
set of textures and, furthermore, on one medical case study containing
mammographies from the MIAS Database. In all these cases texture is
significant enough that it can be tentatively used alone for indexing.
The testing dataset with aerial images was constructed using aerial
photographs acquired in nearby Cagliari regions. The dataset include several
images of different kind of soil, vegetables, roads, river and buildings. Figure 1
shows a portion of the area under investigation, with a subdivion in tiles (10 x 10
pixels). We tried to investigate the use of texture as a visual primitive to search
and retrieve aerial images.
The results obtained demonstrate (Figure 2) that our method can be used to
select a large number of geographically salient features as vegetation patterns,
parking lots, and building developments.
300000
250000
u) 200000
8
2
*
150000
=
u)
100000
50000
150000
8
C
100000
m
c)
.s
0
50000
0
0 10 20 30 40 50 60 70 80 90 100
Rank
In figure 3 is shown the distances from the query tile (Bark.OOOO) to the
closest 100 matches in the testing dataset. First bin represents the first nine
matches at distance 0 each one belonging to the same texture type.
About medical cases, the first database used was the MIAS Mammographic
Database, digitised at 50 micron pixel edge and reduced to 200 micron pixel
edge, so that every image is 1024 pixels x 1024 pixels with 8 bits. The MIAS
Database included 330 images, arranged in pairs of films, where each pair
represents the left and the right mammograms of a single patient, with the
follows details: MIAS database reference number, character of background
tissue, class of abnormality present, severity of abnormality, image coordinates
of centre of abnormality and radius of a circle enclosing the abnormality.
The testing data set includes 67 benign and 54 malignant mammograms. The
lesion was labelled by the reference included in the Database. Table 1 illustrates
the results obtained from the experimentation with different query tiles. The
table is structured in this way: the “#FA /XX’ columns show the number of false
alarms in the XX-elements answer set; the “1st FA” column contains the answer
set rank of the first false alarm; the last one column “Class” represents the class
of abnormality. It is important to note as in all cases we have not found any false
alarm in the first 10 retrieved mammogram tiles.
I04 15 I 5 M
25 I2 5 9 B
99 16 3 9 M
34 I1 3 9 B
25,W
-01 20,m-
>
.!? 15.00-
h
al
10.00 -
0 10 20
# tiles
5. Conclusions
The main idea we have proposed with the HER method is to consider the
maxima as more important feature of a signal. The importance of the maxima is
not only on their position but rather on their “mutual position” inside the signal.
HER is a hierarchical method who select maxima considering the relative value
and reciprocal distances.
The signal is represented by means a vector containing couples of elements,
where the former is the distance of the maxima from the first one and the latter
represent the entropy associated. HER is a non linear transform who present
several nice invariance: translation, rotation, reflection, luminance shifting and
scale.
Experimentation using contour signal has shown encouraging results. Such a
results are strictly connected with the procedure we followed to transform a
shape into a 1-d signal. HER for contour allows to obtain important information
on the number and on the shape of the elongations of the object under
investigation. The sampling theorem clarifies the differences between the
proposed method and Fourier descriptors. Also the comparison with moment-
based technique has shown the validity of the HER method.
Some consideration can be done about the results obtained using the
Brodatz dataset of textures: the transformations applied on the tiles (rotation,
reflection, luminance shifting and contrast shifting) does not modify the low
158
frequencies of the signal, this allows the Fourier Transform to obtain better
results using only few coefficients. However, such a results are not better respect
to the once obtained with HER and the wavelets.
One of the most importance propriety of the HER method is relative to its
low time consuming respect to all other techniques take into account.
Furthermore all experimentation has been conducted using only the 30% of
whole signal information.
In conclusion we can affirm the experimentation with HER has shown
results comparable (and sometimes better) with Fourier Transform and
Wavelets. These lunds of results are confirmed with the last experiments on
medical images (mammography database).
Considering the results obtained on medical images we thlnk our method
could be used as the basis for a computer-aided detection (CAD) system. Finding
similar images, with the aim to attract radiologist attention to possible lesion
sites, sure is important way to provide aid during clinical practice. The
importance of a content based image retrieval system in computer aided
detection is to help radiologists when they need reference cases to interpret an
image under analysis. Our future objective is on the development of an efficient
database methodology for retrieving patterns in medical images representing
pathological processes.
References
1. Casanova A., Fraschmi M., Vitulano S., Hierarchical Entropy Approach for
image and signals Retrieval, Proc. FSKD02, Singapore, L.Wang et al.
Editors.
2. H Distasi R., Nappi M., Tucci M., Vitulano S., CONTEXT: A techniquefor
Image retrieval Integrating CONtour and TEXture Information, Proc. of
ICIAP 2001, Palermo-Italy, IEEE Comp. SOC.
3. Brodatz P., Textures, A Photographic Album of Artists and Designers,
Dover Publications, New York, 1966. Available in a single .tar file:
ftp :I/ftp .cps .msu.edu/pub/prip/textures/ .
4. Issam El Naqa, Yongyi Yang, et alt., Content-based image retrieval for
digital mammography, ICIP 2002.
159
VITO DI GESU
D M A University of Palermo, Italy
IEF University oj Paris Sud, ORSAY, France
E-mail: digesu@math.unapa.it
Aim of the paper is to address some fundamental issues and view-points about
machine vision systems. Among them image understanding is one of more chal-
lenging. Even in the case of the human vision its meaning is ambiguous, it depends
on the context and goals to be achieved. Here, a pragmatic view will be considered,
by addressing the discussion on the algorithmic aspects of the artificial vision and
its applications.
1. Visual Science
Visual science is considered one of the most important field of investigation
in perception studies. One of reasons is that eyes collect most of the envi-
ronment information and this makes very complex the related computation.
Moreover, eyes interacts with other perceptive senses (e.g. hearing, touch,
smell), and this interaction is not fully understood. Mental models, stored
somewhere in the brain, are perhaps used to elaborate all information that
flow from our senses to the brain. One of the results of this process is an
update of our mental models by means of a sort of feedback loop. This
scenario shows that the understanding of a visual scene surrounding us is
a challenging problem.
The observation of visual forms plays a considerable role in the majority
of human activities. For example in our daily life, we stop the car at the red
t r a f i c light, select ripe tomatoes, discarding the bad ones, read a newspaper
to update our knowledge.
The previous three examples are related to three different levels of un-
derstanding. In the first example an instinctive action is performed as a
'This work has been partly supported by the european action COST-283 and by the
French ministry of education.
160
Figure 1. Axial slices through five regions of activity of the human brain.
One of the visual science goals is the design and the realization of arti-
ficial visual systems closer and closer to human being. Recent advances in
the inspection of human brain and future technology will allows us both
to explore our physical brain in deep (see Figure 1))and to design artificial
visual systems, behavior of which will be closer and closer to the human
being (see Figure 2).
However, advances in technology will be not sufficient to realize such
advanced artificial visual systems, as a matter of fact that their design
would need of a perfect knowledge of our visual system (from the eyes to
the brain).
162
tems. Here, the choice of the visual model is usually based on optimization
criteria.
Even if pragmatic approach has been developed under the stimulus of
practical requirements, it has contributed also to understand better some
vision mechanisms. For example, graph theoretical algorithms have been
successfully applied to recognize Gelstat clusters according to the human
perception l2?l3. The relation between graphs and natural grouping of
patterns could be grounded on the fact that our neural system could be
seen as is a very dense multi-graph with billions of billions of paths. Of
course, the question is still open and probably will never be solved.
Artificial visual systems can be described throughout several layers of
increasing abstraction; each one corresponding to a set of iterated trans-
formations. The general purpose is to reach a given goal, starting from an
input scene, X, represented, for example as an array of 2D pixels or 3D
voxels defined on a set of gray levels G. The computation paradigm follows
four phases (see Figure 3): Low level vision, Intermediate level vision, High
level vision, Interpretation.
Note that these steps don’t operate as a simple pipeline process; they
may interact through semantic networks and mechanism of control based
on feedback. For example, parameters and operators, used in the low level
phase, can be modified if the result is inconsistent with an internal model
used during the interpretation phase. The logical sequence of the vision
phases is weakly related with natural vision processes; in the following a
pragmatic approach is considered, where the implementation of each visual
procedure is performed by means of mathematical and physical principles,
which may have or not neuro-physiological counterpart.
The pragmatic approach has achieved promising and useful results in
many application fields. Among them robotics vision 14, face expres-
sion analysis 15, document analysis 16, medical imaging 17, and pictorial
database
Early Vis'ton
Coanitive Vision
Figure 5. The attentive operator D S T : a) the input image; b) the application of the
D S T ; c) the selection of point of interest; c) the selection of eyes.
system describing the snake under constraints that are imposed by the
image features. The solution is funded by an iterative procedure and it
corresponds to the minimization of the system energy. The algorithm is
based on the evolution of the dynamic system:
Figure 6. (a) input image; (b) edges detection; (c) snakes computation.
42
Figure 7. (a) Cipolla's skewed symmetries detection; (b) examples of global symmetry
detection.
task. Therefore, general solutions do not exist and each proposed techniques
is suitable for a class of problems. In this sense, the image segmentation is
an hill posed problem that does not admit a unique solution.
Moreover, the segmentation problem is often hard, because the proba-
bility distribution of the features is not well known. Often, the assumption
of a Gaussian distribution of the features is a rough approximation that
makes false the linear separation between classes.
In the literature the segmentation problem has been formulated from
different perspectives. For example, in 45 is described a two-steps procedure
that use only data included in boundary, this approach has been extended t o
boundary surfaces by combining splines and superquadrics to define global
shape parameters 46,47. Other techniques use elastic surface models, that
are deformed under the action of internal forces to fit object contours using
a minima energy criteria 48. A model-driven approach segmentation of
range images is proposed in 49.
Recently, Jianbo and Malik 50 have considered a 2 0 image segmenta-
tion as a Graph Partitioning Problem ( G P P ) solved by a normalized cut
criterion. The method founds an approximated solution by solving a gener-
alized eigenvalue system. Moreover, the authors consider both spatial and
intensities pixel features in the evaluation of the similarity between pixels.
Recently, the problem of extracting the largest image regions that satisfy
uniformity conditions in the intensity/spatial domains has been related to
a Global Optimization Problem (GOP) 51 by modelling an images by a
weighted graph; where the edge-weight is function of both intensities and
spatial information. The chosen solution is that one for which a given
objective function obtains its smallest value, hopefully the global minimum.
In 52 a genetic algorithm is proposed to solve the segmentation problem as
a G O P problem using a tree regression strategy 53.
The evaluation of a segmentation method is not an easy task because
the expected results are subjective and they depend on the application.
One evaluation could be the comparison with a robust and well experi-
mented method; but this choice is not always feasible; whenever possible
the evaluation should be done combining the judgement of more than one
human expert. For example, the comparison could be performed using a
vote strategy as follows:
171
Figure 8. (a) input image; (b) human segmentation; ( c ) GS, (d) NMC, (e) SL, and (f)
C-means segmentations.
where #agrkis the number of pixels in which there is the agreement between
the human and the machine, I H P k I is the cardinality of the segment defined
by the human, and 1 41is the cardinality of the segment found by the
algorithm.
Figure 8 shows how different segmentations methods (Genetic Segmen-
tation (GS), Normalized Minimum Cut (NMC), Single link (SL), C-means)
perform the segmentation of the same image. Figure 8b shows the human
segmentation obtained using the vote strategy.
and shape features are used in the classification and recognition of a! and
,B ganglion retinal cells. In 55 is presented a quantitative approach where
several features are combined such as diameter, eccentricity, fractal dimen-
sion, influence histogram, influence area, convex hull area, and convex hull
diameter. The classification is performed integrating the results from three
different clustering methods (Ward’s hierarchical scheme, K-Means and Ge-
netic Algorithm) using a voting strategy. The experiments indicated the
superiority of some features, also suggesting possible biological implications
among them the eccentricity derived from the axial moments of the cell (see
Figure 9).
Autonomous robots equipped with visual systems are able to recognize
their environment and to cooperate in finding satisfactory solutions. For
example in 56 is developed a probabilistic, vision-based state estimation
method for individual, autonomous robots. A team of mobile robots is
able to estimate their joint positions in a known environment and track the
positions of autonomously moving objects. The state estimators of different
robots cooperate to increase the accuracy and reliability of the estimation
process. The method has been empirically validated on experiments with
a team of physical robots playing soccer 5 7 .
The concept of internal model is central in this phase of the analysis.
173
3.4. Interpretation
This phase exploit the semantic part of the visual system. The result be-
longs to an interpretation space. Examples are linguistic description and
definition of physical models. This phase could be considered as the con-
scious component of the visual system. However, in a pragmatic approach it
is simply a set of semantic rules that are given, for example, by a knowledge
base.
The technical problem is that of automatically deriving a sensible in-
terpretation from an image. This task depends on the application or the
domain of interest within which the description makes sense. Typically, in
a domain there are named objects and characteristics that can be used in a
report or to make a decision. Obviously, there is a wide gap between the na-
ture of images (essentially arrays of numbers) and their descriptions and the
intermediate level of of the analysis is the necessary the link between image
data and domain descriptions. There are researchers who take clues from
174
biological systems to develop theories, and there are those who focus on
mathematical theories and physics regarding the imaging process. Eventu-
ally however, theory becomes practice in the specification of an algorithm -
embodied in an executable program with appropriate data representations.
There are alternate views of vision, resulting in other paradigms for image
understanding and research.
In image interpretation, knowledge about the application domain is ma-
nipulated to arrive at understanding the recorded part of the world. Knowl-
edge representation schemes that are studied include semantic networks 62,
Bayesian and Belief Networks 6 3 , and fuzzy expert systems 64. Some of
the issues addressed within these schemes are: incorporation of procedural
and declarative information, handling uncertainty, conflict resolution, and
mapping existing knowledge onto a specific representation scheme. Re-
sulting interpretation systems have been successfully applied to interpret
utility map, interpreting music scores and interpreting face images. Future
developments will focus on the central theme of fusing knowledge represen-
tations. In particular, attention will be paid towards information fusion,
distributed knowledge in multi-agent systems and mixing knowledge de-
rived from learning techniques with knowledge from context and expert.
Moreover recognition systems must be able to handle uncertainty, and
t o include subjective interpretation of a scene. Fuzzy-logic 67 can provide
a good theoretical support to model such kind of information 65966.For
example, to evaluate the degree of truth of the propositions:
4. Final remarks
This review has shown some problems and solutions in visual systems. To-
day, more than 10,000 researchers are working on visual science around
the world. Visual science has become one the most popular field among
scientists. Physicists, neurophysiologists, psychologists, and philosophers
cooperate to reach a full understanding about visual processes from differ-
ent perspectives. Fusion and the integration of which will allow us to make
consistent progresses in this fascinating subject. Moreover, we note that
anthropomorphic elements should be introduced to design complex artifi-
cial visual system. For example, the psychology of perception mays suggest
new approaches to solve ambiguous 2D and 3D segmentation problems.
For example, Figure 10 shows the well known Kanizsa illusion 70. Here the
perceived edges have no physical support whatsoever in the original signal.
References
1. D.Marr, S.Francisco, W.H.Freeman, (1982).
2. S.E.Palmer, MIT Press, (1999).
3. M.D.Eesposito, J.A. Detre, G.K. Aguirre, M. Stallcup, D.C. Alsop, L.J. T i p
pet, M.J. Farah, Neuropsychologia 35(5), 725 (1997).
4. M.Conrad, Advances in Computers, 31, 235 (1990).
5. [B] N.Wiener, Massachusetts Institute of Technology, MIT Press, Cambridge
(1965).
6. F.Rosemblatt, Proceedings of a Symposium on the Mechanization of Thought
Processes, 421, London (1959).
7. F.Rosemblatt, Self-organizing systems, Pregamon Press, NY, 63 (1960).
8. F.Rosemblatt, Spartan Books, NY (1962).
9. V. Cantoni, V. Di Ges, M. Ferretti, S. Levialdi, R. Negrini, R.Stefanelli,
Journal oj VLSl Signal Processing, bf 2, 195 (1991)
176
G..MADONNA
Sistemi Informativit
users a'reserved
with "infinite
To sum up, all of these interactions put together require the information
system supporting them to be able to:
concentrate on the management of fundamental interactions (those with
the patient), so as to ensure the governing of the strategic objectives
pursued;
handle other exchanges of information adequately, in particular by
adopting progressively a logic of close integration of the information
flows with external companies (for example entrusting GP’s with
computerised appointment taking operations);
produce automatically a full and explanatory set of information of
public domain or with controlled access, so as to guarantee the
necessary performance transparency of processes, and therefore make
them usable by third parties, by way of portals.
The strategic variables transversal to the three types of structure, then, can
be identified on the basis of the two essential access and control fbnctions:
0 management control;
the information and IT system, decisive both for the circulation of data for
access purposes, as well as for their analysis for control purposes;
the quality system, i.e. new attention paid to the service offered to the user;
human resources development.
identification of the different levels within which the data must be treated and
processed and, in a correlated way, the identification of the integration
mechanisms of the data themselves.
Since integration is the solution to functioning problems of the information
system of the company, an integration which is able to exchange flows of
information is to be achieved by means of solutions which enable the sharing of
archives by all subsystems.
So it is clear how the help of a strong, organic and elastic support system is
fundamental to informatiodmanagement activities.
The system must be organised in such a way that the data only originate
from primary sources, identified as follows:
original data, generated by management processes;
second level data, produced by processing procedures;
complex data, resulting from the automatic acquisition from more than
one archve.
Complex data :
This is data necessary to activities of control, management and
statistical and epidemiological assessment, which have been originated
by a crossing of data present in more than one archive at the moment in
which they are correlated, in order to be able to express significant
values. This therefore deals with the definition of one or more data
warehouses, starting from which the specific application systems see to
the processing of a more aggregated level, in an On Line Analytical
Processing (OLAP) logic.
186
service of the patient, both directly, in that they are the direct user of
them, and indirectly, as support to the work of health staff;
the top two layers, both characterised by complex data resulting from
statistical or OLAP processing, divide the information about the patient
according to the two axes - effectiveness and efficiency - recalled most
often:
J in the first direction, the essential processing is of the
medical and clinical sort, supporting the quality of the health
outcome and capability of the system to guarantee the
necessary levels of assistance;
J in the second direction, the typical processing are those
supporting managerial and strategical decisions and control
of expenditure and the relative production level.
Q
P
Operator
USER Management
ASSOCIATION Operator
The integration of modules, i.e. the term Integrated Health System cannot
be left behind by a new company vision: a vision based on “Processes”.
This statement proposes another determining key for the “information
system” as strategic variable: the capability of the system to be of support to
company processes and, therefore to map itself out and configure in a flexible
way on these processes. In short, the logics of design requested of an ERP
system must regard not only administrative and accounting systems - now an
acquired fact - but health ones too, both of the health-administration sort and
health-professional sort.
Each process is in itself complex and integrates administrative, health,
economic and welfare elements. Each process requires therefore information
integration towards itself and other correlated processes.
As Figure 7 shows, an implicit logical information flow does exist which
transports information from health processes (“production” in the strictest sense)
towards directional processes (the “government” of production), transiting for
processes which are to a certain extent auxiliary ones (but obviously essential
for the functioning of the production machine of the company), of the
healthladministrative type and the accounting/administrative one.
j
......................... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
The architecture of the system must enable the support of the processes
described in the figure and the information flows which tie these processes, and
in particular:
the territory-hospital integration favouring the patient and relative
welfare processes. The integration of welfare processes between
operative units of hospitals and operators over the territory (family
doctors and pharmacies) is activated through the support of the
functionality of the whole process of delivery of services: from
information activities, prescription, appointment making, delivery, to
that of payment of tickets and withdrawallacquisition of return
mformation.
the integration of the operational units, whether clinical or not, as
auxiliary service to the medical and health staff in the carrying out of
activities relative to the care of the patient. The information system
must handle in a unitary and facilitated way the activities of the care
process controlling the process of the delivery of services requested and
their outcome, so as to obtain an improvement in quality and efficiency.
the integration of clinical-welfare information in order to guarantee
compactness of the welfare process. The visibility of the status of the
total clinical-welfare process towards the patient is made possible
thanks to the access to previous clinical records.
0 the integration of information with directional purposes as auxiliary
service to the management personnel of the Company. Following the
latest reforms, Companies are pursuing management improvement,
guaranteeing the delivery of services at the highest levels of quality,
aiming at the final objective of health, at the total outcome of welfare.
From that comes the need to collect, reconcile and integrate
information coming from the different information, administrative and
health subsystems.
the information integration with the Regional Health Office General
Management in order to facilitate both the administrative operations
aimed at reimbursements (communications with local health authorities
and with the Regional office concerning budgets and services
delivered), as well as control and regional supervision activities about
health expenditure, and lastly activities of health (epidemiological)
supervision. The interaction of the Company with the GM of the
Regional Office has a double scope: to transmit promptly the necessary
documentation to receive reimbursements and regional financial
support the Company is due for services delivered, and to provide the
Region with the information necessary in support of the governing of
expenditure and management of the financial support, planning and
rebalancing of the Regional Health System and the improvement of
services for the population.
193
The cards relative to hospitalisation, once filled out, are recorded in the
HDCDRG, which sees to the carrying out of pricing (Grouper 3M) and formal
controls and then to feeding the Hospitalisation Mobility module.
The In-patients Ward Management represents in itself the evolution of a
clinical-health process in the area of administrative process (admission and
filling out of the Discharge card).
The degree of integration sees to it that the list of in-patients of the single
ward is fed by all the modules of the Health System whlch can carry out the
admmistrative hospitalisation. Data relative to therapiedmedicine giving and the
pages of the specialist clinical record are visible in the complete clinical history
of the patient. The giving of medicine means an automatic writing for
discharging form the pharmacy cabinet, which is integrated with the Pharmacy
Store for the management of the sub stock and relative order for supplies. From
the specialist clinical record it is possible to access all administrative and clinical
data relative to the hospitalisation in itself as well as previous hospitalisations
(including therein all the data about ambulatory services with referrals inside the
Ambulatory Management).
The figure which follows represents the logical flow relative to the
Integrated management of an In-Patient Ward.
197
6.3, Weyareprocess
The Delivery of Welfare Services process starts out with the request
(opening of file) for the Service itself from the family doctor, hospital doctor and
is defined with the management of the Multi-dimension Assessment file for
Adults and the Elderly and the putting onto the waiting list according to the type
of welfare regime.
The health process is completed with the discharge from the welfare
structure and the closing of the Clinical Record.
The admmistrative/accounting process is fed by the recording of the
activities delivered, so the price lists relative to both private structures and the
ones in the National Health Service are valid. On the basis of preventive
activities it is possible to carry out expenditure forecasts for each structure and
each type of welfare regime. For every welfare structure it is possible to import
198
specific plans (defined by the regions) relative to the activities carried out on
patients on their lists.
On the basis of the data from the files, it is possible to carry out controls on
the suitability of said data; controls enable transparent management of payments
made to private structures with welfare regime.
Figure 11 - Welfare Process
GIANNI FENU
University of Cagliari
Department of Mathematics and Informatics
e-mail:fenu@unica.it
ANTONIO CRISPONI
University of Cagliari
Department of Mathematics and Informatics
e-mail: antonio@sc.unica.it
SIMONE CUGIA
University of Cagliari
Department of Mathematics and Informatics
e-mail: simone@sc.unica. it
MASSIMILIANO PICCONI
University of Cagliari
Department of Mathematics and Informatics
e-mail: mpicconi@sc.unica.it
The use of computer technology in the sector of medicine has seen lately the study of
different fit applications to the clinical data management, being textual or images, across
networks in which has always been given priority to the politics of flexibility and safety.
In the pattern here shown I have summarized the factors of flexibility necessary for the
development of systems for ample application-oriented spectrums, creditable to the
technology investment wireless allowing safety and guaranteeing a certified exchange in
client network-server. Besides the need to offer an ample base, in client side, has
suggested the investment of different PDA patterns, making largely portable the
application and allowing architectural independence. The same smart-wireless network
smart-client allows cell-growth in the area of employment.
1 Introduction
The diffusion of computer tools in different sectors of modern medicine has
marked an evident discontinuity in the steps of development in scientific
activities.
Notable benefits have been brought in the field of medicine, by the
improvement of hospital services and from the consequential growth of the
quality of the aid, that in different ways is related not only to the development of
20 1
2 Architectural Model
The solutions for the development of wireless applications are distinguished in
browser-based, synchronization-based, and smart-client [3].
The browser-based introduces the disadvantage to require a permanent
connection, with the consequent problems of high data exchange, more than
user's necessity, and cachmg of informations, that would be able not result
update.
The synchronization-based solution introduces the contrary disadvantage,
the offline operation, and it doesn't allow the system to work in real-time in
wireless network; the application uses a cache of data on the handled device.
The smart-client solution allows the network exchange of in demand
informations only, guaranteeing data rate and simple inquiry mode; a further
advantage is the independence by network architecture, integrating in existing
server arclutecture.
202
4 Client interface
The application answers three fundamental requirement: provide users’
informative reports, provide data transfer security and reliability and provide the
data communication of different format but compatible with the computational
archtecture and the interface of clients.
The mechanism of communicatiodsynchronization between server and
client are implemented through a pattem-matching system to interpret, on server-
side, client commands, and on client-side, server commands [ 11 [4].
Steps of the communication between server and client are:
Cognome: ICognome I
4
I
,.
Pnitd Operatiw I 15i03i20031
Cognome:
Nome: Ie
Interu.: ,....,:vento
Letto: II 6ttn
'Dirte: 7
P i t d Operatiw I 15i03/200:
The use of the protocol IEEE 802.11b, that defines a standard for the
physical layer and for the MAC sublayer for the implementation of Wireless
LAN, rapresents a system of communication that extends a traditional LAN on a
radio tecnologies, and so it facilitates the integration in existing department. The
adopted WLAN may be configured in two separate modes:
(ACKnowledgment) to the sender of the packet testifies the good result of the
transmission.
One of the aspects in the use of the standard 802.1l b is one's own security,
which is entrusted to a protocol called WEP ( Wired Equivalent Privacy ), that
is concerned with authentications of nodes and cryptography.
The logical diagram of the WEP algorithm is represented in this figure:
209
b IV
lntegritycheckvalue (ICV)
The inizialization vector (IV) is a 24 bits key, linked with the secret key (40bit
key).
In this way we obtain a set of 64 bits that are integrated as input in a
generator of pseudorandom codes ( PRGN WEP ) creating the sequence key.
The data users (Plaintext) are linked with a value of 4 bits, called Integrity
Check Value (ICV), generated from the algorithm for the integrity.
At this point the key sequence and the output of the connection between
plaintext and ICV, are submitted to a XOR operation (ciphertext). Then IV and
chiphertext are lmked and subsequently transmitted. The IV changes for every
transmission, and it’s the only that is transmitted in clear, to be able to
reconstruct the message in receipt phase.
7 Conclusions
The described architecture is characterized for aspects of the simplicity of
implementation, and the insertion in complex existent architecture.
T h s integration model is characterized in portability, security and
interactivity, and it consents to the user to interact with the server even to
distance.
There are in study some enhancements to process different models and
criterions for direct data exchange in the PDA architectures.
However the model of an user without ties in the treatment and in the access
to parametric data of the patient represents in this moment a simply and reliable
smart-client architecture.
210
References