Download as pdf or txt
Download as pdf or txt
You are on page 1of 18

Medical Image Analysis (1996) volume 1, number 1, pp 1–18


c Oxford University Press

A representation for mammographic image processing

Ralph Highnam∗ , Michael Brady and Basil Shepstone

Departments of Engineering Science and Radiology, Oxford University, South Parks Road, Oxford
OX1 3PJ, UK

Abstract
Mammographic image analysis is typically performed using standard, general-purpose algo-
rithms. We note the dangers of this approach and show that an alternative physics-model-based
approach can be developed to calibrate the mammographic imaging process. This enables us
to obtain, at each pixel, a quantitative measure of the breast tissue. The measure we use is h int
and this represents the thickness of ‘interesting’ (non-fat) tissue between the pixel and the X-ray
source. The thicknesses over the image constitute what we term the h int representation, and it
can most usefully be regarded as a surface that conveys information about the anatomy of the
breast. The representation allows image enhancement through removing the effects of degrading
factors, and also effective image normalization since all changes in the image due to variations
in the imaging conditions have been removed. Furthermore, the h int representation gives us a
basis upon which to build object models and to reason about breast anatomy. We use this ability
to choose features that are robust to breast compression and variations in breast composition. In
this paper we describe the h int representation, show how it can be computed, and then illustrate
how it can be applied to a variety of mammographic image processing tasks. The breast thickness
turns out to be a key parameter in the computation of h int , but it is not normally recorded. We
show how the breast thickness can be estimated from an image, and examine the sensitivity of
h int to this estimate. We then show how we can simulate any projective X-ray examination and
can simulate the appearance of anatomical structures within the breast. We follow this with
a comparison between the h int representation and conventional representations with respect to
invariance to imaging conditions and the surrounding tissue. Initial results indicate that image
analysis is far more robust when specific consideration is taken of the imaging process and the
h int representation is used.
Keywords: image enhancement and restoration, image simulation, mammography, surface
representation
Received September 14, 1995; revised December 4, 1995; accepted December 11, 1995

1. INTRODUCTION die of breast cancer than any other form of cancer. About
one woman in 12 can expect to develop breast cancer during
X-ray mammography continues to be the best examination for her lifetime. It has been demonstrated that early detection
early detection of breast cancer in post-menopausal women, is greatly improves mortality rates, perhaps by as much as 25%
the basis for national screening programmes and is the subject as claimed in the Forrest Report (Forrest, 1986), which led
of the mathematical and computational modelling proposed to the UK breast cancer screening programme based on X-
here. Women under the age of 50 years are currently excluded ray mammography, though this figure is currently the topic
from screening because of the attenuation of X-rays by dense of some debate (see below). For this reason, mammographic
breast tissue before involution. The facts about the incidence examinations are nowadays performed on about 25 million
of breast cancer are well known: more women in the West women annually in the EC (of which, about 3 million are in
∗ Correspondingauthor
the UK), at a cost of about US $3 billion per year. This huge
(e-mail: rph@robots.ox.ac.uk) cost and the poor accuracy in diagnosis, 8–25% of cancers are
2 R. Highnam et al.

Figure 1. The Xmammo interface. A left–right breast pair is shown. The buttons on the left-hand side enable the user to select the original image,
the primaries (scatter removal), to simulate the familiar hotlight, etc. A hotlight is shown as the bright square and allows close examination of
the densities within the light, radiologists use it in practice to view the breast edge. The buttons are repeated on the right side. At the bottom,
there are simple ways to change the time of exposure, the film and tube types, and to reposition the AEC.

missed and 70–80% of open surgical biopsies turn out to be As an application of image processing, mammographic
benign (Shroff et al., 1991), have led to increased interest in images pose a tough challenge because they have poor
applying computer-aided techniques. signal-to-noise ratio, typically 5–6 dB, corresponding to
A recent study (Woodman et al., 1995) suggests that the a noise level of approximately 3–4 grey levels in intensity
rate of presentation of tumours that arise within the screening (Cerneaz, 1994; Cerneaz and Brady, 1995). This is largely
interval or ‘interval cancers’ (the interval is three years in the because the images exhibit complex textures, because of
UK) is much larger than was predicted in the Forrest Report. If scattered photon radiation (Highnam et al., 1994) and
these early results on interval cancers hold up to further study, because there is a compromise between radiation dose and
this will cause a sharp decrease in the claimed improvement image quality. Worse, abnormalities appear as quite subtle,
in mortality as a result of the current protocols for national irregular, often non-local differences in intensity. Moreover,
screening. In fact, the findings have already led inter alia the images are inevitably cluttered due to superimposition,
to pressure for a reduction in the screening interval and to the background varies greatly between different breasts, and
routine adoption of two-view screening (cranio-caudal and there is relatively weak control of the imaging acquisition.
45◦ medio-lateral). The massive increase in the number of In short, mammographic images strain techniques of image
mammograms that these changes to the screening programme processing to their current limits, and beyond.
would entail make the development of reliable and robust Unfortunately, while there have been many papers written
computer techniques vital. on the application of image processing to mammography,
Mammographic image processing 3

Figure 2. The original mammogram. There is a suspect mass in the


bottom left-hand corner of the image.

the vast majority of the work has been of limited scope and
incorporates only general non-mammography-specific image
processing considerations. The dangers are obvious: image
smoothing may make lesions easier to locate, but can remove
calcifications and spiculations; edge sharpening may appear
Figure 3. Depiction of the h int surface of a breast, after smoothing
to improve an image from the image processor’s perspective
with a Gaussian G(0, σ ) where in the case shown σ = 3. This is
but can transform a malignant lesion into one that appears to a implemented by a convolution of mask size 5 by 5.
radiologist to be benign. In addition, currently available image
processing systems use terms familiar to engineers but totally
unfamiliar to clinicians, and this leads to unreliability, a lack
of confidence in users (typically radiographers) and disuse. by breast tissue that gives mammograms their appearance of
Full literature reviews can be found in Highnam (1992) and being slightly out of focus. There are many techniques for
Cerneaz (1994). sharpening up an (mammogram) image, of which unsharp
Our general approach has been that in order to be reliable masking is the best known. For the effects of scatter
and predictable, image processing must be based on a model removal to be predictable and reliable it is important that
of how the image is formed (Highnam, 1991, 1994, 1995). For while sharpening up the image to remove scatter, one must
mammography, this means we must model the way that X-rays be careful not to introduce artefacts that could change the
pass through breast tissue, are absorbed, and scattered before diagnosis! Technically, this corresponds to choosing carefully
exposing the film. Our modelling uses the work of a number the convolution kernel used in the unsharp masking. We
of medical physicists, most notably Haus et al. (1977), Day have shown, for example, that the Gaussian kernel, widely
and Dance (1983), Dance and Day (1984), Johns and Yaffe used in unsharp masking, gives poor results. The kernel we
(1987), Carlsson et al. (1989) and Dance et al. (1992). We developed is based on a model of the scattering of the X-rays
have also worked very closely with radiographers and clinical and their passage through an anti-scatter grid before absorption
radiologists, at the Churchill Hospital, Oxford and the Institut by the intensifying screen. We have shown mathematically
Gustave Roussy, Paris. and clinically that our procedure significantly improves the
A clear example of the difference between our approach and image, and hence the diagnosis, without introducing clinically
conventional approaches is in our modelling of the imaging suggestive artefacts. Further image restoration and enhance-
process to enhance mammograms by removing the effect of ment algorithms have been developed and incorporated into
scatter (Highnam et al., 1994). It is the scattering of X-rays a software system (Xmammo) that is available under license
4 R. Highnam et al.

Figure 4. Depiction of the h int surface of a breast, following processing of a mammogram by the Xmammo package. The height corresponds
to the amount of non-fatty tissue. The surface has been smoothed for easier viewing. The smoothing kernel is a Gaussian as in the previous
figure, with σ = 5. A cyst is clearly visible at the bottom left as a significant hill on the landscape. In this way, anatomical features correspond
to topographic features of h int surfaces.

from Oxford University and which is currently being evaluated the breast tissue at each pixel. The breast measure we use is
in several hospitals. A typical window display of Xmammo is the amount of non-fat (‘interesting’) tissue between the pixel
shown in Figure 1. The interface and screen layout result from and the X-ray source. We use h int to represent the thickness
substantial and continuing interactions with radiographers at of interesting tissue above each pixel and call these values
the Churchill Hospital, Oxford. collectively the h int representation. As a typical example,
Modelling the imaging process and understanding the Figure 2 shows a mammogram with a mass clearly located
object being imaged allows us to perform effective image nor- to the left of the nipple. Figures 3 and 4 are two displays of
malization. A mammogram contains two sorts of information, the h int surface computed from the image shown in Figure 2.
namely image variations that are due to: (i) the anatomical The tumour is clearly visible as a prominent hill in the south
structure of the breast, including pathologies and (ii) the choice west portion of the surface map.
of imaging parameters that are particular to the examination. In this paper we first describe the h int representation
The former include fibrous tissue, milk ducts, blood vessels informally (section 2). Then in section 3 we show how it
and fatty tissue, as well as calcifications and masses. The latter can be computed. The breast thickness turns out to be a
include: the type of X-ray tube and film used; the position of key parameter in the estimation of h int , but is not normally
the breast in the machine; the compression of the breast and recorded. We show how the breast thickness can be estimated
the exposure time. It is the former information that is required from the image, and the sensitivity of h int to this estimate.
for early diagnosis by both radiologists and computer. After The major contributions of the paper are in sections 4–8 in
obtaining calibration data and using appropriate models, we which we illustrate how the h int surface representation can
have shown how to remove the imaging effects (ii) to enable be applied to a variety of mammographic image processing
a radiologist (or computer) to concentrate on (i). The way tasks. In particular, it can be used for a range of image
we do this is to take sufficient calibration data so that we restoration and enhancement operations, and can form the
may transform the image to attain a quantitative measure of basis for simulating the appearance of anatomical structures
Mammographic image processing 5

within the breast. We compare the h int representation and Table 1. X-ray linear attenuation coefficients for various breast tissue
conventional representations with respect to invariance to types reported by Johns and Yaffe (1987).
imaging conditions and surrounding tissue. Initial results
demonstrate that image analysis is far more robust when µ (cm−1 ) at energy (keV)
specific consideration is taken of the imaging process and the Tissue type 18 20 25
h int representation is used. We conclude by sketching further
possible applications of the h int representation, including Minimum 0.538 0.441 0.314
matching mammograms (and breast MRI volumes) and as the Fat Mean 0.558 0.456 0.322
Maximum 0.585 0.476 0.333
basis for a fresh approach to feature extraction.

2. THE hint SURFACE Fibrous Minimum 1.014 0.791 0.499


(Glandular) Mean 1.028 0.802 0.506
(Parenchymal) Maximum 1.045 0.816 0.516
This section describes informally the h int (x, y) representation
and how it arises from our work on scatter removal (subsequent
Infiltrating Minimum 1.061 0.826 0.519
sections provide more detail). The intensity of a mammogram
duct Mean 1.085 0.844 0.529
at a given pixel (x, y) indicates the amount of attenuation carcinoma Maximum 1.137 0.884 0.552
(absorption and scattering) of X-rays in the pencil of breast
tissue vertically above (x, y) on the film. Unfortunately,
this information is confounded by scattered radiation from
the surrounding tissue. We have shown (Highnam et al., result of our scatter removal algorithm (Highnam et al., 1994)
1994) how to estimate the scatter component for a mam- was to demonstrate that it is possible to estimate h int and h fat .
mogram, and thus how to find the primary component of This means that we can convert an image into a representation
the X-ray beam at each pixel and hence an attenuation which effectively assumes that the fat has risen to float on top
measure. of the interesting tissue surface, then peels off the fat leaving a
Ideally, one might hope for a quantitative three-dimensional representation h int (x, y). This concept, while simple, suffices
representation of the breast with each voxel labelled with for a surprising number of purposes as we see later (section 4).
a tissue type, such as: glandular, fibrous, cancerous, fat Informally, this representation can be viewed as a surface and
or calcium. Given the X-ray attenuation within a voxel it clinically significant effects such as masses appear as features
is certainly possible to classify fat since it has relatively on this surface, e.g. small hills (Figures 2 and 4). Note that
low linear attenuation coefficients (Johns and Yaffe, 1987) this is fundamentally different from regarding the intensity
as Table 1 shows. It is also possible to classify the likely image as a surface, since the h int representation is a quantitative
occurrence of calcium, which is practically radio-opaque. measure of anatomical tissue in vertical pencils of the breast.
However, the remaining breast tissues are those that comprise The importance of h int stems from the fact that it factors out
anatomically significant events in breast disease, such as cysts, the imaging parameters particular to the examination to yield
malignant masses, fibroadenomas, and they are difficult to a representation of the intrinsic anatomy, which is ultimately
resolve from X-ray attenuation measurements alone. These what is relevant for diagnosis. The key point about h int is
observations led us to classify breast tissue into one of three that it is the basis for quantitative analysis of mammographic
types: ‘interesting tissue’, fat and calcium. signs.
Unfortunately, a further problem arises because of the
projective nature of mammographic imaging: the three- 3. COMPUTATION OF THE hint SURFACE
dimensional information is lost. In light of this, the only
information that is available describes the tissue within a cone Having defined h int we now look at how it is computed in
of the breast, where the cone has as its base the area of a practice for real mammographic images. The first subsection
pixel and as its apex the X-ray source. After appropriate deals with the calibration data required to compute h int . It is
correction we can consider the X-ray beam within this cone as split into two parts: the first deals with system calibration,
a pencil beam. For our work, we consider calcium to have been the second with image calibration. The actual computation of
detected (Karssemeijer, 1993) so that there are basically only h int is next, and the importance of an accurate breast thickness
two tissue classes to consider, and we use the thicknesses of measurement H is stressed. The next subsection looks at
interesting tissue (h int cm) and fat (h fat cm) as our quantitative improving the estimate of H using the image itself and we
breast measurements. end the section by considering the errors likely in h int due to
If the breast thickness H is known then H = h int + h fat . A errors in the calibration data.
6 R. Highnam et al.

3.1. Calibrating the mammographic process a more accurate assessment of radiation dose. It should be
3.1.1. System calibration noted that most mammography compression devices have a
The mammographic imaging process has several parts which slant of about 0.5 cm across the breast image. Our computer
might vary from day to day. In order to effect meaningful software takes this slant into consideration, but in this paper
image analysis by computer, it is necessary to know these we consider H constant.
variations in order to make the images conform to a standard.
To achieve this requires calibration data. In our work, we 3.2. Computing hint
calibrate the film–screen response, film processor and film Given a mammographic image, we seek to find the thicknesses
digitizer. We assume that the X-ray tube output spectrum is of interesting and fatty tissue between the X-ray source and
relatively stable but that the number of incident photons varies each pixel. To do this, we consider the energy imparted to
across the image (the anode heel effect). In order to calibrate the intensifying screen at each pixel since we can attain these
these parts of the system we collect the following data: from the pixel values in the image using the calibration data.
Let E ps (x, y) be the energy imparted to the screen in the area
• A step wedge film. We need a film with a lucite step corresponding to the pixel (x, y). E ps (x, y) contains both
wedge placed along the back of the film with a lucite scatter and primary components. The primary component
block placed over the automatic exposure control. This E p (x, y) is determined by subtracting a scatter estimate from
film allows us to calibrate the film–screen system and film the total energy imparted. The process of estimating the scatter
processing so that energy imparted to the intensifying component is explained fully in Highnam et al. (1994). We
screen can be related to film density. compare E p (x, y) to the primary energies that we expect to
• A ‘blank’ film. We need a film taken with a short find in practice in order to determine h int (x, y).
exposure time with no object (breast) present. The For a pixel with h int cm of interesting tissue and h fat cm of
exposure has to be short so that the film does not saturate. fatty tissue above the corresponding area of the intensifying
We find that an exposure of 0.04 s, at 100 mA and 28 kV screen, the total attenuation at any energy E is expected to be
produces a film that has film densities that vary between
1.8 and 2.6 (despite looking black). This film provides hµ(E, x, y) = h int (x, y)µint (E) + h fat (x, y)µfat (E)
us with information about the spatial variations of the = h int (x, y)(µint (E) − µfat (E)) + H µfat (E),
incident radiation intensity.
• The digitized image of the step wedge film. The film (1)
density on each step of the wedge is measured so that where we have substituted h fat (x, y) = H − h int (x, y). In this
once digitized, the relationship between pixel value in the case, the energy expected to be imparted to the intensifying
digital image and the film density in the corresponding screen by the primary photons is
area of the film is known.
 Vtube
3.1.2. Image calibration E p (x, y) = φ(Vtube , x, y)Ap ts N0rel (E)E S(E)G(E)
0
As well as calibrating the system components we need to
×e−µluc (E)h plate e−hµ(E,x,y) dE (2)
know data specific for each mammographic examination. In
particular, we require: where φ is the photon flux for an X-ray tube voltage of Vtube ,
• The tube voltage (Vtube kV); this varies across the image due to the anode heel effect; Ap is
• The tube current (Itube mA); the pixel area; ts is the time of exposure; N0rel (E) is the relative
• The time of exposure (ts s); number of photons at energy E; S(E) is the absorption ratio
• The breast thickness (H cm). of the screen to primary photons of energy E; G(E) is the
transmission ratio of the grid for primary photons of energy
Most of this information is readily available, but measuring E; µluc (E) is the linear attenuation coefficient of lucite at
the breast thickness H is currently awkward since the radiog- energy E and h plate is the thickness of the lucite compression
rapher has to measure it using a ruler; though newer machines plate.
are incorporating automatic measurement of breast thickness. Note that after substituting Equation (1) into Equation (2)
In both cases, the values of H have been shown to be inaccurate the only unknown is h int (x, y). We equate the primary energy
(Burch and Law, 1995). After explaining how the h int values found in the practical case with the theoretical value and
are generated, we detail ways of improving the accuracy of solve the resulting nonlinear equation to determine h int (x, y).
H using the image itself and discuss the effects of errors in Our experiments in over 100 clinical tests, carried out on
H on the values of h int . An improved estimate of H allows women from a wide range of ethnic and socio-economic
Mammographic image processing 7

backgrounds with no pre-filtering, have shown that h int images


can be computed reliably if a good estimate of breast thickness
H is available. The next section looks at improving the
measurement of H , and then we look at errors that we might
expect in the h int computation from errors in H .

3.3. Improving the estimate of H


There are many checks that can be made on the computed
h int values. For example, we can compute the ratios of
interesting tissue to fat within a breast and can judge visually
whether the results are realistic. Another useful indicator
of calibration data accuracy and the scatter estimate are the
scatter-to-primary ratios across the mammographic images.
Typically we expect a minimum of about 0.05, an average of Figure 5. This shows images with luminance proportional to h int
about 0.12 and a maximum of ∼0.22. For both these tests it is except for the bright white ‘breast edge’ which is where h int < 0.
The breast thicknesses tried were: 6.4 cm, top left; 6.0 cm, top right;
possible to correct H to achieve satisfactory results. However,
5.4 cm, bottom left; 3.4 cm, bottom right. The top two estimates
we note that even a small variation in H may result in large
are too high (the breast edge is too ragged) whilst the bottom right
changes to certain calculated features. For example, clinical estimate of H is too low—there is no breast edge.
tests carried out at Oxford (Highnam, 1992) showed that if one
has good calibration data, one finds good consistency between
the percentage of interesting tissue within the breast and the
visual appearance of the image. For example, a dense looking The two bounds are useful because the effects they bound
breast would have a high percentage of interesting tissue while react differently to changes in breast thickness H . As H
a fatty looking breast would have a low percentage. This increases the predicted primary component in Equation (2)
percentage was computed as decreases and h int decreases accordingly to match the actual
and predicted values. If H increases sufficiently, pixels well
Vinteresting tissue inside the breast edge begin to have values of h int < 0 and
100.0 × (3)
Vtotal this makes the h int = 0 line less smooth, possibly disjointed.
where V is the volume of the specified tissue. A rise in H If H decreases the predicted primary rises and h int increases
leads to a large increase in the total volume (since any change accordingly and it is possible that not only will values rise
is multiplied by the projected area size). But a rise in H above H , but also that there will be no pixels found with
also gives a large decrease in the total volume of interesting h int < 0. Figure 5 shows the results when H is too low,
tissue so that in Equation (3) the numerator becomes smaller and when H is too high. In some clinical settings, we have
and the denominator increases, resulting in a large change in found that many of the breast thicknesses have to be altered,
the percentage. Clearly, H is a crucial variable and we now either because of poor estimates or other poor calibration data.
consider ways to improve the initial estimate. In some cases, the breast thickness H has to be changed by
There are two important bounds on the values of h int . The about 0.5 cm. However, there appears to be a narrow range
first is that we do not expect to find any pixels where h int > H . of plausible values of H and this is likely to decrease further
The second is more subtle: it is that we always expect some when the smoothness of the h int = 0 line is also taken into
pixels for which h int < 0; this occurs around the breast account.
edge where the actual amount of breast tissue between the
compression plates reduces quickly from H to zero, and for 3.4. Errors in hint due to errors in calibration data
which the measured attenuation is too low to have been from Recall from the previous sections that h int (x, y) is computed
H cm (the reported breast thickness) of pure fat. Moreover, by adjusting the value at each pixel until the theoretical primary
we expect the line on the image at h int = 0 to be quite smooth, energy matches the measured primary energy. The polyen-
since the breast is enclosed in a layer of fat, although a possible ergetic nature of the incident X-ray beam, and the resulting
exception is where ligaments cross the layer of fat that borders integral equation for the theoretical energy makes analytical
the breast and join the skin (although, it should be noted analysis impossible, but simplifying to a monoenenergetic
that sometimes the digitizer can cut off a large portion of the equation removes many of the variables. In the analysis that
breast edge, and this may prevent setting H to give a certain follows, we use a mixture of analysis of the monoenergetic
proportion of breast tissue versus edge tissue). case supported by numerical analysis in the polyenergetic case.
8 R. Highnam et al.

In the monoenergetic simulation, the theoretical primary Table 2. Differences between linear attenuation coefficients over
energy [Equation (2)] becomes energies of interest.

E p (x, y) = φ(x, y)Ap ts E SG e−µluc h plate E µint µfat µint − µfat


×e−h int (x,y)(µint −µfat )−H µfat . (4) 14 1.966 0.982 0.985
15 1.635 0.832 0.803
Rearranging to give an explicit equation for h int : 16 1.381 0.718 0.664
17 1.184 0.628 0.556
H µfat 18 1.028 0.558 0.470
h int (x, y) =
µfat − µint 19 0.903 0.502 0.401
ln E p (x, y) − ln(φ(x, y)Ap ts E SGe−µluc h plate ) 20 0.802 0.456 0.346
+ . 21 0.719 0.419 0.301
µfat − µint
22 0.651 0.388 0.263
(5) 23 0.594 0.362 0.232
24 0.546 0.340 0.205
Let H  be the estimated breast thickness with h int the resulting 25 0.505 0.322 0.183
h int values. Then subtracting Equation (5) for H and H  : 26 0.471 0.307 0.164
27 0.441 0.293 0.148
h int (x, y) = h int (x, y) 28 0.416 0.282 0.134
(H  − H )µfat ln E p (x, y) − ln E p (x, y) 29 0.393 0.272 0.122
+ + 30 0.374 0.263 0.111
µfat − µint µfat − µint

(H − H )µfat ln(E ps (x, y) − E s (x, y)) The first column shows photon energy; the second shows the linear attenuation
= h int + + coefficient (cm−1 ) for ‘interesting’ tissue; the third column shows the linear
µfat − µint µfat − µint attenuation coefficient for fat; the fourth column shows the difference between

ln(E ps (x, y) − E s (x, y)) the coefficients. The difference is an important variable in many equations.
− The energy range of most interest is between 16 and 20 keV. It is this energy
µfat − µint range that contains the vast majority of the X-ray photons.
(E (x,y)−E  (x,y))
(H  − H )µfat ln (Epsps (x,y)−Ess (x,y))
= h int (x, y) + − .
µfat − µint µfat − µint be written as
(6)
E ps (x, y) − E s (x, y)
= 1 + 0.02(H − H  ) .
An error in the estimate of breast thickness results in an error E ps (x, y) − E s (x, y)
in the estimation of the scatter component E s (and so primary)
Equation (6) becomes
and an error in the analysis of the attenuation measure. We
shall show that the change in scatter component has little (H  − H )µfat ln(1 + 0.02(H − H  ))
effect compared to the error in the analysis of the attenuation h int (x, y) = h int (x, y)+ − .
µfat − µint µfat − µint
measure, and that this error only causes a uniform translation
of h int values, i.e. h int (x, y) = h int (x, y) + constant. In the energy range of interest µfat is greater than 0.5 (see
Concentrating on the last part of the previous equation, we Table 2). Moreover, in practice |H − H  | < 0.5 cm, so that
can simplify it: ln(1 + 0.02(H − H  )) is negligible and we can remove the
third term from the equation:
E ps (x, y) − E s (x, y) E p (x, y) + E s (x, y) − E s (x, y)
= (H − H  )µfat
E ps (x, y) − E s (x, y) E p (x, y) h int (x, y) = h int (x, y) + .
µfat − µint
E s (x, y) E  (x, y)
= 1+ − s . (7)
E p (x, y) E p (x, y) Rewriting this equation using  to represent change in the
variable µfat
From the work of Carlsson et al. (1989) we know that for h int (x, y) = H .
any block of breast tissue of a certain composition there is µint − µfat
an approximately linear relationship between the scatter-to- The amplifying term, µintµ−µ
fat
fat
is always greater than 1.0 for
primary ratio and breast thickness. From their values we the energy range of interest (see Table 2). This means that
deduce that the rate of change of the scatter-to-primary ratio H − H  has quite a major effect on the calculation of h int .
with breast thickness is about 0.02 cm−1 , so Equation (7) can However, the equation suggests that h int is linearly related to
Mammographic image processing 9

Table 3. h int values calculated when a poor breast thickness estimate


suggested that careful calibration is possible, and results from
has been entered. qualitatively comparing percentages of interesting to visual
appearance were very promising, albeit in a low number of
H H h int h int cases. The adjustment of H to compensate for all errors
4.0 3.0 1.0 2.15 leaves the potential for implausible values of H , either too
2.0 3.17 high or far too low. However, as we pointed out earlier,
3.0 4.15 the h int surface has been computed successfully for over 100
3.5 1.0 1.53 mammograms taken from over 48 women from a wide range of
2.0 2.54 ethnic and socioeconomic backgrounds. The mammograms
3.0 3.55 were obtained from clinics in Oxford, UK and New Jersey,
4.5 1.0 0.31 USA and the women were in the age range 50–65. Twenty-
2.0 1.29 five per cent of the mammograms contained carcinomas.
3.0 2.28
5.0 1.0 —
4. APPLICATIONS OF hint
2.0 0.67
3.0 1.65
6.0 5.0 1.0 2.18 Now that we have described the h int surface, and shown
2.0 3.20 how it can be computed from a mammogram image, in the
3.0 4.22 following sections we put it to work on four applications.
4.0 5.24 First, we sketch the use of the h int surface in image restoration
5.5 1.0 1.54 and enhancement. In particular, h int lies at the heart of our
2.0 2.55 technique for scatter removal (Highnam et al., 1994) and
3.0 3.56 for simulating a scatter-free monoenergetic examination. If
4.0 4.57 Dgiven (x, y) is the original (two-dimensional) X-ray density
6.5 1.0 0.28
image and h int (x, y) is the (2.5-dimensional) interesting tissue
2.0 1.27
surface, then this application can be represented by
3.0 2.25
4.0 3.23 Dgiven (x, y) −→ h int (x, y)−→Denhanced (x, y) .
7.0 1.0 —
2.0 0.63 Secondly, we present results that show how the h int surface
3.0 1.60 can be transformed prior to display in order to simulate change
4.0 2.57 in anatomical structure or breast tissue. In particular, we can
The first column shows the actual breast thickness in cm and the second simulate the appearance of masses of various types in various
shows the estimated breast thickness. The third column shows the actual h int contexts. There are many reasons for doing this including
thickness and the fourth shows the thickness estimated from the estimated
breast thickness. The error in h int is near constant for each circumstance even
developing a teaching tool for radiologists. This application
in the polyenergetic case. can be represented by

Dgiven (x, y) → h int (x, y)→h int (x, y) → Dsimulated (x, y) .


(8)
H − H  and this can be shown in the polyenergetic case by Thirdly, we consider how h int might be used to normalize
considering specific values or by studying histograms of the mammographic images. We start by showing how conven-
h int image. Table 3 shows the change for various different h int tional ideas about normalization fail in mammography. This is
values in different circumstances in the polyenergetic case. followed by a discussion of how, using h int values and choosing
The change in h int is nearly constant. The linear relationship appropriate image measurements to overcome the superimpo-
reveals a translation of h int values with different H values, and sition problems, we can attain reliable measurements. The
this turns out to be useful later on when choosing features. last section uses the normalization ideas to show how we
When h int is computed, H is adjusted until the breast can differentiate between cysts and fatty tissues, and we
edge is suitably smooth (see section 3.3). This adjustment again show the superiority of our technique over conventional
of H leads to feasible h int values, effectively compensating techniques.
for any errors. However, this means that the absolute values It should be noted that another application of h int is in
cannot be trusted unless careful calibration has taken place but estimating the breast thickness. Other techniques have been
does suggest that differences in the h int values might be used shown to be inaccurate or cumbersome (Burch and Law, 1995),
quantitatively; the previous work at Oxford (Highnam, 1992) it would be far better to find the thickness from the breast
10 R. Highnam et al.

image itself. This is important because knowledge of the in the computation of h int and for any reliable image analysis
breast thickness allows an estimate of the radiation dose during (see section 7.1.2).
the examination, and, in fact, the h int representation allows a The mammographic image can be corrected by using the
far more accurate estimation than is currently possible since data obtained by taking a blank image at low exposure with
it contains information specific to each breast, and is not a no ‘object’ present. As described earlier, this presents a dark
generic breast model. mammogram that includes the spatial variation of the incident
intensity. We correct for the anode heel effect in the energy
5. IMAGE RESTORATION AND ENHANCEMENT imparted representation, i.e. after converting film density to
energy imparted. The corrected energy imparted is
Once all the imaging conditions have been removed and the h int
representation computed, any projective X-ray examination φ(Vtube , xanode , yanode )
corrected
E ps (x, y) = E ps (x, y)
can be simulated. This enables us to produce images which φ(Vtube , x, y)
show the removal of known degrading factors such as scattered blank
E ps (xanode , yanode )
radiation and beam hardening. Because the algorithms are = blank (x, y)
E ps (x, y)
E ps
based on the physics of the imaging process they are less likely
to introduce artefacts than conventional algorithms. Since where E blank (x, y) are the energies imparted to the screen
users understand how the algorithms work they are better when the blank film is performed and (xanode , yanode ) is the
placed to diagnose from the enhanced image. Of course, position on the film directly below the anode, and is where the
as we noted earlier, the information contained intrinsically maximum film density on the blank film is.
in a mammographic image is limited (projective, high noise
to signal, poor discrimination of tissue type), but our work 5.2. Simulating a change of time of exposure
demonstrates that there is information on the films that is The time of exposure is set for each mammographic examina-
not usually seen so that we can be sure of performing what tion by an automatic exposure control (AEC) which terminates
radiologists would term image enhancement. However, note the exposure once a certain amount of radiation has been
that we do not seek to perform exact simulations: our goal absorbed. Occasionally, the AEC will misfunction producing
is to enhance the image, so we do not, for example, seek to images that are under- or overexposed, or the tissue above the
amplify the noise just in order to be accurate. AEC will be different between left and right breasts leading to
In this section we look at four algorithms, these simulate: an different exposures and making it hard to compare the images.
even incident radiation intensity; a change in time of exposure; In both cases, it is possible to simulate a change in exposure
removing the effects of scatter; changing the X-ray source. time, and hence a change in image brightness. Note the point
The first two of these can be simulated without going to made in the Introduction: the radiographer understands the
the h int representation but are both required for the last two effect of changing the exposure time, hence trusts using the
algorithms, which are the major innovations and are based corresponding Xmammo button and the algorithm it calls.
directly upon the h int representation. Further algorithms have There are two ways that the radiographer might change
been developed which allow the user to simulate any film– the time of exposure. One is through direct manipulation of
screen characteristic curve (Highnam, 1992) and the use of all the mA s (milli-Ampere seconds) value given at the time of
the algorithms can be seen in the accompanying CD-ROM, or exposure, the other is through manipulation of the AEC—
on the video available from the authors. both its position and density setting. Under- or overexposure
Note that although we discuss film density images, which of the image can be as a result of poor positioning of the AEC.
is where the pixel values are related linearly to film density, all Modelling the AEC as giving an average energy imparted
the images that we display are transmitted light images; that over a certain area has been shown to be a good model. To
is, the displayed images are adjusted so that the luminance simulate a change in the positioning of the AEC, one computes
from the monitor is directly proportional to the intensity of the current average energy imparted at the new area, E got
the light transmitted through the film when it is placed on a (AEC), and use the required energy E desired (AEC) (computed
light box. by specifying a desired average film density) to compute the
required increase in time of exposure:
5.1. Simulating an even incident radiation intensity
The ‘anode heel effect’ arises because of the way in which the tsdesired E desired (AEC)
X-rays are generated and leads to spatially varying incident got = .
ts E got (AEC)
radiation intensity over the breast. Although not necessarily a
degrading effect, it is necessary to be able to compensate for it All the energies in the image are multiplied by this ratio, and
Mammographic image processing 11

to appear blurred, and its removal leads to clearer, sharper


images. The scatter model that we use is based upon using the
image to estimate the tissue composition around each pixel,
and then using this composition to estimate the scatter at the
pixel. The reader is referred to Highnam et al. (1994) for the
details of scatter estimation and scatter removal. Removal of
scatter is the key step in determining the h int values. After
scatter removal the images usually have to be made darker
since they now appear underexposed; this can be done using
the AEC model above.

5.4. Simulating changing the X-ray source


Currently, the most advanced model-based enhancement is
achieved by simulating monoenergetic, scatter-free examina-
tions. Switching to a monoenergetic beam removes the effects
of beam hardening. Beam hardening causes a loss of contrast
in dense or thick breasts since as the beam passes through the
breast tissue, the lower energy photons are more likely to be
Figure 6. An original mammographic image. The imaging process absorbed, and so the average photon energy rises. In practice,
has many degrading factors. Figure 7 shows the result of simulating X-ray sources are chosen to produce a beam that gives high
a scatter-free monoenergetic examination. contrast between tissue types, but keeps the dosage low. Low
energy gives greater attenuation but gives greater absorption
into the breast tissue. The h int representation enables us to
optimize appearance without worrying about dose.

5.5. Simulation example


We can represent the operations required to perform a mo-
noenergetic, scatter-free examination as follows:
D → E → E anode → E p → h int → E mono → Denhanced .
Figure 6 shows an original mammographic image, and Figure
7 shows the result from a scatter-free, monoenergetic simula-
tion. Note that since the scatter removal is a high-pass filter
we can be sure that signs of calcium will only be enhanced.

6. SIMULATING THE APPEARANCE OF


ANATOMICAL STRUCTURES

We hope that by learning how to realistically simulate


anatomical structures such as cysts and spiculated masses we
Figure 7. This shows the result of simulating a scatter-free can gain insight into:
monoenergetic examination on the breast shown in the original
mammogram in Figure 6. • The 3D shape of anatomical structures found within the
breast;
• The effects on anatomical structures of physical com-
then converted back to film densities to show the simulated pression from different directions;
image. • The way that abnormal (both malignant and benign)
structures interact with the local tissues;
• The tissue composition of anatomical structures.
5.3. Simulating removing the effects of scatter
The next level of enhancement is to remove the effects of This information will be useful for image processors in
scattered radiation. Scatter causes a mammographic image that it will help in devising reliable and robust features for
12 R. Highnam et al.

image analysis, but also for radiologists in that it will help


them connect two-dimensional shapes and intensities to 3D
shape and composition. Furthermore, since constructing large
mammographic databases has proven to be a major problem,
simulation could prove helpful in testing image analysis
algorithms as well as being a teaching tool for radiologists.

6.1. Simulating a cyst


We now show how to use the h int surface to simulate how a cyst
might appear in a breast image. The breast image we use is
from a real asymptomatic patient. Our aim is to show what the
image would look like if a cyst were to develop in the breast. To
make the point as simply as possible, in an initial experiment,
we model the cyst simply as a sphere. Let h cyst (x, y) be the
corresponding interesting tissue surface, where the value at
each pixel represents the thickness of interesting tissue due to
the cyst. This thickness is found by multiplying the thickness
of the sphere by a mass density value:
h cyst (x, y) = h sphere (x, y) × mass density .
Figure 8. This fatty breast has had two spherical volumes implanted
The mass density value represents the proportion of interesting into it to simulate cysts, and two irregular shaped volumes to simulate
tissue to fatty tissue within the cyst and is assumed constant malignant lesions. The cysts volumes are ‘filled’ with tissue which
throughout the cyst. This proportion would be higher for a has a lower interesting tissue percentage than for the two irregular
malignant mass than for a cyst. malignant volumes. The circular shape in the top right of the image,
The simulation starts, as foreshadowed by Equation (8), near the nipple, is a genuine cyst. The irregular shaped masses have
where the breast image is converted to its interesting tissue the shape of a malignant lesion from a different mammogram.
surface representation, h breast . Let S be the set of pixels that
the sphere covers when projected onto the film, then the new
surface is given as follows: underlying three-dimensional shape is again assumed to be
 spherical. This is implemented using binary morphological
 h breast (x, y), (x, y) ∈ S
h breast (x, y) = erosion to define layers within the mass, and then the spherical
h breast (x, y) + h cyst (x, y), (x, y) ∈ S . model is used to find the thickness of the mass for each layer.
Note that in doing this, we are implicitly assuming not only This thickness is multiplied by a mass density value (the
that the cyst develops separately from the breast tissue, but that proportion of interesting tissue-to-fat for the mass) to attain
it does not displace interesting tissue either, only fat. Although the thickness of interesting tissue at each pixel due to the mass.
this might be appropriate for benign lesions, it is unlikely to Figure 8 shows two irregular shapes which were produced
be correct for malignant lesions, where in radiology parlance, by our simulation of a malignant mass. The outline mass shape
‘masses pull in the surrounding tissues’. The mammographic is taken from an actual mammographic image (Cerneaz, 1994).
imaging process is then simulated (including scatter) to show The mass density used was 0.75 for both examples, but one
how the new h breast surface would appear. For example, Figure was placed on fatty tissue whilst the other on denser tissue.
8 shows the result from the simulation. There are three Revising this density and devising ways of building the masses
circular shapes in the image, the two away from the nipple are into the surrounding tissues will be areas for future work.
simulated whilst the one near the nipple is a genuine cyst. The
two simulated masses have mass densities of 0.5 and appear to 6.3. Simulating curvilinear structures
a radiologist to be realistic. We believe that the mass density Further simulations involve much smaller anatomical struc-
will turn out to be a key parameter in differentiating between tures, namely curvilinear structures (CLS). Removal of the
malignant and benign masses. CLS from an image allows better detection of tumours and
in itself can provide useful information about the location
6.2. Simulating malignant lesions of objects such as calcification (Cerneaz et al., 1994, 1995).
In this simulation, we extract an outline malignant lesion Simulating various size structures and matching with actual
shape from one image, and ‘place’ it into another image. The images allows models of the CLS to be proposed and their
Mammographic image processing 13

serious questions about work done to date on the application


of neural networks to mammographic breast cancer diagnosis
(see Tarassenko et al., 1995 for a discussion). This is followed
by the arguments that show that h int images should not be
normalized and how to choose feature measurements robust
to different surrounding tissue.

7.1. The conventional approach and its limitations


In this section we look at a standard normalized feature,
contrast, and show how it changes between images due to
digitization and film–screen processing variations. We then
show how intra-image variations occur due to the anode heel
effect.
Figure 9. In this example four linear structures have been ‘implanted’
into the breast and a polyenergetic X-ray simulation performed. The
7.1.1. Film processing conditions
linear structure on the left apparently passes through the area of
increased density (brightness) much as a vessel might if it were really
Most reported mammographic image processing is performed
above or below the tissue volume that is projected onto that area. The on film density images, that is, the images have pixel values
four implanted structures all have elliptic cross-sections of varying linearly related to film density. These images tend to
radii. Varying the cross-sectional model allows us to investigate the be used for several reasons, including their suitability for
effects of breast compression on curvilinear structures by comparing digital storage without losing too much information due to
the simulations with real images. discretization and the absence of variations due to varying
illumination conditions in the digitization process.
To illustrate the susceptibility of conventional features to
appearance in images and responses to feature detectors changes in the imaging conditions consider ‘contrast’ in a
calculated. The model we have used so far is that the CLS film density (D) image. Let P(x, y) = α D(x, y) + η be the
has elliptic cross-sections. Figure 9 shows various CLS linearly digitized image. One popular definition of contrast in
simulations. a window of the image is

7. NORMALIZATION Pmax − Pmin


C =
Pmax + Pmin
Conventionally, normalization takes the form of scaling an α Dmax + η − α Dmin − η
image to cover a standard range of pixel values or mapping =
α Dmax + α Dmin + 2η
it to have a standard mean; normalization of the feature Dmax − Dmin
measurements might involve finding the average feature mea- = .
Dmax + Dmin + 2η/α
surement over each image and using it as a normalizing factor.
Although, these techniques might sometimes be appropriate Note that already contrast depends on the digitizer transform.
in simple imaging processes, the mammographic imaging For analysis we assume a linear relationship between film
process is far too complicated for conventional normalization density and the logarithm of the energy imparted to the
to deal with all the potential changes in imaging conditions intensifying screen:
and surrounding tissue. Indeed ad hoc normalization can
add or remove the very changes which the features were D = γ log10 (β E)
selected to detect. Instead, we argue that processing should be
performed on h int images (with no further normalization) but where γ is the film–screen gradient and β is related to the speed
that the feature measurements should take account of objects of the film–screen system. We can then expand the contrast
being projected onto different surrounding tissues (i.e. the definition to investigate the effects of the film processing
background changes). conditions on the feature:
We start by demonstrating the failure of the conventional
approach when there are variations in the imaging process. We γ log β E max − γ log β E min
C =
find that conventional normalization techniques are ineffective γ log β E max + γ log β E min + 2η/α
both for intra-image variations (e.g. the anode heel effect) and log E max − log E min
inter-image variations. The latter is a critical point and raises = .
log E max + log E min + 2 log β + 2η/γ α
14 R. Highnam et al.

7.1.2. The anode heel effect


As a second example of the errors induced when the imaging
conditions are not considered we look at changes in contrast
due to the anode heel effect. This is crucial since it presents
intra-image variations rather than inter-image variations. The
example we take is where we have two identical blocks
of tissue, with the same scatter and extra-focal radiation
contributions, but with one tissue block being near the nipple,
the other near the chest wall. The incident radiation intensity
typically varies by 15% between these two positions. That
gives the following two contrast equations:
log E max − log E min
Cnipple =
log E max + log E min + 2 log β + 2η/γ α
Figure 10. This shows the relative distribution of contrast values log E max − log E min
from small windows over a mammographic image using a simple Cchestwall = .
log E max + log E min + 2 log β + 2η/γ α − 0.141
contrast measure, 100 samples, and a film density image of a volume
of fibro-glandular tissue within the breast. The curves are for different To attain this equation we have worked through a 15% drop
film–screen characteristic curves with different gradients as marked. in incident intensity by reducing all the energies by 15%, as
the theory predicts. Taking typical values of the variables
let us see the effect on contrast of the anode heel effect:
γ = 3; α = −93; η = 279; β = 1.648 × 1011 J−1 ; E min =
1.5 × 10−11 J (a film density of 1.2); E max = 4.1 × 10−11 J
(a film density of 2.5). With these values Cnipple = −0.541
but Cchest wall = −0.460, a significant increase reflecting the
higher incident radiation. Interestingly, when this situation is
simulated it appears that extra-focal radiation makes up for
the anode heel effect somewhat and Cnipple is nearer −0.517.

7.2. Normalization using hint


Transforming a mammogram to the h int representation effec-
tively normalizes the mammogram, but further normalization
of the features has to take place for two reasons. The first
reason is that breasts compress to different thicknesses. The
second reason is that breast composition varies widely, and
an object such as a cyst might be projected onto fat in one
image and onto dense fibro-glandular tissue in another. As
Figure 11. This shows the relative distribution of contrast values an example of the susceptibility of features to changes in
within a film density image of a volume of fibro-glandular tissue. the background consider the effect of the breast being less
The continuous curve shows the feature distribution on a breast at compressed. Figure 11 shows a contrast measure on a breast at
full compression; the dotted curve shows the feature distribution at full and slightly less compression. The images were supplied
less compression.
from the research detailed in Highnam et al. (1991, 1992).
Successful normalization provides consistency in feature
values for the same tissue type within the same image and
Clearly, this contrast measure is highly susceptible to changes between different images but also differentiation of tissue
in the imaging conditions. Figure 10 shows the relative types. We have already seen how susceptible film density
distribution of contrast measures for some normal fibro- images are to variations in the imaging conditions and the
glandular tissue when the film gradient changes from 2.0 to breast thickness and how this effects a conventional feature
3.0. Note that the change in film gradient is simulated, and that measurement such as contrast. In this section we look at how
the contrast measure is being evaluated over small windows we should normalize h int images and feature measurements
within the image. knowing the effects of errors in H . In particular, this allows us
Mammographic image processing 15

to study the effects of different breast thicknesses and different


breast compositions.

7.2.1. Coping with different breast thicknesses


The first point to note is that there has been little research that
considers anatomical structure size relative to breast thickness.
In the absence of evidence to the contrary and in the belief
that most anatomical structures are from local processes, we
assume that feature size is independent of breast thickness
H . This means that we should not scale the h int images to a
standard breast size and then look for a standard size feature,
rather we should take the h int images and look for a standard
size feature in them.
Another problem with scaling on H is that it can lead to h int
values which have little or no relation to the projected spatial
size of a feature. For example, a 2 cm projected diameter cyst
might be mapped to have an apparent depth of 3 cm. Indeed, Figure 12. Here the same mass is in two different surrounds: one
we consider the relationship between projected spatial size representing a fatty breast (top) and one a dense breast (bottom). In
and h int value to be of primary importance since it gives an this scenario many feature measures fail.
indication of mass density (proportion of interesting tissue to
fat) and this might be useful for diagnosis.
To deal properly with the issue of breast thickness requires 7.2.2. Coping with different surrounding tissues
an in-depth knowledge of breast compression and that is Breast features might have different tissue surrounds. For
beyond the scope of this current work, although we will be example, a mass could be on a fatty surround or a dense
working on this and using such constraints as conservation of surround (Figure 12). In these circumstances features such
interesting tissue in the near future. For now, we sketch some as the contrast measure defined earlier are exceptionally poor:
simple models: maximum − minimum remains constant, but maximum +
minimum is vastly different.
M1 Fat compresses and moves whilst interesting tissue does A problem with the h int representation is that the band
not compress and remains stationary, where we use of tissues that are classed as interesting is too broad. The
‘compress’ to mean that the tissue deforms; breast has much fibrous tissue which helps to support it and
M2 As M1 but the interesting tissue moves but does not which mostly can be ignored (being irrelevant to diagnosis
compress; and essentially invisible) although it does give the breast its
dense or fatty appearance. We consider large, relatively flat
M3 Compression removes a constant value from both h int
components of h int to be surrounding tissue, and we seek to
and h fat across the image. Effectively this assumes that
estimate this component to attain what we term h feature :
interesting tissue and fat both compress, but that the
interesting tissue does not move. h feature (x, y) = h int (x, y) − h surround (x, y) .
M4 Both tissue types compress and move.
Earlier we saw how errors in H lead to translations in the h int
The simple model M1 represents the conventional approach values, so that the definition of h feature actually removes the
to breast compression: ‘It is only the fat that compresses, error since it exists in both components. Convienently, this
the rest stays essentially the same’. It can be modelled as feature also fits in nicely with model M3 so that it should be
adding or subtracting a layer of fat from the breast. If this is robust to different compressions. Note though that h feature is
the case then any feature measure will return the same value likely to be near zero for so-called fatty areas of the breast. As
from the h int representation at the different compressions. It is a consequence, feature measurements which are normalized
more difficult to predict what happens with model M2 since it by dividing by a value of h feature might be unstable.
allows for interesting tissues to overlap after movement. With There are several ways of estimating h surround . One is to
model M3 (the model used in this paper), the effect on the estimate it from the entire breast image by, for example, finding
h int representation is equivalent to changes in the surrounding the minimum h int away from the breast edge. Another way is
tissue, and that is the issue we now consider. to choose window sizes in which to compute the features and to
16 R. Highnam et al.

Figure 13. Distributions of mean values in small windows from film density and h feature images of three breasts. Around 100 samples were
taken for each example. The values computed on the transformed images show better clustering and differentiation of tissue types.

consider h min to be the surrounding level. The correct way of higher. This suggests using the average value within some
estimating h surround depends upon the definition of surrounding window on the image as a feature: higher averages tend
tissue that is being used and this, in turn, depends upon the type to indicate denser tissues. Figure 13 shows the distribution
of feature required, that is, we have to build in some notion of of average pixel value in small windows (1.5 mm × 1.5
scale. These issues are addressed in the next section. mm) in the film density and h feature images for the cyst and
fatty tissue in the three examples. Clearly, the average pixel
values in the film density images for the fatty tissues are not
8. FEATURE EXTRACTION sufficiently clustered to differentiate purely by thresholding.
For the h feature measures the value of h surround was taken to be
Due to the range of features and feature sizes in mammograms the minimum in the large window over which measurements
it is inappropriate to talk about general feature detection. For are performed. This is appropriate because it defines the
this paper we deal with features pertinent to detecting masses, surrounding tissue. In practice one might use a large window
in particular cysts, from fatty tissue, looking for feature values size that is greater than the maximum size of a cyst to
that are consistent within an image and between images. The ensure that some surrounding tissue is covered by the window.
three example images we use are meant to be illustrative of The fat values and the cyst values are far better clustered,
our approach; proof of superiority of our approach has to be in particular, the fatty tissue and the cysts in the breast at
statistical and that is beyond the scope of this paper. However, different compressions give almost identical responses, which
even with the relatively simple problem our approach appears we would expect given the compression model M3.
to have the advantage over a conventional approach. Fatty tissue, in particular, has low tissue roughness (there
The three examples that we consider are of cysts. Two are not many hills in the h int representation). There are many
of the examples (FDLC1,FDLC2) are of a cyst at different possible measures of roughness many of which are based
breast compressions. FDLC1 has the breast compressed to upon the concept of visual contrast so that the feature gives
5.55 cm with an exposure of 70.5 mA s, whilst FDLC2 has the high response for situations where we have a relatively high
breast compressed to 6.46 cm with an exposure of 92.7 mA s. peak compared with the background. Although such measures
The cyst in these images has surrounding tissue that appears might be appropriate for emulating a radiologist, we are more
mostly fatty. The third example (GCRC1) has a smaller cyst in interested in a measure of the tissue roughness that measures
an image digitized using a different scanner to the other two. the actual tissue property. For this paper we again use a simple
The breast was 5.80 cm thick with an exposure of 89.0 mA s. measure, this time standard deviation. The simplicity of our
Again, the surrounding tissue was relatively fatty. features comes in part from the task we are considering but
The first useful observation about cysts is that they often also from not needing to do any further normalization. Figure
present as being denser than fatty tissue. In this case, the 14 shows the standard deviation distribution from the three
attenuation is higher and consequently the image brightness is
Mammographic image processing 17

Figure 14. Distributions of standard deviation values in small windows from film density and h feature images of three breasts. Around 100
samples were taken for each example. Note that the horizontal axis are marked differently, this reflects the fact that the original images are
integers whilst the transformed images are floats.

film density and h feature images for the cysts and surrounding • Image matching. The h int representation suggests new
fatty tissue. The distributions show that the average of the approaches to difficult problems such as matching two
standard deviations is lower for the fatty tissue and that the such structures: images of the same breast taken at
distributions do overlap for the film density images. However, different times, two different views of the same breast,
the distributions in the h feature images are closer than those in comparison of a woman’s left and right breasts, or
the film density images, especially in the case of the breasts at comparison of lots of breasts from the same view. The
different compressions. It would appear that the h feature values development of image matching techniques based on
are more robust to breast compression. the h int representation, and the comparison of results to
image based techniques (Bowyer et al., 1992; Sallam and
9. FUTURE WORK Bowyer, 1994) will be one of our earliest developments
of the research reported here.
In this paper we have introduced the h int surface representation, • Computing mass density. It might be possible to calculate
shown how to compute it, and illustrated its use in a number automatically the proportion of interesting tissue-to-fat
of applications in mammographic image processing. Current for a mass by matching predicted appearance to actual
work aims to extend that described here in many ways: appearance. The spatial size of the tumour is known from
its projected shape and using a suitable tumour model a
• Surface feature extraction. We noted earlier that the
three-dimensional shape can be proposed. The computer
h int representation is a surface representation of breast
then changes the proportion of interesting tissue-to-fat
anatomy, freed of the imaging choices that are con-
until the pixel values in the predicted and actual images
founded in regarding the image as a surface. On the
match.
h int surface, cancerous masses correspond to hills, for
which the steepness of the hill slope, its height relative to • Computing radiation dose. The h int representation
the surrounding ‘plain’ and the jaggedness of its outline supports an estimate of the radiation dose during the
are important parameters for distinguishing malignant examination that is more accurate than is currently
from benign cancers. This line of thinking leads to the possible. Current techniques use a standard breast model,
evaluation of techniques to extract useful information we can use details from the specific breast. Furthermore,
robustly from digital surfaces, such as first, second and we can consider images over time and match them to
even higher order surface variations, including lines of compute the combined radiation dose to certain tissues.
curvature and crest lines, all at multiple scales. We aim • Integrating MRI and X-ray. Using h int we can integrate
to apply such techniques to the h int surface to extract and MRI and X-ray of the breast to investigate hardness and
evaluate potential masses. mobility of tissue, and to investigate recidivist masses.
18 R. Highnam et al.

From an MRI image of the breast it is possible to classify Dance, D. R. and Day, G. J. (1984) Computation of scatter in
each voxel into one of our tissue classifications, and an mammography by Monte Carlo methods. Phys. Med. Biol., 29,
X-ray simulated. The differences between the simulated 237–247.
and actual X-ray images are due to the compression of Dance, D. R., Persliden, J. and Carlsson G. A. (1992) Calculation of
the breast in the X-ray image, and these differences can dose and contrast for two mammographic grids. Phys. Med. Biol.
37, 235–248.
thus give us an idea about the hardness and mobility
Day, G. J. and Dance, D. R. (1983) X-ray transmission formula for
of the tissue. If tissues can be matched between the antiscatter grids. Phys. Med. Biol., 28, 1429–1433.
two images, it might be possible to more accurately Forrest, P. (1986) Breast Cancer Screening. Report to the Health
differentiate between benign and malignant tissues Ministers of England, Wales, Scotland and Northern Ireland.
(Highnam et al., 1991). Finally, if a cancer is detected, HMSO.
leading to a partial mastectomy or lumpectomy, there Gilles, S. (1995) Analyse et Traitement d’Images Médicales Tri-
is an increased risk of subsequent recidivism, but dimensionelles: Application à la Détection Précoce du Cancer
such subsequent growths tend to be near the surgical du Sein. Master’s Thesis, Université Paris VII-Denis Diderot.
scar tissue and they are hard to detect using X-ray Haus, A. G., Metz, C. E., Doi, K. and Bernstein, J. (1977)
mammography alone. We have recently developed Determination of x-ray spectra incident on and transmitted
a prototype system (Gilles, 1995) to analyse contrast through breast tissue. Radiology, 124, 511–513.
Highnam, R. P. (1992) Model-Based Enhancement of Mammo-
enhanced MRI images for application to early detection
graphic Images. Ph.D. Thesis, Department of Engineering Sci-
of breast cancer in young, at risk groups of women. ence, Oxford University.
Highnam, R. P., Shepstone, B. J. and Brady, J. M. (1991) Mammo-
ACKNOWLEDGEMENTS grams at Different Compression Plate Widths for Detection of
Breast Cancer. In Radiology and Oncology 91, Work in Progress
The authors thanks the staff at the Churchill Hospital, Oxford,
(abstracts), 3. British Institute of Radiology.
for their continuing support and encouragement. Particular
Highnam, R. P., Brady, J. M. and Shepstone, B. J. (1994) Computing
thanks to Yvonne Swainston, Anne Dickson-Browne, Donald the scatter component of mammographic images. IEEE Trans.
Peach, and Niall Moore. Nick Cerneaz did the early work Med. Imaging, 13, 301–313.
on the curvilinear structures. Thanks to Alison Noble and Highnam, R. P., Brady, M. and Shepstone, B. (1995) A Represen-
Gabor Szekely for comments on an earlier draft. R.P.H. tation for Mammographic Image Processing. In First Int. Conf.
thanks Philips Laboratories for support during his 9 month on Computer Vision, Virtual Reality and Robotics in Medicine,
stay there and Chuck Carman for many useful discussions; CVRMed’95, Lecture Notes in Computer Science, Ayache, N.
J.M.B. thanks the Epidaure Team at INRIA Sophia Antipolis, (ed.), 905. Springer-Verlag, Nice.
and Nicholas Ayache in particular, for tremendous support Johns, P. C. and Yaffe, M. J. (1987) X-ray characterisation of normal
during his sabbatical there. and neoplastic breast tissue. Phys. Med. Biol., 32, 675–695.
Karssemeijer, N. (1993) Adaptive noise equalization and recognition
of microcalcification clusters in mammograms. Int. J. Patt. Recog.
REFERENCES
Artif. Intell., 7, 1357–1376.
Sallam, M. and Bowyer, K. (1994) Registering Time Sequences
Bowyer, K., Sallam, M., Hubiak, G. and Clarke, L. (1992) Screening
of Mammograms Using a Two-Dimensional Image Unwraping
Mammogram Images for Abnormalities Developing Over Time.
Technique. In 2nd International Workshop on Digital Mammog-
In IEEE Nuclear Science Symp. and Medical Imaging Conf.,
raphy, Excerpta Medical International Congress Series 1069 ,
1270–1272.
Gale, A. G., Astley, S. M., Dance, D. R. and Cairns, A. Y. (eds),
Burch, A. and Law, J. (1995) A method for estimating compressed
121–130. Elsvier Science, York.
breast thickness during mammography. Brit. J. Radiol., 68, 394–
Shroff, J. H., Lloyd, L. R. and Schroder, D. M. (1991) Open breast
399.
biopsy: a critical analysis. Am. Surgeon, 57, 481–485.
Carlsson, G. A., Dance, D. R. and Persliden, J. (1989) Grids
Tarassenko, L., Hayton, P., Cerneaz, N. and Brady, M. (1995) Novelty
in Mammography: Optimization of the Information Content
Detection for the Identification of Masses in Mammograms. In
Relative to Radiation Risk. Technical Report ULi-RAD-R-059,
Fourth Int. Conf. Artif. Neural Networks, 442–447. Institute of
Linkoping University, Department of Radiation Physics.
Electronic Engineers, Cambridge.
Cerneaz, N. J. (1994) Model-Based Analysis of Mammograms. Ph.D.
Woodman, C. B., Threlfall, A. G., Boggis, C. R. and Prior, P. (1995) Is
Thesis, Department of Engineering Science, Oxford University.
the three year breast screening interval too long? Occurrence of
Cerneaz, N. J. and Brady, J. M. (1995) Finding Curvilinear Structures
interval cancers in the NHS breast screening programme’s north
in Mammograms. In First International Conf. on Computer
western region, Brit. Med. J., 310, 224–226.
Vision, Virtual Reality and Robotics in Medicine, CVRMed’95,
Lecture Notes in Computer Science, Ayache, N. (ed.), 905.
Springer-Verlag, Nice.

You might also like