Download as pdf or txt
Download as pdf or txt
You are on page 1of 9

Advanced statistical techniques applied to comprehensive FTIR spectra

on human colonic tissues


A. Zwielly and S. Mordechaia
Department of Physics and the Cancer Research Center, Ben-Gurion University (BGU), Beer-Sheva 84105,
Israel

I. Sinielnikov
Department of Pathology, Soroka University Medical Center (SUMC), Beer-Sheva 84105, Israel

A. Salman
Department of Physics, Sami Shamoon College (SCE), Beer-Sheva 84100, Israel

E. Bogomolny
Department of Physics and the Cancer Research Center, Ben-Gurion University (BGU), Beer-Sheva 84105,
Israel

S. Argov
Department of Pathology, Soroka University Medical Center (SUMC), Beer- Sheva 84105, Israel

Received 16 June 2009; revised 27 December 2009; accepted for publication 30 December 2009;
published 9 February 2010
Purpose: Colon cancer is a major public health problem due to its high disease rate and death toll
worldwide. The use of FTIR microscopy in the field of cancer diagnosis has become attractive over
the past 20 years. In the present study, the authors investigated the potential of FTIR microscopy to
define spectral changes among normal, polyp, and cancer human colonic biopsied tissues.
Methods: A large database of FTIR microscopic spectra was compiled from 230 human colonic
biopsies. The database was divided into five subgroups: Normal, cancerous tissues, and three stages
of benign colonic polyps, namely, mild, moderate, and severe polyps, which are precursors of
carcinoma. All biopsied tissue sections were classified concurrently by an expert pathologist. The
authors applied the principal components analysis PCA model to reduce the dimension of the
original data size to 13 principal components.
Results: While PCA analysis shows only partial success in distinguishing among cancer, polyp, and
the normal tissues, multivariate analysis e.g., LDA shows a promising distinction even within the
polyp subgroups.
Conclusions: Good classification accuracy among normal, polyp, and cancer groups was achieved
with a success rate of approximately 85%. These results strongly support the potential of developing FTIR microscopy as a simple, reagent-free tool for early detection of colon cancer and, in
particular, for discriminating among the benign premalignant colonic polyps having increasing
degrees of dysplasia severity mild, moderate, and severe. 2010 American Association of
Physicists in Medicine. DOI: 10.1118/1.3298013
Key words: FTIR microscopy, polyps, colon cancer, PCA, LDA
I. INTRODUCTION
Colon cancer is a major public health problem due to its
widespread occurrence and death toll worldwide. According
to the estimation of the National Cancer Institute, approximately 108 070 colon and 40 740 rectal cases were reported
in 2008, of which 49 960 caused mortality. Of the estimated
5.2 106 mortalities from cancer per year throughout the
world, 655 000 cases are caused by colorectal malignancies.
It is the second leading cause of cancer-related death in the
Western world.1 Despite the improvement in diagnostic techniques, more than 90% of colon cancer cases have either
advanced or metastasized by the time they are diagnosed.
Hence, there is an urgent need to develop novel digital diagnostic methods to detect the malignancy in the earliest stage
possible. Many colorectal cancers are thought to arise from
adenomatous polyps in the colon. These mushroomlike
1047

Med. Phys. 37 3, March 2010

growths are usually benign, but some may develop into cancer over time. The ability to classify these polyps in time
could provide warning for their development into cancer.
Even within diagnosed cancer cases, the ability to classify
between early and severe cases is highly important and could
influence the treatment strategy.
The use of FTIR microscopy in the field of cancer diagnosis has shown encouraging trends over the past 20 years.2
The wavelength of infrared radiation which is absorbed depends on the nature of the covalent bond i.e., atoms involved and the type of bond and the strength of any intermolecular interactions van der Waals interactions and H
bonding.3 Various biomolecular components of the cell give
a characteristic IR spectrum, from which structural and functional aspects4 of the cell can be inferred. The differences in
the absorbance spectra in the mid-IR region between normal

0094-2405/2010/373/1047/9/$30.00

2010 Am. Assoc. Phys. Med.

1047

1048

Zwielly et al.: Statistical techniques applied to FTIR colonic spectra

and abnormal tissues have been shown to be possible criteria


for detection and characterization of various types of cancer
such as breast,5 leukemia,6 cervical,7 skin,8 brain,9 prostate,10
and also neck and head tumors.11
After approximately 20 years of using IR and FTIR spectroscopy for diagnostic purposes, this field of research is now
challenged with new frontiers. In the past few years hardware innovations have accelerated. This includes different
mobile facilities such as portable attenuated total reflectance
devices as well as optical fiber sensors adjusted to the same
FTIR spectrometer basic principles.12,13 These improvements
require a revolution in the necessity of implementing new
statistical and mathematical algorithms adequate to the potential of modern instrumentation. Well established as well as
new analyses are being constantly improved and adapted.
Developing system approaches that incorporate the different
stages of the spectral analysis is essential for quick and reliable automatic classification between the various groups.
One of the promising techniques is the artificial neural networks that has previously been applied successfully on colon
cancer.14,15 Advanced statistical analysis also shows good results in melanoma studies, where malignant neoplasms of
epidermal melanocytes were successfully differentiated from
nevus based on Gaussian distribution of several unique spectral parameters biomarkers utilized to classify the two
cases.16
Colon cancer, when detected in the early stages, is one of
the most curable cancers. Treatment is mainly surgical in
which the cancerous section of the bowel is removed. Surgery is followed by chemotherapy and radiotherapy.17
As in the case of melanoma where full recovery can be
achieved if the tumor is removed before metastases evolve,
colonic polyps are an indicator for early dysplastic stages.
Thus, grading of premalignant colonic polyps, digitally
and systematically, could lead to economic and practical relief for many patients as well as medical staff. Currently,
ascribing a grade to premalignant polyps i.e., mild, moderate, and severe is still controversial even for expert pathologists.
II. MATERIALS AND METHODS
II.A. Malignant tissues characteristics

Figure 1 presents typical histological images of the five


groups of human formalin-fixed tissues included in the
present study: a Normal colon histology showing flat mucosal surface and abundant vertically oriented crypts lined by
columnar epithelium. The mucosal crypts are lined up in
parallel. b and c Mild and moderate-low grade polyps,
respectively. Low grade dysplastic crypts are characterized
by partial loss of cell polarity and reduced goblet cells. Dysplastic nuclei are stratified and polarized to the lower half of
the epithelial layer. d Severe high grade polyp. High grade
dysplasia is characterized by marked nuclear stratification
and high nuclear/cytoplasmic ratio. Dysplastic nuclei are
stratified through the epithelial layer up to the luminal surface. e CarcinomaAdenocarcinoma of colon showing
malignant glands, with variability in the size and configuraMedical Physics, Vol. 37, No. 3, March 2010

1048

FIG. 1. Histological images of formalin-fixed human colonic tissues stained


with hematoxylin-eosin. a Normal, b mild, c moderate, and d severe
benign polyps, e carcinoma. Bars represent a length of 1500 m.

tion of the glands. The epithelial cells are large with high
nuclear/cytoplasmic ratio. The nuclei of the malignant cells
stratify through the epithelial layer up to the luminal surface
and show a number of mitotic figures and individual necrotic
cells. Glands are embedded in desmoplastic stroma. In summary, Fig. 1 displays the gradual changes in the tissue morphology encountered during the multistep process in which
normal colonic tissues progress toward malignancy.

II.B. Sample preparation

The method described by Argov et al.18 was followed for


sample preparation. Formalin-fixed, paraffin-embedded tissues from adenocarcinoma patients were retrieved from the
histopathology files of Soroka University Medical Center,
Beer-Sheva, Israel. The tissue samples used in this study
were selected with the concession of the patients and under
the institutions Helsinki committee approval to include
both normal, three grades of benign polyps mild, moderate,
and severe, and malignant sites. Two consecutive paraffin
sections were cut from each biopsy; one was placed on a zinc
selenium slide and the other on a glass slide. This procedure
was performed carefully to assure that the two tissue sections
were identical. Thickness of all tissue samples was 10 m.
The first slide was deparaffinized using xylol and alcohol and
was used for FTIR measurements. The second slide was
stained with hematoxylin and eosin for histology review by
an expert pathologist.
The measurement sites were chosen carefully by an expert
pathologist to include the proper epithelial cells on the tissue
cross section. For example, when we measured normal tissues, we chose areas with normal cells only and minimizing
extra cellular contents such as mucine. The same procedure
was followed also with the abnormal tissues.
Our database is composed of 78 patients. From the biopsies, we extracted the following: 103 normal tissues, 29 mild
polyps, 41 moderate polyps, 31 severe polyps, and 26 carcinoma tissue sections. The total number of individual microscopic spectra analyzed was approximately 800.

1049

Zwielly et al.: Statistical techniques applied to FTIR colonic spectra

1049

baseline correction using the rubber band algorithm, vector


normalization, and a second baseline correction handling
constant shifts. Reduction in the data was done by principal
component analysis PCA.20 IR spectra of eukaryotic cells
are defined by roughly 500 variables wavenumbers. To reduce this number, PCA was performed. Basically, PCA is a
mathematical algorithm that reduces the dimension of the
problem that is being dealt with. In other words, instead of
using many variables, the variability in the data is described
by only few PCs. The reduction is achieved by finding the
correlation between the variables. A covariance matrix is being and the eigenstates PCs and the eigenvalues proportional to the variability included in each PC are extracted.
The first linear combination is called the first principal component PC1 and contains, in our case, approximately 55%
of the variance. The second principal component PC2 is a
linear combination of wavenumbers, which accounts for
most of the residual variance and is perpendicular to the first
one. The subsequent principal components obey the same
rules. This method allows the reduction in our spectra to 13
variables the first 13 principal components that account for
almost 100% of the variance.21 Following the PCA, a linear
discriminant analysis LDA was performed.2224 LDA is a
classification technique that employs Mahalanobis distance
to determine the class of an unknown sample. In order to
classify between the different groups a classification criterion
is determined,
f i = iC1xTk 21 iC1Ti + lnpi,
FIG. 2. Schematic diagram of the spectral analysis strategy based on MATLAB
utilized in the present work. The preprocessing block starts with the raw
spectrum. The spectrum was cut into three regions, rubber band baseline
corrected, normalized, and constant baseline corrected. PCA analysis was
applied on the preprocessed spectra followed by LDA as represented by the
right branch. The left branch represents the DCF analysis constructed of six
selective biomarkers chosen based on the spectra t-test results.

II.C. Fourier transform infrared microscopy


measurements

Microscopic FTIR measurements in transmission mode


were performed using the IRscope II FTIR microscope with
a sensitive liquid nitrogen-cooled mercury cadmium telluride
detector, coupled to the FTIR spectrometer Bruker Equinox
model 55, OPUS Software. The measured sites were circular with a diameter of 100 m. For each biopsy, at least
three measurements at different locations were acquired and
the average spectra were analyzed. The spectrum at each site
was the average of 128 coadded scans to increase the signal
to noise ratio.
II.D. Spectral analysis

Figure 2 summarizes the procedure used to process the


measured data. All analysis was done by our in-house codes
developed using MATLAB software.19 Preprocessing of the
data includes bisection of the spectrum into three regions,
Medical Physics, Vol. 37, No. 3, March 2010

where i is the data mean designated class i and C is the


covariance matrix. Each element in C is given by
Cnn = Ci,j,Ci,j = covdimi,dim j.

pi is the prior probability of a measurement belonging to


group i. In our case, we assumed that pi is proportional to the
number of samples in each group. The second term
iC1Ti is the Mahalanobis distance, which is a measure
of dissimilarity between several groups. The class to which a
measurement belongs was determined by its largest f i value.
Training and test sets were selected randomly from the
database. 50% of each set was employed for training and the
remainder for the test. In addition, the validation experiment
was repeated 100 times, with the same input features but
with different sets of randomly selected training and test sets
and the results were averaged.

II.E. Discriminant classification function DCF and


the t-test

DCF is a statistical tool that enables to improve classification between gradual evolved subgroups simultaneously
using several spectral variants. DCF generates a classification score for each group that is a linear combination of a
previously derived array of biomarkers with weight coefficients given by the following equation:

1050

Zwielly et al.: Statistical techniques applied to FTIR colonic spectra

FIG. 3. FTIR spectra at the 2800 3000 cm1 region.

S = c + w1 x1 + w2 x2 + . . . + wi xi + . . . ,

where S denotes the resultant classification score, c is a constant, and wi is the weight coefficient given by
wi =

t _ valuei
,
xi

where xi is the biomarker value, is a constant, and t _ valuei


was defined as the paired t-value of each biomarker among
normal and abnormal tissues.
The constants c and were chosen in such a way as to
nullify the average classification scores of the normal group
and give a score of 100 for the cancerous human colonic
tissues.
The t-test values were considered significant at P 0.05.
III. RESULTS
III.A. FTIR microscopy spectra of tissues

Figure 3 shows the average spectra in the region


2800 3000 cm1. This wavenumbers region was cut from
the entire spectrum, normalized and baseline corrected. The
results Fig. 3 exhibit four prominent absorbance bands:
Near 2848 cm1, due to the symmetric stretching of the methylene chains in membrane lipids; at 2872 cm1 arising
from the symmetric CH3 methyl stretching; at 2918 cm1
due to the antisymmetric CH2 stretch; and at 2958 cm1 due
to antisymmetric stretching of the methyl groups of both
lipids and proteins. The average absorption intensities of the
different tissues are distinctive at 2848 and at 2958 cm1
bands. The average values of these bands indicate a gradual
intensity change, where the normal group has the lowest intensity and the cancer has the highest intensity in the
2958 cm1 band and vice versa for the 2848 cm1 band.
Thus the best discriminating values were obtained by deriving the intensity ratio of these two vibrational modes i.e.,
A2848 / A2958 or vas CH3 / vs CH2. This dimensionless ratio
eliminates a possible artifact, which may arise due to the
Medical Physics, Vol. 37, No. 3, March 2010

1050

baseline contribution underneath each band. Table I summarizes the average values of the prominent bands and their
standard errors, as well as the t-test values between the five
tissue stages. The t-value for this ratio between the two extreme groups of normal and carcinoma is more than 24.
Therefore, this ratio may be considered as a satisfactory
biomarker for the classification between these two extreme
cases. The t-test values in Table I reveal that this ratio is also
significant for the polyps groups as well.
The variation in the phosphate level, measured by integrating the absorbance of the symmetric 1000 1150 cm1
band for the different cases are presented in Fig. 4. On average, the phosphate levels for polyp and malignant tissue
samples were lower than for the normal group. However, the
average absorbance of polyp and malignant samples were
almost equal. The asymmetric 1170 1310 cm1 band shows
the same trend not shown in the figure. Two distinctive
regions are shown in this figure: The 1200 1800 cm1 spectral region and the 1000 1200 cm1 spectral region. Both
regions, separately, were vector normalized. The region
1500 1800 cm1 is almost solely subjugated to the
conformation-sensitive amide I and amide II bands, which
are the most dominant bands in the spectra of nearly all
complex biological systems.25 The intensity differences between normal, polyp, and cancerous tissues in the amide II
band were not significant for all cases. Amide I is among the
bands which slightly shift between the various groups. In
particular, the normal group was lower and wider with respect to the other groups. Since amide I arises from the
C v O hydrogen bonded stretching vibrations, these may
arise due to biochemical alterations conformation and composition in protein and/or nucleic acids, respectively.
Another important biomarker can be obtained from the
shoulder near 1740 cm1, resulting primarily from C v O
stretching vibrations of the ester functional groups in
phospholipids.26 The lipids in the membrane are composed
mainly of phospholipids that determine membrane structure,
stability, fluidity, and membrane enzymatic activity. Figure 4
shows gradual intensity changes in the 1740 cm1 band with
irregularity for the cancer, where its value is above the moderate and severe polyps. Significant higher intensity is noticed for the normal group as can be seen in Table I. The
weaker amino acid side chain from peptides and proteins at
1456 and 1401 cm1 are associated with the asymmetric and
symmetric CH3 bending vibrations.27 The absorption peak at
1243 cm1 is due to the PO2 ionized asymmetric
stretching.28 The absorption due to normal tissue was larger
than for polyps and cancerous types in this entire region for
the averaged spectra. In the case of the 1401 cm1 band, a
significant shift can be noticed for the normal tissue.
The 1000 1140 cm1 region in Fig. 4 contains many
overlapping vibrational modes associated with absorbance of
macromolecules such as proteins, nucleic acids, carbohydrates, and phospholipids. Substantial differences appeared
between the normal tissue spectra, the polyps, and carcinoma, while only mild differences are apparent in the transition between polyps and carcinoma. Changes in this spectral range between the five groups exist almost in all

1051

Zwielly et al.: Statistical techniques applied to FTIR colonic spectra

1051

TABLE I. Average and STD of selected biomarkers are represented in the top section. The bottom section
contains t-test values for all six biomarkers and tissue combination pairs. Each square contains six boxes
corresponding to the six biomarkers in the upper part. The nonsignificant values are marked as NS.

Average values and std deviations


Normal

Mild
Polyp

Moderate
Polyp

Severe
Polyp

Cancer

2.460.38

1.870.39

1.600.26

1.200.21

0.730.12

1.220.32

0.640.17

0.470.14

0.430.13

0.500.17

A1083 / A1056

0.840.04

0.990.05

1.060.07

1.130.06

1.070.07

1025 cm-1

0.200.01

0.160.02

0.130.01

0.120.01

0.120.02

1045 cm-1

0.340.01

0.280.01

0.250.01

0.250.01

0.250.02

A1121 / A1015

1.880.31

2.480.52

3.170.65

3.551.04

3.481.38

A2848 / A2958
1740 cm-1
(x100)

The t-values for the above selected six biomarkers


Normal

Mild

Moderate

Severe

Cancer

5.9
7.6
12.7
12.8
13.8
6.0
12.7
14.2
19.5
27.1
30.1
12.9
12.1
8.9
19.7
21.0
20.8
10.1
24.9
11.5
18.4
22.5
22.8
8.1

Mild
Polyp

Moderate
Polyp

Severe
Polyp

3.1
4.0
4.3
7.2
8.6
4.1
5.8
3.8
6.9
7.1
7.0
3.9
15.2
2.8
4.5
6.6
6.6
3.1

5.2
0.9 (NS)
3.2
2.1 (NS)
0.9 (NS)
1.6 (NS)
17.1
0.8 (NS)
0.8 (NS)
1.7 (NS)
1.2 (NS)
1.3 (NS)

9.3
1.4 (NS)
2.3 (NS)
0.4 (NS)
0.2 (NS)
0.2 (NS)

wavenumbers. The bands at 1083 and 1056 cm1 correspond


to absorbance of the vs PO2 of phosphodiesters of nucleic
acids28 and the O u H stretching coupled with C u O bending of C u OH groups of carbohydrates, respectively.29
These two biomarkers show the same absorbance intensity
Medical Physics, Vol. 37, No. 3, March 2010

Cancer

but in reverse order, hence the A1083 / A1056 ratio was considered significant Table I. In IR spectra, the bands at 1025
and 1045 cm1 correspond to the vibrational modes of
uCH2OH groups and the C u O stretching coupled with
C u O bending of the C u OH groups of carbohydrates in-

1052

Zwielly et al.: Statistical techniques applied to FTIR colonic spectra

1052

FIG. 5. DCF of normal, polyps mild, moderate, and severe and cancer
tissues. Each class is represented by an array of average values of four
biomarkers.

FIG. 4. Important biomarkers are marked in the FTIR spectra at the


1000 1800 cm1 fingerprint region. While good classification for normal
tissues is apparent, only small changes are noticed in this region among the
other groups. The shaded region represents the asymmetric phosphate
biomarker.

cludes glucose, glycogen, etc. Higher intensity is noticed for


the normal and mild polyp compared to moderate, severe
polyps, and the cancer groups Table I.
Previous works have shown that the band at 1121 cm1
arises from RNA absorbance, whereas the 1015 cm1 shoulder is due to DNA.30,31 It was found that the best discriminating values were obtained by deriving the intensity ratio of
these two vibrational modes i.e., A1121 / A1015 as can be seen
in Table I.
III.B. Grading the samples using DCF

Although the normal, benign polyps, and malignant tissues constitute three separate main groups, an interesting
analysis would be to examine a possible digital grading of
the tissues based on a chosen set of biomarkers. Based on
these biomarkers, an acuteness ladder could be formed and
the groups can be classified. Each case was characterized
using an array of biomarkers, which were arranged as follows:


A2848/A2958

A1740
A1083/A1056

A1021/A1015

To further examine the gradual spectral changes encountered


in the above tissue samples, we utilized the discriminant
classification function. This statistical tool enables to improve discrimination among normal, polyp stages, and malignancy by representing an adequate quantitative follow up
of transformations versus group type. DCF generates a classification score for each group or premalignant stage using a
linear combination of a previously derived array of biomarkers with weight coefficients.23 Figure 5 shows the scores of
Medical Physics, Vol. 37, No. 3, March 2010

each group based on Eq. 3. It can be noted that the mild


and moderate polyps have similar scores, which means that
only small detectable spectral changes occur between mild
and moderate polyps as would be expected. Generally, the
score values of the tissues starting with the normal group
gradually approach the spectral values of the malignant
group as shown in Fig. 5. It is also noticed from Fig. 5 that
the diversity among polyps was larger than for the malignant
and the controls. This is mainly notable between the mild
and moderate polyps where they appear to overlap.
III.C. LDA classification

PCA is a mathematical algorithm that reduces the large


dimension of the measured spectrum that is being dealt with,
i.e., instead of using many variables, the variability in the
data is described by only few PCs. The reduction is achieved
by finding the correlation between the variables. Figure 6
shows the scores of PC1 versus PC2 for all the measurements. It can be seen that all normal data points are completely separated from carcinoma, while some overlap appears between the polyps and the carcinoma and between the
polyps and normal tissue. PC1 contributes almost solely to
the separation between normal and carcinoma groups, while
PC2 contributes mainly to the separation between the polyps
and the other two groups. This partial separation obtained by
PCA is not satisfactory and further procedures should be
carried out in order to distinguish between all five groups.
Thus, LDA was applied to discern between the five groups.
LDA is a statistical multivariate supervised method. It
searches for the variables containing the largest and the
smallest interclass variances, and constructs a linear combination of the variables to discriminate between classes. The
rule is to construct a training set of samples, which is further
tested using the test set. The large number of valid variables
in the infrared spectra is an obstacle for this approach, which
needs more observations than variables. Using PCA unsupervised method prior to the LDA analysis was helpful in
reaching this goal of variable reduction. The results from the
LDA iteration are summarized in Table II, which shows the
percentage of success of each data set within all the possible

1053

Zwielly et al.: Statistical techniques applied to FTIR colonic spectra

1053

FIG. 6. PCA model employed on the database reducing the 512 valid measured variables in the spectrum to 13 PCs, which describe 98.4% of the data
variance. Full separation was achieved between the normal group plus symbols and the malignant group squares, while partial overlaps exist between the
three polyp groups circles. The solid black circles represent the corresponding groups centroids.

groups included in the study. We performed the LDA analysis with two strategies: First, where all five groups were included Table IIa and the second with only three groups,
namely, normal, polyp, and cancer, where the polyp group
consists of all polyp subgroups mild, moderate, and severe
Table IIb. The results when all the five groups are included show relatively lower success rates indicating that
many cases were classified within the neighboring groups.
This picture dramatically improved when only three groups
were assumed, where most of the group members were classified correctly. In both cases, none of the normal was assigned as cancer and vice versa. The largest misidentification
occurred between normal and mild polyp 24%, which is
frequently a difficult problem even for an expert pathologist,
since mild polyps are intrinsically very close in their cell
morphology to normal tissues. This misidentification improves dramatically when normal is misidentified as moderate or severe polyps with only 2% and 1%, respectively.
When all three polyp stages are treated as a single class, the
percentage of normal misidentified as polyp is reduced to
14% Table IIb.
It is encouraging that in both strategies the false negative
and the false positive rates both remain at 0%, as can be seen
from Tables IIa and IIb.

IV. DISCUSSION
Previous studies have provided evidence that infrared
spectroscopy is a useful and powerful tool to classify tissues
and cells. The aim of this work was to test the potential of
FTIR spectroscopy on colon cancer patients where five different tissue groups stages can be identified. We used a
statistical approach to analyze the large database of IR spectra that was measured. The complete database was analyzed
Medical Physics, Vol. 37, No. 3, March 2010

in order to examine the classification potential of this optical


methodology in tandem with advanced statistical techniques.
The gradual changes shown by the DCF score Fig. 5
present a digital illustration how benign polyps evolve toward carcinoma. This trend was further studied using specific biomarkers which clearly verify this gradual transition.
The main biomarker that dictates the DCF fitting and fully
shows the gradual behavior is the CH2 / CH3 ratio
A2848 / A2958. This is due to its extraordinary high t-value
Table I and the highly ordered gradual absorbance intensity
of the normal, polyps, and carcinoma groups Fig. 3. These
results suggest that the lipid/protein ratio gradually increases
with the severity of the disease. Since proteins contain, on
the average, an equal amount of methyl and methylene
groups, a protein change alone should modify the CH3
stretching as well as the CH2 stretching to the same extent.
The precise origin of the increase in the methyl/methylene
ratio remains to be determined; it may arise from an increase
in lipid content, but can also be associated with the modification of the membrane composition during cancer.
In our previous studies, changes in the lipid region were
described differently.32 We believe that the main reason for
this is the normalization procedure in each case study. Our
previous study in this subject used min/max normalization of
the entire measured spectra with respect to the amide I intensive band. This approach cannot detect subtle changes in
the lipid region. In order to reduce the contribution of close
bands and to focus on the relevant regions, we now use a
different approach by bisecting the entire spectrum to three
regions and vector normalized each segment separately as
explained in the experimental section Fig. 2. Another benefit over our previous normalization technique is the removal
of the uninformative region of 1800 2800 cm1, which is
highly dependent on the CO2 surrounding of the measured

1054

Zwielly et al.: Statistical techniques applied to FTIR colonic spectra

TABLE II. True/false identification percentage of each tissue type based on


the averaged LDA iterations a assuming five groupsNormal, mild, moderate, severe polyps, and cancer b assuming only three groups where all
polyp subgroups are treated as a single polyp group.
Identified as

Normal

Mild
Polyp

Moderate
Polyp

Severe
Polyp

Carcinoma

Normal

74

24

Mild

63

14

12

Moderate

22

60

15

Severe

66

18

Carcinoma

17

80

Type

Identified as

Normal

Polyps

Carcinoma

Normal

86

14

Polyps

85

Carcinoma

15

85

(a)

Type
(b)

tissues. This is especially important when applying mathematical based distinction algorithms such as PCA where it
could lead to artifacts in the classification. The two different
normalization approaches reveal that different spectral preprocessing techniques may alter the biochemical interpretation. This bisect technique was widely used,20,33,34 where
sections of the spectrum were cut, baseline corrected, and
normalized independently from the entire region of the spectrum.
In complex systems such as tissues, the main absorptions
arise from N u H, C v O, C u H, and P v O bonds from
proteins, lipids, and nucleic acids present in the cells.
Wavenumbers below 1800 cm1 constitute prominent regions that contain all the above vibrational modes. This region shows remarkable differences among normal tissues,
cancer, and all three polyp groups. This is strongly ostensible
in the 1045 cm1 region, where a decreased absorption is
noted for all groups compared to the normal. The normal
cases clearly stand above all pathological tissues polyp and
cancer. This can be explained by the substantial glycogen
reduction consumption in the polyps and the cancer tissues.
Another remarkable different biomarker is the 1072 cm1
band that corresponds to carbohydrates. This biomarker decreased in the normal group.
The antisymmetric phosphate levels 1170 1310 cm1
reveal the metabolic turnover, as it consists of energy producers such as ATP and GTP, and other biomolecular components which include phospholipids, nucleic acids DNA
and RNA, and phosphorylated proteins. The difference in
total phosphate level among normal, the three groups of polyps, and cancer was clear in the normal case. This enhanceMedical Physics, Vol. 37, No. 3, March 2010

1054

ment may arise from the fact that phosphate level is the sum
of a larger number of biomolecules containing phosphate
groups.
The shoulder at 1740 cm1 can be assigned to the ester
C v O stretching of phospholipids, not present in DNA and
proteins. The decreased intensities at 1740 are also consistent
with the methyl/methylene ratio, except for the cancer group
that shows a higher value than the moderate and severe polyps Table I. The 1740 cm1 band is composed of residues
of other vibrational modes35 besides phospholipids that may
cause for this inconsistency.
Our LDA results Table II indicate that a high discrimination percentage is achieved when dealing only with the
three main groups Table IIb: Normal, polyps, and cancer
tissues. A significant but relatively low percentage is obtained when considering all the five groups: Normal tissue,
mild, moderate, advanced polyps, and cancer Table IIa.
False identification can result from several reasons: Built-in
spatial36 and spectral resolution limitation of the FTIR to
distinguish between proximate tissues; cytological identification failureEspecially when dealing within the polyp subgroups; averaging over different measured sites inside specific sample can lead to realistic uncertainty of the sample
type since some regions show borderline behavior; and damaged FTIR measurement due to unstable surrounding and
equipments. All the above reasons except the first confounding factor can be eliminated by consistently acquiring a
larger database.
The t-test values Table I revel that besides methyl/
methylene ratio, the chosen biomarkers have almost no significant differences between carcinoma severe polyp and
moderate polyp groups. Although the LDA algorithm takes
into account all wavenumbers, this result is also true for the
severe polyp case Table IIa. This leads to mixing between
the severe polyp and carcinoma groups. In contrast, the DCF
classification Fig. 5 shows remarkably different scores between carcinoma and severe polyps. DCF weights are influenced mainly by the t-test values. Thus, the methyl/
methylene ratio, which has the largest t-value, dictates the
DCF behavior. When comparing the LDA results just for the
normal and malignant samples, excellent separation is
achieved in the t-test analysis Table I as well as in the DCF
scores Fig. 5.
In summary, we conclude that infrared spectroscopy is a
useful tool to identify different types of colonic tissues. The
DCF scores formed acuteness ladders, which give further
benefits to the ability of grading the samples in the correct
order, namely, normal, polyps, and cancer based on the previously selected array of biomarkers. We also demonstrated
that PCA combined with LDA is a powerful tool for investigating the global biochemical modifications responsible for
tissue classification. However, we do not claim to replace the
pathologist. Spectroscopic methods may provide a second
opinionEspecially in difficult cases where ambiguous assignments are given by histopathology.
We have shown that different normalization techniques
can change the biochemical interpretation, although not the
total changes among spectra.

1055

Zwielly et al.: Statistical techniques applied to FTIR colonic spectra

Early cancer detection is vital in all cancer types but this


is especially true in colon cancer, where removing the premalignant tissue can save lives. Due to the still mysterious
nature of how polyps progress spontaneously toward carcinoma, further studies which examine in more detail the potential of IR spectroscopy may shed more light on these
changes.
ACKNOWLEDGMENTS
This work was supported in part by the Israel Science
Foundation ISF Grant No. 788/01, and the Cancer Research Foundation in Memory of Professor Tabb at the Soroka University Medical Center.
a

Author to whom correspondence should be addressed. Electronic mail:


shaulm@bgu.ac.il; Telephone: 972-8-646 1749; Fax: 972-8-647
2924.
1
Cancer, World Health Organization February 2006. Retrieved on 24
May 2007, http://www.cancerinfodirect.com/colon-cancer.
2
R. K. Sahu and S. Mordechai, Fourier transform infrared spectroscopy in
cancer detection, Future Oncol. Oct. 15, 635647 2005.
3
G. Herzberg, Molecular Spectra and Molecular Structure. II Infrared and
Raman Spectra of Polyatomic Molecules Van Nostrand Reinhold, New
York, 1945.
4
P. Lasch and J. Kneipp, Biomedical Vibrational Spectroscopy Wiley,
Hoboken, 2008.
5
T. Gao, J. Feng, and Y. Ci, Human breast carcinomal tissues display
distinctive FTIR spectra: Implication for the histological characterization
of carcinomas, Anal Cell. Pathol. 18, 8793 1999.
6
R. Sahu, U. Zelig, M. Huleihel, N. Brosh, M. Talyshinsky, M. BenHarosh, S. Mordechai, and J. Kapelushnik, Continuous monitoring of
WBC biochemistry in an adult leukemia patient using advanced FTIRspectroscopy, Leuk. Res. 30, 687693 2006.
7
A. Podshyvalov, R. K. Sahu, S. Mark, K. Kantarovich, H. Guterman, J.
Goldstein, R. Jagannathan, S. Argov, and S. Mordechai, Distinction of
cervical cancer biopsies by use of infrared microspectroscopy and probabilistic neural networks, Appl. Opt. 4418, 37253734 2005.
8
A. Tfayli, O. Piot, A. Durlach, A. Bernard, and M. Manfait, Discriminating nevus and melanoma on paraffin-embedded skin biopsies using
FTIR microspectroscopy, Biochim. Biophys. Acta 17243, 262269
2005.
9
C. Krafft, L. Shapoval, S. B. Sobottka, K. D. Geiger, G. Schackert, and R.
Salzer, Identification of primary tumors of brain metastases by SIMCA
classification of IR spectroscopic images, Biochim. Biophys. Acta
17587, 883891 2006.
10
E. Gazi, M. Baker, J. Dwyer, N. P. Lockyer, P. Gardner, J. H. Shanks, R.
S. Reeve, C. A. Hart, N. W. Clarke, and M. D. Brown, A correlation of
FTIR spectra derived from prostate cancer biopsies with gleason grade
and tumour stage, Eur. Urol. 504, 750761 2006.
11
P. Bruni, C. Conti, E. Giorgini, M. Pisani, C. Rubini, and G. Tosi, Histological and microscopy FT-IR imaging study on the proliferative activity and angiogenesis in head and neck tumours, Faraday Discuss. 126,
1926 2004.
12
D. R. Shankaran and N. Miura, Trends in interfacial design for surface
plasmon resonance based immunoassays, J. Phys. D: Appl. Phys. 40,
71877200 2007.
13
S. H. Tseng, A. Grant, and A. J. Durkin, In vivo determination of skin

Medical Physics, Vol. 37, No. 3, March 2010

1055

near-infrared optical properties using diffuse optical spectroscopy, J.


Biomed. Opt. 131, 014016 2008.
14
P. Lasch, M. Diem, W. Hansch, and D. Naumann, Artificial neural networks as supervised techniques for FT-IR microspectroscopic imaging,
J. Chemom. 20, 209220 2006.
15
P. Lasch, J. Schmitt, and D. Naumann, Colorectal adenocarcinoma diagnosis by FT-IR microspectrometry, Biomed. Spectroscopy 3918, 4556
2000.
16
Z. Hammody, S. Argov, R. K. Sahu, E. Cagnano, R. Moreh, and S.
Mordechai, Distinction of malignant melanoma and epidermis using IR
micro- spectroscopy and statistical methods, Analyst Cambridge, U.K.
1333, 372378 2008.
17
M. S. Cappell, From colonic polyps to colon cancer: Pathophysiology,
clinical presentation, screening and colonoscopic therapy, Minerva Gastroenterol. Dietol 534, 351373 2007.
18
S. Argov, J. Ramesh, A. Salman, I. Sinelnikov, J. Goldstein, H. Guterman, and S. Mordechai, Diagnostic potential of Fourier-transform infrared microspectroscopy and advanced computational methods in colon
cancer patients, J. Biomed. Opt. 7, 248254 2002.
19
MATLAB, Version 7.0 R14, The MathWorks Inc. Natick, MA 2007.
20
A. Zwielly, J. Gopas, G. Brkic, and S. Mordechai, Detection of a drugresistant human melanoma cell line using FTIR Spectroscopy, Analyst
Cambridge, U.K. 134, 294300 2009.
21
M. Diem, P. Griffith, and J. Chalmers, Vibrational Spectroscopy for Medical Diagnosis Wiley, New York, 2008.
22
R. A. Fisher, The use of multiple measures in taxonomic problems,
Ann. Eugen. 7, 179188 1936.
23
C. Huberty, Applied Discriminant Analysis Wiley, New York, 1994.
24
K. Fukunaga, Introduction to Statistical Pattern Recognition Academic,
San Diego, 1990.
25
H. Gremlich and B. Yang, Infrared and Raman Spectroscopy of Biological Materials Dekker, New York, 2001, pp. 421475.
26
M. Diem, S. Boydston-White, and L. Chiriboga, Infrared spectroscopy
of cells and tissues: Shining light onto a novel subject, Appl. Spectrosc.
53, 148161 1999.
27
H. H. Mantsch and D. Chapman, Infrared Spectroscopy of Biomolecules
Wiley, New York, 1996.
28
J. Liquier and E. Taillandier, in Infrared Spectroscopy of Biomolecules,
edited by H. H. Mantsch and D. Chapman Wiley, New York, 1996, pp.
131158.
29
S. Wartewig, IR and Raman Spectroscopy Wiley, New York, 2003, pp.
75124.
30
D. Naumann, FT-infrared and FT-Raman spectroscopy in biomedical
research, Appl. Spectrosc. Rev. 36, 239298 2001.
31
F. S. Parker, Application of Infrared Spectroscopy in Biochemistry, Biology and Medicine Plenum, New York, 1971.
32
A. Salman, S. Argov, J. Ramesh, J. Goldstein, S. Igor, H. Guterman, S.
Mordechai, FTIR microscopic characterization of normal and malignant
human colonic tissues, Cell. Mol. Biol. Paris 4722, 159166 2001.
33
P. G. Andrus, Cancer monitoring by FTIR spectroscopy, Technol. Cancer Res. Treat. 52, 157167 2006.
34
V. R. Kondepati, M. Keese, H. M. Heise, and J. Backhaus, Detection of
structural disorders in pancreatic tumour DNA with Fourier-transform
infrared spectroscopy, Vib. Spectrosc. 40, 3339 2006.
35
Z. Movasaghi, S. Rehman, and I. ur Rehman, Fourier transform infrared
FTIR spectroscopy of biological tissues, Appl. Spectrosc. Rev. 43,
134179 2008.
36
C. Krafft, D. Codrich, G. Pelizzo, V. Sergo, Raman and FTIR microscopic imaging of colon tissue: A comparative study, J. BiophotonicsMay 12, 154169 2008.

You might also like