Professional Documents
Culture Documents
Komdig Colonos
Komdig Colonos
TABLE II
Shen and Chen [14] measured the focus by dividing the
C LASSIFICATION PERFORMANCE COMPARISON USING DIFFERENT
image into multiple square blocks and calculating the energy
CATEGORIES OF HAND - CRAFTED FEATURES . T HE AUC IS GIVEN IN THE
ratio of AC and DC coefficients of each block in the discrete
FORMAT OF µ (σ) FROM 5- FOLD CROSS - VALIDATION .
cosine transform domain:
H/S W/S Hand-crafted features AUC
X X EACu,v
FDCT R = , (5) Reflection 0.740 (0.036)
u=1 v=1
EDCu,v Intensity 0.840 (0.067)
Edge 0.718 (0.022)
where S is the size of the block. EACu,v is the sum of GLCM 0.884 (0.022)
square of the AC coefficients and EDCu,v is the square of Blur 0.891 (0.021)
Reflection + Intensity 0.843 (0.047)
the DC coefficient in the block (u, v). In this study, we used Reflection + Intensity + Edge 0.873 (0.030)
S = 15. Reflection + Intensity + Edge + GLCM 0.899 (0.023)
Reflection + Intensity + Edge + GLCM + Blur 0.909 (0.020)
E. Classifier Training
A random forest (RF) model was trained using the com-
bination of hand-crafted features and bottleneck features. in colonoscopy videos, frames before the tube insertion and
Considering that the number of features after the feature after the tube withdrawal have lots of edge information,
fusion is large, the RF model was selected due to its property but they are non-informative. Also, the intensity value of
of automatic feature selection. each channel may be affected by the patient’s individual
colon environment and camera settings. However, when we
III. R ESULTS AND D ISCUSSION combine features from reflections, intensity statistics, and
A. Experimental Settings edges, the classification performance increases significantly,
Our CNN was implemented using the TensorFlow library. indicating they are highly complementary.
During fine-tuning, the Adam optimizer was used to mini-
C. Classification Performance of Feature Fusion
mize the cross-entropy loss with a learning rate of 10−6 .
To better compare the performance of different meth- From Table III, combining hand-crafted features in the
ods, a patient-wise 5-fold cross-validation was performed. HSV color space and deep learning based features in the
Frames were randomly divided into 5 folds, where each fold RGB color space achieves statistically significant improve-
contained frames from two patients. The averages of the ment on non-informative frame classification performance
mean (µ) and standard deviation (σ) of F1-score, sensitivity, compared with other methods. While deep learning methods
specificity, and AUC of the image classification in the 5 folds have proven to be effective in extracting comprehensive fea-
were calculated as our final results. tures, our result demonstrates that a small set of hand-crafted
features based on visual-based prior knowledge can still
B. Classification Performance of Hand-crafted Features provide additional and helpful information. This observation
Table II lists classification performances using single or may also be applied to other medical applications where the
different combinations of hand-crafted feature categories. training set is small and domain knowledge is available.
From the table, using a single feature category, GLCM and Fig. 6 gives examples of classification results using feature
Blur measures perform significantly better than other feature fusion. While our model makes correct classifications for
categories. The classification model using only intensity most of the frames, misclassification happens when frames
statistics has a high standard deviation, and the classification exhibit features from both informative and non-informative
model using only edge features has a low classification groups. For the first two frames in Fig. 6(a), fold contours
accuracy. This may be because these two feature categories are visible while water or reflections obscure part of the
are important for only a portion of images. For example, camera view. For the last frames in Fig. 6(a), the camera
2405
the efficiency of integrating visual-based prior knowledge
into data representations extracted using deep learning. Al-
though deep learning techniques have achieved wide success
in various fields, feature engineering with domain knowledge
can provide valuable information. Feature fusion has the
potential to improve model performance, especially when the
training dataset is small and domain knowledge is available.
Finally, the proposed automatic and accurate non-informative
frame detection system is essential for further colonoscopy
video analysis. Accurate detection and removal of non-
informative frames can efficiently improve the accuracy of
Fig. 6. Examples of classification results using feature fusion. (a) Frames disease severity estimation and reduce computational cost.
incorrectly classified as informative; (b) Frames incorrectly classified as
non-informative R EFERENCES
[1] L. Hixson, M. B. Fennerty, R. Sampliner, and H. Garewal, “Prospective
blinded trial of the colonoscopic miss-rate of large colorectal polyps,”
is too close to the colon wall to capture informative content Gastrointestinal endoscopy, vol. 37, no. 2, pp. 125–127, 1991.
[2] T. Kaltenbach, S. Friedland, and R. Soetikno, “A randomized tandem
while vessels in the colon wall are quite clear. For frames colonoscopy trial of narrow band imaging versus white light exami-
in Fig. 6(b), the reason for being misclassified as non- nation to compare neoplasia miss rates,” Gut, 2008.
informative may be because they are blurry and obscured by [3] C. Ballesteros, M. Trujillo, C. Mazo, D. Chaves, and J. Hoyos, “Auto-
matic classification of non-informative frames in colonoscopy videos
water. However, the overall colon structures are still visible, using texture analysis,” in Progress in Pattern Recognition, Image
so it is reasonable to annotate them as informative. From Analysis, Computer Vision, and Applications, C. Beltrán-Castañón,
those misclassified instances, one limitation of our study I. Nyström, and F. Famili, Eds. Cham: Springer International
Publishing, 2017, pp. 401–408.
is the lack of quantitative criteria for image annotation. [4] A. Islam, A. Alammari, J. Oh, W. Tavanapong, J. Wong, and P. C.
Adding uncertainty grading during the annotation process de Groen, “Non-informative frame classification in colonoscopy videos
and integrating the uncertainty in the training process may using cnns,” in Proceedings of the 2018 3rd International Conference
on Biomedical Imaging, Signal Processing. ACM, 2018, pp. 53–60.
improve model performance. [5] M. A. Armin, G. Chetty, F. Jurgen, H. De Visser, C. Dumas, A. Fazlol-
lahi, F. Grimpen, and O. Salvado, “Uninformative frame detection in
TABLE III colonoscopy through motion, edge and color features,” in International
C LASSIFICATION PERFORMANCE COMPARISON . T HE EVALUATION Workshop on Computer-Assisted and Robotic Endoscopy. Springer,
MEASURES ARE GIVEN IN THE FORMAT OF µ (σ) FROM 5- FOLD 2015, pp. 153–162.
[6] C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, and Z. Wojna, “Rethink-
CROSS - VALIDATION ing the inception architecture for computer vision,” in Proceedings
of the IEEE conference on computer vision and pattern recognition,
Method AUC F1 Sensitivity Specificity 2016, pp. 2818–2826.
Hand-crafted 0.909 0.720 0.846 0.845 [7] J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei,
features + RF (0.020) (0.045) (0.043) (0.020) “Imagenet: A large-scale hierarchical image database,” in Computer
0.924 0.752 0.821 0.890 Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference
Deep Learning
(0.020) (0.032) (0.022) (0.051) on. Ieee, 2009, pp. 248–255.
Bottleneck 0.928 0.756 0.824 0.891 [8] A. M. Reza, “Realization of the contrast limited adaptive histogram
features + RF (0.012) (0.040) (0.043) (0.008) equalization (CLAHE) for real-time image enhancement,” Journal of
Feature fusion 0.939 0.775 0.828 0.919 VLSI signal processing systems for signal, image and video technology,
+ RF (0.009) (0.028) (0.057) (0.029) vol. 38, no. 1, pp. 35–44, 2004.
Note: The method ”Deep Learning” uses an end-to-end Inception-v3 [9] C. Harris and M. Stephens, “A combined corner and edge detector.” in
architecture, while the method ”Bottleneck features + RF” uses bottleneck Alvey vision conference, vol. 15, no. 50. Citeseer, 1988, pp. 10–5244.
features and a random forest (RF) classifier. A paired t-test was performed [10] D. H. Ballard, “Generalizing the hough transform to detect arbitrary
comparing cross-validation results from the ”Feature fusion + RF” and shapes,” Pattern recognition, vol. 13, no. 2, pp. 111–122, 1981.
”Bottleneck features + RF” methods. The p-values for AUC, F1, and [11] A. Baraldi and F. Parmiggiani, “An investigation of the textural char-
Specificity are smaller than 0.05. acteristics associated with gray level cooccurrence matrix statistical
parameters,” IEEE Transactions on Geoscience and Remote Sensing,
vol. 33, no. 2, pp. 293–304, 1995.
IV. C ONCLUSION [12] K. De and V. Masilamani, “Image sharpness measure for blurred
images in frequency domain,” Procedia Engineering, vol. 64, pp. 149–
A new algorithm of non-informative frame detection for 158, 2013.
colonoscopy videos was proposed using a combination of [13] M. Subbarao and J.-K. Tyan, “Selecting the optimal focus measure
for autofocusing and depth-from-focus,” IEEE transactions on pattern
bottleneck features in the RGB color space and a small set analysis and machine intelligence, vol. 20, no. 8, pp. 864–870, 1998.
of hand-crafted features in the HSV color space. In our ex- [14] C.-H. Shen and H. H. Chen, “Robust focus measure for low-contrast
periments, feature fusion achieved an average AUC of 0.939 images,” in Consumer Electronics, 2006. ICCE’06. 2006 Digest of
Technical Papers. International Conference on. IEEE, 2006, pp. 69–
using 5-fold cross-validation, better than using bottleneck 70.
features or hand-crafted features alone. Our key contribution
is threefold. First, we designed a feature extraction algorithm
in HSV color space and demonstrated that features from both
RGB and HSV color spaces could better characterize the
frames from colonoscopy videos. Secondly, we demonstrated
2406