Skin Cancer Classification

You might also like

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 30

A Project Report On

AI-Powered Early Warning System for Skin Manifestations


in partial fulfilment for the award of the degree of

BACHELOR OF TECHNOLOGY
In
COMPUTER SCIENCE AND ENGINEERING

Submitted by
BANKA DHARMIK
20B91A0526

Under the Guidance of


Dr. V CHANDRA SEKHAR
Professor & Head of Department

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING


SRKR ENGINEERING COLLEGE (A)
Chinna Amiram, Bhimavaram, West Godavari Dist., A.P.
[2023 – 2024]
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING
SRKR ENGINEERING COLLEGE (A)
Chinna Amiram, Bhimavaram, West Godavari Dist., A.P.
[2023 – 2024]

BONAFIDE CERTIFICATE

This is to certify that the project work entitled “AI-Powered Early Warning
System for Skin Manifestations” is the bonafide work of Banka Dharmik bearing
20B91A0526, who carried out the project work under my supervision in partial
fulfilment of the requirements for the award of the degree of Bachelor of
Technology in Computer Science and Engineering.

SUPERVISOR HEAD OF THE DEPARTMENT


( Dr V Chandra Sekhar ) ( Dr V Chandra Sekhar )
Professor Professor
SELF DECLARATION

I hereby declare that the project work entitled “AI-Powered Early Warning
System for Skin Manifestations” is a genuine work carried out by me in B.Tech
( Computer Science and Engineering ) at SRKR Engineering College(A),
Bhimavaram and has not been submitted either in part or full for the award of any
other degree or diploma in any other institute or University.

Banka Dharmik
20B91A0526
ABSTRACT

Skin diseases rank as the fourth leading cause of the burden of nonfatal diseases worldwide.
Not only do they affect the individual, but they are also strong symptoms of underlying diseases.
Early detection of skin lesions is essential for the management of dermatological disorders.
Effective skin care in underserved areas is challenging due to lack of diagnostic tools,
communication problems, and trained dermatologists. Even the preliminary screening of
dermatological manifestations is a hectic task.

An innovative solution is to propose an artificial intelligence-based system with advanced


imaging techniques using Convoluted Neural Networks (CNN) trained on skin lesion datasets for
proactive skin conditions detection. The tool aims to provide cost-effective, effective and accessible
diagnostic support to close the health gap, particularly to alert under-resourced communities, where
conventional diagnostic methods may be unavailable.

By automating the preliminary screening process, the tool facilitates remote screening, ensures
timely intervention, and reduces the burden on healthcare systems. These technological advances
address the challenges posed by the global burden of skin diseases, contributing to improved health
outcomes, psychological well-being, functioning and social participation of individuals affected.

Keywords: Dermatological disorders, skin lesions, machine learning, deep learning, early
detection, image segmentation, classification, severity assessment, recommendation, skin cancer.
TABLE OF CONTENTS
1 INTRODUCTION........................................................................................................................................................................... 6
1.1 The Urgency of Early Detection............................................................................................................................................6
1.2 AI Steps Up to the Challenge:................................................................................................................................................6
1.3 The Power of Deep Learning:................................................................................................................................................7
1.4 Beyond MNIST: The Complexity of Skin Lesions:...............................................................................................................8
1.5 A Glimpse into the Future:.....................................................................................................................................................8
2 LITERATURE SURVEY................................................................................................................................................................9
3 PROBLEM STATEMENT............................................................................................................................................................11
4 METHODOLOGY......................................................................................................................................................................... 12
4.1 Preprocessing and Data Augmentation................................................................................................................................12
4.2 CNN Architecture Design....................................................................................................................................................12
4.3 Model Optimization.............................................................................................................................................................13
4.4 Overfitting Prevention Techniques......................................................................................................................................13
4.5 Model Evaluation and Fine-tuning.......................................................................................................................................13
5 IMPLEMENTATION....................................................................................................................................................................14
5.1 Data Preparation...................................................................................................................................................................15
5.1.1 Importing the modules................................................................................................................................................15
5.1.2 Image Data Loading...................................................................................................................................................16
5.1.3 Lesion Type Categorization.......................................................................................................................................16
5.2 Image Preprocessing and Augmentation..............................................................................................................................17
5.2.1 Image Resizing and Normalization............................................................................................................................17
5.2.2 Data Augmentation.....................................................................................................................................................17
5.2.3 Class Balancing..........................................................................................................................................................18
5.3 Splitting Training and Testing Dataset.................................................................................................................................18
5.4 CNN Model Architecture.....................................................................................................................................................18
5.4.1 Convolutional Layers and Activation Functions........................................................................................................18
5.4.2 Pooling, Dropout, and Batch Normalization..............................................................................................................19
5.4.3 Model Design.............................................................................................................................................................19
5.5 Model Training.....................................................................................................................................................................19
5.6 Hyperparameter Tuning and Model Optimization...............................................................................................................20
5.6.1 Early Stopping and Model Checkpoints.....................................................................................................................20
5.6.2 Optimization Algorithms............................................................................................................................................21
5.7 Model Evaluation and Analysis...........................................................................................................................................21
5.7.1 Loading the Best Model and Calculating Accuracy...................................................................................................21
5.7.2 Class-wise Accuracy Assessment...............................................................................................................................22
6 RESULT ANALYSIS....................................................................................................................................................................23
6.1 Overall Performance............................................................................................................................................................23
6.2 Accuracy and Loss Trends...................................................................................................................................................24
6.3 Class-wise Performance.......................................................................................................................................................25
6.4 Visual Comparisons.............................................................................................................................................................25
7 CONCLUSION AND FUTURE WORK......................................................................................................................................27
8 REFERENCES............................................................................................................................................................................... 28
1 INTRODUCTION

The human skin, our largest and most visible organ, acts as a constant shield against the external
world. Yet, this protective barrier itself is susceptible to a multitude of diseases, impacting millions
worldwide. These conditions, ranging from common rashes to potentially life-threatening cancers,
can significantly affect individuals physically, psychologically, and socially. Early detection and
management are crucial for effective treatment and improved quality of life. However, access to
healthcare professionals and diagnostic tools remains a challenge, particularly in underserved
communities. This is where the power of artificial intelligence (AI) emerges, offering a glimmer of
hope for bridging the healthcare gap and revolutionizing skin disease detection.

1.1 The Urgency of Early Detection

Skin lesions, often the first visible signs of dermatological conditions, play a critical role in early
diagnosis. Take skin cancer, the most common human malignancy. Traditionally, its detection relies
on visual inspection, followed by dermoscopy, biopsy, and histopathological examination. This
process, while effective, can be time-consuming, resource-intensive, and prone to human error.
Delays in diagnosis can have severe consequences, highlighting the need for faster, more accurate
methods.

1.2 AI Steps Up to the Challenge:

This project delves into the exciting realm of AI-powered skin lesion analysis, proposing a novel
solution: a system that utilizes convolutional neural networks (CNNs) for proactive detection.
Imagine a world where a simple image captured by a smartphone can be analyzed by an AI system,
providing valuable insights into potential skin concerns. This system aims to bridge the healthcare
gap by offering:

1. Cost-effectiveness: By automating the preliminary screening process, the system can reduce
the reliance on expensive specialist consultations.
2. Accessibility: The system's digital nature makes it readily available, even in remote areas
with limited access to dermatologists.
3. Efficiency: AI algorithms can analyze large volumes of data quickly and
accurately, facilitating faster diagnosis and treatment initiation.
1.3 The Power of Deep Learning:

The core of this system lies in the power of CNNs, a type of deep learning architecture inspired
by the structure and function of the human visual cortex. By training these networks on vast
datasets of labelled skin lesion images, such as the HAM10000 dataset (Fig: 1.1), the system learns
to recognize patterns and subtle differences crucial for accurate classification.

Fig 1.1: HAM1000 Dataset Sample Images

The HAM10000 dataset, a publicly available collection of 10,015 dermoscopic images


representing seven different skin cancer types, serves as a valuable training ground for the CNN
model. Each image in this dataset captures the unique characteristics of a skin lesion, allowing the
model to learn and refine its classification skills.

Another common dataset that proved to shown much influence is the ISIC Skin Cancer Dataset
(Fig: 1.2). This set consists of 2357 images of malignant and benign oncological diseases, which
were formed from The International Skin Imaging Collaboration (ISIC). It is publicly available. and
serves as a valuable ground for CNN models.
Fig 1.1: ISIC Skin Cancer Dataset Sample Images

1.4 Beyond MNIST: The Complexity of Skin Lesions:

While the success of deep learning in tasks like image recognition is undeniable, skin lesion
analysis presents a unique set of challenges compared to simpler datasets like MNIST (handwritten
digits). The high variability in shape, color, and texture of skin lesions, coupled with the often
subtle differences between benign and malignant cases, demands a more sophisticated approach.
This project delves into the intricate process of building and evaluating a CNN model that can
effectively navigate these complexities, paving the way for a more accurate and reliable AI-
powered diagnostic tool.

1.5 A Glimpse into the Future:

This project is just one step in the ongoing journey towards harnessing the power of AI for
improved healthcare. As technology advances and datasets grow, the accuracy and efficiency of AI-
powered skin lesion analysis are poised to further improve. Imagine a future where AI-powered
systems seamlessly integrate into routine healthcare checkups, offering real-time feedback and
personalized recommendations. This future holds immense potential for early detection, improved
treatment outcomes, and ultimately, a healthier future for all.
2 LITERATURE SURVEY

Taye Girma Debelee [1] conduct a systematic review on machine learning (ML) for skin disease
detection, focusing on evaluating datasets, current methods, and challenges. Through a detailed
methodology including research question formulation and database search strategies, they explore
the use of ML and AI, particularly deep learning models like EfficientNets and MobileNet V2, in
dermatology. The review highlights the importance of datasets such as HAM10000 and ISIC in
advancing diagnostic accuracy but points out the need for more extensive, varied datasets and
model generalizability for future progress.

T. Swapna [2] explore the application of deep learning for skin disease classification, utilizing
CNN architecture alongside Alex Net, ResNet, and InceptionV3 models. The study focuses on
enhancing diagnostic accuracy through the HAM10000 dataset, comprising various skin conditions,
extended with images of cuts and burns. It underscores the effectiveness of deep learning in
simplifying the diagnostic process, promising significant improvements in accuracy and efficiency.

Nancy Girdhar, Aparna Sinha, and Shivang Gupta [3] introduce DenseNet-II, an advancement in
CNNs aimed at improving melanoma detection. Focused on overcoming the challenges of variable
image quality in diagnosis, their study highlights the critical need for accurate detection tools
amidst rising cancer cases. DenseNet-II merges the strengths of existing models like DenseNet,
VGG-16, InceptionV3, and ResNet into a superior classifier tested on the HAM10000 dataset. This
innovation represents a significant leap in medical imaging, offering enhanced accuracy in
identifying melanoma, thereby contributing to better outcomes in cancer care.

Evgin Goceri and Ayse Akman Karakas [4] conducted a comparative study to classify skin
diseases using CNNs, focusing on networks like VGG16, VGG19, GoogleNet, InceptionV3, and
ResNet101. They ensured uniform testing conditions across networks and assessed them based on
accuracy, precision, and other metrics. InceptionV3 was highlighted for its novel architecture,
contributing to effective skin lesion classification. Ultimately, ResNet101 showed the highest
accuracy at 77.72%. This research highlights the potential of specific CNN models, including
InceptionV3, in improving dermatological diagnostics.
Kassem, Hosny, and Fouad [5] propose a deep learning model utilizing transfer learning with
GoogleNet for the accurate classification of skin lesions. Their study, focusing on the ISIC 2019
cancer dataset, demonstrates the model's ability to distinguish between eight different skin lesion
classes. By modifying GoogleNet's architecture and employing transfer learning, they achieve
classification accuracy, sensitivity, specificity, and precision percentages of 94.92%, 79.8%, 97%,
and 80.36%, respectively. This work highlights the effectiveness of their approach in handling the
ISIC 2019 dataset.

Esmaeilzadeh's [6] study delves into consumer attitudes towards adopting AI in healthcare,
pinpointing technological, ethical, and regulatory factors as major influencers on acceptance. It
underscores the critical balance between perceived benefits and risks associated with AI tools for
health purposes, aiming to enhance user trust and guide ethical AI integration into healthcare
practices.

Elder [7] discuss the transformative role of AI in cosmetic dermatology, particularly focusing on
skincare. They highlight AI-driven innovations such as personalized skincare regimens and
augmented reality tools for skin analysis. These advancements empower patients in their skincare
decisions and suggest a future where AI further personalizes and enhances dermatological care.

Berman [8] highlight the increased risk of skin cancer in solid organ transplant recipients
(SOTRs), underscoring the need for diligent prevention, regular screenings, and careful
management. The study advocates for adjusted immunosuppressive treatments and emphasizes
multidisciplinary care to mitigate skin cancer risks in this vulnerable population.

Sander [9] stress the importance of sunscreen in mitigating skin cancer risk, underscoring its
proven effectiveness against melanoma and nonmelanoma types. They recommend broad-spectrum
sunscreens with at least SPF 30, while noting ongoing research into the safety and environmental
impacts of sunscreen ingredients.

Shaheen's [10] review highlights the transformative role of AI in healthcare, particularly in areas
like drug discovery, clinical trials, and patient care. It showcases AI's capacity to accelerate
pharmaceutical research, streamline clinical trials through efficient data handling, and significantly
improve patient treatment outcomes. The review underlines AI's potential in enhancing healthcare
efficiency and making healthcare services more accessible and personalized.
3 PROBLEM STATEMENT

Skin conditions, a major global health issue, often go undiagnosed and untreated, especially in
resource-poor areas. This "diagnostic divide" leaves many facing the detrimental impacts of
untreated skin diseases without access to proper care. The lack of skilled dermatologists and
essential diagnostic tools in these communities further exacerbates the issue, delaying necessary
treatment, which might be deadly.

An innovative solution to this problem is the application of artificial intelligence (AI). By


leveraging AI, we can bridge the gap in diagnosis and treatment, offering a way to provide timely
and accurate skin disease diagnoses in underserved areas. This approach promises to improve the
management of skin conditions, enhancing health outcomes for those affected by this global
challenge.
4 METHODOLOGY

Skin cancer ranks among the most common cancers worldwide, posing a significant health
challenge, especially in regions with limited access to dermatology experts and diagnostic tools.
The capability for early detection and precise classification of skin lesions is vital for effective
treatment and improved patient outcomes. Yet, in many areas, a lack of medical infrastructure and
specialists heightens the risks associated with delayed diagnosis and treatment errors. Artificial
intelligence (AI), particularly through Convolutional Neural Networks (CNNs), emerges as a key
solution to this issue, offering a method to accurately classify skin cancer types based on
dermatoscopic images.

The HAM10000 dataset, which includes dermatoscopic images across seven skin cancer
categories, provides a valuable resource for training AI models in skin cancer classification. This
enables not only the advancement of machine learning in healthcare but also the possibility to make
dermatological diagnostics more accessible to underserved communities. The methodology
described herein focuses on creating and refining a CNN model for this purpose, leveraging AI to
potentially transform skin cancer diagnostics globally.

4.1 Preprocessing and Data Augmentation

 Normalization: Standardizes the pixel values across all images to have a mean of 0 and a
standard deviation of 1, ensuring that the model trains more efficiently.
 Data Augmentation Algorithms: Techniques like rotation, zoom, flip, and translation are
applied to increase the diversity of the training dataset, helping the model generalize better
to new, unseen images

4.2 CNN Architecture Design

 Convolutional Layers: Utilize kernels or filters to perform convolution operations that


capture spatial hierarchies of features in images. Algorithms for automated kernel
optimization may be employed to find the most effective filters for feature extraction.
 Activation Functions: ReLU (Rectified Linear Unit) is commonly used for adding non-
linearity to the network, allowing it to learn complex patterns in the data.
 Pooling Layers: Max pooling is a prevalent algorithm used to reduce the spatial dimensions
of the input volume for the next convolutional layer, decreasing the number of parameters
and computation in the network.
 Batch Normalization: Applied after convolutional layers to stabilize learning and speed up
the training process by normalizing the input layer by adjusting and scaling the activations.

4.3 Model Optimization

 Optimization Algorithms: Stochastic Gradient Descent (SGD) or Adam (Adaptive


Moment Estimation) are widely used for optimizing the neural network by minimizing the
loss function. These algorithms adjust the weights of the network iteratively to reduce
prediction errors.

4.4 Overfitting Prevention Techniques

 Dropout: A regularization technique where randomly selected neurons are ignored during
training, preventing them from co-adapting too much. This helps in reducing overfitting by
making the network more robust.
 Early Stopping: Monitors the model's performance on a validation set and stops training
when performance degrades, as indicated by an increase in validation loss.

4.5 Model Evaluation and Fine-tuning

 Transfer Learning: Leveraging pre-trained models on large datasets and fine-tuning them
on the HAM10000 dataset can significantly improve accuracy, especially when the dataset
is relatively small. Algorithms like VGG, ResNet, or Inception can serve as the starting
point.
 Ensemble Methods: Combining predictions from multiple models or variations of a model
can improve accuracy and robustness. Techniques like bagging and boosting may be
employed to aggregate the outputs of multiple models.

This methodology provides a foundation for building a powerful CNN model capable of
classifying skin cancer with high accuracy. By carefully selecting and fine-tuning these algorithms,
it's possible to create a model that generalizes effectively to new, unseen images, making strides in
addressing the global challenge of skin cancer diagnosis.
5 IMPLEMENTATION

In addressing the pressing challenge of skin cancer detection, this project harnesses the power of
Convolutional Neural Networks (CNNs) to classify dermatoscopic images from the HAM10000
dataset. The implementation aims to develop a robust, accurate, and efficient model that can assist
in early diagnosis and contribute to improved patient outcomes. This section outlines the systematic
steps taken, from data preparation through model evaluation, to achieve these goals.

Fig 5.1: Implementation Flowchart


5.1 Data Preparation

To optimize the CNN's performance for skin cancer classification, we undertook several dataset
preparation steps. Our objective was to transform the raw HAM10000 image data into a format
conducive to effective machine learning. The steps included:
1. Importing libraries and modules
2. Image Data Loading
3. Lesion Type Categorization

5.1.1 Importing the modules

We began by importing essential libraries and modules, including NumPy for linear
algebra, Pandas for data processing, and TensorFlow along with Keras for building and training
our CNN model.

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import cv2
import os

from tensorflow.keras.preprocessing.image import ImageDataGenerator


from tensorflow.keras.models import Sequential, load_model, Model
from tensorflow.keras.layers import Conv2D, BatchNormalization, MaxPool2D, Flatten, Dense, Dropout, Activation,
GlobalAveragePooling2D, AveragePooling2D
from tensorflow.keras.callbacks import EarlyStopping, ModelCheckpoint, ReduceLROnPlateau
from tensorflow.keras.utils import to_categorical
from tensorflow.keras.optimizers import Adam

from sklearn.model_selection import train_test_split


from sklearn.utils.class_weight import compute_class_weight

from sklearn.metrics import classification_report, confusion_matrix


import seaborn as sns
5.1.2 Image Data Loading

We initiated the preparation by loading the dermatoscopic images along with their metadata
from the HAM10000 dataset. This step involved reading the image files and the accompanying
CSV file that contains metadata, including diagnoses (dx), image IDs, and patient information.

# Reading the meta data of our data frame.


df_skin = pd.read_csv('./data/HAM10000_metadata.csv')

# Display the first 5 lines


df_skin.head(5)

5.1.3 Lesion Type Categorization

To facilitate targeted analysis, we categorized images based on lesion types. We mapped


each lesion diagnosis code to a corresponding descriptive label and a numeric ID for model
processing. This enabled us to treat skin cancer classification as a multi-class problem.

# Lesion/dis names are given in the description of the data set.


lesion_type_dict = {
'nv': 'Melanocytic nevi',
'mel': 'Melanoma',
'bkl': 'Benign keratosis-like lesions ',
'bcc': 'Basal cell carcinoma',
'akiec': 'Actinic keratoses',
'vasc': 'Vascular lesions',
'df': 'Dermatofibroma'
}

lesion_ID_dict = {
'nv': 0,
'mel': 1,
'bkl': 2,
'bcc': 3,
'akiec': 4,
'vasc': 5,
'df': 6
}
# Maping the lesion type and ID to a dict.
df_skin['lesion_type'] = df_skin['dx'].map(lesion_type_dict)
df_skin['lesion_ID'] = df_skin['dx'].map(lesion_ID_dict)

5.2 Image Preprocessing and Augmentation

To optimize our CNN for skin cancer classification, we resized the HAM10000 images for
consistency and applied data augmentation techniques like rotation, flip, and zoom. These steps
enriched the dataset and improved model generalization.
1. Image Resizing and Normalization
2. Data Augmentation
3. Class Balancing

5.2.1 Image Resizing and Normalization

Given the varied sizes of dermatoscopic images, we resized each image to a uniform
dimension (e.g., 100x100 pixels) to ensure consistency.

# Resizing the read image to 100x100


img = imread(file_to_read)
img2 = resize(img, (100, 100))

5.2.2 Data Augmentation

To enhance the diversity of the training dataset and mitigate overfitting, we applied data
augmentation techniques such as rotation, zoom, flip, and translation. This step generated
additional synthetic images from the original dataset, increasing its robustness.

def produce_new_img(img2: cv2) -> tuple:


imga = cv2.rotate(img2, cv2.ROTATE_90_CLOCKWISE)
imgb = cv2.rotate(img2, cv2.ROTATE_90_COUNTERCLOCKWISE)
imgc = cv2.rotate(img2, cv2.ROTATE_180)
imgd = cv2.flip(img2, 0)
imge = cv2.flip(img2, 1)
new_imges = imga, imgb, imgc, imgd, imge
return new_imges

5.2.3 Class Balancing

Recognizing the imbalanced nature of the dataset, with some lesion types being more
prevalent than others, we computed class weights. This approach allowed us to give higher
importance to underrepresented, aiming for a balanced model sensitivity across all categories.

def est_class_weights(dis_id: np.array) -> dict:


class_weights = np.around(compute_class_weight('balanced', np.unique(dis_id), y), 2)
class_weights = dict(zip(np.unique(dis_id), class_weights))
return class_weights

5.3 Splitting Training and Testing Dataset

We split the dataset into training and test sets, allocating 75% of the images for training and
25% for testing. This division allowed us to evaluate the CNN’s performance on unseen data,
ensuring the model's generalizability.

# split in 75% training and 25% test data


X_train, X_test, y_train, y_test = train_test_split(x, y_train, test_size = 0.25, random_state = 42, stratify = y)

5.4 CNN Model Architecture

The architecture incorporates modern neural network practices to extract meaningful patterns
from the data, enhancing the model's predictive accuracy. The core components of our CNN include
convolutional layers with activation functions, pooling to reduce dimensionality, dropout for
regularization, and batch normalization to accelerate training.

5.4.1 Convolutional Layers and Activation Functions

Our CNN employs multiple convolutional layers with filters of various sizes to capture a
wide range of features essential for skin cancer classification. Each convolutional layer is
followed by a ReLU activation function, introducing non-linearity and enabling efficient
learning of complex patterns. This structure aids in preventing the vanishing gradient issue and
improves the model's ability to generalize, making it robust in detecting different skin cancer
types.
5.4.2 Pooling, Dropout, and Batch Normalization

 Pooling: To reduce the spatial dimensions of the feature maps and mitigate overfitting,
we incorporated Max Pooling layers after specific convolutional layers. This approach
helps in reducing the number of parameters and computation in the network, making the
model more efficient.
 Dropout: Recognizing the importance of preventing overfitting, especially when
working with a relatively limited dataset, we integrated Dropout layers within the
network architecture. By randomly dropping units during training, Dropout forces the
model to learn more robust features that are generalizable across unseen data.
 Batch Normalization: To ensure stable and faster training, we applied Batch
Normalization after each convolutional layer. This technique normalizes the inputs to
each layer, reducing internal covariate shift and improving the overall training dynamics.

5.4.3 Model Design

Fig 5.2: Dense Net Model Architecture

5.5 Model Training


After finalizing the CNN architecture and preprocessing the dataset, we proceed to the training
phase, where the model learns from the data. This phase involves feeding the preprocessed images
and their corresponding labels into the model, allowing it to adjust its weights and biases to
minimize the loss function.

### Training phase ###


datagen = ImageDataGenerator(zoom_range = 0.2, horizontal_flip = True, shear_range = 0.2)
datagen.fit(X_train)
history = model.fit(datagen.flow(X_train,y_train),
epochs = epochs,
batch_size = batch_size,
shuffle = True,
callbacks = [early_stopping_monitor, model_checkpoint_callback],
validation_data = (X_test, y_test),
class_weight = new_class_weights
)

5.6 Hyperparameter Tuning and Model Optimization

To enhance the performance of our CNN model, we focused on fine-tuning hyperparameters


and optimizing the model's learning process. This phase is crucial for achieving the best possible
accuracy and efficiency in skin cancer classification.

5.6.1 Early Stopping and Model Checkpoints

 Early Stopping: To prevent overfitting and save computational resources, we


implemented early stopping. This technique halts training when the validation accuracy
ceases to improve, ensuring the model doesn't learn from noise in the data.
 Model Checkpoints: We used model checkpoints to save the model at intermediate
stages during training. This approach ensures we can retrieve the version of the model
with the highest validation accuracy, safeguarding against potential degradation in
performance during later epochs.

def mod_checkpoint_callback() -> None:


trained_model = ModelCheckpoint(filepath = 'model.h5', # result file name
save_weights_only = False, # Save all training
monitor = 'val_accuracy', # check our model accuracy
mode = 'auto', # enable auto save.
save_best_only = True, # if ac_new > ac_old
verbose = 1)
return trained_model

# Montoring the training procces in each epoch.


early_stopping_monitor = EarlyStopping(patience = 100, monitor = 'val_accuracy')
model_checkpoint_callback = mod_checkpoint_callback()

5.6.2 Optimization Algorithms

We experimented with various optimization algorithms, including Stochastic Gradient


Descent (SGD) and Adam. Each optimizer has unique characteristics that influence the model's
convergence speed and accuracy. By tuning parameters such as the learning rate and
momentum, we optimized the model's ability to minimize the loss function efficiently.

# Estimate the model data if it was big one.


optimizer = Adam(learning_rate = 0.001, beta_1 = 0.9, beta_2 = 0.999, epsilon = 1e-3)
model.compile(optimizer = optimizer, loss = 'categorical_crossentropy', metrics = ['accuracy'])

5.7 Model Evaluation and Analysis

We meticulously assessed the performance of our CNN model, focusing on accuracy, loss
metrics, and class-wise accuracy to determine its efficacy in classifying various skin cancer types.
This evaluation not only highlighted the model's predictive strengths but also identified areas for
potential improvement, setting a foundation for future enhancements.

5.7.1 Loading the Best Model and Calculating Accuracy

The best-performing model was saved during training, using the checkpoints established
based on validation accuracy. It shows our CNN after hyperparameter tuning and training.

# Loading the best model


best_model = load_model('./model.h5')
# Compute predictions
y_pred_prob = np.around(best_model.predict(X_test),3)
y_pred = np.argmax(y_pred_prob, axis = 1)
y_test2 = np.argmax(y_test, axis = 1)

# Inform the user with model Accuracy %


scores = best_model.evaluate(X_test, y_test, verbose = 1)
print("Accuracy: %.2f%%" % (scores[1] * 100))

5.7.2 Class-wise Accuracy Assessment

To ensure our model performs well across all lesion types, we calculated class-wise
accuracy. This assessment highlights how effectively the model classifies each specific type of
skin cancer, revealing any biases or weaknesses.

#Accuracy for each type


acc_tot= []

for i in range(7):
acc_parz = round(np.mean(y_test2[y_test2 == i] == y_pred[y_test2 == i]),2)
lab_parz = lesion_names[i]
print('accuracy for',lab_parz,'=',acc_parz)
acc_tot.append(acc_parz)
6 RESULT ANALYSIS

Upon the comprehensive evaluation of our Convolutional Neural Network (CNN) model,
designed for the classification of skin cancer types using the HAM10000 dataset, we have observed
promising results that underscore the potential of AI in dermatological diagnostics. The model
achieved a notable accuracy, highlighting its capability to effectively distinguish between various
skin lesion types. The accuracy and loss curves, plotted over the training and validation phases,
exhibited a favourable convergence, indicating a balanced learning process without significant
overfitting or underfitting issues.

6.1 Overall Performance

The CNN model demonstrated promising results in classifying skin cancer types from the
HAM10000 dataset. Achieving an accuracy of 87.28%, with precision, recall, and F1-scores
reflecting a balanced performance across various classes, the model validates the effectiveness of
the chosen architecture and training regimen. These metrics underscore the model's capability in
distinguishing between different skin lesion types, an essential aspect of early and accurate skin
cancer diagnosis.

Fig 6.1: Model Accuracy Analysis

Fig 6.2: Class-wise Classification Report


6.2 Accuracy and Loss Trends

Throughout the training and validation phases, the model's accuracy improved consistently,
while the loss decreased, indicative of effective learning. However, a careful examination of the
trends revealed minor signs of overfitting, as evidenced by a slight divergence between training and
validation accuracy in the later epochs. This observation guided adjustments in model training,
including the incorporation of dropout and early stopping, to mitigate overfitting and enhance
generalization.

Fig 6.3: Accuracy Trend over 100 Epochs

Fig 6.4: Loss Trend over 100 Epochs


6.3 Class-wise Performance

The model's performance varied across different skin lesion classes, with Vascular lesions and
Dermatofibroma showing the highest accuracy rates. In contrast, Melanocytic nevi and Benign
keratosis-like lesions presented more challenges, reflecting lower prediction accuracy. This
variation highlights the model's strengths and areas where further tuning could yield improvements.

Fig 6.5: Confusion Matrix

6.4 Visual Comparisons

Visual inspection of test images alongside their predicted and actual labels revealed the model's
proficiency in identifying distinct lesion characteristics. Successful predictions across a spectrum of
lesion types demonstrated the model's robustness. However, notable misclassifications, particularly
in less accurately predicted classes, underscored the necessity for continuous model refinement.

Fig 6.5: Sample Test Data


7 CONCLUSION AND FUTURE WORK

The CNN model for skin cancer classification represents a significant step forward in applying
artificial intelligence to dermatological diagnosis. By achieving high accuracy and uncovering
specific areas for improvement, this project not only contributes to the ongoing efforts in medical
AI but also outlines a clear path for future advancements.

 Enhancing Model Performance


Future work will focus on addressing identified challenges, such as the variability in class-wise
performance and occasional overfitting. Strategies include exploring more sophisticated data
augmentation techniques, implementing advanced neural network architectures, and fine-tuning
hyperparameters more rigorously.

 Expanding Dataset Diversity


To further improve the model's generalizability and robustness, expanding the training dataset to
include a broader diversity of lesion types and patient demographics is paramount. Collaborations
with medical institutions for data sharing could enrich the dataset, providing a more comprehensive
foundation for training.

 Clinical Integration and Real-world Application


Looking ahead, translating these research findings into clinical practice remains a priority. Pilot
studies to assess the model's real-world applicability and user acceptance in clinical settings will be
critical. Additionally, developing user-friendly interfaces for dermatologists and integrating the
model into existing medical software systems will enhance its utility and impact.

In conclusion, this project lays the groundwork for transformative changes in skin cancer
diagnosis, with AI-powered models offering speed, accuracy, and accessibility. Continuous
innovation and collaboration between technologists and medical professionals will be key to
realizing the full potential of this promising field.
8 REFERENCES

[1] Debelee, Taye Girma. "Skin Lesion Classification and Detection Using Machine Learning
Techniques: A Systematic Review." Diagnostics 13.19 (2023): 3147.

[2] Swapna, T., et al. "Detection and Classification of Skin diseases using Deep Learning." The
International journal of analytical and experimental modal analysis, ISSN 0886-9367 (2021).

[3] Girdhar, Nancy, Aparna Sinha, and Shivang Gupta. "DenseNet-II: An improved deep
convolutional neural network for melanoma cancer detection." Soft computing 27.18 (2023): 13285-
13304.

[4] Goceri, Evgin, and Ayse Akman Karakas. "Comparative evaluations of CNN based networks
for skin lesion classification." 14th International conference on computer graphics. visualization,
computer vision and image processing (CGVCVIP), Zagreb, Croatia. 2020.

[5] Kassem, Mohamed A., Khalid M. Hosny, and Mohamed M. Fouad. "Skin lesions classification
into eight classes for ISIC 2019 using deep convolutional neural network and transfer learning."
IEEE Access 8 (2020): 114822-114832.

[6] Esmaeilzadeh, Pouyan. "Use of AI-based tools for healthcare purposes: a survey study from
consumers’ perspectives." BMC medical informatics and decision making 20.1 (2020): 1-19.

[7] Elder, Alexandra, et al. "The role of artificial intelligence in cosmetic dermatology—current,
upcoming, and future trends." Journal of Cosmetic Dermatology 20.1 (2021): 48-52.

[8] Berman, Hannah, et al. "Skin cancer in solid organ transplant recipients: A review for the
nondermatologist." Mayo Clinic Proceedings. Elsevier, 2022.

[9] Sander, Megan, et al. "The efficacy and safety of sunscreen use for the prevention of skin
cancer." Cmaj 192.50 (2020): E1802-E1808.

[10] Shaheen, Mohammed Yousef. "Applications of Artificial Intelligence (AI) in healthcare: A


review." ScienceOpen Preprints (2021).

You might also like