Professional Documents
Culture Documents
Batch 7
Batch 7
BACHELOR OF TECHNOLOGY
Electronics and Communication Engineering
CERTIFICATE
This is to certify that the Mini Project work entitled “Classification of Lung cancer
Nodules to monitor patients health using Neural network Topology with SVM algorithm &
Jawaharlal Nehru Technological University, Hyderabad during the academic year 2023-2024.
External Examiner
H.no:3rd floor,
Above Sbi Bank, Arunodaya Colony, Hi-
Tech Theatre Line,Opp: Metro Pillar No. C1758,
M a d h a p u r , Hyderabad,
1. T. Sharanya (20RH1A04N4)
2. V. Chandrika (20RH1A04P5)
3. S. Tejashwini (21RH5A0420)
4. T. Sarala Devi (20RH1A04N7)
We feel ourselves honored and privileged to place our warm salutation to our college
Malla Reddy Engineering College for Women and Department of Electronics and
Communication Engineering which gave us the opportunity to have expertise in engineering
and profound technical knowledge.
We would like to deeply thank our Honorable Minister of Telangana State Sri.Ch.
Malla Reddy Garu, founder chairman MRGI, the largest cluster of institutions in the state of
Telangana for providing us with all the resources in the college to make our project success.
We wish to convey gratitude to our Principal Dr. Y. Madhavee Latha, for providing
us with the environment and mean to enrich our skills and motivating us in our endeavor and
helping us to realize our full potential.
We express our sincere gratitude to our Head of the Department Dr. K. Sudhakar, of
for inspiring us to take up a project on this subject and successfully guiding us towards its
completion.
We would also like to thank our Project coordinator Dr.S.Rajkumar, for his kind
encouragement and overall guidance in viewing this program a good asset with profound
gratitude.
We would like to thank our internal guide Dr.L.Malliga, and all the Faculty members
for their valuable guidance and encouragement towards the completion of our project work.
We hereby declare that our Mini Project entitled “Classification of Lung cancer
Nodules to monitor patients health using Neural network Topology with SVM algorithm
& compare with K-means accuracy” submitted to Malla Reddy Engineering College for
It is declared that the project report or any part thereof has not been previously
T.SHARANYA (20RH1A04N4)
V.CHANRIKA (20RH1A04P5)
S.TEJASHWINI (21RH5A0420)
T.SARALA DEVI (20RH1A04N7)
CLASSIFICATION OF LUNG CANCER NODULES TO MONITOR PATIENTS’ HEALTH
ABSTRACT
Lung nodules are vital indicators for the presence of lung cancer. An early detection enhances the survival
rate of the patient by starting treatment at the right time. The detection and classification of malignancy in
Computed Tomography (CT) images is a very time-consuming and difficult task for radiologists which lead
the researchers to develop algorithms for Computer-Aided Diagnosis (CAD) systems to mitigate this burden.
The performance of CAD systems is continuously improving by using various deep learning techniques for
screening of lung cancer. In this paper, we proposed transferable texture Convolutional Neural Networks
(CNN) to improve the classification performance of pulmonary nodules in CT scans. An Energy Layer (EL)
is incorporated in our scheme, which extracts texture features from the convolutional layer. The inclusion of
EL reduces the number of learnable parameters of the network, which further reduces the memory
requirements and computational complexity. The proposed model has only three convolutional layers and
one EL, instead of pooling layer. Overall proposed CNN architecture comprises of nine layers for automatic
feature extraction and classification of pulmonary nodule candidates as malignant or benign. Furthermore,
the pre-trained model of proposed CNN is also used to handle the smaller dataset classification problem by
using transfer learning.
INDEX
TITLE PAGENO
CHAPTER 1- INTRODUCTION 2
1.1 Introduction 2
CONCLUSION 36
REFERENCES 37
1. INTRODUCTION
Lung cancer growth has turned out to be a standout amongst the most widely recognized reasons
for disease in the two people. Countless bite the dust each year because of lung malignancy. The
illness has diverse stages whereby it begins from the little tissue and spreads all through the
distinctive territories of the lungs by a procedure called metastasis. It is the uncontrolled
development of undesirable cells in the lungs. It is assessed that around 12,203 people had lung
disease in 2016, 7130 guys and 5073 females; passing from lung malignant growth in 2016 were
8839. . Biomedical image handling is the most recent rising apparatus in medicinal research
utilized for the early recognition of malignancies. Biomedical image handling strategies can be
utilized in the restorative field to analysis maladies at the beginning time. It utilizes biomedical
images, for example, X-beams, Computed innovation and MRIs. The principle commitment of
image handling in the restorative field is to analysis the malignant growth at the beginning time,
expanding survival rates. The time factor is basic for tumors of the mind, the lungs, and bosoms.
image handling can identify these malignant growths in the early periods of the maladies
encouraging an early treatment process. The image preparing procedure comprises of four essential
stages, pre-handling, division, including extraction and grouping. This paper presents image
preparing procedures whereby the CT examine image is utilized as information image, is handled
and beginning period lung disease is distinguished utilizing an SVM (bolster vector machine)
calculation as a classifier in the grouping stage to improve exactness, affectability, and
explicitness. First the image is pre-handled and divided. After that Features are removed from the
sectioned image lastly the image is delegated ordinary or destructive. Advanced image handling is
the utilization of PC calculations to perform image preparing on computerized images. As a
subfield of advanced flag preparing, computerized image handling has numerous points of interest
over simple image preparing.[19][2] It permits a lot more extensive scope of calculations to be
connected to the information data — the point of advanced image handling is to improve the image
information (Features) by stifling undesirable mutilations as well as upgrade of some vital image
includes with the goal that our AIComputer Vision models can profit by this improved information
to take a shot at. Feature extraction begins from an underlying arrangement of estimated
information and assembles determined qualities (Features) proposed to be useful and non-excess,
encouraging the resulting learning and speculation steps, and at times prompting better human
elucidations.
2. LITERATURE SURVEY
2.1 Image retrieval system based on feature extraction and relevance feedback
Abstract: The availability of huge multimedia databases and the development of information
highways have urged many researchers for developing effective methods of retrieval based on
their content. The traditional way of searching the available huge collections of multimedia data
was by keyword indexing or simply by browsing, where by the user's main interest lies in the
maximum retrieval of similar data. Digital image databases however, opened the way to content-
based searching and retrieval. A lot of research has been done in retrieving the content based on
image features like color, texture, and shape.
Abstract: The authors introduce two detectors which they use to locate simulated tumors of fixed
size in clinical gamma-ray images. The first method was conceived when it was observed that
small tumours possess an identifiable signature in curvature feature space, where "curvature" is
the local curvature of the image data when viewed as a relief map. Computed curvature values are
mapped to a normalized significance space using a windowed statistic. The resulting test statistic
is thresholder at a chosen level of significance to give a positive detection. Nonuniform anatomic
background activity is effectively suppressed. The second detector is an adaptive prewhitening
matched filter, which uses a form of preprocessing known as statistical scaling to adaptively
prewhiten the background.
3. SYSTEM ANALYSIS
Disadvantages:
1. The CT filter picture is pre-prepared pursued by division of the ROI of the lung.
2. Discrete waveform Transform is connected for picture pressure and highlights are extricated
utilizing a GLCM.
Advantages:
1. The classification is the major portion where the cancerous and non-cancerous is identified with
the pre trained model.
HARDWARE REQUIREMENTS:
SOFTWARE REQUIREMENTS:
• Operating System: Windows
the process, an external entity that interacts with the system and the information flows in
the system
User
Check Unauthorized
user
Accuracy Graph
End process
UML DIAGRAMS
UML stands for Unified Modeling Language. UML is a standardized general-purpose
modeling language in the field of object-oriented software engineering. The standard is managed,
and was created by, the Object Management Group.
The goal is for UML to become a common language for creating models of object oriented
computer software. In its current form UML is comprised of two major components: a Meta-model
and a notation. In the future, some form of method or process may also be added to; or associated
with, UML.
IMPLEMENTATION:
MODULES:
Upload Lung Cancer Dataset
Read &split Dataset To Train & Test
Execute SVM Algorithms
Execute K-Means Algorithm
Predict Lung Cancer
Accuracy Graph
4.SOFTWARE DESCRIPTION:
FEASIBILITY STUDY
The feasibility of the project is analyzed in this phase and business proposal is
put forth with a very general plan for the project and some cost estimates. During system
analysis the feasibility study of the proposed system is to be carried out.
ECONOMICAL FEASIBILITY
TECHNICAL FEASIBILITY
SOCIAL FEASIBILITY
ECONOMICAL FEASIBILITY: This study is carried out to check the economic impact
that the system will have on the organization. The amount of fund that the company can pour into
the research and development of the system is limited.
TECHNICAL FEASIBILITY: This study is carried out to check the technical feasibility,
that is, the technical requirements of the system. Any system developed must not have a high
demand on the available technical resources. This will lead to high demands on the available
technical resources.
SOCIAL FEASIBILITY: The aspect of study is to check the level of acceptance of the
system by the user. This includes the process of training the user to use the system efficiently. The
user must not feel threatened by the system, instead must accept it as a necessity.
Guido Van Rossum published the first version of Python code (version 0.9.0) at alt. sources in
February 1991. This release included already exception handling, functions, and the core data
types of list, dict, str and others. It was also object oriented and had a module system.
Python version 1.0 was released in January 1994. The major new features included in this release
were the functional programming tools lambda, map, filter and reduce, which Guido Van
Rossum never liked. Six and a half years later in October 2000, Python 2.0 was introduced.
TensorFlow
TensorFlow is a free and open-source software library for dataflow and differentiable
programming across a range of tasks. It is a symbolic math library, and is also used for machine
learning applications such as neural networks. It is used for both research and production
at Google.
NumPy
It is the fundamental package for scientific computing with Python. It contains various features
including these important ones:
Matplotlib
used in Python scripts, the Python and IPython shells, the Jupyter Notebook, web application
servers, and four graphical user interface toolkits.
Scikit – learn
Python a versatile programming language doesn’t come pre-installed on your computer devices.
Python was first released in the year 1991 and until today it is a very popular high-level
programming language. Its style philosophy emphasizes code readability with its notable use of
great whitespace.
HARDWARE REQUIREMENTS:
• System : Pentium IV 2.4 GHz.
• Hard Disk : 40 GB.
• Ram : 512 Mb.
SOFTWARE REQUIREMENTS:
• Operating system : - Windows.
What is Python :-
Python is currently the most widely used multi-purpose, high-level programming language.
The biggest strength of Python is huge collection of standard library which can be used
for the following –
Machine Learning
GUI Applications (like Kivy, Tkinter, PyQt etc. )
Web frameworks like Django (used by YouTube, Instagram, Dropbox)
Advantages of Python :-
1. Extensive Libraries
Python downloads with an extensive library and it contain code for various purposes like
regular expressions, documentation-generation, unit-testing, web browsers, threading,
databases, CGI, email, image manipulation, and more. So, we don’t have to write the
complete code for that manually.
2. Object-Oriented
This language supports both the procedural and object-oriented programming paradigms.
While functions help us with code reusability, classes and objects let us model the real world.
A class allows the encapsulation of data and functions into one.
Before we take a look at the details of various machine learning methods, let's start by looking
at what machine learning is, and what it isn't. Machine learning is often categorized as a
subfield of artificial intelligence, but I find that categorization can often be misleading at first
brush. The study of machine learning certainly arose from research in this context, but in the
data science application of machine learning methods, it's more helpful to think of machine
learning as a means of building models of data.
Human beings, at this moment, are the most intelligent and advanced species on earth because
they can think, evaluate and solve complex problems. On the other side, AI is still in its initial
stage and haven’t surpassed human intelligence in many aspects. Then the question is that
what is the need to make machine learn? The most suitable reason for doing this is, “to make
decisions, based on data, with efficiency and scale”.Challenges in Machines Learning :-
Machine Learning is the most rapidly growing technology and according to researchers we are
in the golden year of AI and ML. Emotion analysis
Sentiment analysis
This is a rough roadmap you can follow on your way to becoming an insanely talented Machine
Learning Engineer. Of course, you can always modify the steps according to your needs to
reach your desired end-goal!
Both Linear Algebra and Multivariate Calculus are important in Machine Learning. However,
the extent to which you need them depends on your role as a data scientist. If you are more
focused on application heavy machine learning, then you will not be that heavily focused on
maths as there are many common libraries available.
Some people prefer to skip Linear Algebra, Multivariate Calculus and Statistics and learn them
as they go along with trial and error. But the one thing that you absolutely cannot skip is Python!
While there are other languages you can use for Machine Learning like R, Scala, etc. Python is
currently the most popular language for ML.
Now that you are done with the prerequisites, you can move on to actually learning ML (Which
is the fun part!!!) It’s best to start with the basics and then move on to the more complicated
stuff. Some of the basic concepts in ML are:
Model – A model is a specific representation learned from data by applying some machine
learning algorithm. A model is also called a hypothesis.
Feature – A feature is an individual measurable property of the data. A set of numeric
features can be conveniently described by a feature vector. Feature vectors are fed as input to
the model. For example, in order to predict a fruit, there may be features like color, smell, taste,
etc.
Target (Label) – A target variable or label is the value to be predicted by our model. For the
fruit example discussed in the feature section, the label with each set of input would be the name
of the fruit like apple, orange, banana, etc.
Supervised Learning – This involves learning from a training dataset with labeled data using
classification and regression models. This learning process continues until the required level of
performance is achieved.
Unsupervised Learning – This involves using unlabelled data and then finding the underlying
structure in the data in order to learn more and more about the data itself using factor and cluster
analysis models.
Semi-supervised Learning – This involves using unlabelled data like Unsupervised Learning
with a small amount of labeled data. Using labeled data vastly increases the learning accuracy
and is also more cost-effective than Supervised Learning.
Reinforcement Learning – This involves learning optimal actions through trial and error. So
the next action is decided by learning behaviors that are based on the current state and that will
maximize the reward in the future.
Advantages of Machine learning :-
1. Easily identifies trends and patterns: Machine Learning can review large volumes of
data and discover specific trends and patterns that would not be apparent to humans. For
instance, for an e-commerce website like Amazon, it serves to understand the browsing
behaviours and purchase histories of its users to help cater to the right products, deals, and
reminders relevant to them. It uses the results to reveal relevant advertisements to them.
2. No human intervention needed (automation): With ML, you don’t need to babysit
your project every step of the way. Since it means giving machines the ability to learn, it lets
them make predictions and also improve the algorithms on their own. A common example of
this is anti-virus softwares; they learn to filter new threats as they are recognized. ML is also
good at recognizing spam.
Now, check for the latest and the correct version for your operating system.
Step 3: You can either select the Download Python for windows 3.7.4 button in Yellow Color or
you can scroll further down and click on download with respective to their version. Here, we are
downloading the most recent python version for windows 3.7.4
Step 4: Scroll down the page until you find the Files option.
Step 5: Here you see a different version of python along with the operating system.
• To download Windows 32-bit python, you can select any one from the three options:
Windows x86 embeddable zip file, Windows x86 executable installer or Windows x86 web-
based installer.
• To download Windows 64-bit python, you can select any one from the three options: Windows
x86-64 embeddable zip file, Windows x86-64 executable installer or Windows x86-64 web-
based installer.
Here we will install Windows x86-64 web-based installer. Here your first part regarding which
version of python is to be downloaded is completed. Now we move ahead with the second part
in installing python i.e. Installation
Installation of Python
Step 1: Go to Download and Open the downloaded python version to carry out the installation
process.
Step 2: Before you click on Install Now, Make sure to put a tick on Add Python 3.7 to PATH.
Step 3: Click on Install NOW After the installation is successful. Click on Close.
With these above three steps on python installation, you have successfully and correctly installed
Python. Now is the time to verify the installation.
Note: The installation process might take a couple of minutes.
Verify the Python Installation
Step 1: Click on Start
Step 2: In the Windows Run Command, type “cmd”.
Step 3: Click on IDLE (Python 3.7 64-bit) and launch the program
Step 4: To go ahead with working in IDLE you must first save the file. Click on File > Click on
Save
Step 5: Name the file and save as type should be Python files. Click on SAVE. Here I have named
the files as Hey World.
Step 6: Now for e.g. enter print
SYSTEM TEST
The purpose of testing is to discover errors. Testing is the process of trying to discover every
conceivable fault or weakness in a work product. It provides a way to check the functionality of
components, sub assemblies, assemblies and/or a finished product It is the process of exercising
software with the intent of ensuring that the Software system meets its requirements and user
expectations and does not fail in an unacceptable manner. There are various types of test. Each test
type addresses a specific testing requirement.
TYPES OF TESTS
Unit testing
Unit testing involves the design of test cases that validate that the internal program
logic is functioning properly, and that program inputs produce valid outputs. All decision branches
and internal code flow should be validated. It is the testing of individual software units of the
application .it is done after the completion of an individual unit before integration. This is a
structural testing, that relies on knowledge of its construction and is invasive.
Integration testing
Integration tests are designed to test integrated software components to
determine if they actually run as one program. Testing is event driven and is more concerned with
the basic outcome of screens or fields.
Functional test
Functional tests provide systematic demonstrations that functions tested are
available as specified by the business and technical requirements, system documentation, and
user manuals.
Functional testing is centered on the following items:
System Test
System testing ensures that the entire integrated software system meets
requirements. It tests a configuration to ensure known and predictable results.
Unit testing is usually conducted as part of a combined code and unit test phase of
the software lifecycle, although it is not uncommon for coding and unit testing to be conducted as
two distinct phases.
Integration Testing
Software integration testing is the incremental integration testing of two or more
integrated software components on a single platform to produce failures caused by interface
defects.
Acceptance Testing
User Acceptance Testing is a critical phase of any project and requires significant participation by
the end user. It also ensures that the system meets the functional requirements.
main = tkinter.Tk()
global filename
global classifier
global svm_acc, kmeans_acc
global X, Y
global X_train, X_test, y_train, y_test
global pca
def uploadDataset():
global filename
filename = filedialog.askdirectory(initialdir=".")
text.delete('1.0', END)
text.insert(END,filename+" loaded\n");
def splitDataset():
global X, Y
global X_train, X_test, y_train, y_test
global pca
text.delete('1.0', END)
X = np.load('features/X.txt.npy')
Y = np.load('features/Y.txt.npy')
X = np.reshape(X, (X.shape[0],(X.shape[1]*X.shape[2]*X.shape[3])))
def executeSVM():
global classifier
global svm_acc
text.delete('1.0', END)
cls = svm.SVC()
cls.fit(X_train, y_train)
predict = cls.predict(X_test)
svm_acc = accuracy_score(y_test,predict)*100
classifier = cls
text.insert(END,"SVM Accuracy : "+str(svm_acc)+"\n")
def executeKmeans():
global kmeans_acc
kmeans = KMeans(n_clusters=2, random_state=0)
kmeans.fit(X_train)
predict = kmeans.predict(X_test)
kmeans_acc = accuracy_score(y_test,predict)*100
print("KMeans Predicted Labels : "+str(kmeans.labels_))
centroids = kmeans.cluster_centers_
text.insert(END,"K-Means Accuracy : "+str(kmeans_acc)+"\n")
def predictCancer():
filename = filedialog.askopenfilename(initialdir="testSamples")
img = cv2.imread(filename)
img = cv2.resize(img, (64,64))
im2arr = np.array(img)
im2arr = im2arr.reshape(64,64,3)
im2arr = im2arr.astype('float32')
im2arr = im2arr/255
test = []
test.append(im2arr)
test = np.asarray(test)
cv2.imshow(msg, img)
cv2.waitKey(0)
def graph():
title = Label(main, text='Classification of Lung Cancer Nodules to Monitor Patients Health using
Neural Network topology with SVM algorithm & Compare with K-Means Accuracy')
readButton = Button(main, text="Read & Split Dataset to Train & Test", command=splitDataset)
readButton.place(x=350,y=550)
readButton.config(font=font1)
kmeansButton.place(x=350,y=600)
kmeansButton.config(font=font1)
main.config(bg='LightSteelBlue3')
main.mainloop()
5.RESULT ANALYSIS
Classification of Lung Cancer Nodules to Monitor Patients Health using Neural Network
topology with SVM algorithm & Compare with K-Means Accuracy
In this project we are using CT Scan Lung Cancer Nodules dataset to predict patient health using
SVM and KMeans algorithm and then comparing prediction accuracy between them. To
implement this project we are using lung cancer images dataset and below screen showing
dataset details and this dataset saved inside ‘Dataset’ folder.
In above screen in dataset we have two types of images such as normal and abnormal and then
SVM and KMEANS will get train on above dataset and when we upload new image then SVM
will predict whether new image is normal or abnormal. To implement 4 titles we created 4
folders to separate algorithms for each title and you can run one by one
SCREENS SHOT
To run project double click on run.bat file from ‘Title1_SVM_KMeans’ folder to get below
screen
In above screen click on ‘Upload Lung Cancer Dataset’ button and then upload dataset folder
In above screen selecting and uploading ‘Dataset’ folder and then click on ‘Select Folder’ button
to load dataset and to get below screen
In above screen dataset loaded and now click on ‘Read & Split Dataset to Train & Test’ button to
split dataset into train and test parts and application split 80% dataset for training and 20%
dataset to test trained model
In above screen we can see dataset contains total 138 images and then application using 110
images for training and 28 images for testing and now data is ready and now click on ‘Execute
SVM Algorithm’ button to run SVM on loaded dataset and to get below accuracy
In above screen SVM accuracy is 60% and now click on “Execute K-Means Algorithm” button
to run KMEANS algorithm on loaded dataset and to get below screen
In above screen KMEANS got 50% accuracy and now click on ‘Predict Lung Cancer’ button to
upload new test image and then application will give prediction result
In above screen selecting and uploading ‘1.png’ file and then click on ‘Open’ button to get below
result
In above screen uploaded image predicted as Abnormal and now test with another image
Above image predicted as Normal and similarly you can upload any image and perform
prediction and now click on ‘Accuracy Graph’ button to get below graph
In above screen x-axis represents algorithm name and y-axis represents accuracy of those
algorithms and from above graph we can conclude that SVM is better than KMEANS in
prediction.
CONCLUSION
In the principal period of the venture the Region of Interest in a picture is distinguished. The
Identified district is situated in an item. The highlights in the picture are distinguished by utilizing
some picture handling system. In second period of the task the component removed information is
then used to arrange the picture is destructive or not utilizing a portion of the SVM – bolster vector
machine grouping. At that point some boosting calculation is utilized to expand the exactness of
the instrument.
In existing paper, a picture handling procedures has been utilized to recognize beginning time lung
malignant growth in CT examine pictures. The CT filter picture is pre-prepared pursued by
division of the ROI of the lung. Discrete waveform Transform is connected for picture pressure
and highlights are extricated utilizing a GLCM. The outcomes are encouraged into a SVM
classifier to decide whether the lung picture is carcinogenic or not. The SVM classifier is assessed
dependent on a LIDC dataset. In future the advanced level of algorithm is used to increase the
level of prediction while we are in process to include the Extreme gradient boosting Algorithm to
use the data set more effectively.
REFERENCES
1. M. Debois, “TxNxM1 : the anatomy and clinics of metastatic cancer.Boston,” Kluwer Academic
Publishers, 2006.
2. J.M. Fitzpatrick, & M. Sonka, “Medical imaging 2003: image processing. USA: SPIE Society
of PhotoOptical Instrumentation Engineers,” 2003.
3. C. M. Haskell, & J.S. Berek, “ Cancer treatment. Philadelphia: W.B. Saunders, 2001
4.K.T.Manivannan, “Development of gray level co-occurrence matrix based support vector
machines for particulate matter characterization,”Retrieved
fromhttp://rave.ohiolink.edu/etdc/view?acc_num=toledo1341577486, 2012.
5. Sathish Kumar R, R. Logeswari, N. Anitha Devi, S. DivyaBharathy Efficient Clustering using
ECATCH Algorithm to Extend Network Lifetime in Wireless Sensor Networks. International
Journal of Engineering Trends and Technology. 45. 476-481. 10.14445/22315381/IJETT-
V45P290 – March 2017
6. C. Charalambous, Conjugate gradient algorithm for efficient training of artificial neural
networks, IEEE Proceedings 139 (3) (1992) 301–310.
7. B. H. Boyle, “Support vector machines : data analysis, machine learning, and applications,”
Retrievedhttp://public.eblib.com/choice/publicfullrecord.aspx?p=3021500, 2011.
8. A.A. Abdulla, S.M. Shaharum, Lung cancer cell classification method using artificial neural
network, Information Engineering Letters 2 (March) (2012) 50–58.
9. Sathish Kumar R , Nive tha M, Madhu mita G & Santhoshy P. Image Enhancement using NHSI
Model Employed in Color Retinal Images. International Journal of Engineering Trends and
Technology. 58. 14-19. 10.14445/22315381/IJETT-V58P203- April 2018.
10. D. Harini, & D. Bhaskari, “Image retrieval system based on feature extraction and relevance
feedback,” ACM, 2 Penn Plaza, Suite 701, New York,NY 10121-0701, 2012.