Professional Documents
Culture Documents
Organized
Organized
Organized
TO OPTIMIZE INVENTORY
A PROJECT REPORT
Submitted by
SHIVARAMAKRISHNAN [RA2011003010641]
BACHELOR OF TECHNOLOGY
in
COMPUTER SCIENCE AND ENGINEERING
Certified that 18CSP109L project report titled “DEEP LEARNING BASED IMAGE
who carried out the project work under my supervision. Certified further, that to the best of my
knowledge the work reported here in does not form part of any other thesis or dissertation on the
basis of which a degree or award was conferred on an earlier occasion for this or any other
candidate.
Dr. M. PUSHPALATHA
HEAD OF THE DEPARTMENT
Professor
Department of Computing Technologies
We here by certify that this assessment compiles with the University’s Rules and Regulations relating
to Academic misconduct and plagiarism, as listed in the University Website, Regulations, and the
Education Committee guidelines.
We confirm that all the work contained in this assessment is our own except where indicated, and that
we have met the following conditions:
▪ Clearly references / listed all sources as appropriate
▪ Referenced and put in inverted commas all quoted text(from books, web,etc.)
▪ Given the sources of all pictures, data etc that are not my own.
▪ Not made any use of the report(s) or essay(s) of any other student(s)either past
or present
▪ Acknowledged in appropriate places any help that I have received from others(e.g
fellow students, technicians, statisticians, external sources)
▪ Compiled with any other plagiarism criteria specified in the Course hand book /
University website
We understand that any false claim for this work will be penalized in accordance with the University
policies and regulations.
DECLARATION:
We are aware of and understand the University’s policy on Academic misconduct and
plagiarism and I certify that this assessment is my / our own work, except where indicated by
referring, and that I have followed the good academic practices noted above.
Shivaramakrishnan [RA2011003010641]
Mahin Sharon [RA2011003010641]
Date:
If you are working in group, please write your registration numbers and sign with the date for
every student in the group.
ACKNOWLEDGEMENT
We extend our sincere thanks to Dr. T. V. Gopal , Dean-CET, SRM Institute of Science and
Technology, for his invaluable support.
We are incredibly grateful to our Head of the Department, Dr. M. Pushpalatha, Professor,
Department of Computing Technologies, SRM Institute of Science and Technology, for her
suggestions and encouragement at all the stages of the project work.
We want to convey our thanks to our Project Coordinators, Dr. S. Godfrey Winster, Associate
Professor, Dr. M. Baskar, Associate Professor, Dr. P. Murali, Associate Professor, Dr. J. Selvin
Paul Peter, Associate Professor, Dr. C. Pretty Diana Cyril, Assistant Professor and Dr. G.
Padmapriya, Assistant Professor, Panel Head, Dr M. Kanchana, Associate Professor and panel
members , Dr. M. Vijalakshmi Assistant Professor and Dr.N.Arunachalam Assistant
Professor Department of Computing Technologies, SRM Institute of Science and Technology,
for their inputs during the project reviews and support.
We register our immeasurable thanks to our Faculty Advisor, Dr.G.Abirami ,Dr. G. Ramya,
Assistant Professor Department of Computing Technologies, SRM Institute of Science and
Technology, for leading and helping us to complete our course.
Our inexpressible respect and thanks to our guide, Dr. N. Arunachalam, Assistant Professor,
Department of Computing Technologies, SRM Institute of Science and Technology, for
providing us with an opportunity to pursue our project under her mentorship. He provided us
with the freedom and support to explore the research topics of our interest. His passion for
solving problems and making a difference in the world has always been inspiring.
We sincerely thank all the staff and students of Computing Technologies Department, School of
Computing, S.R.M Institute of Science and Technology, for their help during our project. Finally,
we would like to thank our parents, family members, and friends for their unconditional love,
constant support and encouragement
SHIVARAMAKRISHNAN [RA2011003010641]
ABSTRACT v
LIST OF FIGURES vii
LIST OF TABLES ix
LIST OF SYMBOLS AND ABBREVIATIONS x
1. INTRODUCTION 1
1.1 General 1
1.2 Importance Of Revolutionizing Super Market Inventory 2
1.3 Advancements In Deep learning Based Image Processing 3
1.4 Enhanced CNNs for Inventory Image Classification 5
1.5 Objective 6
1.6 Scope 8
2 LITERATURE SURVEY 9
2.1 Motivation 11
2.2 Summary Of The Survey 11
3 ARCHITECTURE AND ANALYSIS 13
3.1 Architecture Diagram 13
3.2 Frontend Design 16
3.3 Backend Design 17
4 DEEP LEARNING BASED IMAGE CLASSIFICATION TO 21
OPTIMIZE INVENTORY
4.1 Data Preparation 21
4.2 Model Design And Training 22
4.3 Evaluation And Optimization 23
4.4 Analysis And Deployment 24
4.5 Model Discussion 25
4.5.1 Alexnet Architecture 25
4.5.2 ShuffleNet Architecture 27
4.5.3 Residual Network 28
4.5.4 Manual Network 29
-
LIST OF ABBREVIATIONS
Product identification is one of the major domains in supermarkets where manual labour is still
largely relied upon. Upon arrival at the shop, items must undergo identification, categorization,
and entry into the inventory management system. Historically, this task has been carried out
manually by shop personnel who visually examine the items and manually input their
information into the system. This procedure is characterized by being time-consuming,
susceptible to mistakes, and lacking scalability. Nevertheless, the progress in deep learning-
based image categorization provides a means to automate and enhance the efficiency and
optimization of this process (Birajdar et al., 2020).[2]. Convolutional Neural Networks (CNNs),
a type of deep learning algorithm, have demonstrated exceptional performance in applications
involving the categorization of images. By training these algorithms using extensive datasets of
product photos, they can acquire the ability to precisely recognize goods based on their visual
attributes. The main benefit of employing deep learning-based image classification for product
identification in supermarkets is its capacity to rapidly and precisely analyse a vast quantity of
goods (Yang et al., 2023).[3]. Once the algorithm is taught, it can rapidly and accurately identify
goods, surpassing human capabilities. This not only enhances efficiency but also minimizes the
probability of mistake, Scalability is another benefit. Deep learning algorithms have the
capability to efficiently handle extensive inventories containing thousands of diverse goods
(Unnikrishnan et al., 2018).[4]. The same deep learning algorithm may be applied to identify and
manage the whole inventory, regardless of whether a supermarket has a few hundred goods or
several thousand. Furthermore, the utilization of deep learning algorithms for picture
categorization might enhance inventory management in supermarkets by offering immediate and
1
accurate information about stock levels and optimal product positioning (Gomes et al., 2021).[5].
Through the use of cameras and real-time image analysis, the program can constantly check
product levels on the shelves. It is capable of notifying store management when restocking is
required or when shelves need to be rearranged. Implementing this proactive strategy for
inventory management can effectively decrease occurrences of stockouts and enhance the overall
shopping experience for customers in supermarkets. Moreover, deep learning techniques may
be employed to enhance the optimization of product placement inside the shop (Ghosh et al.,
2020) [6]. The algori3thm utilizes consumer traffic patterns and purchasing behaviour analysis
to suggest the most effective positioning of items in the shop, with the aim of maximizing sales.
For instance, if the algorithm identifies that specific goods are commonly bought together, it
might suggest arranging them in close proximity on the shelve. One further advantage of
employing deep learning-based image categorization in supermarkets is the capability to monitor
and examine client behaviour (Liu et al., 2024).[8]. The algorithm can discern trends in consumer
activity by examining surveillance footage, including the identification of regularly visited
regions inside the shop, the duration of time clients spend in each aisle, and the identification of
often associated product purchases. shop managers may utilize this information to enhance shop
layout, optimize product placement, and customize marketing techniques to more effectively
cater to their consumers' requirements.
Manual inventory management is slow and prone to human error, which is why advanced
technology is indispensable. By utilizing deep learning techniques like ShuffleNet, supermarkets
can significantly improve the speed and accuracy of inventory processes. For example, Zhichao
Chen and Jie Yang (2022)[9] demonstrated how ShuffleNet v2 could be used to streamline a
garbage classification system, suggesting similar efficiency could be achieved with supermarket
products. These deep learning algorithms can quickly identify products by analyzing their visual
2
features, drastically reducing the time and effort required for manual data entry.
Additionally, such advanced algorithms not only boost accuracy but also provide real-time
insights into inventory status. This real-time data helps supermarkets maintain optimal stock
levels, reducing both overstocking and understocking. Supermarkets can also use deep learning
to analyse customer behaviour, optimizing product placement to increase impulse purchases and
improve overall sales. As G. Prince Devaraj (2024)[10] pointed out, the multi-branch ShuffleNet
architecture can enhance classification tasks, suggesting that similar techniques could be used to
identify optimal product placements within supermarkets.
Cost savings are another benefit of deep learning-based inventory management. Automating the
process reduces the need for manual labor and minimizes errors, thereby lowering operational
costs. Additionally, optimized product placement can lead to increased sales, further improving
a supermarket's bottom line. Perarasi and Ramadas (2023)[11] emphasized the effectiveness of
improved AlexNet for detecting cracks in solar panels, indicating that similar precision could be
applied to detect inventory discrepancies or damaged goods.
In the competitive world of retail, supermarkets that adopt advanced technologies like deep
learning-based image classification are better equipped to meet customer demands and stay
ahead of competitors. By embracing this innovative approach, they can ensure well-stocked
shelves, accurate inventory levels, and optimized product placements, leading to a better
customer experience and improved sales. This strategic advantage could be key to long-term
success in an evolving retail landscape.
Initially, CNNs were limited by their depth and the availability of large-scale labeled datasets for
training. However, in recent years, several breakthroughs have significantly advanced the
capabilities of deep learning-based image processing. One of the most notable advancements is
the development of deeper and more complex CNN architectures, such as ResNet, Inception, and
3
EfficientNet. These architectures utilize techniques like residual connections, parallel feature
extraction, and efficient model scaling to improve performance while maintaining computational
efficiency. Moreover, the availability of large-scale labeled datasets, such as ImageNet, COCO,
and Open Images, has been instrumental in training deep learning models for image processing
tasks. These datasets contain millions of labeled images across thousands of categories, allowing
researchers to train more accurate and robust models.
In recent years, attention has also shifted towards developing more interpretable and explainable
deep learning models. Techniques such as attention mechanisms, gradient-based attribution
methods, and Class Activation Mapping (CAM) have been developed to provide insights into
the decision-making process of deep learning models. These techniques not only improve model
interpretability but also help identify model biases and vulnerabilities.
In the context of retail, these advancements in deep learning-based image processing have
enabled supermarkets to automate and optimize various aspects of their operations, including
product identification, inventory management, and customer behaviour analysis. By leveraging
deep learning models trained on large-scale datasets of product images, supermarkets can
accurately identify products, monitor inventory levels, optimize product placement, and analyze
customer behaviour in real-time.
4
learning techniques, hardware advancements, and interpretable model techniques have all
contributed to the rapid progress in this field. In the context of retail, these advancements have
enabled supermarkets to automate and optimize various aspects of their operations, leading to
improved efficiency, cost savings, and a better overall shopping experience for customers.
Supermarkets require scalability, and advanced Convolutional Neural Networks (CNNs) have
the capability to manage extensive inventories including thousands of diverse goods. Regardless
of the number of products stocked, these models can effectively identify and oversee the whole
inventory of a supermarket, whether it consists of a few hundred or several thousand items.
Supermarkets may achieve substantial cost reductions by implementing advanced Convolutional
Neural Networks (CNNs) to automate the categorization of inventory images. These models
decrease the requirement for human work and minimize inaccuracies in inventory management,
resulting in reduced operating expenses and enhanced profitability. In addition, via the
optimization of inventory management operations, supermarkets may decrease occurrences of
5
excessive or insufficient stock, therefore further reducing expenses related to inventory
management.
Precise and effective categorization of inventory images also enhances the browsing experience
for customers. By maintaining constantly replenished shelves with the desired products, clients
may effortlessly locate the items they want, so minimizing aggravation and enhancing overall
happiness. Implementing real-time inventory management guarantees the constant availability
of popular commodities, resulting in heightened consumer loyalty and greater repeat business.
Furthermore, advanced Convolutional Neural Networks (CNNs) provide supermarkets the
ability to promptly adjust to shifts in market trends and customer preferences. Supermarkets may
adapt their inventory management techniques to suit changing customer demands by consistently
reviewing inventory data and customer behavior. The capacity to adapt is crucial for
supermarkets aiming to maintain competitiveness in the dynamic retail environment of today.
Advanced Convolutional Neural Networks (CNNs) have greatly boosted the categorization of
inventory images in supermarkets. This has resulted in a more precise, effective, and adaptable
solution for inventory management. Supermarkets may achieve cost reduction, enhanced
customer happiness, and maintain competitiveness in the current fast-paced retail landscape by
utilizing sophisticated CNN architectures, transfer learning techniques, and real-time inventory
management systems.
1.5 Objective
Determining the Optimal Model with the Maximum Accuracy:
The main goal of this study is to determine the deep learning model that delivers the maximum
level of accuracy in categorizing grocery products. The research seeks to identify the most
precise model for picture categorization by employing Convolutional Neural Networks (CNNs)
such as AlexNet, ShuffleNet, ResNet, and a manual model. The models are trained and evaluated
using a varied dataset of product photographs. The evaluation of each model is conducted based
on its accuracy, speed, and efficiency. The objective is to choose the model that offers the highest
level of accuracy in classifying retail products. Activating the model and constructing an
automated system: After identifying the top-performing model, the next goal is to incorporate it
into an automated system for managing inventories. This automated system allows for
instantaneous identification, categorization, monitoring of shelves, and replenishment processes
for products. Through the automation of these operations, the system minimizes the requirement
6
for manual intervention, therefore enhancing operational efficiency and decreasing the
probability of mistakes. The objective is to seamlessly integrate the deep learning model into the
existing supermarket infrastructure, enabling real-time image analysis and decision support for
inventory management and automation.
Enhancing Prediction and Classification Efficiency: Apart from accuracy, the velocity of
prediction and classification is vital for real-time inventory management. Hence, an additional
aim of this research is to enhance the efficiency of the deep learning model by optimizing it to
improve processing speed while maintaining accuracy. By enhancing the speed of prediction and
categorization, the system can swiftly examine photos, facilitating expedited decision-making
and optimizing inventory management. This entails refining the deep learning model and
modifying its architecture to improve processing speed while preserving high levels of accuracy.
Error analysis and model improvement: Error analysis approaches are crucial for discovering
and correcting prevalent misclassification patterns, hence boosting the efficacy of the selected
deep learning model. A crucial aim of this research is to do comprehensive error analysis in order
to comprehend the causes of misclassifications and to devise techniques to rectify them. The
project seeks to enhance its accuracy and dependability by assessing mistakes and implementing
appropriate adjustments to the model. The inventory optimization system is continuously
monitored and improved through the utilization of user feedback, operational experience, an
technology improvements.
The integration of user input and continuous improvement is crucial for enhancing the efficiency
and usability of the inventory optimization system. Hence, a crucial aim of this project is to
include user input into the ongoing process of enhancing and refining. The project is to gather
input from supermarket personnel, management, and consumers in order to identify areas that
need improvement.
The goal is to make essential adjustments that will enhance the efficacy of the system and
improve user happiness. Continuous monitoring and improvement techniques guarantee the
continuing optimization and refinement of the inventory optimization system, assuring its
efficacy and efficiency in fulfilling the changing requirements of the supermarket business.
7
1.6 Scope
The objective of this project is to explore deep learning approaches to optimize inventory
management in supermarkets. By employing several popular neural network architectures—
namely AlexNet, ShuffleNet, ResNet, and a custom-built manual architecture—the study aims
to develop and evaluate models that can improve inventory efficiency, leading to reduced waste,
better stock management, and enhanced customer satisfaction.
The core of the project involves implementing four distinct deep learning architectures: AlexNet,
ShuffleNet, ResNet, and a custom manual model. These models are designed and trained to
predict inventory-related outcomes, such as product demand and stock levels. A detailed account
of the model configurations, training parameters, and techniques used to prevent overfitting (such
as data augmentation and dropout) is included.
8
CHAPTER 2
LITERATURE SURVEY
Innovative garbage classification system by Chen and Yang's (2022)[9] utilizing an improved
ShuffleNet v2 architecture represents a significant leap in waste management technology. The
improved ShuffleNet v2's lightweight design and enhanced efficiency allow for rapid and precise
garbage sorting, providing a practical solution for the recycling and waste management industries
(Chen & Yang, 2022). This technology is crucial in streamlining garbage sorting processes,
facilitating recycling, and reducing environmental impact.
In the healthcare sector, Devaraj's (2024)[10] multi-branch ShuffleNet architecture has proven
to be a vital tool in advancing skin cancer diagnosis. By implementing deep learning techniques,
particularly the ShuffleNet architecture, this approach offers increased accuracy in identifying
various types of skin lesions. The multi-branch ShuffleNet architecture allows for improved
classification of skin cancer, aiding dermatologists in early diagnosis and potentially saving lives
through timely treatment (Devaraj, 2024)[10]. This advancement underscores the role of deep
learning in medical diagnostics, providing a framework for more effective healthcare solutions.
Perarasi and Ramadas (2023)[11] introduced a novel approach for detecting cracks in solar panel
images using an improved AlexNet classification method. This method enhances the accuracy
of crack detection, which is crucial for timely maintenance and repair of solar panels. By
employing deep learning, particularly through the AlexNet architecture, they were able to
provide a more reliable solution for identifying structural issues in solar panels, thus supporting
the sustainable energy sector (Perarasi & Ramadas, 2023)[11].
In the field of image processing, Li et al. (2023)[12] introduced the Residual Shuffle Attention
Network (RSAN), a breakthrough for image super-resolution. This network combines residual
connections and attention mechanisms to capture intricate details effectively. The use of the
ShuffleNet architecture in this context provides state-of-the-art performance, making RSAN a
significant contribution to image processing applications where fine detail is essential (Li et al.,
2023)[12]. This work has broader implications for industries relying on high-resolution images,
including satellite imagery and digital media.
9
Additionally, Xue et al.'s (2024)[13] development of a lightweight improved residual network
for efficient inverse tone mapping has led to significant improvements in image quality. This
approach efficiently enhances image tone mapping, proving beneficial for various multimedia
applications. By integrating deep learning, their solution offers a compelling answer to the
challenges of inverse tone mapping in image processing, impacting industries like photography
and film (Xue et al., 2024)[13].
Niu et al.'s (2024)[14] Ghost Residual Attention Network (GRAN) for single-image super-
resolution demonstrates how the combination of residual connections and attention mechanisms
can elevate image resolution to new heights. This advancement significantly improves single-
image super-resolution, contributing to a broader range of image processing applications (Niu et
al., 2024). GRAN's capabilities are particularly relevant in fields requiring enhanced image
resolution, such as medical imaging and satellite imagery.
In satellite image classification, Yadav et al.'s (2024)[20] deep learning approach significantly
improved the accuracy and efficiency of satellite image classification. Using convolutional
neural networks (CNNs), their method provides a reliable and automated solution for satellite
image classification, offering valuable insights for remote sensing applications and
environmental monitoring (Yadav et al., 2024).
Mora et al. (2020) [18]conducted a comprehensive review of Convolutional Neural Networks
10
(CNNs) in fruit image processing. Their analysis showed that CNNs substantially improved the
accuracy of fruit classification and quality assessment. This review offers a detailed
understanding of how deep learning can enhance agricultural practices and food quality control,
emphasizing the importance of CNNs in these fields (Mora et al., 2020).
2.1 Motivation
This project is driven by the necessity to achieve effective and precise inventory management in
supermarkets. Conventional approaches to inventory management are frequently slow, prone to
mistakes, and require a lot of manual work, resulting in inefficiencies and higher operating
expenses. The tremendous progress in deep learning-based image processing techniques presents
a substantial opportunity to transform inventory management operations in supermarkets.
Through the utilization of Convolutional Neural Networks (CNNs) and transfer learning, it is
feasible to create exceedingly precise and effective systems for automated product identification
and inventory management.
The objective of this research is to investigate the use of advanced deep learning algorithms,
specifically the AlexNet and ShuffleNet architectures, for accurately classifying inventory
images in supermarkets. This project aims to enhance the efficiency, accuracy, and scalability of
inventory management procedures in supermarkets by creating a strong and effective system for
automated product identification and inventory management. If this initiative is implemented
successfully, it has the potential to result in substantial cost reductions, better control over
inventory, and increased consumer happiness in supermarkets. Ultimately, this will contribute to
higher profitability and competitiveness in the retail sector.
11
demonstrates how this architecture's inherent flexibility allows for detailed analysis of skin
lesions, potentially leading to earlier and more accurate detection of skin cancers. Perarasi and
Ramadas (2023)[11] contribute to sustainable energy initiatives with their improved AlexNet
classification method, designed for the detection of cracks in solar panels. By automating the
inspection process and enhancing the reliability of crack detection, this research plays a crucial
role in maintaining the efficiency and longevity of solar energy systems. This technology could
lead to significant cost savings and promote wider adoption of renewable energy sources.
Li et al. (2023)[12] introduce the Residual Shuffle Attention Network (RSAN), a deep learning
model that excels in image super-resolution. This model's ability to capture fine details with high
accuracy represents a major advancement in the field of computer vision, with implications for
industries such as surveillance, medical imaging, and digital content creation. The enhanced
resolution provided by RSAN sets a new standard for image processing technologies.In a related
vein, Xue et al. (2024)[13] and Niu et al. (2024)[14] present lightweight networks designed for
inverse tone mapping and single-image super-resolution, respectively. These models are notable
for their efficiency and reduced computational resource requirements, making them highly
applicable in multimedia and image processing industries.
The medical field is further explored by Hüseyin Eldem (2023)[15], whose work on wound
image classification using AlexNet with transfer learning showcases the potential of deep
learning in medical diagnostics. This approach facilitates accurate classification of wound types,
aiding healthcare professionals in treatment planning and patient care. Foundational studies by
Liu and Jia (2015)[16] and the latest research by Yadav et al. (2024)[20] focus on satellite image
classification, underscoring the versatility of convolutional neural networks (CNNs) in diverse
fields such as agriculture, environmental monitoring, and urban planning. These studies illustrate
the potential for CNNs to transform industries by enabling large-scale data analysis with
unprecedented accuracy. Mora et al.'s (2020)[18] comprehensive review highlights the
transformative role of CNNs in fruit image processing, emphasizing their critical contribution to
agricultural practices and food quality control. This line of research has practical implications
for improving crop yield, reducing waste, and enhancing food safety.
12
CHAPTER 3
ARCHITECTURE AND ANALYSIS
Data analysis: is conducted on the collected photos to extract relevant information, including
product names, categories, and labels. This procedure entails the utilization of sophisticated
image analysis algorithms to effectively separate and categorize goods. In addition, data analysis
techniques are used to understand the features and distribution of the dataset, which helps in
developing successful training strategies. The employed components include cutting-edge image
processing methods that leverage Convolutional Neural Networks (CNNs), data visualization
tools for investigating aspects of the dataset, and statistical analysis approaches for quantitative
assessment.
Data preprocessing: is performed on gathered photos before model training to improve their
quality and appropriateness for machine learning tasks. This entails a sequence of preprocessing
procedures, including downsizing to standardized dimensions, normalization to guarantee
uniform pixel intensity ranges, and augmentation to enhance dataset variability. Preprocessing
tools and libraries are used to automate these activities. These include picture scaling tools that
use interpolation techniques, data augmentation libraries that provide various transformations,
and preprocessing pipelines that combine numerous processing stages.
Model Building: We construct and train selected Convolutional Neural Network (CNN)
architectures using the preprocessed dataset. This stage entails the configuration of model
architectures, the initialization of model weights, and the optimization of hyperparameters in
14
order to achieve maximum performance. Dropout regularization and batch normalization
techniques are utilized to mitigate overfitting and enhance generalization. The training process
employs deep learning frameworks like TensorFlow or PyTorch, leveraging high-performance
computing infrastructure equipped with GPUs or TPUs to accelerate the training process. Model
assessment measures, such as precision, recall, accuracy and F1 score, are calculated to evaluate
the model performance. Model evaluation involves a thorough assessment of trained models,
focusing on performance criteria such as accuracy, precision, and recall. The selection process
for deploying models in real-world settings prioritizes those that exhibit the utmost accuracy and
dependability. Aside from quantitative measurements, qualitative factors such as the
interpretability of the model and its ability to handle fluctuations in input data are taken into
account. Evaluation metrics are computed using proven mathematical methods, and the
thresholds are determined based on the specified needs of the application and the experience in
the field.
The CNN models, namely AlexNet, ShuffleNet, ResNet, and a bespoke architecture, are
incorporated into a web-based application utilizing the Django framework for deployment. This
enables the development of an interactive and user-friendly interface for managing supermarket
inventory. The system offers instantaneous picture categorization and inventory surveillance
capabilities, facilitating automated inventory control, shelf supervision, and replenishment
procedures.
The deployment: consists of a Django web application that offers a dynamic frontend interface,
a scalable backend infrastructure for model inference, and a real-time image analysis module for
processing incoming photos.
This system design utilizes advanced deep learning techniques and state-of-the-art CNN
architectures to tackle the difficulties of inventory management in supermarkets. The system
intends to increase operational efficiency, decrease costs, and boost customer happiness by
seamlessly combining data gathering, analysis, model development, and deployment phases. The
aforementioned technical details highlight the intricate and advanced nature of the suggested
solution.
15
3.2 Frontend Design
The frontend design of the proposed supermarket inventory management system is carefully
crafted to fulfill the diverse needs of administrators, personnel, and automated processes, each
with unique functions. The system utilizes a Graphical User Interface (GUI) as the main interface
for user interaction. It incorporates advanced role-based access control methods to provide
precise access levels for users, hence ensuring the security of data. The graphical user interface
(GUI) features a sophisticated and user-friendly design, with clearly labeled menus and controls
to facilitate smooth navigating between various portions of the program. In order to guarantee
compatibility with a wide range of electronic devices, the Graphical User Interface (GUI) is
carefully designed to be flexible and versatile, effortlessly adapting to different screen sizes and
resolutions.
The frontend provides administrators with powerful features specifically designed for managing
user accounts, configuring permissions, and generating detailed reports on inventory status,
sales, and trends. Administrators also have the ability to customize system settings and
parameters to fulfill unique operational needs. On the other hand, staff are provided with tools
that make it easier for them to identify products, manage inventory, and have immediate access
to information on the status and placement of inventory. Automation is efficiently incorporated
to manage repetitive processes like as stock replenishment and inventory management, utilizing
IoT devices for immediate data capture and analysis.
Real-time visualization technologies are essential for enhancing the user experience by offering
immediate feedback during product identification activities. The live picture classification results
are shown in real-time, together with visual indicators such as color-coded annotations or
overlays to highlight recognized goods or categories. Users are able to utilize interactive
capabilities to magnify, move about, and personalize the appearance of supermarket photos,
allowing them to concentrate on certain areas of interest. The Convolutional Neural Networks
(CNNs) produce concise classification results, which include product labels and confidence
ratings. These results provide valuable information about the system's degree of confidence in
identifying the products. In addition, the frontend incorporates powerful inventory management
features to optimize the structure, retrieval, and analysis of product information. Product profiles
provide detailed information including the product's name, category, quantity, and its location
within the supermarket. The inclusion of advanced search and filter features allows users to
16
quickly find certain goods or subsets of data by using appropriate criteria. The frontend design
places a high importance on user preferences and customization choices, giving users the ability
to adjust display settings, notification preferences, and system configurations to match their own
workflow preferences. The frontend design incorporates accessibility elements to promote
diversity and cater to users with a wide range of requirements and abilities. Accessibility is
improved for all users through the use of high contrast settings, text scaling options, and keyboard
shortcuts. The supermarket inventory management system enhances operational efficiency and
consumer pleasure in supermarkets by methodically applying frontend design concepts to
optimize product identification, inventory monitoring, and replenishment operations. To
integrate the Keras model into a Django web application for inventory optimization, started by
setting up a Django project and creating a Django app. Once the project is set up, save the trained
Keras model as an .h5 file and move it into your Django app directory. Next, write a Django
view function that loads the Keras model using load_model() from Keras and makes predictions
based on input data provided by users. Map this view function to a URL in the urls.py file,For
the frontend, design HTML templates where users can input the data required for predictions.
Utilizeing AJAX to establish communication between the frontend and backend. When users
submit input data through the frontend interface, use AJAX to send this data to your Django view
for prediction. Once the prediction is made, return it to the frontend as a JSON response and
update the user interface accordingly.
Data Processing and Storage: After receiving unprocessed picture data from the frontend
interface, the backend system begins preprocessing chores to ready the data for subsequent
analysis. The duties involve standardizing the sizes of images, leveling the values of pixels, and
improving the quality of images. The preprocessed data is thereafter saved in a well-organized
database that has been streamlined to facilitate efficient retrieval and analysis. Supermarket
picture data is protected by the implementation of strong data management standards.
The backend has a collection of Convolutional Neural Network (CNN) models, such as AlexNet,
17
ShuffleNet, ResNet, and a bespoke manual architecture, which are used for picture categorization
tasks. Model training strategies utilize annotated datasets to enhance and optimize model
performance. Methods like as data augmentation, hyperparameter optimization, and cross-
validation are utilized to improve the resilience of the model and mitigate overfitting. Model
performance is reliably quantified by computing evaluation measures like as accuracy, precision,
recall, and F1-score.
Integration with External Systems: The smooth integration with external systems, like as
inventory databases, point-of-sale (POS) systems, and product information databases, is made
possible through the use of APIs and middleware. This facilitates the effective transfer and
coordination of data between the supermarket inventory management system and other
platforms. The utilization of real-time data synchronization procedures guarantees that the
inventory database remains constantly updated in accordance with the categorization findings
produced by the CNN models.
The backend design is carefully designed for scalability and performance in order to handle
substantial amounts of picture data and simultaneous user interactions. Methods like as parallel
processing, distributed computing, and load balancing are utilized to enhance system
performance and maximize the use of resources.
18
Class Diagram:
The class diagram proved to be an important communication tool for achieving design clarity. It
facilitated the communication of the system's design and architecture to developers, stakeholders,
and other project participants. The class diagram provides a concise and comprehensive
overview of the system's structure and design, enabling all project stakeholders to easily
comprehend it.
The class diagram facilitated the identification of links and interdependence among various
classes. By visualizing these interconnections, I gained insight into the potential impact of
changes in one component of the system on other components, hence enhancing decision-making
capabilities during the development process. The class diagram functioned as a precise plan for
carrying out the implementation. The roadmap gave me, as a developer, with clear guidance on
developing code for various classes and verifying that the final implementation adhered to the
system's architecture.
19
Image (Color, Pixel): Within the context of your supermarket inventory optimization project,
this class serves as the foundation for handling raw input image data. It encapsulates attributes
related to the color information of the images, such as RGB values, and stores pixel-level details
crucial for subsequent processing steps.
InputInformation (Field, Frame): Playing a pivotal role in preprocessing raw image data and
extracting relevant features, this class is integral to your project's data processing pipeline. It
includes attributes for the extracted features or fields from the supermarket images (Field) and
encapsulates the structural information or frame of the input images (Frame), aiding in
subsequent analysis and classification tasks.
Test (Testing the Machine): Responsible for evaluating the trained TensorFlow model's
performance using test data pertinent to your supermarket inventory optimization context, this
class facilitates rigorous assessment of the model's classification capabilities. Attributes within
this class pertain to the test data utilized in gauging the model's effectiveness in accurately
classifying supermarket products based on the images provided.
Output, Test Data (Classified): Crucial for the project's outcome generation, this class
encapsulates the output data derived from the TensorFlow model post-classification. It includes
attributes representing the classified test data, reflecting the results of the image classification
process conducted by the TensorFlow model.
20
CHAPTER 4
DEEP LEARNING BASED IMAGE CLASSIFICATION TO
OPTIMIZE INVENTORY
Afterwards, the data cleaning step consists of using preprocessing algorithms to standardize the
format and look of the inventory photographs. Quality control methods are put in place to identify
and resolve typical problems including blurriness, inconsistent lighting, and background clutter.
This ensures that only high-quality photographs are kept for further processing. Visual inspection
is a crucial stage to verify the success of the cleaning process, guaranteeing that the pictures are
sharp, well-illuminated, and devoid of any disturbances.
Annotating the dataset significantly improves its informativeness and usefulness. Supermarket
personnel or professionals in labeling are hired to annotate each image in the inventory with
labels that indicate the product type, brand, and other relevant information. To guarantee
uniformity and the ability to replicate results, standardized standards and protocols are created
for annotation. These criteria include various levels of detail to capture different elements of each
product. During the feature extraction step, image processing methods are utilized to extract
pertinent characteristics from the preprocessed inventory photos. Different feature
representations, such as color histograms, texture features, and form descriptors, are used to
21
capture unique aspects of the items. The study investigates advanced strategies for extracting
features, such as utilizing pre-trained convolutional neural network (CNN) models based on deep
learning, to harness the capabilities of deep learning in representing features.
Data augmentation strategies are utilized to expand the variety and magnitude of the dataset,
hence improving the resilience and generalization abilities of the model. Temporal
transformations, such as rotation, flipping, and resizing, replicate changes in picture orientation
and size. On the other hand, random perturbations, such noise addition, blur, and brightness
modifications, imitate actual variations in lighting conditions and image quality. Collectively,
these procedures provide an all-encompassing data preparation process specifically designed for
addressing the distinct obstacles and criteria associated with picture categorization in grocery
stores.
The initial stage entails the careful construction of the CNN architecture, taking into account
many aspects like as the depth and breadth of the network, the size of the kernels, and the patterns
of connection. The study investigates different CNN architectures, including classic CNNs,
advanced models such as deep residual networks (ResNets), AlexNet, and ShuffleNet, to address
the special needs of supermarket inventory categorization. The model architecture incorporates
domain-specific information and limitations to maximize performance and tackle the particular
issues found in retail contexts. Hyperparameter tuning involves defining a search space for
hyperparameters, which includes factors like batch size, learning rate, optimizer selection,
weight initialization, and regularization techniques. Hyperparameters are methodically adjusted
using approaches such as random search, grid search, or Bayesian optimization. Performance
assessment across different hyperparameter combinations is carried out utilizing holdout
22
validation or cross-validation approaches to determine the ideal configurations for model
training. Partitioning the preprocessed dataset into training, validation, and test sets using proper
ratios is essential for efficient model training. Stratification strategies are used to ensure that the
distribution of classes remains consistent across different subsets, especially when dealing with
unbalanced datasets that include different product categories. The method of randomizing the
data splitting helps to reduce biases and assures that the model assessment metrics are resilient.
During the training of the model, the parameters are initialized using approaches such as Xavier
or He initialization in order to accelerate convergence. Efficient model training involves the use
of mini-batch Stochastic Gradient Descent (SGD) or adaptive optimization techniques such as
Adam or RMSprop. Metrics such as loss function values, accuracy, and validation performance
are used to monitor the progress of training. This helps to identify convergence and prevent
overfitting.
Validation is essential for evaluating the performance of a trained model based on preset criteria
such as accuracy, precision, recall, and F1 score. Examining and visualizing training and
validation curves can help detect possible problems with overfitting or underfitting, providing
guidance for future optimization. Validation-based early stopping strategies are employed to
prevent model degradation and optimize training efficiency, guaranteeing the resilience and
dependability of the trained CNN models for supermarket inventory categorization.
23
Metric Calculation entails the calculation of evaluation metrics using tools such as confusion
matrices, precision-recall curves, ROC curves, and other diagnostic performance measurements.
The interpretation of these measures is based on the unique context of the supermarket inventory
classification job, taking into account aspects such as the frequency of various product categories
and the impact of incorrect positive and negative predictions on inventory management
procedures. Error Analysis involves the thorough investigation of model mistakes and
misclassifications in order to identify recurring trends, systemic biases, and areas of uncertainty.
An examination of cases where false positives and false negatives occur helps to comprehend
the root causes and possible origins of misunderstanding in the categorization procedure.
Engaging with domain specialists, such as inventory managers and retail analysts, enables the
verification of model predictions and improvement of categorization criteria.
Optimizing the model is crucial for improving the CNN model's resilience and flexibility. The
text explores many techniques, such as model regularization, which include dropout, weight
decay, batch normalization, and early stopping. These techniques are used to reduce overfitting
and improve generalization. The study explores architectural alterations, such as model
ensembling, transfer learning, and architecture search, to enhance the performance and
adaptability of the model. The hyperparameters are adjusted systematically by using insights
gained from evaluating performance and analyzing errors. This process refines the model's
structure and training approach to attain the highest possible accuracy and reliability in
classifying supermarket inventory for management purposes.
24
provides evidence of the system's concrete advantages and operational efficacy. Compliance and
adherence to standards are of utmost importance, requiring careful attention to the applicable
rules and guidelines that govern the creation of automation systems. This entails assuring
conformity with industry norms and regulatory mandates, along with the creation of thorough
documentation and compliance reports. Working together with professionals in regulatory
compliance helps to navigate through regulatory channels and obtain the required permissions
or certificates, guaranteeing that the supermarket automation system complies with regulatory
standards.
Deployment involves the incorporation of the trained CNN model into both the frontend interface
and backend architecture of the supermarket automation system. This allows for real-time picture
analysis and decision assistance capabilities. Thorough deployment testing and validation
methods are carried out to ensure smooth integration with current supermarket workflows,
compatibility with inventory management systems, and adherence to strict data privacy and
security requirements. In addition, extensive training and educational programs are implemented
to acquaint supermarket employees with the functioning, interpretation of classification
outcomes, and upkeep of the automated classification system, guaranteeing efficient usage at all
levels of operation.
Continuous monitoring and improvement procedures are implemented to constantly observe and
evaluate the performance of the deployed categorization system. These techniques include the
monitoring of performance in real-time, reporting of errors, and systematic collecting of
feedback. Post-deployment surveillance involves continuously monitoring system performance
indicators, analyzing user input, and identifying areas for optimization. The classification
system's design, functionality, and performance are continuously improved and refined through
iterative processes. This is done by incorporating insights from user feedback, operational
experience, and advancements in technological paradigms.
25
max-pooling layers, fully linked layers, and softmax activation functions for the purpose of
classification. The significance of the architecture to your project is in its capacity to accurately
identify complex characteristics and patterns from input photos of diverse grocery items. Now,
we will thoroughly analyze and examine the structure of the AlexNet architecture.
The input layer: positioned at the forefront of the architecture, is responsible for receiving
preprocessed pictures that depict a variety of grocery items and products. Subsequently, the
design consists of several convolutional layers. The first layer, known as Convolutional Layer 1,
has 96 filters with a kernel size of 11x11 and a stride of 4 pixels. When combined with ReLU
activation functions, these filters are able to extract basic visual elements such as edges and
textures from the input pictures.
Max-Pooling Layer 1: decreases the size of the image and highlights important characteristics
by taking the maximum value inside a 3x3 pixel frame and moving 2 pixels at a time. The next
layers.
Max-Pooling Layer 3: is used to further decrease the size of the spatial dimensions and extract
distinctive characteristics. Next, the architecture progresses to a Flatten Layer, which converts
the output of the previous convolutional layer into a one-dimensional vector.
Fully Connected Layers 1 and 2 : each contain 4096 neurons, and they utilize ReLU activation
functions to introduce non-linearity. The Output Layer, which consists of 15 neurons
representing various categories of grocery items, employs softmax activation to calculate the
probability distribution across these categories. The deep architecture and intelligent design of
AlexNet allow it to reliably detect and categorize supermarket goods, making it a crucial
component in your project's image classification pipeline. AlexNet is highly effective in
extracting complex characteristics and patterns using its convolutional and fully connected
layers. This has greatly contributed to the success of your supermarket inventory categorization
system.
26
4.5.2 ShuffleNet Architecture
ShuffleNet stands out as a convolutional neural network architecture tailored explicitly for
efficient and high-performance image classification endeavors. Noteworthy for its compact
design, ShuffleNet finds particular utility in deployments on resource-constrained platforms like
mobile phones and embedded systems. Let's explore the ShuffleNet architecture and its
pertinence to your supermarket image classification project.
input layer: of ShuffleNet serves as the entry point for preprocessed images from the dataset,
each representing diverse supermarket products and items. Unlike conventional convolutional
layers.
Depth wise Convolution layer: ShuffleNet adopts depthwise separable convolutions in its
convolutional layers to curb computational complexity while preserving representational
capacity. This innovative approach entails a two-step convolution process: depthwise
convolution, which convolves each input channel separately with distinct filters, followed by
pointwise convolution, which consolidates the outputs of the depthwise convolution using 1x1
convolutions.
Channel Shuffle: A notable feature distinguishing ShuffleNet is its channel shuffle operation,
facilitating information exchange among feature maps from different groups. Here, group
convolution segregates feature maps into multiple groups for independent convolution, after
which the channel shuffle enables cross-group communication, enhancing the network's
expressive power,Further optimizing efficiency,
27
Subsequently, the feature vector undergoes classification through a fully connected layer.
Softmax activation: applied to the fully connected layer outputs class probabilities, with the
class exhibiting the highest probability deemed the predicted class for the input image. In your
supermarket image classification endeavor, ShuffleNet assumes a pivotal role as one of the deep
learning models adept at efficiently categorizing images of assorted supermarket products. Its
compact and resource-efficient architecture renders it amenable to deployment on platforms with
constrained computational resources, enabling real-time image classification within supermarket
environments.
Input Layer: In a ResNet-based image classification project, the process starts with an input
layer that receives the preprocessed images.
Convolutional Layers: The initial layers of ResNet are a series of convolutional layers. These
layers extract key features from the input images, often using a combination of filters, kernel
sizes, and strides to capture different aspects of the images. After each convolutional layer,
there's typically a batch normalization step to standardize activations, followed by a Rectified
Linear Unit (ReLU) activation function, which introduces non-linearity into the network.
Residual Blocks: The hallmark of ResNet is its use of residual blocks. A residual block contains
two paths: the shortcut path and the main path. The shortcut path, also known as identity
mapping, is designed to allow the input to skip certain layers via skip connections. This setup
helps the gradients flow more easily during training, reducing the risk of vanishing gradients.
The main path, on the other hand, applies a series of convolutional layers to the input. After
processing through the main path, the output is added to the original input from the shortcut path,
28
and the combined result is passed through a ReLU activation function. This approach allows
ResNet to learn residual functions, focusing on differences rather than absolute values, which
can be easier for very deep networks to manage.
Skip Connections and Stacking Blocks: The skip connections in ResNet are the key to its
resilience against the vanishing gradient problem. By allowing gradients to flow through shorter
pathways, ResNet can effectively train networks with hundreds of layers. ResNet architectures
are typically composed of multiple residual blocks stacked on top of each other. The
configuration, including the number of filters and kernel sizes, can vary depending on the ResNet
variant, such as ResNet-50.
Bottleneck Blocks in Deeper Variants: Deeper variants of ResNet, like ResNet-50 and beyond,
often incorporate bottleneck blocks to improve computational efficiency. A bottleneck block
typically involves a combination of 1x1, 3x3, and 1x1 convolutional layers. This design reduces
the number of parameters and computation while maintaining high representational power.
Global Average Pooling and Fully Connected Layer: Following the convolutional and residual
blocks, ResNet generally uses global average pooling to reduce the spatial dimensions of the
feature maps. This operation aggregates spatial information, resulting in a fixed-size vector
regardless of the input image size.
The output from global average pooling: Then fed into a fully connected layer, where
classification takes place. This layer maps the extracted features to the specific output classes,
representing the different supermarket product categories.
Softmax Activation and Output: In the final stage, a softmax activation function is applied to
the output of the fully connected layer. This converts the results into probabilities.
29
instantiated, commencing with a convolutional layer followed by max-pooling to extract salient
features from the input images. Subsequently, the extracted feature maps are flattened before
traversing through two fully connected layers, each integrated with ReLU activation functions.
The final layer adopts the softmax activation function, facilitating multiclass classification. Upon
defining the model architecture, compilation ensues, where the RMSprop optimizer and
categorical cross-entropy loss function are employed. Throughout training, the model's
performance is assessed using the accuracy metric. Model training unfolds via the fit method,
wherein training data is fed in batches from the training set, while validation data is sourced from
the test set. Notably, model checkpoints are strategically employed to preserve the best-
performing model based on accuracy throughout the training process, ensuring optimal model
retention.
The strong design and integrated features of Django make it an ideal option for constructing the
backend of the web application. By utilizing Django, I structured the project into distinct
applications, with each app being accountable for unique capabilities. I developed a Django
application specifically designed for inventory management. In this application, I incorporated
the backend functionality to load a trained Keras model and generate predictions depending on
user input.
The Keras model: which had been trained, was successfully included into the Django
application and saved as a .h5 file. I employed Django's views to load the Keras model by
utilizing the load_model() function provided by Keras. This enabled me to generate forecasts
based on the inventory data submitted by users via the online interface. By associating the view
function with a specific URL in the urls.py file, I created a defined pathway for managing
prediction requests.
30
Regarding the frontend, I created HTML templates utilizing Django's template system. These
templates offered a user-friendly interface that allowed users to input the necessary data for
optimizing inventories. In order to enhance communication between the frontend and backend,
I utilized AJAX (Asynchronous JavaScript and XML). By utilizing AJAX, I successfully
included asynchronous requests to transmit user input data to the Django view for prediction
without the need to reload the entire page. This facilitated a smooth and uninterrupted user
experience, augmenting the speed and efficiency of the online application. An important benefit
of incorporating a Keras model into the Django web application was the capability to deliver
real-time forecasts for optimizing inventory to store management. The Keras model, trained
using past sales data, has the ability to reliably predict the demand for various items. This enables
managers to optimize inventory levels and reduce instances of stockouts or overstocking.
Through the utilization of a Django-based online interface, I have enhanced the accessibility and
user-friendliness of inventory optimization.
The class diagram was essential in establishing the structure of the Django application that
handles the integration of the Keras model. The class diagram facilitated the organization of the
code structure and comprehension of the interactions among various components, such as views,
models, and templates. This enhanced the efficiency of the web application's development.
31
CHAPTER 5
RESULT AND DISCUSSION
Loss metric, which computes the discrepancy between the predicted and actual values.term
"accuracy" denotes the proportion of samples that were correctly classified relative to the total
number of samples; a higher accuracy signifies superior overall performance. The metric
"Precision" assesses the validity of the model's positive predictions by quantifying the proportion
of true positive cases that were predicted. The "Recall" function computes the percentage of true
positive instances that were accurately predicted by the model. This signifies the ratio of true
positives to the combined count of false positives and false negatives. The "F1-Score" is
calculated as the harmonic mean of precision and recall, thereby offering a unified metric that
achieves a balance between the two. Its utility is especially pronounced in unbalanced datasets
where one class exhibits dominance over the others.
32
5.1 Manual Model
33
Fig.5.3 Confusion Matrix
The Image Depicts the Confusion Matrix as a result of computation using Manual Architecture,
confusion matrix is a table that visualizes the performance of a classification model by
comparing predicted and actual values.
Accuracy Improvement:The initial accuracy of the model was quite low, registering at 0.0729
during the first epoch. However, there was a noticeable improvement as the training progressed.
By the seventh epoch, the accuracy had increased to 0.1146, indicating a steady improvement in
the model's performance over successive epochs. This trend suggests that the model was
effectively learning from the training data, gradually enhancing its ability to classify images
accurately.
34
Model Loss: The model's loss, which reflects the disparity between predicted and actual values,
exhibited a significant reduction throughout the training process. Initially, the loss was
exceedingly high at 67.5473. However, by the end of the training, the loss had diminished
substantially to 2.6580. This decline in loss indicates that the model was converging, with its
predictions aligning more closely with the ground truth labels as training progressed.
Precision, Recall, and F1-Score: F1-score ,Precision, and recall are essential measures for
assessing the model's performance, especially in situations when there is an imbalance in class
distribution. Precision quantifies the degree of accuracy in positive predictions, while recall
evaluates the model's capability to correctly identify real positive instances. The F1-score is the
harmonic average of precision and recall.
While there were incremental improvements in accuracy and loss over the training course, the
model's overall performance remained inadequate. Further optimization of the model
architecture or training process is imperative to enhance its accuracy and reliability in classifying
images accurately. Additional experimentation and refinement may be necessary to achieve the
desired level of performance for practical deployment in real-world applications.
35
5.2 ShuffleNet Architecture
36
Table 5.2 ShuffleNet Architecture Results
The ShuffleNet model was trained for 100 epochs, during which its performance was constantly
tracked using several metrics. At the end of the training process, the model demonstrated high
accuracy, with a score of 88.54% on the training set and 84.79% on the validation set. This
suggests that the model effectively learned and was able to generalize its knowledge. The high
level of accuracy was accompanied by equally excellent precision rates: 88.54% on the training
set and 84.94% on the validation set, indicating a dependable classification with minimal
occurrences of false positives.
During the training phase: the model's ability to learn was demonstrated by a steady decrease
in loss values. By the last epoch, the training loss reduced to 0.2492, while the validation loss
dropped to 0.3089. This indicates that the model was learning effectively and did not exhibit
substantial overfitting.
The performance trends during the 100 epochs exhibited an overall rising trajectory in both
training and validation accuracy, with occasional modest fluctuations seen in the latter. These
fluctuations are common in deep learning due to differences in data and training methods. The
continual decline in loss values further emphasizes the model's ability to accurately extract
important characteristics from the dataset.
37
5.3 AlexNet Architecture
38
Results Summary
Accuracy: During the last epoch of training, the AlexNet model exhibited remarkable accuracy
rates, attaining 92.50% on the training data and 89.58% on the validation data. The model's
exceptional accuracy highlights its expertise in classifying photos of grocery goods.
Precision: Precision metrics were examined to evaluate the model's capacity to accurately detect
positive cases. During the last epoch, the model demonstrated its robustness by achieving
accuracy scores of 93.33% on the training data and 90.32% on the validation data, indicating its
ability to make correct predictions.
Loss: Examining the loss values yielded significant insights on the model's ability to learn.
During the training phase, the model consistently decreased the loss on both the validation and
training data. More precisely, the loss on the training data reduced to 0.3420, and on the
validation data, it reached 0.3011 by the last epoch. The decrease in loss values indicates the
model's capacity to reduce mistakes and differences between anticipated and real values, hence
improving its prediction accuracy.
Model Performance Over Epochs: Despite some slight changes in validation accuracy, the
overall trajectory consistently showed an increasing trend, indicating the model's ability to
progressively learn from the data. Similarly, the loss values consistently decreased during the
course of training, suggesting ongoing improvement and optimization of the model's parameters.
39
5.4 ResNet Architecture
40
Results Summary
Upon conducting an in-depth analysis of the ResNet model's performance, detailed insights into
its effectiveness in image classification tasks were gleaned. Across various metrics, the model's
Capabilities.
Accuracy: In the final epoch, the ResNet model showcased commendable accuracy rates,
achieving an impressive 86.46% on the training data and 78.75% on the validation data. This
signifies the model's proficiency in accurately categorizing images of supermarket products,
highlighting its efficacy as a classification tool.
Precision: Precision metrics further underscored the model's robustness, with precision scores
of 87.83% on the training data and 79.91% on the validation data in the final epoch. This metric
measures the model ability to correctly identifying the positive cases, indicating a high level of
precision in its predictions.
Loss: An analysis of loss values provided valuable insights into the model learning dynamics.
Throughout the training process, the model consistently reduced the loss on both the validation
and training data, indicative of its adeptness in minimizing errors and discrepancies between
predicted and actual values. In the final epoch, the loss on the training data stood at 0.3470, while
on the validation data, it was 0.6886, reflecting the model's ability to learn and extract meaningful
features from the dataset.
Model Performance Over Epochs: A granular examination of the model performance across
epochs revealed a progressive improvement in both validation and training accuracy. While the
41
training accuracy gradually increased throughout training, reaching 86.46% in the final epoch,
validation accuracy exhibited minor fluctuations but ultimately reached 78.75% by the final
epoch. Similarly, the loss values demonstrated a steady decline, with the training loss decreasing
to 0.3470 and the validation loss fluctuating but ultimately reaching 0.6886 in the final epoch.
The ResNet model demonstrated promising performance metrics across various evaluation
criteria, affirming its efficacy as a robust classification tool for image recognition tasks in the
context of supermarket product identification. While the model's accuracy, precision, and loss
values underscore its effectiveness, further optimization and fine-tuning may be warranted to
enhance its performance further.
Table 5.5 Model Comparision Results
Manual
Model 11.46% 10.29% 13.25% 12.05% 3.4200 3.0110
The accuracy, precision, and loss performance of the several models is compiled in the table.
With a very high loss and very little accuracy, the first model was horrible. Significant gains
were shown, however, by ShuffleNet, AlexNet, ResNet, and the Manual Model. The least
performing of them was ResNet. Further, compared to ResNet and the Manual Model,
ShuffleNet and AlexNet exhibited comparatively smaller end training and validation losses.
On training and validation dataset, AlexNet fared better than other models in terms of accuracy
and precision. Accuracy in training was 92.50%, and in validation 89.58%. AlexNet also
displayed the highest precision, 93.33% on the training data and 90.32% on the validation data.
Better convergence and learning were further indicated by the model's comparatively reduced
final training and validation losses when compared to other models.
42
CHAPTER 6
CONCLUSION AND FUTURE SCOPE
6.1 Conclusion
In conclusion, our project on deep learning-based image classification for supermarket
automation represents a significant advancement in modernizing supermarket operations. By
utilizing the power of Convolutional Neural Networks (CNNs) and advanced image processing
techniques, we have developed a robust system capable of automating tasks related to product
classification and inventory management, thereby enhancing efficiency, accuracy, and
productivity within supermarkets.
Key findings from our study include:
• Evaluation of multiple CNN architectures including AlexNet, ShuffleNet, ResNet, and a
custom manual architecture for image classification.
• Comparison of the performance of these architectures based on metrics such as precision,
accuracy, and reliability.
• Selection of AlexNet as the most effective architecture due to its superior accuracy and
performance in classifying a diverse range of supermarket products.
Our project demonstrates the potential of deep learning-based image classification systems to
revolutionize supermarket operations, offering a scalable, adaptable, and high-performance
solution to streamline various processes and improve overall operational efficiency.
Longitudinal Studies:
Conducting longitudinal studies to evaluate the long-term performance and reliability of the
image classification system in real-world supermarket environments. This includes assessing the
system's robustness to changes in lighting conditions, shelf layouts, and product placements over
time.
44
REFERENCES
45
11. Perarasi, M., Ramadas, G. Detection of Cracks in Solar Panel Images Using Improved
AlexNet Classification Method. Russ J Nondestruct Test 59, 251–263 (2023).
https://doi.org/10.1134/S1061830922100230
12. Li, X., Shao, Z., Li, B. et al. Residual shuffle attention network for image super-
resolution. Machine Vision and Applications 34, 84 (2023). https://doi.org/10.1007/s00138-
023-01436-9
13. Xue, L., Xu, T., Song, Y. et al. Lightweight improved residual network for efficient inverse
tone mapping. Multimed Tools Appl (2024). https://doi.org/10.1007/s11042-023-17811-7
14. Niu, A., Wang, P., Zhu, Y. et al. GRAN: ghost residual attention network for single image
super resolution. Multimed Tools Appl 83, 28505–28522 (2024).
https://doi.org/10.1007/s11042-023-15088-4
15. Hüseyin Eldem, Alexnet architecture variations with transfer learning for classification of
wound images, Engineering Science and Technology, an International Journal, Volume
45, September 2023, 101490
16. Wei Liu, Yangqing Jia Going deeper with convolutions. June 2015,
DOI:10.1109/CVPR.2015.7298594,Conference: 2015 IEEE Conference on Computer
Vision and Pattern Recognition (CVPR)
17. Atchaya, A.J., Anitha, J., Priya, A.G., Poornima, J.J., Hemanth, J. (2023). Multilevel
Classification of Satellite Images Using Pretrained AlexNet Architecture. In: Jabbar, M.A.,
Ortiz-Rodríguez, F., Tiwari, S., Siarry, P. (eds) Applied Machine Learning and Data
Analytics. AMLDA 2022. Communications in Computer and Information Science, vol
1818. Springer, Cham. https://doi.org/10.1007/978-3-031-34222-6_17
18. Marco Mora, Ruber Hernández-García, Ricardo J. Barrientos, Claudio Fredes, Andres
Valenzuela, José Naranjo-Torres, A Review of Convolutional Neural Network Applied to
Fruit Image Processing 16 April 2020 / Revised: 12 May 2020 / Accepted: 13 May
2020 / Published: 16 May 2020
19. Ullah, A., Elahi, H., Sun, Z. et al. Comparative Analysis of AlexNet, ResNet18 and
SqueezeNet with Diverse Modification and Arduous Implementation. Arab J Sci Eng 47,
2397–2417 (2022). https://doi.org/10.1007/s13369-021-06182-6
20. Yadav, D., Kapoor, K., Yadav, A.K. et al. Satellite image classification using deep learning
approach. Earth Sci Inform (2024). https://doi.org/10.1007/s12145-024-01301-x
21. B, S., Mahesh, S. Hybrid optimized MRF based lung lobe segmentation and lung cancer
classification using Shufflenet. Multimed Tools Appl (2023).
46
https://doi.org/10.1007/s11042-023-17570-5
22. N. A., D. Deep learning and computer vision approach - a vision transformer based
classification of fruits and vegetable diseases (DLCVA-FVDC). Multimed Tools
Appl (2024). https://doi.org/10.1007/s11042-024-18516-1
23. A survey on Image Data Augmentation for Deep Learning July 2019 Journal of Big Data
6(1) DOI:10.1186/s40537-019-0197-0
24. Image Classification Algorithm Based on Improved AlexNet February 2021 Journal of
Physics Conference Series 1813(1):012051 DOI:10.1088/1742-6596/1813/1/012051
25. Zala, S., Goyal, V., Sharma, S. et al. Transformer based fruits disease
classification. Multimed Tools Appl (2024). https://doi.org/10.1007/s11042-024-19172-1
26. Singh, S.R., Yedla, R.R., Dubey, S.R. et al. Frequency disentangled residual
network. Multimedia Systems 30, 9 (2024). https://doi.org/10.1007/s00530-023-01232-5
27. Laghari, A.A., Sun, Y., Alhussein, M. et al. Deep residual-dense network based on
bidirectional recurrent neural network for atrial fibrillation detection. Sci Rep 13, 15109
(2023). https://doi.org/10.1038/s41598-023-40343-x
28. Zeng, Z., Yang, J., Wei, Y. et al. Fault Detection of Flexible DC Distribution Network
Based on GAF and Improved Deep Residual Network. J. Electr. Eng. Technol. (2024).
https://doi.org/10.1007/s42835-024-01848-1
29. Abedi, F. Dense residual network for image edge detection. Multimed Tools Appl (2024).
https://doi.org/10.1007/s11042-024-19264-y
30. Chen, S., Zhang, C., Gu, F. et al. RSGNN: residual structure graph neural network. Int. J.
Mach. Learn. & Cyber. (2024). https://doi.org/10.1007/s13042-024-02136-0
31. Liu, S., Lin, Y., Liu, D. et al. RTNet: a residual t-shaped network for medical image
segmentation. Multimed Tools Appl (2024). https://doi.org/10.1007/s11042-024-18544-x
32. Yang, Z., Yuan, P., Zhang, Y. et al. Residual aggregation U-shaped network for image
super-resolution. Multimed Tools Appl (2023). https://doi.org/10.1007/s11042-023-14875-
3
47
APPENDIX A
CODING AND TESTING
In this project, we aimed to develop a deep learning-based image classification system to classify
images into different categories. We employed state-of-the-art convolutional neural network
(CNN) architectures, including AlexNet, ResNet, and ShuffleNet, to achieve high accuracy in
image classification tasks.
Technologies Used:
• Programming Language: Python
• Deep Learning Frameworks: TensorFlow, Keras
• Web Framework: Django
Overview
Our project consists of two main components:
1. Model Development: We utilized Python along with TensorFlow and Keras to develop
and train the deep learning models. We experimented with various CNN architectures
such as AlexNet, ResNet, and ShuffleNet to find the best-performing model for our image
classification task.
2. Frontend Development: For the frontend, we utilized the Django web framework to
create an intuitive user interface. Users can upload images through the web interface, and
the trained model classifies them into different categories.
Model Evaluation:
We evaluated each model based on accuracy, precision, and loss metrics to assess its
performance. Additionally, we analyzed the training history of each model to understand its
learning behavior over epochs.
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
Inventry optimization.pdf
ORIGINALITY REPORT
6 %
SIMILARITY INDEX
4%
INTERNET SOURCES
3%
PUBLICATIONS
1%
STUDENT PAPERS
PRIMARY SOURCES
1
fastercapital.com
Internet Source 1%
2
"Intelligent Computing Theories and
Application", Springer Science and Business
<1 %
Media LLC, 2017
Publication
3
www2.mdpi.com
Internet Source <1 %
4
ojs.trp.org.in
Internet Source <1 %
5
www.ijisae.org
Internet Source <1 %
6
medium.com
Internet Source <1 %
7
Submitted to Sheffield Hallam University
Student Paper <1 %
8
idl.iscram.org
Internet Source <1 %
www.mdpi.com
<1 %
Internet Source
9
10
Submitted to Glasgow Caledonian University
Student Paper <1 %
11
Muhammad Alrashidi, Ali Selamat, Roliana
Ibrahim, Hamido Fujita. "Social Recommender
<1 %
System Based on CNN Incorporating Tagging
and Contextual Features", Journal of Cases on
Information Technology, 2024
Publication
12
Submitted to The University of Law Ltd
Student Paper <1 %
13
arxiv.org
Internet Source <1 %
14
technicaljournals.org
Internet Source <1 %
15
www.journal.esrgroups.org
Internet Source <1 %
16
Babeș-Bolyai University
Publication <1 %
17
ijrpr.com
Internet Source <1 %
18
www.frontiersin.org
Internet Source <1 %
19
"Robot Intelligence Technology and
Applications 4", Springer Science and
<1 %
Business Media LLC, 2017
Publication
20
Abolfazl Zargari, Gerrald A. Lodewijk, Najmeh
Mashhadi, Nathan Cook et al. "DeepSea: An
<1 %
efficient deep learning model for single-cell
segmentation and tracking of time-lapse
microscopy images", Cold Spring Harbor
Laboratory, 2022
Publication
21
www.researchgate.net
Internet Source <1 %
22
Junfang Fan, Juanqin Liu, Qili Chen, Wei
Wang, Yanhui Wu. "Accurate Ovarian Cyst
<1 %
Classification with a Lightweight Deep
Learning Model for Ultrasound Images", IEEE
Access, 2023
Publication
23
core.ac.uk
Internet Source <1 %
24
Submitted to Manchester Metropolitan
University
<1 %
Student Paper
25
www.qs.com
Internet Source <1 %
Format - I
SRM INSTITUTE OF SCIENCE AND TECHNOLOGY
(Deemed to be University u/ s 3 of UGC Act, 1956)
RA2011003010641
3 Registration Number
Individual or group :
(Strike whichever is not applicable)
Dr. N . Arunachalam
Assistant Professor
Department Of Computing Technologies
Name and address of the Supervisor / SRM Institute Of Science And Technology
9
Guide Kattankulathur = 603 203
Mail ID: arunachn@srmist.edu.in
Mobile Number: 9944342292
13 Plagiarism Details: (to attach the final report from the software)
1
Introduction 2% 2% 2%
2 Literature Survey 2% 2% 2%
5
Result And Discussion 0% 0% 0%
Appendices 6% 6% 6%
I / We declare that the above information have been verified and found true to the best of my / our knowledge.
Dr. N . Arunachalam
Name & Signature of the Staff
Signature of the Candidate (Who uses the plagiarism check software)
48
49