Short-Term Internship: Designed & Developed by

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 48


Designed & Developed by



Name of the student: INTI MANIKANTA JYOTHI

Registration Number: 21HN1A0520
Period of Internship: From: 17/05/2023 To: 16/08/2023
Name & Address of the intern organization: Grafx building, 5 th line Dwrakanagar
Visakhapatnam -530016

YEAR: 2023-2024

ii | P a g e
An Internship Report on

Data Science using python

Submitted in accordance with the requirement for the degree of Bachelor of Technology.

Under the Faculty Guide of


Department of
Computer Science and Engineering
Head of the Department is
Submitted by

Register Number


Department of
Computer Science and Engineering


YEAR: 2023-2024

iii | P a g e
Student declaration
I, INTI MANIKANTA JYOTHI, a student of Adarsh College of Engineering pursuing
Computer Science Engineering, declare that the work presented in the context of my Data
Science Internship with GIT Solution Pvt. Ltd, conducted from 18-05-2023 to 16-08-2023 is
entirely my own. The data analyses, data preprocessing, machine modeling, and deployment
drawn are based on my independent efforts and understanding of the concepts learned during
my academic pursuit. The data utilized in this internship project, sourced from the company
and my knowledge is accurate, and any external sources are appropriately cited in the
accompanying documentation. I ensured the accuracy and relevance of external sources. All
programming code and scripts developed during the internship are original, and any external
code used is duly referenced. The analyses and insights presented in the internship report are a
true reflection of my understanding and efforts, and I take full responsibility for the same. I
acknowledge that the work produced during this internship may be used by GIT Solution Pvt.
Ltd for internal purposes, provided that proper credit is given to the author. I am aware of the
importance of maintaining the integrity of the internship process and affirm my commitment
to upholding the highest standards of honesty and professionalism.

Signature and Date

iv | P a g e
Data Science using python
An internship Report
Submitted By
Register Number

In partial fulfillment for the award of the degree of


(Approved by AICTE New Delhi, Affiliated to JNTUK Kakinada)


(Approved by AICTE New Delhi, Affiliated to JNTUK Kakinada)


I hereby certify that INTI MANIKANTA JYOTHI, with Registration number 21HN1A0520,
has successfully completed his internship in Data Science using python at under my
supervision. This was done as a partial fulfillment of the requirements for the degree of
Bachelor of Technology in the Department of Computer Science and Engineering at Adarsh
College of Engineering. The certification is accepted for evaluation.

Head of the Department

(Signatory with Date and Seal)


Faculty Guide


vi | P a g e
Certificate from Intern Organization

vii | P a g e
I want to extend my sincere gratitude to several individuals who have significantly contributed
to my successful internship experience.

First and foremost, I wish to express my heartfelt appreciation to Ms. P. Swathi, my dedicated
faculty guide, for her unwavering support and invaluable mentorship. Her guidance has been
pivotal in my professional growth. I am immensely thankful to Mr. T. Nirosh Kumar, Head
of the Department, for his expert insights and the vital information he shared during my
internship. His contributions have enriched my understanding of the field of Data Science and
Machine Learning. A special note of gratitude goes to our esteemed Principal, Mr. Y. V. N.
Rajasekhar, for granting me the golden opportunity to immerse myself in the world of data
science and machine learning. This experience has expanded my knowledge and provided
valuable insights that will serve me well in the future.

I would also like to express my appreciation to my parents, friends, and colleagues whose
unwavering support and encouragement were essential in helping me complete the internship
within the designated time frame. Furthermore, I extend my sincere appreciation to all the
faculty members and lab programmers who generously shared their expertise and provided me
with valuable guidance throughout the internship. Their contributions have been instrumental
in my learning journey.

I am grateful for the excellent infrastructure and facilities provided by the Management of
ADARSH College of Engineering, which played a crucial role in facilitating a productive

In conclusion, it is essential to highlight that this internship was not solely about achieving
good grades, but also about acquiring practical knowledge and valuable experience. My
gratitude goes out to everyone who played a significant role in making this internship a
resounding success.


viii | P a g e
The "Diabetes Prediction Using Data Science" project confidently aims to develop an effective
predictive model for identifying individuals at risk of diabetes. To achieve this goal, advanced
data science techniques are expertly used to analyze a comprehensive dataset comprising
various health parameters, lifestyle factors, and medical history. The project methodology
confidently includes data preprocessing, feature selection, and the expert implementation of
machine learning algorithms, such as logistics regression, KNN, Random Forest tree algorithm,
and deep learning algorithms. The goal is to confidently build a robust predictive model that
emphasizes interpretability and accuracy, thereby enhancing the understanding of key
determinants that contribute to diabetes onset.

In addition to developing a robust predictive model for diabetes prediction, a user-friendly

interface has been designed to facilitate the practical application of the model. The user
interface is accessible through a dedicated website, providing an intuitive platform for users to
interact with the predictive tool.

The analysis confidently reveals significant associations between certain features and the
likelihood of diabetes. The predictive model confidently demonstrates promising results,
achieving high accuracy in identifying individuals at risk. Interpretability tools, such as feature
importance analysis, confidently provide insights into the crucial factors influencing
predictions. The practical implications of this project are substantial, confidently offering a
reliable tool for early diabetes risk assessment. The model's potential applications confidently
include public health interventions, personalized healthcare strategies, and proactive
management of diabetes-related risks. As the prevalence of diabetes continues to rise globally,
the methodologies and insights presented in this project confidently contribute to the ongoing
efforts to combat this public health challenge.

In conclusion, "Diabetes Prediction Using Data Science" confidently demonstrates the efficacy
of machine learning in healthcare and underscores the importance of confidently leveraging
data science for proactive and personalized health management.

ix | P a g e
1. Program book for short-term internship ii
2. Internship report iii
3. Students Declaration iv
4. Certificate vi
5. Certificate From internship organization vii
6. Acknowledgement vii
7. Abstract ix
8. Executive Summary 1
9. Internship Part 2-3
10. Activity logbook
a. Activity Log Book first-week 4-5
b. Activity Log Book second-week 6-7
c. Activity Log Book third-week 8-9
d. Activity Log Book fourth-week 10-11
e. Activity Log Book fifth-week 12-13
f. Activity Log Book sixth-week 14-15
g. Activity Log Book seventh week 16-17
h. Activity Log Book eight week 18-20
11. Outcome description 21
12. Technical skills Acquired 22-23
13. Managerial skills 24-25
14. Communication skills 26-27
15. Technological development 28-29
16. Photos and links 30-32
17. Student Evaluation 33
18. Supervisor Evaluation 34

19. Internal Assessment 35
20. Eternal Assessment 36

xi | P a g e
Executive Summary
The Data Science Internship Report provides a comprehensive overview of the work conducted
during the internship at GIT Solution Pvt. Ltd. The primary objective of the internship was to
develop an effective predictive model for identifying individuals at risk of diabetes, leveraging
advanced data science techniques. The methodology employed rigorous data preprocessing,
feature selection, and the implementation of machine learning algorithms such as logistic
regression, KNN, Random Forest, and deep learning algorithms. The emphasis was on
interpretability and accuracy, with a focus on enhancing the understanding of key determinants
contributing to diabetes onset.

Key findings from the analysis revealed significant associations between certain features and
the likelihood of diabetes. The predictive model demonstrated promising results, achieving
high accuracy in classifying individuals at risk. Interpretability tools, including feature
importance analysis, provided insights into the crucial factors influencing predictions. The
practical implications of this project are substantial, offering a reliable tool for early diabetes
risk assessment. The model's potential applications span public health interventions,
personalized healthcare strategies, and proactive management of diabetes-related risks. As the
prevalence of diabetes continues to rise globally, the methodologies and insights presented in
this project contribute to the ongoing efforts to combat this public health challenge.

In conclusion, "Diabetes Prediction Using Data Science" not only demonstrates the efficacy of
machine learning in healthcare but also underscores the importance of leveraging data science
for proactive and personalized health management. The internship experience has provided
valuable insights, enhanced technical skills, and contributed to the development of impactful
solutions in the field of data science.

Internship Part

Our internship program offers interns a flexible and collaborative working environment that
transcends geographical boundaries. Interns have to work from a company place in the comfort
of their own space while enjoying seamless communication and collaboration tools that
facilitate a teamwork setting. Interns follow a well-defined but flexible weekly work schedule,
allowing for a balance between structured project work, mentorship sessions, and independent
learning. The remote nature of the internship enables interns to manage their time effectively,
accommodating around 20-30 hours per week, aligning with both organizational needs and the
intern's schedule.

Interns are equipped with the necessary technological tools and software to enable seamless
engagement in data science and machine learning tasks. This includes access to cloud
computing platforms, collaborative coding environments, video conferencing tools, and
specialized software for data analysis and model development.

Tasks Performed

Interns engage in diverse tasks designed to provide a comprehensive understanding of data

science and machine learning concepts. Key responsibilities include.

• Collaboration & Communication: Interns communicate with their teammates

and team leader. Here, it increases communication most efficiently in the internship.
• Data Exploration and Preprocessing: Interns work with real-world datasets,
exploring and preprocessing data to ensure its suitability for analysis.
• Data analysis: Interns analyze the data in the dataset using tools like Matplotlib,
which helps to analyze the data, which value is highest in the dataset. Also, it helps the
finding NULL values in the dataset.
• Algorithm Implementation: Building and implementing machine learning
algorithms for various tasks, including regression, classification, and clustering.
• Model Evaluation: The model evaluation of the model is consisting the model
performance, accurate prediction, and integrity.

• Presentation: A presentation is a method of communicating information, ideas, or a
specific message to an audience. It typically involves a speaker or presenter who
delivers content using visual aids such as slides, images, graphs, or other multimedia
• Problem-solving: Problem-solving is a critical skill that involves the ability to
analyze a situation, identify challenges, and find effective solutions.

Skills Acquired

Interns in our internship program acquire a diverse skill set in data science and machine
learning, including.

• Collaboration and Communication: Enhanced ability to collaborate effectively

with teammates, communicate findings, and participate in project discussions.
• Technical Proficiency: Competency in using collaboration tools, cloud platforms,
and programming languages such as Python and R for data science tasks.
• Independent Learning: Developed skills in self-directed learning and managing
time effectively in a work environment.
• Problem-solving: Strengthened problem-solving skills by navigating challenges
unique to work scenarios.
• Adaptability: Improved adaptability to work conditions, demonstrating flexibility
and resilience in the face of real challenges.
• Project Management: Learned to manage projects balancing priorities and
meeting deadlines in an environment.

The internship experience not only enhances interns' technical skills in data science and
machine learning but also equips them with the valuable work skills essential for success in the
evolving landscape of distributed work.

Activity Log Book First week
Learning Person In-charge
Days Description
Outcomes Signature

Day-1 Know about the

Interacting with the company.
17-05-2023 company

Day-2 Interacting with teammates & Know about the

18-05-2023 Team leader themself

Team leader discussion to

Know roles and
Day-3 assign roles and
19-05-2023 responsibilities for individual
About Project

Exploration of project Exposure to Git

repositories on GitHub practices and
understanding version control. collaborative code.

The company has given me a

laptop. I was downloading I was learning new
Python and Anaconda. command was used in
Configure python and Anaconda prompt.

Day-6 Checking the required Perfectly working the

23-05-2023 software tool. tools.

(From Dt:17-05-2023 to Dt:23-05-2023)
Detailed report:
The initial day served as a gateway to the company's inner workings. Through insightful
interactions, I delved into the intricacies of the company's structure, culture, and operational
methodologies. This opportunity allowed me not only to gain a surface-level understanding but
also to acquire in-depth knowledge about the company's background, values, and overarching
mission. This foundational insight provided a robust context for the subsequent days.
Transitioning into team dynamics, Day 2 focused on building connections with teammates and
the team leader.

Engaging in meaningful discussions, I sought to unravel the fabric of the team's dynamics.
These interactions were not mere introductions; they were a collective exploration of each team
member's roles, strengths, and responsibilities. Conversations with the team leader offered
profound insights into leadership styles and the importance of cohesive teamwork. Day 3
marked a pivotal moment in the internship as the team leader led a comprehensive discussion
on project roles and responsibilities. This strategic planning session was instrumental in
shaping the trajectory of the upcoming project. Participating actively, I not only gained clarity
on my specific roles but also understood the interplay of responsibilities within the team. The
result was a cohesive plan that set the stage for effective collaboration and project execution.
Venturing into the practical realm, Day 4 involved a deep dive into project repositories on
GitHub. This hands-on exploration provided more than just an understanding of version control
practices; it was an initiation into the collaborative world of coding. Navigating through
repositories, I grasped the nuances of Git practices, recognizing the importance of streamlined
workflows and efficient version control in a team setting. Equipped with a company laptop,

Day 5 transitioned into practical applications as I downloaded Python and Anaconda. The
subsequent configuration of these tools was not merely a technical exercise but a tangible
demonstration of the development environment's importance. Acquiring hands-on experience,
including navigating commands in the Anaconda prompt, solidified my understanding of the
practical aspects of software development.

Activity Log Book Second week
Learning Person In-charge
Days Description
Outcomes Signature
Day-1 concepts
Regain concepts.
24-05-2023 Programming is the
Python used for

concepts Regain concepts.
Statistics used in the

Using datasets for

Day-3 manipulation
Data manipulation.
26-05-2023 techniques for data

Loading, cleaning,
Day-4 Pandas for data
and manipulating
27-05-2023 manipulation.
data with pandas.

Creating plots and

Visualization with
Day-5 outlier detection
matplotlib and
29-05-2023 using the matplotlib
visualization library.

We are try some

Day-6 method in the Remembering
30-05-2023 matplotlib and Methods

(From Dt:24-05-2023 to Dt:30-05-2023)
Detailed report:
In the first week of my internship, the focus was on refreshing and enhancing fundamental
concepts crucial for the upcoming data science project. In the second week, each day brought
a specific area of study, building a foundation for effective data manipulation and visualization.
The initial day centered around revisiting fundamental concepts in Python programming,
specifically tailored to the project's requirements. The goal was to regain a solid understanding
of Python, ensuring proficiency in its usage for the impending data science tasks. Building upon
the programming knowledge from the previous day, Day 2 delved into refreshing fundamental
statistical concepts relevant to the project. The aim was to ensure a robust grasp of statistical
methods crucial for effective data analysis and interpretation. Day 3 shifted the focus to
practical application by using datasets for manipulation techniques. This involved hands-on
experience in retrieving and manipulating data, honing skills crucial for handling real-world
datasets effectively.

The fourth day was dedicated to exploring the Pandas library for data manipulation. This
involved learning techniques for loading, cleaning, and manipulating data efficiently using
Pandas, a powerful tool in the data science toolkit. With a solid foundation in data manipulation,
Day 5 introduced visualization concepts using Matplotlib and Seaborn. The day's focus was on
creating meaningful plots and utilizing these visualization libraries for outlier detection,
essential for gaining insights from data. Recognizing the importance of work-life balance, the
sixth day was designated as a holiday. This break provided an opportunity to recharge and
reflect on the acquired knowledge, ensuring a fresh start for the upcoming week.

Building a Strong Foundation, the second week of the internship was dedicated to establishing
a strong foundation in Python programming, statistics, data manipulation, and visualization.
The sequential approach, from refreshing fundamental concepts to practical application with
real datasets, aimed at equipping me with the necessary skills for the data science project. As I
move forward, I am confident in my ability to leverage these skills to contribute effectively to
the project. The week's structured learning approach, combined with hands-on experience, lays
the groundwork for a successful and impactful internship journey.

Activity Log Book third week

Learning Person In-charge

Days Description
Outcomes Signature
I have known the
Day-1 Exploring machine algorithms of
31-05-2023 learning algorithms. supervised and
I got the
We apply the classification model
logistics algorithm to to get the result
pre-processed data. considered for the
main project.
I got a result that is
We apply the completely different
Day-3 decision tree from the previous
02-06-2023 algorithm to the algorithm. And
project observed some
overfitting problems.

I got a result that is

completely different
We apply the random
Day-4 from the previous
forest tree algorithm
03-06-2023 algorithm. And
to the project
observed some space

We apply the KNN

We got the result that
Day-5 induction to help to
is dataset fulfilled the
05-06-2023 the filling missing
missing values

We observed
We are implements
Day-6 inefficient
different induction
06-06-2023 performance are

They are conducted a

meeting for work Improvements found

(From Dt:31-05-2023 to Dt:07-06-2023)
Detailed report:
On the first day of our machine learning exploration, the focus was on understanding the
fundamental algorithms of supervised and unsupervised learning. We delved into the intricacies
of how these algorithms operate and their applications in real-world scenarios. This initial
exploration laid the foundation for our understanding of machine learning. We gained insights
into the various techniques employed in both supervised and unsupervised learning, setting the
stage for the practical applications in the following days. Building upon our theoretical
understanding from Day 1, we took a hands-on approach by applying the logistics algorithm to
pre-processed data. This step was crucial in implementing a classification model that would
serve as a cornerstone for our main project. The practical application of the logistics algorithm
provided valuable experience in translating theoretical knowledge into actionable insights. We
successfully generated a classification model, setting the groundwork for subsequent
algorithmic implementations.

Roles and responsibilities for this day were distributed among the team members. Seeking to
diversify our approach, we implemented the decision tree algorithm in the project. The results
obtained were surprising, showcasing a significant deviation from the previous logistics
algorithm. However, the presence of overfitting issues prompted a critical examination of our
model. The divergence in results emphasized the importance of selecting the right algorithm
for specific tasks. The identification of overfitting issues became a focal point for further
analysis and optimization in subsequent stages of the project.

To address the observed overfitting issues, we decided to experiment with the random forest
tree algorithm. While this approach produced distinct results, we encountered challenges
related to space utilization. The introduction of the random forest tree algorithm highlighted
the nuanced trade-offs between model complexity and performance. Space-related issues
served as a valuable lesson in considering resource constraints during algorithm selection.
Acknowledging the importance of data completeness, we applied the KNN induction algorithm
to fill in missing values in the dataset. This step aimed to enhance the robustness and reliability
of our dataset for subsequent analyses. Successfully filling in missing data demonstrated the

practical utility of machine learning in data preprocessing. The KNN induction algorithm
proved effective in maintaining data integrity, laying the groundwork for more accurate and
reliable models.

Recognizing the significance of maintaining a healthy work-life balance, the team observed a
well-deserved holiday. This break allowed for relaxation, rejuvenation, and reflection on the
progress made during the week. Taking time off reinforced the importance of balancing
productivity with personal well-being. It provided an opportunity for team members to
recharge, fostering a positive and sustainable work environment.

10 | P a g e
Activity Log Book fourth week

Day Person In-

Brief description of the daily Learning
& activity
Date signature
Made plans for the main project,
Gained insights into
Day-1 discussed with the team leader to
project planning and
08-06-2023 complete project information, and
initiated project planning.

Developed a project
Divided the project into different
Day-2 plan and initiated the
parts for easy completion and
09-06-023 data collection
started data gathering.

Gathered the dataset for the project

Day-3 Obtained a complete
and gained an understanding of
10-06-2023 grip on the dataset.
database attributes.

Started data preprocessing, Successfully

focusing on techniques for cleaning obtained a cleaned
12-06-2023 the dataset. dataset.

Acquired knowledge
Explored different techniques for
Day-5 of new techniques
forward data cleaning to enhance
13-06-2023 for dataset
data efficiency

Day-6 Learning the

Advance data-cleaning technique.
14-06-2023 advance techniques.

11 | P a g e
(From Dt:08-06-2023 to Dt:14-06-2023)
I participated in important project planning activities. Together with the team leader, we created
plans for the main project, discussing essential details to ensure a comprehensive understanding
of the project. This initial discussion provided valuable insights into effective project planning
and coordination. It set the tone for a well-organized approach to the upcoming tasks. The day
ended with the initiation of project planning, establishing a solid foundation for the weeks
ahead. This initial step allowed for clear communication and alignment of goals within the
team, which was a pivotal step in project development as it divided the overarching project into
manageable parts. This strategic approach aimed to facilitate a smoother workflow and enhance
overall project efficiency. At the same time, I began the data-gathering process, a crucial step
in building the necessary resources for the project. By the end of the day, we had developed a
comprehensive project plan ensuring that each team member had a clear understanding of their
role in achieving the project objectives.

On the fourth day, data preprocessing took center stage, focusing on techniques for cleaning
the dataset. This phase was vital to ensure the quality and reliability of the data. Through careful
application of preprocessing techniques, I successfully obtained a cleaned dataset, setting the
stage for more accurate and meaningful analysis. This hands-on experience with data
manipulation contributed to a practical understanding of data preprocessing in real-world

I dedicated myself to exploring different techniques for forward data cleaning, with a specific
focus on enhancing data efficiency. This exploration not only broadened my knowledge but
also equipped me with new skills for dataset preprocessing. The acquired techniques would
prove valuable in optimizing the dataset for more robust and insightful analysis in the
subsequent stages of the project.

A well-deserved holiday provided an opportunity for rest, crucial for maintaining productivity
and focus. Taking a break allowed for rejuvenation, ensuring a fresh start for the upcoming

12 | P a g e
Activity Log Book fifth week

Day Person In-

Brief description of the Learning
& charge
daily activity Outcome
Date signature
We checked if there were any
We got the outlier in
Day-1 outliers in the dataset and
the dataset and found
05-07-2023 analyzed the data using the
the data values.
Seaborn library.

I was used to finding how outlier I have seen the huge

Day-2 values increased using the seaborn outlier values there.
06-07-2023 method in boxplot (). Rectified Removed the
the outlier in the dataset. Outlier.

I started the feature selection

method to enhance the
Day-3 I was not satisfied
performance of machine learning.
07-07-2023 with the result.
Explore the new techniques for
best results.

I learned new techniques on the I learned new

Day-4 feature selections and techniques. I got the
08-07-2023 implemented them to get the best best result on the
feature for the project. feature selection.

I got appreciation
Day-5 There was a meeting for work from the team leader
10-07-2023 progress. for the good work I

Day-6 We write some improvement Found the

11-07-2023 about the meeting. improvement’s.

13 | P a g e
(From Dt:05-07-2023to Dt:11-07-2023)
During the first day of the week, my focus was on data analysis, specifically identifying outliers
in the dataset. I used the Seaborn Library to conduct a thorough examination of the data. This
initial exploration revealed the presence of outliers and gave me valuable insights into the
distribution of data values. This step laid the groundwork for addressing and rectifying
anomalies within the dataset. On the second day, I conducted a deeper dive into outlier analysis.
I utilized the seaborn method boxplot () to visualize how outlier values were distributed. The
visualization highlighted significant outliers, indicating potential data irregularities. I rectified
these outliers promptly, ensuring the dataset's integrity and reliability. This process not only
addressed immediate concerns but also enhanced the overall quality of the data for subsequent

Day three marked a shift towards feature selection to optimize machine learning performance.
I explored various feature selection methods, but the results did not meet expectations. This
prompted a deeper dive into alternative techniques for improved outcomes. This day
underscored the iterative nature of data analysis and the importance of refining strategies to
achieve desired results. Determined to overcome the challenges encountered on Day 3, the
fourth day was dedicated to learning and implementing new techniques for feature selection. I
explored advanced methodologies and applied them to the project. The effort paid off, and I
successfully identified the best features for the project, enhancing its overall performance. This
experience not only contributed to project success but also expanded my knowledge base with
valuable insights into diverse feature selection techniques.

A pivotal event during the week was a team meeting to discuss work progress. This platform
provided an opportunity to share insights and updates with team members. During the meeting,
I received appreciation from the team leader for the effective handling of outlier detection,
rectification, and successful implementation of feature selection techniques. This
acknowledgment reinforced the significance of individual contributions to the team's overall

14 | P a g e
Activity Log Book sixth week
Day & Person In-
Description Learning outcome
Date charge signature

Day-1 I got the features are Insulin

12-07-2023 and Glucose. Research is on

Day-2 Research started on diabetes Some important

13-07-2023 to ensure when diabetes things I got

Some insulin values I got of I got values of the

the diabetes and also diabetes infecting
Glucose values. humans.

Classifiers the diabetes is Three

normal or prediabetes or classifications
diabetes. were.

Then build the machine

Fixed some
Day-5 learning model to predict
algorithms for
17-07-2023 whether they have diabetes
or not

We research advance
Day-6 We found some
algorithms to perform the
18-07-2023 algorithms.
accurate prediction.

Day-7 We conducted a meeting for Found diabetes

19-07-2023 diabetes research. values.

15 | P a g e
(From Dt:12-07-2023 to Dt:19-07-2023)
During my internship, I delved into the exploration of essential features related to diabetes,
with a specific focus on Insulin and Glucose. This initial research phase laid the foundation for
my understanding of diabetes and its intricacies. Unfolded, my research took a more concrete
direction aimed at ensuring a comprehensive grasp of diabetes. Notable insights were gained,
shedding light on crucial aspects that form the basis of this complex medical condition. This
preliminary research phase proved instrumental in setting the stage for more in-depth

Moving forward to the third day, my focus shifted to obtaining specific insulin values and
glucose levels associated with diabetes. This phase of the research significantly contributed to
my knowledge, providing valuable data that offers insights into the impact of diabetes on
individuals. Understanding these numerical values deepened my appreciation for the
quantitative aspects of diabetes. This marked a pivotal point in my internship journey, as I
began classifying diabetes into distinct categories—normal, prediabetes, and diabetes. This
classification process involved analyzing the gathered data and discerning patterns that define
each category. The outcome of this classification exercise introduced a structured approach to
understanding the varied stages of diabetes.

The subsequent day, I transitioned into the practical realm of machine learning by undertaking
the construction of a predictive model. The goal was to predict the likelihood of an individual
having diabetes based on the collected data. This step involved selecting and fine-tuning
machine learning algorithms, paving the way for the creation of a robust predictive model.

16 | P a g e
Activity Log Book seventh week
The person in
Days Description charge's

Day-1 Assembles the code We got the order of

02-08-2023 segments. We done before. project.

Day-2 We got so many errors after Trying to rectifying

03-08-2023 assembled code. Errors.

Some errors are solved, but

Day-3 Trying to rectifying
remaining errors not
04-08-2023 Errors.

Day-4 Error free code is

Solved all errors.
5-08-2023 came.

Day-5 We introduce the Found the

07-08-2023 hyperparameter techniques. techniques.

Day-6 Knowing more about

Learned the more.
08-08-2023 hyperparameter.

17 | P a g e
(From Dt:02-08-2023 to Dt:08-08-2023)
Detailed report:

In the span of a week during our internship, our focus revolved around the intricate process of
code assembly and delving into the realm of hyperparameter techniques. The report
encapsulates the daily progress, challenges faced, and the invaluable learning outcomes

The journey commenced with the task of assembling code segments. Drawing from prior
experience, we efficiently pieced together the components, setting the foundation for the
upcoming project. The initial phase was characterized by the clarity brought about by a well-
defined project order. Assembled code often begets errors, and Day-2 was no exception. The
team encountered numerous challenges as the code was compiled, leading to a cascade of
errors. The primary focus on this day was identifying and understanding these errors, laying
the groundwork for subsequent resolutions. Although progress was made in resolving some
errors, a subset proved to be more resilient. Day-3 was marked by the team's relentless pursuit
of solutions, demonstrating a commendable commitment to problem-solving. The challenges
faced in this phase became integral to the learning process.

A significant turning point was reached on Day-4 as the team successfully navigated through
the code, addressing and resolving all identified errors. The end result was a refined, error-free
codebase. The accomplishment not only marked a technical triumph but also instilled
confidence in the team's collaborative problem-solving capabilities. Transitioning seamlessly,
the team introduced hyperparameter techniques on Day-5. This marked a pivotal moment in
the internship as the focus shifted to optimizing the model's performance. The day was spent
delving into the world of hyperparameter optimization, identifying key techniques that would
be instrumental in fine-tuning our project.

Building upon the introduction from the previous day, Day-6 was dedicated to a comprehensive
exploration of hyperparameters. The team acquired a nuanced understanding of these
parameters, recognizing their influence on model behaviour and performance. This deep dive
into hyperparameter knowledge laid the groundwork for informed decision-making in
subsequent project phases.

18 | P a g e
Activity Log Book eight week
Day & Person In-
Description Learning outcome
Date charge signature
Completed the
Three algorithms are used in
machine learning
Day-1 the Project KNN, Nub, and
model build.
09-08-2023 logistic algorithms. The
Completed the
training phase is coming
training of models

Then came the testing of the Found some error
three algorithms. at the testing part.

I rectified the error on the

Day-3 The result is
testing part. The result is an
11-08-2023 perfectly predicted
analysis using the seaborn

Develop the user interface

Day-4 Developed the user
for the code using HTML
12-08-2023 interface.
and CSS.

Connecting the code

Day-5 Completed the
segment for deployment
14-08-2023 deployment
used some techniques

Completed the diabetes

project and presented it to Completed project
the team leader

Day-7 Presentation project in the We got

17-08-2023 company. appreciation.

19 | P a g e
(From Dt:09-08-2023 to Dt:17-08-2023)
Over the past week of my internship, I made significant strides in the development and
completion of the diabetes prediction project. The week commenced with the utilization of
three pivotal machine learning algorithms—K-Nearest Neighbours (KNN), Naive Bayes
(Nvb), and Logistic Regression. This marked the training phase of the project, wherein I
successfully built the machine learning models. The training process proved to be a crucial
step, laying the groundwork for the subsequent testing phase. As I transitioned into the testing
phase on the second day, a challenge emerged, revealing errors in the testing process.
Addressing these issues became a primary focus, requiring a meticulous approach to ensure the
accuracy and reliability of the algorithms. The third day witnessed a successful resolution of
the testing errors. Through a systematic debugging process, I rectified the issues, resulting in
an accurate and precise analysis of the test data using the Seaborn visualization library.

Day four brought a shift in focus towards enhancing the user experience by developing a user
interface using HTML and CSS. The creation of an intuitive and visually appealing interface
aimed to improve accessibility and user interaction with the diabetes prediction tool. This step
added a layer of user-friendly functionality to the overall project. Building on the progress made
in the previous days, the fifth day was dedicated to connecting the code segments for
deployment. Employing various deployment techniques, I completed the deployment process,
ensuring the seamless execution of the diabetes prediction model in a real-world environment.

The culmination of the internship week transpired on the sixth day with the completion of the
entire diabetes prediction project. The finalized project was presented to the team leader,
showcasing the successful integration of machine learning algorithms, error resolution, user
interface development, and deployment strategies.

In conclusion, the week proved to be a dynamic and transformative period, marked by the
successful implementation and completion of the diabetes prediction project. The experience
deepened my understanding of machine learning methodologies, troubleshooting techniques,
and user interface development, contributing significantly to my overall skill set. I am eager to
apply these newfound skills in future projects and continue my journey of growth within the
field of data science and machine learning.

20 | P a g e

Positive Work Environment

The cultivation of a positive work environment is essential to promote employee satisfaction

and productivity. A supportive culture and workplace atmosphere that promotes inclusivity and
equality can significantly enhance the overall work environment.

Well-Designed Workspace

Additionally, a well-designed workspace can have a significant impact on employee well-

being. Ergonomic and comfortable office spaces can contribute to the overall health and
productivity of employees. Collaborative work areas and well-equipped facilities can also
foster creativity and teamwork.

Effective Communication

Effective communication is another key factor that can contribute to a harmonious and
productive work environment. Clear and open communication channels can facilitate the
exchange of ideas and feedback between employees and management. Regular feedback
sessions and transparent communication from leadership can also boost employee morale.

Employee Benefits

Furthermore, employee benefits such as competitive salaries, comprehensive benefits

packages, and opportunities for professional development can contribute to employee
satisfaction and retention. These benefits can help employees feel valued, motivated, and
invested in the success of their organization. In conclusion, creating a positive work
environment through a supportive culture, well-designed workspace, effective communication,
and employee benefits can greatly enhance the productivity, satisfaction, and retention of

21 | P a g e
Technical skills Acquired

Data Gathering Proficiency

A primary focus of my internship involved the effective acquisition of data from official
websites and leveraging ChatGPT for supplementary information. This process necessitated a
systematic and meticulous approach to ensure the collection of accurate and relevant data sets.
Adhering to ethical guidelines and legal considerations, I successfully navigated official
sources, extracting pertinent information that laid the foundation for subsequent analysis.

Advanced-Data Cleaning Methods

Basic data cleaning techniques, including handling missing data and addressing redundancy,
were employed early in the internship. Recognizing the importance of data quality in
subsequent analyses, I meticulously applied these fundamental methods to enhance the
reliability of the datasets. Here I delved into more advanced data-cleaning tools and
methodologies. This phase of the internship encompassed a comprehensive exploration of
techniques such as outlier detection, imputation strategies, and feature engineering.

Exploratory Data Analysis (EDA)

Conducted comprehensive exploratory data analysis to uncover patterns, trends, and insights
in datasets. Utilized statistical and visualization tools for an in-depth understanding of data

Machine Learning Model Development

In my Internship, I successfully implemented regression and classification algorithms. Despite

encountering challenges during the training phase, I effectively navigated and resolved the
issues to achieve successful outcomes. This experience not only honed my technical skills but
also underscored my ability to troubleshoot and overcome obstacles in the development
process. Overall, the challenges encountered during training contributed to a valuable learning
experience, enhancing my proficiency in algorithmic implementation and problem-solving
within a professional context.

22 | P a g e
Model Evaluation and Model Optimization

During the internship, a comprehensive evaluation of the implemented regression and

classification algorithms was conducted. This involved assessing the models' predictive
accuracy, precision, recall, and F1 score, among other relevant metrics. Rigorous validation
techniques, such as cross-validation and holdout validation, were employed to ensure the
robustness and generalizability of the models. To enhance the algorithms' performance, a
systematic approach to model optimization was undertaken. This involved fine-tuning
hyperparameters, feature engineering, and exploring alternative algorithms.

Model Testing and Evaluation

The testing phase of the internship project served as a crucial validation step to ensure the
effectiveness and reliability of the implemented models. This section provides an in-depth
exploration of the methodologies employed in the testing process, focusing on key elements
such as unit testing, integration testing, performance metrics, cross-validation, error analysis,
and stakeholder feedback.

Deployment of the Project

The deployment phase of my internship project was a crucial component in bringing the
designed report generation system to fruition. This involved transitioning from a development
environment to a production-ready state, ensuring seamless functionality and accessibility for

Hands-On Experience

By actively participating in these activities, interns not only acquired technical skills but also
developed problem-solving abilities, adaptability, and effective communication in a remote
work setting, enhancing their overall readiness for future roles in data science and machine

23 | P a g e
Managerial skills


Detail how you effectively communicated with team members, superiors, and other
stakeholders. Mention instances where you presented information clearly and concisely.
Highlight your ability to listen actively and provide constructive feedback.

Time Management

Discuss how you managed your time efficiently to meet deadlines. Provide examples of
prioritization and multitasking. Explain how you handled tight schedules and conflicting


Share situations where you had to make decisions independently. Explain the factors you
considered and the reasoning behind your decisions. Discuss outcomes and any adjustments
made based on feedback.


Problem-solving requires a systematic approach to identify and resolve issues effectively.

When encountering challenges, it is important to describe the specific obstacles that were faced
and the steps taken to overcome them. This includes highlighting instances where creative and
critical thinking were utilized to develop innovative solutions. Finally, it is important to reflect
on the outcomes of the problem-solving process and identify any areas for improvement that
could be utilized in future problem-solving scenarios. Effective problem-solving involves
seeking out practical and efficient solutions and implementing them promptly.


Emphasize your ability to work collaboratively with colleagues. Describe your role in group
projects and how you contributed to team success. Discuss any challenges faced within the
team and how you addressed them. it's important to provide details that showcase your ability

24 | P a g e
to collaborate effectively with colleagues. You can start by describing your role in group
projects and how you contributed to the success of the team. By providing detailed examples
of your teamwork skills, you can demonstrate to potential employers that you are a valuable
asset to any team.


Explain how you adapted to changes in projects, tasks, or team dynamics. Discuss your ability
to learn quickly and adjust to new environments. Highlight instances where you demonstrated
flexibility in your approach. Provide detailed examples of how you demonstrated flexibility in
your approach, such as taking on new responsibilities or collaborating with team members with
different work styles. Include any challenges you faced and how you overcame them

25 | P a g e
Communication skills

Active Listening

Example: Team Collaboration

Provide examples of situations where you actively listened to team members or superiors.
Discuss how your active listening skills contributed to a better understanding of tasks or
projects. Mention any instances where your ability to listen attentively positively impacted the
team's performance.

Verbal Communication

Example: Leading Meetings or Presentations

Describe any meetings or presentations you conducted or participated in. Highlight your ability
to articulate ideas clearly and engage the audience. Include feedback or reactions received
during or after your presentations.

Clarity and Conciseness

Example: Simplifying Complex Information

Discuss occasions where you had to convey complex information simply and understandably.
Provide examples of how you broke down intricate concepts for your team or clients. Highlight
instances where your communication prevented misunderstandings or confusion.

Team Meetings and Presentations

Detail your participation in team meetings and any presentations you delivered. Discuss how
you organized and conveyed information to your team. Highlight instances where your
communication contributed to successful collaboration or project outcomes.

Feedback and Collaboration

26 | P a g e
Discuss how you provided feedback to colleagues and received feedback from others. Highlight
instances where your feedback contributed to positive changes or improvements. If there were
collaborative projects, discuss how effective communication fostered successful teamwork.

Adaptability in Communication Style

Example 1: Diverse Audiences

Discuss how you adapted your communication style for different audiences (e.g., team
members, supervisors, clients). Share feedback or positive outcomes resulting from this

Example 2: Multicultural Collaboration

If relevant, highlight experiences of effective communication in a multicultural environment.

Discuss how cultural sensitivity positively impacted collaboration.

27 | P a g e
Technological developments

Python Programming

Skills Developed

Discuss the Python programming skills you developed during the internship. Highlight any
specific libraries or frameworks (e.g., NumPy, Pandas, Matplotlib) you used.


Detail how Python was applied in data analysis, scripting, automation, or any other relevant
areas. Provide examples of Python code snippets to illustrate your work.

Anaconda and Package Management

Environment Setup

Discuss how Anaconda facilitated the setup and management of Python environments.
Highlight the benefits of using Anaconda for package and environment management.

Package Integration

Describe the integration of various Python packages within the Anaconda environment.

Discuss how this streamlined the development process.

Version Control and Reproducibility

Explain how Anaconda contributed to version control and the reproducibility of your work.
Discuss any versioning challenges you addressed.

Data Analysis and Visualization

Share examples of how you employed Python for data analysis and visualization. Discuss the
libraries and frameworks (e.g., Pandas, NumPy, Matplotlib, Seaborn) you used. Highlight any
insights or decisions derived from your data analyses.

28 | P a g e
Machine Learning (if applicable)

If you worked on machine learning projects, discuss the Python libraries used (e.g., sci-kit-
learn, TensorFlow, PyTorch). Detail the machine learning algorithms and models you
implemented. Share any notable outcomes, improvements, or predictions achieved through
machine learning.

Code Optimization

If relevant, discuss any efforts made towards optimizing Python code for better performance.
Highlight techniques employed to enhance code efficiency. Share outcomes and
improvements achieved through code optimization.

29 | P a g e
Photos and links

30 | P a g e
31 | P a g e

32 | P a g e
Student Self-Evaluation of the Short-Term Internship
Registration No:21HN1A0520
Term of Internship: 8 Weeks From:18-05-2023 To:16-08-2023
Date of Evaluation:
Organization name & Address: Grafx building, 5th line Dwrakanagar
Visakhapatnam -530016

Please rate your performance in the following areas:

Rating Scale: Letter grade of CGPA calculation to be provided

1 Oral communication 1 2 3 4 5

2 Written communication 1 2 3 4 5

3 Proactiveness 1 2 3 4 5

4 Interaction ability with community 1 2 3 4 5

5 Positive Attitude 1 2 3 4 5

6 Self-confidence 1 2 3 4 5

7 Ability to learn 1 2 3 4 5

8 Work Plan and Organization 1 2 3 4 5

9 Professionalism 1 2 3 4 5

10 Creativity 1 2 3 4 5
11 Quality of work 1 2 3 4 5
12 Time Management 1 2 3 4 5
13 Understanding the Community 1 2 3 4 5

14 Achievement of Desired Outcomes 1 2 3 4 5


Date Signature of the student

33 | P a g e
Name Of the Student:
Program of Study:
Year of Study:
Register No/H.T. No:
Name of the College:

Sl. No Evaluation criterion Maximum mark’s Marks Awarded

1 Activity Log 25

2 Internship evaluation 50

3 Oral Presentation 25

Date: Signature of the Faculty Guide

Certified by

Date: Signature of the Head of the Department/Principal


34 | P a g e


Programme of Study: Bachelor of Technology
Year of Study: THIRD YEAR – FIRST SEMESTER (3-1)
Register No/H.T. No: 21HN1A0520

University: JNTUK

Maximum Marks
Sl.No Evaluation Criterion
Marks Awarded

1. Internship Evaluation 80
For the grading giving by the
2. 20
Supervisor of the Intern Organization
3. Viva-Voce 50
GRAND TOTAL (EXT. 50 M + INT. 100M) 200

Signature of the Faculty Guide

Signature of the Internal Expert

Signature of the External Expert

Signature of the Principal with Seal

35 | P a g e
36 | P a g e

You might also like