Professional Documents
Culture Documents
Report of The Summer Internship Project
Report of The Summer Internship Project
Report of The Summer Internship Project
On
Duration:
BY
Mr. K BASU NAYAK (2451-20-733-082)
Certificate
This is to certify that the Summer Internship work entitled “DATA ANALYTICS
CONSULTING VIRTUAL INTERNSHIP” is a bonafide work carried out by Mr. K BASU
NAYAK (2451-20-733-082) in partial fulfillment of the requirements for the award of degree
of Bachelor of Engineering in Computer Science and Engineering from Maturi Venkata
Subba Rao (MVSR) Engineering College, affiliated to OSMANIA UNIVERSITY, Hyderabad
during the Academic Year 2022-23 under our guidance and supervision.
DECLARATION
This is to certify that the work reported in the present summer internship entitled “DATA
ANALYTICS CONSULTING VIRTUAL INTERNSHIP” is a record of bonafide work done
by us as part of internship in the KPMG FORAGE. The report is based on the project work
done entirely by us and not copied from any other source.
K Basu nayak
(2451-20-733-082)
ii
ACKNOWLEDGEMENTS
We would like to express our sincere gratitude and indebtedness to my summer internship
guide Mr V Sathish, Asst. Professor for his valuable suggestions and interest throughout the
course of this summer internship.
We are also thankful to our principal Dr. G Kanaka Durga and Mr. J Prasanna Kumar,
Professor and Head, Department of Computer Science and Engineering, Maturi Venkata
Subba Rao (MVSR) Engineering College, Hyderabad for providing excellent infrastructure
for completing this summer internship successfully as a part of our B.E. Degree (CSE). We
would like to thank our summer internship coordinator Ms. Sirisha Daggubati, Asst.
Professor for their constant monitoring, guidance, and support.
We convey our heartfelt thanks to the lab staff for allowing us to use the required
equipment whenever needed.
Finally, we would like to take this opportunity to thank our families for their support
through the work. We sincerely acknowledge and thank all those who gave directly or
indirectly their support in the completion of this work.
K Basu nayak
(2451-20-733-082)
iii
MISSION
VISION
vision is to empower students with visual insights into the journey of their data, fostering
a sense of digital literacy and security consciousness. It envisions a visually engaging
representation that not only educates but also sparks interest in networking concepts.
iv
COURSE OBJECTIVES
To prepare the students
To give an experience to the students in solving real life practical problems with all its
constraints.
To give an opportunity to integrate different aspects of learning with reference to real
life problems.
To enhance the confidence of the students while communicating with industry
engineers and give an opportunity for useful interaction with them and familiarize
with work culture and ethics of the industry.
COURSE OUTCOMES
On successful completion of this course student will be
ABSTRACT
The objective of the internship is to facilitate reflection on experiences obtained in the
internship and to enhance understanding of academic material by application in the internship
setting. Internships will provide students the opportunity to test their interest in a particular
career before permanent commitments are made.
Internship students will develop skills and techniques directly applicable totheir careers.
Internship programs will enhance advancement possibilities of graduates.
Develop skills in analyzing Data Sets and perform different traditional techniques, processing
methods, make uses of different various algorithms toprocess data quickly and efficiently.
vi
TABLE OF CONTENTS
TABLE OF
CONTENT
Content Page No.
Chapter 1: Introduction
1.1: Big Data
1.2: Data Analytics
1.3: Data Science
1
1
2
2
Chapter 2: Problem
Statement 3
Chapter 3: Motivation 5
Chapter 4:
Methodological Details
4.1: Task 1
4.2: Task 2
4.3: Task 3
6
5
7
7
Chapter 5: Result
5.1: Result and analysis
5.2: Output
11
11
12
Chapter 6: Conclusion
6.1: Conclusion
13
13
Acknowledgement 15
Worksheet 16
References
TABLE OF
CONTENT
Content Page No.
Chapter 1: Introduction
1.1: Big Data
1.2: Data Analytics
1.3: Data Science
1
1
2
2
Chapter 2: Problem
Statement 3
Chapter 3: Motivation 5
Chapter 4:
Methodological Details
4.1: Task 1
4.2: Task 2
4.3: Task 3
6
5
7
7
Chapter 5: Result
5.1: Result and analysis
5.2: Output
11
11
12
Chapter 6: Conclusion
6.1: Conclusion
13
13
Acknowledgement 15
Worksheet 16
References
TABLE OF
CONTENT
Content Page No.
Chapter 1: Introduction
1.1: Big Data
1.2: Data Analytics
1.3: Data Science
1
1
2
2
Chapter 2: Problem
Statement 3
Chapter 3: Motivation 5
Chapter 4:
Methodological Details
4.1: Task 1
4.2: Task 2
4.3: Task 3
6
5
7
7
Chapter 5: Result
5.1: Result and analysis
5.2: Output
11
11
12
Chapter 6: Conclusion
6.1: Conclusion
13
13
Acknowledgement 15
Worksheet 16
References
CHAPTER 1: INTRODUCTION 1
3.1: TASK 1 6
3.2: TASK 2 7
3.3: TASK 3 7
CHAPETR 4: RESULT 11
4.2: OUTPUT 12
CHAPTER 5: CONCLUSION 13
REFFERENCES 16
vii
LIST OF FIGIRES
CHAPTER I
1. INTRODUCTION
1.1Big Data
What is Data?
Big Data is a collection of data that is huge in volume, yet growing exponentially with time.
It is a data with so large size and complexity that none of traditional data management tools
can store it or process it efficiently. Big data is also a data but with huge size.
Following are some of the Big Data examples-The New York Stock Exchange is an example
of Big Data that generates about one terabyte of new trade data per day.
1. Structured
2. Unstructured
3. Semi-structured
2.2 Data Analytics
As the process of analysing raw data to find trends and answer questions, the definition of
data analytics captures its broad scope of the field. However, it includes many techniques
with many different goals. The data analytics process has some components that can help
a variety of initiatives. By combining these components, a successful data analytics
initiative will provide a clear picture of where you are, where you have been and where
you should go
Descriptive analytics helps answer questions about what happened. These techniques
summarize large datasets to describe outcomes to stakeholders. By developing key
performance indicators (KPIs,) these strategies can help track successes or failures.
Metrics such as return on investment (ROI) are used in many industries. Specialized
metrics are developed to track performance in specific industries. This process requires
the collection of relevant data, processing of the data, data analysis and data visualization.
This process provides essential insight into past performance.
Diagnostic analytics helps answer questions about why things happened. These
techniques supplement more basic descriptive analytics. They take the findings from
descriptive analytics and dig deeper to find the cause. The performance indicators are
further investigated to discover why they got better or worse. This generally occurs in
three steps:
CHAPTER II
Problem Statement:
The client provided KPMG with 3 datasets:
Customer Demographic
Customer Addresses
To correct issues in data set like accuracy, completeness or duplicate values or null values
2
CHAPTER III
Task 1 : Data Quality Assessements
As per voicemail, please find the 3 datasets attached from Sprocket Central Pty Ltd:
Customer Demographic
Customer Addresses
I’ve also attached a data quality framework as a guideline. Let me know if you have any
questions.
Draft an email to the client identifying the data quality issues and strategies to mitigate
these issues. Refer to ‘Data Quality Framework Table’ and resources below for
criteria and dimensions which have been considered.
Using programs like Excel, Google Sheets, Tableau, Power BI to start. Feel free to use
Python, R Programming Language, Mat Lab and other data analytics tools that you
know of.
Task 2:
Sprocket Central Pty Ltd has given us a new list of 1000 potential customers with
their demographics and attributes. However, these customers do not have prior
The marketing team at Sprocket Central Pty Ltd is sure that, if correctly analysed, the
data would reveal useful customer insights which could help optimize resource
value customers.
customers should be targeted to drive the most value for the organization.
Task 3:
The client is happy with the analysis plan and would like us to proceed. After
building the model we need to present our results back to the client.
Visualizations such as interactive dashboards often help us highlight key findings and
convey our ideas in a more succinct manner. A list of customersor algorithm won’t cut
it with the client, we need to support our results with the use of visualizations.
Please develop a dashboard that we can present to the client at our next meeting.
CHAPTER IV
Result and analysis:
Result and analysis: We have successfully analysed the dataset given by Sprocket Central Pty
Ltd. The final step was interpreting the results from the data analysis. This part is essential
because it's how a business will gain actual value from the previous four steps. Interpreting
data analysis results should validate why you conducted it, even if it's not 100 percent
conclusive.
OUTPUT:
Before Analysis
After Analysis
CONCLUSION: