Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 21

A INTERNSHIP REPORT ON

DATA ANALYTICS

SUBMITTED TO THE SAVITRIBAI PHULE PUNE UNIVERSITY, PUNE


IN THE PARTIAL FULFILLMENT OF THE REQUIREMENTS
FOR THE AWARD OF THE

DEGREE OF

BACHELOR OF ENGINEERING (INFORMATION


TECHNOLOGY)

SUBMITTED BY

STUDENT NAME: RUSHIKESH SUBHASH GAIKHE PRN No :

DEPARTMENT OF INFORMATION
TECHNOLOGY

SANDIP INSTITUTE OF TECHNOLOGY AND RESEARCH


CENTRE NASHIK

SAVITRIBAI PHULE PUNE UNIVERSITY

2024 -2025
CERTIFICATE

This is to certify that the Internship report entitles

“ DATA ANALYTICS ”

Submitted by

STUDENT NAME : RUSHIKESH SUBHASH GAIKHE PRN No :

is a bonafide student of this institute and the work has been carried out by him/her under the
supervision of Prof. Megha Singru and it is approved for the partial fulfillment of the requirement
of Savitribai Phule Pune University, for the award of the degree of Bachelor of Engineering
(Information Technology).

(Prof. Megha Singru) (Prof.Abhay.R.Gaidhani)


Guide Head,
Department of Information Technology Department of Information Technology

(Dr. M.M. Patil)


Place : Nasik
Principal,
Date :
SITRC, Nasik
ACKNOWLEDGEMENT

With deep sense of gratitude we would like to thanks all the people who have litour path
with their kind guidance. We are very grateful to these intellectuals who did their best to help
during our project work.
It is our proud privilege to express deep sense of gratitude to, Dr.M.M.Patil, Principal
of Sandip Institute of Technology and Research Centre (SITRC), Nashik, for his comments
and kind permission to complete this project. We remain indebted to Prof. Abhay R.
Gaidhani, H.O.D Information Technology Department for their timely suggestion and
valuable guidance.
The special gratitude goes my guide Prof. Megha Singru and staff members, technical
staff members of Information Technology Department for their expensive, excellent and
precious guidance in completion of this work. We thank to all the colleagues for their
appreciable help for our working project.
With various industry owners or lab technicians to help, it has been our endeavour to
throughout our work to cover the entire project work.
We are also thankful to our parents who providing their wishful support for ourproject
completion successfully.
And lastly we thanks to our all friends and the people who are directly or indirectly related to
our project work.

Rushikesh Subhash Gaikhe


ABSTRACT

Data analysis plays a pivotal role in extracting meaningful insights from vast volumes of data to
inform decision-making processes across various domains. This abstract encapsulates the essence of
data analysis, outlining its importance, methodologies, and applications. Data analysis encompasses a
spectrum of techniques aimed at uncovering patterns, trends, and correlations within datasets. It
involves the collection, cleaning, processing, and interpretation of data to derive actionable insights.
From descriptive statistics to advanced machine learning algorithms, data analysis methodologies vary
in complexity and applicability, catering to diverse business needs and objectives. In today's data-
driven era, organizations rely on data analysis to gain a competitive edge, optimize operations, and
enhance customer experiences. By harnessing the power of data, businesses can identify opportunities,
mitigate risks, and drive innovation. Moreover, data analysis facilitates evidence-based decision-
making, enabling stakeholders to make informed choices grounded in empirical evidence rather than
intuition or conjecture. This abstract serves as a primer on data analysis, highlighting its significance
in transforming raw data into valuable knowledge. It underscores the interdisciplinary nature of data
analysis, bridging the gap between data science, statistics, and domain expertise. As organizations
continue to amass vast amounts of data, the role of data analysis in extracting actionable insights and
driving strategic initiatives will only grow in prominence. At its core, data analysis involves the
systematic collection, cleaning, processing, and interpretation of data to uncover patterns, trends, and
relationships. Through statistical inference, predictive modeling, and data visualization, analysts
transform raw data into actionable insights that drive informed decision-making.
TITLE

Sr. No. Title of Chapter Page No.


01 Introduction 1
1.1 Overview 1
1.2 Problem Definition and Objectives 1
1.3 Organization of the report 2
02 Literature Survey 3
03 Motivation 4
3.1 Purpose and scope 4
3.2 Objective of seminar 6
04 System design/technology/Analytical and/or experimental 8
work,
05 Conclusions 16
5.1 Conclusions 16
5.2 Future Work 16
5.3 Applications 17
06 Bibliography/References (in IEEE Format): Plagiarism Report of 19
project report.
CHAPTER NO 1:- INTRODUCTION

Data visualization is a graphical representation of data. It presents data as an image or graphic to make
it easier to identify patterns and understand difficult concepts. Technology allows users to interact
with the data by changing the parameters to see more detail and create new insights. Additionally, it
provides an excellent way for employees or business owners to present data to non-technical
audiences without confusion.

1.1 Overview

Data visualization is the graphical representation of information and data. By using visual elements
like charts, graphs, and maps, data visualization tools provide an accessible way to see and understand
trends, outliers, and patterns in data. Additionally, it provides an excellent way for employees or
business owners to present data to non-technical audiences without confusion. Data visualization is the
representation of data through use of common graphics, such as charts, plots, infographics, and even
animations. These visual displays of information communicate complex data relationships and data-
driven insights in a way that is easy to understand.Data visualization can be utilized for a variety of
purposes, and it’s important to note that is not only reserved for use by data teams. Management also
leverages it to convey organizational structure and hierarchy while data analysts and data scientists use
it to discover and explain patterns and trends. Harvard Business Review (link resides outside IBM)
categorizes data visualization into four key purposes.

1.2 Problem Definition and Objectives

Data visualization is commonly used to spur idea generation across teams. They are frequently
leveraged during brainstorming or Design Thinking sessions at the start of a project by supporting the
collection of different perspectives and highlighting the common concerns of the collective. While
these visualizations are usually unpolished and unrefined, they help set the foundation within the
project to ensure that the team is aligned on the problem that they’re looking to address for key
stakeholders

1.3 Organization of the report


Ready to utilize data visualization in your company? TIBCO Spotfire software is the most complete
analytics solution on the market, enabling everyone to explore and visualize new discoveries in data
through immersive dashboards and advanced analytics. Spotfire analytics delivers capabilities at scale,
including predictive analytics, geolocation analytics, and advanced analytics. And with Spotfire Mods,
you can build tailored analytic apps rapidly, repeatedly, and to scale.

1
TIBCO Spotfire advanced analytics helps you:

 Gain richer insights with AI-infused visual analytics and custom analytics app creation
 Combine historic and streaming data to predict trends via data science and embedded
analytics

2
CHAPTER NO 2:-LITERATURE SURVEY

TPC has done some recent research on visual communication, however much of this research is not
directly focused on empirical research that addresses data visualization. Recent work [18] that
attempts to address design lore in the field does little to mitigate this problem, and since TPC has not
directly investigated effective practices for producing data visualizations of complex information, we
were forced to go outside of the field to see whether other disciplines had broached the subject. To do
this, we have performed a version of the integrative literature review. Before beginning our database
searches, we had placed two restrictions on our searches. First, we restricted our search of the
literature to post-2000, with the majority of the research we examined closely being in the last 10
years. The improved tools for data visualization creation and more data readily available means that
techniques used over 15 years ago have little bearing on current practices. The results reported here
include research through 2016. Second, we also restricted our search to studies that report empirical
research reported as journal articles. While we appreciate theoretical orientations and foundations,
empirical research provides more reliable data for trying to ascertain effective practices for data
visualization creation for end-user application. For example, some research [19] based
recommendations on visualization types on their own research of “best practices” that were not clearly
explained nor empirically supported. Thus, this type of research article was not included in our
analysis. We also limited our search to studies that intersected with communicating patient focused
information in health and medical settings. As recent research has suggested [20]–[21], technical and
professional communicators are well qualified to talk a leading role in health communication,
particularly those that involve patient centered information such as patient education materials,
decision aids for shared decision making, and health risk communication. These patient-centered
materials require data visualizations (and information design) to help patients and their families or
caregivers make better decisions about their health. The type of information included in these
materials needs to be patient centered and they need to find ways to discuss subjects such as
prevalence of disease in a population, probability that a positive test result indicates a "true positive",
frequency of various outcomes from different treatment procedures in ways that patients can
understand. With health literacy levels in the US hovering at around 12% according to the US
government, researchers in health and medicine are conducting the most advanced and diverse around
ways to communicate complex data and information in both visual and textual ways.

3
CHAPTER NO 3:- MOTIVATION

Data visualization is essential to assist businesses in quickly identifying data trends, which would
otherwise be a hassle. The pictorial representation of data sets allows analysts to visualize concepts
and new patterns. With the increasing surge in data every day, making sense of the quintillion bytes of
data is impossible without Data Proliferation, which includes data visualization. Every professional
industry benefits from understanding their data, so data visualization is branching out to all fields
where data exists. For every business, information is their most significant leverage. Through
visualization, one can prolifically convey their points and take advantage of that information.
A dashboard, graph, infographics, map, chart, video, slide, etc. all these mediums can be used for
visualizing and understanding data. Visualizing the data enable decision-makers to interrelate the data
to find better insights and reap the importance of data visualization.

3.1 Purpose and scope


Some of the methods used in data visualization today include area chart, bar chart, box-and-whisker
plots, bubble cloud, bullet graph, cartogram, circle view, dot distribution map, Gantt chart, heat map,
highlight table, histogram, matrix, network, polar area, radial tree, scatter plot (2d or 3d), streamgraph,
text tables, timeline, treemap, wedge stack graph, and word cloud. You can choose the right type of
visualization based on the purpose of your visualization, the nature of data, and the needs of your
audience.

1. Instant absorption of large and complex data

When presented effectively, we are able to grasp large volumes of data literally in the blink of
an eye. The reason for this is that the neural process required to process ready images is much
easier than creating our own visualization from text or numbers. We are also able to appreciate
the interrelations between different data points more easily when we view their visual
representations. Once we see something, we internalize it faster.

Our addiction to our smartphone or tablet screens has led to shortened attention spans, so
receiving information as a ‘snapshot’ is helpful. On the other hand, if you have systems in your
organization to collect data but no effective way to present it to stakeholders, then you are
unlikely to get the expected business benefits, as users struggle to make sense of unwieldy and
hard-to-understand reports.

So a data visualization creates small ‘packages’ or units of information to convey sets of ideas
that can be stored in your viewer’s short-term memory. Color, line weight, scale, and
placement are used to bring out the conceptual meaning of these “chunks”, all helping to match
content to its hierarchy of importance. The viewer can easily navigate the chunks and
understand their significance.
4
2. Better decision-making based on data

Business meetings that discuss visual data tend to be shorter and reach consensus more easily
as compared to those that focus on only text or numbers. Data visualization helps to reach
decisions faster and enables viewers to glean far better insights about patterns and trends.

With visualization, the benefits of data analytics are available to various roles throughout your
organization, who may not be experts in the field. If you put the right data visualization
systems in place, your sales staff can gain a better understanding of consumer behavior and
perceptions even though they may not be experts at interpreting data themselves. Data
visualization is a combination of technical analytics and creative storytelling, allowing you to
create experts with the right tools and training.

You will get the best outcomes when visualizations are developed to best suit your business
objectives. Some data visualizations help to analyze, while others present information in an
interesting way. Some are designed to illustrate concepts, processes, or strategies for different
kinds of viewers. Build your own based on your specific objectives, type of data, and needs of
different stakeholders.

3. Audience Engagement

Viewers feel far more engaged when they are able to relate to data thanks to good visual
presentation. Images produce emotional responses, so data visualization can help to drive
opinion and action.

Visualization also enables communication and collaboration as multiple stakeholders can view,
appreciate and discuss insights from data. We now expect data to be presented in easy-to-
understand, visual methods. For example, when we want to know the performance of our own
websites, we consult Google Analytics, which has charts that help us to understand the
information far better than if it were in raw tabular form. As another example, consider sales
spread over a geographical region, or retail outlet and distributor locations. It would certainly
be more helpful for you to view this as a geospatial distribution than as descriptive text.

During sales presentations, data that showcases your strength, when presented visually, goes a
long way in building credibility and persuading. Encourage your sales teams to share visual
data that proves claims rather than just words.

Interactivity goes a long way in creating engagement. Can you create interactive visualizations
5
that viewers can change, ask queries, and arrive at conclusions of their own? This helps to
build more credibility about the data.

4. Reveals hidden patterns and deeper insights

Data visualization uncovers trends, patterns, and relationships that are not easily discernible
from numerical data or traditional forms of representation. Deeper insights and
interrelationships can be obtained through data visualization.

Sales forecasts made using data visualization tend to be more accurate than others. When it
comes to consumer behavior, visualization helps to see a number of different factors and how
they are related to each other, leading to a better understanding.

You can also use data visualization to understand your own operations, identify bottlenecks
and pinpoint areas that need improvement. For instance, let’s say there are spikes in customer
complaints. When the data is seen in conjunction with certain changes in the support staff, you
may find correlation and possibly causes.

3.2 Objective of Seminar

Data visualization is the practice of translating information into a visual context, such as a map or
graph, to make data easier for the human brain to understand and pull insights from. The main goal of
data visualization is to make it easier to identify patterns, trends and outliers in large data sets. The
term is often used interchangeably with others, including information graphics, information
visualization and statistical graphics.

Data visualization is one of the steps of the data science process, which states that after data has been
collected, processed and modeled, it must be visualized for conclusions to be made. Data visualization
is also an element of the broader data presentation architecture (DPA) discipline, which aims to
identify, locate, manipulate, format and deliver data in the most efficient way possible.

Data visualization is important for almost every career. It can be used by teachers to display student
test results, by computer scientists exploring advancements in artificial intelligence (AI) or by
executives looking to share information with stakeholders. It also plays an important role in big
data projects. As businesses accumulated massive collections of data during the early years of the big
data trend, they needed a way to get an overview of their data quickly and easily. Visualization tools
were a natural fit.

6
Visualization is central to advanced analytics for similar reasons. When a data scientist is writing
advanced predictive analytics or machine learning (ML) algorithms, it becomes important to visualize
the outputs to monitor results and ensure that models are performing as intended. This is because
visualizations of complex algorithms are generally easier to interpret than numerical outputs.

Data visualization provides a quick and effective way to communicate information in a universal
manner using visual information. The practice can also help businesses identify which factors affect
customer behavior; pinpoint areas that need to be improved or need more attention; make data more
memorable for stakeholders; understand when and where to place specific products; and predict sales
volumes.

Other benefits of data visualization include the following:

 the ability to absorb information quickly, improve insights and make faster decisions;

 an increased understanding of the next steps that must be taken to improve the
organization;

 an improved ability to maintain the audience's interest with information they can
understand;

 an easy distribution of information that increases the opportunity to share insights with
everyone involved;

 eliminate the need for data scientists since data is more accessible and understandable; and

 an increased ability to act on findings quickly and, therefore, achieve success with greater
speed and less mistakes.

7
CHAPTER NO 4:- SYSTEM DESIGN

Data visualization can be expressed in different forms. Charts are a common way of expressing data,
as they depict different data varieties and allow data comparison.

The type of chart you use depends primarily on two things: the data you want to communicate, and
what you want to convey about that data. These guidelines provide descriptions of various different
types of charts and their use cases.

Types of charts

Change over time charts show data over a period of time, such as trends or comparisons across
multiple categories. Common use cases include: Category comparison..Change over time

Change over time charts show data over a period of time, such as trends or comparisons across
multiple categories.Common use cases include:

 Stock price performance


 Health statistics

FIGURE 4.1 (TYPE OF CHARTS)


Change over time charts include:

1. Line charts
2. Bar charts
3. Stacked bar charts
4. Candlestick charts
5. Area charts
6. Timelines
7. Horizon charts
8
8. Waterfall charts

Category comparison

Category comparison charts compare data between multiple distinct categories.

Use cases include:

 Income across different countries


 Popular venue times
 Team allocations

FIGURE 4.2 (TYPE OF BAR CHARTS)

Category comparison charts include:

1. Bar charts
2. Grouped bar charts
3. Bubble charts
4. Multi-line charts
5. Parallel coordinate charts
6. Bullet charts

9
Ranking

Ranking charts show an item’s position in an ordered list.

Use cases include:

 Election results
 Performance statistics

FIGURE 4.3 (TYPE OF STACKED BAR CHARTS)

Ranking charts include:

1. Ordered bar charts


2. Ordered column charts
3. Parallel coordinate charts

Part-to-whole

Part-to-whole charts show how partial elements add up to a total.

Use cases include:

 Consolidated revenue of product categories


 Budgets

10
FIGURE 4.4 (PART TO WHOLE CHART)

Part-to-whole charts include:

1. Stacked bar charts


2. Pie charts
3. Donut charts
4. Stacked area charts
5. Treemap charts
6. Sunburst charts

Correlation

Correlation charts show correlation between two or more variables.

Use cases include:

 Income and life expectancy

FIGURE 4.5 (CORRELATION)

11
Correlation charts include:

1. Scatterplot charts
2. Bubble charts
3. Column and line charts
4. Heatmap charts

Distribution

Distribution charts show how often each values occur in a

dataset. Use cases include:

 Population distribution
 Income distribution

FIGURE 4.6 (DISTRUBUTION)

Distribution charts include:

1. Histogram charts
2. Box plot charts

Flow

Flow charts show movement of data between multiple states.

Use cases include:


12
 Fund transfers
 Vote counts and election results

FIGURE 4.7 (FLOW CHART)


Flow charts include:

1. Sankey charts
2. Gantt charts
3. Chord charts
4. Network charts

Relationship

Relationship charts show how multiple items relate to one

other. Use cases include

 Social networks
 Word charts

FIGURE 4.8 (RELATIONSHIP)

13
CHAPTER NO 5:- CONCLUSION

5.1 Conclusion

Good data visualization should communicate a data set clearly and effectively by using graphics. The
best visualizations make it easy to comprehend data at a glance. They take complex information and
break it down in a way that makes it simple for the target audience to understand and on which to base
their decisions. As Edward R. Tufte pointed out, “the essential test of design is how well it assists the
understanding of the content, not how stylish it is.” Data visualizations, especially, should adhere to
this idea. The goal is to enhance the data through design, not draw attention to the design itself.

5.2 Future Work

Although the data visualization design is effective at delivering the data effectively, there are
changes that would enhance the current system, especially if time constraints aren’t a factor. First,
the map could be designed using shape files that create borders for the countries across the map. In
addition, using the leaflet package would offer the ability to create interactive web maps. The
rendering of the map would need to be a priority. Although there is a spinner to show processing
time, increasing the rendering speed of the map would please users. Second, a visualization in
which the user can pick a country and it shows who they played and the outcome, would be
complementary to the total wins per year bar graph. It would explain why two countries can have
the same amount of wins but why one country was the champion of the tournament and the other
was not. Third, when selecting countries to visualize their total goals scored across tournament
years, the current plot generates zeros for goals per year. Without the data table to offer more
insight, the plot could be misleading. The user might think that the country participated in that
year’s tournament but scored no goals. However, in reality, the zero is a placeholder so the graph
will create the correct plot when at least one other team is selected. Therefore, re-coding, to allow
for non-consecutive points to be plotted without altering what the line plot should look like, would
be a significant improvement in the plot and create less of a dependency on the data table that
generates.

5.3 Application

1. Business Intelligence

Business intelligence utilizes data visualization to gather, analyze, and interpret data for informed
decision-making. It involves running various analyses such as sales performance, market
segmentation, and financial forecasting. For example, a company can use data visualization to
analyze sales data across different regions and product categories to identify the best performing
regions and products, enabling them to allocate resources effectively and optimize their sales
strategies.
14
2. Finance Industries

Data visualization in the finance industry helps professionals analyze financial data, detect trends,
and make informed decisions. It enables them to run analyses such as revenue and expense
tracking, cash flow analysis, and portfolio performance evaluation. For example, financial analysts
can use data visualization to track revenue growth over time, identify seasonal patterns, and
compare performance across different product lines, allowing them to make strategic decisions and
optimize financial strategies accordingly.

3. E-commerce

In the e-commerce industry, data visualization aids in understanding customer behavior, optimizing
marketing campaigns, and enhancing personalized recommendations. Analysis can include
customer segmentation, purchase patterns, and conversion rates. For instance, e-commerce
companies can use data visualization to analyze customer browsing and purchasing data to identify
customer segments and target them with tailored marketing campaigns, resulting in improved
conversion rates and customer satisfaction.

4. Education

In the education industry, data visualization facilitates tracking student performance, identifying
learning outcomes, and informing pedagogical decisions. Analysis can include student
achievement, learning progress, and assessment results. For example, educational institutions can
use data visualization to analyze student test scores over time, identify areas where students may be
struggling, and adjust teaching strategies accordingly to improve learning outcomes and academic
success.

5. Data Science

Data visualization is essential in the field of data science, enabling professionals to extract insights
from complex datasets and communicate findings effectively. Analyses can include exploratory
data analysis, pattern recognition, and model evaluation. For example, data scientists can use
visualizations to analyze customer behavior data, identify patterns in purchasing habits, and build
predictive models to recommend personalized products, leading to increased customer satisfaction
and sales revenue.

15
CHAPTER NO 6:- REFERENCE

 [1] B. Wong, "Visual representation of scientific information," Sci. Signaling, vol.


4, p. pt1, 2011.

 [2] R. Garcia-Retamero and U. Hoffrage, "Visual representation of statistical


information improves diagnostic inferences in doctors and their patients," Soc. Sci.
Med., vol. 83, pp. 27-33, Apr. 2013.

 [3] P. Shah and E. G. Freedman, "Bar and line graph comprehension: An


interaction of top-down and bottom-up processes," Topics in Cognitive Sci., vol. 3,
pp. 560-578, 2011.

 [4] M. Galesic and R. Garcia-Retamero, "Statistical numeracy for health: a cross-


cultural comparison with probabilistic national samples," Arch. Intern. Med., vol.
170, pp. 462-8, Mar. 2010.

 [5] S. Neuner-Jehle et al., "How do family physicians communicate about


cardiovascular risk? Frequencies and determinants of different communication
formats," BMC Family Practice, vol. 12, p. 15, 2011.

16

You might also like