Internship Project

You might also like

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 10

Name: Sagar Raj

Workshop Report: Data analysts

UID: 22BCA10525

Year: 2nd year (4th semester)

Submitted To:
Bachelor in Computer Application
(Department)
DECLARATION

I hereby certify that the work which is being


presented in the report entitled “Data
Analysts” in fulfilment of the requirement for
completion of one-month industrial training
in Department of Bachelor in computer
application of “Chandigarh University” is an
authentic record of my own work carried out
during industrial training.

Sagar Raj
22BCA10525
BCA-4thsem.
ACKNOWLEDGEMENT

The work in this report is an outcome of


continuous work over a period and drew
intellectual support from Binary Tree and
other sources. I would like to articulate our
profound gratitude and indebtedness to
Binary Tree helped us in completion of the
training. I am thankful to Binary Tree
Associates for teaching and assisting me in
making the training successful.

Sagar Raj
22BCA10525
BCA-4thsem
APPLICATION
Data analytics and big data are significantly transforming businesses, reshaping
daily operations, financial analysis, and customer interactions. The immense
value derived from data insights is clear, but understanding the specific
applications can sometimes be challenging. Let's explore some examples. In
today's data-rich era, almost everyone generates vast amounts of data daily,
often unknowingly. This digital footprint reveals patterns in our online
activities. For instance, when you search for or purchase a product on Amazon,
you'll notice personalized recommendations based on your search history. This
type of system, known as a recommendation engine, is a common application
of data analytics. Companies like Amazon, Netflix, and Spotify use algorithms to
provide specific recommendations derived from customer preferences and
historical behavior. Personal assistants like Siri on Apple devices utilize data
analytics to generate answers to a myriad of user queries. Google monitors
your online activities, shopping habits, and social media interactions, analyzing
this data to suggest restaurants, bars, shops, and other attractions based on
your location and behavior. Wearable devices like Fitbits, Apple Watches, and
Android watches contribute data about your activity levels, sleep patterns, and
heart rate. Now, let's examine how data analytics is impacting business. In
2011, McKinsey & Company predicted that data analytics would become a key
competitive advantage, driving new waves of productivity, growth, and
innovation. In 2013, UPS announced it was using data from customers, drivers,
and vehicles to develop a new route guidance system aimed at saving time,
money, and fuel. Initiatives like this support the notion that data analytics will
fundamentally change how businesses compete and operate. How does a firm
gain a competitive edge? Take Netflix, for example. Netflix collects and analyzes
massive amounts of data from millions of users, including viewing times, when
people pause, rewind, or fast-forward, and the directors and actors they search
for. By analyzing users' preferences for certain directors and actors and
identifying popular combinations, Netflix can predict the success of a show
before filming begins. For example, Netflix knew that many users had streamed
content by David Fincher and that films featuring Robin Wright were popular.
They also noted the success of the British version of House of Cards. This
comprehensive analysis assured them of the new show's potential success.
Understanding Standard Libraries in Python
Pandas
When it comes to data manipulation and analysis, nothing beats Pandas. It is
the most popular Python library, period. Pandas is written in the Python
language especially for manipulation and analysis tasks.
Pandas provides features like:
• Dataset joining and merging

•Data Structure column deletion and insertion

•Data filtration

•Reshaping datasets

•DataFrame objects to manipulate data, and much more!

Reading a CSV File in Python

A CSV (Comma Separated Values) file is a form of plain text document which
uses a particular format to organize tabular information. CSV file format is a
bounded text document that uses a comma to distinguish the values. Every row
in the document is a data log. Each log is composed of one or more fields,
divided by commas. It is the most popular file format for importing and
exporting spreadsheets and databases.

USing csv.reader(): At first, the CSV file is opened using the


open() method in ‘r’ mode(specifies read mode while opening a file) which
returns the file object then it is read by using the reader() method of CSV
module that returns the reader object that iterates throughout the lines in the
specified CSV document.
Note: The ‘with‘ keyword is used along with the open() method as it simplifies
exception handling and automatically closes the CSV file.
Import csv

# opening the CSV file with open('Giants.csv', mode ='r')as file:


# reading the CSV file
Csv File =csv.reader(file)
# displaying the contents of the CSV file
for lines in csvFile:
print(lines)

•Using pandas.read_csv() method: It is very easy and simple to read a CSVfile


using pandas library functions. Here read_csv() method of pandas libraryis used
to read data from CSV files.

Import pandas

# reading the CSV file


csvFile =pandas.read_csv('Giants.csv')
# displaying the contents of the CSV file
print(csvFile

Data Frames and basic operations with Data


Frames:
Pandas Data Frame is two-dimensional size-mutable, potentially
heterogeneous tabular data structure with labeled axes (rows and columns). A
Data frame is a two-dimensional data structure, i.e., data is aligned in a tabular
fashion in rows and columns. Pandas Data Frame consists of three principal
components, the
data, rows, and columns.

Operations Performed
 Data Collection: Describe how you collected
data, sources of data, and the format in which
data was obtained.
 Data Cleaning: Outline the steps taken to
clean the data using Pandas (handling missing
values, correcting data types, removing
duplicates, etc.).
Data Analysis: Explain the analysis performed
using Pandas, such as filtering, grouping, and
summarizing data.

Visualization: Describe any data visualizations


created using Pandas or other libraries like
Matplotlib or Seaborn.
Sample Operations and Challenges:
Operations Performed:
1.Loading Data:

2.Data Cleaning:

3.Data Analysis:

4.Visualization:
.

You might also like