Certificate

Rayat Shikshan Sanstha's
KARMAVEER BHAURAO PATIL

COLLEGE,VASHI
[AutonomousCollege]
Reaccredited NAAC with Grade A+' (CGPA3.53)|ISO 9001:2008 Certified Institute

‘BestCollege’ Award by University of Mumbai
[ DEPARTMENT OF INFORMATION TECHNOLOGY ]
CERTIFICATE
This Is To Certify That

Mr. Tanmay Chandrakant Mane
Student of T.Y.B.Sc.IT. Class From Karmaveer Bhaurao Patil College,
Vashi [Autonomous], Navi Mumbai Has Satisfactorily Completed The
Practical Course In Subject DATA ANALYSIS AND VISUALIZATION As
per The Syllabus Laid By The University Of Mumbai During The Academic
Year 2023- 24.
ROLL NO .: 237802
EXAM NO.: 237802
SAHIL VICHARE MADHURI GABHANE

Date: / /2023 Head, Department of IT
External Examiner
INDEX
Sr. No.: Practical Name Sign
1. a. Print "Hello, Data Analysis!" using Python's print function

b. Declare variables for your age, name, and favourite colour.
2. a. Use pandas to create a simple DataFrame with some sample

data.
b. Create a small dataset with missing values using pandas and
display it.
3. a. Calculate the sum of numbers from 1 to 10 using a loop.

b. Create a DataFrame with duplicate rows and remove them
using pandas.
4. a. Plot a basic line graph showing the population growth over

years.
b. Create a bar chart that displays the sales of different
products.
5. Generate a scatter plot for two variables showing their

relationship.
6. Generate a box plot to visualise the distribution of exam

scores.
7. Create a scatter plot to explore the relationship between two

numeric variables.
8. a. Calculate and visualise the mean and median of a dataset

using pandas and Matplotlib.
b. Choose a small dataset and create a bar chart to show

different categories.
9. Plot a line graph showing the temperature variation over days.
Name :-Tanmay Mane

Roll No : 237802
Batch : B2
PRACTICAL NO.: 1(A)
Aim: Print "Hello, Data Analysis!" using Python's print function.
Program :
print("Hello Data Analysis")
Output :
Explanation :
The print() function prints the specified message and output on the screen.
Name :-Tanmay Mane

Roll No : 237802
Batch : B2
Practical No.: 1B
Aim : Declare variables for your age, name, and favourite colour.
Program :
age=20
name='abc'
fav_col='red'
print('AGE:',age)
print('Name:',name)
print('Favourite
colour:',fav_col)
Output :
Explanation :
In the above Program code,
Variable age has Integer Data Type which stores Age.
Variable name has String Data Type which stores Name.
Variable fav_col has String Data Type which stores Favourite Colour.
Name :-Tanmay Mane

Roll No : 237802
Batch : B2
PRACTICAL NO.: 2(A)
Aim : Use pandas to create a simple DataFrame with some sample data.
Program :
import pandas as pd data={'Name':['abc','def','ghi'], 'age':
[19,20,21],
'city':['kk','ghansoli','Panvel']}
df=pd.DataFrame(data)
df
Output :
Explanation :
In the above Program code,
We import Library Panda .
We create dictionary of sample data as ‘data’ and the convert it into the
dataframe using the function DataFrame()
Name :-Tanmay Mane

Roll No : 237802
Batch : B2
PRACTICAL NO.: 2B
Aim : Create a small dataset with missing values using pandas and
display it.
Program :
import pandas as
pd import numpy
as np data = {
'Name': ['Omkar', 'Sid', 'Vaibhav', 'DK'],
'Age': [22, None, 22, 28],
'City': ['Mumbai', 'Pune', None, 'Navi Mumbai'],
'Salary': [50000,75000, None, 55000]
}
df = pd.DataFrame(data)
df
Output :
Name :-Tanmay Mane

Roll No : 237802
Batch : B2
Explanation :
We create a small dataset with missing values in pandas using the None
(Not any value) value, which represents missing or undefined data.
In this example, we've intentionally set some values to None to represent
missing data in the 'Age' ,'City' and ‘Salary’ columns.
Name :-Tanmay Mane

Roll No : 237802
Batch : B2
PRACTICAL NO.: 3(A)
Aim : Calculate the sum of numbers from 1 to 10 using a loop.

Program :
sum=0
for i in range(1,11)=
sum=sum+i
print('Sum of first 10 digits :',sum)
Output :
Explanation :
In the above Program of code,
for loop is used for iterations
Name :-Tanmay Mane

Roll No : 237802
Batch : B2
PRACTICAL NO.: 3B
Aim : Create a DataFrame with duplicate rows and remove them using
pandas.
Program :
import pandas as pd data={'Name':
['omkar','yash','DK','vaibhav'],
'age':[20,21,20,21],
'city':['Ghansoli','Dombivali','Airoli','koparkhairane']}
df=pd.DataFrame(data)
print('Original DataFrame:')
print(df)
Output :
Explanation :
In the above example,
We create a DataFrame with duplicate rows based on the ‘Name’,’Age’ and
‘City’ columns
Name :-Tanmay Mane

Roll No : 237802
Batch : B2
PRACTICAL NO.: 4(A)
Aim : Plot a basic line graph showing the population growth over years.
Program :
import matplotlib.pyplot as plt
years=[2018,2019,2020,2021,2022]
population=[7.1,6.4,7.8,8.3,7.5]
plt.plot(years,population,marker='o',linestyle='-')
plt.xlabel('Year')
plt.ylabel('population(billions)')
plt.title('population growth over years')
plt.grid(True)
plt.show()
Output :
Name :-Tanmay Mane

Roll No : 237802
Batch : B2
Explanation :
We define the years and corresponding population data.
Then use plt.plot() to create a line graph, where we plot years on the x-
axis and population on y-axis , we use marker= ‘o’ to add markers at data
points and linestyle= ‘-’ to connect the markers with lines. Then add labels
x and y-axis,
Name :-Tanmay Mane

Roll No : 237802
Batch : B2
Practical No.: 4B
Aim : Create a bar chart that displays the sale of different product.
Program :
products=['product A','product B','product C', 'product D']
sales=[2000,6700,3590,1200]
plt.bar(products,sales)
plt.xlabel('Products')
plt.ylabel('sales(units)')
plt.title('Sales of different products')
plt.xticks()
plt.tight_layout()
plt.show()
Output :
Name :-Tanmay Mane

Roll No : 237802
Batch : B2
PRACTICAL NO.: 5
Aim : Generate a scatter plot for two variables showing their relationship.
Program :
variable1=[1,2,3,4,5,6,7,8,9,10]
variable2=[2,3,5,7,11,13,17,19,23,29]
plt.scatter(variable1,variable2,label='Data points',color='blue',marker='o')
plt.xlabel('Variable 1')
plt.ylabel('Variable 2')
plt.title('Scatter plot of Variable 1 vs. Variable 2')
plt.legend()
plt.grid(True)
plt.show()
Output:
Name :-Tanmay Mane

Roll No : 237802
Batch : B2
PRACTICAL NO.: 6
Aim : Generate a box plots to visualise the distribution of exam scores.
Program :
import seaborn as sns
exam_score=[78,85,90,88,92,75,82,95,88,76,89,93,80]
sns.boxplot(x=exam_score,color="lightblue")
plt.ylabel('Exam scores')
plt.title('Box plot of exam scores')
plt.grid(axis='y',linestyle='-',alpha=0.7)
plt.show()
Name :-Tanmay Mane

Roll No : 237802
Batch : B2
PRACTICAL NO.: 7
Aim : Create a scatter plot to explore the relationship between two numeric
variables.
Program :
variable1 = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

variable2 = [2, 3, 5, 7, 11, 13, 17, 19, 23, 29]
plt.scatter(variable1, variable2, color='blue', marker='o')
plt.xlabel('Variable 1')
plt.ylabel('Variable 2')
plt.title('Scatter Plot of Variable 1 vs. Variable 2')
plt.grid(True)
plt.show()
Output :
Name :-Tanmay Mane

Roll No : 237802
Batch : B2
Explanation :
import matplotlib.pyplot as plt: This line imports the Matplotlib library, which
is used for creating various types of plots and visualisations.
variable1 and variable2: These are two Python lists containing the values of
'Variable 1' and 'Variable 2.' Each list represents a set of data points. In this
example, 'Variable 1' has values 1 through 10, and 'Variable 2' contains a
different set of prime numbers.
plt.scatter(variable1, variable2, color='blue', marker='o'): This line creates a

scatter plot. Here's what each parameter does:
variable1 and variable2: These parameters specify the data to be plotted on the x
and y axes, respectively. In this case, 'Variable 1' is on the x-axis, and 'Variable
2' is on the y-axis.
color='blue': This sets the color of the scatter points to blue.
marker='o': This specifies that circular markers (dots) should be used for the
data points.
plt.xlabel('Variable 1') and plt.ylabel('Variable 2'): These lines set labels for
the x and y axes, indicating that 'Variable 1' is on the x-axis and 'Variable 2' is
on the y-axis.
plt.title('Scatter Plot of Variable 1 vs. Variable 2'): This line sets the title of the
plot to "Scatter Plot of Variable 1 vs. Variable 2."
plt.grid(True): This command adds a grid to the plot to help in reading and
interpreting the data points.
plt.show(): This line displays the scatter plot on the screen. It's necessary to
include this line to visualise the plot.
Name :-Tanmay Mane

Roll No : 237802
Batch : B2
PRACTICAL NO.: 8(A)
Aim : Calculate and visualise the mean and median of a dataset using pandas
Matplotlib.
Program :
import pandas as pd
import numpy as np
data = {'Values': [12, 24, 36, 48, 60, 72, 84, 96, 108, 120]}
df = pd.DataFrame(data) mean_value
= df['Values'].mean()
median_value = df['Values'].median()
print(f"Mean: {mean_value}")
print(f"Median: {median_value}")
plt.figure(figsize=(8, 6))
plt.boxplot(df['Values'], vert=False, widths=0.4) plt.scatter([mean_value], [1],
color='red', marker='o', label='Mean') plt.scatter([median_value], [1],
color='blue', marker='x', label='Median')plt.xlabel('Values')
plt.title('Box Plot with Mean and Median')
plt.legend()
plt.grid(True)
plt.show()
Name :-Tanmay Mane

Roll No : 237802
Batch : B2
Output :
Explanation :
Imports the necessary libraries:
import pandas as pd: Imports the Pandas library for data manipulation.
import numpy as np: Imports the NumPy library for numerical operations.
import matplotlib.pyplot as plt: Imports Matplotlib for creating plots.
Creates a dictionary 'data' containing a single column of data called 'Values'

with ten values ranging from 12 to 120.
Converts the 'data' dictionary into a Pandas DataFrame named 'df'.
Name :-Tanmay Mane

Roll No : 237802
Batch : B2
Calculates the mean and median of the 'Values' column in the DataFrame:
mean_value = df['Values'].mean(): Calculates the mean of the 'Values' column.
median_value = df['Values'].median(): Calculates the median of the 'Values'

column.
Prints the mean and median values.
Sets up a Matplotlib figure with a specific size (8 inches wide and 6 inches
high) using plt.figure(figsize=(8, 6)).
Creates a boxplot of the 'Values' column using plt.boxplot(df['Values'],

vert=False, widths=0.4). The vert=False argument specifies that the boxplot
should be horizontal, and widths=0.4 sets the width of the boxes.
Adds two scatter points to the plot to mark the mean and median:
plt.scatter([mean_value], [1], color='red', marker='o', label='Mean'): Adds a

red circular marker for the mean value.
plt.scatter([median_value], [1], color='blue', marker='x', label='Median'): Adds a

blue 'x' marker for the median value.
Sets the x-axis label to 'Values' using plt.xlabel('Values').
Sets the title of the plot to 'Box Plot with Mean and Median' with plt.title('Box
Plot with Mean and Median').
Adds a legend to the plot to distinguish the mean and median markers using
plt.legend().
Enables the grid for the plot with
plt.grid(True). Finally, displays the plot with
plt.show().
Name :-Tanmay Mane

Roll No : 237802
Batch : B2
Practical No.: 8B
Aim :Choose a small dataset and create a bar chart to show different categories.
Program :
import pandas as pd
data = {
'Product': ['Product A', 'Product B', 'Product C', 'Product D'],
'Sales': [1200, 850, 950, 1100]
df = pd.DataFrame(data)
plt.bar(df['Product'], df['Sales'], color='skyblue')
plt.xlabel('Product')
plt.ylabel('Sales')
plt.title('Product Sales')
plt.xticks(rotation=15)
plt.grid(axis='y', linestyle='--', alpha=0.7)
plt.show()
Name :-Tanmay Mane

Roll No : 237802
Batch : B2
Output :
Explanation :
Import the necessary libraries:
import matplotlib.pyplot as plt: Imports Matplotlib for creating plots.
Create a dictionary 'data' with two columns: 'Product' and 'Sales,' representing
the product names and their corresponding sales values.
Convert the 'data' dictionary into a Pandas DataFrame named 'df.'
Name :-Tanmay Mane

Roll No : 237802
Batch : B2
Set up a Matplotlib figure with a specific size (8 inches wide and 6 inches high)
using plt.figure(figsize=(8, 6)).
Create a bar chart using plt.bar(df['Product'], df['Sales'], color='skyblue'):
df['Product'] is used as the x-axis (product names).
df['Sales'] is used as the y-axis (sales values).
color='skyblue' sets the color of the bars to sky blue.
Set the x-axis label to 'Product' with plt.xlabel('Product').
Set the y-axis label to 'Sales' with plt.ylabel('Sales').
Set the title of the plot to 'Product Sales' using plt.title('Product Sales').
Rotate the x-axis labels by 15 degrees for better readability with

plt.xticks(rotation=15).
Enable gridlines on the y-axis with a dashed line style and a transparency level
of 0.7 using plt.grid(axis='y', linestyle='--', alpha=0.7).
Finally, display the bar chart with plt.show().
Name :-Tanmay Mane

Roll No : 237802
Batch : B2
PRACTICAL NO.:9
Aim :Plot a line graph showing the temperature variation over days.
Program :
import pandas as pd
days = ['Day 1', 'Day 2', 'Day 3', 'Day 4', 'Day 5']
temperatures = [75, 78, 82, 79, 83]
data = pd.DataFrame({'Day': days, 'Temperature': temperatures})
plt.plot(data['Day'], data['Temperature'], marker='o', linestyle='-', color='b',
markersize=8)
plt.title('Temperature Variation Over Days')
plt.xlabel('Days')
plt.ylabel('Temperature (°F)')
plt.grid(True)
plt.show()
Name :-Tanmay Mane

Roll No : 237802
Batch : B2
Output :
Explanation :
Import the necessary libraries:
import matplotlib.pyplot as plt: Imports the Matplotlib library for creating plots.
Create two lists:
days: A list of strings representing the days, from 'Day 1' to 'Day 5'.
temperatures: A list of numerical values representing the temperature in degrees

Fahrenheit for each respective day.
Create a Pandas DataFrame named 'data' by combining the 'days' and

'temperatures' lists:
Name :-Tanmay Mane

Roll No : 237802
Batch : B2
'Day': The 'days' list becomes the 'Day' column in the DataFrame.'Temperature':
The 'temperatures' list becomes the 'Temperature' column in the DataFrame.
Set up a Matplotlib figure with a specific size (8 inches wide and 6 inches high)
using plt.figure(figsize=(8, 6)).
Create a line plot using plt.plot(data['Day'], data['Temperature'],

marker='o', linestyle='-', color='b', markersize=8):
data['Day'] is used as the x-axis, representing the days.
data['Temperature'] is used as the y-axis, representing the temperature
values. marker='o' specifies that circular markers should be placed at data
points. linestyle='-' specifies that the line connecting the data points should
be solid. color='b' sets the color of the line and markers to blue.
markersize=8 sets the size of the markers to 8 points.
Set the title of the plot to 'Temperature Variation Over Days' using
plt.title('Temperature Variation Over Days').
Set the x-axis label to 'Days' with plt.xlabel('Days').
Set the y-axis label to 'Temperature (°F)' with plt.ylabel('Temperature (°F)'.
Enable grid lines on the plot with plt.grid(True).
Finally, display the line plot using plt.show().
Name :-Tanmay Mane

Roll No : 237802
Batch : B2

Certificate

Uploaded by

Copyright:

Available Formats

You might also like

Certificate

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Certificate

Uploaded by

Copyright:

Available Formats

Rayat Shikshan Sanstha's

KARMAVEER BHAURAO PATIL

Reaccredited NAAC with Grade A+' (CGPA3.53)|ISO 9001:2008 Certified Institute

[ DEPARTMENT OF INFORMATION TECHNOLOGY ]

This Is To Certify That

SAHIL VICHARE MADHURI GABHANE

Sr. No.: Practical Name Sign

1. a. Print "Hello, Data Analysis!" using Python's print function

2. a. Use pandas to create a simple DataFrame with some sample

3. a. Calculate the sum of numbers from 1 to 10 using a loop.

4. a. Plot a basic line graph showing the population growth over

5. Generate a scatter plot for two variables showing their

6. Generate a box plot to visualise the distribution of exam

7. Create a scatter plot to explore the relationship between two

8. a. Calculate and visualise the mean and median of a dataset

b. Choose a small dataset and create a bar chart to show

9. Plot a line graph showing the temperature variation over days.

Name :-Tanmay Mane

Aim: Print "Hello, Data Analysis!" using Python's print function.

print("Hello Data Analysis")

Name :-Tanmay Mane

Name :-Tanmay Mane

Name :-Tanmay Mane

Name :-Tanmay Mane

Name :-Tanmay Mane

Aim : Calculate the sum of numbers from 1 to 10 using a loop.

Name :-Tanmay Mane

Name :-Tanmay Mane

Name :-Tanmay Mane

Name :-Tanmay Mane

Name :-Tanmay Mane

Name :-Tanmay Mane

Name :-Tanmay Mane

import matplotlib.pyplot as plt

variable1 = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

plt.scatter(variable1, variable2, color='blue', marker='o')

Name :-Tanmay Mane

plt.scatter(variable1, variable2, color='blue', marker='o'): This line creates a

Name :-Tanmay Mane

import matplotlib.pyplot as plt

plt.boxplot(df['Values'], vert=False, widths=0.4) plt.scatter([mean_value], [1],

color='red', marker='o', label='Mean') plt.scatter([median_value], [1],

color='blue', marker='x', label='Median')plt.xlabel('Values')

plt.title('Box Plot with Mean and Median')

Name :-Tanmay Mane

Imports the necessary libraries:

import matplotlib.pyplot as plt: Imports Matplotlib for creating plots.

Creates a dictionary 'data' containing a single column of data called 'Values'

Converts the 'data' dictionary into a Pandas DataFrame named 'df'.

Name :-Tanmay Mane

mean_value = df['Values'].mean(): Calculates the mean of the 'Values' column.

median_value = df['Values'].median(): Calculates the median of the 'Values'

Prints the mean and median values.

Creates a boxplot of the 'Values' column using plt.boxplot(df['Values'],

plt.scatter([mean_value], [1], color='red', marker='o', label='Mean'): Adds a

plt.scatter([median_value], [1], color='blue', marker='x', label='Median'): Adds a

Sets the x-axis label to 'Values' using plt.xlabel('Values').

Enables the grid for the plot with