Download as pdf or txt
Download as pdf or txt
You are on page 1of 3

12/17/23, 1:01 PM Part B - Program 1 - Jupyter Notebook

1. Python statistics module for given data set (label x,


label y) (.csv or .xlsx file formats)
In [1]: import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

In [2]: df=pd.read_csv("car data.csv")

In [3]: df

Out[3]:
Car_Name Year Selling_Price Present_Price Kms_Driven Fuel_Type Seller_Type Tra

0 ritz 2014 3.35 5.59 27000 Petrol Dealer

1 sx4 2013 4.75 9.54 43000 Diesel Dealer

2 ciaz 2017 7.25 9.85 6900 Petrol Dealer

3 wagon r 2011 2.85 4.15 5200 Petrol Dealer

4 swift 2014 4.60 6.87 42450 Diesel Dealer

... ... ... ... ... ... ... ...

296 city 2016 9.50 11.60 33988 Diesel Dealer

297 brio 2015 4.00 5.90 60000 Petrol Dealer

298 city 2009 3.35 11.00 87934 Petrol Dealer

299 city 2017 11.50 12.50 9000 Diesel Dealer

300 brio 2016 5.30 5.90 5464 Petrol Dealer

301 rows × 9 columns

localhost:8888/notebooks/Part B - Program 1.ipynb 1/3


12/17/23, 1:01 PM Part B - Program 1 - Jupyter Notebook

i. Scatter all point graph by matplotlib

In [4]: # Scatter plots are used to observe and show relationships between two nume
# Scatter plot betwwen selling price and present price

plt.scatter(df["Selling_Price"], df["Present_Price"])
plt.xlabel("Selling Price")
plt.ylabel("Present Price")
plt.show()

ii. Calculates the mean (average) of the given data set

In [5]: #average selling price of cars


df["Selling_Price"].mean()

Out[5]: 4.661295681063123

iii. Calculate the median (middle value) of the given data.

In [6]: # median of kilometers driven


df["Kms_Driven"].median()

Out[6]: 32000.0

iv. Calculate the standard deviation.

In [7]: # standard deviation of selling price


df["Selling_Price"].std()

Out[7]: 5.082811556177803

localhost:8888/notebooks/Part B - Program 1.ipynb 2/3


12/17/23, 1:01 PM Part B - Program 1 - Jupyter Notebook

v. Calculate the variance.

In [8]: # variance of kilometers driven


df["Kms_Driven"].var()

Out[8]: 1512189738.0574305

vi. Calculate slope btw points

In [9]: m,b = np.polyfit(df["Selling_Price"], df["Present_Price"],1)


# returns slope(m) and intercept(b)
# The last parameter of the function specifies the degree of the function,

In [10]: m

Out[10]: 1.4948471869862356

vii. Draw regression line

In [11]: r = m*df["Selling_Price"] + b

In [12]: plt.scatter(df["Selling_Price"],df["Present_Price"])
plt.plot(df["Selling_Price"], r, color='red')
plt.show()

localhost:8888/notebooks/Part B - Program 1.ipynb 3/3

You might also like