Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 2

LINEAR REGRESSION USING MULTIPLE VARIABLES / MULTIPLE LINEAR REGRESSION

Here, we take more than 1 independent variables (x1, x2, x3, … etc).

The dependent variable y = m1x1 + m2x2 + m3x3 + … + b

Problem: We are given Home prices in Monroe Township, NJ (USA). We should predict the prices
for the following homes:
a) 3000 sft area, 3 bed rooms, 40 years old
b) 2500 sft area, 4 bed rooms, 5 years old

dataset
homeprices.csv

Note: Since the one piece of data in no. of bedrooms missing in the dataset, we have to clean the
data. We want to find median of that column and substitute it in the missing cell.

Linear equation

price = m1* area + m2 * bedrooms + m3 * age + b

program

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn import linear_model

# load the data into dataframe


df = pd.read_csv("E://ds/2-multiple-regression/homeprices.csv")
df

# fill the missing data (NaN) with median of bedrooms


import math
median_bedrooms = math.floor(df.bedrooms.median())
median_bedrooms # 3
# fill the missing data (NaN columns) with this median value
df.bedrooms = df.bedrooms.fillna(median_bedrooms)
df

# create linear regression model


# take the independent vars first and take dependent var next.
reg = linear_model.LinearRegression()
reg.fit(df[['area', 'bedrooms', 'age']], df['price']) # fitting means training

# print coefficients,i.e. m1, m2, m3 values


reg.coef_ # 137.25, -26025. , -6825.

# intercept
reg.intercept_ # 383725.

# predict the price of 3000 sft area, 3 bed rooms, 40 years old house
reg.predict([[3000, 3, 40]]) # 444400.

# predict the price of 2500 sft area, 4 bed rooms, 5 years old house
reg.predict([[2500, 4, 5]]) # 588625.

Task on Multiple Linear Regression

hiring.csv file contains hiring statics for a firm such as experience of candidate, his written test score
and personal interview score. Based on these 3 factors, HR will decide the salary. Given this data,
you need to build a machine learning model for HR department that can help them decide salaries
for future candidates. Using this predict salaries for following candidates,

a) 2 yr experience, 9 test score, 6 interview score

b) 12 yr experience, 10 test score, 10 interview score

dataset
hiring.csv.

You might also like