Simple Linear Regression Lab II

DataSet (After All Preprocessing)

Years Of Experience Salary

1.1 39343
1.3 46205
1.5 37731
2 43525
2.2 39891
2.9 56642
3 60150
3.2 54445
3.2 64445
3.7 57189
3.9 63218
4 55794
4 56957
4.1 57081
4.5 61111
4.9 67938
5.1 66029
5.3 83088
5.9 81363
6 93940
6.8 91738
7.1 98273
7.9 101302
8.2 113812
8.7 109431
9 105582
9.5 116969
9.6 112635
10.3 122391
10.5 121872
Complete Code
# Simple Linear Regression

# Importing the libraries

import numpy as np

import matplotlib.pyplot as plt

import pandas as pd

# Importing the dataset

dataset = pd.read_csv('Salary_Data.csv')

X = dataset.iloc[:, :-1].values

# all columns except last column

y = dataset.iloc[:, -1].values

# here -1 means last column

# Splitting the dataset into the Training set and Test set
from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 1/3, random_state = 0)

# Training the Simple Linear Regression model on the Training set

from sklearn.linear_model import LinearRegression

regressor = LinearRegression(), y_train)

# Predicting the Test set results

y_pred = regressor.predict(X_test)

# here compare Y_test and y_predict

# Y_test contains real salary and y_predict contains predicted salaries

# Visualising the Training set results
plt.scatter(X_train, y_train, color = 'red')

# x-axis experience and y-axis is salary, here observation point shown as red

# it will show real values in red

plt.plot(X_train, regressor.predict(X_train), color = 'blue')

# here y coordinate will be predictions from x_train, and color blue for observational points

# it will show predicted values as blue line

plt.title('Salary vs Experience (Training set)')

# Title of graph

plt.xlabel('Years of Experience')

# x-axis lable


# y-axis lable

# here real values are red and predicted values are blue
# Visualising the Test set results
plt.scatter(X_test, y_test, color = 'red')

plt.plot(X_train, regressor.predict(X_train), color = 'blue')

plt.title('Salary vs Experience (Test set)')

plt.xlabel('Years of Experience')


# here blue line is same as previous graph but red points are from test set

