Download as pdf
Download as pdf
You are on page 1of 6
2123723, 11:48.AM py/t_linear_regression ipynb at master -codebasiesipy © codebasics / py Public <> Code © Issues so 17 Pullrequests 65 © Actions (Projects PB master + py/ML/ 1linear_reg / 1_linear_regression.ipynb gp dhavalsays Fixed issues in linear regression tutorial x1 contributor 749 Lines (748 sioc) | 26.6 KB hitpssgthub.convcodebasies/pyblobimasteriML/1_l CO wiki © Security WY In D History ws 2123723, 11:48 AM py/t_linear_regression pynb at master - codebasiesipy Machine Learning With Python: Linear Regression With One Variable Sample problem of predicting home price in monroe, new jersey (USA) Below table represents current home prices in monroe township based on square feet area, new jersey 2600 550000 3000 565000 3200 610000 3600 680000 4000 725000 Problem Statement Given above data build a machine learning model that can predict home prices based on square feet ares You can represent values in above table as a scatter plot (values are shown in red markers) After that one can draw a straight line that best fits values on chart 725000 700000 675000 2 @ 650000 Qo “= 625000 a 600000 575000 550000} + 525000 2600 2800 3000 3200 3400-3600 3800 4000 area You can draw multiple lines like this but we choose the one where total sum of error is minimum hitpssigthub.convcodebasicsipyblobimasteriML/1_linear_eg/t_linear_regressionpynb 216 2123723, 11:48 AM hitpssgthub.convcodebasiesipyblobimasteriML/1_l py/t_linear_regression pynb at master - codebasiesipy 725000 700000 675000 @ £0000 Minimize yy (ai? ¥ += 625000 00000 575000 550000 525000 2600 2800 3000 3200 3400 3600 3800 4000 area You might remember about linear equation from your high school days math class. Home prices can be presented as following equation, home price = m* (area) +b Generic form of same equation is, price = m*area+b y=mx+b 7 \ ‘Slope (or Gradient) Intercept Reference: htp://wurw.mathsistun.com/algobra/inear-equations. html import pandas as pd import nunpy as np from sklearn import Linear_nodel inport matplotlib.pyplot as plt dF = pd.read_csv(*honeprices.csv") oF area price 293, 1148AM yt _tnear_seqression:pynb at master cosebasesipy © 2600 350000 1 3000 565000 2 3200 610000 33600 680000 4 4000 725000 XnatplotLib inline plt.xlabel( area") plt.ylabel( price’) ple. seatter(df.area,df.price,color="red' jnarker='+") 725000 > 700000 575000 “650000 5 25000 00000 75000 50000 | + 2600 2800 3000-3200 3400 3600 3800 4000 new_df = dF.drop(‘price’ ,axise'colums') new_of © 2600 1 3000 2 3200 3 3600 4 4000 price = df.price price 550000 565000 610008 680000 725008 lane: price, dtype: inte # Create Linear regression object reg = Linear model.LinearRegression() hitpssigthub.convcodebasicsipyblobimasteriML/1_linear_eg/t_linear_regressionpynb 46 2123723, 1148 AM hitpssgthub.convcodebasicsipyblobimasteriML/1_l py/t_linear_regression pynb at master - codebasiesipy reg. t1t(new_ct, price) LinearRegression(copy_XTrue, fit_intercept=True, n_jobs=None, normalize=False) (1) Predict price of a home with atea = 3300 sqr ft rreg-predict([[330@]]) array [628715.75342466]) reg.coef_ array ([135.78767123]) reg. intercest_ 180616. 43835616432 Y= m*X +b (mis coefficient and b is intercept) 3300*135.78767123 + 180616.43835616432 628715. 7534151643 (1) Predict price of a home with area = 5000 sqr ft reg. predict([[5@00]]) array([859554.79452055]) Generate CSV file with list of home price predictions area_éf = pd.read_csv("areas.csv") area_df.head(3) area © 1000 1 1500 2 2300 P= reg.predict (area_d#) P array([ 316404.10958904, 384297.94520548, 492928.08219178, 661304.79452055, 749061.64383562, 799808.21917808, 926890.75342466, 650441.78082192, 825607.87671233, 492928.08219178, 1402705.47945205, 1348390.4109589 , 11447¢8.90410959]) ar_r0g!1_linear_regressionipynb 2123723, 1148 AM area_df[ ‘prices’ area_df 1 area 0 1000 1 1500 2 2200 3 3840 4 an20 5 4560 5490 3460 4750 2300 10 9000 118600 12 T00 prices 3.164041e+08 3.842979e+05 4929281¢+0¢ 5613048e+08 7.400616e+05 7.998082e+0¢ 9.260908e+ 05 5 504418e+08 8.256079e+0¢ 4929281¢+0¢ 1.402705e+06 113483900+06 1.148709e+06 py/t_linear_regression pynb at master - codebasiesipy area_df.to_csv("prediction.csv") Exercise Predict canada's per capita income in year 2020. There is an exercise folder here or github at same level as this notebook, download that and you will fine canada_per_capita_income.csv file. Using this build a regression model and predict the er capita income fo canadian citizens in year 2020 Answer 41288,69409442 hitpssgthub.convcodebasicsipyblobimasteriML/1_K

You might also like