Professional Documents
Culture Documents
AIML
AIML
AIML
MACHINE LEARNING
SUBMITTED BY :
SENTHURAN L K (21ECB32)
D S (21ECB57) ROHITH G
(21ECB18)
Find your own data set. As a suggested first step, spend some time finding a data set that
you are really passionate about. This can bePROJECT
a data set similar to the data you have available
at work or data you have always wanted to analyze. For some people this will be sports data
sets, while some other folks prefer to focus on data from a datathon or data for good.
REPORT
• The main objective of this analysis is to predict housing prices based on various attributes
using linear regression models.
• The data set used for this analysis contains information about housing prices, including features
such as square footage, number of bedrooms, number of bathrooms, location, etc.
• During data exploration, we examined the distribution of each feature, checked for missing values,
and handled outliers if necessary. We also performed feature engineering to create additional features
that might be useful for prediction.
1. Simple Linear Regression: We started with a simple linear regression model using only one
feature, such as square footage, as a predictor.
2. Polynomial Regression: We extended the simple linear regression by including polynomial
features to capture nonlinear relationships between predictors and the target variable.
3. Regularized Regression : We applied ridge or lasso regression to handle multicollinearity and
prevent overfitting by adding a penalty term to the loss function.
Code Implementation:
housing_data = pd.read_csv('housing_data.csv')
print(housing_data.info())
# Polynomial regression
poly = PolynomialFeatures(degree=2)
X_poly = poly.fit_transform(X[['sqft']])
X_train_poly, X_test_poly, y_train_poly, y_test_poly = train_test_split(X_poly, y, test_size=0.2,
random_state=42)
poly_lr = LinearRegression()
poly_lr.fit(X_train_poly, y_train_poly)
poly_lr_pred = poly_lr.predict(X_test_poly)
ridge_pred = ridge.predict(X_test)
lasso = Lasso(alpha=0.5)
lasso.fit(X_train, y_train)
lasso_pred = lasso.predict(X_test)
Output: