Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 2

Instructions and Guidelines:

Deliverables
a. report
ii. Introduction: This section briefly explains the given dataset and the problem you are solving.
Also, justify why did you select linear regression to model the solution.
iii. Procedure and Results: List all your solution steps and results by commenting on each part.
This section should include all code cells and figures with comments in markdown cells.
iv. Conclusion: Write brief concluding remarks about your proposed solution. This may include
how the model achieves the prediction and comments about its accuracy. You also add any
limitations or problems you faced in this section.

b. The source file for the notebook solution (i.e., .ipynb file). The code should be well
commented and submitted along with the report.
Description:
you will generate a linear regression model using a gradient descent algorithm for a given multi-
feature dataset with two parts – training and testing. The dataset is attached and named
"admission_predict.xlsx", which was created to predict Graduate Admissions for students with
different backgrounds and academic standing. A description of the dataset is given in the same
file.

The following steps need to be performed for the training data.


 Generate a scatterplot matrix for the given features and use it to select the most open
pair to a linear regression model. Note these two features in your report.
 Isolate the two features you selected above and generate a linear regression model
using gradient descent. Provide a scatter diagram for your isolated features together
with the eventual model.
 Judge how good your model is by performing the following two steps. Report and
comment on your findings.

- Comparing the value of the model parameters (beta’s) with those calculated
through the normal equation given as 𝐵 = (𝑋 𝑇𝑋) −1𝑋 𝑇𝑌; and
- Calculating the R2 measure of fitness on the training data.

Using the model generated in the training data, the following needs to be made on the testing
data.

 Evaluate the prediction error of the model based on the training data. Evaluating the
prediction error of a linear regression model based on the training data means assessing
how well the model fits the data it was trained on. This is typically done by comparing
the actual values of the dependent variable in the training dataset with the predicted
values generated by the model. One common method for evaluating the prediction
error of a linear regression model is to calculate the mean squared error (MSE).

 Compare this prediction error (i.e., MSE) with a model generated using training and
testing data (i.e., use the entire dataset to generate the model and then evaluate its
performance by calculating the new MSE).

You may use any code to complete these tasks. You can write your own code to create the
model or use the sklearn library for the same purpose.

You might also like