College Predictor

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 20

CSE3013: ARTIFICIAL INTELLIGENCE

J COMPONENT REPORT

COLLEGE ADMISSION PREDICTION

Submitted by

Nipun Pundhir 19BCB0014


Janhavi Bhosale 19BCE2354
Nisheta Gupta 19BCE2233

Submitted to

Prof. Mohan Kumar P

SCOPE

August 2022
TABLE OF CONTENTS:

Sr. No. Contents Page No.

1. 1. Abstract 3

2. 2. Keywords 3

3. 3. Introduction 4

4. 4. Literature Survey 4

5. 5. Proposed Work 5

6. 6. Tools and Datasets Used 6

7. 7. Implementation and Results 6

8. 7.1 Code 6

9. 7.2 Output and Screenshots 10

10. 8. Conclusion 19

11. 9. References 20

2
1. ABSTRACT:

This project describes how we built a College Admission Prediction System using some of the
most efficient Machine Learning and Artificial Intelligence algorithms out there in the market.
We used Linear Regression, Artificial Neural Networks, Decision Trees and Random Forest
methodologies to implement our system to compare the accuracy and efficiency of our prediction
system. After our research and employment of our system, we were finally able to deduce that
the Linear Regression model outputted the best and most feasible and accurate results when
compared to the other models.

2. KEYWORDS:

College Admission Prediction System, Machine Learning, Artificial Intelligence, Naive Bayes
algorithm, Bayesian Networks Algorithms, Linear Regression, Artificial Neural Networks,
Support Vector Regression, Decision Trees, Random Forest, Multilinear relapse, SVM, Linear
Kernel, AdaBoost, prediction

3
3. INTRODUCTION:

For anyone pursuing their postgraduate studies, it would be difficult for them to find out what
college they may join, based on their GPA, Quants, Verbal, TOEFL and AWA Scores. People
may apply to many colleges that look for candidates with a higher score set, instead of applying
to colleges at which they have a chance of getting into. This would be detrimental to their future.
It is very important that a candidate should apply to colleges that he/she has a good chance of
getting into, instead of applying to colleges that they may never get into.

The Education Based Prediction System helps a person decide what colleges they can apply to
with their scores. The dataset that is used for processing consists of the following parameters:
College names, Quants and Verbal Scores (GRE) TOEFL and AWA Scores.

4. LITERATURE SURVEY:

This section includes the literature review of previous research on the assessment of student
enrolment opportunities in universities. Numerous programs and studies have been carried out on
topics relating to university admission using many machine learning models which help the
students in the admission process to their desired universities. Previous research done in this area
used:

1. Naive Bayes algorithm which will evaluate the success probability of student application into
a respective university but the main drawback is they didn’t consider all the factors which will
contribute to the student admission process like TOEFL/IELTS, SOP, LOR and undergraduate
score. Bayesian Networks Algorithms have been used to create a decision support network for
evaluating the application submitted by foreign students of the university. This model was
developed to forecast the progress of prospective students by comparing the score of students
currently studying at university. The model thus predicted whether the aspiring student should be
admitted to university based on various scores of students. Since the comparisons are made only

4
with students who got admission into the universities but not with students who got their
admission rejected, this method will not be that accurate.

2. One amazing work by Acharya et al. has looked at between 4 changed relapse calculations,
which are: Linear Regression, Support Vector Regression, Decision Trees and Random Forest, to
anticipate the opportunity of conceding dependent on the best model that showed the least MSE
which was a multilinear relapse.

3. Gupta et al. fostered a model that reviews the alumni affirmation measure in American
colleges utilizing AI procedures. The motivation behind this investigation was to direct
understudies in tracking down the best instructive establishment to apply for. Five AI models
were underlying this paper including Naïve Bayes, SVM (Linear Kernel), AdaBoost, and
Logistic classifiers.

4. Also, Chakrabarty et al. thought about between both linear regression and gradient boosting
regression in foreseeing the possibility of conceding; call attention to that gradient boosting
regression showed better outcomes.

5. PROPOSED WORK:

Educational institutions are the backbone of any society as a good level of education equates to a
promising future. There is a time when a student needs to decide on the colleges he/she wishes
to apply to based on various factors like their GRE and TOFEL scores, university ratings, etc and
at this point is where the real struggle starts. There are various apps and sites for this purpose but
using them is tedious and the lack of precise information leads to the students wasting a lot of
valuable time.

Thus the problem that we tackle through this project is finding out the best AI algorithm which
gives the most precise output for college applications. With the increased efficiency of the
system, the students will spend little to no time applying to colleges where their application does
not have a scope of getting accepted. Instead, they can apply to colleges where they are certain
about acceptance and scholarships.

5
6. TOOLS AND DATASET USED:

Tools used: Jupyter notebook


Dataset used: Graduate Admission 2 | Kaggle

7. IMPLEMENTATION AND RESULTS:

Link of the google collab notebook:


https://colab.research.google.com/drive/1ge0vYtAgFn5WfoCt1AqKCHF2KB_
M9Zfh?usp=sharing

7.1 CODE:
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt

6
admission_df = pd.read_csv('Admission_Predict.csv')
admission_df.head()
#Let's drop the serial no.
admission_df.drop('Serial No.', axis = 1, inplace = True)
admission_df
# checking the null values
admission_df.isnull().sum()
# Check the dataframe information
admission_df.info()

admission_df.describe()

df_university = admission_df.groupby(by = 'University Rating').mean()


df_university
admission_df.hist(bins = 30, figsize =(20,20))
sns.pairplot(admission_df)
corr_matrix = admission_df.corr()
plt.figure(figsize = (12,12))
sns.heatmap(corr_matrix , annot = True, cmap="YlGnBu")
plt.show()
admission_df.columns
X = admission_df.drop(columns = ['Chance of Admit '])
y = admission_df['Chance of Admit ']
X.shape
y.shape
X = np.array(X)
y = np.array(y)
y = y.reshape(-1,1)
y.shape
# scaling the data before training the model
from sklearn.preprocessing import StandardScaler, MinMaxScaler
scalar_x = StandardScaler()
X = scalar_x.fit_transform(X)

scalar_y = StandardScaler()
y = scalar_y.fit_transform(y)
# spliting the data in to test and train sets
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X , y , test_size=0.5)
from sklearn.linear_model import LinearRegression

7
from sklearn.metrics import mean_squared_error, accuracy_score

#APPLYING LINEAR REGRESSION


LinearRegression_model = LinearRegression()
LinearRegression_model.fit(X_train, y_train)
accuracy_LinearRegression = LinearRegression_model.score(X_test, y_test)
accuracy_LinearRegression
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.layers import Dense, Activation, Dropout
from tensorflow.keras.optimizers import Adam

ANN_model = keras.Sequential()
ANN_model.add(Dense(50, input_dim = 7))
ANN_model.add(Activation('relu'))
ANN_model.add(Dense(150))
ANN_model.add(Activation('relu'))
ANN_model.add(Dropout(0.5))
ANN_model.add(Dense(150))
ANN_model.add(Activation('relu'))
ANN_model.add(Dropout(0.5))
ANN_model.add(Dense(50))
ANN_model.add(Activation('linear'))
ANN_model.add(Dense(1))
ANN_model.compile(loss = 'mse', optimizer = 'adam')
ANN_model.summary()
ANN_model.compile(optimizer='Adam', loss='mean_squared_error')
epochs_hist = ANN_model.fit(X_train, y_train, epochs = 100, batch_size =
20)

result = ANN_model.evaluate(X_test, y_test)


accuracy_ANN = 1 - result
print("Accuracy : {}".format(accuracy_ANN))
epochs_hist.history.keys()

plt.plot(epochs_hist.history['loss'])
plt.title('Model Loss Progress During Training')
plt.xlabel('Epoch')
plt.ylabel('Training Loss')
plt.legend(['Training Loss'])

8
from sklearn.tree import DecisionTreeRegressor
DecisionTree_model = DecisionTreeRegressor()
DecisionTree_model.fit(X_train,y_train)

accuracy_DecisionTree = DecisionTree_model.score(X_test,y_test)
accuracy_DecisionTree

from sklearn.ensemble import RandomForestRegressor


RandomForest_model = RandomForestRegressor(n_estimators=100,max_depth=10)
RandomForest_model.fit(X_train,y_train)

accuracy_RandomForest = RandomForest_model.score(X_test,y_test)
accuracy_RandomForest

y_predict = LinearRegression_model.predict(X_test)
plt.plot(y_test, y_predict, '^',color = 'b')

y_predict_orig = scalar_y.inverse_transform(y_predict)
y_test_orig = scalar_y.inverse_transform(y_test)

plt.plot(y_test_orig,y_predict_orig,'^',color = 'b')

k = X_test.shape[1]
n = len(X_test)
rom sklearn.metrics import r2_score, mean_squared_error,
mean_absolute_error
from math import sqrt
RMSE = float(format(np.sqrt(mean_squared_error(y_test_orig,
y_predict_orig)),'.3f'))
MSE = mean_squared_error(y_test_orig, y_predict_orig)
MAE = mean_absolute_error(y_test_orig, y_predict_orig)
r2 = r2_score(y_test_orig, y_predict_orig)
adj_r2 = 1-(1-r2)*(n-1)/(n-k-1)
print('RMSE =',RMSE, '\nMSE =',MSE, '\nMAE =',MAE, '\nR2 =', r2,
'\nAdjusted R2 =',adj_r2)

9
7.2 OUTPUT AND SCREENSHOTS:

10
11
12
13
14
15
16
17
18
8. CONCLUSION:

Every year, lakhs of students apply to colleges to continue/start their education. A lot of them are
misguided and are not cautious with their approach, and this leads to them applying to the wrong
colleges, which wastes a lot of time, effort and money. Through our project we would like to
help such prospective students to find better colleges. This would be detrimental to their future.
It is very important that a candidate should apply to colleges that he/she has a good chance of
getting into, instead of applying to colleges that they may never get into. Our prepared models
work to a satisfactory level of accuracy, and may be of great assistance to such people. This is a
project with good future scope, especially for students of our age group who want to pursue their
higher education in their dream college.

19
9. REFERENCES:
● https://www.w3schools.com/ai/
● https://medium.com/analytics-vidhya/understanding-the-linear-regression-808c1f69
41c0
● https://towardsdatascience.com/understanding-random-forest-58381e0602d2
● https://scikit-learn.org/stable/modules/tree.html
● https://www.analyticsvidhya.com/blog/2021/08/decision-tree-algorithm/

20

You might also like