IKG Punjab Technical University Amritsar: Movie Recommendation Website

You might also like

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 40

IKG Punjab Technical University

Amritsar

MOVIE RECOMMENDATION WEBSITE

Submitted in the partial fulfillment of the requirements for the degree of

Submitted by

SARANJEET SINGH ARJUN RAJPOOT GURPARTAP SINGH

U-ROLLNO: 1931435 U-ROLLNO: 1931402 U-ROLLNO: 1931405

Supervised by

Miss Deepaksi

Department of Computer Application

2022

Bachelor of Computer Applications


Acknowledgment

I am very grateful to my project guide Miss Deepakshi for giving her valuable time and

constructive guidance in preparing the project. It would not have been possible to complete

this project in the short period of time without her kind encouragement and valuable

guidance.

Date: Signature

SARANJEET SINGH

ARJUN RAJPOOT

GURPARTAP SINGH
Certificate of originality

I hereby declare that the project entitled “ Movie Recommendation Website” submitted to the

Department of Computer science, GLOBAL INSTITUTES, Amritsar, in partial fulfillment for the award of

the Degree of BACHELOR OF COMPUTER APPLICATIONS in the session 2019-2022 is an authentic

record of my own work carried out under the guidance of and that the project has not previously formed the

basics for the award of any other degree.

Place: …………………….. Signature of candidate

Date: ……………………… Saranjeet Singh, Arjun Rajpoot, Gurpartap Singh

1931435, 1931402, 1931405

This is to certify that the above statement made by the candidate is correct to the best of my knowledge.

Signature of Supervisor

Miss Deepakshi

(Assistant Professor)
ABSTRACT

ABSTRACT

 Determining whether the listed price of a used car is a


challenging task, due to the many factors that drive a used
vehicle’s price on the market. The focus of this project is
developing machine learning models that can accurately
predict the price of a used car based on its features, in order
to make informed purchases. We implement and evaluate
various learning methods on a dataset consisting of the sale
prices of different makes and models across cities in the
United States. Our results show that Random Forest model
and K-Means clustering with linear regression yield the best
results, but are compute heavy. Conventional linear
regression also yielded satisfactory results, with the
advantage of a significantly lower training time in
comparison to the aforementioned methods.
INDEX
Sr. No. Content Page No.

1. Scope of the project 5-9

2. Methodology/ Planning of work 10-17

3. System Requirements 18-27

4. System Code Implementation 28-48

5. Actual Screenshots 49-56

6. Bibliography 57-58
1.SCOPE OF PROJECT

6
The price of a car depends on a lot of factors like the goodwill of the brand
of the car, features of the car, horsepower and the mileage it gives and
many more. Car price prediction is one of the major research areas in
machine learning. So if you want to learn how to train a car price prediction
model then this article is for you. In this article, I will take you through how
to train a car price prediction model with machine learning using Python.

Every passing day we can see that the car market is increasing rapidly,
and that brings us to a door where many people are buying and selling the
cars in the market, so to make this process more fluid and with a better
market view, I am proposing a model which will take particular
information related to the car and will predict the selling price. This will
help to get a better view while buying and selling cars.

EXISTING SYSTEM:
The existing system has following ways to gain knowledge:

1)First is to go to book depot to buy the respected book. (MAIN)

2)Second (for the digital person) is to surf the internet to get the knowledge.

3)Third is by watching information on Television or other medium.

 Total Cost is massive: You have to spend a lot of money and paper on printing and

publishing the books (ultimate effect on trees), money on advertisements etc. The rent of a

business property and the salary of workers can increase your operational cost significantly.

 Cash management:

a) The book sellers have to handle the cash payments and sell the books. They need to

purchase updated books and manage the list of books. The process to return the

unsold stock to publications is also very burdening.

7
b) The all over cost of advertisement making and to air the advertisement is also

very high.

 Time consuming: People have to visit the book depots to buy the books and sometimes

they have to wait for long times to get them. Considering today's busy lifestyle, some of your

potential event attendees may simply decide to chuck the plan as they don't have enough time

and patience to wait for long times.

 Chances of disappointment: As there is no facility to inform customers about a sold

out event, people may get very disappointed to hear from the book depots that there is no

book left to sell related to them. After all, they traveled a long distance and you wasted both

their time and money.

PROPOSED SYSTEM:
Our website is introduced keeping in view the full satisfaction to the user seeking better information

with much better GUI (Graphical User Interface) and where one can get huge information on India

with just one click go. An internet user can surf the website at any time of day or night.

Also in the proposed system, customers can view the updated information or share the information

with anyone on the web. All the customers have to register on our website for feedbacks, opinions,

comments, FAQs etc. for making our website better.

8
Overview
This dataset consists information about used car listed on cardekho.com.
It has 9 columns each columns consists information about specific
features like Car_Name gives information about car
company .which Year the brand new car has been
purchased.selling_price the price at which car is being sold this will be
target label for further prediction of price.km_driven number of
kilometre car has been driven.fuel this feature the fuel type of car (CNG ,
petrol,diesel etc).seller_type tells whether the seller is individual or a
dealer. transmission gives information about the whether the car is
automatic and manual.owner number of previous owner of the
car. Present_price what is the current showroom price of the car.

9
INTRODUCTION

Determining whether the listed price of a used car is a challenging


task, due to the many factors that drive a used vehicle’s price on the
market. The focus of this project is developing machine learning
models that can accurately predict the price of a used car based on its
features, in order to make informed purchases. We implement and
evaluate various learning methods on a dataset consisting of the sale
prices of different makes and models . We will compare the
performance of various machine learning algorithms like Linear
Regression, Ridge Regression, Lasso Regression, Elastic Net,
Decision Tree Regressor and choose the best out of it. Depending on
various parameters we will determine the price of the car. Regression
Algorithms are used because they provide us with continuous value as
an output and not a categorized value because of which it will be
possible to predict the actual price a car rather than the price range of
a car. User Interface has also been developed which acquires input
from any user and displays the Price of a car according to user’s
inputs.

10
2. METHODOLOGY/

PLANNING OF WORK

11
METHODOLGY
There are two primary phases in the system: 1. Training phase: The system

is trained by using the data in the data set and fits a model (line/curve)

based on the algorithm chosen accordingly. 2. Testing phase: the system is

provided with the inputs and is tested for its working. The accuracy is

checked. And therefore, the data that is used to train the model or test it, has

to be appropriate. The system is designed to detect and predict price of used

car and hence appropriate algorithms must be used to do the two different

tasks. Before the algorithms are selected for further use, different

algorithms were compared for its accuracy. The well-suited one for the task

was chosen.

Objective
To develop a efficient and effective model which predicts the price of a

used car according to user’s inputs. To achieve good accuracy. To develop

a User Interface( UI ) which is user-friendly and takes input from the user

and predicts the price.

12
PROPOSED SYSTEM

As shown in the above figure, the process starts by collecting the dataset. The next step is to do Data

Preprocessing which includes Data cleaning, Data reduction, Data Transformation. Then, using

various machine learning algorithms we will predict the price. The algorithms involve Linear

Regression, Ridge Regression and Lasso Regression. The best model which predicts the most

accurate price is selected. After selection of the best model the predicted price is displayed to the

user according to user’s inputs. User can give input through website to for used car price prediction

to machine learning model. Linear Regression Linear Regression attempt to model the relationship

13
between two variables by fitting a linear equation to observed data. The other is considered to be

dependent variable. For Example: A modeler might want to relate weights of individuals to their

heights using a linear regression mode

Linear regression is useful for finding relationship between multiple continuous variables There are

multiple independent variables and single independent variable y = m1X1+m2X2+……+b m1, m2,

m3 ….  slope b  y intercept X1, X2, X3 ……  independent variables y  dependent variables.

Ridge Regression
A Ridge regressor is basically a regularized version of Linear Regressor. The regularized term has

the parameter ‘alpha’ which controls the regularization of the model i.e helps in reducing the

variance of the estimates

14
Lasso Regression

The “LASSO” stands for Least Absolute Shrinkage and Selection Operator. Lasso regression is a

regularization technique. It is used over regression methods for a more accurate prediction. This

model uses shrinkage. Shrinkage is where data values are shrunk towards a central point as the mean.

The lasso procedure encourages simple, sparse models (i.e. models with fewer parameters). This

particular type of regression is well-suited for models showing high levels of multicollinearity or

15
when you want to automate certain parts of model selection, like variable selection/parameter

elimination

16
3. SYSTEM

REQUIREMENTS

Requirements

Hardware requirements:

Operating system- Windows 7,8,10

17
Processor- dual core 2.4 GHz (i5 or i7 series Intel processor or equivalent AMD)

RAM-4GB

SPACE- 10GB

Software Requirements:

Python

Pycharm

PIP 2.7

Jupyter Notebook

Chrome

18
4. SYSTEM CODE

IMPLEMENTATON

19
Model Coding(backend):

20
21
22
23
24
25
26
27
28
29
Application.py Code:

30
31
INDEX.html Code:

32
33
34
WEB Screenshots

35
After filling values:

36
Results:
The results of our tests were quantified in terms of the R score of our predictions. score is a statistical

2 R 2 measure of how close the data are to the fitted regression line. Learning Algorithm R Score on

Test 2 Data R Score on Training 2 Data Training Time Linear Regression 0.87 0.87 15 minutes

Gradient Boost 0.64 0.64 130 minutes Random Forest 0.88 0.98 75 minutes Light GBM 0.81 0.82

104 seconds XGBoost 0.78 0.81 180 minutes KMeans + LinReg 0.88 0.89 70 minutes Deep Neural

Network 0.85 0.85 10 hours.

FUTURE SCOPE

In future this machine learning model may bind with various website which can

provide real time data for price prediction. Also we may add large historical data

of car price which can help to improve accuracy of the machine learning model.

We can build an android app as user interface for interacting with user. For

better performance, we plan to judiciously design deep learning network

structures, use adaptive learning rates and train on clusters of data rather than the

whole dataset.

37
CONCLUSION

The increased prices of new cars and the financial incapability of the customers

to buy them, Used Car sales are on a global increase. Therefore, there is an

urgent need for a Used Car Price Prediction system which effectively determines

the worthiness of the car using a variety of features. The proposed system will

help to determine the accurate price of used car price prediction. This paper

compares 3 different algorithms for machine learning : Linear Regression, Lasso

Regression and Ridge Regression.

38
Future Work

For better performance, we plan to judiciously design deep learning network

structures, use adaptive learning rates and train on clusters of data rather than the

whole dataset. To correct for overfitting in Random Forest, different selections of

features and number of trees will be tested to check for change in performance.

39
References

1. https://www.kaggle.com/jpayne/852k-used-car-listings

2. N. Monburinon, P. Chertchom, T. Kaewkiriya, S. Rungpheung, S. Buya and P. Boonpou,

"Prediction of prices for used car by using regression models," 2018 5th International Conference on

Business and Industrial Research (ICBIR), Bangkok, 2018, pp. 115-119.

3. Listiani M. 2009. Support Vector Regression Analysis for Price Prediction in a Car Leasing

Application. Master Thesis. Hamburg University of Technology

4. Chen, Tianqi, and Carlos Guestrin. "Xgboost: A scalable tree boosting system." Proceedings of

the 22nd acm sigkdd international conference on knowledge discovery and data mining. ACM, 2016.

5. Ke, Guolin, et al. "Lightgbm: A highly efficient gradient boosting decision tree." Advances in

Neural Information Processing Systems. 2017.

6. Fisher, Walter D. "On grouping for maximum homogeneity." Journal of the American statistical

Association 53.284 (1958): 789-798. 7. https://scikit-learn.org/stable/modules/classes.html: Scikit-

learn: Machine Learning in Python, Pedregosa et al., JMLR 12, pp. 2825-2830, 2011

40

You might also like