Professional Documents
Culture Documents
DA - Project 1
DA - Project 1
2
Data Cleaning
Check for missing values : As we can see in the image, the columns “dish_liked”, “phone”, “rate”,
“cuisines”, “location”, “rest_type” and “approx_cost” has some amount of missing values.
Replace nan or unreadable value formats with empty strings.
Drop the missing values.
Rating column – Convert string to numeric.
Label encoding for online order and book table.
Cost for two people- Convert string to float type.
Fetch latitude and longitude values using Open Gate API.
3
Exploratory Data Analysis
MAP OF BANGALORE WITH RATING
MAP OF BANGALORE WITH AVERAGE RATING
4
Exploratory Data Analysis
Top 5 cuisines for each
location.
5
Exploratory Data Analysis
Based on the pie chart we can say that North Indian Food is the most popular cuisine followed
by south Indian.
6
Exploratory Data Analysis
7
Exploratory Data Analysis
The chart below shows the top 10 most expensive and affordable restaurants at each location and
type of restaurants.
8
Exploratory Data Analysis
Word Cloud for types of dishes
popular in each restaurant type.
9
Predictive Analysis
10
Predictive Analysis
Multi linear regression model is applied by considering the columns(online order, book table,votes, cost for 2 people) to predict
the rating of the restaurant.
11
Results And Conclusion
The analysis done are utilised for aiding new restaurants, which implies, there are many factors
to focus on. Considering location wise ratings, location which has maximum online orders, type
of cuisines in that particular location, the restaurant type you prefer, etc, however it also
depends on the reviews list column from the dataset. Also from the predictive analysis, the mse
and mae errors are less, but the factors that included are not enough for a real world business
problem to predict the ratings.
In conclusion, using a restaurant dataset to predict insights of the locations for aiding new
restaurants is a challenging task that requires careful data preparation, data cleaning and
meaningful visualizations. Despite the challenges, a project involving predicting restaurant
ratings using a restaurant dataset has significant future prospects, including improved accuracy,
personalized recommendations, better restaurant management and integration with other
technologies. The recommendation system can be based on the dropped “review_list” feature,
further moving us into the use of NLP (Natural Language Processing).
12