Welcome to Scribd!

Machine Learning Models For Detecting Recipe Site Traffic: Tasty Bytes

Uploaded by

0% found this document useful (0 votes)

5 views12 pages

This document discusses using machine learning models to predict which recipes on a recipe site will receive high traffic. The goals are to predict high traffic recipes 80% of the time. Analysis shows potato and vegetable recipes receive the most traffic. Most popular number of servings is 4. Nutritional values are right skewed. Correlation between variables is low. Models tested are KNN and logistic regression after preprocessing. Logistic regression outperforms KNN based on evaluation metrics. The document recommends deploying the logistic regression model and collecting more data to improve predictions.

Original Description:

presentation on data modeling case

Original Title

Presentation

Copyright

Available Formats

PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

Download as pdf or txt

0% found this document useful (0 votes)

5 views12 pages

Machine Learning Models For Detecting Recipe Site Traffic: Tasty Bytes

Uploaded by

dimagmaatkha

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

Download as pdf or txt

Jump to Page

You are on page 1of 12

Search inside document

Machine learning models for

detecting Recipe Site Traffic

Tasty Bytes
Business Goals:
● In a world filled with delicious food recipes, we need to find which ones attract
more users

● Our focus is on predicting which recipes will lead to high traffic

● We need to determine a way to predict high traffic recipes 80% of the time
Most Popular type of Recipes
We can observe that our
most popular categories
of recipes are Potato
and Vegetable recipes,
which account for 29%
of all High traffic. While
Breakfast and Drinks
receive 6.3% and 0.9%
of all High Traffic,
respectively, they are
our least popular dishes.
Most Popular amount of Servings
We can see that 4
servings, which receive
41% of both High Traffic
and Low Traffic, is the
most popular number of
servings overall.
Nonetheless, the
remaining servings
alternatives generate
almost the same
amount in both groups.
Spread of Nutritional Values
We can see that the
distribution of our
Nutritional Values
columns is all right
skewed, which means
that a few recipes are
really high in nutritional
values.
These values must be
standardised when
creating a model.
Correlation between Columns
When we examine the
Correlation between our
numerical values, we
find that there is no
significant pearson
correlation between any
2 variables, with proteins
and calories having the
largest correlation
(0.17).
Model Development:
This is a Classification challenge because we're determining whether or not a recipe
receives a lot of traffic.

The first step is to divide the data into input and output variables. Then, we encode
the categorical data and scale the numerical data. Next, we divided the data into
training and testing groups of 80/20.

Our basic model is a KNN model and our comparison model is a logistic regression
model.

Both use GridSearch to find optimal hyperparameters.

Confusion Matrix comparison

We can see that the Logistic

Regression is either
performing similarly or better
than the KNN model
ROC Curve of Both models
The Logistic Regression
model has a greater true
positive rate and a lower false
positive rate when compared
to the KNN model, according
to the ROC curves of both
models.
We can also see that the ROC
AUC score of the Logistic
Regression model is higher
than the KNN, confirming that
the LogReg model is better for
detecting High traffic recipes
Business Metrics
Measuring Precision, Recall, and F1-score is an effective metric to
evaluate the accomplishment of our two goals.

While The accuracy score and ROC AUC score are useful KPIs for
these models to gauge their predictive power.

Our logistic regression model performs better than the KNN model in
all the mentioned metrics
Recommendation
We can put our logistic regression model into production to aid the product manager in
predicting the high traffic of the recipes. By using this model, approximately 80% of the
predictions will ensure that there will be heavy traffic.

To implement and improve the model, I will consider the following steps:
1. Deploying the model into production, preferably using edge devices for ease and security
2. Collecting more data such as time to make, ingredients, site duration time,etc.
3. Feature Engineering such as increasing the number of values in category column,
creating more meaningful features from the variables
Thank You

Capstone Presentation: Telecom Churn Study
Document19 pages
Capstone Presentation: Telecom Churn Study
A d
100% (3)
Telco Customer Churn
Document11 pages
Telco Customer Churn
Hamza Qazi
100% (2)
Forecasting Part 2
Document66 pages
Forecasting Part 2
Gela Soriano
No ratings yet
Methods To Address Multicollinearity: A Project Report On
Document30 pages
Methods To Address Multicollinearity: A Project Report On
Abhishek Verma
No ratings yet
Recipe Recommendation
Document9 pages
Recipe Recommendation
erika
No ratings yet
Churn Analysis - Group 5 v.30.09.20
Document13 pages
Churn Analysis - Group 5 v.30.09.20
Zero Mustaf
No ratings yet
Multiple Linear Regression: Application
Document22 pages
Multiple Linear Regression: Application
danna ibanez
No ratings yet
Final Project Report DA
Document3 pages
Final Project Report DA
parulbatish2102
No ratings yet
Bias Varience Trade Off
Document35 pages
Bias Varience Trade Off
mobeen
100% (1)
All Analytics Topics
Document5 pages
All Analytics Topics
Operations Club NCR
No ratings yet
Appendix I Raking Article
Document10 pages
Appendix I Raking Article
Mataichi
No ratings yet
Statistics For Data Analytics Project - CA 2: Multiple Linear Regression and Binary Logistic Regression
Document11 pages
Statistics For Data Analytics Project - CA 2: Multiple Linear Regression and Binary Logistic Regression
Rahul Jaju
No ratings yet
Capstone Project Business: Predict Customer Churn in E-Commerce
Document10 pages
Capstone Project Business: Predict Customer Churn in E-Commerce
A d
100% (2)
Machine Learning
Document9 pages
Machine Learning
Sanjay Kumar
No ratings yet
Project Presentation
Document14 pages
Project Presentation
Manisha Shinde
No ratings yet
Lasso and Ridge Regression
Document30 pages
Lasso and Ridge Regression
Aarti
No ratings yet
Logistic Regression: Utkarsh Kulshrestha Data Scientist - Tcs Learnbay
Document21 pages
Logistic Regression: Utkarsh Kulshrestha Data Scientist - Tcs Learnbay
N Mahesh
100% (1)
Marketing Research: Data Analysis V: Regression Analysis (Part 1)
Document35 pages
Marketing Research: Data Analysis V: Regression Analysis (Part 1)
E W
No ratings yet
Multicollinearity
Document2 pages
Multicollinearity
Zubair Ahmed
No ratings yet
B5 (r2)
Document17 pages
B5 (r2)
Vk Att
No ratings yet
Ipsos Loyalty Multicollinearity
Document12 pages
Ipsos Loyalty Multicollinearity
Hashmat حشمت روحیان
No ratings yet
European Journal of Operational Research: Stefan Woerner, Marco Laumanns, Rico Zenklusen, Apostolos Fertis
Document14 pages
European Journal of Operational Research: Stefan Woerner, Marco Laumanns, Rico Zenklusen, Apostolos Fertis
julio perez
No ratings yet
Heteroscedasticity and Multicollinearity
Document4 pages
Heteroscedasticity and Multicollinearity
divyakantmeva
No ratings yet
We Are Intechopen, The World'S Leading Publisher of Open Access Books Built by Scientists, For Scientists
Document21 pages
We Are Intechopen, The World'S Leading Publisher of Open Access Books Built by Scientists, For Scientists
karncmu
No ratings yet
5.1 Naive Bayes Experimental Result
Document24 pages
5.1 Naive Bayes Experimental Result
Shafayet Uddin
No ratings yet
Primer of Applied Regression and Analysis of Variance (Glantz S.a., Slinker B.K., Neilands T.B)
Document1,472 pages
Primer of Applied Regression and Analysis of Variance (Glantz S.a., Slinker B.K., Neilands T.B)
samskulsky_309827443
No ratings yet
Linearity of Calibration Curves For Analytical Methods
Document21 pages
Linearity of Calibration Curves For Analytical Methods
Ana Carolina Santos
No ratings yet
Goodness of Fit Measures - University of Strathclyde
Document3 pages
Goodness of Fit Measures - University of Strathclyde
Julie Finch
No ratings yet
Final Report DCG
Document3 pages
Final Report DCG
David Christopher
No ratings yet
Multicolinearity 1
Document40 pages
Multicolinearity 1
Ali Hassan
No ratings yet
Multinomial Logistic Regression-3
Document19 pages
Multinomial Logistic Regression-3
Ali Hassan
No ratings yet
Acquisition Analytics Assignment
Document15 pages
Acquisition Analytics Assignment
Harshad Ambekar
No ratings yet
Lead Scoring Case Study Presentation
Document11 pages
Lead Scoring Case Study Presentation
Devanshi
100% (2)
Primer of Applied Regression and Analysis of Variance 3Rd Edition Glantz S A All Chapter
Document67 pages
Primer of Applied Regression and Analysis of Variance 3Rd Edition Glantz S A All Chapter
jeanette.rasmussen227
100% (6)
Machine Learning Model
Document9 pages
Machine Learning Model
Sanjay Kumar
No ratings yet
ML03
Document9 pages
ML03
juan lopez
No ratings yet
How To Use ROC Curves and Precision-Recall Curves For Classification in Python
Document47 pages
How To Use ROC Curves and Precision-Recall Curves For Classification in Python
Adrian Stan
No ratings yet
Regression
Document34 pages
Regression
Ravi goel
100% (1)
Primer of Applied Regression Analysis of Variance 3Rd Edition Edition Stanton A Glantz All Chapter
Document67 pages
Primer of Applied Regression Analysis of Variance 3Rd Edition Edition Stanton A Glantz All Chapter
jeanette.rasmussen227
100% (5)
Stat Guide Minitab
Document37 pages
Stat Guide Minitab
Septiana Rizki
No ratings yet
Multicollinearity and Remedies
Document23 pages
Multicollinearity and Remedies
deeksha3363
No ratings yet
12 Pengambilan Keputusan Dengan AHP
Document51 pages
12 Pengambilan Keputusan Dengan AHP
Titin Missa
No ratings yet
Pengambilan Keputusan Dengan: Proses Hirarki Analitik (Analytical Hierarchy Process-AHP)
Document51 pages
Pengambilan Keputusan Dengan: Proses Hirarki Analitik (Analytical Hierarchy Process-AHP)
RohmatDwiHastanto
No ratings yet
Community Project: Simple Linear Regression in SPSS
Document4 pages
Community Project: Simple Linear Regression in SPSS
Neva Asih
No ratings yet
03 Logistic Regression
Document23 pages
03 Logistic Regression
Harold Costales
No ratings yet
04 - Interval Estimation
Document42 pages
04 - Interval Estimation
Jason
No ratings yet
The Hidden Dangers of Historical Simulation
Document5 pages
The Hidden Dangers of Historical Simulation
Krishan Parwani
No ratings yet
Merge +1
Document107 pages
Merge +1
SHIKHA SHARMA
No ratings yet
77 MultipleRegression
Document4 pages
77 MultipleRegression
Sameeia
No ratings yet
Predicting Energy Values of Feeds WP WEISS
Document10 pages
Predicting Energy Values of Feeds WP WEISS
J Jesus Bustamante Gro
No ratings yet
625 Preliminary
Document39 pages
625 Preliminary
Gursimar Singh
No ratings yet
Testing Accuracy Ratio PDF
Document6 pages
Testing Accuracy Ratio PDF
Ngọc Ánh Trần
No ratings yet
Lecture 4 Linear Regression 1 07032024 082032pm
Document32 pages
Lecture 4 Linear Regression 1 07032024 082032pm
Abdul Mueed Paracha
No ratings yet
Mitigating The Hook Effect in Lateral Flow Sandwich
Document12 pages
Mitigating The Hook Effect in Lateral Flow Sandwich
Joana Barbosa
No ratings yet
CVS Landong Ravalo
Document18 pages
CVS Landong Ravalo
Lalaine Beatriz
No ratings yet
Anshul Dyundi Machine Learning July 2022
Document46 pages
Anshul Dyundi Machine Learning July 2022
Anshul Dyundi
50% (2)
Machine Learning Project: Raghul Harish
Document46 pages
Machine Learning Project: Raghul Harish
ARUNKUMAR S
100% (1)
BOP Assignment
Document9 pages
BOP Assignment
Saurav Prakash
No ratings yet
Book 2.0 - Python
Document143 pages
Book 2.0 - Python
Shiva Yadandla
No ratings yet
Linear Regression with Multiple Covariates
From Everand
Linear Regression with Multiple Covariates
Brett Kottmann
No ratings yet
Unit 17
Document12 pages
Unit 17
Muhammed Mikhdad K G 21177
No ratings yet
Decision Tree
Document4 pages
Decision Tree
Gurushantha Doddamani
No ratings yet
Assignment-2: 1) Explain Classification With Logistic Regression and Sigmoid Function
Document6 pages
Assignment-2: 1) Explain Classification With Logistic Regression and Sigmoid Function
pranesh
No ratings yet
Chapter 5
Document16 pages
Chapter 5
Edward
No ratings yet
Common Statistical Abbreviations and Symbols in APA Italics
Document4 pages
Common Statistical Abbreviations and Symbols in APA Italics
Irwanda Juni
No ratings yet
Ijeter 247112019
Document7 pages
Ijeter 247112019
Wiwit Supriyanti
No ratings yet
Probability and Statistics (MMT-008) : Practical Assignment File
Document48 pages
Probability and Statistics (MMT-008) : Practical Assignment File
Anurag Garg
No ratings yet
Review Jurnal Dan Buat Judul
Document10 pages
Review Jurnal Dan Buat Judul
H1ghr Gang
No ratings yet
Net09 28
Document11 pages
Net09 28
shardapatel
No ratings yet
Multiple-Choice Test Linear Regression Regression: y X y X y X X F y
Document2 pages
Multiple-Choice Test Linear Regression Regression: y X y X y X X F y
Madhu Ck
No ratings yet
DataScience With Python Course Content Syllabus Meritude
Document10 pages
DataScience With Python Course Content Syllabus Meritude
Kumar Udit
No ratings yet
ECON 550: Econometrics Exercise 5 - Panel Estimation: V - Shall. Estimate The Following Three Specifications
Document2 pages
ECON 550: Econometrics Exercise 5 - Panel Estimation: V - Shall. Estimate The Following Three Specifications
Tahmeed Jawad
No ratings yet
GSEMinstataintroduction
Document39 pages
GSEMinstataintroduction
AMELIA ULM
No ratings yet
ANOVA
Document23 pages
ANOVA
Akash Raj Behera
No ratings yet
Factor Analysis Spss Output Interpretation PDF
Document2 pages
Factor Analysis Spss Output Interpretation PDF
Nichole
100% (1)
Process/product Optimization Using Design of Experiments and Response Surface Methodology
Document22 pages
Process/product Optimization Using Design of Experiments and Response Surface Methodology
Dario Piñeres
No ratings yet
EV92
Document1,099 pages
EV92
Javi Quispillo Huamán
No ratings yet
BRM Unit V
Document99 pages
BRM Unit V
Anonymous 8G41ro6O
No ratings yet
Jawaban Uas Statistika
Document10 pages
Jawaban Uas Statistika
haidar rafi
No ratings yet
Analisis Respon Surface Design Box Behnken: "Optimasi Adsorpsi Kitosan Bertautsilang Glutaraldehida Terhadap Ion Fe (III) "
Document5 pages
Analisis Respon Surface Design Box Behnken: "Optimasi Adsorpsi Kitosan Bertautsilang Glutaraldehida Terhadap Ion Fe (III) "
Fega Rahmawati
No ratings yet
Project Report Forest Fire Final
Document26 pages
Project Report Forest Fire Final
Sudhanshu Singh
No ratings yet
Pengaruh Kompensasi Terhadap Kinerja Karyawan (Studi Pada Karyawan PT. Asuransi Jiwasraya Persero Regional Office Malang)
Document8 pages
Pengaruh Kompensasi Terhadap Kinerja Karyawan (Studi Pada Karyawan PT. Asuransi Jiwasraya Persero Regional Office Malang)
Andy Pradana
No ratings yet
CH 05
Document166 pages
CH 05
Ray Vega Lugo
No ratings yet
Data Problems: Multicollinearity and Inadequate Variation
Document4 pages
Data Problems: Multicollinearity and Inadequate Variation
rafiddu
No ratings yet
Pengaruh Media Sosial Instagram Dan Whatsapp Terhadap Pembentukan Budaya "Alone Together"
Document12 pages
Pengaruh Media Sosial Instagram Dan Whatsapp Terhadap Pembentukan Budaya "Alone Together"
Muhammad Farhan Syah
No ratings yet
Basic Eco No Metrics - Gujarati
Document48 pages
Basic Eco No Metrics - Gujarati
Faizan Raza
50% (2)
Using Outreg2 To Report Regression Output, Descriptive Statistics, Frequencies and Basic Crosstabulations
Document16 pages
Using Outreg2 To Report Regression Output, Descriptive Statistics, Frequencies and Basic Crosstabulations
Ricardo Medina
No ratings yet
4 - The Multiple Linear Regression - Parameter Estimation
Document10 pages
4 - The Multiple Linear Regression - Parameter Estimation
Miftahul Jannah
No ratings yet
Scikit-Learn Cheat Sheet Python For Data Science: Preprocessing The Data Evaluate Your Model's Performance
Document1 page
Scikit-Learn Cheat Sheet Python For Data Science: Preprocessing The Data Evaluate Your Model's Performance
Lourdes Victoria Urrutia
100% (1)
Statistics Notes 4 (F-Test (ANOVA), Pearson R and Linear Regression)
Document5 pages
Statistics Notes 4 (F-Test (ANOVA), Pearson R and Linear Regression)
Charmaine V. Bañes
0% (1)