SE Paper Chapter 2

RECIPE GENERATOR OF LOCAL DISHES THROUGH
AVAILABILITY OF INGREDIENTS USING COSINE SIMILARITY
John Paulo I. Perminola
Marielle Louise B. Tario
Kayshia Princess M. Malunes
Jhonella Jazmine R. Orquillo
Ella Monique T. Gonzales
A Proposed System Presented to the Faculty of the
Computer Studies Department
College of Science
Technological University of the Philippines
Ayala Boulevard, Manila
In Partial Fulfillment of the
Requirements for the Degree
Bachelor of Science in Computer Science
June 2022
1
CHAPTER 2
CONCEPTUAL FRAMEWORK
Recommendation Systems
Food recommendation systems (RSs) are software systems that generate personalized
recommendations from a huge number of options, and hence offer a possible solution to information
overload and poor food choices. As more customers buy food or look for food-related content online,
RSs have recently been established for the internet food business. Meal preparation, recipes, ingredients,
coffee shops, restaurants, restaurant menus, and grocery shopping are all examples of food-related items.
The aim of this paper is to present a RS presentation of the recommendations. Consumers may not
understand or disregard recommendations if the design is unsuitable.
The task of recommending recipes does present several unique challenges. There is no limit on
the number of ingredients that can be used in a recipe, and generally recipes are not rated by as many
people as movies or music. Such challenges can bring serious problems for traditional filtering models
(Lin et al,2014). To generate recommendations, a variety of matchmaking approaches are used:
The content-based method anticipates consumers' preferences by looking at their product or

service ratings and purchase history.
The demographic approach draws on the opinions, evaluations, and preferences of others who
share the same demographics as the RS user.(Paroudy et al. ,2017)
Artificial Neural Network Architecture
The spread of artificial neural network architectures such as Generative Pre-trained Transformer 2
(Radford et al., 2015),Long short-term memory (Hochreiter and Schmidhuber, 1997) or BERT a
language representation model (Devlin et al., 2019). A study of (Merity et al., 2017) allowed the field of
text production to advance which says that The density of a recipe rating matrix is much lower than that
of a movie rating. A variety of projects are now being worked on to use neural language models on
recipe datasets and employ a dataset of 100,000 recipes (Wang et al,2015) . For the goal of named entity
recognition, an LSTM-based discriminative language model was developed. For the evaluation, they
used a dataset of cooking recipes. (Yang et al., 2017) proposed reference-aware language models to
create instructions depending on the ingredients provided using a dataset of 31,000 recipes. A recurrent
neural network that represents global coherence was presented by (Kiddon et al., 2016). Web services
relating to food, such as recipe websites and food journaling applications, are fast gaining popularity. In
these systems, data from service providers and data provided by consumers frequently coexist.
Compared to the former, due to its randomness and lack of organization, it is often difficult to
incorporate into common service features like recommendation making and associative searching. This
2
work addresses this problem by investigating the effectiveness of multiple text embedding methods in
generating distributed representations of local dishes.
Majumder et al., 2019 proposed the task of personalized recipe generation, and have shared a dataset of
180K recipes and 700K user interactions (reviews). The authors used an encoder-decoder framework to
generate recipes and conducted an evaluation using text metrics. They encoded three embedding layers:
title, ingredient, and caloric level using BERT then decoded recipes steps using a two-layered GRU.
(Lee et al., 2020) have recently presented a demo paper of their system for the automatic generation of
cooking recipes utilizing the Recipe1M+ dataset and a language model. The evaluation of the model was
based on translation metrics. They focused on two separate tasks: ingredients, and instructions
generation. On the contrary, we use prepared food entities to generate complete recipes, which allows
pairwise comparison of the original and generated recipe composed of the same set of ingredients.
Marin et al., 2019 combined the Recipe1M+ dataset with 13 million food images to generate joint
embeddings of recipes and images. Their goal was to maximize the coherence of the generated text with
its corresponding image. (Bossard et al., 2014) recognized and classified food images into 101 food
categories, utilizing a dataset consisting of approximately 100K images. (Salvador et al., 2019) used the
Recipe 1M+ to generate simplified recipes lacking ingredient quantities and units. They evaluated their
model using a perplexity score as well as the adequacy between the generated text and the image
Machine Learning Methods
Furthermore, it's essential to understand which items can be combined to create a good culinary
preparation. A beginner cook will find it challenging to choose the proper recipe from a list of items.
Even specialists, though, may face difficulties. Machine learning is often used in our everyday lives. For
instance, image processing can be used to recognize things. Traditional procedures will result in an error
rate, despite the fact that this process is complex due to many food constituents. Machine learning and
deep learning approaches can help overcome these issues.(Chenhall,2010)
In recent years, food recognition, ingredient detection, and recipe suggestions have all gotten a
lot of attention. All of these deep learning-based works on object recognition and categorization are
connected. For object detection and classification, there are sufficient deep learning models and
algorithms. Color histograms, BoFs, linear kernel SVM classifiers, and K-nearest neighbors are all
common techniques.(Rokon et al.,2022)
Everyday food is an important part of our daily lives and, currently, there are many
internet sites that help us plan these meals. This paper aimed to develop a tool that would help in
pairing various ingredients from different cuisines and to suggest an alternate ingredient. The objective
is to help in innovation of new dishes and to help people allergic to certain ingredients by recommending
alternate ingredients. Researchers have proposed that machine learning methods can be used to build
models to complete a recipe. The machine learning model suggested is Non negative matrix
factorization. The characteristic of this method is that it gives a linear non negative part based
representation of data. But, it does not consider the information about flavor components and hence the
3
results are not accurate(Colyer, 2019) . Other researchers proposed a recommendation method of
alternative ingredients based on co-occurrence relation on recipe database. It proposes two algorithms
to recommend alternative ingredients. Through the looking and tasting experiments, it was confirmed
that each of the proposed methods were effective for each intended purpose.
The state of the art in language modeling is rapidly evolving. Recurrent neural networks, such as
LSTM(Long-Short Term Memory) Network, were popular a few years ago(G, Felix A., et al, 2001) and
served as the state of the art for modeling sequence data. These models had some fundamental flaws
when applied to natural language. Lengthy training that was not parallelizable, and limited possibility of
transfer learning were among the problems that affected every statistical language model for a very long
time(Ashish et al, 2017).
Recent research in the area changed this situation. In the series of breakthroughs referenced to as
”NLP’s ImageNet moment” (Sebastian,2018), new models such as ULMFiT (Universal Language
Model Fine-tuning for Text Classification)(H., Jeremy, et al, 2018), ELMo (Embeddings from Language
Models)(Matthew et al., 2018) and OpenAI GPT (Generative Pre-Trained Transformer)(Alec et al.,
2018)emerged, each of them solving some of the previously stated problems. In this situation, it was
reasonable to merge these solutions into a universal language model which offers deep context
understanding, can be pretrained, and relatively easily fine-tuned.
Data Gathering
Data was taken from all of the sites after they were collected. To begin, the texts of the literature
reviews were combined to create a summary of the several types of food RSs, issues and solutions, and
future research ideas. To make things easier, a color code was applied. Following that, a table was
created with descriptive data for the total number of studies per: publication year, food type, RS
approach, decision-making stage, RS type, and research type.
For the first part of Kishor Morol’s way of gathering, he employed Transfer Learning to train
their CNN model. Transfer learning is the process of reusing a previously trained model. ResNet50 was
utilized as a pre-trained model. Because it can train deep neural networks with relatively minimal data,
transfer learning is quite popular in deep learning. Because most real-world issues do not have millions
of labeled data points to train such sophisticated deep neural networks, transfer learning is efficient.
Using transfer learning, it was possible to achieve good outcomes with less data. They picked transfer
learning because of this. A deep learning model's first layer recognizes forms, the last layer recognizes
more complicated visual patterns, and the last layer produces predictions.(Marcelino, 2018)
To increase the accessibility of your manuscript, you should set the title and language metadata.
On Word for Windows, open the File tab and click on Info. On Word for Mac, click the File Menu and
select Properties, then click the Summary tab. Fill in the title of your document. For anonymous review,
clear the ‘author’ field. (Capprette,2018)
According to a website Geoviz: Web Scraping is a technique employed to extract large
amounts of data from websites whereby the data is extracted and saved to a local file in a computer or to
a database in table format.The ingredient pairing module is used to pair the ingredients of the selected
recipe with other ingredients to innovate a new dish. For this,Tf-idf andCosine similarity are
4
used.Tf-idf(Term frequency -Inverse document frequency)assigns weights to ingredients based on
their flavor components.A matrix is formed for all ingredients present in dataset and all the flavor
components. Each ingredient is treated as a single document and flavor components are treated as
terms of documents. Tf-idf score is calculated for all ingredients on the basis of this matrix. Then, for
the selected ingredient, a matching pair is found by calculating the cosine of the angle between two
ingredients.
Alternative Ingredient recommendation is used to recommend an alternative ingredient
if a particular ingredient is not present or cannot be used in therecipe.Word2vec model is
created to perform this function and recommend alternative ingredients. Vectors Are positioned in
the vector space such that ingredients that share common contexts in the corpus are located in close
proximity to one another in the space. This method computes cosine similarity between a simple
mean of the projection weight vectors of the given words and the vectors for each word in the model.
Food pairing hypothesis states that two ingredients sharing common flavor profile taste well when
used together in a recipe.In the Indian Cuisine vegetables,fruits,cereals,pulses,nuts support the food
pairing hypothesis whereas spices and dairy products do not support this hypothesis. Astate Diagram Is
used to represent the condition of the system or part of the system at finite instances of time. It’s a
behavioral diagram and it represents the behavior using finite state transitions. Astate diagram is
used to model the dynamic behavior of a class in response to time and changing external stimuli.
Challenges and Solutions
There are several challenges and solutions for food RS discussed:
Food journals are a method of better predicting a user's food choices. Users can keep track of
their eating habits, such as portion sizes and calories, in a meal journal. However, keeping a meal record
requires a significant amount of effort from the user, and users frequently forget or provide incorrect
information.(Benton,2015)
User ratings can also be used to learn more about a user's food preferences. Unfortunately,
collecting enough user ratings while making the food RSs convenient and reducing user effort is
difficult. Furthermore, persuading people to keep rating dishes, recipes, or culinary items is tough.
Balancing a large database with user satisfaction: a large database is advantageous since the food
RSs can recommend more food items that are more appropriate for the user's health situation or food
preferences. Food RSs should, however, strike a balance between the number of food products in the
database and user satisfaction with the system's response time.
Kishor Morol’s paper proposed a CNN model for recognizing food items as well as a recipe
suggestion algorithm for preparing foods based on the discovered elements. A custom dataset of 32 food
ingredient categories was also introduced. They got a testing accuracy of 94%, which is impressive, and
demonstrated that the model's ability to distinguish food items from photos is excellent.
The food recommendation system developed serves as a tool in the innovation of new dishes and
in pairing various ingredients from different cuisines and to suggest an alternate ingredient. The system
5
may also help in maximizing the use of available ingredients the user has and can somehow help people
that have food allergies by recommending an alternative ingredient for a dish.
Food pairing and suggestion of alternative ingredients are features that this food recommendation
system includes. Innovation of new dishes is possible by food pairing, term frequency-inverse document
frequency, and cosine similarity. Alternative Ingredient Recommendation system is studied to replace an
ingredient in the recipe and recommend an alternative ingredient with the help of word2vec model.
Overall, the proposed system can be used by users to innovate new dishes and replace any ingredient in
the recipe to complete it.
REFERENCES
[1] Prabhakaran, S. (2018). Cosine Similarity – Understanding math and how it works (with python codes)
[2] Cournapeau, D. (2007). Scikit Learn
[3] Leitch, J. (2020). Building a Recipe Recommendation API using Scikit-Learn, NLTK, Docker, Flask, and Heroku
[4] Gupta, T. (2019). Data Preprocessing in Python
[5] Delfstack (2021). Cosine Similarity in Python
[6] Tutorialspoint Scikit Learn Tutorial, Retrieved from: https://www.tutorialspoint.com/scikit_learn/index.htm
[7] Grimley, B. (2015). What is neurolinguistics programming, (nlp)? The development of a grounded theory of neuro-linguistic
programming within an action research journey
[8] Lin, C et al. (2014) A Content-Based Matrix Factorization Model for Recipe Recommendation
[9] Radford, A (2015) Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks
[10] Devlin, J (2019) BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
[11] Hochreiter,S . Schmidhuber. J (1997) LONG SHORT-TERM MEMORY
[12] Merity, S (2017) Regularizing and Optimizing LSTM Language Models
[13] Wang, X et al. (2015) RECIPE RECOGNITION WITH LARGE MULTIMODAL FOOD DATASET
[14] Yang, Z et al. (2017) Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing
[15] Kiddon, C et al. (2016) Globally Coherent Text Generation with Neural Checklist Models
[16] Felix A. Gers and Jürgen Schmidhuber. ‘LSTM recurrent networks learn simple context-free and context-sensitive languages’.
In: IEEE Trans. Neural Networks 12.6 (2001), pp. 1333–1340. DOI: 10.1109/72.963769.
[17] Ashish Vaswani et al. ‘Attention is All you Need’. In: Advances in Neural Information Processing Systems 30: Annual
Conference on Neural Information Processing Systems 2017, 4-9 December 2017, Long Beach, CA, USA. Ed. by Isabelle
Guyon et al. 2017, pp. 5998–6008.
[18] Sebastian Ruder.NLP’s ImageNet moment has arrived. 12th July 2018. Retrieved from: https://ruder.io/nlp-imagenet/ (visited
on 31/10/2019).
[19] Jeremy Howard and Sebastian Ruder. ‘Universal Language Model Fine-tuning for Text Classification’. 2018, pp. 328–339.
ISBN: 978-1-948087-32-2. DOI: 10.18653/v1/P18-1031
[20] Matthew E. Peters et al. ‘Deep Contextualized Word Representations’. 2018, pp. 2227–2237. ISB N: 978-1-948087-27-8.
[21] Alec Radford et al. ”Improving Language Understanding by Generative Pre-Training”. 2018, Retrieved from:
https://s3-us-west-2.amazonaws.com/openai-assets/research-covers/language-unsupervised/language_understanding_paper.pdf
[22] Majumder, B (2019) generating personalized recipes from historical user preferences
[23] Lee, H et al. (2020) RecipeGPT: Generative pretraining based cooking recipe generation and evaluation system.
[24] Marin, J et al (2019) Recipe1m+: A dataset for learning cross-modal embeddings for cooking recipes and food images
[25] Bossard, L et al (2014) Food-101 – mining discriminative components with random forests.
[26] Salvador, A et al (2019)Inverse cooking: Recipe generation from food images
[27] Paroudy, P et al (2017)Investigating the effects of smart technology on customer dynamics and customer experience
[28] Chenhall, C(2010) Improving Cooking and Food Preparation Skills: A Synthesis of the Evidence to Inform Program and Policy
Development and Improving Cooking and Food Preparation Skills: A Profile of Promising Practices in Canada and Abroad
[29] Rokon,S (2022) Food Recipe Recommendation Based on Ingredients Detection Using Deep Learning
[30] Colyer, A (2019) The why and how of nonnegative matrix factorization, https://blog.acolyer.org/2019/02/18/ the-why-and-how
-of-nonnegative-matrix-factorization/
[31] Marcelino, P/(2018)Transfer learning from pre-trained models. https://towardsdatascience.com/transfer-learning-from-pre
6
-trained-models-f2393f124751
[32] Capprette, H (2018) Creating Accessible Word Documents – Setting Language and Title https://pressbooks.ulib.csuohio.edu
/accessibility/chapter/chapter-2-creating-accessible-word-documents/
[33] gvadmin(2018)Ethics and Uses of Web Scraping https://geo-viz.com/blog/ethics-and-uses-of-web-scraping/
[34] Benton,D (2015)Portion Size: What We Know and What We Need to Know
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4337741/

SE Paper Chapter 2

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

SE Paper Chapter 2

Uploaded by

Copyright:

Available Formats

RECIPE GENERATOR OF LOCAL DISHES THROUGH

AVAILABILITY OF INGREDIENTS USING COSINE SIMILARITY

John Paulo I. Perminola

Marielle Louise B. Tario

Kayshia Princess M. Malunes

Jhonella Jazmine R. Orquillo

Ella Monique T. Gonzales

A Proposed System Presented to the Faculty of the

Computer Studies Department

Technological University of the Philippines

Ayala Boulevard, Manila

In Partial Fulfillment of the

Requirements for the Degree

Bachelor of Science in Computer Science

The content-based method anticipates consumers' preferences by looking at their product or

Artificial Neural Network Architecture

Machine Learning Methods

Challenges and Solutions

There are several challenges and solutions for food RS discussed:

You might also like