Professional Documents
Culture Documents
Monsoon 2020 - Week 03: Computational Gastronomy
Monsoon 2020 - Week 03: Computational Gastronomy
RECIPE DATA
We shall now investigate the relationship among cuisines, recipes, ingredients and
ingredient categories.
A more structured way of representing these recipes is via a paired list of ‘Recipe-ID—
Ingredient-ID ’.
Recipe-01 — Ingredient-01
Recipe-01 — Ingredient-02
Recipe-01 — Ingredient-15
Recipe-01 — Ingredient-19
Recipe-01 — Ingredient-06
Recipe-02 — Ingredient-06
Recipe-02 — Ingredient-12
Recipe-02 — Ingredient-08
Recipe-02 — Ingredient-15
.
.
.
Recipe-100 — Ingredient-06
Recipe-100 — Ingredient-11
Recipe-100 — Ingredient-15
Recipe-100 — Ingredient-13
Recipe size distribution for world cuisines from 22 regions. The number of recipes of a given size
in a cuisine are normalized by enumerating ‘Percentage of recipes’. The inset shows the
distribution for all the recipes across the world regions. Adapted from Singh and Bagler, ‘Data-
driven investigations of culinary patterns in traditional recipes across the world’, 2018 IEEE 34th
International Conference on Data Engineering Workshops, Paris.
Every recipe that is part of the cuisine is a cultural legacy and has been transmitted over
generations. By definition a recipe using a single ingredient (s=1) would not be considered
to be meaningful, since the for the purpose of present analysis we shall focus on recipes
that have survived due to the ‘magic’ of ingredient combinations.
Intuitionally, smaller size recipes would be easy to transmit. However, such recipes would
also be considered ‘too simple’ iconic cultural legacy. On the other hand, while complex
recipes that combine a large number of ingredients potently represent culturally nuances,
such recipes would be rarer due to, both, the difficulty in passing on with subtleties of
cooking protocol as well as because they would be harder to reproduce. With such
forces/constraints shaping the recipe sizes it is understandable that cuisines across the
world present with a normal distribution with a typical recipe size of s=10.
Frequency-Rank statistics of cuisines (also known as Ingredient Popularity Statistics) for world
cuisines from 22 regions. The normalized frequency follows a power distribution suggesting the
presence of skewed use/popularity of a few ingredients. The inset shows the statistics for all
the recipes across the world regions. Adapted from Singh and Bagler, ‘Data-driven
investigations of culinary patterns in traditional recipes across the world’, 2018 IEEE 34th
International Conference on Data Engineering Workshops, Paris.
Category Composition:
Category composition analysis is a coarse grained view of recipe structures to probe the
presence/dominance of ingredient of each category.
The heatmap of the Category Composition matrix for world cuisines from 22 regions. The data
is normalized within each cuisine to represent the fraction of all ingredient instances that are
from a given category. Adapted from Singh and Bagler, ‘Data-driven investigations of culinary
patterns in traditional recipes across the world’, 2018 IEEE 34th International Conference on
Data Engineering Workshops, Paris.
The category composition matrix provides insights into the dominant ingredient
categories (spice, dairy, vegetable etc.) that dictate the recipes of a cuisine as well as
similarities/differences across cuisines by way of their ingredient use.
Frequent itemset mining captures this notion to identify ingredient combinations (pairs,
triads and higher order sets) that are frequently used across the recipes.