Download as pdf or txt
Download as pdf or txt
You are on page 1of 21

Student Cafeteria Consumptions and Prediction

Abu Sufyan

May 29, 2024

Project Members:

Abu Sufyan

Talha khan

Salman Ahmad

Ahmad Ghafoor

Supervisor:

Dr. Abbas Abbasi

BSc. Electronic Engineering, Session 2021-2025

Department of Electronic Engineering

The Islamia University of Bahawalpur

1
Abstract

This project seeks to use machine learning techniques to analyze and forecast student cafeteria con-

sumption patterns. The PoS system of the school’s cafeteria provided transactional data while student

demographics, and the event schedule were also used to discover major trends. We conducted a meticu-

lous analysis on the data so as to understand how different variables related over time.There are several

models we used for predictions such as Linear Regression, Decision Trees, Random Forests, Support

Vector Machine and Neural Networks.[1]According to our findings machine learning can be used in im-

proving operational efficiencies in institutional food services.The objective is decreasing wastage of food

through comprehension of eating habits and constructing a model that will predict future demands of

what students should eat in their diet.The main aim is to improve overall operational efficiency by cutting

down on wastage of food products in the institution’s kitchen. Data was collected from the cafeteria’s

Point where sale (POS) system which included transaction details, student demographics and special

event schedules. Such data was then prepared for further analysis as well as modeling after thorough

preprocessing steps like cleaning it up, normalizing it as well as feature engineering.[2]Finally, this process

ended with deploying the final model using a user-friendly web application that allows for real-time esti-

mation of consumption requirements. Consequently the managers shall be able to get accurate readings

or rather estimates about what they need to produce regarding quantities without having any fear. The

plan is to study how students buy food in a canteen using smart computers. Facts from the canteen’s

cash machine, student info, and events were collected. We found ideas like more lunch sales in tests. We

tried different ways to guess, like Line, Trees, Random Bush, Help Vectors, and Brains. Brains did best

(Mistake: 11.9, R²: 0.78). We made a web plan to show guesses for food. Later, we will try more things

like what every student likes, quick data, and more than one place to show how smart computers can

help food plans.

2
Contents

1 Introduction 7

2 Data Collection 8

2.1 Data Sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

2.1.1 Past Sales Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

2.1.2 Calendar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

2.1.3 Weather . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

2.1.4 Events Day . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

3 Data Preprocessing 9

3.1 Data Cleaning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

3.1.1 Handling Missing Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

3.2 Normalization and Standardization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

3.2.1 Normalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

3.2.2 Standardization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

3.2.3 Encoding Categorical Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

3.3 Data Spliting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

4 Exploratory Data Analysis (EDA) 10

4.1 Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

4.2 Graph Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

4.2.1 Each Day Sale . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

4.2.2 Item Sale for Different Weather Conditions . . . . . . . . . . . . . . . . . . . . . . . . 11

4.2.3 Average Quantity Sold in Events . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

4.2.4 Total Price of Item . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

4.2.5 Sales by Weather Condition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

4.2.6 Total Sales by Event . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

4.2.7 Total Sales by Month . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

5 Predictive Modeling 14

5.1 Model Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

5.1.1 GradientBoostingRegressor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

3
5.2 Model Training and Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

5.2.1 Loading Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

6 Model Development 15

6.1 Coding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

7 Results 16

7.1 App Development . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

7.1.1 GUI App . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

7.1.2 Prediction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

7.1.3 Sunny and Normal Day . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

7.1.4 Cloudy and Musical Concert . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

7.1.5 Holiday and Rainy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

7.1.6 Exam Day . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

8 Conclusion 20

4
List of Figures

1 Dataset of Cafeteria . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

2 Day wise Sale . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

3 Bar Plot of Item sale in different weather . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

4 Bar Plot of Item sale in different weather . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

5 Total Price of each Item in PKR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

6 Yearly Sale by Weather Condition in PKR . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

7 Total Sale by Events in PKR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

8 Total Sales by Month in PKR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

9 Predicted Price for Sunny and Normal Day in PKR . . . . . . . . . . . . . . . . . . . . . . . 17

10 Predicted Price for Cloudy and Musical Concert in PKR . . . . . . . . . . . . . . . . . . . . . 18

11 Predicted Price for Holiday and Rainy in PKR . . . . . . . . . . . . . . . . . . . . . . . . . . 18

12 Predicted Price for Exam Day and Fog in PKR . . . . . . . . . . . . . . . . . . . . . . . . . . 19

5
List of Tables

6
1 Introduction

Efficiently managing food services at a student cafeteria is crucial, for keeping operations smooth reducing

food wastage and ensuring student satisfaction. Understanding how students consume food can greatly help

in achieving these objectives by optimizing inventory levels, scheduling staff and tailoring menu options to

meet demand.

This initiative aims to study consumption data from a student cafeteria and create a model to anticipate

future demand. Using machine learning techniques the project hopes to uncover patterns in the data that can

offer insights for cafeteria management.[3] Accurate forecasts of food consumption can assist in minimizing

waste ensuring items are always on hand and enhancing the dining experience for students.

The data analyzed in this research comes from the cafeterias Point of Sale (POS) system. Includes

transaction records, student demographics and information on special occasions, like holidays and exam

periods. This extensive dataset enables an examination of factors that impact consumption trends.

The upcoming sections of this report outline the approaches used for gathering and preparing data the

exploratory analysis carried out to identify patterns the machine learning models assessed for prediction

precision and the implementation of the effective model. The report ends by discussing the findings the

impact, on cafeteria operations and possible future directions. This study showcases how using data driven

methods can revolutionize how food services are run in schools promoting sustainability and improving

student happiness.

7
2 Data Collection

Get detailed and pertinent data in order to create a strong prediction model for student cafeteria usage.

The primary data types you should think about acquiring are listed below, along with possible sources and

techniques for doing so..

2.1 Data Sources

2.1.1 Past Sales Information

• Data Points:Item,Quantity,Date,Price,Time

• Sources: The cafeteria’s Point of Sale (POS) systems.

• Techniques: To record hourly patterns, make sure timestamps are accurate and export data from

POS systems.

2.1.2 Calendar

• Calendar Data Points:Public holidays, academic calendar (term start and end dates, holidays, exam

periods).

• Source:Public holiday databases, university academic calendar.

• Methods:Use reputable websites to scrape public holiday data and access online academic calendar

resources.

2.1.3 Weather

• The weather conditions (sunny, rainy, etc.) and temperature are the five weather data points.

• Sources:Weather APIs, such as Weather.com and OpenWeatherMap.

• Methods: To gather historical and current weather information pertinent to the campus location, use

API requests.

2.1.4 Events Day

• Special activities: Off-campus activities influencing campus life, as well as on-campus events (con-

certs, sporting events, guest lecturers).

• Sources: Local event listings and university event calendars.

8
3 Data Preprocessing

A number of procedures are used in data preprocessing to clean, modify, and arrange raw data into a format

that is appropriate for modelling and analysis. This is a theoretical summary of the main procedures in data

preprocessing for forecasting student cafeteria usage.

3.1 Data Cleaning

3.1.1 Handling Missing Values

• Identification: We can detect the missing values within the given dataset.

• Fill Missing Values: We can fill the missing values with the subsitutes such as mean, median, and

mode for numerical dataand most categoryfor the categorical data.

• Romove Missing Values: We can remove the missing values in case where are too many missing

values because it may affected rows and columns to maintain our dataset.

3.2 Normalization and Standardization

3.2.1 Normalization

• Sacalling: We can adjust the numerical features to common scale between 0 and 1 to ensure the

comparability and improve the model performance.

3.2.2 Standardization

• Rescaling: We can rescale our numerical features to have mean of 0and the standard deviation of 1

particularly useful when data varying scales.

3.2.3 Encoding Categorical Variables

• One-Hot Encoding: We can convert categorical varables into the binary columns to enable its uses

in machine learning.

3.3 Data Spliting

• Training and Testing: We can divide the dataset into the training and testing subsets to evaluate

the performance of our model.

9
• Validation Set: We can create a validation set to tune the model parameters and avoid overfitting.

4 Exploratory Data Analysis (EDA)

Exploratory Information Examination (EDA) is a basic move toward the information science process where

experts inspect and sum up the principal qualities of a dataset. EDA expects to reveal designs, spot

irregularities, outline speculations, and really take a look at presumptions through factual illustrations and

different information perception procedures.

4.1 Dataset

Figure 1: Dataset of Cafeteria

4.2 Graph Distributions

4.2.1 Each Day Sale

Figure 2: Day wise Sale

10
4.2.2 Item Sale for Different Weather Conditions

Figure 3: Bar Plot of Item sale in different weather

4.2.3 Average Quantity Sold in Events

Figure 4: Bar Plot of Item sale in different weather

11
4.2.4 Total Price of Item

Figure 5: Total Price of each Item in PKR

4.2.5 Sales by Weather Condition

Figure 6: Yearly Sale by Weather Condition in PKR

12
4.2.6 Total Sales by Event

Figure 7: Total Sale by Events in PKR

4.2.7 Total Sales by Month

Figure 8: Total Sales by Month in PKR

13
5 Predictive Modeling

With predictive modelling, past data is used to build a model that can forecast future events using machine

learning algorithms and statistical methodologies. The process involves picking pertinent features, suitable

algorithms, training the model, and assessing its efficacy in order to anticipate campus cafeteria consumption

by students.

5.1 Model Selection

5.1.1 GradientBoostingRegressor

GradientBoostingRegressor is an algorithm for regression problems. In each iteration, it constructs an

ensemble of weak prediction models, usually decision trees, and selects the model that yields the greatest

gain for the purpose of learning. Gradient Boosting is becoming increasingly popular in the field of machine

learning as the most popular algorithm for regression and classification.[4] It is an iterative method that uses

an ensemble of decision trees; each new tree attempts to address the shortcomings of its precursors.

5.2 Model Training and Evaluation

5.2.1 Loading Data

• Data Preparation: The dataset is preprocessed taking into account the missing values, encoding the

categorical variables, and picking out the features which are in consideration.

• Feature Selection:Select relevant features (Quantity Item Price (PKR), Temperature (°C) Day,

Weather Event) and target variable (Total Price (PKR)).

• Model Initialization: Initialize GradientBoostingRegressor with parameters like number of estima-

tors (100), learning rate (0.1) and maximum depth (3).

• Model Training: Train model using training set. Use fit method.

• Making Predictions:Predict target variable on testing set using predict method.

• Model Evaluation: Calculate performance metrics: MSE, RMSE and R².

14
6 Model Development

Model development involves several stages. These include understanding the problem. Preparing the data is

another stage. Selecting appropriate model training the model evaluating its performance and deploying it for

practical use are also crucial stages. The process starts with defining the objective and identifying the target

variable. This is followed by collecting and cleaning the data to handle missing values and outliers. Data

transformation and feature engineering are crucial. These steps convert categorical data to numerical formats

and create relevant features. Model selection includes choosing the right algorithm. Tuning hyperparameters

is also essential. Training the model involves fitting it to the training data. Common techniques include

Gradient Boosting. This technique sequentially improves model performance. Evaluation uses metrics such

as MAE, MSE RMSE and R² to assess accuracy. This ensures the model generalizes well to new data.

Finally the trained model is saved and integrated into a production environment. Continuous monitoring

maintains performance.

6.1 Coding

import pandas as pd

from sklearn.ensemble import GradientBoostingRegressor

from sklearn.preprocessing import OneHotEncoder

from sklearn.compose import ColumnTransformer

from sklearn.pipeline import Pipeline

Load the data df pd.readc sv(Suf yan(2).csv ′ )

Convert the ’Date’ column to datetime format with explicit format specification

df[’Date’] = pd.tod atetime(df [′ Date′ ], f ormat =′

Handle any parsing errors (if there are any)

df = df.dropna(subset=[’Date’])

Aggregate the data by date to get the total price for each day

dailyd ata = df.groupby([′ Date′ ,′ W eather′ ,′ Day ′ ,′ Event′ ,

’Temperature’], asi ndex = F alse)[′ T otalP rice(P KR)′ ].sum()

Select features and target variable

X = dailyd ata[[′ W eather′ ,′ Day ′ ,′ Event′ ,′ T emperature′ ]]

y = dailyd ata[′ T otalP rice(P KR)′ ]T argetvariable

Define preprocessing steps for categorical features

15
categoricalc ols = [′ W eather′ ,′ Day ′ ,′ Event′ ]

preprocessor = ColumnTransformer(

transformers=[

(’cat’, OneHotEncoder(), categoricalc ols)],

remainder=’passthrough’ Keep non-categorical columns as they are )

Define the model pipeline with GradientBoostingRegressor model = Pipeline(steps=[

(’preprocessor’, preprocessor),

(’regressor’, GradientBoostingRegressor(ne stimators = 100, randoms tate = 42))])

Train the model

model.fit(X, y)

Function to get prediction

def getp rediction() :

weather = weatherv ar.get()

day = dayv ar.get()

event = eventv ar.get()

temperature = int(temperaturee ntry.get())

Test the model with user input

inputd ata = pd.DataF rame([[weather, day, event, temperature]],

columns=categoricalc ols + [′ T emperature′ ])

predictedp rice = model.predict(inputd ata)[0]

Display prediction

predictionl abel.conf ig(text = f ”P redictedtotalpricef ortheday : predictedp rice”)

7 Results

7.1 App Development

App development encompasses the entire process of creating software application from ideation and design

to coding testing. Deployment and maintenance also play crucial roles. It begins with defining the app’s

purpose, target audience and core functionalities. During the design phase user interfaces (UI) and user

experiences (UX) are meticulously crafted. This ensures an intuitive and engaging experience. The develop-

ment phase involves selecting appropriate technology stack. Writing code and integrating necessary features

and APIs follow. Rigorous testing ensues. This identifies and fixes bugs It ensures the app runs smoothly

16
across different devices and platforms. Once the app is deployed to app stores, continuous monitoring user

feedback analysis and regular updates are crucial to maintaining performance, security and user satisfaction.

This holistic approach ensures the creation of reliable user-friendly and efficient application that meets the

needs of its users.By outlining these sections and subsections, the app development process is thoroughly

documented, covering all aspects from design and implementation to deployment and user feedback.

7.1.1 GUI App

7.1.2 Prediction

7.1.3 Sunny and Normal Day

Figure 9: Predicted Price for Sunny and Normal Day in PKR

If the weather is sunny and normal day the predicted price for the given inputs given according to weather

temperature.We can predict the dataset for different inputs.

17
7.1.4 Cloudy and Musical Concert

Figure 10: Predicted Price for Cloudy and Musical Concert in PKR

7.1.5 Holiday and Rainy

Figure 11: Predicted Price for Holiday and Rainy in PKR

18
7.1.6 Exam Day

Figure 12: Predicted Price for Exam Day and Fog in PKR

The predicted price in exam days is very high other than holidays, normal days,and musical concert.

19
8 Conclusion

In conclusion the development of predictive model for student cafeteria consumption is meticulous process

that integrates several critical steps. Initially, it requires clear understanding of the problem. Defining the

objective is crucial. Forecasting total sales based on historical data is necessary. The data preparation phase

is essential. This involves collection. Cleaning to handle missing values and outliers. Transformation is

necessary to convert categorical variables into numerical forms. Features must also be scaled appropriately.

Feature engineering plays a vital role. It enhances model performance. This is done by creating and

selecting the most relevant features. Choosing the right model. GradientBoostingRegressor and fine-tuning

its hyperparameters ensures optimal performance. Methods like grid search or random search are employed

for fine-tuning.

The model training phase involves fitting the model to training data. It also employs cross-validation to

avoid overfitting. Evaluating the model with metrics like MAE. MSE RMSE and R² on a separate testing

set confirms its predictive power. Additionally it ensures generalizability.

Finally the deployment stage involves saving the model. It is then integrated into production environment.

This allows for real-time predictions. Maintaining it through continuous monitoring. Periodic retraining

adapts to new data. This maintains accuracy. This comprehensive approach ensures the predictive model

effectively supports decision-making. It also enhances operational efficiency in the cafeteria.

20
References

[1] Ramya Gorantla. Smart Cafeteria Management System Using Neural Networks. PhD thesis, California

State University, Northridge, 2023.

[2] Daniyal Irfan, Xuan Tang, Vipul Narayan, Pawan Kumar Mall, Swapnita Srivastava, V Saravanan, et al.

Prediction of quality food sale in mart using the ai-based tor method. Journal of Food Quality, 2022,

2022.

[3] Indrajeet Kumar, Jyoti Rawat, Noor Mohd, and Shahnawaz Husain. Opportunities of artificial intelli-

gence and machine learning in the food industry. Journal of Food Quality, 2021:1–10, 2021.

[4] Minwoo Lee, Wooseok Kwon, and Ki-Joon Back. Artificial intelligence for hospitality big data analytics:

developing a prediction model of restaurant review helpfulness for customer decision-making. Interna-

tional Journal of Contemporary Hospitality Management, 33(6):2117–2136, 2021.

21

You might also like