Application Research of Collaborative Filtering Algorithm in Catering Recommendation System

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 5

Application Research of Collaborative Filtering

2023 19th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNC-FSKD) | 979-8-3503-0439-8/23/$31.00 ©2023 IEEE | DOI: 10.1109/ICNC-FSKD59587.2023.10280828

Algorithm in Catering Recommendation System


Bingxian Fan Jingyao Hu*

School of Information Science and Engineering, School of Information Science and Engineering,
Linyi University, Linyi University,

Linyi, China, Linyi, China

552762462@qq.com hujingyao@163.com

Abstract— Collaborative filtering algorithm is commonly II. COLLABORATIVE FILTERING RECOMMENDATION


ALGORITHM
used in recommendation systems. Based on the traditional
The key to collaborative filtering recommendation
collaborative filtering algorithm, this paper proposes an algorithms is similarity calculation. Firs, users and items are
classified according to similarity, then recommendations are
improvement scheme. When calculating similarity, an improved
made. Collaborative filtering algorithms include:
cosine similarity formula is first used, followed by adding time
A. User-based collaborative filtering algorithm
threshold parameters for weighting. Furthermore, the similarity This algorithm is based on users, calculating the similarity
between different users based on the historical data left by them,
results are re-weighted based on the number of common scoring
and according this similarity, recommending dishes to target
items among users, and finally normalized. users.

Keywords—recommendation system; Collaborative filtering; 1) Find a set of users with similar interests to the
target users
Similarity Given user u and user v, letting N (u) and N (v) represent the
collection of items that user u and v have had positive feedback.
I. INTRODUCTION Simply calculate the interest similarity between users u and v
The catering recommendation system not only meets the using the Jaccard formula, as shown in formula (1):

=|
| ∩ |
personalized catering needs of customers, but also helps
businesses establish long-term stable customer relationships, , ∪ |
(1)
reduces customer churn rates, and improves customer loyalty.
In reality, many users have not been engaged in historical
The focus of the catering recommendation system is the
behavior towards the same item, then | ( ) ∩ ( )|=0.
recommendation algorithm it uses. So studying and optimizing
Therefore, it is possible to establish an inverted list of items to
recommendation algorithms is very important.This paper
users, saving a list of users who have acted on each item and
mainly studies the application and improvement of
additionally, to reduce the influence of popular items on
collaborative filtering algorithm in catering recommendation
calculating similarity between users. Formula for calculating
system.
user similarity, as shown in formula (2)

X202310452323,Linyi University Students Innovation and

Entrepreneurship Training Program 1

Authorized licensed use limited to: Universidad Simon Bolivar (Colombia). Downloaded on May 13,2024 at 03:54:12 UTC from IEEE Xplore. Restrictions apply.
∑∈ ∩ | |

( ) represents the set of items that user u likes, - (,, )) is


(6)
, =
| || |
(2)

1 ,#

similarity between item j and item i, and ' # is the interest of


the set of k items that are most similar to item j, is the
log 1!" #$"$
penalizes the impact of popular items in the

shared interest list of users u and v on calculating similarity. user u in item i.


2) Find items of interest to users in the collection
and recommend them to the target users III. IMPROVEMENT OF COLLABORATIVE FILTERING
This improvement,the algorithm will recommend K items RECOMMENDATION ALGORITHM
that are most similar to their interests to users. As shown in This algorithms ultimately need to perform similarity
formula (3): calculation, and then complete recommendation on this basis.

% , #$ = & ' #
The main idea of algorithm optimization is to improve the
∈( ,)$∩ #$ formula for calculating similarity based on the needs of business
scenarios. This paper will optimize the cosine similarity
(3)
commonly used in the collaborative filtering recommendation
B. Item-based collaborative filtering algorithm. The improvement ideas are as follows:
recommendation algorithm
A. Cosine similarity
Collaborative filtering based on items takes items as the core,
Cosine similarity is a measure of the difference between two
combines user historical behavior data, calculates the similarity
individuals. It is the cosine value of the angle between two
between items, and then recommends them to target users based
vectors in vector space,. When the cosine value approaches 1, it
on their similarity. The implementation process is as follows:
indicates that the angle between two vectors is closer to 0
1) Calculate similarity between items
degrees, indicating that the two vectors are more similar. When
Use formula (4) to define the similarity of items:
the cosine value is negative, it indicates that two vectors are

=
| * ∩ + |
*,+ | * |
negatively correlated. The calculation method for cosine
(4)
similarity in a two-dimensional vector is shown in formula (7):
The denominator | (#)| is the number of users who like item
51 52!71 72
#, ,$ = cos #, ,$ = =
0∙2$
3"0"3×3"2"3
i, and the numerator | (#)∩ (,)| is the number of users who (7)
2 2 2 2
85 !7 ×85 !7
like both item i and item j. If item j belongs to a popular item, it 1 1 2 2

will make any item very similar to the popular item. This is not
a good feature for recommendation systems dedicated to When considering the user's rating as an n-dimensional
discovering long-tail information. Therefore,Therefore, formula vector, the calculation method is shown in formula (8):

;
∑ : ×:,
(#9 #, ,$ =
(5) is used to avoid recommending popular items.
#,,<1 #

=
| * ∩ + |
(8)
2 2
8∑; : ×∑; :
*,+ | * || + |
(5) #<1 # , ,

2) Generate a recommendation list for users


Cosine similarity measures the included Angle between
based on project similarity and historical data.
After obtaining the similarity between items, this algorithm vectors, and pays more attention to the difference in the

can calculate the interest of user u in item j through formula (6): direction of vectors. The difference fitness brought by the scores
of different users is low, which will cause errors. If there are two

% = & '
users u1 and u2, their scoring vectors are respectively (5,4,3)
, ,# #
#∈ ∩- ,,)
and (2,1,3). At this time, the calculation result of cosine

Authorized licensed use limited to: Universidad Simon Bolivar (Colombia). Downloaded on May 13,2024 at 03:54:12 UTC from IEEE Xplore. Restrictions apply.
similarity is 0.869, which shows that the similarity between the the (J#)−JLℎ) value, the greater its weight, and the larger the
two users is very high. However, visually, there will be a large impact factor during calculation, the better the average weighted
difference in the preference similarity between u1 and u2, so the effect.
result calculated by cosine similarity is inconsistent with the
D. Add weight to the number of shared scoring items
actual situation. Therefore, an improved cosine similarity
Due to the limited number of common scoring items among
formula is used in this paper to solve the problem that cosine
some users, directly using the improved cosine similarity
similarity is not sensitive enough to numerical value.
calculation results can lead to bias and result in low
B. Improved cosine similarity recommendation quality. In order to improve this deviation, this
The core of the improved cosine similarity algorithm is to article further weights the similarity value, and the weight
subtract the mean value of the user's score from each dimension influence factor is calculated based on the number of common
of the vector,The formula is as follows: scoring items between users, as shown in formula (11):

|N# ∩N, |
× (#9 #, ,$, |N# ∩ N, | ≤ J
;
∑ : =:# ∙ :, =:,
(#9 #, ,$ =
#,,<1 #
(#9 #, ,$ = M J

(#9 #, ,$, 3N# ∩ N, 3 > J


(9)
; ;
8∑ : =:# 2 8∑ : =:, 2
(11)
#<1 # ,<1 ,

In the above example, the average ratings of users u1 and u2 Among them, |N#∩N,| is the number of common rating items
are 4 and 2, respectively. It is used to calculate user u1, and the among users, T is the set threshold for the number of items, and
u2 rating vectors are (1,0, -1) and (0, -1,1), respectively. At this the ratio of the two can be regarded as a penalty factor. When
point, the cosine value is calculated to be approximately -0.5, the number of common rating items is less than T, the credibility
and the result is negative, indicating that there is a significant of their rating similarity is also smaller, and the similarity value
difference in preferences between the two users, and this result calculated through weighting is more realistic; When the
is also more realistic. number of common scoring items is greater than T, no
weighting is applied, and the similarity value obtained in the
C. Time parameters were added for weighted average

-#9(#,,) being [-1,1], for the convenience of calculation,


previous step can be used directly. Due to the value range of
processing
In this paper, formula (10) is used to calculate the average

user rating :># :


formula (12) is used for normalization processing.

(#9 #, ,$ = 0.5 +
′ (#9 #,,
2
(12)

:>? = @ABC :*A


D E F =EGH
EI =EGH
(10)
IV. RESULT VALIDATION
In the formula, J#) represents the rated time of the project by
user i, and J#1 represents the last rated time; :#) represents the
A. Experimental dataset

rating score of user i, and :#1 the most recent rating of user i; JLℎ
The dataset used in this article is from the MovieLens
dataset provided by the GroupLens research group. The
is a set time threshold parameter, indicating that when
MovieLens dataset is one of the most classic datasets in the field
calculating scores, only the average weighting is performed
of recommendation systems. This article uses the Pandas library

have a weight of 0 and will not be calculated; J( represents the


under the current threshold. Scores exceeding this time will
to read the original ratings.dat data content, as shown in Figure
1:
current system time, with (J(−JLℎ) as a fixed value; if the scoring
time of user i is closer to the current system time J#), the larger

Authorized licensed use limited to: Universidad Simon Bolivar (Colombia). Downloaded on May 13,2024 at 03:54:12 UTC from IEEE Xplore. Restrictions apply.
experiments on training data, and finally calculates MAE.
Figure 2 shows the comparison of the final experimental results.

0.82
0.78
0.74
0.7
0.66
0.62
0.58
0.54
0.5
Improved Weighted Improved cosine Cosine
Cosine
Fig 1 data source

In this experiment, the MovieID, Rating, and Timestamp in


the MovieLens dataset can be used to replace the product items,
user ratings, and rating time required by the system. Fig 2 Comparison of recommendation results of different algorithms

B. evaluating indicator From the experimental results, it can be seen that the
Combined with practical application requirements, in the improved weighted cosine calculation method proposed in this
process of collaborative filtering recommendation algorithm article has obvious advantages, and the MAE comparison is
research, in order to evaluate the quality of recommendations, significantly reduced.
the average absolute error (MAE) is generally used. MAE
V. SUMMARIZE
measures the accuracy of system recommendations by the
This article demonstrates the shortcomings of traditional
difference between the user's predicted value and the user's real
collaborative filtering algorithms and proposes the use of an
score in the test data set. This indicator first needs to sum the
improved cosine similarity formula, adding time threshold
absolute values of the differences between N corresponding
parameters for weighting, and re weighting the similarity results
predictions, and then takes the average value to calculate MAE.
before normalizing them there are still shortcomings:
The calculation method is shown in formula (13):


A. Seasonal issues with dishes
|%# =Y# |
VWX =
#<1
(13) Seasonal issues refer to the fact that users generally enjoy
eating different dishes in different seasons, and their dietary
The lower the MAE value, the more accurate the prediction
interests will not remain unchanged and will change according
score of the algorithm, and the higher the quality of
to the needs of the season.
recommendations.
B. Dish pairing problem
C. Result analysis
Dish pairing problem:it refers to the fact that when ordering
This experiment compared three different similarity
a meal, users do not always order one type of dish, but rather
algorithms, namely cosine, improved cosine, and improved
different types of dishes are paired with each other.
weighted cosine. For each similarity algorithm, this article
implements user based similarity calculation, uses weighted In response to the above issues, this article will conduct

summation algorithm to generate predictions, conducts further research.

Authorized licensed use limited to: Universidad Simon Bolivar (Colombia). Downloaded on May 13,2024 at 03:54:12 UTC from IEEE Xplore. Restrictions apply.
REFERENCES [10] Rubaiee S , Zhao Zhenyi, Jian Zhou. An improved

[1] Nándor Fodor,Péter Csathó, Tamás Árendás, László association rule mining algorithm for large data[J]. Journal of

Radimszky, Tamás Németh. Crop Nutrient Status and Nitrogen, Intelligent Systems, 2021, 30(1).

Phosphorus, and Potassium Balances Obtained in Field Trials


Evaluating Different Fertilizer Recommendation Systems on
Various Soils and Crops in Hungary[J]. Communications in Soil
Science and Plant Analysis, 2013, 44(5).

[2] David Goldberg, David Nichols, Brian M, Oki and


Douglas Terry. Using collaborative filtering to weave an
information Tapestry[J]. Communications of the ACM,
1992,35(12):61-71..

[3] Yukun Cao,Yunfeng Li. An intelligent fuzzy-based


recommendation system for consumer electronic products[J].
Expert Systems With Applications, 2006, 33(1).

[4] Yao-Chun CHIANG,Deng-Neng CHEN. Combining


Personal Ontology and Collaborative Filtering to Design a
Document Recommendation System[J]. Journal of Service
Science and Management, 2009, 02(04).

[5] Congying Guan, Shengfeng Qin, Wessie Ling, Guofu


Ding. Apparel recommendation system evolution: an empirical
review[J]. International Journal of Clothing Science and
Technology, 2016, 28(6).

[6] Can Cui, Mengqi Hu, Jeffery D. Weir, Teresa Wu. A


recommendation system for meta-modeling: A meta-learning
based approach[J]. Expert Systems With Applications, 2016, 46.

[7] Kleinberg J. The Small-World Phenomenon: An


Algorithmic Perspective[J]. Acm Symposium on Theory of
Computing, 2010, 406(2):163-170

[8] Renjie Zhou, Samamon Khemmarat, Lixin Gao, Jian


Wan, Jilin Zhang, Yuyu Yin, Jun Yu. Boosting video popularity
through keyword suggestion and recommendation systems[J].
Neurocomputing, 2016, 205.

[9] Mehmet Ali Salahli, Tokay Gasimzade, Flora


Alasgarova, Akber Guliyev. The Use of Predictive Models in
Intelligent Recommendation Systems[J]. Procedia Computer
Science, 2016.

Authorized licensed use limited to: Universidad Simon Bolivar (Colombia). Downloaded on May 13,2024 at 03:54:12 UTC from IEEE Xplore. Restrictions apply.

You might also like