Professional Documents
Culture Documents
Final Year Project Report 2
Final Year Project Report 2
Final Year Project Report 2
Created By :
SID : A11.2013.07380
SID : A11.2013.07380
ii
VALIDATION OF THE BOARD OF EXAMINERS
SID : A11.2013.07380
This final year project has been examined and defensed in front of Examiner team
on August 8th, 2019. We hereby declare that we have read this final year project
report and in our opinion this final year project is sufficient in terms of scope and
quality as a partial fulfillment of Bachelor of Computer Science.
Examiner Team:
Head of Examiner
iii
DECLARATION
SID : A11.2013.07380
is the result of my own research except as cited in the references. This final year
project has not been accepted for any degree and is not concurrently submitted in
candidature of any other degree.
Made in : Semarang
Date : 8th August 2019
Signature
iv
CONSENT STATEMENT OF SCIENTIFIC PAPER’S
PUBLICATION FOR ACADEMIC INTEREST
SID : A11.2013.07380
as well as the tools needed (if any). With Non-exclusive Royalty-Free Right, Dian
Nuswantoro University reserve the right to store, multiply, use, manage it in the
form of database, distribute, and display / publish on the internet or other media
for academic interest without need to ask any permission from me as long as
included my name as the author / creator.
Made in : Semarang
Date : 8th August 2019
Signature
v
ACKNOWLEDGEMENTS
With gratitude to Allah SWT. God the merciful and the most merciful who gave
all the grace and the guidance to the author so that the final report entitled
“PROMOTION ANALYZATION ON WEB APPLICATION USING APRIORI
TECHNIQUE” can be finished as planned due to the support of various parties.
Therefore, the author would express thanks to:
vi
May Almighty God give a greater reward to them, and finally the author hope that
the writing if this final year project report can be helpful and useful as its function.
Author
vii
ABSTRACT
The design of this application is made with the implementation in native PHP.
The main purpose of this application is to analyze the customer’s behavior of
shopping. It is an application that allows user as an admin or owner of the shop to
make promotion for customers. The author has used the knowledge of PHP during
the internship.This research aims at analyzing the sales transaction based on
customer’s transaction behaviour to know the best rule and make suitable
promotion for customer. The data sources of this research were obtained during an
internship program the author took at Khon Kaen University, Khon Kaen,
Thailand in 2016. The data were 400 transaction dataset from the supervisor at
Khon Kaen University. The method used in this research was association using
apriori algorithm. The results show that the data mining application can be used to
determine the association rules using apriori algorithm. Data mining method is a
market basket analysis using apriori algorithm that can be applied in the
transaction data to determine the promotion in the internship program at Khon
Kaen University, Thailand with association rules as follows: Keyboard →
Monitor, with the confidence value 66.67% it means that 66.67% from all of the
customers who buy Keyboard also buy Monitor. Monitor → Keyboard, with the
confidence value 53.33% it means that 53.33% from all of the customers who buy
Monitor also buy Keyboard. The writer suggests more specific and larger data for
the research.
viii
TABLE OF CONTENT
ix
2.2.2 Data Mining .................................................................................... 11
2.2.3 Cross-Industry Stadard Process for Data Mining (CRISP-DM) ..... 15
2.2.4 Types of Data Mining Method ........................................................ 18
2.2.5 Association Rule ............................................................................. 19
2.2.6 Apriori Algorithm ........................................................................... 19
2.2.7 Minimum Support ........................................................................... 21
2.2.8 Minimum Confidence ..................................................................... 21
2.2.9 PHP ................................................................................................. 22
2.2.10 MyQSL Database ............................................................................ 23
2.3 Review of The Object of Study .............................................................. 23
2.3.1 Khon Kaen University..................................................................... 23
2.3.2 Vision and Mission ......................................................................... 24
2.3.3 Location........................................................................................... 25
2.3.4 Job Description ............................................................................... 25
2.3.5 Project Schedule .............................................................................. 26
2.3 Framework of Study ............................................................................... 27
CHAPTER III ....................................................................................................... 29
RESEARCH METHOD ........................................................................................ 29
3.1 Data Sources ........................................................................................... 29
3.2 Data Analysis Technique ........................................................................ 29
3.3 Proposed Method .................................................................................... 31
3.4 Model Testing......................................................................................... 32
CHAPTER IV ....................................................................................................... 37
RESULT AND DISCUSSION ............................................................................. 37
4.1 Research Result ...................................................................................... 37
4.2 Design Function ..................................................................................... 37
4.2.1 Use Case Diagram ........................................................................... 37
4.2.2 Sequence Diagram .......................................................................... 38
4.2.3 Activity Diagram ............................................................................. 39
4.2.4 Flowchart ........................................................................................ 43
4.3 Discussion .............................................................................................. 47
x
4.3.1 Final Interface Program................................................................... 47
4.3.2 Shop Interface Diagram .................................................................. 48
4.3.3 Apriori Interface Diagram ............................................................... 52
4.3.4 Promotion Interface Diagram .......................................................... 53
4.3.5 Choose Dataset ................................................................................ 54
4.3.6 Processing Data ............................................................................... 55
4.3.7 Compute the support and confidence value .................................... 55
CHAPTER V......................................................................................................... 63
CONCLUSION ..................................................................................................... 63
5.1 Conclusion .............................................................................................. 63
5.2 Suggestion .............................................................................................. 64
REFERENCES...................................................................................................... 65
ATTACHMENT ................................................................................................... 67
Attachment 1. Raw Data .................................... Error! Bookmark not defined.
xi
TABLE OF FIGURE
xii
TABLE OF TABLE
xiii
TABLE OF ATTACHMENT
xiv
CHAPTER I
INTRODUCTION
1
2
In this project, the author use the Apriori algorithm of data mining to
analyze the behavior of customer to make rules of association that could
help the owner of the shop to make a promotion that suitable for the
customers.
THEORITICAL BACKGROUND
4
5
information from the system can also be used as planning for other sales
strategies, such as giving sidkon or improving the layout of goods.
d. Penerapan Association Rule Dengan Algoritma Apriori Untuk
Menampilkan Informasi Tingkat Kelulusan Mahasiswa Teknik
Informatika S1 Fakultas Ilmu Komputer Universitas Dian Nuswantoro
(Saputro, 2015)
In this study the concern is about the number of new student in
Dian Nuswantoro University is not the same with the student that get
graduated, it will reduce the accreditation of a university.
This study will apply association rule method and using apriori
algorithm with SPMF (Sequential Pattern Mining Framework)
application to determine the support value and confidence value from
student data of Informatics Engineering that has been processed.
Using apriori algorithm and implemented 0.2 as a support value
and 0.5 as a confidence value on SPMF application produced 8 rules.
Pattern data that found in student master data and student graduation
data containing entry attributes with the regular category have a strong
tendency to contain 6 rules and attributes with a 4 year study period or
less than 4 years and a GPA of 2.76 - 3, 50 contains 3 rules.
e. Penerapan Algoritma Apriori Untuk Menentukan Strategi Penjualan
Pada Rumah Makan “Dapoer Emak” Pati (Hidayat & Wijanarto, 2017)
In this study, the author concern about the waste of food in Dapoer
Emak restaurant. The food that has been cooked and didn’t sold will be
wasted and getting thrown away.
This study will apply market basket analysis and using apriori
algorithm to determine the selling strategy of this restaurant, to know
what the customer want and will reduce the waste of food.
8
customer.
mining, you possibly can ask way more refined questions of your
information than you possibly can with typical querying strategies. The
data that information mining offers can result in an immense enhancement
within the high quality and dependability of enterprise determination
making.
As a series of processes, data mining can be divided into several
sections, illustrated in Figure 1 below.
6 phases of CRISP-DM :
1. Business Understanding Phase
a. Detailed project objectives and needs in the overall scope of the
business or research unit
b. Translating goals and constraints into formulas of data mining
problems.
c. Prepare an initial strategy for achieving goals.
2. Data Understanding Phase
a. Collecting Data
b. Use data analysis investigations to further identify data and search
for initial knowledge
c. Evaluating data quality
17
minimal confidence given by the consumer. It's by far probably the most
well-known affiliation rule algorithm. The fundamental differences of this
algorithm from the AIS and SETM algorithms are the way of generating
candidate itemsets and the selection of candidate itemsets for counting
(Himani Bathla, 2015).
There are two main processes that performed in the apriori algorithm:
1. Join
In this process for each item is combined with another item until can’t
form a combination anymore.
2. Prune
In this process, the result of the item set that have been combined was
trimmed using a minimum support that has been specified by the user
2.2.9 PHP
PHP stands for Hypertext Preprocessor is currently one of the most
popular programming languages, widely used in both open source
community and in industry to build large web-focused applications and
application framework (Douglas Kunda, 2017; Douglas Kunda, 2017).
Php also used to add functions that can be done by html, also used to
communicate with MySQL database.
PHP is called the server side programming language because PHP
is processed on the server computer. This is different compared to client-
side programming languages like JavaScript that are processed in a web
browser (client).
PHP can be used free (free) and is Open Source. PHP is released
under the PHP License, a little different from the GNU General Public
License (GPL) that is commonly used for Open Source projects.
The ease and popularity of PHP has become a standard for web
programmers around the world. According to Wikipedia in February 2014,
around 82% of the world's web servers use PHP. PHP also forms the basis
of popular CMS (Content Management System) applications such as
Joomla, Drupal, and WordPress.
23
2.3.3 Location
1 2 3 4 5 6 7 8 9
Project Understanding
Design Website
Create Database
Shop Function 1
Apriori Function
Shop Function 2
Testing
Maintenance
Re-Testing
Final Presentation
the author got the chance to take an internship program abroad, in Khon
Kaen University, Khon Kaen, Thailand. That was a big oporunity for me.
The author was under supervision of Asst, Prof. Dr. Wararat
Songpan, who offers and encourage the author to implement one of data
mining technique, Apriori.
This project was conducted in two stages. First is to re-learn about
apriori technique in data mining that the author have learn before in Dian
Nuswantoro University on 6th semester. It doesn’t took the author a long
time to understanding more about apriori. The second is to build an online
shop system that could analyze the behavior of customer and help the
owner to determine a promotion based on customer’s shopping behaviour.
Business Understanding
Data Understanding
Data Preparation
Modeling
Evaluation
Deployment
RESEARCH METHOD
29
30
Here below is the raw data transaction that already stored in the
database (20 out of 400 data). The rest of raw data is attached in the
attachment 1.
20 out of 400 raw data that already stored in the database contain
some information such as transaction number, transaction ID, date of
transaction, time of transaction, the items and total price of each
transaction.
31
The raw data from database already put in the excel file. With this
file that already contain the raw data from database, the process of data
mining is possible to be done.
Phase 1. Join:
Find the itemset candidate of 1 item (C1) and count the support. For
calculating the support, seen how many itemsets that appear in the table of
the transaction and multiplied by the weight of each transaction. Because
there are 10 transactions table, so will be calculated from the percentage of
100% divided by the number of transactions. The details of percentage for
each itemset will be shown in Table 3.5
Phase 2. Prune:
Choose which fulfill the requirement of minimum support is 20%. The
details will show in Table 3.6 as follow.
34
The next phase is repeated the first phase and second phase until the candidate
that fulfill the minimum support does not exist anymore.
(3)
The candidate that fulfill the minimum support (L2) will show in Table 3.8.
As follow
Itemset Support
2,6 20%
14,16 20%
6,18 20%
16,12 20%
Itemset Support
2,6,18 -
14,16,12 -
In Table 11 above the itemset in C3 does not exist in the transaction, therefore
the rule stops until here.
The process to find association rules that meet the minimum confidence 50%.
And the result of the rules will show in Table 12
37
38
This sequence shows how the user get the association rules from the initial
sequence, which is input the minimum support and minimum confidence.
39
In this sequence diagram show the sequence on how the user to choose the
desired rules to make it as a promotion.
This activity occurs when the admin already login to the system.
The admin could see the report from every month and make it as a data
set. And after that the admin will use it to see the association between
those transaction on that dataset, it will be shown in Figure 13 below:
41
Describe the apriori process. The admin can input the value of minimum
support and minimum confidence and also choose the dataset that will be
uuse. It will shown the result of the apriori.
42
After the apriori process, the admin move to promotion section, in this activity,
the admin will chose the association that has more confidence
43
4.2.4 Flowchart
The system starting with choosing a dataset in the flowchart Figure 15, if
the admin need a new dataset, admin should choose the desired transaction report
and choose that report to become a new data set. After admin choose the dataset,
then admin input the desired minimum support and minimum confidence and the
process begin, to know every impossible rules that fulfill the parameters.
44
From figure X shows the code for getting the report based on inputted
month that chosen by the admin. On line 114 is code for selecting month query
with selected month. And for line 116 is the code for selecting every transaction
that occurs on selected month.
c. Line 4 - variable $sumtrans store the total transaction on dataset that stored
on variable $file
d. Line 5 and 6 - get the minimum support and confidence
e. Line 7 - make a new object from class Apriori named raidou.
f. Line 13 - to separate the items with the delimiter comma ( , )
g. Line 14 - apriori process
h. Line 21 – print or display the association rules.
After the admin got the association rules, admin move to the
promotion page to make a promotion shown in flowchart figure 16. First,
47
admin input the minimum support of those rules that shown in the page.
The result of association that have minimum support desired by admin will
be shown and admin could choose the rule that will admin make for
promotion.
The figure X above is a code for filtering minimum confidence from the
rules that already shown to make the admin easier to choose the rule to make it as
a promotion.
After filtering and choose the desired rule, the user required to input the
name, price and limited stock of the promotion on line 139 to line 142. After that
the code on line 144 is for store it at the database.
4.3 Discussion
4.3.1 Final Interface Program
The resulted program will be discussed here, along with the
process. In order to kept it brief the author only discuss the apriori process
and promotion action only and just showing some part of shop interface.
48
Here is the main process of the project, the admin input the desired
minimum support and minimum confidence, and choose the dataset. The
admin also could choose whether the association rules are going to be
saved to the database or not. If yes, previous association rules will be
deleted and replaced with the new one. After that hit the “Process” button
to do the apriori process and wait for the application display the
association rules based on inputed minimum support and minimum
confidence.
53
process that have been stored in database. On the top left, there’s a field to
filter the desired minimum support to choose the best promotion based on
the higher support. After that, on the right side, there’s an option selection
bar to choose the rules from the filtered minimum support. Then the admin
name the promotion and give the price of the promotion and declare how
many stock of the promotion is.
4.3.5 Choose Dataset
Figure 24. Shows the results of the transaction data that has been
cleaned up and transformed so it is ready to be processed by using data
mining application and it will display the association rules according to
those datas.
55
𝑆𝑢𝑝𝑝𝑜𝑟𝑡 (𝐵𝑢𝑓𝑒𝑡)
𝑆𝑢𝑚 𝑜𝑓 𝑡𝑟𝑎𝑛𝑠𝑎𝑐𝑡𝑖𝑜𝑛 𝑡ℎ𝑎𝑡 𝑐𝑜𝑛𝑡𝑎𝑖𝑛𝑠 𝐶𝑎𝑟𝑑𝑟𝑒𝑎𝑑𝑒𝑟
= 𝑋100%
𝑆𝑢𝑚 𝑜𝑓 𝑡𝑟𝑎𝑛𝑠𝑎𝑐𝑡𝑖𝑜𝑛
23
= 𝑋100% = 5.4%
400
Table 11. Candidate 1 - Itemset (C1)
Table 14 shows the items that meets the minimum support which is
10%. Next, the results will proceed with merging the L1 to generates the
next candidate that contains two types of item, then will be recalculated
the support value.
16
= 400 𝑋100%
= 4%
Card
Flashdrive,RAM 0.75% Monitor,Proyektor 0.5%
Flashdrive,VGA Card 0.5% Monitor,USB 1%
Flexible Lamp
Flashdrive,Proyektor 0.75% Monitor,USB Hub 1.5%
Flashdrive,USB 2% Mouse,RAM 1.25%
Flexible Lamp
Flashdrive,USB Hub 5.5% Mouse,VGA Card 1.25%
Headphone,Joystick 4.5% Mouse,Proyektor 1.5%
Headphone,Keyboard 1% Mouse,USB 1%
Flexible Lamp
Headphone,Laptop 1% Mouse,USB Hub 1.75%
Headphone,Mini 0.75% RAM,VGA Card 9.75%
Speaker
Headphone,Monitor 1% RAM,Proyektor 0%
Headphone,Mouse 0.5% RAM,USB 0.25%
Flexible Lamp
Headphone,RAM 1% RAM,USB Hub 0%
Headphone,VGA 0.75% VGA 0%
Card Card,Proyektor
Headphone,Proyektor 0% VGA Card,USB 0.25%
Flexible Lamp
Headphone,USB 0.25% VGA Card,USB 0%
Flexible Lamp Hub
Headphone,USB Hub 0.5% Proyektor,USB 0%
Flexible Lamp
Joystick,Keyboard 0% Proyektor,USB 0.25%
Hub
Joystick,Laptop 1.5% USB Flexible 0.25%
Lamp ,USB Hub
CPU,DVDGame 1.25% CPU,Monitor 11.25%
CPU,Flashdrive 1.25% CPU,Mouse 0.75%
CPU,Headphone 1% CPU,RAM 3%
CPU,Joystick 0.5% CPU,VGA Card 2%
CPU,Keyboard 7.25% CPU,Proyektor 0.25%
CPU,Laptop 0.75% CPU,USB 0%
Flexible Lamp
CPU,Mini Speaker 0.5% CPU,USB Hub 0%
60
Itemset Support
Keyboard, Monitor 10%
DVDGame,Joystick 13.5%
CPU,Monitor 11.25%
The items in the last process that meets the minimum support is
shown in table 16. From the last Large-itemset, to be formed candidate
association rules. From the combination item then will be separated by 2
part with each position antecedent and consequent toward all of
possibilities.
40
=400 𝑋100% = 17.2%
Antecedent → Consequent
Antecedent is the trigger item so that the other item was purchased
while consequent is the item is affected by the purchased item antecedent.
At the time of generating association rules, the parameters of minimum
confidence is needed, because for each association rule that appears will be
calculated the percentage value of it confidence according to equation (4).
10
= 15 𝑋100% = 66.67%
10
= 18.75 𝑋100% = 53.33%
After steps above from Table 13 until Table 17 then the association rules will
appear like in Figure 26 It will show the association rules from datasets that has
400 transactions with the minimum support 10%.
CHAPTER V
CONCLUSION
5.1 Conclusion
From the analysis and experiment result that have been done, so the
researcher summarizes that data mining method is market basket analysis
using apriori algorithm that can be applied in the transaction data for
determine the promotion in the internship program at Khon Kaen
University, Thailand with association rules that produced is:
1. Keyboard → Monitor, with the confidence value 66.67% it means that
66.67% from all of the customers that buy Keyboard also buy Monitor.
2. Monitor → Keyboard, with the confidence value 53.33% it means that
53.33% from all of the customers that buy Monitor also buy Keyboard.
With the confidence value above the author wants to prove the
relationship between items. Because the confidence value is the
probability of occurrence some products which purchased simultaneously
where one of the product is certainly purchased by customer
63
64
5.2 Suggestion
1. The data that researched should be in large number
2. For the future works the program should be more automatic on
selecting or providing the dataset
REFERENCES
Lukmanul Hakim, A.F., 2015. Penentuan Pola Hubungan Kecelakaan Lalu Lintas
Menggunakan Metode Association Rules dengan Algoritma Apriori.
Muflikhah, L., Ratnawati, D.E. & Putri, R.R.M., 2018. Data Mining. Malang, Jwa
Timur, Indonesia: UB Press.
Pratibha Mandave, M.M.P.S.P., 2013. Data mining using Association rule based
on APRIORI algorithm and improved approach with illustration. International
Journal of Latest Trends in Engineering and Technology, 3(2), pp.107-13.
Rangkuti, F., 2009. Strategi Promosi yang Kreatif dan Analisis Kasus. 1st ed.
Jakarta, DKI Jakarta, Indonesia: Gramedia Pustaka Utama.
65
66
Wulandari, H.N. & Rahayu, N.W., 2014. Pemanfaatan Algoritma Apriori untuk
Perancangan Ulang Tata Letak Barang di Toko Busana. Yogyakarta: Universitas
Islam Indonesia Yogyakarta.
ATTACHMENT
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81