Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 12

TOPIC 1a

INTRODUCTION
TO DATA MINING
OBJECTIVES
To introduce about Data Mining and its
relationship with data and knowledge ✅

To discuss the history, evolution and motivation


of Data Mining

To discuss Data Mining techniques, tasks,


applications and some major issues
https://dribbble.com/shots/10494603-Isometric-Animation-Data-Mining
PATTERN RECOGNITION AND DATA MINING
PATTERN RECOGNITION
a process of recognizing a pattern using machine (computer), it can be viewed through several aspects

Pattern Recognition by Human Pattern Recognition by Computer Pattern Recognition from Data
 perceptual (emotions,  benefit of automated pattern  learn or observe from large
feelings) recognition amounts of data
 specialized – decision  advantage in complex  study the dependencies and
making calculations extract knowledge from data
WHAT IS DATA?
Data – the basic facts such as names, numbers or characters that come in different forms
(like text or image).

# Names Studies Education Work_performance Income (D)


1 Amni Ali Poor High School Poor None
2 Chuah Ah Lan Moderate High School Poor Low
3 Daria Danial Poor High School Poor None
Table 1 - a sample of data with 4 Marisa Malik Moderate Diploma Poor Low
five (5) variables, where the 5 Nur Aini Mat Poor High School Good Low
6 Suria Mohd Moderate Diploma Poor Low
last column indicates the 7 Ozaila Othman Good Master Good Medium
outcome of that sample. …

99 Muhd Haris Aziz Poor High School Good Low


100 Zulhairi Yatim Moderate Diploma Poor Low
WHAT IS KNOWLEDGE?
Knowledge – the processed or organized data (information) that is given some values to
uncover the relationship for deeper understanding.

Sample of knowledge in the form of IF then ELSE rules:


studies(Poor) AND work(Poor)  income(None)
studies(Poor) AND work(Good)  income(Low)
education(Diploma)  income(Low)
education(Master)  income(Medium)
OR income(High)
studies(Moderate)  income(Low)
studies(Good)  income(Medium)
OR income(High)
https://www.ontotext.com/knowledgehub/fundamentals/dikw-pyramid/
education(SPM) AND work(Good)  income(Low)
WHAT IS DATA MINING?
Data mining - definition
extraction of interesting (non-trivial, implicit, previously unknown and potentially
useful) patterns or knowledge from huge amount of data
exploration and analysis, by automatic or semi-automatic means, of large quantities
of data in order to discover meaningful patterns

Alternative names
Knowledge discovery (mining) in databases (KDD), knowledge extraction,
data/pattern analysis, data archeology, data dredging, information harvesting

Is everything “data mining”?


Simple search and query processing, like query of information about “Shopee
products”
WHY IS DATA MINING?
Today, massive growth of data availability, from Terabyte to Yottabyte, it is everywhere and
anywhere

Source of data ?

Facebook, Instagram, Telegram Blogs, News Amazon, Shopee, Lazada


(Social Media) (Society) (E-commerce)

“There were 5 exabytes of information created between the dawn of civilization through
2003, but that much information is now created every 2 days” – Eric Schmidt, Executive Chairman of Google

“Information is the oil of 21st century, and analytics is the combustion engine.” – Peter Sondergaard, Gartner Research
FROM DATA MINING TO BIG DATA MINING

What is Big data?

A term which refers to a large


amount of data where the
concept is related to the
characteristics of the data
itself.

Figure 1. 5V’S of Big Data


https://www.techentice.com/the-data-veracity-big-data/
FROM DATA MINING TO BIG DATA MINING

Classifying youth emotions based on Sentiment analysis on reviews of Proton


Twitter data Cars in Malaysia using Facebook postings

Big data mining is referred to the Goal – to discover insights from the
collective data mining or extraction social media platforms (Instagram,
techniques that is performed on large Twitter, Facebook) with thousand of
volume of data or the big data. postings.
CONCLUSION

DATA MINING is simply…

Finds relationship
(that exist within the dataset)
and
makes prediction
Photo-credit to:
https://www.bigstockphoto.com/image-12788702/stock-
vector-a-fortune-teller-holding-her-crystal-ball-vector
REFERENCES

1. Pang-Ning Tan, Michael Steinbach & Vipin Kumar, Introduction to Data Mining, Addison Wesley, 2019.
2. Jiawei Han and Micheline Kamber, Data Mining: Concepts and Techniques, 3rd Edition, Morgan Kaufmann, 2012.
3. Che D., Safran M., Peng Z. (2013) From Big Data to Big Data Mining: Challenges, Issues, and Opportunities. In: Hong
B., Meng X., Chen L., Winiwarter W., Song W. (eds) Database Systems for Advanced Applications. DASFAA 2013.
Lecture Notes in Computer Science, vol 7827. Springer, Berlin, Heidelberg.
https://doi.org/10.1007/978-3-642-40270-8_1
4. Razak, Z. I., & Mutalib, S. (2018). Web Mining In Classifying Youth Emotions. Malaysian Journal of Computing, 3(1), 1-
11.
5. Wah, Y. B., Abdullah, N., Abdul-Rahman, S., & Tan, M. L. P. (2018). text mining and sentiment analysis on reviews of
proton cars in malaysia. Malaysian Journal of Science, 37(2), 137-153.
THANK YOU
Shuzlina Abdul Rahman | Sofianita Mutalib | Siti Nur Kamaliah Kamarudin | Farah Syazwani Mohd Rashid

You might also like