Professional Documents
Culture Documents
Web Mining
Web Mining
by NINI P SURESH
OUTLINE
Introduction Data mining Vs Web mining Web mining subtasks Challenges Taxonomy Web content mining Web structure mining Web usage mining Applications
Page 2
INTRODUCTION
Nowadays, it has become necessary for users to utilise automated tools to find, extract, filter & evaluate desired information & resources. The target of search engines is only to discover the resources on the web.
Page 3
Page 4
INTRODUCTION
Other Approaches
Database approach (DB) Information retrieval Natural language processing (NLP) Web document community
Page 5
WEB MINING
DEFENITION
Web mining refers to the overall process of discovering potentially useful and previously unknown information or knowledge from the Web data.
Page 6
DATA MINING
WEB MINING
Extraction of useful patterns from data sources like databases, texts, web, images etc
Extracting relevant information hidden in Web-related data, like hypertext documents on web
Page 7
Page 8
CHALLENGES
Search relevant information on web Create knowledge Personalization of Information Learn patterns Uniformity & standardisation
Page 9
CHALLENGES
Redundant Information Noisy web Monitoring changes Sites providing Services Privacy
Page 10
TAXONOMY
Web Mining
Link Mining
URL Mining
Page 11
Page 12
Page 13
Page 14
Discovering structure information from web Web graph : web pages as nodes & hyperlinks as edges
Page 15
Page 16
PageRank
Metric for ranking hypertext documents Depends on rank of pages pointing it Iterative process
Page 17
Page 18
HITS
Iterative algorithm Identify topic hubs & authorities Input : search results returned by traditional text indexing technique
Page 19
Assigns weight to hub based on authoritiveness Outputs pages with largest hub & authority weights
Page 20
Preprocessing
Pattern Analysis
Interesting Rules, Patterns & Statistic
Raw logs
Page 21
Page 22
Pattern discovery
Statistical Analysis Association Rules Clustering analysis
Page 23
Page 24
Page 25
APPLICATIONS
Personalized Services Improve website design System Improvement Predicting trends Carry out intelligent buisness
Page 26
PROS
High trade volumes Classify threats & fight against Terrorism Establish better customer relationship Increase profitability
Page 27
CONS
Page 28
CONCLUSION
Page 29
REFERENCE
[1] http://en.wikipedia.org/wiki/Web mining [2] http://www.galeas.de/webimining.html [3] Jaideep srivastava, Robert Cooley, Mukund Deshpande, Pan-Ning Tan, Web Usage Mining: Discovery and Applications of Usage Patterns from Web Data, SIGKDD Explorations, ACM SIGKDD,Jan 2000. [4] Miguel Gomes da Costa Jnior,Zhiguo Gong, Web Structure Mining: An Introduction, Proceedings of the 2005 IEEE International Conference on Information Acquisition [5] R. Cooley, B. Mobasher, and J. Srivastava,Web Mining: Information and Pattern Discovery on the World Wide Web, ICTAI97 [6] Brijendra Singh, Hemant Kumar Singh, WEB DATA MINING RESEARCH: A SURVEY, 2010 IEEE [7] Mining the Web: discovering knowledge from hypertext data, Part 2 By Soumen Chakrabarti, 2003 edition [8] Web mining: applications and techniques By Anthony Scime
Page 30
WEB MINING
Thank You
Page 31