MLR Institute of Technology
DATA MINING
(PROFESSIONAL ELECTIVE - I)
IB. TECH- 1 SEMESTER
‘Course Code Category Hours /Week | Credits! Maximum Marks
cL} Tt) oP c cE | SEE] Total
ASCS19 PEC
a} -| - 3 30 | 70 | 100
COURSE OBJECTIVES
1. Lear Data Mining concepts and preprocessing techniques.
2. Understand associate rules for frequent pattern mining,
3. Discuss various classification algorithms
4, Understand clustering techniques to group labeled data.
5. Learn basic concepts and techniques of information retrieval, web search for knowledge
extraction from web.
COURSE OUTCOMES
1. Perform the preprocessing of data and apply mining techniques on it
2. Generate frequent patterns from a given data set,
3. Apply standard classification algorithms and assess the quality of classification models
4, Demonstrate basic clustering models & perform outlier analysis.
5.__Develop skills of using recent data mining software for solving practical problems of web mining.
UNIT-1 | INTRODUCTION CLASSES: 12
INTRODUCTION TO DATA MINING: Introduction, What is Data Mining, Definition, Knowledge Discovery
from Data (KDD), What Kinds of Data can be Mined, Data Mining Tasks, Integration of Data Mining
‘System With A Database or Data Warehouse System, Types of Data Sets and Attribute Values.
PREPROCESSING: Data Quality, Major Tasks in Data Pre-processing, Data Cleaning and Data
Integration, Data Reduction, Data Transformation and Data Discretization, Measures of Similarity and
Dissimilarity- Basics
UNIT-II_ | ASSOCIATION RULES CLASSES: 12
ASSOCIATION RULES: Problem Definition, Frequent Item Set Generation, The APRIORI Principle,
‘Support and Confidence Measures, Association Rule Generation; APRIOIRI Algorithm, The Partition
Algorithms, FP-Growth Algorithms, Compact Representation of Frequent Item Set- Maximal Frequent
tem Set, Closed Frequent Item Sel.
UNIT -iI|_ | CLASSIFICATION CLASSES: 14
CLASSIFICATION: Problem Definition, General Approaches to solving a classification problem
Evaluation of Classifiers , Classification techniques, Decision Trees-Decision tree Construction ,
Methods for Expressing attribute test conditions, Measures for Selecting the Best Split, Algorithm for
Decision tree Induction ; Naive-Bayes Classifier, Bayesian Belief Networks; K- Nearest neighbor
classification-Algorithm and Characteristics.
UNIT-IV_ | CLUSTERING CLASSES: 14
CLUSTERING: Problem Definition, Clustering Overview, Evaluation of Clustering Algorithms, Partitioning
Clustering-K-Means Algorithm, K-Means Additional issues, PAM Algorithm; Hierarchical Clustering-
Agglomerative Methods and divisive methods, Basic Agglomerative Hierarchical Clustering Algorithm,
Specific techniques, Key Issues in Hierarchical Clustering, Strengths and Weakness; Outlier Detection.
Bi Tech- Computer Sclence and Eneineerine . MLE? Pace 1/102MLR Institute of Technology
UNIT-V | WEB AND TEXT MINING CLASSES: 12
WEB AND TEXT MINING: Introduction, web mining, web content mining, web structure
mining, we usage mining, Text mining -unstructured text, Text Mining Techniques,
hierarchy of categories, text clustering.
TEXT BOOKS
1. Data Mining: Concepts and Techniques - Jiawei Han, Micheline Kamber, Jian Pei (2012),
3M edition, Elsevier, United States of America,
2, Introduction to Data Mining, Pang-Ning Tan, Vipin Kumar, Michael Steinbanch, Pearson
Education
3. Data mining Techniques and Applications, Hongbo Du Cengage India Publishing
REFERENCE BOOKS
4. Data Mining Techniques, Arun K Pujari, 3rd Edition, Universities Press.
2. Data Mining Principles & Applications — T.VSveresh Kumar, B.Esware Reddy, Jagadish S
Kalimani, Elsevier.
3. Data Mining, Vikaram Pudi, P Radha Krishna, Oxford University Press
WEB LINKS
7, _hitpsionlinecourses.nptel_ac in/noc_cs06/preview
2. _hittps:/inptel.ac,in/nocicoursesinoc21/SEM 1 /noc2 1-cs06)
Bi Tech- Computer Sclence and Eneineerine . MLE? Race 1108MLR Institute of Technology
DISTRIBUTED DATABASES.
(PROFESSIONAL ELECTIVE - I)
ll B. TECH- I SEMESTER
‘Course Code Category Hours /Week | Credits Maximum Marks
tL} Tt) P c cE | SEE] Total
‘ASCS20 PEC
3} -| - 3 30 | 70 | 100
COURSE OBJECTIVES
1. Tounderstand the theoretical and practical aspects of the database technologies.
2. To understand the need for distributed database technology to tackle deficiencies of the
centralized database systems.
3, To introduce the concepts and techniques of distributed database including principles,
architectures, design, implementation and major domain of application.
4. To familiarize the emerging database technology
COURSE OUTCOMES
‘Analyze database with distributed database concepts and its structures.
Apply methods and techniques for Distributed query processing and Optimization
Apply the concepts of Distributed Transaction process and concurrency control
Illustrate reliability and providing security in the distributed databases
‘Summarize the concepts of Distributed Object Database Management Systems
ae
UNIT-1 | INTRODUCTION CLASSES: 12
Features of Distributed versus Centralized Databases, Principles of Distributed Databases, Levels Of
Distribution Transparency, Reference Architecture for Distributed Databases, Types of Data
Fragmentation, Integrity Constraints in Distributed Databases, Distributed Database Design
UNIT-1_ | QUERY PROCESSING CLASSES: 12
Translation of Global Queries to Fragment Queries, Equivalence transformations for Queries,
Transforming Global Queries into Fragment Queries, Distributed Grouping and Aggregate Function
Evaluation, Parametric Queries,
Optimization of Access Strategies, A Framework for Query Optimization, Join Queries, General Queries
UNIT -IIl_ | TRANSACTION MANAGEMENT AND CONCURRENCY CONTROL | CLASSES: 14
The Management of Distributed Transactions, A Framework for Transaction Management, Supporting
Atomicity of Distributed Transactions, Concurrency Control for Distributed Transactions, Architectural
Aspects of Distributed Transactions.
Concurrency Control, Foundation of Distributed Concurrency Control, Distributed Deadlocks,
Concurrency Control based on Timestamps, Optimistic Methods for Distributed Concurrency Control.
UNIT-IV_ | RELIABILITY AND SECURITY IN THE DISTRIBUTED DATABASES | CLASSES: 14
Reliability, Basic Concepts, Non blocking Commitment Protocols, Reliability and concurrency Control,
Determining a Consistent View of the Network, Detection and Resolution of Inconsistency, Checkpoints,
and Cold Restart, Distributed Database Administration, Catalog Management in Distributed Databases,
‘Authorization and Protection
UNIT-V_ | DISTRIBUTED OBJECT DATABASE MANAGEMENT SYSTEMS —_| CLASSES: 12
Bi Tech- Computer Sclence and Eneineerine . MLE? Pace |104MLR Institute of Technology
Architectural Issues, Alternative Client/Server Architectures, Cache Consistency, Object Management,
Object Identifier Management, Pointer Swizzling, Object Migration, Distributed Object Storage, Object
Query Processing, Object Query Processor Architectures, Query Processing Issues, Query Execution,
Transaction Management, Transaction Management in Object DBMSs, Transactions as Objects
TEXT BOOKS
7, Distributed Databases - Principles and Systems; Slefano Ceri; Guiseppe Pelagatti, Tala
McGraw Hill; 1985,
2. Fundamental of Database Systems; Elmasri &Navathe; Pearson Education; Asia Database
‘System Concepts; Korth & Sudarshan; TMH.
REFERENCE BOOKS
7. Data Base Management System; Leon & Leon; Vikas Publications
2. Introduction to Database Systems; Bipin C Desai; Galgotia
3. Principles of Distributed Database Systems; M. Tamer Ozsu; and Patrick Valduriez Prentice
Hall
WEB LINKS
1. hilps www digimat in/nptelVcourses/video!106106768/LO1 nimi
2, https:/inptel ac. n/courses/106/106/106106168/
Bi Tech- Computer Sclence and Eneineerine . MLE? Dace |1n&