Professional Documents
Culture Documents
Business Intelligence Software and Techniques: BUAN6324/MIS6324
Business Intelligence Software and Techniques: BUAN6324/MIS6324
Business Intelligence Software and Techniques: BUAN6324/MIS6324
Software and
Techniques
BUAN6324/MIS6324
Spring, 2016 Gregory G. MacDonald, PhD
(Lecture #1)
Course Preliminaries/Logistics
Motivation for a Data Mining (DM) project
General modeling considerations
Typical phases of a DM project
Form groups (deferred for a few classes yet)
Install the required software
Short tool tutorial using the weather dataset
Assignment for next week
2
Office Hours
Section 501 (Monday)
Time: 6:00PM 7:00PM, Location: JSOM 3.604
Email address:
gregory.macdonald@utdallas.edu
3
Course TA*
TBD
2014
Source: http://cacm.acm.org/news/189911-the-2015-top-10-programminglanguages/fulltext
6
Resources
Special note on the following texts:
Theory: Introduction to Data Mining
Tan, Steinbach, Kumar, 2006
Syllabus Review
Other Topics?
?
10
?
12
Terminology/Concepts
Some Terminology
Features are inputs (independent variables)
Target is the outcome (dependent variable)
General Concepts
Over-fitting
Generalization
Concept to be Learned
14
Types of Learning
Unsupervised no target value
Supervised target value (outcome)
Other forms of learning
15
Irrelevant Data
features that are irrelevant to the
concept
16
Status
L
P
Assessing Model
Goodness
?
18
Assessing Model
Goodness
Accuracy?
Classification Error Rate?
Receiver Operating Characteristic
Curve (ROC)
Confusion Matrix
Caution: several different representations
in the literature (i.e., transpose of the
matrix)
19
Receiver Operating
Characteristic
AU
C
20
Confusion Matrix
Features: Size, Tail length,
weight, ear length, eye
count, food consumed,
transportation method
Target: Cat, Dog, Rabbit
21
Typical DM Process
22
Typical DM Process
Form Groups
4-5 people in a group, mix of
strengths
Group formation via eLearning
Group membership must stay fixed
for the semester
24
Tools Installation
https://www.youtube.com/watch?v=cX532N_XLIs
R http://www.r-project.org
RStudio (IDE for R) - https://www.rstudio.com
Rattle -
install.packages(rattle)
library(rattle)
rattle()
26
Assignment
Read first 3 chapters of Data Mining with
Rattle and R, and chapter 1 of Intro to
DM
Complete the install of R, RStudio, Rattle
Next week: The journey begins! (2 weeks
for the Monday class)
Topic: Classification
27