Download as txt, pdf, or txt
Download as txt, pdf, or txt
You are on page 1of 2

classification: Direct Marketing Goal: Reduce cost of mailing by targeting a set of consumers likely to buy a new cell-phone product.

. O Fraud Detection Goal: Predict fraudulent cases in credit card transactions. Clustering: Application 1 O Market Segmentation: O Document Clustering: Goal: To find groups of documents that are similar to each other based on the important terms appearing in them. Association: Produce dependency rules which will predict occurrence of an item based on occurrences of other items. Marketing and Sales Promotion: Nominal Examples: ID numbers, eye color, zip codes Ordinal Examples: rankings (e.g., taste of potato chips on a scale from 1-10), grades, height in {tall, medium, short} Interval Examples: calendar dates, temperatures in Celsius or ****zero point Fahrenheit. Ratio Examples: temperature in Kelvin, length, time, cou For ratio variables, both differences and ratios are meaningful. (*, /) Nominal attribute: distinctness Ordinal attribute: distinctness & order Interval attribute: distinctness, order & addition Ratio attribute: all 4 properties Distinctness: = Order: < > Noise refers to modification of original values Examples: distortion of a person's voice when talk Outliers are data objects with characteristics that are considerably different than most of the other data objects in the data set

Aggregation: Data reduction Reduce the number of attributes or objects Change of scale Cities aggregated into regions, states, countries, etc More stable data Aggregated data tends to have less variabi

Minkowski Distance: Examples O r = 1. City block (Manhattan, taxicab, L1 norm) distance. A common example of this is the Hamming distance, which is just the number of bits that are different between two binary vectors O r = 2. Euclidean distance O r . supremum (Lmaxnorm, L norm) distance. This is the maximum difference between any component of the vectors O Do not confuse r with n, i.e., all these distances are defined for all numbers of dimensions.

You might also like