Professional Documents
Culture Documents
Learning Paradigm - Classification: L Jeganathan
Learning Paradigm - Classification: L Jeganathan
Technical Description of the Problems that are fit for Learning a ‘class’
Summary with an Exercise
L Jeganathan
O UTLINE
S.no. Loan Amount (L) Monthly Income (I) Monthly Savings(S) Age(A) Collaterals(C) Financial History (H) risk-score(r)
in Lakhs. Lakhs. in Lakhs. in years. in Lakhs
1 10 0.2 0.02 35 3 good Low
2 23 0.5 0.1 39 10 good Low
3 70 0.7 0.01 45 9 bad High
4 50 1.5 0.5 40 9 good Low
5 35 0.9 0.3 29 10 bad High
. . . . . . ..
. . . . . . .
. . . . . . .
TASK :
Given the information on the input attributes from a loan application, To compute
the class ( High risk or Low risk) to which the loan application may belong?
Procedure:
Propose an equation (called as a hypothesis) that involves all the input
variables.
Let Score(x) = w1 L + w2 I + w3 S + w4 A + w5 C + w6 H + w0 , where
x = (L, I, S, A, C, H)
Score(x) is the credit score, for the input x.
P ROCESS - CONTINUED
where
k t = 1 if r̂ t 6= r t
k t = 0 if r̂ t = r t
P is the count of the number of misclassifications done by the classifier.
N OTE
‘P=0’ means that, our learning model had classified correctly.
P LOTTING E
Any input with d-factors can be plotted as a point in d-dimensional surface.
The line which separates the two classes, is called the ‘Discriminant’.
L Jeganathan Learning Paradigm - Classification
Classification - A Learning Paradigm
Technical Description of the Problems that are fit for Learning a ‘class’
Summary with an Exercise
Performance Measure:
Total number of misclassifications made by the learning (Classifier). Average of
the square of the errors (difference between the actual output and the predicted
output) made in each of the instance of E.
S.no. x1 x2 x3 . . r
1 x1 1 x2 1 x3 1 . . r1
2 x1 2 x2 1 x3 2 . . r2
3 x1 3 x2 3 x3 3 . . r3
. . . . . . .
. . . . . . .
N x1 N x2 N x3 N . . rN
S UMMARY
E XERCISE
1. Say True or False : In a classification problem with two features as the input
variables, Can we plot the data points in a two-dimensional plane?
2. Propose a new problem (not discussed by us) with a clear description of E
and T where classification based learning is feasible.
3. What is the main difference between the two learning paradigms:
Classification, Regression.
4. Performance measure of a classifier is defined as the total number of
misclassifications. Can we define the performance measure of a classifier as
the total number of correct classifications?