Prese

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 3

ASSIGNMENT – 3

NAME :- HARSHIT PATIL


ROLL NO. :- 11
STD/DIV :- X-F

Q.1)Through a step-by-step process, calculate


TFIDF for the given
corpus.
Document 1: We are going to Mumbai
Document2: Mumbai is a famous place.
Document 3: We are going to a famous place.
Document 4: I am famous in Mumbai.
1. Create document vector table for all
the documents.

we are going to Mumbai is

1 1 1 1 1 0
0 0 0 0 1 1
1 1 1 1 0 0
0 0 0 0 1 0

famous place I am in a

0 0 0 0 0 0

1 1 0 0 0 1

1 1 0 0 0 1

1 1 1 1 1 0
2. Make inverse document frequency
table.
we are going to Mumbai is
4/2 4/2 4/2 4/2 4/3 4/1

famous place I am in a
4/3 4/3 4/1 4/1 4/1 4/2

3. Write formula for TDIDF for all the


documents.
we are going to Mumbai is
1*log(4/2) 1*log(4/2) 1*log(4/2) 1*log(4/2) 1*log(4/3) 0*log(4)

0*log(4/2) 0*log(4/2) 0*log(4/2) 0*log(4/2) 1*log(4/3) 1*log(4)

1*log(4/2) 1*log(4/2) 1*log(4/2) 1*log(4/2) 0*log(4/3) 0*log(4)

0*log(4/2) 0*log(4/2) 0*log(4/2) 0*log(4/2) 1*log(4/3) 0*log(4)

famous place I am in a
0*log(4/3) 0*log(4/3) 0*log(4) 0*log(4) 0*log(4) 0*log(4/2)

1*log(4/3) 1*log(4/3) 0*log(4) 0*log(4) 0*log(4) 1*log(4/2)

1*log(4/3) 1*log(4/3) 0*log(4) 0*log(4) 0*log(4) 1*log(4/2)

1*log(4/3) 1*log(4/3) 1*log(4) 1*log(4) 1*log(4) 0*log(4/2)

You might also like