Professional Documents
Culture Documents
Wordembedding
Wordembedding
Terminology
• The term “Word Embedding” came from deep
learning community
• Other terms:
– Distributed Representation
– Semantic Vector Space
– Word Space
Representing words by their context
0.286
0.792
−0.177
−0.107
banking = 0.109
−0.542
0.349
0.271
Two words are similar in meaning if they have similar context vector!
Context Vector of “vertigineux” = [0, 1, 6, 6, 1, ...]
Context Vector of “vertiges” = [0, 2, 6, 6, 0, ...]
Term weighting
• Weighting: it practically works well... instead of
just using raw counts.
Example:
Word Collection frequency Document frequency
Which word is a better search term (and should get a higher weight)?
How does “importance” or “informativeness” relate to document frequency?
Ideas?
The tf-idf weight of a term is the product of its tf weight and its idf weight
tf-idf weighting
affection 115 58 20
jealous 10 7 11
gossip 2 0 6
3 documents example contd.
Log frequency weighting After normalization
term N1 N2 N3 term N1 N2 N3
affection 3.06 2.76 2.30 affection 0.789 0.832 0.524
jealous 2.00 1.85 2.04 jealous 0.515 0.555 0.465
gossip 1.30 0 1.78 gossip 0.335 0 0.405
wuthering 0 0 2.58 wuthering 0 0 0.588
cos(N1,N2) ≈
0.789 ∗ 0.832 + 0.515 ∗ 0.555 + 0.335 ∗ 0.0 + 0.0 ∗ 0.0 ≈ 0.94
cos(N1,N3) ≈ 0.79
cos(N2,N3) ≈ 0.69
tf-idf weighting has many variants