Professional Documents
Culture Documents
Text Mining and Web Mining
Text Mining and Web Mining
Sentiment analysis or opinion mining refers to the application of natural language processing, computational linguistics, and text
analytics to identify and extract subjective information in source materials. Generally speaking, sentiment analysis aims to determine
the attitude of a speaker or a writer with respect to some topic or the overall contextual polarity of a document. The attitude may be
his or her judgment or evaluation, affective state (that is to say, the emotional state of the author when writing), or the intended
emotional communication (that is to say, the emotional effect the author wishes to have on the reader) [Source: Wikipedia].
Sentiment analysis answers the question: is what being said "positive" or "negative"?
A sophisticated text analytics tool can identify the sentiments associated with the named entities, concepts as well as themes being
discussed in the text data. Examining our example once again, we note the following sentiments associated with named entities,
concepts and themes:
(Reuters) - Research In Motion Ltd said on Tuesday its subscriber base has risen to 80 million from the 78 million it reported earlier this
year, surprising many on Wall Street and sending its shares up more than 3 percent.
Most analysts had expected RIM, for the first time in its history, to begin losing subscribers in the recently completed quarter as it has
rapidly lost market share in North America to Apple's snazzier iPhone and Samsung's Galaxy devices.
Document Summarization
(Reuters) - Research In Motion Ltd said on Tuesday its subscriber base has risen to 80 million
from the 78 million it reported earlier this year, surprising many on Wall Street and sending its
shares up more than 3 percent.
Most analysts had expected RIM, for the first time in its history, to begin losing subscribers in
the recently completed quarter as it has rapidly lost market share in North America to Apple's
snazzier iPhone and Samsung's Galaxy devices.
Research In Motion subscriber base has risen to 80 million sending its shares up more than 3
percent. Most analysts had expected RIM, for the first time in its history, to begin losing
subscribers.
As can be seen, the summary captures the gist of the conversation. While this may not be
impressive in the case of a two paragraph article, the ability to rapidly summarize large
volumes of text data is a very useful output from sophisticated text mining applications.
From this we are able to gather that the sentence relates to a bank
account customer but not much else.
We were able to gather that the same sentence now contained the following expressions:
Cstmr
Customer
Yes
Bank
not happy
switch
bank account
As you will appreciate, the expression "not happy" conveys a very different meaning
than the word "happy"!
Another breakthrough in text analytics with the ability to extract
Named Entities. This helped identify what was being discussed as can
be seen below:
Information extraction
Topic tracking
Summarization
Categorization
Clustering
Concept linking
Question answering
TEXT MINING APPLICATIONS
Marketing applications
Enables better CRM
Security applications
ECHELON, OASIS
Deception detection (…)
Academic applications
Research stream analysis
TEXT MINING APPLICATION
(RESEARCH TREND IDENTIFICATION IN LITERATURE)
… … … … … … … …
TEXT MINING TOOLS