Professional Documents
Culture Documents
Text Analytics and Text Mining Overview
Text Analytics and Text Mining Overview
•Information extraction
•Topic tracking
•Summarization
•Categorization
•Clustering
•Concept linking
•Question answering
• Information extraction: Identification of key
phrases and relationships within text by
looking for predefined objects and sequences
in text by way of pattern matching.
Marketing Applications
•Increase cross-selling and up-selling by analyzing the unstructured
data generated by call centers. Text generated by call center notes as
well as transcriptions of voice conversations with customers can be
analyzed by text mining algorithms to extract novel, actionable
information about customers’ perceptions toward a company’s
products and services.
•One of the largest and most prominent text mining applications in the
security domain is probably the highly classified ECHELON surveillance
system. As rumour has it, ECHELON is assumed to be capable of
identifying the content of telephone calls, faxes, e-mails, and other
types of data, intercepting information sent via satellites, public-
switched telephone networks, and microwave links
•In 2007, EUROPOL developed an integrated system capable of
accessing, storing, and analyzing vast amounts of structured and
unstructured data sources in order to track transnational organized
crime. Called the Overall Analysis System for Intelligence Support
(OASIS), this system aims to integrate the most advanced data and text
mining technologies available in today’s market. The system has enabled
EUROPOL to make significant progress in supporting its law enforcement
objectives at the international level (EUROPOL, 2007)
• Another security-related application of text mining is in the area of
deception detection . Applying text mining to a large set of real-
world criminal (person-of-interest) statements, Fuller et al. (2008)
developed prediction models to differentiate deceptive statements
from truthful ones
• Using a rich set of cues extracted from the textual statements, the
model predicted the holdout samples with 70 percent accuracy,
which is believed to be a significant success considering that the
cues are extracted only from textual statements (no verbal or
visual cues are present)
• Biomedical Applications