Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 14

QUESTION-ANSWERING

SYSTEM USING NAMED


ENTITY RECOGNITION (NER)
TECHNIQUE
Presented by:
Abhishek Pathak
Joy Aneja
Pradeep Kumar
Priyam Gupta
Whats Our Objective??
Objective
Development of a Question-Answering system using
Named Entity Recognition (NER) technique which is a
subset of Natural Language Processing (NLP). In this
system, we take a paragraph as a pre-processed input
and based on this paragraph, questions will be asked by
the user and we intend the system to respond to these
questions by synthesizing the provided text using NER
technique.
What is Named
Entity
Recognition ??
Named Entity Recognition
Named Entity Recognition (NER) (also known as entity
identification and entity extraction) is a subtask of
information extraction that seeks to locate and classify
atomic elements in text into predefined categories such as
the names of persons, organizations, locations,
expressions of times, quantities, monetary values,
percentages, etc.
Literature Surveys
Diego Molla et al.2006, Named Entity Recognition for
Question Answering. In Proceedings of the 2006
Australasian Language Technology Workshop
(ALTW2006)
This paper focuses on the use of named entity recognition for question
answering. For the purpose of this paper, question-answering (QA) is
the task of automatically finding the answer to a question phrased in
English by searching through a collection of text documents.
An important component of a QA system is the named entity recogniser
and virtually every QA system incorporates one. The rationale of
incorporating a NER as a module in a QA system is that many fact-
based answers to questions are entities that can be detected by a
NER. Therefore, by incorporating in the QA system a NER, the task of
finding some of the answers is simplified considerably.
Menno Van Zaanen et al.2005, Named Entity
Recognition At Text Retrieval Conference (TREC 2005)
by National Institute of Standards and Technology
(NIST).
Named Entity Recognition (NER) plays a relevant role in several
Natural Language Processing tasks. Question-Answering (QA) is an
example of such, since answers are frequently named entities in
agreement with the semantic category expected by a given question.
In this context, the recognition of named entities is usually applied in
free text data. In this paper, we approach the identification and
classification of named entities in natural language questions. We
hypothesize that NER results can benefit with the inclusion of
previously labelled questions in the training corpus.
Methodology
Software Requirements
Python
Python supports multiple programming paradigms, including object-oriented,
imperative and functional programming styles. It features a dynamic type
system and automatic memory management and has a large and
comprehensive standard library.

Natural Language Toolkit
NLTK is a leading platform for building Python programs to work with human
language data. It provides easy-to-use interfaces to over 50 corpora and
lexical resources such as WordNet, along with a suite of text processing
libraries for classification, tokenization, stemming, tagging, parsing, and
semantic reasoning.

WordNet Corpus
WordNet is a lexical database for the English language. It groups English
words into sets of synonyms called synsets, provides short, general
definitions, and records the various semantic relations between these
synonym sets.
Project Plan
The Questioning-Answering system can be divided into several
phases:
Phase 1:
Question Analysis : The first phase is the analysis of the question. During
this phase question type classification is performed to determine what sort of
answer we are looking for, such as location, person, etc.
Phase 2:
Document Selection : The next phase is the selection of the documents
that are likely to contain the answer. This phase is performed based on the
information of the question. Only the selected documents are considered in
the following phases.
Phase 3:
Sentence Selection : Using the information extracted during the question
analysis phase, sentences that are likely to contain the answer are extracted
from the selected documents. Only these sentences are processed further.
Phase 4:
Answer Selection : The information taken from the question is matched
against the selected sentences and based on this, the exact answer is
extracted and returned.
Applications
Online Doubt Clarifier : From a set of
textbooks, users can get answers to their
doubts.

Q-A system can act as a travellers guide for
tourists.

In medical sciences, doctors can consult a Q-A
System for suggesting medicines to patients.
Future Scope
The system can be automated to accept any generalised
input paragraph.

The system can also be enhanced to accept generalised
questions and give subjective answers.

Incorporating summarization of the input text.

Introduction of Question-Answering system in web-
search engines.

References
1. Diego Molla et al.2006, Named Entity Recognition for Question
Answering. In Proceedings of the 2006 Australasian Language
Technology Workshop (ALTW2006).

2. Menno Van Zaanen et al.2005, Named Entity Recognition, At Text
Retrieval Conference (TREC 2005) by National Institute of
Standards and Technology (NIST).

3. The Stanford Natural Language Processing Group
http://nlp.stanford.edu/software/CRF-NER.shtml

4. Norwegian University of Science and Technology www.ntnu.edu
Thank You

You might also like