Professional Documents
Culture Documents
Capstone Project Report (AST)
Capstone Project Report (AST)
Capstone Project Report (AST)
A PROJECT REPORT
Submitted by
BACHELOR OF TECHNOLOGY
in
APRIL 2022
VIT BHOPAL UNIVERSITY, KOTHRIKALAN, SEHORE
MADHYA PRADESH – 466114
BONAFIDE CERTIFICATE
Certified that this project report titled “TEXT SUMMARIZATION” is the bonafide
Jayasurya M (18BCE10126) who carried out the project work under my supervision.
Certified further that to the best of my knowledge the work reported at this time does
not form part of any other project/research work based on which a degree or award
First and foremost, I would like to thank the Lord Almighty for His presence and immense blessings
I wish to express my heartfelt gratitude to Dr.Sandip mal, Head of the Department, School of
Computing Science and Engineering for much of his valuable support and encouragement in
I would like to thank my internal guide Dr AVR Mayuri, for continually guiding and actively
I would like to thank all the technical and teaching staff of the School of Computing science and
Last, but not least, I am deeply indebted to my parents who have been the greatest support while I
NO.
1.
System Architecture 18
tough job and a waste of time and effort. It is very difficult for human beings to
manually extract the summary of a large document of text. There is plenty of text
information from it. In order to solve the above two problems, automatic text
form for a single document or multi-documents for tackling such a problem. The most
important benefits of using a summary are its reduced reading time and providing a
to find the most important text units and present them as a summary of the original
document.
TABLE OF CONTENTS
List of abbreviations 4
List of figures 5
Abstract 6
1 CHAPTER-1:
PROJECT DESCRIPTION AND OUTLINE
1.1 Introduction 9
1.2 Motivation for the work 9
1.3 Problem Statement 10
1.4 Aim & Objective 10
2 CHAPTER-2:
RELATED WORK INVESTIGATION
11
2.1 Introduction
11
2.2 Existing Approaches/Methods
12
2.2.1 Approaches/Methods -1
12
2.3 <Pros and cons of the stated Approaches/Methods >
2.4 Issues/observations from investigation
3 CHAPTER-3:
REQUIREMENT ARTIFACTS
13
3.1 Introduction
13
3.2 Hardware and Software requirements
3.3 Specific Project requirements
14
3.3.1 Data requirement
14
3.3.2 Functions requirement
3.3.3 Performance and security requirement
3.3.4 Look and Feel Requirements 15
4 CHAPTER-4:
DESIGN METHODOLOGY AND ITS NOVELTY
16
4.1 Methodology and goal
17
4.2 Functional modules design and analysis
18
4.3 System architecture design
18
4.4 User Interface designs
5 CHAPTER-5:
TECHNICAL IMPLEMENTATION & ANALYSIS
5.1 Technical coding and code solutions 19
5.2 Test and validation 22-37
5.3 Performance Analysis 39
6 CHAPTER-6:
PROJECT OUTCOME AND APPLICABILITY
39
6.1 Key implementations outlines of the System
40
6.2 Significant project outcomes
40
6.3 Project applicability on Real-world applications
7 CHAPTER-7:
CHAPTER 1
1.1 Introduction
Text summarization refers to the technique of shortening long pieces of text. Before going to the
Text summarization, first we must know what a summary is. A summary is a short form of text
that is formed from one or more texts that gives important information in the original text. The
purpose of automatic text summarization is presenting the source text into a shorter version with
semantics. Summary reduces the reading time. The intention is to create a coherent and fluent
summary having only the main points outlined in the document.
1.2 Motivation
An enterprise produces a huge amount of data every day and most of the data are either
unstructured or very long. It takes a lot of effort and time to process this data manually.Text
summarization refers to the technique of shortening long pieces of text using machine learning
and natural language processing. The intention is to create a coherent and fluent summary
having only the main points outlined in the document. Text Summarization is increasingly being
used in the commercial sector such as Telephone communication industry, data mining of text
databases, for web-based information retrieval, in word Processing tools. Many approaches
differ on the behavior of their problem formulations. Automatic text summarization is an
important step for information management tasks. It solves the problem of selecting the most
important portions of the text. High quality summarization requires sophisticated NLP
techniques.
Text Summarization is one of those applications of Natural Language Processing (NLP) which
is bound to have a huge impact on our lives. With growing digital media and ever-growing
publishing, there is a time constraint to go through entire articles, documents, books to decide
whether they are useful or not. The explosion of electronic documents has made it difficult for
users to extract useful information from them. The user due to the large amount of information
does not read many relevant and interesting documents. This demands an automatic text
summarization that can generate a concise and meaningful summary of text from multiple text
resources such as books, news articles, blog posts, research papers, emails, and tweets
Today, our world is parachuted by the gathering and dissemination of huge amounts of data.
The International Data Corporation (IDC) projects that the total amount of digital data
circulating annually around the world would sprout from 4.4 zettabytes in 2013 to hit 180
zettabytes in 2025. With such a big amount of data circulating in the digital space, there is a
need to develop machine learning algorithms that can automatically shorten longer texts and
deliver accurate summaries that can fluently pass the intended messages.
Our objective is to apply text summarization that reduces reading time, accelerates the process
of researching for information, and increases the amount of information that can fit in an area.
CHAPTER-2
2.1 Introduction
With the advancement of technology, the internet is accessible through various devices
like smartphones, smart watches and within the reach of common people. That leads
to the accessibility of a lot of information through the World Wide Web (WWW).
More information on the internet sometimes becomes so difficult to select only
required information from large texts. Due to the information, manual summarization
of information is very challenging and also a time-consuming task. So to overcome
this challenge, an idea of making a working automatic text summarization was born.
ATS uses NPL to generate small summaries of big text documents in a few minutes.
2.2.1 Approaches/Methods -1
Numerous approaches for identifying important content for automatic text summarization have been
developed to date. Topic representation approaches first derive an intermediate representation of the
text that captures the topics discussed in the input. Based on these representations of topics,
sentences in the input document are scored for importance. In contrast, in indicator representation
approaches, the text is represented by a diverse set of possible indicators of importance which do not
aim at discovering topicality. These indicators are combined, very often using machine learning
techniques, to score the importance of each sentence. Finally, a summary is produced by selecting
sentences in a greedy approach, choosing the sentences that will go in the summary one by one, or
globally optimizing the selection, choosing the best set of sentences to form a summary. One of the
most common approaches is extractive summarization systems for short, paragraph length
summaries and these summarizers identify the most important sentences in the input, which can be
either a single document or a cluster of related documents, and string them together to form a
summary. The decision about what content is important is driven primarily by the input to the
summarizer.
It has been observed that in the context of multi-document summarization of news articles,
extraction may be inappropriate because it may produce summaries which are overly verbose or
biased towards some sources where as in abstractive summarization it gives us short, correct
summary of the document
As the availability of data in the form of text increases day by day. It becomes so difficult to read the
whole textual data in order to find the required information which is both difficult as well as a
time-consuming task for a human being. So, at that time ATS performs an important role by
providing a summary of a whole text document by extracting only the useful information and
sentences. There are different approaches to text summarization. The real-world applications of text
summarization can be: documents summarization, news and articles summarization, review systems,
recommendation systems, social media monitoring, survey responses systems. The paper provides a
literature review of various research works in the field of automatic text summarization. This
research area can be explored more by looking in existing systems and working on different and new
techniques of NLP and Machine Learning.
Related work
Developing learning algorithms for distributed compositional semantics of words has been a
longstanding open problem at the intersection of language understanding and machine learning. In
recent years, several approaches have been developed for learning composition operators that map
word vectors to sentence vectors including recursive networks, recurrent networks, convolutional
networks and recursive- convolutional methods among others. There are many methods to
summarize documents by finding topics of the document first and scoring the individual sentences
with respect to the topics. Sentence clustering has been successfully applied in document
summarization to discover the topics conveyed in a document collection. All of these methods
produce sentence representations that are passed to a supervised task and depend on a class label to
backpropagate through the composition weights. Consequently, these methods learn high-quality
sentence representations but are tuned only for their respective task. Our model is an alternative to
the above models in that it can learn unsupervised sentence representations by introducing a
distributed sentence indicator as part of a neural language model.
CHAPTER-3
REQUIREMENT ARTIFACTS
● Visual Studio
Visual Studio Code is a lightweight but powerful source code editor. It comes with built-in
support for JavaScript, TypeScript and Node.js and has a rich ecosystem of extensions for
other languages such as C++, C#, Java, Python, PHP and Go. In our project, this source code
editor is used as our development environment for React js, Django framework and REST
API framework.
● Anaconda
● Google Chrome
Google Chrome is a cross-platform web browser developed by Google. We have used this to
debug our front end and server responses. Google Chrome includes a built-in console in
which the process and the responses can be logged. This eased the testing process of the
project.
● The system shall never display the contents of the documents to anyone on the internet.
● Ability to upload the documents a different number of times to get various summaries
● The project shall be based on the web and depends upon the optimization capabilities of the
webserver.
● The project shall start with the initial load time which depends upon the network strength of
the carrier and the user is using to access the Internet.
● The performance of the project shall not be affected by the hardware specifications of the
user
● Secure sockets are to be used in all transactions involving any confidential information of the
user.
● The application shall not leave any cookies on the user’s system (computer/laptop/phone)
containing the user’s credentials.
● The web application provides storage of all documents uploaded on redundant computers
with automatic switchover.
● The backup of the server is constantly maintained and updated to reflect the most recent
changes
● A commercial deployment site is used to produce the application and the application server
takes care of the site.
CHAPTER-4
Our summarization model is an encoder-decoder model. That is, an encoder maps words to a
sentence vector and a decoder is used to generate the surrounding sentences. Encoder-decoder
models have gained a lot of traction for neural machine translation. In this setting, an encoder is used
to map e.g. an English sentence into a vector. The decoder then conditions on this vector to generate
a translation for the source English sentence.
The system interface is developed using the Django framework with the REST API framework.
Django framework manages the backend working of the website. It is responsible to process the
requests from the client then process the PDF and send the input to the summarization model . REST
API is responsible for transporting the JSON format input and output between the backend and the
client system. REST API logs all the requests received from the client in a JSON string format. This
API view can be accessed by the admin for maintenance purposes since the view provides the
creation, update, reset and delete record option.
For Text Summarization, we first train a model that takes in a dataset and then convert them
into sentence tuples. Given a tuple (si-1; si; si+1) of contiguous sentences, with si the ith sentence of
the dataset the sentence si is encoded into a vector representation and tries to reconstruct the
previous sentence si-1 and next sentence si+1. We then freeze this model and save it as an encoder.
In our model, we use a recurrent neural network (RNN) encoder with gated recurrent unit
(GRU) activations and an RNN decoder with a conditional GRU. This model combination is nearly
identical to the RNN encoder-decoder used in neural machine translation. GRU has been shown to
perform as well as LSTM on sequence modeling tasks while being conceptually simpler. GRU units
have only 2 gates and do not require the use of a cell. While we use RNNs for our model, an encoder
and decoder can be used so long as we can backpropagate through it.
PREPROCESSING: There are three steps in preprocessing. Stop words are removed from the text.
Stop words are frequently occurring words such as ‘a’ an’, the’ that provides less meaning and
contains noise. The Stop words are predefined and stored in an array. Tokenization will separate the
input text into separate tokens. Punctuation marks, spaces and word terminators are the word
breaking characters. Word Stemming is used to convert every word into its root form by removing its
prefix and suffix for comparison with other words.
Encoder: The encoder is typically a GRU-RNN which generates a fixed-length vector representation
h(i) for each sentence S(i) in the input. The encoded representation h(i) is obtained by passing the
final hidden state of the GRU cell (i.e. after it has seen the entire sentence) to multiple dense layers.
The encoder produces vectors in batches of sentences with the same length for optimization
purposes. A vector is a Numpy array with as many rows as the length of the sentence.
Decoder: The decoder is a neural language model which conditions the encoder output. The
computation is like that of the encoder except we introduce matrices that are used to bias the update
gate, reset gate and hidden state computation by the sentence vector. One decoder is used for the
next sentence while a second decoder is used for the previous sentence. Separate parameters are used
for each decoder
The user interface is the front end of the webpage. The webpage consists of a drop zone area where
the client can drop or browse a PDF file to summarize. This PDF is sent to the backend framework
using the REST API in a JSON format. The JSON file is received by the Django framework and
then the PyPDF2 module splits the text from the PDF file. The text is then sent to the text
summarization model. The result is the summary. The frontend converts this string into a JavaScript
object from where this information is displayed on the client user interface. The client then can
download the. During this complete process, the user interface gives the status of the process
happening in the backend when the client chooses to process the PDF file. The user has to wait for
some time to get the result displayed on the webpage.
CHAPTER-5
class contract:
contractions = {
"ain't": "am not / are not / is not / has not / have not",
"aren't": "are not / am not",
"can't": "cannot",
"can't've": "cannot have",
"'cause": "because",
"could've": "could have",
"couldn't": "could not",
"couldn't've": "could not have",
"didn't": "did not",
"doesn't": "does not",
"don't": "do not",
"hadn't": "had not",
"hadn't've": "had not have",
"hasn't": "has not",
"haven't": "have not",
"he'd": "he had / he would",
"he'd've": "he would have",
"he'll": "he shall / he will",
"he'll've": "he shall have / he will have",
"he's": "he has / he is",
"how'd": "how did",
"how'd'y": "how do you",
"how'll": "how will",
"how's": "how has / how is / how does",
"I'd": "I had / I would",
"I'd've": "I would have",
"I'll": "I shall / I will",
"I'll've": "I shall have / I will have",
"I'm": "I am",
"I've": "I have",
"isn't": "is not",
"it'd": "it had / it would",
"it'd've": "it would have",
"it'll": "it shall / it will",
"it'll've": "it shall have / it will have",
"it's": "it has / it is",
"let's": "let us",
"ma'am": "madam",
"mayn't": "may not",
"might've": "might have",
"mightn't": "might not",
"mightn't've": "might not have",
"must've": "must have",
"mustn't": "must not",
"mustn't've": "must not have",
"needn't": "need not",
"needn't've": "need not have",
"o'clock": "of the clock",
"oughtn't": "ought not",
"oughtn't've": "ought not have",
"shan't": "shall not",
"sha'n't": "shall not",
"shan't've": "shall not have",
"she'd": "she had / she would",
"she'd've": "she would have",
"she'll": "she shall / she will",
"she'll've": "she shall have / she will have",
"she's": "she has / she is",
"should've": "should have",
"shouldn't": "should not",
"shouldn't've": "should not have",
"so've": "so have",
"so's": "so as / so is",
"that'd": "that would / that had",
"that'd've": "that would have",
"that's": "that has / that is",
"there'd": "there had / there would",
"there'd've": "there would have",
"there's": "there has / there is",
"they'd": "they had / they would",
"they'd've": "they would have",
"they'll": "they shall / they will",
"they'll've": "they shall have / they will have",
"they're": "they are",
"they've": "they have",
"to've": "to have",
"wasn't": "was not",
"we'd": "we had / we would",
"we'd've": "we would have",
"we'll": "we will",
"we'll've": "we will have",
"we're": "we are",
"we've": "we have",
"weren't": "were not",
"what'll": "what shall / what will",
"what'll've": "what shall have / what will have",
"what're": "what are",
"what's": "what has / what is",
"what've": "what have",
"when's": "when has / when is",
"when've": "when have",
"where'd": "where did",
"where's": "where has / where is",
"where've": "where have",
"who'll": "who shall / who will",
"who'll've": "who shall have / who will have",
"who's": "who has / who is",
"who've": "who have",
"why's": "why has / why is",
"why've": "why have",
"will've": "will have",
"won't": "will not",
"won't've": "will not have",
"would've": "would have",
"wouldn't": "would not",
"wouldn't've": "would not have",
"y'all": "you all",
"y'all'd": "you all would",
"y'all'd've": "you all would have",
"y'all're": "you all are",
"y'all've": "you all have",
"you'd": "you had / you would",
"you'd've": "you would have",
"you'll": "you shall / you will",
"you'll've": "you shall have / you will have",
"you're": "you are",
"you've": "you have"
}
import skipthoughts
model = skipthoughts.load_model()
encoder = skipthoughts.Encoder(model)
encoded = encoder.encode(sentences)
n_clusters = int(np.ceil(len(encoded)**0.6))
print(n_clusters)
kmeans = KMeans(n_clusters=n_clusters)
kmeans = kmeans.fit(encoded)
avg = []
for j in range(n_clusters):
idx = np.where(kmeans.labels_ == j)[0]
avg.append(np.mean(idx))
closest, _ = pairwise_distances_argmin_min(kmeans.cluster_centers_, encoded)
ordering = sorted(range(n_clusters), key=lambda k: avg[k])
summary = ''.join([sentences[closest[idx]] for idx in ordering])
print(summary)
'''
Skip-thought vectors
'''
import os
#os.environ["THEANO_FLAGS"] = "mode=FAST_RUN,device=cuda,floatX=float32"
import theano
import theano.tensor as tensor
profile = False
#----------------------------------------------------------------------------
-#
# Specify model and table locations here
#----------------------------------------------------------------------------
-#
path_to_models = 'C:\\model files\\'
path_to_tables = 'C:\\model files\\'
#----------------------------------------------------------------------------
-#
def load_model():
"""
Load the model with saved tables
"""
# Load model options
print('Loading model parameters...')
with open('%s.pkl'%path_to_umodel, 'rb') as f:
uoptions = pkl.load(f)
with open('%s.pkl'%path_to_bmodel, 'rb') as f:
boptions = pkl.load(f)
# Load parameters
uparams = init_params(uoptions)
uparams = load_params(path_to_umodel, uparams)
utparams = init_tparams(uparams)
bparams = init_params_bi(boptions)
bparams = load_params(path_to_bmodel, bparams)
btparams = init_tparams(bparams)
# Extractor functions
print('Compiling encoders...')
embedding, x_mask, ctxw2v = build_encoder(utparams, uoptions)
f_w2v = theano.function([embedding, x_mask], ctxw2v, name='f_w2v')
embedding, x_mask, ctxw2v = build_encoder_bi(btparams, boptions)
f_w2v2 = theano.function([embedding, x_mask], ctxw2v, name='f_w2v2')
# Tables
print('Loading tables...')
utable, btable = load_tables()
return model
def load_tables():
"""
Load the tables
"""
words = []
utable = numpy.load(path_to_tables +
'utable.npy',allow_pickle=True,encoding='bytes')
btable = numpy.load(path_to_tables +
'btable.npy',allow_pickle=True,encoding='bytes')
f = open(path_to_tables + 'dictionary.txt', 'rb')
for line in f:
words.append(line.decode('utf-8').strip())
f.close()
utable = OrderedDict(zip(words, utable))
btable = OrderedDict(zip(words, btable))
return utable, btable
class Encoder(object):
"""
Sentence encoder.
"""
# length dictionary
ds = defaultdict(list)
captions = [s.split() for s in X]
for i,s in enumerate(captions):
ds[len(s)].append(i)
if use_eos:
uembedding = numpy.zeros((k+1, len(caps),
model['uoptions']['dim_word']), dtype='float32')
bembedding = numpy.zeros((k+1, len(caps),
model['boptions']['dim_word']), dtype='float32')
else:
uembedding = numpy.zeros((k, len(caps),
model['uoptions']['dim_word']), dtype='float32')
bembedding = numpy.zeros((k, len(caps),
model['boptions']['dim_word']), dtype='float32')
for ind, c in enumerate(caps):
caption = captions[c]
for j in range(len(caption)):
if d[caption[j]] > 0:
uembedding[j,ind] = model['utable'][caption[j]]
bembedding[j,ind] = model['btable'][caption[j]]
else:
uembedding[j,ind] = model['utable']['UNK']
bembedding[j,ind] = model['btable']['UNK']
if use_eos:
uembedding[-1,ind] = model['utable']['<eos>']
bembedding[-1,ind] = model['btable']['<eos>']
if use_eos:
uff = model['f_w2v'](uembedding,
numpy.ones((len(caption)+1,len(caps)), dtype='float32'))
bff = model['f_w2v2'](bembedding,
numpy.ones((len(caption)+1,len(caps)), dtype='float32'))
else:
uff = model['f_w2v'](uembedding,
numpy.ones((len(caption),len(caps)), dtype='float32'))
bff = model['f_w2v2'](bembedding,
numpy.ones((len(caption),len(caps)), dtype='float32'))
if use_norm:
for j in range(len(uff)):
uff[j] /= norm(uff[j])
bff[j] /= norm(bff[j])
for ind, c in enumerate(caps):
ufeatures[c] = uff[ind]
bfeatures[c] = bff[ind]
features = numpy.c_[ufeatures, bfeatures]
return features
def preprocess(text):
"""
Preprocess text for encoder
"""
X = []
sent_detector = nltk.data.load('tokenizers/punkt/english.pickle')
for t in text:
sents = sent_detector.tokenize(t)
result = ''
for s in sents:
tokens = word_tokenize(s)
result += ' ' + ' '.join(tokens)
X.append(result)
return X
def word_features(table):
"""
Extract word features into a normalized matrix
"""
features = numpy.zeros((len(table), 620), dtype='float32')
keys = table.keys()
for i in range(len(table)):
f = table[keys[i]]
features[i] = f / norm(f)
return features
def init_tparams(params):
"""
initialize Theano shared variables according to the initial parameters
"""
tparams = OrderedDict()
for kk, pp in params.items():
tparams[kk] = theano.shared(params[kk], name=kk)
return tparams
def get_layer(name):
fns = layers[name]
return (eval(fns[0]), eval(fns[1]))
def init_params(options):
"""
initialize all parameters needed for the encoder
"""
params = OrderedDict()
# embedding
params['Wemb'] = norm_weight(options['n_words_src'], options['dim_word'])
# encoder: GRU
params = get_layer(options['encoder'])[0](options, params,
prefix='encoder',
nin=options['dim_word'],
dim=options['dim'])
return params
def init_params_bi(options):
"""
initialize all paramters needed for bidirectional encoder
"""
params = OrderedDict()
# embedding
params['Wemb'] = norm_weight(options['n_words_src'], options['dim_word'])
# encoder: GRU
params = get_layer(options['encoder'])[0](options, params,
prefix='encoder', nin=options['dim_word'], dim=options['dim'])
params = get_layer(options['encoder'])[0](options, params,
prefix='encoder_r', nin=options['dim_word'], dim=options['dim'])
return params
# encoder
proj = get_layer(options['encoder'])[1](tparams, embedding, options,
prefix='encoder',
mask=x_mask)
ctx = proj[0][-1]
# encoder
proj = get_layer(options['encoder'])[1](tparams, embedding, options,
prefix='encoder',
mask=x_mask)
projr = get_layer(options['encoder'])[1](tparams, embeddingr, options,
prefix='encoder_r',
mask=xr_mask)
# some utilities
def ortho_weight(ndim):
W = numpy.random.randn(ndim, ndim)
u, s, v = numpy.linalg.svd(W)
return u.astype('float32')
return params
dim = tparams[_p(prefix,'Ux')].shape[1]
if mask == None:
mask = tensor.alloc(1., state_below.shape[0], 1)
r = tensor.nnet.sigmoid(_slice(preact, 0, dim))
u = tensor.nnet.sigmoid(_slice(preact, 1, dim))
h = tensor.tanh(preactx)
h = u * h_ + (1. - u) * h
h = m_[:,None] * h + (1. - m_)[:,None] * h_
return h
The individual segments of the application were tested. The text summarizer tested for their models.
The final application was then integrated and checked with sample documents containing pdf and
text.
5.2.2 Features tested
Among all the text summarization algorithms available, skip-thought vectors have the highest
accuracy and resemblance to the human-like summary. Word to vector improves the outcome
significantly compared to other methods because their main benefit arguably is that they don't
require expensive annotation, but can be derived from large unannotated corpora that are readily
available. Pre-trained embeddings can then be used in downstream tasks that use small amounts of
labelled data.
The majority of the testing was done using Google Chrome and Visual Studio Code.Google Chrome
was used to test the front end of the project whereas Visual Studio Code was used to test the back
end of the project.
Visual Studio Code is a freeware source-code editor made by Microsoft for Windows,Linux and
macOS. Features include support for debugging, syntax highlighting, intelligent code completion,
snippets, code refactoring, and embedded Git. The reason to use this source-code editor was that this
supports all the modules and programming languages involved in our project. This eased the
development and testing phase of the project.
The test cases involve summarization of various PDF files containing a different number of pages.
The benefit of testing PDF files with a different number of pages is that this can determine the time
difference between the summarization of these files. This also helps in determining the quality of the
summary produced since more sentences will have more K-means clusters to form a summary.
During the working of the models, all the outputs are logged in the internal terminal of the Visual
Studio Code to ensure that the model works as intended. Once the summarization and the image
caption is done, the response is logged in the Visual Studio Code terminal to ensure that the output is
sent to the client in a proper format. Once the response is received by the client, it is then logged in
the Google Chrome console to ensure that the data received was in correct format.
CHAPTER-6:
PROJECT OUTCOME AND APPLICABILITY
We were successfully able to implement ATS, using the encoder decoder algorithm. Key
implementation of our project was master, contradiction and skip thoughts. These three are the
backend (server) of our project. Which basically will take the text from the documents, tokenize the
words, reduce them and generate a new fresh summary for us.
Another key implementation done by us was the front-end part of the project which the user will
interact with. We made our website simple, direct and extremely user- friendly.
By the end of this project we implemented an encoder- decoder algorithm and created an ATS model
for users.
Various organisations today, be it online shopping, private sector organisations, government, tourism
and catering industry, or any other institute that offers customer services, they are all concerned to
learn their customer’s feedback each time their services are utilised.
Now, considering that these companies are receiving an enormous amount of feedback and data
every single day. It becomes quite a tedious task for the management to analyse each of these data
points and come up with insights. Therefore, using Machine Learning, models have become capable
of understanding human language with the help of NLP (Natural Language Processing).
We can also summarize case studies, research papers, thesis, essay exceeding 700+ words using
NLP. It will help us in saving a lot of time and energy and result in increase in work efficiency.
CHAPTER-7:
CONCLUSIONS AND RECOMMENDATION
7.1 Constraints of the System
The text summarization model and image caption model requires high computational power to
produce a result in as little time as possible and handle multiple requests from the client at the same
time. The server requires fast local storage to process all the model features and load the model
constraints when the server starts. Involving the GPU processing units improves the performance
significantly. Therefore, if the project is deployed in a full-fledged server with GPU arrays then the
results can be produced within a few seconds even with large PDF files and multiple requests from
the client.
Vanishing gradient constraint with the recurrent neural network (RNN) model of text summarization.
In an RNN information travels through the neural network from input neurons to the output neurons,
while the error is calculated and propagated back through the network to update the weights. The
cost function compares your outcomes (red circles on the image below) to your desired output. As a
result, you have these values throughout the time series, for every single one of these red circles.
Essentially, every single neuron that participated in the calculation of the output, associated with this
cost function, should have its weight updated to minimize that error. And the thing with RNNs is that
it’s not just the neurons directly below this output layer that contributed but all of the neurons far
back in time. So, you have to propagate back through time to these neurons. The problem relates to
updating weight recurring (wrec) – the weight that is used to connect the hidden layers to themselves
in the unrolled temporal loop.
For instance, to get from xt-3 to xt-2 we multiply xt-3 by wrec. Then, to get from xt-2 to xt-1 we
again multiply xt-2 by wrec. So, we multiply with the same weight multiple times, and this raises a
problem that when we multiply something by a small number, the value decreases very quickly. The
lower the gradient is, the harder it is for the network to update the weights and the longer it takes to
get to the result. To solve this problem we have used GRU units to make the RNN only to
backpropagate to a specific node. This ensures that the encoder does not run in an infinite loop and
thus produces the result.
7.2 Future Enhancements
The model can be enhanced in a few areas like the quality of hardware on which the summaries
are being computed. Our project is completely based on the performance of the model, CPU and the
storage unit. This model can also be deployed in GPU with better storage units to improve the speed
of the outcome. In theory, the performance is 10x faster in GPU with fast solid state drives. On the
other hand, the model can also be deployed using LSTM units as compared to the current controlled
GRU units but in theory, the performance improvement of LSTM is not significant compared to
GRU units.
REFERENCES
[1] O. Vinyals, A. Toshev, S. Bengio and D. Erhan, "Show and tell: A neural image caption
generator," CVPR 2015. [pdf]
[2] H. Fang et al., "From captions to visual concepts and back," CVPR 2015.[pdf]
[3] X. Jia, E. Gavves, B. Fernando and T. Tuytelaars, "Guiding the Long-Short Term
Memory Model for Image Caption Generation" ICCV 2015.[pdf]
[4] Jamie Ryan Kiros, Yukun Zhu, Ruslan Salakhutdinov, Richard S. Zemel, Antonio
Torralba, Raquel Urtasun, Sanja Fidler. Skip-Thought Vectors. In NIPS, 2015.
[12] Illustrated Guide to LSTM and GRU: A step by step explanation by Michael
Phi.towardsdatascience.com 2018.
[13] LSTM and GRU Neural Network Performance Comparison Study: Taking Yelp Review
Dataset as an Example by Shudong Yang, Xueying Yu and Ying Zhou 2020.
[17] Tutorial: Django REST with React (and a sprinkle of testing) by Valentino Gagliardi
2021.
[18] Build a single-page React application in a hybrid Django project by saas pegasus 2021.
[19] A Survey on Automatic Image Caption Generation by Shuang Bai and Shan An in 2018.
[20] A Model for Text Summarization by Rasim Algulivez, Ramiz Aliguliyev, Nijat Isazade
2017.