Professional Documents
Culture Documents
Show and Tell: A Neural Image Caption Generator (CVPR 2015) : Presenters: Tianlu Wang, Yin Zhang October 5
Show and Tell: A Neural Image Caption Generator (CVPR 2015) : Presenters: Tianlu Wang, Yin Zhang October 5
Show and Tell: A Neural Image Caption Generator (CVPR 2015) : Presenters: Tianlu Wang, Yin Zhang October 5
Mathematically, to build a single joint model that takes an image I as input, and is trained to maximize the
likelihood p(Sentence|Image) of producing a target sequence of words
Inspiration from Machine Translation task
Cell state:
information
flows along it!
Gate: optionally
let information
through
LSTM Cont.(forget gate)
input x
f (vector, every element is 0 or 1)
previous output h
decide what
information to
throw away from
the cell state
LSTM Cont.
decide what values will be updated
BLEU: https://en.wikipedia.org/wiki/BLEU
Reference:
• Show and Tell: A Neural Image Caption Generator, Oriol Vinyals,
Alexander Toshev, Samy Bengio, Dumitru Erhan
https://arxiv.org/pdf/1411.4555v2.pdf
http://techtalks.tv/talks/show-and-tell-a-neural-image-caption-gener
ator/61592/
• Understanding LSTM Networks, colah’s blog
http://colah.github.io/posts/2015-08-Understanding-LSTMs/