Professional Documents
Culture Documents
Synopsis: Guide'S Name
Synopsis: Guide'S Name
Synopsis
TITLE
SUBMITTED
BY
STUDENT NAME
REG NO
E-MAIL ADDRESS/PHONE
GUIDEs NAME
Designation
Department
Signature of Guide:
1. Introduction:
1.1 Topic
Language is man's most important means of communication and speech its primary
medium. A speech signal is a complex combination of a variety of airborne pressure waveforms.
This complex pattern must be detected by the human auditory system and decoded by the brain.
This can be done by using a combination of audio and visual cues to perceive speech more
effectively. The project aims to emulate this mechanism in human machine communication
systems by exploiting the acoustic and visual properties of human speech.
1.2 Organization
3. Objective:
Recognise 10 English words (speaker independent) with at least 90% accuracy in a noisy
environment.
4. Methodology:
The project is carried out in into following parts
5. Project Schedule:
January 2014
o Processing of audio signals
o Feature extraction from the chosen training database
o Pattern recognition and signature extraction from the features
o Training the HMM with the training set
February 2014
o Processing of video signals
o Feature extraction from the chosen training database
o Pattern recognition and signature extraction from the features
March 2014
o Synchronize audio and video features for pattern recognition
o Extension of training data set to 10 words
April 2014
o Up gradation of system for speaker independent applications
o Performance analysis by comparing results of audio-only approach with that of joint
audio-visual approach
Documentation
References:
1. Tsuhan Chen, "Audiovisual Speech Processing, Lip Reading and Lip synchronization",
IEEE Signal Processing Magazine, January 2001.
2.
R.Chellapa, C.L. Wilson and S. Sirohoey, Human and Machine Recognition of
Faces : A survey, Proceedings of the IEEE, vol 83, no.5 May 1995