Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 13

LIPWA

VE
PRESENTATION
PRESENTED & CREATED
BY:

ARTUR MISTIUK
PRESENTATION
• Project Description
OUTLINE
• Tortoise Model Introduction
• Wave2Lip Model Introduction
• Additional Tools
• Time Consumption
• Challenges
• Questions and Answers
PROJECT
DESCRIPTION
The project integrates two content generation models -
Wav2Lip and Tortoise TTS, aimed at creating high-
quality videos with synchronized audio content.

MAIN GOAL: CREATE


FUNNY VIDEOS
TORTOISE MODEL
INTRODCUTION
Tortoise is a deep neural network that converts written
text into natural-sounding speech.

Audio files with the


desired voice

Text Tokenization Model Audio


Sound Alert!!!!
THYNK UNLIMITED
WE LEARN FOR THE FUTURE
WAV2LIP MODEL
INTRODCUTIONrealistic lip movements based on audio input,
Wav2Lip is a lip-sync model that generates

synchronizing them with the audio.

Trained on a large dataset of audio and corresponding


video recordings of lip movements.
ADDITIONAL
TOOLS
Downloading
video

Splitting it into
Changing the short segments
format of audio
and video files

Removing audio
from a video
TIME
CONSUMPTION

01 02 03
TORTOISE WAV2LIP TOTAL
VOICE
Training the Tortoise model on Using audio to generate lip Total time spent on a 50-second
20 minutes of audio took 4 hours movements and create the video video with training of both
to generate 50 seconds of audio. took 3 hours for a 50-second models - 7 hours
video
CHALLENGES

TIME COLLECTING SEARCHING


CONSUMPTION AUDIO AND FOR WORKING
VIDEO DATA MODELS
QUESTIONS AND
ANSWERS
Your insights and questions are highly valuable to
me, and we want to create an engaging and
interactive session. Please feel free to send us your
questions and concerns for clarifications.

+46762541491

mistiuk.artur@gmail.com

You might also like