Download as pdf or txt
Download as pdf or txt
You are on page 1of 2

Text-to-video model

A text-to-video model is a machine learning model which takes as input a natural language description and
produces a video matching that description.[1]

Video prediction on making objects realistic in a stable background is performed by using recurrent neural
network for a sequence to sequence model with a connector convolutional neural network encoding and
decoding each frame pixel by pixel,[2] creating video using deep learning.[3]

Methodology
Data collection and data set preparation using clear video from kinetic human action video.
Training the convolutional neural network for making video.
Keywords extraction from text using natural-language programming .
Testing of Data set in conditional generative model for existing static and dynamic
information from text by variational autoencoder and generative adversarial network.

Models
There are different models including open source models. CogVideo presented their code in GitHub.[4]
Meta Platforms uses text-to-video with makeavideo.studio (https://Makeavideo.studio).[5][6][7]Google used
Imagen Video for converting text-to-video.[8][9][10][11][12]

Antonia Antonova presented another model.[13]

References
1. Artificial Intelligence Index Report 2023 (https://aiindex.stanford.edu/wp-content/uploads/202
3/04/HAI_AI-Index-Report_2023.pdf) (PDF) (Report). Stanford Institute for Human-Centered
Artificial Intelligence. p. 98. "Multiple high quality text-to-video models, AI systems that can
generate video clips from prompted text, were released in 2022."
2. "Leading India" (https://www.leadingindia.ai/downloads/projects/VP/vp_16.pdf) (PDF).
3. Narain, Rohit (2021-12-29). "Smart Video Generation from Text Using Deep Neural
Networks" (https://www.datatobiz.com/blog/smart-video-generation-from-text/). Retrieved
2022-10-12.
4. CogVideo (https://github.com/THUDM/CogVideo), THUDM, 2022-10-12, retrieved
2022-10-12
5. Davies, Teli (2022-09-29). "Make-A-Video: Meta AI's New Model For Text-To-Video
Generation" (https://wandb.ai/telidavies/ml-news/reports/Make-A-Video-Meta-AI-s-New-Mod
el-For-Text-To-Video-Generation--VmlldzoyNzE4Nzcx). W&B. Retrieved 2022-10-12.
6. Monge, Jim Clyde (2022-08-03). "This AI Can Create Video From Text Prompt" (https://better
programming.pub/this-ai-can-create-video-from-text-prompt-6904439d7aba). Medium.
Retrieved 2022-10-12.
7. "Meta's Make-A-Video AI creates videos from text" (https://www.fonearena.com/blog/375627/
meta-make-a-video-ai-create-videos-from-text.html). www.fonearena.com. Retrieved
2022-10-12.
8. "google: Google takes on Meta, introduces own video-generating AI - The Economic Times"
(https://m.economictimes.com/tech/technology/google-takes-on-meta-introduces-own-video-
generating-ai/amp_articleshow/94681128.cms?amp_gsa=1&amp_js_v=a9&usqp=mq331A
QKKAFQArABIIACAw==#amp_tf=From%20%251$s&aoh=16655942495197&referrer=http
s://www.google.com&ampshare=https://m.economictimes.com/tech/technology/google-takes
-on-meta-introduces-own-video-generating-ai/articleshow/94681128.cms).
m.economictimes.com. Retrieved 2022-10-12.
9. Monge, Jim Clyde (2022-08-03). "This AI Can Create Video From Text Prompt" (https://better
programming.pub/this-ai-can-create-video-from-text-prompt-6904439d7aba). Medium.
Retrieved 2022-10-12.
10. "Nuh-uh, Meta, we can do text-to-video AI, too, says Google" (https://www.theregister.com/A
MP/2022/10/06/google_ai_imagen_video/). www.theregister.com. Retrieved 2022-10-12.
11. "Papers with Code - See, Plan, Predict: Language-guided Cognitive Planning with Video
Prediction" (https://paperswithcode.com/paper/see-plan-predict-language-guided-cognitive).
paperswithcode.com. Retrieved 2022-10-12.
12. "Papers with Code - Text-driven Video Prediction" (https://paperswithcode.com/paper/text-dri
ven-video-prediction). paperswithcode.com. Retrieved 2022-10-12.
13. "Text to Video Generation" (https://antonia.space/text-to-video-generation). Antonia
Antonova. Retrieved 2022-10-12.

Retrieved from "https://en.wikipedia.org/w/index.php?title=Text-to-video_model&oldid=1153001697"

You might also like