These examples show that Google's VideoPoet will become a formidable competitor to OpenAI's Sora

Users will soon witness VideoPoet and its descendants producing amazingly realistic videos .

Picture 1 of These examples show that Google's VideoPoet will become a formidable competitor to OpenAI's Sora
Activate the Mona Lisa painting from two instructions: "A woman turns to look at the camera" and "A woman yawns".

As the wave of artificial intelligence generating text is on a steady track with increasingly perfect products, a new wave called the 'AI generating video model' is starting to bloom. However, these models have great difficulty in producing a series of movements that make sense to the viewer.

Over time, these models will learn more, thereby producing more realistic and quality products. Their beauty lies in the product creation process being quite simple, just need to skillfully issue commands for AI to generate videos or similar products. Besides the versatile AI model, it can make videos from commands, create videos from photos or stylize videos, etc.

Currently, OpenAI's Sora software is attracting public attention when it releases a series of surprisingly realistic AI-generated videos, but they are not alone on their journey to research artificial intelligence. Google also owns its own similar project called VideoPoet, which has been in development for a while and also has very impressive products.

Picture 2 of These examples show that Google's VideoPoet will become a formidable competitor to OpenAI's Sora
Video from the tutorial: "Two pandas playing cards".

Picture 3 of These examples show that Google's VideoPoet will become a formidable competitor to OpenAI's Sora
Video from the guide: "Horses galloping in the background of van Gogh's painting Starry Night".

As confirmed by Google researchers, the input image can be animated to create movement, VideoPoet can also automatically fill in missing content (for example, restore the original video) or generate additional content. for videos.

In the stylization task, the AI model uses video depicting depth and optical effects, which can show movement, and then draws additional content on top to create the style according to the user's instructions. . Below is the product after stylizing a video also born from Google's AI model.

Picture 4 of These examples show that Google's VideoPoet will become a formidable competitor to OpenAI's Sora
Directions of the videos (from left to right): "Wombat wearing sunglasses holding a volleyball on the beach"; "Teddy bear skating on a frozen lake"; "A metal lion roared in the light of the forge".

Based on the last 1 second in the video, the AI model can create a longer video by predicting what content may happen in the next second. Repeating this process, VideoPoet can not only expand the video easily but also preserve the form of objects appearing in the short clip.

Picture 5 of These examples show that Google's VideoPoet will become a formidable competitor to OpenAI's Sora
Video from the guide: " An astronaut starts dancing on Mars. Then bright fireworks explode from behind."

VideoPoet is also capable of generating sound . With 2-second clips, the AI tries to predict the sound without text instructions. This allows video and audio to be generated from a single sample.

Generate sound from teddy bear drumming content.

Generate sounds from content of cats playing piano.

Through VideoPoet, Google demonstrates the extremely competitive quality of large language models that not only produce textual content but also create eye-catching, realistic videos.

The results show the promising potential of large language models in the field of video generation. In the future, these types of AI models could generate content based on a variety of input instructions, such as using text to create sounds, creating videos from spoken words, automatically describing videos, and many other applications. .

Update 27 March 2024