Google’s Text to Video Artificial Intelligence

Ankit Kumar
4 min readNov 1, 2022

--

You want a video where your doggy is fighting some superheroes then you are on the right place. say hurray!

On October 5, Google announced its new AI on which they’re working now. It’s a text-to-video AI, which is capable of giving you an autogenerated video of quality 720p with 24fps. For starters this is good quality, considering it’s just their second phase.

Google headquarter Image
Photo by Dylan Carr on Unsplash

It’s not the first they introduced some text-to-video before there were other AI text-to-video tools but they were producing crappy videos, and no one would like to use them. Not the issue of developers. It is just new tech and they are still working on it. It will take some time before it become really good. Devlopers are constntly working on improving these artificial intelligence, so that they can produce high-quality video with some standards that can be used for commercial purposes.

As for the time being, they are producing 720p quality as I told which is good and their output from text to video is remarkable, but it is still not enough for commercials. There are still a lot of improvements.

The pic you are seeing below is a part of that text-to-video AI. As you can see the elephant is not looking that good and when the video is played the movement that elephant shows is not natural at all. Movements in thesse AI generated videos are sloppy, but they kinda assures me that it’s not too long before they will be able to produce some really good text-to-video results. AI tools that will give some awesome videos that people will find it difficult to figure out, that is this truly made from some AI tools. I guess this time is a decade far tech for now atleast, but Who knows, maybe they will make it in the next 3years. It’s unpredictable anything thing can happen but here I am expecting things. hoofff!

Source: Youtube

For sample videos click on this link: https://imagen.research.google/video/

Now let’s know the behind-the-scenes. How does the artificial intelligence of google is accomplishing these results?

You familiar with Artificial intelligence whenever we read something about artificial intelligence we often find terms of machine learning and deep learning. These terms are used interchangeably. There is some confusion going around about Machine learning (ML), Deep Learning (DL), and Artificial Intelligence (AI).

By Author

Data science is a field of study that uses data for various research and reporting purposes to derive insights and meaning from that data.

Artificial intelligence: Artificial Intelligence (AI) is a subfield within computer science associated with constructing machines that can simulate human intelligence. AI research deals with the question of how to create computers that are capable of intelligent behavior

Machine learning: It’s a subfield within Artificial intelligence and its main function is the ability to learn from experience without the need to be programmed explicitly, which means ml learns on its own with no need for humans after implying it.

In ML, there are two categories one is supervised learning and second is unsupervised learning. There’s one more which is a hybrid of these two called semi-supervised learning.

Let’s see what makes these three different.

Supervised models learn from ground truth data that was labeled manually by data scientists. In computer vision, this process is called image annotation. The model uses this data to learn (training) how to make predictions on new data (inferencing).

On the other hand, unsupervised learning is where the algorithm is given raw data that is not annotated. Here, the algorithm is not explicitly told what to do with it and must learn how to make predictions by itself. This type of ML model is suitable to perform specific tasks on distinct data types, for example, fraud detection or financial analysis, that require identifying a hidden structure in unlabeled data.

Deep learning, it’s a part machine learning process, it’s a subfield of ML.
In deep learning, algorithms are trained for recognizing several patterns in data.
on the other hand, Ml is more like a guy who deals with general algorithms that can learn any given task.

Now you can sum up by yourself how google AI, text to video works.
Suppose you put a command command “ An elephant walking in water “ Then the google AI, AI subfield which is ml does the research work and searches for the water and underwater and elephant then the program ai puts all those data in work and prepares the exact video that you desires. These processes are done very quickly as these data are cloud-based and gathering data from the cloud is a very smooth and fast process.

That’s the overall story, how you get your video from google video AI.

I hope it was a fun read
Thanks

--

--

Ankit Kumar
Ankit Kumar

Written by Ankit Kumar

I write about ai and future tech. You can follow me if you like it :)