AI for Video Synthesis - Matthewrenze

August 15, 2021 Author: Matthew Renze

How do we use AI to generate new videos from scratch?

In my last article in this series on The AI Developer’s Toolkit, I introduced you to the three most popular AI tools for video analysis. These tools allowed us to extract useful information from digital video.

However, there are many cases where want to generate new videos from scratch. This set of tasks is referred to as video synthesis (aka. “deep fakes”).

In this article, I’ll introduce you to the three most popular AI tools for video synthesis.

Video Interpolation

Video interpolation allows us to predict missing video frames given both previous and subsequent frames. It answers the question: “what content should go in this missing video frame?”

For example, we can create slow-motion footage from existing regular-speed footage. We provide the video-interpolation model with a low-frame-rate video as input. Then the model produces a high-frame-rate video as output.

Video interpolation is useful for a variety of video-editing tasks. For example:

creating super-slow-motion videos using frame-rate-conversion
restoring old film strips with inconsistent frame rates
smoothing out security footage recorded using a video multiplexer

Video Prediction

Video prediction allows us to predictively synthesize future video frames based on a few preceding video frames. It answers the question, “what will likely happen next in this video?”

For example, given 10 frames of a golf video, we can predict the next 30 frames. We provide the video-prediction model with a few frames of video as input. Then the model produces a prediction of the next few frames of the video as output.

Video prediction is currently an active area of research, so there aren’t many practical applications yet. However, as you can imagine, this tool will likely be quite useful for a wide variety of video-generation tasks in the near future.

Video transfer

Video transfer (aka. video-to-video synthesis) allows us to synthesize entirely new videos from a more simplified input video. Essentially, it allows us to create entirely new videos from scratch.

For example, we can use a semantically-segmented video to produce a completely new video from scratch. We provide the video-transfer model with a video containing semantically-labeled pixels as input. Then the model produces a realistic video that represents the labels as output.

Video transfer is also an active area of research so there currently aren’t many practical applications. However, once again, you can imagine what we might be doing with this technology in the very near future.

Other Tools

Beyond the three examples that we’ve seen so far, there are also a variety of other AI tools for video synthesis. For example:

Video completion – which is like image completion but for motion video
Video face synthesis – which allows us to create synthetic videos of people speaking
Video pose transfer – which allows us to create synthetic videos of people’s movements
Video lip syncing – which allows us to apply the lip movements from an audio track to a video track

As we can see, video-synthesis tools allow us to transform existing videos and create new videos from scratch.

If you’d like to learn how to use all of the tools listed above, please watch my online course: The AI Developer’s Toolkit.

The future belongs who those who invest in AI today. Don’t get left behind!

Start Now!

[Image source: Video-to-Video Synthesis]

Share this Article