MIT’s deep-learning software produces videos of the future

When you see a photo of a dog bounding across the lawn, it’s pretty easy for us humans to imagine how the following moments played out. Well, scientists at MIT have just trained machines to do the same thing, with artificial intelligence software that can take a single image and use it to to create a short video of the seconds that followed. The technology is still bare-bones, but could one day make for smarter self-driving cars that are better prepared for the unexpected, among other applications.

The software uses a deep-learning algorithm that was trained on two million unlabeled videos amounting to a year’s worth of screen time. It actually consists of two separate neural networks that compete with one another. The first has been taught to separate the foreground and the background and to identify the object in the image, which allows the model to then determine what is moving and what isn’t.

According to the scientists, this approach improves on other computer vision technologies under development that can also create video of the future. These involve taking the information available in existing videos and stretching them out with computer-generated vision, by building each frame one at a time. The new software is claimed to be more accurate, by producing up to 32 frames per second and building out entire scenes in one go.

Blog