Video Is a Better Source for Machines To Understand the World

“V-JEPA is a step toward a more grounded understanding of the world so machines can achieve more generalized reasoning and planning,” says Meta’s VP & Chief AI Scientist Yann LeCun, who proposed the original Joint Embedding Predictive Architectures (JEPA) in 2022.

What Meta is presenting now (March 2024) is an early example of a physical world model that excels at detecting and understanding highly detailed interactions between objects.

Meta releasing this model under a Creative Commons Non-Commercial license for researchers to further explore only. The next step, as they say, is to show how we can use this kind of predictor or world model for planning or sequential decision-making.