According to Yann LeCun in a recent interview, large language models cannot lead to artificial general intelligence despite their value, as they lack the ability to predict action consequences and plan in abstract space—capabilities essential for true human-level reasoning. LeCun emphasized that LLMs' success relies on language's discrete nature, but the real world is continuous and high-dimensional, requiring models to understand physical causality rather than merely predict the next token.
LeCun proposes Joint Embedding Predictive Architecture (JEPA) as an alternative, which predicts future states in semantic representation space rather than reconstructing individual pixels. A March 2026 paper on LeWorldModel demonstrated JEPA's potential: a 15-million-parameter model achieved 96% success rate on control tasks and improved planning speed by up to 50 times, without requiring massive pre-training datasets.