Is the Robot Turning Point Here? This U.S. Company Claims Its New Model Can "Execute Tasks Robots Have Never Been Trained On"

Wallstreetcn
2026.04.17 03:29

Robot AI may be approaching its 'ChatGPT moment': San Francisco-based startup Physical Intelligence has unveiled a new model, π0.7, demonstrating autonomous execution of tasks it was never trained on—surprising even its own researchers. This breakthrough in 'compositional generalization' could fundamentally reshape the commercialization path for robotics, propelling the company's valuation from $5.6 billion to nearly $11 billion

The field of robot AI may be approaching a capability leap similar to that seen with large language models.

Physical Intelligence, a robotics startup based in San Francisco, released new research on Thursday claiming its new model π0.7 can command robots to complete tasks they have never been specifically trained for—a capability that even surprised the company's own researchers.

Sergey Levine, co-founder of the company and professor at the University of California, Berkeley, stated that this marks a shift in robot AI from "rote memorization" to "generalization," with capability improvements outpacing the linear growth of training data scale.

If validated externally, this breakthrough could profoundly impact the commercialization trajectory of the robotics industry—robots could potentially be deployed in entirely new environments and optimized in real time without additional data collection or model retraining. Meanwhile, according to reports, Physical Intelligence is currently negotiating a new round of financing, with its valuation potentially doubling from $5.6 billion to approximately $11 billion.

Core Breakthrough: From "Specialized Memory" to "Compositional Generalization"

Founded just two years ago, Physical Intelligence's newly released π0.7 model demonstrates what researchers call "compositional generalization"—combining skills learned in different scenarios to solve novel problems the model has never encountered before.

This stands in stark contrast to the prevailing paradigm in prior robot training. The standard approach previously involved collecting data for each specific task, training a specialized model, and repeating this process for the next task. π0.7 breaks this pattern.

Levine analogizes this shift to the capability leap once observed in the large language model domain: "Once you cross that threshold—from only completing tasks supported by data to being able to recombine skills in new ways—the rate of capability improvement exceeds the linear proportion of data volume growth. We have already observed this more favorable scaling behavior in language and vision domains."

Key Demonstration: Air Fryer Experiment Reveals "Emergent Knowledge"

The most compelling demonstration in this research comes from an air fryer the model had almost never seen during training. Post-experiment analysis revealed that the entire training dataset contained only two relevant records: one showing another robot pushing an air fryer closed, and another from an open-source dataset recording a robot placing a plastic bottle inside upon instruction.

Nevertheless, π0.7 integrated these fragmented pieces of information with broader pre-trained web data to form a functional understanding of how the device operates. Without any prompts, the model attempted to cook sweet potatoes using the air fryer, achieving results that were generally acceptable; with step-by-step language guidance, the task was successfully executed.

Lucy Shi, a researcher at Physical Intelligence and a Ph.D. candidate in computer science at Stanford University, described a dramatic transformation in an early experiment: **the initial success rate was merely 5%, but after spending about half an hour optimizing the description of the task, the success rate jumped to 95%. "Sometimes the failure isn't with the robot or the model, but with us—we simply didn't do prompt engineering well enough," she said.

Research scientist Ashwin Balakrishna noted that he had always been able to predict the model's capabilities based on training data, "but over the past few months, I've genuinely felt surprised for the first time. I casually bought a set of gears and asked the robot if it could turn them, and it did so immediately."

Limitations: Researchers Actively Define Boundaries

The research team remains candid about the model's limitations. π0.7 cannot yet autonomously complete complex multi-step tasks from a single high-level command. "You can't tell it 'go make me a slice of toast'," Levine said, "but if you guide it step by step—'for the toaster, open this part, press that button, do this'—it usually performs very well."

Additionally, the robotics field currently lacks standardized benchmark tests, making external validation considerably difficult. Physical Intelligence chose to compare π0.7 against its own previous specialized models, with results showing that this general-purpose model achieved performance comparable to specialized models in complex tasks such as making coffee, folding clothes, and assembling boxes.

The paper itself maintains cautious wording, describing π0.7 as showing "early signs" and "preliminary demonstrations" of generalization capabilities. When directly asked when systems based on this research could be practically deployed, Levine declined to offer predictions: "I believe there are sufficient grounds for optimism, and progress is moving faster than I anticipated two years ago. But I find it difficult to answer that question."

Capital Bet: Valuation Could Double to $11 Billion

To date, Physical Intelligence has raised over $1 billion in total funding, with its latest valuation standing at $5.6 billion. According to reports, the company is currently negotiating a new round of financing, with its valuation potentially nearing double that amount at $11 billion.

Investor enthusiasm for the company stems largely from the endorsement of co-founder Lachy Groom. Previously one of Silicon Valley's most respected angel investors, Groom backed notable companies including Figma, Notion, and Ramp. Before deciding to co-found Physical Intelligence, he viewed it as the company he had been searching for all along. This background helped the startup attract institutional capital, despite the company consistently refusing to provide investors with a commercialization timeline.

When addressing potential external skepticism, Levine proactively anticipated criticism: "For any robot generalization demonstration, one can always argue—the tasks are too boring, and the robot isn't doing backflips." He countered this by stating that truly generalizable robot systems will never look as spectacular as meticulously choreographed stunt demonstrations, but their practical value is far greater.