Microsoft makes a bid to run physical AI

By
Sabrina Ortiz

Jan 21, 2026

12:00pm UTC

Copy link
Share on X
Share on LinkedIn
Share on Instagram
Share via Facebook
A

s AI skyrocketed in popularity in recent years, we have seen various evolutions of the tech, such as AI agents, become massively popular. The hottest new buzz trend is Physical AI, and Microsoft is jumping into the fray.

Broadly speaking, Physical AI can be described as hardware that goes beyond what robots already do by perceiving the environment and then using reason to agentically perform or orchestrate actions. Microsoft’s first set of robotics models, Rho-alpha, translates spoken commands into actions for robotic systems performing bimanual manipulation tasks, such as using both hands at once.

The models, derived from Microsoft's Phi series, go a step further from the traditional vision-language-action models (VLAs) by adding tactile sensing, or the ability to understand physical cues. For instance, the company shares that efforts are underway to enable it to sense modalities such as force. That capability could be helpful in real-world scenarios, such as stopping a movement if someone is in the way.

Rho-alpha also enables robots to learn from the feedback given by people, which allows them to continue to learn on the job much like a person. Ultimately, Microsoft says the goal is to make physical systems more adaptable, both adjusting to their environment and people’s requests, making them more trustworthy. To achieve this goal, Ro-Alpha was trained on physical demonstrations, simulated tasks, and web-scale visual question-answering data, according to the company.

Lastly, Microsoft is tackling the scarcity of high-quality simulated data that accurately captures reality by having its training pipeline generate synthetic data using the open-source NVIDIA Isaac Sim framework. Those interested in using physical AI foundations and tools can join Microsoft's Research Early Access Program.

Our Deeper View

While tech like this is a step in the right direction (and serves as damage control), the bigger question is whether it actually works. Today's AI is still far from perfect, with a tendency to hallucinate and make mistakes. And we’ve already seen that wreak havoc in age verification systems such as Roblox, which misidentified kids as adults and vice versa. The question remains whether these kinds of risks can be controlled at all while allowing young users to harness the technology, and what responsibility these firms have in safeguarding the systems.