M
M
e
e
n
n
u
u
M
M
e
e
n
n
u
u
M
M
e
e
n
n
u
u

September 18, 2025

September 18, 2025

September 18, 2025

Heirarchical Reasoning Models & their Impact on Robotics

Hierarchical Reasoning Models (HRMs) are small but provocative: instead of scaling params, they build in loops for real sequential reasoning. For robotics, thats the difference between guessing once and iterating until the bolt threads or the grasp holds. Sight isnt enoughrobots need opinionated world models that perceive, reason, predict, and respond. HRM isnt the answer yet, but it points the way: give machines the right to hold a hypothesis, test it, and change course. Thats how we move from recognition understanding anticipation action.

Hierarchical Reasoning Models (HRMs) are small but provocative: instead of scaling params, they build in loops for real sequential reasoning. For robotics, that’s the difference between guessing once and iterating until the bolt threads or the grasp holds. Sight isn’t enough—robots need opinionated world models that perceive, reason, predict, and respond. HRM isn’t the answer yet, but it points the way: give machines the right to hold a hypothesis, test it, and change course. That’s how we move from recognition → understanding → anticipation → action.

How next-gen AI models that reason in steps are unlocking smarter, safer robotic decision-making.

Hierarchical Reasoning Models & the Next Leap for Robots


Transformers gave robots sharp eyes; they still need a point of view. That’s why the recent Hierarchical Reasoning Model (HRM) work is interesting: it puts the “scratchpad” for thinking back inside the network and makes reasoning iterative, not just verbose. HRM uses two coupled loops - a fast, low‑level module that grinds through details and a slower, high‑level module that plans and corrects - so the model can revisit a problem until it converges, rather than being trapped by a fixed depth of layers. Reported results are striking for such a small model (≈27M params, trained on roughly ~1,000 examples, no pretraining), including decent Sudoku and ARC‑AGI performance for such a small model.


For robotics, the appeal isn’t leaderboard hype; it’s sequential control under constraints. Many real tasks- regrasping, fastening, threading, insertion, recovery from slip—are little trees of “try → check → backtrack → try again.” Fixed‑depth feed‑forward passes struggle here; you want controlled recurrence so the system can reason as long as needed. That’s the same intuition behind Universal Transformers’ dynamic halting: keep thinking until the job is done (it takes what it takes).


There’s nuance. Independent ARC Prize analysis seem to suggest a lot of HRM’s gains come from its outer loop (iterative refinement / test‑time training) rather than the hierarchy alone - basically, doing more closed‑loop work on the actual instance. In robotics terms, that’s a feature, not a bug: the best policies learn by iterating on the live state, not by guessing once and hoping.


Here’s the practical takeaway for robots:


  • Reason in loops, not lines. Make “perceive → reason → predict → respond” a true control loop, not just a slogan. Give the system permission to backtrack.

  • Separate timescales. Let a slow planner set subgoals while a fast controller handles contacts, slippage, and micro‑corrections—HRM’s two‑module rhythm maps cleanly onto high/low‑level control.

  • Be opinionated about the scene. Labels aren’t enough; the agent needs beliefs about function and affordances (“this bore is misaligned,” “that fastener will gall unless I re‑thread”). That’s the core of what it means to 'See Beyond Sight': sight ≠ understanding ≠ anticipation.

  • Scale compute with difficulty. Adaptive “think time” beats fixed budgets when stakes are physical and failures are expensive.


So is HRM “the answer”? Too early to crown any single architecture, and some results will keep getting poked & prodded by the community - which is healthy. But the direction is right: robots need opinionated world models that can revise themselves in‑flight. Give them the right to hold a hypothesis about the scene, test it, and change course when the torque curve or force trace disagrees. That’s how we cross the gap I have argued consistently: from seeing to *understanding* to *anticipation* to *action*

SOURCES:

[1]: https://arxiv.org/abs/2506.21734 "Hierarchical Reasoning Model"

[2]: https://arxiv.org/abs/1807.03819 "Universal Transformers"

[3]: https://arcprize.org/blog/hrm-analysis "The Hidden Drivers of HRM's Performance on ARC-AGI"

How next-gen AI models that reason in steps are unlocking smarter, safer robotic decision-making.

Hierarchical Reasoning Models & the Next Leap for Robots


Transformers gave robots sharp eyes; they still need a point of view. That’s why the recent Hierarchical Reasoning Model (HRM) work is interesting: it puts the “scratchpad” for thinking back inside the network and makes reasoning iterative, not just verbose. HRM uses two coupled loops - a fast, low‑level module that grinds through details and a slower, high‑level module that plans and corrects - so the model can revisit a problem until it converges, rather than being trapped by a fixed depth of layers. Reported results are striking for such a small model (≈27M params, trained on roughly ~1,000 examples, no pretraining), including decent Sudoku and ARC‑AGI performance for such a small model.


For robotics, the appeal isn’t leaderboard hype; it’s sequential control under constraints. Many real tasks- regrasping, fastening, threading, insertion, recovery from slip—are little trees of “try → check → backtrack → try again.” Fixed‑depth feed‑forward passes struggle here; you want controlled recurrence so the system can reason as long as needed. That’s the same intuition behind Universal Transformers’ dynamic halting: keep thinking until the job is done (it takes what it takes).


There’s nuance. Independent ARC Prize analysis seem to suggest a lot of HRM’s gains come from its outer loop (iterative refinement / test‑time training) rather than the hierarchy alone - basically, doing more closed‑loop work on the actual instance. In robotics terms, that’s a feature, not a bug: the best policies learn by iterating on the live state, not by guessing once and hoping.


Here’s the practical takeaway for robots:


  • Reason in loops, not lines. Make “perceive → reason → predict → respond” a true control loop, not just a slogan. Give the system permission to backtrack.

  • Separate timescales. Let a slow planner set subgoals while a fast controller handles contacts, slippage, and micro‑corrections—HRM’s two‑module rhythm maps cleanly onto high/low‑level control.

  • Be opinionated about the scene. Labels aren’t enough; the agent needs beliefs about function and affordances (“this bore is misaligned,” “that fastener will gall unless I re‑thread”). That’s the core of what it means to 'See Beyond Sight': sight ≠ understanding ≠ anticipation.

  • Scale compute with difficulty. Adaptive “think time” beats fixed budgets when stakes are physical and failures are expensive.


So is HRM “the answer”? Too early to crown any single architecture, and some results will keep getting poked & prodded by the community - which is healthy. But the direction is right: robots need opinionated world models that can revise themselves in‑flight. Give them the right to hold a hypothesis about the scene, test it, and change course when the torque curve or force trace disagrees. That’s how we cross the gap I have argued consistently: from seeing to *understanding* to *anticipation* to *action*

SOURCES:

[1]: https://arxiv.org/abs/2506.21734 "Hierarchical Reasoning Model"

[2]: https://arxiv.org/abs/1807.03819 "Universal Transformers"

[3]: https://arcprize.org/blog/hrm-analysis "The Hidden Drivers of HRM's Performance on ARC-AGI"

YOUR FIRST STEP

Let's do a quick meeting.

"The way people buy products online hasn't changed in 25 years- We're here to change that."

Joseph Douglas

CEO & Founder

YOUR FIRST STEP

Let's do a quick meeting.

"The way people buy products online hasn't changed in 25 years- We're here to change that."

Joseph Douglas

CEO & Founder

YOUR FIRST STEP

Let's do a quick meeting.

"The way people buy products online hasn't changed in 25 years- We're here to change that."

Joseph Douglas

CEO & Founder

13

Ready to start?

Get in touch

Whether you have questions or just want to explore options, we’re here.

By submitting, you agree to our Terms and Privacy Policy.

We are Based in Singapore

Soft abstract gradient with white light transitioning into purple, blue, and orange hues

13

Ready to start?

Get in touch

Whether you have questions or just want to explore options, we’re here.

By submitting, you agree to our Terms and Privacy Policy.

We are Based in Singapore

Soft abstract gradient with white light transitioning into purple, blue, and orange hues

13

Ready to start?

Get in touch

Whether you have questions or just want to explore options, we’re here.

By submitting, you agree to our Terms and Privacy Policy.

We are Based in Singapore

Soft abstract gradient with white light transitioning into purple, blue, and orange hues