M
M
e
e
n
n
u
u

M
M
e
e
n
n
u
u

M
M
e
e
n
n
u
u

A 30-minute call to introduce Conversational 3D.

September 18, 2025

Seeing beyond Sight- Our Founder takes the Slush Stage

Artificial intelligence today can “see” but it doesn’t truly understand. Spatial Support is building the bridge from raw perception to real reasoning — perceiving, predicting, and responding with context. Just as a watchmaker sees more than gears, we’re pushing AI toward opinionated, informed perspectives that move beyond recognition into comprehension. This is the chasm we must cross if we’re to unlock super-intelligent AI.

Highlights from Spatial Support’s debut on the Slush stage in Penang, Malaysia, sharing how we turn 3D data into live conversations.

Seeing Beyond Sight

People assume that if AI can see, it must also understand. It doesn’t. Recognition is not comprehension; pixels are not opinions. Sight ≠ understanding ≠ anticipation. That sequence is the whole point of my talk, and it’s the gap we have to close to build real intelligence.

The difference between seeing and knowing

A child looks at an exploded watch and sees shiny pieces. Most adults see “watch gears.” A modern multimodal model sees “watch parts, metal, gears.” A watchmaker sees the Zenith El Primero and immediately infers tolerances, function, and failure modes. Same image; radically different depth of knowledge.

That depth is the iceberg we ignore: a thin sliver of visible recognition above the waterline, a mass of tacit structure below—mechanics, causality, affordances, history. Until AI operates below the line—where concepts connect, predict, and constrain—it will remain a clever labeler rather than a thinker.

The chasm we must cross

Merely seeing cannot carry us across the gap to super-intelligent systems. Crossing the chasm means moving from perception to an opinionated perspective—models that don’t just name objects, but reason about what they are for, what happens next, and what to do now. In practice, this is a loop:

Perceive → Reason → Predict → Respond.

If any link is missing, the loop collapses. Symbolic blocks without data crack under reality’s noise. Pure pattern matchers hallucinate coherence. We need grounded models that can absorb structure from the world and update their beliefs through experience.

Opinionated AI

“Why can’t AI see what we see?” is the wrong question. The right one: why doesn’t AI have an informed, opinionated perspective on what it sees? Humans don’t look at frames; we carry schemas—physics, intentions, parts and assemblies—that let us compress the present and simulate the near future. That’s why a watchmaker can diagnose a movement at a glance, and a service tech can hear a bearing fail before it fails. We want machines to form those kinds of opinions.

Children acquire roughly 35,000 concepts by age seven—not as disconnected labels, but as a scaffold for reasoning. That’s the bar. Models must build and revise internal structure fast enough to make useful predictions and safe decisions.

Why 3D is the right proving ground

3D is unforgiving: geometry encodes function. Parts fit or they don’t. Assemblies operate or they seize. That’s exactly why we start here. When a system understands a valve body or a gearbox as an object with constraints, it can answer real questions, guide real actions, and learn from outcomes without drifting into fantasy.

What we’re building at Spatial Support

We turn geometry + documentation into live, conversational scenes—opinionated world models you can interrogate. Our pipeline ingests CAD and supporting docs, links symbols to structure, and serves an interactive experience where the agent can perceive, reason, predict, and respond in context. It is not “a chatbot with pictures.” It is a reasoning loop wrapped around a physical object and its tasks.

Perceive: parse parts, assemblies, constraints.
Reason: map functions, procedures, and causal chains.
Predict: simulate likely outcomes and failure modes.
Respond: explain, troubleshoot, recommend, and execute next steps.

This is how we go from static frames to actionable understanding—how we give AI the right to hold an opinion about the world and defend it with evidence.

Where this goes

Once machines can form and test opinions about structured worlds, we get safer automation, better training, faster diagnosis, and sales experiences that teach rather than tease. That’s not “sharper eyes.” It’s deeper understanding—the missing ingredient between today’s perception demos and tomorrow’s truly intelligent systems.

We’re building that bridge now. Seeing is table stakes. Understanding is the product.

Robot holding glowing cube- image is grayscale

Heirarchical Reasoning Models & their Impact on Robotics

Athlete or dancer mid-leap in motion, photographed with a motion blur effect.

Building the greatest computer vision team in Singapore.