SketchUp Agent Harness Technical Architecture: Why the Agent CLI Is Only the Entry Point

If you only look at the user-facing entry point, SketchUp Agent Harness can sound like “Codex or Claude controls SketchUp.” That description is not wrong, but it is too shallow for developers.

A more useful description is this: Codex and Claude are entry points. The interesting part is the harness in the middle, the part that connects natural language, project state, tool calls, geometric operations, validation results, and SketchUp execution.

The core architecture looks like this:

agent CLI
-> product runtime skills
-> MCP server
-> project workspace / design_model.json
-> bridge execution trace
-> SketchUp Ruby bridge
-> SketchUp scene
-> execution feedback
-> design_model.json

If you are building a local agent product for a professional desktop application, CAD tool, modeling tool, design tool, or another host application, this chain matters more than whether the model can write a convincing prompt.

The agent CLI should not be the product core

SAH supports Claude and Codex. If it supports another agent CLI later, the product core should not need to be rewritten.

The reason is simple: each CLI has its own plugin mechanism, project rules, skill locations, session model, and permission model. Those differences should not change how a SketchUp model is described, validated, executed, or repaired.

The reusable core of SAH is not a specific CLI. It is:

MCP server
SketchUp Ruby bridge
design_model.json schema
component metadata
runtime skills
project workspace convention
protocol and spatial rules

CLI-specific code should stay as an adapter. Its job is to connect the current agent tool to the same core capability surface, not to turn product behavior into Claude-only or Codex-only branches.

This boundary matters. Without it, expanding from one agent CLI to another quickly creates multiple behavior models, multiple documentation paths, multiple test surfaces, and multiple runtime branches.

Why the project workspace matters

A common mistake is treating the product source repository as the user’s working surface.

For a designer, normal SAH usage should not mean cloning the product repository, opening the source tree, and editing internal files. The operating surface should be a clean design project directory.

That project directory owns the current design facts:

design_model.json
design_rules.json
component_library.json
assets.lock.json
imports/
snapshots/
project-local runtime skills
model.skp

The harness is installed outside the project. The product repository maintains the MCP server, Ruby bridge, runtime skill authoring source, specs, tests, and release artifacts. The user project owns the active working truth, rules, evidence, assets, and review artifacts.

This is not just directory hygiene. It decides what the agent should treat as fact.

If the agent operates from the product repository, it sees maintainer context. If it operates from the design project, it sees the current design context. A local agent harness has to keep those two surfaces separate.

Why the SketchUp scene is not the source of truth

SketchUp is the live execution view. It matters, but it is not a strong enough source of truth for agent reasoning.

A live scene does not naturally answer questions like:

Did this wall come from manual editing, an imported source, or an agent replay?
Where are this component’s dimensions, anchors, clearances, and license metadata?
Can the current model be regenerated from structured state?
If visual review finds a problem, which structured fact should change?
If execution failed, which operations succeeded and which were skipped?

SAH therefore puts design_model.json before the SketchUp scene. The structured model records spaces, walls, openings, components, rules, assumptions, imports, execution metadata, and version information. The SketchUp scene is an execution result of that truth, not the only fact layer.

That gives the agent more engineering leverage:

inspect project state without opening SketchUp;
derive a testable execution trace from the same truth;
write bridge entity IDs and operation metadata back into truth;
use structured diff, validation, and repair;
perform clean replay instead of accumulating stale geometry.

What the MCP server owns

The MCP server is the shared execution layer. It should not merely forward natural language to SketchUp.

It owns several responsibilities:

reading and validating the project workspace;
exposing tools the agent can call;
merging design rules and project state;
generating bridge execution traces;
planning headlessly when live SketchUp is not open;
executing the project model when the live bridge is available;
writing execution feedback back into structured truth.

This is why the architecture is not “the LLM directly controls the SketchUp UI.” Professional software state is too complex to leave everything to one prompt. The middle layer has to turn intent into inspectable project state and operation plans.

What the Ruby bridge owns

The Ruby bridge should have a narrower responsibility: execute structured operations inside SketchUp and return structured results.

It should not own product-level reasoning, and it should not know about the differences between agent CLIs. It should care about:

whether SketchUp is ready to execute;
whether an operation payload is valid;
whether geometry was created successfully;
whether failure can be attributed to a specific operation;
whether rollback is needed;
which entity IDs, spatial deltas, and model revision metadata should be returned.

This creates a stable protocol boundary between the Python MCP server and the Ruby bridge. For developers, that is more maintainable than asking the model to improvise Ruby scripts directly.

Runtime skills are product surface, not maintainer notes

SAH also has an important skill boundary.

Product runtime skills live in the product’s skill authoring source, but they should be installed through the supported Claude, Codex, or future agent CLI mechanisms so the user-facing tool can load them. They are part of the designer workflow: spatial planning, component search, semantic placement, floor plan import, visual feedback, and layout validation.

Those are not the same as maintainer skills used while developing the product.

Maintainer skills support product engineering. Product runtime skills support designer usage. Project or session dynamic runtime skills support a specific design project, such as an imported source legend, a designer correction, or a project preference.

Mixing the three creates two problems:

maintainer implementation detail leaks into user runtime;
one project’s local memory gets mistaken for product baseline behavior.

Developers should treat skills as part of product architecture, not as a loose prompt folder.

Verification should not rely only on live demos

When an agent harness controls a desktop application like SketchUp, a live demo matters. It should not be the only verification strategy.

A stronger strategy is layered:

source checkout unit and contract checks;
project initialization, state inspection, validation, and trace planning without opening SketchUp;
bridge installation dry run;
live bridge integration;
release smoke;
installed package smoke.

The architectural point is that users do not normally run the product from the maintainer’s source path. They install the harness, install the bridge, create a project directory, and start the agent CLI from their own project.

Passing source checkout checks does not prove the installed product works. Runtime skills, bridge files, plugin manifests, templates, and CLI entry points can all fail at the packaging boundary.

Where this pattern generalizes

SAH’s host application is SketchUp, but the architecture is not SketchUp-specific.

Any local agent product that controls a complex host application may need a similar split:

agent CLI as the entry point;
runtime skills for domain workflow;
MCP server or equivalent tool layer;
project workspace as the active operating surface;
structured model as editable truth;
execution trace connecting truth to the host application;
bridge layer executing host operations;
visual or live output as execution and review artifact;
verification across headless, live, and installed-package paths.

That is why “the agent CLI is only the entry point” is the right first article for the SAH technical architecture series. The developer question is not whether Codex or Claude is better at chatting. The more durable question is how to turn a professional application into a verifiable, executable, and repairable agent harness.

Source trace

sketchup-agent-harness:README.md
sketchup-agent-harness:AGENTS.md
sketchup-agent-harness:docs/adr/0001-agent-harness-boundary.md
sketchup-agent-harness:DEVELOPMENT.md
How Codex Can Drive Verifiable SketchUp Modeling
Project Workspace As Agent Operating Surface
Structured Execution Trace For Agent Harnesses
Natural Language Cad Agent Harness