The substrate that addresses the three deficiencies
A harness that does which-tests, that-they-ran, and the-outcome - regardless of IDE, model, or host.
The platform addresses the three deficiencies Mark named across any IDE, any AI model, and any deployment lane. Switching tools does not change the audit record. The recorded outcome follows the work.
"We and every other software company in the world are outstripping our ability to test what we're building."
Why now: the velocity of agentic coding has decoupled from the velocity of testing, auditing, and validation - the knowledge and proof that AI agents did what they were tasked to perform, i.e. testing, in this case. An AI agent can produce more code in a day than a team used to write in a sprint. The test, audit, and compliance layers did not get faster at the same rate. The gap is structural and widens with every model release.
Three deficiencies - in every company today - that no software addresses:
- determining which tests need to run for a particular release
- checking whether they ran
- recording the outcome
Mark Walker, nue.io - meeting transcript [00:46:36]
The platform is the substrate that addresses the three deficiencies regardless of which IDE, AI model, or host the work runs on.
Absence 1 - which tests run, regardless of IDE or model
The platform runs underneath the IDE. Whether the operator uses VS Code, JetBrains, Cursor, Vim, or a browser-based IDE, the same selection logic decides which tests apply to a given change. Switching IDE does not change the answer.
Today, Anthropic Claude is the primary execution path through Claude Code. Per-task routing to Gemini, Codex, Kimi, and local open-source models is the unified-LLM architecture per ADR-122. The test selection logic is independent of which model executes the work.
Absence 2 - they ran, in any deployment lane
Tests execute on whichever lane the project uses: a local AI workstation (LM Studio, on-prem) is supported today. Managed cloud workstation (Google Cloud Workstations on GCP) and hybrid lanes are part of the deployment architecture per ADR-002. The execution environment is recorded with the result.
Provider-failover routing is part of the unified-LLM component per ADR-122. The test selection and recorded-outcome layers are independent of which provider handles a given call, so a routing change does not change the audit record.
Absence 3 - the outcome follows the work, not the tool
The audit trail is not a feature of any one tool the operator picks. It is the substrate the platform writes to. Switching IDE, model, or host swaps the surface; the recorded outcome stays in the same database, with the same schema, queryable by the same SQL.
This decouples compliance from vendor choice. A regulator asking 'show every AI-assisted change in 2026' gets the same answer whether the team used three IDEs and four models or one of each.