The literal recorded outcome Mark described

VTR - Virtual Test Record - is the exact answer to the third deficiency.

Mark named three deficiencies. The third was 'no system for recording the outcome'. The Virtual Test Record IS the outcome record - requirement linked to test, test linked to result, result linked to evidence (screenshot, video, DOM). Auditor-ready by construction.

The problem - in Mark Walker's words
"We and every other software company in the world are outstripping our ability to test what we're building."

Why now: the velocity of agentic coding has decoupled from the velocity of testing, auditing, and validation - the knowledge and proof that AI agents did what they were tasked to perform, i.e. testing, in this case. An AI agent can produce more code in a day than a team used to write in a sprint. The test, audit, and compliance layers did not get faster at the same rate. The gap is structural and widens with every model release.

Three deficiencies - in every company today - that no software addresses:

  • determining which tests need to run for a particular release
  • checking whether they ran
  • recording the outcome

Mark Walker, nue.io - meeting transcript [00:46:36]

VTR with Playwright proves the user-facing behaviour ran and produced the expected result. The Virtual Test Record is literally the recorded outcome the third deficiency describes.

VTR is the literal answer to deficiency 3

Mark's third deficiency: 'no system for recording the outcome'. The Virtual Test Record is exactly that record, formalised: each requirement linked to a specific UI test, the test linked to a result, the result linked to evidence (screenshot, video, DOM snapshot).

Where a regulator (FDA SaMD, EU AI Act Article 13) asks 'how do you prove this UI behaves as documented', the VTR is the answer. One row per requirement, all evidence linked, signed and timestamped.

Playwright addresses deficiencies 1 and 2 for the UI layer

CODITECT generates Playwright tests from the requirement spec - selectors, assertions, screenshot points, all named for the requirement they verify. Per change, only the UI tests touching the changed surface are selected.

Playwright opens a real browser (Chromium, Firefox, WebKit), executes the user flow, and captures evidence at every assertion. A failed run includes video replay and DOM snapshot at point of failure.

Self-correcting selectors, recorded

When a UI change breaks a selector, the agent proposes a corrected selector and re-runs the test, with both the old and new selector versions logged. The fix iteration itself is part of the recorded outcome.

Coverage report: which requirements have tests, which lack them. The Requirements Traceability Matrix shows 100% coverage or names the gap by requirement ID.