Verification before trust: FTT and Poka Yoke

Quality is not a judgment call made at the end. Why Rocket Routine OS relies on verification, FTT, and Poka Yoke so nothing ships unchecked.

31 May 2026

Ask anyone in a company whether a particular piece of work is good, and the answer usually arrives as a feeling. "Looks good." "Fine." "Ship it." Someone glances at it, nods, and the thing is approved. In that model, quality is a judgment one person makes in passing (a person who sometimes has no real subject-matter competence at all, just the epaulettes on their shoulders).

It works as long as that one person checks every result themselves and applies the same standard every time. It breaks the moment more work flows through than a single head can look at. Then "done" depends on who looked and how much time they had. Sometimes the bar is strict, sometimes loose, and nobody can say how good the work really was, because there is no definition of "good" that exists independently of the reviewer. Quality that rests on a judgment call is not steerable. It is an opinion with a timestamp.

The moment AI operators start producing real work, this model becomes untenable for good. An operator generates more output in an hour than a human can review in detail in a day. If the only quality safeguard is a person glancing over the result at the end, then quality is exactly as good as that person's attention in that exact moment. That is not quality assurance, that is hope with an approval button.

Verification precedes trust

Rocket Routine OS is built on a hard principle: nothing ships without quality confirmation. No AI operator output, no handoff, no work item leaves its stage before it has been checked against defined criteria for whether it meets the standard.

That sounds like a detail, but it reverses the usual order. Normally, trust comes first and then (maybe!) a spot check follows. An employee is considered reliable, so their output stops being questioned. An AI operator "runs pretty well by now," so it gets left alone. Verification becomes something you do only after something has already gone wrong.

Rocket Routine OS flips that. Trust is not the precondition for shipping. Trust is the result of verification being passed, again and again.

Trust is not an input to quality. Trust is its output.

Quality confirmation is therefore not a judgment call and not a nice extra at the end. It is a duty anchored in every AI operator's Role Contract, with defined criteria and defined evidence types. The operator doesn't get to claim its work is good. It has to show that the work meets the criteria.

FTT (First Time Through): quality becomes a number

Verification as a principle isn't enough as long as nobody measures how often it actually gets passed. That is what the central quality metric in Rocket Routine OS is for: FTT, First Time Through.

FTT measures the share of work items that pass quality confirmation without rework. Not "fine after two correction loops." Not "broadly usable." Passed on the first run, no rework. A number, not a feeling.

The idea behind it isn't new. Philip Crosby formulated "do it right the first time" back in the 1960s, long before anyone was thinking about AI operators. His point was simple and uncomfortable: quality is not what's left over after you inspect and sort out the defects at the end, it's what gets done right from the start. In that view, rework isn't a normal part of the work, it's the measurable price of something missing the standard on the first attempt. FTT is exactly that idea as a number: the share of work that was right the first time.

That changes the conversation about quality at the root. "Looks good" can't be compared, can't be tracked over weeks, can't be attributed to a domain or an operator. An FTT of 72 percent, by contrast, is a fact. It tells you that nearly three out of ten results needed rework, and it forces the next question onto the table: why those three?

As long as quality is a feeling, you can talk about it. Only as a number can you improve it.

FTT is also the metric that decides whether an AI operator takes on more responsibility. If it rises stably above a defined threshold in a domain, that justifies the next step. If it drops, downgrade is the right answer. Data beats opinions, especially on the question of how much execution you hand to whom.

Poka Yoke: making the error structurally impossible

A low FTT shows that something goes wrong repeatedly. The decisive question is what you do next. The usual reflex is to manage the person more tightly. More control, more reminders, more "pay closer attention." That is the most expensive and least reliable path, because it works against human nature and has to be mustered fresh on every single run.

Rocket Routine OS takes the other route, and it comes straight from manufacturing. The principle is called Poka Yoke: design the process so the error cannot occur in the first place. Not "remember to plug the cable in the right way," but a connector that only fits one way. The error is no longer caught, it is designed out.

Translated to knowledge work, that means: when a specific error recurs, you don't admonish the operator, you rebuild the routine. A mandatory check gets inserted, a step in the sequence gets enforced, an output missing a required field doesn't get accepted at all. An AI operator's Role Contract is designed on exactly this principle.

Behind it sits the conviction that carries the whole system: processes are the problem, not people. When an error happens repeatedly, the actor isn't incompetent, the process is badly designed. That is not a kind attitude toward employees, it is the only version that scales. You can admonish a person. You can repair a process.

That is how the three parts lock together. Verification establishes that work gets checked. FTT makes visible how often the check is passed. Poka Yoke makes the most frequent failures structurally impossible the next time around. Verification without measurement stays a gesture. Measurement without Poka Yoke stays a diagnosis with no treatment.

Company 0

At Rocket Routine, the content marketing operator's FTT sat at around 72 percent for a while. That meant nearly every third output went back for rework. Instead of managing the operator more tightly, I looked at the distribution of the rework: not just how often something came back, but why.

The single largest cause wasn't style and wasn't factual accuracy. It was scope drift. In derived posts, the operator kept pulling in concepts the week's source article didn't cover. A LinkedIn post about Adoption Levels suddenly mentioned the Control Tower. An X post brought in OMPRIKL, even though the article said nothing about it. Every time, the output went back.

The fix wasn't a reminder. It was a Poka Yoke in the Role Contract. Before any derived output reaches quality confirmation, the operator first has to output the list of concepts the source article actually covers, and check every claim against that list. An output that names a concept outside the list doesn't get passed through to approval at all.

The error didn't disappear because the operator got better. It disappeared because the process stopped allowing it.

Over the following three weeks, that operator's FTT climbed from 72 to 91 percent and held. The scope-drift category stopped showing up in rework. Not because I controlled more tightly, but because the error no longer had a path through the routine.

What comes next

Verification, FTT, and Poka Yoke together are the reason you can steer AI operators in a company at all without constantly cleaning up behind them. In this model, quality is not a judgment someone makes at the end. It is a structural precondition that has to be met before anything ships.

That nearly closes the conceptual arc. Next week, I'll look at what actually happens on day one when a company starts on Rocket Routine OS. Not a framework in the abstract, but the concrete entry point: what you hold in your hands on day one and what runs in the first week.

If you run a founder-led B2B company with 15 to 50 employees and you want quality to be a verifiable number instead of an opinion: rocket-routine.com