-
Don't let the agent grade its own homework
An agent reporting 'done' is the actor grading its own work. The cheapest guard against silent failure is an independent observer — a second agent, a separate tool, or the live artifact itself. Interactively you spawn the watcher by hand; in a fleet it's a validation gate. Same principle, two scales.
-
Anchor on a fact the model can't see
An agent's confidence is uncorrelated with whether it's right. The cheapest correction you can make is to hold out one ground-truth fact it never had in context — usually a number — and check its output against that. Works the same whether you're watching one session or auditing a fleet's reports.
-
'No' is not an instruction
The cheapest way to steer an agent is also the most-skipped one. A rejection tells the agent what not to do; the space of 'not that' is enormous. Name the alternative in the same breath and you re-steer in one turn instead of N. True for one pet session and for twenty cattle workers.