Model Output Protocol

Home /
Model Output Protocol

A filter that sits between an LLM agent and its human user, enforcing communication rules before output reaches them.

synodic-studio/model-output-protocol· MITalpha

The name is a wink at the Model Context Protocol, nothing more. MOP is not a protocol in that sense and has nothing to do with MCP beyond the pun. It is a filtering layer: a gate every message passes through on its way from the agent to me.

LLM agents drift. Voice rules in the system prompt get crowded out by task instructions, and reminders in CLAUDE.md decay across a long session. The result is messages that are too long, too cheerleady, that ask permission instead of acting or narrate process instead of stating the outcome. I tried to fix this the obvious way first: rules in CLAUDE.md, rules in the project file, a hook, five separate reminders worded five different ways. None of it stuck. The agent would behave for a while and then slide right back, because stuffing more rules into a context already saturated with the task does not work.

So I got bossy. MOP still hands the agent the full rule set up front, as part of the system prompt, so it has a fair chance of getting it right on its own. But it also enforces those same rules at the moment of delivery: the agent’s only path to me is a single tool call, and every call is checked against the rules before anything ships. Prompting for a chance, enforcement for the guarantee.

How it works

The agent has no direct channel to the user. Its only way to reach me is one tool call, submit_message(text). Calling it runs the message through an evaluation against the active rules, and the verdict decides what happens next.

MOP borrows its shape from Claude Code’s Channels feature, where an agent reaches the outside world only through declared MCP tool calls. Channels points that shape at inbound events; MOP aims it the other way, at enforcing what goes out.

Four verdicts

A single call to a small, fast model evaluates each submit_message against the active rules and returns one of:

Accepted — the message ships as-is. Clean status report, no rule trips.
Rewritten(rewritten) — the evaluator reformed the message; the rewritten version ships. The agent learns the diff via the tool result, so any later self-reference like “the fix from before” resolves to the cleaned-up text, not the original.
Rejected(violations) — nothing ships. The original becomes pending_message. The agent must call submit_justification(reason) to argue for delivery, which re-runs the evaluation with the argument in scope. Bounded by max_justification_attempts (default 4).
AcceptedFailedOpen(system_note) — justification budget exhausted. The original ships with a system-note bubble warning that rules were bypassed. It fails open rather than failing silent.

A Stop hook gates turn-end on whether the agent actually delivered something. If a turn would end with nothing having reached the user, the hook blocks it and the agent retries.

Live examples

Real verdicts against sample agent messages, with three rules active: no-permission-asking-for-doable-work, no-commit-hashes (I have no interest in SHAs), and no-environment-vars (I do not configure env vars by hand).

Accepted

Clean status report, no rule trips. Delivered as-is.

Rewritten (minutia)

Rewritten: MOP strips the commit SHA and the raw env-var name, and the cleaned-up message reaches the user

The evaluator stripped the SHA and softened the env-var name. The agent sees the rewritten text in its tool result, so any later self-reference resolves to the polished version.

Rewritten (list)

Rewritten: MOP turns a bulleted list of options into lettered choices a, b, c

A rule prefers lettered options over plain bullets, so the choices come back labeled and easy to answer with a single character from a phone.

Rejected

The agent has the tools to grep, so asking permission burns a round-trip. The message becomes pending. To deliver, the agent must either justify it or, better, actually run the grep and report what it found.

Integration shape

MOP is transport-agnostic and model-agnostic. A host gives the agent the rule set up front by concatenating mop.protocol_prompt(rules) into the system prompt, then wires up a delivery channel (Telegram, Slack, a terminal) and points an optional auditor at a log. Everything else MOP handles internally.

The tools the agent sees:

Tool	Purpose
`submit_message(text)`	The only path to the user. Triggers an evaluation against the active rules.
`submit_justification(reason)`	Argues for delivering a previously-rejected message. Bounded by `max_justification_attempts`.
`get_rules(filter?)`	Read-only. Returns active rule names and descriptions.
`get_status()`	Returns pending message, sent-this-turn, and justification-attempt state so the agent can self-recover.

Rules

Each rules/*.yml file is loaded at startup. A rule is either an LLM rule (a yes/no prompt the evaluator runs against each message) or a deterministic rule (a regex or word-count check that contributes advisory hints, never the final say). The shipped set covers voice, behavior, and a handful of role overlays (designer, devops, solo-founder, and so on); personal style rules like “no em-dashes” or “prefer lettered options over bullets” live in a per-deployment overlay that is not checked in. The long-term intent is for rule ownership to move out of MOP and into whichever host is invoking it, so a Telegram voice and an IDE voice can diverge without forking anything.

Every verdict is logged: one line per decision with the original text, the rules active at that moment, and what MOP did with it, so a silent rewrite or rejection is always recoverable after the fact.

Where it runs

MOP was built for Patchbay Relay, the Telegram-to-Claude-Code bridge, on its cc-sdk-mop harness, where it gated every turn. Patchbay is on ice now, so MOP is not in active daily use, but it is a project I am far from done with. The runtime is intact, host-agnostic, and model-agnostic, and the only assumption it makes is that the host speaks Python.

Companion: HOP

There is a sketch of the decoder side, HOP (Human Output Protocol): the same idea pointed the other way, helping a human compose tighter, higher-signal messages to a high-context agent from a low-bandwidth interface like a phone or voice. It is more concept than code right now and I am not actively building it, but together MOP and HOP frame the two directions of the agent-human channel.

Status

Alpha, pre-v1. The runtime is solid enough to run on every turn it gates, but the default rule set is intentionally minimal and the path by which rule ownership moves from MOP out into its hosts is still being worked out.