diagnose then fix

Worked example: diagnose → propose_fix → PR

A complete fleet run from triggering signal to merged PR. Numbers in brackets reference the steps in docs/integrations/kahn-fleet.md’s end-to-end flow section.

Setup

export PETROVA_GITHUB_TOKEN="<fine-grained PAT, contents+PRs scope>"
export PETROVA_WORKSPACE="$HOME/code/workspace"

Registry has kahn-hq listed:

- slug: kahn-hq
  url: https://github.com/kahn-hq/kahn
  default_branch: main
  role: production
  profile: standard
  fleets_allowed:
    - kahn-implementer
    - kahn-diagnostics
  added: "2026-04-29"

The fleet identifier kahn-implementer is permitted.

Step 1: trigger

KAHN’s audit-rls checkpoint fails on CI run ci-run-12345, returning Outcome: rls-policy-missing for one tenant. The fleet catches this as an audit_fail event.

Step 2: diagnose

petrova diagnose kahn-hq --since 2026-04-22 --json > /tmp/diag.json
DIAG=$(jq -r '.result.diagnosis_id' /tmp/diag.json)
echo "Diagnosis: $DIAG"
# → Diagnosis: diag-a3f29c4b1e8d7602

The diag.json shows:

Current phase: Phase-7, status closed
Open milestones: M7.8.2, M8.1.1
Recent decisions: 4 within window, including 2026-04-28-phase-7-close.md
Recent findings: 20260428-1430-rls-tenant-drift.md — relevant.

The fleet now has enough context to compose a fix.

Step 3: compose

cat > /tmp/fix.json <<'JSON'
{
  "diagnosis_id": "diag-a3f29c4b1e8d7602",
  "title": "fix(rls): add policy for tenant-scoped reads (audit-rls)",
  "rationale": "audit-rls flagged missing policy for service:kahn-internal tenant on B3 endpoint. Diagnose confirms phase 7 closed without paired RLS finding addressed; this fix lands the missing policy.",
  "proposed_changes": [
    {
      "path": "backend/migrations/0046_kahn_internal_rls.sql",
      "operation": "create",
      "contents": "-- 0046: enable RLS for kahn-internal reads\nALTER TABLE runs ENABLE ROW LEVEL SECURITY;\nCREATE POLICY tenant_read ON runs FOR SELECT USING (org_id = current_setting('app.org_id'));\n",
      "edit_rationale": "Add the missing policy referenced in the audit failure."
    }
  ],
  "test_plan": [
    "make integration (boots throwaway PG, applies all migrations including 0046)",
    "manual: scripts/soak-refute.py against kahn-internal tenant returns gate-met"
  ],
  "mr_grounding": [
    {
      "kind": "meta_rule",
      "ref": "MR-2",
      "claim": "audit failure is friction surfaced post-Phase 7 close; fix lands as new work, not retrofit."
    },
    {
      "kind": "finding",
      "ref": "docs/findings/20260428-1430-rls-tenant-drift.md",
      "claim": "names the missing tenant policy as the root cause."
    },
    {
      "kind": "spec",
      "ref": "docs/spec/rls.md#tenant-scoping",
      "claim": "specifies tenant-scoped reads as the contract every table must satisfy."
    }
  ]
}
JSON

Step 4: dry-run

petrova propose_fix kahn-hq \
  --input /tmp/fix.json \
  --actor fleet:kahn-implementer \
  --triggered-by-kind audit_fail \
  --triggered-by-ref ci-run-12345

Output:

propose_fix → dry_run  4f8c3e21d97a
  upholds: MR-7, MR-12
  branch: petrova/propose-fix/4f8c3e21
  create backend/migrations/0046_kahn_internal_rls.sql (192 B)

The fleet’s pre-apply policy: dry-run output passes if exactly one file changes AND it’s under backend/migrations/. This case matches.

Step 5: apply

petrova propose_fix kahn-hq \
  --input /tmp/fix.json \
  --actor fleet:kahn-implementer \
  --triggered-by-kind audit_fail \
  --triggered-by-ref ci-run-12345 \
  --apply

Output:

propose_fix → applied  4f8c3e21d97a
  upholds: MR-7, MR-12
  PR #1234 petrova/propose-fix/4f8c3e21 https://github.com/kahn-hq/kahn/pull/1234

Step 6: PR body (excerpt)

## Petrova action: `propose_fix`

<!-- petrova:metadata -->
```yaml
petrova:
  verb: propose_fix
  target_repo: kahn-hq
  idempotency_key: 4f8c3e21d97a...
  schema_version: 1
  schema_fingerprint: a8b3c4d5e6f7
  actor: fleet:kahn-implementer
  triggered_by:
    kind: audit_fail
    ref: ci-run-12345
  applied_at: 2026-04-29T14:32:18Z

Why

audit-rls flagged missing policy for service:kahn-internal tenant on B3 endpoint. …

Upholds

MR-7 — decision docs are dated, append-only
MR-12 — CLAUDE.md is a projection, not a source

Changes

create backend/migrations/0046_kahn_internal_rls.sql (192 B)

Summary

Diagnosis: diag-a3f29c4b1e8d7602 (scanned 2026-04-29T14:30:00Z).

MR grounding:

meta_rule: MR-2 — audit failure is friction surfaced post-Phase 7 close …
finding: docs/findings/20260428-1430-rls-tenant-drift.md — names the missing tenant policy as the root cause.
spec: docs/spec/rls.md#tenant-scoping — specifies tenant-scoped reads as the contract every table must satisfy.

Test plan:

make integration (boots throwaway PG, applies all migrations including 0046)
manual: scripts/soak-refute.py against kahn-internal tenant returns gate-met

## Step 7: human review and merge

The PR enters kahn-hq's normal review flow:
1. CI runs `make integration` against the new migration.
2. CODEOWNERS for `backend/migrations/` are auto-requested.
3. A human reviews the migration, the test plan, the grounding cites.
4. Approval → merge.

The fleet's involvement ends at step 5. The audit trail of *how this
fix came to exist* lives entirely in the PR body and the diagnose
cache entry, both grounded in MR-7.

## What if a re-run happens?

If the fleet (or the audit retry loop) invokes the same verb with the
same inputs again:

```sh
petrova propose_fix kahn-hq --input /tmp/fix.json [...] --apply
# → propose_fix → skipped_idempotent  4f8c3e21d97a
#   PR #1234 (existing)

Idempotency key matches the open PR’s metadata block; no new PR. The fleet treats skipped_idempotent identically to applied for downstream bookkeeping.