# Model Evolution Guard System ## Problem Statement During the bidirectional sync integration (`sync-benchmarks-from-yaml.cjs`), the script copied models from `capability-index.yaml` (which contained suboptimal assignments) into `model-benchmarks.json` as "current". This silently downgraded multiple agents from their ★-optimal heatmap scores: | Agent | Optimal (v3 heatmap) | Downgraded To | Score Loss | |-------|----------------------|---------------|------------| | `lead-developer` | qwen3-coder:480b **(92★)** | nemotron-3-super | -22 | | `system-analyst` | glm-5.1 **(90★)** | nemotron-3-super | -16 | | `evaluator` | glm-5.1 | nemotron-3-super | -16 | | `devops-engineer` | kimi-k2.6 **(88★)** | nemotron-3-super | -10 | ## Root Causes 1. **No single source of truth** — `capability-index.yaml`, `kilo-meta.json`, agent `.md` files, and `model-benchmarks.json` could each claim to be canonical. 2. **No downgrade protection** — `sync-benchmarks-from-yaml.cjs` blindly overwrote scores without checking if the new model was worse than the old. 3. **No fitness gate** — changes propagated to all downstream files (dashboard, configs) before any validation. 4. **Bidirectional sync ambiguity** — the sync was "YAML → JSON" but looked like "JSON ← YAML", creating confusion about direction. ## Architectural Solution: Model Evolution Guard (MEG) ### Layer 0: Single Source of Truth ``` PRIMARY: agent-evolution/data/model-benchmarks.json └── source: heatmap_scores from agent_model_scores[] └── validated_by: fitness gate (see below) SECONDARY (derived, read-only for sync): ├── .kilo/capability-index.yaml ← receives models FROM benchmarks ├── .kilo/agents/*.md ← receive models FROM benchmarks via sync-agents.js ├── kilo-meta.json ← receives models FROM benchmarks └── kilo.jsonc ← receives models FROM benchmarks ``` **Rule:** `model-benchmarks.json` is the ONLY file that contains heatmap-derived scores. All other configs receive models FROM it, never the reverse. ### Layer 1: Fitness Gate (Mandatory) Every model change must pass the fitness gate. A change is "acceptable" only if: ```typescript interface ModelFitnessGate { // Agent's current score with existing model previous_score: number; // Agent's score with proposed model proposed_score: number; // Absolute minimum score for any agent min_global_threshold: number; // e.g. 75 // Maximum regression allowed max_regression: number; // e.g. -3 points // Is proposed model in agent's top-N from heatmap? top_n_required: number; // e.g. top-3 } function isChangeAcceptable(gate: ModelFitnessGate): boolean { if (gate.proposed_score < gate.min_global_threshold) return false; if (gate.proposed_score < gate.previous_score - gate.max_regression) return false; return true; } ``` **Hard rule:** If `proposed_score < previous_score - 3`, the change MUST be rejected with a clear error. No exceptions. ### Layer 2: Immutable Recommendations Recommendations in `model-benchmarks.json` are append-only. Once a recommendation is generated, it cannot be silently overwritten by a sync — it can only be superseded by a NEW recommendation with a higher timestamp. ```json { "recommendations": [ { "agent": "lead-developer", "from_model": "qwen3-coder:480b", "to_model": "nemotron-3-super", "score_delta": -22, "status": "rejected", "rejected_at": "2026-04-29T20:00:00Z", "rejected_reason": "Downgrade: 92→70 exceeds max regression of 3" } ] } ``` ### Layer 3: Sync Direction Lock All sync scripts must declare their direction explicitly: ```typescript // ✅ CORRECT: benchmarks → configs // src: model-benchmarks.json // dst: capability-index.yaml, agents/*.md, kilo-meta.json // validates: fitness gate // ❌ INCORRECT: configs → benchmarks // This should NEVER happen. Benchmarks come from heatmap analytics only. ``` ### Layer 4: Diff Report on Every Sync Before writing any file, the sync script must produce: ``` === Model Sync Diff Report === Agent Old Model Old Score New Model New Score Status lead-developer qwen3-coder:480b 92★ nemotron-3-super 70 ⚠️ REJECTED (regression -22 > max -3) system-analyst glm-5.1 90★ nemotron-3-super 74 ⚠️ REJECTED (regression -16 > max -3) ``` No files are modified until the DIFF is reviewed (or `--auto-approve` is used for improvements only). ### Layer 5: Recovery Checkpoint Before any sync that touches model assignments, create a git checkpoint: ```bash # In the sync script git stash push -m "pre-model-sync-$(date +%s)" git checkout -b auto/model-sync-$(date +%s) ``` If fitness gate rejects changes, auto-rollback: ```bash git checkout HEAD -- kilo-meta.json .kilo/capability-index.yaml .kilo/agents/ ``` ## Implementation ### 1. Fitness Gate Module ```typescript // agent-evolution/scripts/lib/fitness-gate.ts export class ModelFitnessGate { constructor( private benchmarks: ModelBenchmarks, private minThreshold = 75, private maxRegression = 3 ) {} validateChange(agent: string, fromModel: string, toModel: string): GateResult { const oldScore = this.getAgentModelScore(agent, fromModel); const newScore = this.getAgentModelScore(agent, toModel); if (newScore < this.minThreshold) { return { acceptable: false, reason: `Score ${newScore} below threshold ${this.minThreshold}` }; } if (newScore < oldScore - this.maxRegression) { return { acceptable: false, reason: `Regression ${oldScore}→${newScore} exceeds max ${this.maxRegression}` }; } return { acceptable: true, delta: newScore - oldScore }; } } ``` ### 2. Sync Wrapper ```typescript // agent-evolution/scripts/sync-with-guard.cjs (wraps any sync script) const { validateAllChanges } = require('./lib/fitness-gate'); const changes = detectChanges(); // what the sync WOULD do const report = validateAllChanges(changes); if (report.rejections.length > 0) { console.error('❌ FITNESS GATE BLOCKED:'); report.rejections.forEach(r => console.error(` ${r.agent}: ${r.reason}`)); process.exit(1); } console.log(`✅ All ${changes.length} changes passed fitness gate`); applyChanges(changes); ``` ### 3. Git Checkpoint ```bash # Every sync script must run this first #!/bin/bash set -e STASH_NAME="model-sync-$(date +%s)" git stash push -m "$STASH_NAME" -- kilo-meta.json .kilo/capability-index.yaml .kilo/agents/ ``` ## Verification Checklist After implementing the guard: - [ ] `sync-benchmarks-from-yaml.cjs` validates every model change against heatmap scores - [ ] Downgrades of >3 points are rejected with clear error - [ ] Diff report is printed before any file is written - [ ] Git checkpoint is created before sync - [ ] `model-benchmarks.json` has `source: "heatmap"` locked field - [ ] All sync scripts declare direction: `benchmarks → configs` only - [ ] CI pipeline runs fitness gate as pre-commit hook ## Integration with Existing Workflow The guard integrates at the existing `/evolution` command step 0: ```markdown ## Step 0: Model Research & Guard 1. Run heatmap analysis → produce raw scores 2. **Fitness Gate** validates all proposed changes 3. If any downgrade >3 points → HALT, report to human 4. If all pass → generate recommendations append-only 5. Sync to configs with direction lock: benchmarks → configs ``` --- **Bottom line:** Never again should a script silently replace a ★-optimal model with one scoring 20+ points lower.