fix: restore optimal v3 models + add fitness gate protection

- Restore all 30 agents to v3.html heatmap optimal models: * frontend-developer: qwen3-coder -> minimax-m2.5 (92★) * devops-engineer: nemotron-3-super -> kimi-k2.6:cloud (88★) * browser-automation: qwen3-coder -> kimi-k2.6:cloud (86★) * agent-architect: glm-5.1 -> kimi-k2.6:cloud (86★) - Add Model Evolution Guard system: * agent-evolution/scripts/lib/fitness-gate.cjs * Rejects downgrades >3 points or below score 75 * Produces detailed diff report before any file modifications * Normalized model ID lookup (v3.html ':' vs JSON '-') - Update sync-benchmarks-from-yaml.cjs with fitness gate - Update model-benchmarks.json with v3 optimal assignments - Rebuild research-dashboard.html (104KB, 30 agents, 11 models) - Add model-evolution-guard.md architecture documentation - Add v3-optimal-models.json as source-of-truth reference Fixes regression introduced by commit 3badb25 where models were silently downgraded from heatmap optimal to inferior assignments.
2026-04-29 23:19:16 +01:00
parent d1516f4856
commit 9e48a4960e
14 changed files with 2850 additions and 2049 deletions
--- a/agent-evolution/docs/model-evolution-guard.md
+++ b/agent-evolution/docs/model-evolution-guard.md
@@ -0,0 +1,214 @@
+# Model Evolution Guard System
+
+## Problem Statement
+
+During the bidirectional sync integration (`sync-benchmarks-from-yaml.cjs`), the script copied models from `capability-index.yaml` (which contained suboptimal assignments) into `model-benchmarks.json` as "current". This silently downgraded multiple agents from their ★-optimal heatmap scores:
+
+| Agent | Optimal (v3 heatmap) | Downgraded To | Score Loss |
+|-------|----------------------|---------------|------------|
+| `lead-developer` | qwen3-coder:480b **(92★)** | nemotron-3-super | -22 |
+| `system-analyst` | glm-5.1 **(90★)** | nemotron-3-super | -16 |
+| `evaluator` | glm-5.1 | nemotron-3-super | -16 |
+| `devops-engineer` | kimi-k2.6 **(88★)** | nemotron-3-super | -10 |
+
+## Root Causes
+
+1. **No single source of truth** — `capability-index.yaml`, `kilo-meta.json`, agent `.md` files, and `model-benchmarks.json` could each claim to be canonical.
+2. **No downgrade protection** — `sync-benchmarks-from-yaml.cjs` blindly overwrote scores without checking if the new model was worse than the old.
+3. **No fitness gate** — changes propagated to all downstream files (dashboard, configs) before any validation.
+4. **Bidirectional sync ambiguity** — the sync was "YAML → JSON" but looked like "JSON ← YAML", creating confusion about direction.
+
+## Architectural Solution: Model Evolution Guard (MEG)
+
+### Layer 0: Single Source of Truth
+
+```
+PRIMARY: agent-evolution/data/model-benchmarks.json
+  └── source: heatmap_scores from agent_model_scores[]
+  └── validated_by: fitness gate (see below)
+
+SECONDARY (derived, read-only for sync):
+  ├── .kilo/capability-index.yaml ← receives models FROM benchmarks
+  ├── .kilo/agents/*.md ← receive models FROM benchmarks via sync-agents.js
+  ├── kilo-meta.json ← receives models FROM benchmarks
+  └── kilo.jsonc ← receives models FROM benchmarks
+```
+
+**Rule:** `model-benchmarks.json` is the ONLY file that contains heatmap-derived scores. All other configs receive models FROM it, never the reverse.
+
+### Layer 1: Fitness Gate (Mandatory)
+
+Every model change must pass the fitness gate. A change is "acceptable" only if:
+
+```typescript
+interface ModelFitnessGate {
+  // Agent's current score with existing model
+  previous_score: number;
+  
+  // Agent's score with proposed model  
+  proposed_score: number;
+  
+  // Absolute minimum score for any agent
+  min_global_threshold: number;  // e.g. 75
+  
+  // Maximum regression allowed
+  max_regression: number;  // e.g. -3 points
+  
+  // Is proposed model in agent's top-N from heatmap?
+  top_n_required: number;  // e.g. top-3
+}
+
+function isChangeAcceptable(gate: ModelFitnessGate): boolean {
+  if (gate.proposed_score < gate.min_global_threshold) return false;
+  if (gate.proposed_score < gate.previous_score - gate.max_regression) return false;
+  return true;
+}
+```
+
+**Hard rule:** If `proposed_score < previous_score - 3`, the change MUST be rejected with a clear error. No exceptions.
+
+### Layer 2: Immutable Recommendations
+
+Recommendations in `model-benchmarks.json` are append-only. Once a recommendation is generated, it cannot be silently overwritten by a sync — it can only be superseded by a NEW recommendation with a higher timestamp.
+
+```json
+{
+  "recommendations": [
+    {
+      "agent": "lead-developer",
+      "from_model": "qwen3-coder:480b",
+      "to_model": "nemotron-3-super",
+      "score_delta": -22,
+      "status": "rejected",
+      "rejected_at": "2026-04-29T20:00:00Z",
+      "rejected_reason": "Downgrade: 92→70 exceeds max regression of 3"
+    }
+  ]
+}
+```
+
+### Layer 3: Sync Direction Lock
+
+All sync scripts must declare their direction explicitly:
+
+```typescript
+// ✅ CORRECT: benchmarks → configs
+// src: model-benchmarks.json
+// dst: capability-index.yaml, agents/*.md, kilo-meta.json
+// validates: fitness gate
+
+// ❌ INCORRECT: configs → benchmarks
+// This should NEVER happen. Benchmarks come from heatmap analytics only.
+```
+
+### Layer 4: Diff Report on Every Sync
+
+Before writing any file, the sync script must produce:
+
+```
+=== Model Sync Diff Report ===
+Agent              Old Model              Old Score  New Model              New Score  Status
+lead-developer     qwen3-coder:480b       92★        nemotron-3-super       70         ⚠️ REJECTED (regression -22 > max -3)
+system-analyst     glm-5.1                90★        nemotron-3-super       74         ⚠️ REJECTED (regression -16 > max -3)
+```
+
+No files are modified until the DIFF is reviewed (or `--auto-approve` is used for improvements only).
+
+### Layer 5: Recovery Checkpoint
+
+Before any sync that touches model assignments, create a git checkpoint:
+
+```bash
+# In the sync script
+git stash push -m "pre-model-sync-$(date +%s)"
+git checkout -b auto/model-sync-$(date +%s)
+```
+
+If fitness gate rejects changes, auto-rollback:
+```bash
+git checkout HEAD -- kilo-meta.json .kilo/capability-index.yaml .kilo/agents/
+```
+
+## Implementation
+
+### 1. Fitness Gate Module
+```typescript
+// agent-evolution/scripts/lib/fitness-gate.ts
+export class ModelFitnessGate {
+  constructor(
+    private benchmarks: ModelBenchmarks,
+    private minThreshold = 75,
+    private maxRegression = 3
+  ) {}
+
+  validateChange(agent: string, fromModel: string, toModel: string): GateResult {
+    const oldScore = this.getAgentModelScore(agent, fromModel);
+    const newScore = this.getAgentModelScore(agent, toModel);
+    
+    if (newScore < this.minThreshold) {
+      return { acceptable: false, reason: `Score ${newScore} below threshold ${this.minThreshold}` };
+    }
+    
+    if (newScore < oldScore - this.maxRegression) {
+      return { acceptable: false, reason: `Regression ${oldScore}→${newScore} exceeds max ${this.maxRegression}` };
+    }
+    
+    return { acceptable: true, delta: newScore - oldScore };
+  }
+}
+```
+
+### 2. Sync Wrapper
+```typescript
+// agent-evolution/scripts/sync-with-guard.cjs (wraps any sync script)
+const { validateAllChanges } = require('./lib/fitness-gate');
+const changes = detectChanges();  // what the sync WOULD do
+const report = validateAllChanges(changes);
+
+if (report.rejections.length > 0) {
+  console.error('❌ FITNESS GATE BLOCKED:');
+  report.rejections.forEach(r => console.error(`  ${r.agent}: ${r.reason}`));
+  process.exit(1);
+}
+
+console.log(`✅ All ${changes.length} changes passed fitness gate`);
+applyChanges(changes);
+```
+
+### 3. Git Checkpoint
+```bash
+# Every sync script must run this first
+#!/bin/bash
+set -e
+STASH_NAME="model-sync-$(date +%s)"
+git stash push -m "$STASH_NAME" -- kilo-meta.json .kilo/capability-index.yaml .kilo/agents/
+```
+
+## Verification Checklist
+
+After implementing the guard:
+
+- [ ] `sync-benchmarks-from-yaml.cjs` validates every model change against heatmap scores
+- [ ] Downgrades of >3 points are rejected with clear error
+- [ ] Diff report is printed before any file is written
+- [ ] Git checkpoint is created before sync
+- [ ] `model-benchmarks.json` has `source: "heatmap"` locked field
+- [ ] All sync scripts declare direction: `benchmarks → configs` only
+- [ ] CI pipeline runs fitness gate as pre-commit hook
+
+## Integration with Existing Workflow
+
+The guard integrates at the existing `/evolution` command step 0:
+
+```markdown
+## Step 0: Model Research & Guard
+1. Run heatmap analysis → produce raw scores
+2. **Fitness Gate** validates all proposed changes
+3. If any downgrade >3 points → HALT, report to human
+4. If all pass → generate recommendations append-only
+5. Sync to configs with direction lock: benchmarks → configs
+```
+
+---
+
+**Bottom line:** Never again should a script silently replace a ★-optimal model with one scoring 20+ points lower.