fix: restore optimal v3 models + add fitness gate protection

- Restore all 30 agents to v3.html heatmap optimal models: * frontend-developer: qwen3-coder -> minimax-m2.5 (92★) * devops-engineer: nemotron-3-super -> kimi-k2.6:cloud (88★) * browser-automation: qwen3-coder -> kimi-k2.6:cloud (86★) * agent-architect: glm-5.1 -> kimi-k2.6:cloud (86★) - Add Model Evolution Guard system: * agent-evolution/scripts/lib/fitness-gate.cjs * Rejects downgrades >3 points or below score 75 * Produces detailed diff report before any file modifications * Normalized model ID lookup (v3.html ':' vs JSON '-') - Update sync-benchmarks-from-yaml.cjs with fitness gate - Update model-benchmarks.json with v3 optimal assignments - Rebuild research-dashboard.html (104KB, 30 agents, 11 models) - Add model-evolution-guard.md architecture documentation - Add v3-optimal-models.json as source-of-truth reference Fixes regression introduced by commit 3badb25 where models were silently downgraded from heatmap optimal to inferior assignments.
2026-04-29 23:19:16 +01:00
parent d1516f4856
commit 9e48a4960e
14 changed files with 2850 additions and 2049 deletions
--- a/kilo.jsonc
+++ b/kilo.jsonc
@@ -45,7 +45,7 @@
    "system-analyst": {
      "description": "Designs technical specifications, data schemas, and API contracts before implementation",
      "mode": "subagent",
-      "model": "ollama-cloud/nemotron-3-super"
+      "model": "qwen/qwen3.6-plus:free"
    },
    "sdet-engineer": {
      "description": "Writes tests following TDD methodology. Tests MUST fail initially (Red phase)",
@@ -68,7 +68,7 @@
    "lead-developer": {
      "description": "Primary code writer for backend and core logic. Writes implementation to pass tests",
      "mode": "subagent",
-      "model": "ollama-cloud/nemotron-3-super",
+      "model": "ollama-cloud/qwen3-coder:480b",
      "color": "#DC2626",
      "permission": {
        "read": "allow",