fix: restore optimal v3 models + add fitness gate protection

- Restore all 30 agents to v3.html heatmap optimal models:
  * frontend-developer: qwen3-coder -> minimax-m2.5 (92★)
  * devops-engineer: nemotron-3-super -> kimi-k2.6:cloud (88★)
  * browser-automation: qwen3-coder -> kimi-k2.6:cloud (86★)
  * agent-architect: glm-5.1 -> kimi-k2.6:cloud (86★)
- Add Model Evolution Guard system:
  * agent-evolution/scripts/lib/fitness-gate.cjs
  * Rejects downgrades >3 points or below score 75
  * Produces detailed diff report before any file modifications
  * Normalized model ID lookup (v3.html ':' vs JSON '-')
- Update sync-benchmarks-from-yaml.cjs with fitness gate
- Update model-benchmarks.json with v3 optimal assignments
- Rebuild research-dashboard.html (104KB, 30 agents, 11 models)
- Add model-evolution-guard.md architecture documentation
- Add v3-optimal-models.json as source-of-truth reference

Fixes regression introduced by commit 3badb25 where models were
silently downgraded from heatmap optimal to inferior assignments.
This commit is contained in:
¨NW¨
2026-04-29 23:19:16 +01:00
parent d1516f4856
commit 9e48a4960e
14 changed files with 2850 additions and 2049 deletions

View File

@@ -45,7 +45,7 @@
"system-analyst": {
"description": "Designs technical specifications, data schemas, and API contracts before implementation",
"mode": "subagent",
"model": "ollama-cloud/nemotron-3-super"
"model": "qwen/qwen3.6-plus:free"
},
"sdet-engineer": {
"description": "Writes tests following TDD methodology. Tests MUST fail initially (Red phase)",
@@ -68,7 +68,7 @@
"lead-developer": {
"description": "Primary code writer for backend and core logic. Writes implementation to pass tests",
"mode": "subagent",
"model": "ollama-cloud/nemotron-3-super",
"model": "ollama-cloud/qwen3-coder:480b",
"color": "#DC2626",
"permission": {
"read": "allow",