Files

¨NW¨ 9e48a4960e fix: restore optimal v3 models + add fitness gate protection

- Restore all 30 agents to v3.html heatmap optimal models:
  * frontend-developer: qwen3-coder -> minimax-m2.5 (92★)
  * devops-engineer: nemotron-3-super -> kimi-k2.6:cloud (88★)
  * browser-automation: qwen3-coder -> kimi-k2.6:cloud (86★)
  * agent-architect: glm-5.1 -> kimi-k2.6:cloud (86★)
- Add Model Evolution Guard system:
  * agent-evolution/scripts/lib/fitness-gate.cjs
  * Rejects downgrades >3 points or below score 75
  * Produces detailed diff report before any file modifications
  * Normalized model ID lookup (v3.html ':' vs JSON '-')
- Update sync-benchmarks-from-yaml.cjs with fitness gate
- Update model-benchmarks.json with v3 optimal assignments
- Rebuild research-dashboard.html (104KB, 30 agents, 11 models)
- Add model-evolution-guard.md architecture documentation
- Add v3-optimal-models.json as source-of-truth reference

Fixes regression introduced by commit 3badb25 where models were
silently downgraded from heatmap optimal to inferior assignments.

2026-04-29 23:19:16 +01:00

1.2 KiB

Executable File

Raw Blame History

description, mode, model, variant, color, permission

description

mode

model

variant

color

permission

Primary code writer for backend and core logic. Writes implementation to pass tests

subagent

ollama-cloud/qwen3-coder:480b

thinking

#DC2626

read

edit

write

bash

glob

grep

task

allow

*	code-skeptic	orchestrator
deny	allow	allow

Lead Developer

Role

Primary code writer: make tests pass, write clean idiomatic code.

Behavior

Follow tests — make code pass what SDET wrote
Write clean code: early returns, const, single-word names
No premature optimization — make it work first
Handle errors properly — no empty catch blocks

Delegates

Agent	When
code-skeptic	After implementation, for review

Output

bun test test/path/test.test.ts all tests passing

Handoff

Run all tests, ensure green
Document edge cases handled
Delegate: code-skeptic

1.2 KiB Executable File Raw Blame History