¨NW¨
fb552e0020
feat: v3 optimal model assignments + fitness gate
...
- Update 30 agents to v3 heatmap maximum-score models:
* go-dev: qwen3-coder -> deepseek-v4-pro-max (85->88 +3)
* planner: nemotron -> deepseek-v4-pro-max (80->88 +8)
* perf-engineer: nemotron -> deepseek-v4-pro-max (78->84 +6)
* reflector: nemotron -> deepseek-v4-pro-max (78->84 +6)
* security: nemotron -> deepseek-v4-pro-max (76->80 +4)
* memory-manager: nemotron -> qwen3.6-plus (86->87 +1)
* frontend: kimi-k2.5 -> minimax-m2.5 (92)
* the-fixer: minimax-m2.5 -> kimi-k2.6 (88->90 +2)
* browser-auto: kimi-k2.6 -> qwen3-coder (86->87 +1)
* prompt-opt: glm-5.1 -> qwen3.6-plus (82->83 +1)
* backend: deepseek-v3.2 -> qwen3-coder (91)
* capability-analyst: nemotron -> glm-5.1 (85)
* release-man: devstral-2 -> glm-5.1 (82)
* evaluator: nemotron -> glm-5.1 (86)
* workflow-arch: gpt-oss -> glm-5.1 (84)
- Add Model Evolution Guard:
* fitness-gate.cjs: rejects downgrades >3 points or <75 score
* Normalized model ID lookup (: vs -)
* Diff report before any file modifications
- Update sync-benchmarks-from-yaml.cjs with fitness gate
- Sync kilo-meta.json, kilo.jsonc, .md agent files
- Rebuild research-dashboard.html (104KB, 30 agents, 11 models)
Total improvement: +105 points across 11 agents
Source: v3.html heatmap IF-adjusted composite scores
2026-04-30 08:42:10 +01:00
¨NW¨
9e48a4960e
fix: restore optimal v3 models + add fitness gate protection
...
- Restore all 30 agents to v3.html heatmap optimal models:
* frontend-developer: qwen3-coder -> minimax-m2.5 (92★)
* devops-engineer: nemotron-3-super -> kimi-k2.6:cloud (88★)
* browser-automation: qwen3-coder -> kimi-k2.6:cloud (86★)
* agent-architect: glm-5.1 -> kimi-k2.6:cloud (86★)
- Add Model Evolution Guard system:
* agent-evolution/scripts/lib/fitness-gate.cjs
* Rejects downgrades >3 points or below score 75
* Produces detailed diff report before any file modifications
* Normalized model ID lookup (v3.html ':' vs JSON '-')
- Update sync-benchmarks-from-yaml.cjs with fitness gate
- Update model-benchmarks.json with v3 optimal assignments
- Rebuild research-dashboard.html (104KB, 30 agents, 11 models)
- Add model-evolution-guard.md architecture documentation
- Add v3-optimal-models.json as source-of-truth reference
Fixes regression introduced by commit 3badb25 where models were
silently downgraded from heatmap optimal to inferior assignments.
2026-04-29 23:19:16 +01:00
¨NW¨
3badb259cc
feat: bidirectional research dashboard + agent config fixes
...
- Integrate apaw_agent_model_research_v3.html as standalone dashboard
- Add model-benchmarks.json with 32 agents, 11 scored models, 11 recommendations
- Add build-research-dashboard.ts: inject live data into template → standalone HTML
- Add rebuild-template.cjs: regenerate template from v3.html source
- Add sync-benchmarks-from-yaml.cjs: sync YAML → JSON round-trip
- Add sync-model-research.ts: apply recommendation matrix to config files
- Add model-benchmarks.schema.json and model-research.schema.json for validation
- Add bidirectional-data-flow.md architecture documentation
- Add log-execution.cjs pipeline hook
- Update capability-index.yaml: add fallback_models, failover_strategy
- Update kilo-meta.json, kilo.jsonc, KILO_SPEC.md with synced models
- Update evolution.md / research.md / self-evolution.md / evolutionary-sync.md docs
- Fix security-auditor.md: quote YAML color (#DC2626)
- Fix orchestrator.md: remove duplicate devops-engineer key
- Build research-dashboard.html (106KB standalone) + dated archive
2026-04-29 21:04:22 +01:00
¨NW¨
2ae7789802
fix: sync kilo.jsonc + capability-index.yaml after evolution upgrade
...
- kilo.jsonc: manual fix 7 agent models (sync script does not write back)
- capability-index.yaml: orchestrator model glm-5.1 → kimi-k2.6:cloud
- evolutionary-sync.md: add kilo.jsonc + capability-index.yaml manual rules
- Add cloud suffix verification and per-file verification checklist
- Document finding: sync script reads kilo.jsonc but never writes back
2026-04-27 16:49:25 +01:00
¨NW¨
dbea8c90db
feat: evolutionary agent model upgrades based on recommendation matrix
...
- devops-engineer: deepseek-v3.2 → kimi-k2.6:cloud (★88)
- browser-automation: glm-5 → kimi-k2.6:cloud (★86)
- visual-tester: glm-5 → qwen3-coder:480b (★82)
- agent-architect: nemotron-3-super → kimi-k2.6:cloud (★86)
- orchestrator: glm-5 → kimi-k2.6:cloud (dispatch critical)
- product-owner: glm-5 → glm-5.1 (★84)
- prompt-optimizer: qwen3.6-plus:free → glm-5.1 (stable fallback)
- system-analyst: qwen3.6-plus:free → glm-5.1 (★90)
- Add autonomous-mode.md rule for zero-confirmation workflow
2026-04-27 12:09:36 +01:00
¨NW¨
b517ad5dad
feat: add synchronization system for agent definitions
...
- Add kilo.jsonc (official Kilo Code config)
- Add kilo-meta.json (source of truth for sync)
- Add evolutionary-sync.md rule for documentation
- Add scripts/sync-agents.cjs for validation
- Fix agent mode mismatches (8 agents had wrong mode)
- Update KILO_SPEC.md and AGENTS.md
The sync system ensures:
- kilo-meta.json is the single source of truth
- Agent .md files frontmatter matches meta
- KILO_SPEC.md tables stay synchronized
- AGENTS.md category tables stay synchronized
Run: node scripts/sync-agents.cjs --check
Fix: node scripts/sync-agents.cjs --fix
2026-04-05 13:19:54 +01:00