Files

Deploy Bot ccca685fdc feat(agent-models): assign best-fit models from real-fit evaluation report

Updated all 36 agents to their highest-scoring model per real-fit-report.json:
- kimi-k2.6:  code-skeptic(91.2), system-analyst(92.0), sdet-engineer(97.0),
              lead-developer(72.5), security-auditor(63.8), history-miner,
              browser-automation, evolution-prompt, product-owner,
              orchestrator, release-manager, reflector
- glm-5.1:    devops-engineer(96.2), evaluator, the-fixer, memory-manager,
              performance-engineer, prompt-optimizer, workflow-architect,
              visual-tester, flutter-developer, incident-responder
- qwen3-coder:480b: architect-indexer, frontend-developer, go-developer,
                    markdown-validator, pipeline-judge, workflow-cross-checker,
                    evolution-skeptic, requirement-refiner
- deepseek-v4-pro: backend-developer, capability-analyst, planner,
                    php-developer, python-developer

Files updated:
- kilo-meta.json (source of truth)
- kilo.jsonc (runtime config)
- capability-index.yaml (routing)
- 30 agent .md frontmatters (via sync-agents.cjs)
- KILO_SPEC.md + AGENTS.md (auto-synced)
- real-fit-report.json (regenerated from DB)

2026-05-28 13:46:34 +01:00

README.md

APAW Agent Evolution Dashboard

Overview

This is a standalone HTML dashboard that visualizes agent model assignments, performance scores, and recommendations for the APAW codebase.

Features

Real-time agent model & performance tracking
Agent × Model compatibility heatmap
Performance impact analysis with Chart.js visualizations
Model recommendation engine with priority scoring
Evolution timeline and history tracking

Data Sources

The dashboard pulls data from three primary sources:

.kilo/agents/*.md - Agent definitions with model assignments, modes, colors, and descriptions
kilo-meta.json - Central registry of agent metadata, categories, and capabilities
model-benchmarks-verified.json - IF scores and context window data for all supported models

Build Process

The build-standalone-fixed.cjs script:

Parses all agent YAML frontmatter
Computes composite performance scores using IF scores and context windows
Generates model recommendations based on score improvements
Embeds unified JSON data directly into the HTML file
Updates JavaScript functions to use embedded data

Incremental DB Sync

The watch-db.cjs script provides incremental database synchronization:

Watches for changes in .kilo/agents/*.md and kilo-meta.json
Only processes changed files (incremental update)
Determines change type (model_change vs prompt_change)
Updates database with new versions and metadata
Exports updated data to JSON
Clean shutdown on SIGINT/SIGTERM
Configurable polling interval via WATCH_INTERVAL_MS env var
Logging to .kilo/logs/watch-db.log

Validation

The build process ensures:

✅ No unicode escape sequences (no \u003c or \u003e characters)
✅ Valid embedded JSON structure
✅ Clean standalone HTML file with no external dependencies
✅ Proper function updates (init, renderHeatmap, renderRecommendations)

Output Files

index.standalone.html - Self-contained dashboard with embedded data
data/index.html - Copy of standalone dashboard for web serving

Usage

Simply open index.standalone.html in any modern browser. No server or external dependencies required.

To run the incremental DB watcher:

# Run with default 60 second interval
node agent-evolution/scripts/watch-db.cjs

# Run with custom interval (10 seconds)
WATCH_INTERVAL_MS=10000 node agent-evolution/scripts/watch-db.cjs

# Run in background
nohup node agent-evolution/scripts/watch-db.cjs > watch-db.log 2>&1 &

Agent Count

The dashboard currently tracks 34 agents across multiple categories:

Core Development
Quality Assurance
Security
Analysis
Process Management
Cognitive Enhancement
Testing

Model Support

Supports 15 verified models with IF scores from artificialanalysis.ai:

DeepSeek V4-Pro Max (IF: 89)
DeepSeek V4-Flash (IF: 86)
Kimi K2.6 (IF: 91)
Qwen3-Coder 480B (IF: 88)
GLM-5.1 (IF: 90)
And 10 more models

README.md Unescape Escape