Files
APAW/.kilo/shared/self-evolution.md
¨NW¨ 3badb259cc feat: bidirectional research dashboard + agent config fixes
- Integrate apaw_agent_model_research_v3.html as standalone dashboard
- Add model-benchmarks.json with 32 agents, 11 scored models, 11 recommendations
- Add build-research-dashboard.ts: inject live data into template → standalone HTML
- Add rebuild-template.cjs: regenerate template from v3.html source
- Add sync-benchmarks-from-yaml.cjs: sync YAML → JSON round-trip
- Add sync-model-research.ts: apply recommendation matrix to config files
- Add model-benchmarks.schema.json and model-research.schema.json for validation
- Add bidirectional-data-flow.md architecture documentation
- Add log-execution.cjs pipeline hook
- Update capability-index.yaml: add fallback_models, failover_strategy
- Update kilo-meta.json, kilo.jsonc, KILO_SPEC.md with synced models
- Update evolution.md / research.md / self-evolution.md / evolutionary-sync.md docs
- Fix security-auditor.md: quote YAML color (#DC2626)
- Fix orchestrator.md: remove duplicate devops-engineer key
- Build research-dashboard.html (106KB standalone) + dated archive
2026-04-29 21:04:22 +01:00

6.2 KiB

Self-Evolution Protocol

When task requirements exceed existing agent capabilities.

Trigger Conditions

  1. No agent matches task requirements
  2. Required domain knowledge not in any skill
  3. Complex multi-step task needs new workflow pattern
  4. @capability-analyst reports critical gap
  5. /evolution reports fitness < 0.70 and model research finds better model
  6. Model benchmarks stale (>7 days) and research discovers new model

Evolution Flow

[Gap Detected]
      ↓
1. Create Gitea Milestone → "[Evolution] {gap_description}"
      ↓
2. Create Research Issue → Track research phase
      ↓
3. Run History Search → @history-miner checks git history
      ↓
4. Analyze Gap → @capability-analyst classifies gap
      ↓
5. Design Component → @agent-architect creates specification
      ↓
6. Decision: Agent/Skill/Workflow?
      ↓
7. Create File → .kilo/agents/{name}.md (or skill/workflow)
      ↓
8. Self-Modify → Add permission to orchestrator.md whitelist
      ↓
9. Update capability-index.yaml → Register capabilities
      ↓
10. Verify Access → Test call to new agent
      ↓
11. Update Documentation → KILO_SPEC.md, AGENTS.md, EVOLUTION_LOG.md
      ↓
12. Close Milestone → Record results in Gitea
      ↓
[New Capability Available]

Model Evolution Flow

When an agent's current model is suboptimal (score gap > 5 points in heatmap):

[Evolution Fitness < 0.85]
       ↓
1. Read model-benchmarks.json → load heatmap, recommendations
       ↓
2. IF stale (>7 days) → @capability-analyst researches models
   → Output: agent-evolution/data/model-research-latest.json
   → Validates against: agent-evolution/data/model-research.schema.json
       ↓
3. Identify agents where best_model ≠ current_model (gap > 5)
       ↓
4. Generate recommendations (action: update_model)
       ↓
5. Dry-run → /evolution --dry-run → Show what would change
       ↓
6. Apply → bun run agent-evolution/scripts/sync-model-research.ts
   → Updates: capability-index.yaml, agent-versions.json, kilo-meta.json, kilo.jsonc
   → Triggers: sync-agents.js --fix → propagates to .md files
   → Validates: sync-agents.js --check
       ↓
7. Re-test → @pipeline-judge → new fitness score
       ↓
8. IF fitness improved → commit changes
   IF fitness regressed → revert via agent-versions.json history
       ↓
9. Log to Gitea + fitness-history.jsonl
       ↓
[Models Optimized]

Model Research Data Flow

[model-benchmarks.json]          ← Static benchmark data (refreshed weekly)
       ↓ read
[/evolution Step 0]              ← Checks staleness, triggers research if needed
[/research models]               ← Explicit research trigger
       ↓ produces
[model-research-latest.json]     ← Dynamic research output
       ↓ consumed by
[sync-model-research.ts]         ← Applies recommendations
       ↓ updates
[capability-index.yaml]          ← Model assignments
[agent-versions.json]            ← History tracking
[kilo-meta.json]                 ← Source of truth
[kilo.jsonc]                     ← Agent config (manual verify)
[.kilo/agents/*.md]              ← Frontmatter (via sync script)
       ↓ verified by
[sync-agents.js --check]         ← Consistency validation

Key Files

File Purpose Updated By
agent-evolution/data/model-benchmarks.json Static benchmark data /research models, /evolution research
agent-evolution/data/model-research-latest.json Latest research output /research models, /evolution Step 0
agent-evolution/data/model-research.schema.json Validation schema Manual (schema changes are rare)
agent-evolution/data/model-benchmarks.schema.json Benchmarks data schema Manual
agent-evolution/data/agent-versions.json Version history sync-model-research.ts
agent-evolution/scripts/sync-model-research.ts Application script Manual execution

Self-Modification Rules

  1. ONLY modify own permission whitelist
  2. NEVER modify other agents' definitions
  3. ALWAYS create milestone before changes
  4. ALWAYS verify access after changes
  5. ALWAYS log results to .kilo/EVOLUTION_LOG.md
  6. NEVER skip verification step
  7. ALWAYS validate research output against schema before applying
  8. NEVER apply model changes without dry-run preview first
  9. ALWAYS run sync-agents.js --check after model changes
  10. ALWAYS revert if fitness regresses after model change

Evolution Triggers

  • Task type not in capability Routing Map
  • capability-analyst reports critical gap
  • Repeated task failures for same reason
  • User requests new specialized capability

File Modifications (in order)

  1. Create .kilo/agents/{new-agent}.md (or skill/workflow)
  2. Update .kilo/agents/orchestrator.md (add permission)
  3. Update .kilo/capability-index.yaml (register capabilities)
  4. Update .kilo/KILO_SPEC.md (document)
  5. Update AGENTS.md (reference)
  6. Append to .kilo/EVOLUTION_LOG.md (log entry)
  7. Update agent-evolution/data/model-benchmarks.json (if model data changed)
  8. Update agent-evolution/data/agent-versions.json (add history entry)
  9. Update kilo-meta.json (source of truth for sync)
  10. Run node scripts/sync-agents.js --fix (propagate to all files)
  11. Run node scripts/sync-agents.js --check (verify consistency)

Verification Checklist

After each evolution:

  • Agent file created and valid YAML frontmatter
  • Permission added to orchestrator.md
  • Capability registered in capability-index.yaml
  • Test call succeeds (Task tool returns valid response)
  • KILO_SPEC.md updated with new agent
  • AGENTS.md updated with new agent
  • EVOLUTION_LOG.md updated with entry
  • Gitea milestone closed with results
  • model-research-latest.json validates against schema
  • sync-model-research.ts dry-run shows correct changes
  • capability-index.yaml model field updated for affected agents
  • agent-versions.json history entry added with rationale
  • kilo-meta.json matches new model assignments
  • kilo.jsonc manually verified (sync script does not guarantee this)
  • sync-agents.js --check passes
  • No stale models leaked (grep for previous model IDs)
  • Cloud model suffix correct (kimi-k2.6:cloud, not kimi-k2.6)