UniqueSoft/APAW

Files

¨NW¨ 3badb259cc feat: bidirectional research dashboard + agent config fixes

- Integrate apaw_agent_model_research_v3.html as standalone dashboard
- Add model-benchmarks.json with 32 agents, 11 scored models, 11 recommendations
- Add build-research-dashboard.ts: inject live data into template → standalone HTML
- Add rebuild-template.cjs: regenerate template from v3.html source
- Add sync-benchmarks-from-yaml.cjs: sync YAML → JSON round-trip
- Add sync-model-research.ts: apply recommendation matrix to config files
- Add model-benchmarks.schema.json and model-research.schema.json for validation
- Add bidirectional-data-flow.md architecture documentation
- Add log-execution.cjs pipeline hook
- Update capability-index.yaml: add fallback_models, failover_strategy
- Update kilo-meta.json, kilo.jsonc, KILO_SPEC.md with synced models
- Update evolution.md / research.md / self-evolution.md / evolutionary-sync.md docs
- Fix security-auditor.md: quote YAML color (#DC2626)
- Fix orchestrator.md: remove duplicate devops-engineer key
- Build research-dashboard.html (106KB standalone) + dated archive

2026-04-29 21:04:22 +01:00

6.2 KiB

Raw Blame History

Self-Evolution Protocol

When task requirements exceed existing agent capabilities.

Trigger Conditions

No agent matches task requirements
Required domain knowledge not in any skill
Complex multi-step task needs new workflow pattern
@capability-analyst reports critical gap
/evolution reports fitness < 0.70 and model research finds better model
Model benchmarks stale (>7 days) and research discovers new model

Evolution Flow

[Gap Detected]
      ↓
1. Create Gitea Milestone → "[Evolution] {gap_description}"
      ↓
2. Create Research Issue → Track research phase
      ↓
3. Run History Search → @history-miner checks git history
      ↓
4. Analyze Gap → @capability-analyst classifies gap
      ↓
5. Design Component → @agent-architect creates specification
      ↓
6. Decision: Agent/Skill/Workflow?
      ↓
7. Create File → .kilo/agents/{name}.md (or skill/workflow)
      ↓
8. Self-Modify → Add permission to orchestrator.md whitelist
      ↓
9. Update capability-index.yaml → Register capabilities
      ↓
10. Verify Access → Test call to new agent
      ↓
11. Update Documentation → KILO_SPEC.md, AGENTS.md, EVOLUTION_LOG.md
      ↓
12. Close Milestone → Record results in Gitea
      ↓
[New Capability Available]

Model Evolution Flow

When an agent's current model is suboptimal (score gap > 5 points in heatmap):

[Evolution Fitness < 0.85]
       ↓
1. Read model-benchmarks.json → load heatmap, recommendations
       ↓
2. IF stale (>7 days) → @capability-analyst researches models
   → Output: agent-evolution/data/model-research-latest.json
   → Validates against: agent-evolution/data/model-research.schema.json
       ↓
3. Identify agents where best_model ≠ current_model (gap > 5)
       ↓
4. Generate recommendations (action: update_model)
       ↓
5. Dry-run → /evolution --dry-run → Show what would change
       ↓
6. Apply → bun run agent-evolution/scripts/sync-model-research.ts
   → Updates: capability-index.yaml, agent-versions.json, kilo-meta.json, kilo.jsonc
   → Triggers: sync-agents.js --fix → propagates to .md files
   → Validates: sync-agents.js --check
       ↓
7. Re-test → @pipeline-judge → new fitness score
       ↓
8. IF fitness improved → commit changes
   IF fitness regressed → revert via agent-versions.json history
       ↓
9. Log to Gitea + fitness-history.jsonl
       ↓
[Models Optimized]

Model Research Data Flow

[model-benchmarks.json]          ← Static benchmark data (refreshed weekly)
       ↓ read
[/evolution Step 0]              ← Checks staleness, triggers research if needed
[/research models]               ← Explicit research trigger
       ↓ produces
[model-research-latest.json]     ← Dynamic research output
       ↓ consumed by
[sync-model-research.ts]         ← Applies recommendations
       ↓ updates
[capability-index.yaml]          ← Model assignments
[agent-versions.json]            ← History tracking
[kilo-meta.json]                 ← Source of truth
[kilo.jsonc]                     ← Agent config (manual verify)
[.kilo/agents/*.md]              ← Frontmatter (via sync script)
       ↓ verified by
[sync-agents.js --check]         ← Consistency validation

Key Files

File	Purpose	Updated By
`agent-evolution/data/model-benchmarks.json`	Static benchmark data	`/research models`, `/evolution research`
`agent-evolution/data/model-research-latest.json`	Latest research output	`/research models`, `/evolution Step 0`
`agent-evolution/data/model-research.schema.json`	Validation schema	Manual (schema changes are rare)
`agent-evolution/data/model-benchmarks.schema.json`	Benchmarks data schema	Manual
`agent-evolution/data/agent-versions.json`	Version history	`sync-model-research.ts`
`agent-evolution/scripts/sync-model-research.ts`	Application script	Manual execution

Self-Modification Rules

ONLY modify own permission whitelist
NEVER modify other agents' definitions
ALWAYS create milestone before changes
ALWAYS verify access after changes
ALWAYS log results to .kilo/EVOLUTION_LOG.md
NEVER skip verification step
ALWAYS validate research output against schema before applying
NEVER apply model changes without dry-run preview first
ALWAYS run sync-agents.js --check after model changes
ALWAYS revert if fitness regresses after model change

Evolution Triggers

Task type not in capability Routing Map
capability-analyst reports critical gap
Repeated task failures for same reason
User requests new specialized capability

File Modifications (in order)

Create .kilo/agents/{new-agent}.md (or skill/workflow)
Update .kilo/agents/orchestrator.md (add permission)
Update .kilo/capability-index.yaml (register capabilities)
Update .kilo/KILO_SPEC.md (document)
Update AGENTS.md (reference)
Append to .kilo/EVOLUTION_LOG.md (log entry)
Update agent-evolution/data/model-benchmarks.json (if model data changed)
Update agent-evolution/data/agent-versions.json (add history entry)
Update kilo-meta.json (source of truth for sync)
Run node scripts/sync-agents.js --fix (propagate to all files)
Run node scripts/sync-agents.js --check (verify consistency)

Verification Checklist

After each evolution:

Agent file created and valid YAML frontmatter
Permission added to orchestrator.md
Capability registered in capability-index.yaml
Test call succeeds (Task tool returns valid response)
KILO_SPEC.md updated with new agent
AGENTS.md updated with new agent
EVOLUTION_LOG.md updated with entry
Gitea milestone closed with results
model-research-latest.json validates against schema
sync-model-research.ts dry-run shows correct changes
capability-index.yaml model field updated for affected agents
agent-versions.json history entry added with rationale
kilo-meta.json matches new model assignments
kilo.jsonc manually verified (sync script does not guarantee this)
sync-agents.js --check passes
No stale models leaked (grep for previous model IDs)
Cloud model suffix correct (kimi-k2.6:cloud, not kimi-k2.6)