Files

¨NW¨ 15a7b4b7a4 feat: add Agent Evolution Dashboard

- Create agent-evolution/ directory with standalone dashboard
- Add interactive HTML dashboard with agent/model matrix
- Add heatmap view for agent-model compatibility scores
- Add recommendations tab with optimization suggestions
- Add Gitea integration preparation (history timeline)
- Add Docker configuration for deployment
- Add build scripts for standalone HTML generation
- Add sync scripts for agent data synchronization
- Add milestone and issues documentation
- Add skills and rules for evolution sync
- Update AGENTS.md with dashboard documentation
- Update package.json with evolution scripts

Features:
- 28 agents with model assignments and fit scores
- 8 models with benchmarks (SWE-bench, RULER, Terminal)
- 11 recommendations for model optimization
- History timeline with agent changes
- Interactive modal windows for model details
- Filter and search functionality
- Russian language interface
- Works offline (file://) with embedded data

Docker:
- Dockerfile for standalone deployment
- docker-compose.evolution.yml
- docker-run.sh/docker-run.bat scripts

NPM scripts:
- sync:evolution - sync and build dashboard
- evolution:open - open in browser
- evolution:dashboard - start dev server

Status: PAUSED - foundation complete, Gitea integration pending

2026-04-05 19:58:59 +01:00

8.8 KiB

Raw Permalink Blame History

Evolutionary Sync Rules

Rules for synchronizing agent evolution data automatically.

When to Sync

Automatic Sync Triggers

After each completed issue
- When agent completes task and posts Gitea comment
- Extract performance metrics from comment
On model change
- When agent model is updated in kilo.jsonc
- When capability-index.yaml is modified
On agent file change
- When .kilo/agents/*.md files are modified
- On create/delete of agent files
On prompt update
- When agent receives prompt optimization
- Track optimization improvements

Manual Sync Triggers

# Sync from all sources
bun run sync:evolution

# Sync specific source
bun run agent-evolution/scripts/sync-agent-history.ts --source git
bun run agent-evolution/scripts/sync-agent-history.ts --source gitea

# Open dashboard
bun run evolution:dashboard
bun run evolution:open

Data Flow

┌─────────────────────────────────────────────────────────────┐
│                     Data Sources                            │
├─────────────────────────────────────────────────────────────┤
│ .kilo/agents/*.md          ──► Parse frontmatter, model     │
│ .kilo/kilo.jsonc           ──► Model assignments            │
│ .kilo/capability-index.yaml ──► Capabilities, routing       │
│ Git History                ──► Change timeline              │
│ Gitea Issue Comments       ──► Performance scores            │
└─────────────────────────────────────────────────────────────┘
                            │
                            ▼
┌─────────────────────────────────────────────────────────────┐
│              agent-evolution/data/                          │
│              agent-versions.json                           │
├─────────────────────────────────────────────────────────────┤
│ {                                                          │
│   "agents": {                                              │
│     "lead-developer": {                                    │
│       "current": { model, provider, fit_score, ... },      │
│       "history": [ { model_change, ... } ],                │
│       "performance_log": [ { score, issue, ... } ]        │
│     }                                                      │
│   }                                                        │
│ }                                                          │
└─────────────────────────────────────────────────────────────┘
                            │
                            ▼
┌─────────────────────────────────────────────────────────────┐
│              agent-evolution/index.html                     │
│              Interactive Dashboard                          │
├─────────────────────────────────────────────────────────────┤
│ • Overview - Stats, recent changes, recommendations        │
│ • All Agents - Filterable cards with history                │
│ • Timeline - Full evolution history                          │
│ • Recommendations - Export, priority-based view            │
│ • Model Matrix - Agent × Model mapping                      │
└─────────────────────────────────────────────────────────────┘

Recording Changes

From Gitea Comments

Agent comments should follow this format:

## ✅ agent-name completed

**Score**: X/10
**Duration**: X.Xh
**Files**: file1.ts, file2.ts

### Notes
- Description of work done
- Key decisions made
- Issues encountered

Extraction:

agent-name → agent name
Score → performance score (1-10)
Duration → execution time
Files → files modified

From Git Commits

Commit message patterns:

feat: add flutter-developer agent → agent_created
fix: update security-auditor model to nemotron-3-super → model_change
docs: update lead-developer prompt → prompt_change

Gitea Webhook Setup

Create webhook in Gitea
- Target URL: http://localhost:3000/api/evolution/webhook
- Events: issue_comment, issues

Webhook payload handling

// In agent-evolution/scripts/gitea-webhook.ts
app.post('/api/evolution/webhook', async (req, res) => {
  const { action, issue, comment } = req.body;

  if (action === 'created' && comment?.body.includes('## ✅')) {
    await recordAgentPerformance(issue, comment);
  }

  res.json({ success: true });
});

Performance Metrics

Tracked Metrics

For each agent execution:

Metric	Source	Format
Score	Gitea comment	X/10
Duration	Agent timing	milliseconds
Success	Exit status	boolean
Files	Gitea comment	count
Issue	Context	number

Aggregated Metrics

Metric	Calculation	Use
Average Score	`sum(scores) / count`	Agent effectiveness
Success Rate	`successes / total * 100`	Reliability
Average Duration	`sum(durations) / count`	Speed
Files per Task	`sum(files) / count`	Scope

Recommendations Generation

Priority Levels

Priority	Criteria	Action
Critical	Fit score < 70	Immediate update
High	Model unavailable	Switch to fallback
Medium	Better model available	Consider upgrade
Low	Optimization possible	Optional improvement

Example Recommendation

{
  "agent": "requirement-refiner",
  "recommendations": [{
    "target": "ollama-cloud/nemotron-3-super",
    "reason": "+22% quality, 1M context for specifications",
    "priority": "critical"
  }]
}

Evolution Rules

When Model Change is Recorded

Detect change
- Compare current.model with previous value
- Extract reason from commit message

Record in history

{
  "date": "2026-04-05T05:21:00Z",
  "commit": "caf77f53c8",
  "type": "model_change",
  "from": "ollama-cloud/gpt-oss:120b",
  "to": "ollama-cloud/nemotron-3-super",
  "reason": "Better reasoning for security analysis"
}

Update current
- Set current.model to new value
- Update provider if changed
- Recalculate fit score

When Performance Drops

Detect pattern
- Last 5 scores average < 7
- Success rate < 80%
Generate recommendation
- Suggest model upgrade
- Trigger prompt-optimizer
Notify via Gitea comment
- Post to related issue
- Include improvement suggestions

Integration in Pipeline

Add to post-pipeline:

# .kilo/commands/pipeline.md
post_steps:
  - name: sync_evolution
    run: bun run sync:evolution
  - name: check_recommendations
    run: bun run agent-evolution/scripts/check-recommendations.ts

Dashboard Access

# Start local server
bun run evolution:dashboard

# Open in browser
bun run evolution:open
# or visit http://localhost:3001

API Endpoints (Future)

// GET /api/evolution/agents
// Returns all agents with current state

// GET /api/evolution/agents/:name/history
// Returns agent history

// GET /api/evolution/recommendations
// Returns pending recommendations

// POST /api/evolution/agents/:name/apply
// Apply recommendation

// POST /api/evolution/sync
// Trigger manual sync

Best Practices

Sync after every pipeline run
- Captures model changes
- Records performance
Review dashboard weekly
- Check pending recommendations
- Apply critical updates
Track before/after metrics
- When applying changes
- Compare performance
Keep history clean
- Deduplicate entries
- Merge related changes
Use consistent naming
- Agent names match file names
- Model IDs match capability-index.yaml

8.8 KiB Raw Permalink Blame History Unescape Escape