---
description: Run evolution cycle - judge last workflow, optimize underperforming agents, re-test
---

# /evolution — Pipeline Evolution Command

Runs the automated evolution cycle on the most recent (or specified) workflow.

## Usage

```
/evolution                     # evolve last completed workflow
/evolution --issue 42          # evolve workflow for issue #42
/evolution --agent planner     # focus evolution on one agent
/evolution --dry-run           # show what would change without applying
/evolution --history           # print fitness trend chart
/evolution --fitness           # run fitness evaluation (alias for /evolve)
```

## Aliases

- `/evolve` — same as `/evolution --fitness`
- `/evolution log` — log agent model change to Gitea

## Execution

### Step 1: Judge (Fitness Evaluation)

```bash
Task(subagent_type: "pipeline-judge")
→ produces fitness report
```

### Step 2: Decide (Threshold Routing)

```
IF fitness >= 0.85:
  echo "✅ Pipeline healthy (fitness: {score}). No action needed."
  append to fitness-history.jsonl
  EXIT

IF fitness >= 0.70:
  echo "⚠ Pipeline marginal (fitness: {score}). Optimizing weak agents..."
  identify agents with lowest per-agent scores
  Task(subagent_type: "prompt-optimizer", target: weak_agents)

IF fitness < 0.70:
  echo "🔴 Pipeline underperforming (fitness: {score}). Major optimization..."
  Task(subagent_type: "prompt-optimizer", target: all_flagged_agents)
  IF fitness < 0.50:
    Task(subagent_type: "agent-architect", action: "redesign", target: worst_agent)
```

### Step 3: Re-test (After Optimization)

```
Re-run the SAME workflow with updated prompts
Task(subagent_type: "pipeline-judge") → fitness_after

IF fitness_after > fitness_before:
  commit prompt changes
  echo "📈 Fitness improved: {before} → {after}"
ELSE:
  revert prompt changes
  echo "📉 No improvement. Reverting."
```

### Step 4: Log

Append to `.kilo/logs/fitness-history.jsonl`:

```json
{
  "ts": "<now>",
  "issue": <N>,
  "workflow": "<type>",
  "fitness_before": <score>,
  "fitness_after": <score>,
  "agents_optimized": ["planner", "requirement-refiner"],
  "tokens_saved": <delta>,
  "time_saved_ms": <delta>
}
```

## Subcommands

### `log` — Log Model Change

Log an agent model improvement to Gitea and evolution data.

```bash
/evolution log capability-analyst "Updated to qwen3.6-plus for better IF score"
```

Steps:
1. Read current model from `.kilo/agents/{agent}.md`
2. Get previous model from `agent-evolution/data/agent-versions.json`
3. Calculate improvement (IF score, context window)
4. Write to evolution data
5. Post Gitea comment

### `report` — Generate Evolution Report

Generate comprehensive report for agent or all agents:

```bash
/evolution report           # all agents
/evolution report planner   # specific agent
```

Output includes:
- Total agents
- Model changes this month
- Average quality improvement
- Recent changes table
- Performance metrics
- Model distribution
- Recommendations

### `history` — Show Fitness Trend

Print fitness trend chart:

```bash
/evolution --history
```

Output:
```
Fitness Trend (Last 30 days):

1.00 ┤
0.90 ┤     ╭─╮     ╭──╮
0.80 ┤   ╭─╯ ╰─╮ ╭─╯  ╰──╮
0.70 ┤ ╭─╯     ╰─╯        ╰──╮
0.60 ┤ │                         ╰─╮
0.50 ┼─┴───────────────────────────┴──
     Apr 1  Apr 8  Apr 15  Apr 22  Apr 29

Avg fitness: 0.82
Trend: ↑ improving
```

### `recommend` — Get Model Recommendations

```bash
/evolution recommend
```

Shows:
- Agents with fitness < 0.70 (need optimization)
- Agents consuming > 30% of token budget (bottlenecks)
- Model upgrade recommendations
- Priority order

## Data Storage

### fitness-history.jsonl

```jsonl
{"ts":"2026-04-06T00:00:00Z","issue":42,"workflow":"feature","fitness":0.82,"breakdown":{"test_pass_rate":0.95,"quality_gates_rate":0.80,"efficiency_score":0.65},"tokens":38400,"time_ms":245000,"tests_passed":45,"tests_total":47,"verdict":"PASS"}
{"ts":"2026-04-06T01:30:00Z","issue":43,"workflow":"bugfix","fitness":0.91,"breakdown":{"test_pass_rate":1.00,"quality_gates_rate":0.80,"efficiency_score":0.88},"tokens":12000,"time_ms":85000,"tests_passed":47,"tests_total":47,"verdict":"PASS"}
```

### agent-versions.json

```json
{
  "version": "1.0",
  "agents": {
    "capability-analyst": {
      "current": {
        "model": "qwen/qwen3.6-plus:free",
        "provider": "openrouter",
        "if_score": 90,
        "quality_score": 79,
        "context_window": "1M"
      },
      "history": [
        {
          "date": "2026-04-05T22:20:00Z",
          "type": "model_change",
          "from": "ollama-cloud/nemotron-3-super",
          "to": "qwen/qwen3.6-plus:free",
          "rationale": "Better IF score, FREE via OpenRouter"
        }
      ]
    }
  }
}
```

## Integration Points

- **After `/pipeline`**: Evaluator scores logged
- **After model update**: Evolution logged
- **Weekly**: Performance report generated
- **On request**: Recommendations provided

## Configuration

```yaml
# In capability-index.yaml
evolution:
  enabled: true
  auto_trigger: true           # trigger after every workflow
  fitness_threshold: 0.70      # below this → auto-optimize
  max_evolution_attempts: 3    # max retries per cycle
  fitness_history: .kilo/logs/fitness-history.jsonl
  token_budget_default: 50000
  time_budget_default: 300
```

## Metrics Tracked

| Metric | Source | Purpose |
|--------|--------|---------|
| Fitness Score | pipeline-judge | Overall pipeline health |
| Test Pass Rate | bun test | Code quality |
| Quality Gates | build/lint/typecheck | Standards compliance |
| Token Cost | pipeline logs | Resource efficiency |
| Wall-Clock Time | pipeline logs | Speed |
| Agent ROI | history analysis | Cost/benefit |

## Example Session

```bash
$ /evolution

## Pipeline Judgment: Issue #42

**Fitness: 0.82/1.00** [PASS]

| Metric | Value | Weight | Contribution |
|--------|-------|--------|-------------|
| Tests  | 95% (45/47) | 50% | 0.475 |
| Gates  | 80% (4/5) | 25% | 0.200 |
| Cost   | 38.4K tok / 245s | 25% | 0.163 |

**Bottleneck:** lead-developer (31% of tokens)
**Verdict:** PASS - within acceptable range

✅ Logged to .kilo/logs/fitness-history.jsonl
```

---

*Evolution workflow v2.0 - Objective fitness scoring with pipeline-judge*