APAW/AGENTS.md

# Kilo Code Agents Reference

This file configures AI agent behavior for the APAW project - a self-improving code pipeline with Gitea logging.

## Pipeline Workflow

The main workflow is `/pipeline` - use it to process issues through all agents automatically.

```
User: /pipeline 42
Agent: Runs full pipeline for issue #42 with Gitea logging
```

## Commands (Slash Commands)

| Command | Description | Usage |
|---------|-------------|-------|
| `/pipeline <issue>` | Run full agent pipeline for issue | `/pipeline 42` |
| `/status <issue>` | Check pipeline status for issue | `/status 42` |
| `/evaluate <issue>` | Generate performance report | `/evaluate 42` |
| `/plan` | Creates detailed task plans | `/plan feature X` |
| `/ask` | Answers codebase questions | `/ask how does auth work` |
| `/debug` | Analyzes and fixes bugs | `/debug error in login` |
| `/code` | Quick code generation | `/code add validation` |
| `/research [topic]` | Run research and self-improvement | `/research multi-agent` |

## Pipeline Agents (Subagents)

These agents are invoked automatically by `/pipeline` or manually via `@mention`:

### Core Development
| Agent | Role | When Invoked |
|-------|------|--------------|
| `@requirement-refiner` | Converts ideas to User Stories | Issue status: new |
| `@history-miner` | Finds duplicates in git | Status: planned |
| `@system-analyst` | Designs specifications | Status: researching |
| `@sdet-engineer` | Writes tests (TDD) | Status: designed |
| `@lead-developer` | Implements code | Status: testing (tests fail) |
| `@frontend-developer` | UI implementation | When UI work needed |
| `@backend-developer` | Node.js/Express/APIs | When backend needed |
| `@flutter-developer` | Flutter mobile apps | When mobile development |
| `@go-developer` | Go backend services | When Go backend needed |

### Quality Assurance
| Agent | Role | When Invoked |
|-------|------|--------------|
| `@code-skeptic` | Adversarial review | Status: implementing |
| `@the-fixer` | Fixes issues | When review fails |
| `@performance-engineer` | Performance review | After code-skeptic |
| `@security-auditor` | Security audit | After performance |
| `@visual-tester` | Visual regression | When UI changes |

### Cognitive Enhancement (New)
| Agent | Role | When Invoked |
|-------|------|--------------|
| `@planner` | Task decomposition (CoT/ToT) | Complex tasks |
| `@reflector` | Self-reflection (Reflexion) | After each agent |
| `@memory-manager` | Memory systems | Context management |

### Meta & Process
| Agent | Role | When Invoked |
|-------|------|--------------|
| `@release-manager` | Git operations | Status: releasing |
| `@evaluator` | Scores effectiveness | Status: evaluated |
| `@prompt-optimizer` | Improves prompts | When score < 7 |
| `@capability-analyst` | Analyzes task coverage | When starting new task |
| `@agent-architect` | Creates new agents | When gaps identified |
| `@workflow-architect` | Creates workflows | New workflow needed |
| `@markdown-validator` | Validates Markdown | Before issue creation |

## Workflow State Machine

```
[new]
  ↓ @requirement-refiner
[planned]
  ↓ @capability-analyst → (gaps?) → @agent-architect → create new agents
  ↓ @history-miner
[researching]
  ↓ @system-analyst
[designed]
  ↓ @sdet-engineer (writes failing tests)
[testing]
  ↓ @lead-developer (makes tests pass)
[implementing]
  ↓ @code-skeptic (review)
[reviewing] ──[fail]──→ [fixing] ──→ [reviewing]
  ↓ @review-watcher → (auto-validate) → create fix tasks
  ↓ [pass]
[perf-check]
  ↓ @performance-engineer
[security-check]
  ↓ @security-auditor
[releasing]
  ↓ @release-manager
[evaluated]
  ↓ @evaluator
  ├── [score ≥ 7] → [completed]
  └── [score < 7] → @prompt-optimizer → [completed]
```

## Capability Analysis Flow

When starting a complex task:

```
[User Request]
      ↓
[@capability-analyst] ← Analyzes requirements vs existing capabilities
      ↓
[Gap Analysis] ← Identifies missing agents, workflows, skills
      ↓
[Recommendations] → Create new or enhance existing?
      ↓
[Decision]
  ├── [Create New] → [@agent-architect] → Create component → Review
  └── [Enhance] → [@lead-developer] → Modify existing
      ↓
[Integration] ← Verify new component works with system
      ↓
[Complete] ← Task can now be handled
```

## Gitea Integration

### Status Labels

Pipeline uses Gitea labels to track progress:
- `status: new` → `status: planned` → `status: researching` → ...
- Agents add/remove labels automatically

### Performance Logging

Each agent logs to Gitea issue comments:
```markdown
## ✅ lead-developer completed

**Score**: 8/10
**Duration**: 1.2h
**Files**: src/auth.ts, src/user.ts

### Notes
- Clean implementation
- Follows existing patterns
- Tests passing
```

### Efficiency Tracking

Scores saved to `.kilo/logs/efficiency_score.json`:
```json
{
  "version": "1.0",
  "history": [
    {
      "issue": 42,
      "date": "2024-01-02T10:00:00Z",
      "agents": {
        "lead-developer": 8,
        "code-skeptic": 7,
        "the-fixer": 9
      },
      "iterations": 2,
      "duration_hours": 1.5
    }
  ]
}
```

## Manual Agent Invocation

```typescript
// Use Task tool to invoke subagent
Task tool with:
  subagent_type: "lead-developer"
  prompt: "Implement authentication for issue #42"
```

Or via `@mention`:
```
@lead-developer implement authentication flow
```

## Environment Variables

Required for Gitea integration:
```bash
GITEA_API_URL=https://git.softuniq.eu/api/v1
GITEA_TOKEN=your-token-here
```

## Self-Improvement Cycle

1. **Pipeline runs** for each issue
2. **Evaluator scores** each agent (1-10)
3. **Low scores (<7)** trigger prompt-optimizer
4. **Prompt optimizer** analyzes failures and improves prompts
5. **New prompts** saved to `.kilo/agents/`
6. **Next run** uses improved prompts

## Architecture Files

| File | Purpose |
|------|---------|
| `AGENTS.md` | This file - main config |
| `.kilo/agents/*.md` | Agent definitions with prompts |
| `.kilo/commands/*.md` | Workflow commands |
| `.kilo/rules/*.md` | Custom rules loaded globally |
| `.kilo/skills/` | Skill modules |
| `src/kilocode/` | TypeScript API for programmatic use |

## Using the TypeScript API

```typescript
import {
  PipelineRunner,
  GiteaClient,
  decideRouting
} from './src/kilocode/index.js'

const runner = await createPipelineRunner({
  giteaToken: process.env.GITEA_TOKEN
})

await runner.run({ issueNumber: 42 })
```

## Agent Evolution Dashboard

Track agent model changes, performance, and recommendations in real-time.

### Access

```bash
# Sync agent data
bun run sync:evolution

# Open dashboard
bun run evolution:dashboard
bun run evolution:open
# or visit http://localhost:3001
```

### Dashboard Tabs

| Tab | Description |
|-----|-------------|
| **Overview** | Stats, recent changes, pending recommendations |
| **All Agents** | Filterable agent cards with history |
| **Timeline** | Full evolution history |
| **Recommendations** | Priority-based model suggestions |
| **Model Matrix** | Agent × Model mapping with fit scores |

### Data Sources

| Source | What it tracks |
|--------|----------------|
| `.kilo/agents/*.md` | Model, description, capabilities |
| `.kilo/kilo.jsonc` | Model assignments |
| `.kilo/capability-index.yaml` | Capability routing |
| Git History | Model and prompt changes |
| Gitea Comments | Performance scores |

### Evolution Data Structure

```json
{
  "agents": {
    "lead-developer": {
      "current": { "model": "qwen3-coder:480b", "fit_score": 92 },
      "history": [{ "type": "model_change", "from": "deepseek", "to": "qwen3" }],
      "performance_log": [{ "issue": 42, "score": 8, "success": true }]
    }
  }
}
```

### Recommendations Priority

| Priority | When | Example |
|----------|------|---------|
| **Critical** | Fit score < 70 | Immediate model change required |
| **High** | Model unavailable | Switch to fallback |
| **Medium** | Better model available | Consider upgrade |
| **Low** | Optimization possible | Optional improvement |

## Code Style

- Use TypeScript for new files
- Follow existing patterns
- Write tests before code (TDD)
- Keep functions under 50 lines
- Use early returns
- No comments unless explicitly requested