Files
APAW/AGENTS.md
¨NW¨ 15a7b4b7a4 feat: add Agent Evolution Dashboard
- Create agent-evolution/ directory with standalone dashboard
- Add interactive HTML dashboard with agent/model matrix
- Add heatmap view for agent-model compatibility scores
- Add recommendations tab with optimization suggestions
- Add Gitea integration preparation (history timeline)
- Add Docker configuration for deployment
- Add build scripts for standalone HTML generation
- Add sync scripts for agent data synchronization
- Add milestone and issues documentation
- Add skills and rules for evolution sync
- Update AGENTS.md with dashboard documentation
- Update package.json with evolution scripts

Features:
- 28 agents with model assignments and fit scores
- 8 models with benchmarks (SWE-bench, RULER, Terminal)
- 11 recommendations for model optimization
- History timeline with agent changes
- Interactive modal windows for model details
- Filter and search functionality
- Russian language interface
- Works offline (file://) with embedded data

Docker:
- Dockerfile for standalone deployment
- docker-compose.evolution.yml
- docker-run.sh/docker-run.bat scripts

NPM scripts:
- sync:evolution - sync and build dashboard
- evolution:open - open in browser
- evolution:dashboard - start dev server

Status: PAUSED - foundation complete, Gitea integration pending
2026-04-05 19:58:59 +01:00

294 lines
8.1 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Kilo Code Agents Reference
This file configures AI agent behavior for the APAW project - a self-improving code pipeline with Gitea logging.
## Pipeline Workflow
The main workflow is `/pipeline` - use it to process issues through all agents automatically.
```
User: /pipeline 42
Agent: Runs full pipeline for issue #42 with Gitea logging
```
## Commands (Slash Commands)
| Command | Description | Usage |
|---------|-------------|-------|
| `/pipeline <issue>` | Run full agent pipeline for issue | `/pipeline 42` |
| `/status <issue>` | Check pipeline status for issue | `/status 42` |
| `/evaluate <issue>` | Generate performance report | `/evaluate 42` |
| `/plan` | Creates detailed task plans | `/plan feature X` |
| `/ask` | Answers codebase questions | `/ask how does auth work` |
| `/debug` | Analyzes and fixes bugs | `/debug error in login` |
| `/code` | Quick code generation | `/code add validation` |
| `/research [topic]` | Run research and self-improvement | `/research multi-agent` |
## Pipeline Agents (Subagents)
These agents are invoked automatically by `/pipeline` or manually via `@mention`:
### Core Development
| Agent | Role | When Invoked |
|-------|------|--------------|
| `@requirement-refiner` | Converts ideas to User Stories | Issue status: new |
| `@history-miner` | Finds duplicates in git | Status: planned |
| `@system-analyst` | Designs specifications | Status: researching |
| `@sdet-engineer` | Writes tests (TDD) | Status: designed |
| `@lead-developer` | Implements code | Status: testing (tests fail) |
| `@frontend-developer` | UI implementation | When UI work needed |
| `@backend-developer` | Node.js/Express/APIs | When backend needed |
| `@flutter-developer` | Flutter mobile apps | When mobile development |
| `@go-developer` | Go backend services | When Go backend needed |
### Quality Assurance
| Agent | Role | When Invoked |
|-------|------|--------------|
| `@code-skeptic` | Adversarial review | Status: implementing |
| `@the-fixer` | Fixes issues | When review fails |
| `@performance-engineer` | Performance review | After code-skeptic |
| `@security-auditor` | Security audit | After performance |
| `@visual-tester` | Visual regression | When UI changes |
### Cognitive Enhancement (New)
| Agent | Role | When Invoked |
|-------|------|--------------|
| `@planner` | Task decomposition (CoT/ToT) | Complex tasks |
| `@reflector` | Self-reflection (Reflexion) | After each agent |
| `@memory-manager` | Memory systems | Context management |
### Meta & Process
| Agent | Role | When Invoked |
|-------|------|--------------|
| `@release-manager` | Git operations | Status: releasing |
| `@evaluator` | Scores effectiveness | Status: evaluated |
| `@prompt-optimizer` | Improves prompts | When score < 7 |
| `@capability-analyst` | Analyzes task coverage | When starting new task |
| `@agent-architect` | Creates new agents | When gaps identified |
| `@workflow-architect` | Creates workflows | New workflow needed |
| `@markdown-validator` | Validates Markdown | Before issue creation |
## Workflow State Machine
```
[new]
↓ @requirement-refiner
[planned]
↓ @capability-analyst → (gaps?) → @agent-architect → create new agents
↓ @history-miner
[researching]
↓ @system-analyst
[designed]
↓ @sdet-engineer (writes failing tests)
[testing]
↓ @lead-developer (makes tests pass)
[implementing]
↓ @code-skeptic (review)
[reviewing] ──[fail]──→ [fixing] ──→ [reviewing]
↓ @review-watcher → (auto-validate) → create fix tasks
↓ [pass]
[perf-check]
↓ @performance-engineer
[security-check]
↓ @security-auditor
[releasing]
↓ @release-manager
[evaluated]
↓ @evaluator
├── [score ≥ 7] → [completed]
└── [score < 7] → @prompt-optimizer → [completed]
```
## Capability Analysis Flow
When starting a complex task:
```
[User Request]
[@capability-analyst] ← Analyzes requirements vs existing capabilities
[Gap Analysis] ← Identifies missing agents, workflows, skills
[Recommendations] → Create new or enhance existing?
[Decision]
├── [Create New] → [@agent-architect] → Create component → Review
└── [Enhance] → [@lead-developer] → Modify existing
[Integration] ← Verify new component works with system
[Complete] ← Task can now be handled
```
## Gitea Integration
### Status Labels
Pipeline uses Gitea labels to track progress:
- `status: new``status: planned``status: researching` → ...
- Agents add/remove labels automatically
### Performance Logging
Each agent logs to Gitea issue comments:
```markdown
## ✅ lead-developer completed
**Score**: 8/10
**Duration**: 1.2h
**Files**: src/auth.ts, src/user.ts
### Notes
- Clean implementation
- Follows existing patterns
- Tests passing
```
### Efficiency Tracking
Scores saved to `.kilo/logs/efficiency_score.json`:
```json
{
"version": "1.0",
"history": [
{
"issue": 42,
"date": "2024-01-02T10:00:00Z",
"agents": {
"lead-developer": 8,
"code-skeptic": 7,
"the-fixer": 9
},
"iterations": 2,
"duration_hours": 1.5
}
]
}
```
## Manual Agent Invocation
```typescript
// Use Task tool to invoke subagent
Task tool with:
subagent_type: "lead-developer"
prompt: "Implement authentication for issue #42"
```
Or via `@mention`:
```
@lead-developer implement authentication flow
```
## Environment Variables
Required for Gitea integration:
```bash
GITEA_API_URL=https://git.softuniq.eu/api/v1
GITEA_TOKEN=your-token-here
```
## Self-Improvement Cycle
1. **Pipeline runs** for each issue
2. **Evaluator scores** each agent (1-10)
3. **Low scores (<7)** trigger prompt-optimizer
4. **Prompt optimizer** analyzes failures and improves prompts
5. **New prompts** saved to `.kilo/agents/`
6. **Next run** uses improved prompts
## Architecture Files
| File | Purpose |
|------|---------|
| `AGENTS.md` | This file - main config |
| `.kilo/agents/*.md` | Agent definitions with prompts |
| `.kilo/commands/*.md` | Workflow commands |
| `.kilo/rules/*.md` | Custom rules loaded globally |
| `.kilo/skills/` | Skill modules |
| `src/kilocode/` | TypeScript API for programmatic use |
## Using the TypeScript API
```typescript
import {
PipelineRunner,
GiteaClient,
decideRouting
} from './src/kilocode/index.js'
const runner = await createPipelineRunner({
giteaToken: process.env.GITEA_TOKEN
})
await runner.run({ issueNumber: 42 })
```
## Agent Evolution Dashboard
Track agent model changes, performance, and recommendations in real-time.
### Access
```bash
# Sync agent data
bun run sync:evolution
# Open dashboard
bun run evolution:dashboard
bun run evolution:open
# or visit http://localhost:3001
```
### Dashboard Tabs
| Tab | Description |
|-----|-------------|
| **Overview** | Stats, recent changes, pending recommendations |
| **All Agents** | Filterable agent cards with history |
| **Timeline** | Full evolution history |
| **Recommendations** | Priority-based model suggestions |
| **Model Matrix** | Agent × Model mapping with fit scores |
### Data Sources
| Source | What it tracks |
|--------|----------------|
| `.kilo/agents/*.md` | Model, description, capabilities |
| `.kilo/kilo.jsonc` | Model assignments |
| `.kilo/capability-index.yaml` | Capability routing |
| Git History | Model and prompt changes |
| Gitea Comments | Performance scores |
### Evolution Data Structure
```json
{
"agents": {
"lead-developer": {
"current": { "model": "qwen3-coder:480b", "fit_score": 92 },
"history": [{ "type": "model_change", "from": "deepseek", "to": "qwen3" }],
"performance_log": [{ "issue": 42, "score": 8, "success": true }]
}
}
}
```
### Recommendations Priority
| Priority | When | Example |
|----------|------|---------|
| **Critical** | Fit score < 70 | Immediate model change required |
| **High** | Model unavailable | Switch to fallback |
| **Medium** | Better model available | Consider upgrade |
| **Low** | Optimization possible | Optional improvement |
## Code Style
- Use TypeScript for new files
- Follow existing patterns
- Write tests before code (TDD)
- Keep functions under 50 lines
- Use early returns
- No comments unless explicitly requested