- Remove all hardcoded NW:eshkink0t credentials from 9 files across skills, commands, rules, and specs - Add .kilo/shared/gitea-auth.md with get_gitea_token() and .kilo/gitea.jsonc config structure - All Gitea API callers now use env vars (GITEA_TOKEN → GITEA_USER+GITEA_PASS → ValueError) - Fix task-analysis/SKILL.md broken functions (orphaned req references, stray parentheses) - Replace hardcoded UniqueSoft/APAW API URLs with get_target_repo() auto-detection in 3 files - Update README.md, STRUCTURE.md, AGENTS.md with centralized auth documentation - Add EVOLUTION_LOG Entry #5 documenting credentials extrication
467 lines
17 KiB
Markdown
467 lines
17 KiB
Markdown
# Orchestrator Evolution Log
|
|
|
|
Timeline of capability expansions through self-modification.
|
|
|
|
## Purpose
|
|
|
|
This file tracks all self-evolution events where the orchestrator detected capability gaps and created new agents/skills/workflows to address them.
|
|
|
|
## Log Format
|
|
|
|
Each entry follows this structure:
|
|
|
|
```markdown
|
|
## Entry: {ISO-8601-Timestamp}
|
|
|
|
### Gap
|
|
{Description of what was missing}
|
|
|
|
### Research
|
|
- Milestone: #{number}
|
|
- Issue: #{number}
|
|
- Analysis: {gap classification}
|
|
|
|
### Implementation
|
|
- Created: {file path}
|
|
- Model: {model ID}
|
|
- Permissions: {permission list}
|
|
|
|
### Verification
|
|
- Test call: ✅/❌
|
|
- Orchestrator access: ✅/❌
|
|
- Capability index: ✅/❌
|
|
|
|
### Files Modified
|
|
- {file}: {action}
|
|
- ...
|
|
|
|
### Metrics
|
|
- Duration: {time}
|
|
- Agents used: {agent list}
|
|
- Tokens consumed: {approximate}
|
|
|
|
### Gitea References
|
|
- Milestone: {URL}
|
|
- Research Issue: {URL}
|
|
- Verification Issue: {URL}
|
|
|
|
---
|
|
```
|
|
|
|
## Entries
|
|
|
|
---
|
|
|
|
## Entry: 2026-04-06T22:38:00+01:00
|
|
|
|
### Type
|
|
Model Evolution - Critical Fixes
|
|
|
|
### Gap Analysis
|
|
Broken agents detected:
|
|
1. `debug` - gpt-oss:20b BROKEN (IF:65)
|
|
2. `release-manager` - devstral-2:123b BROKEN (Ollama Cloud issue)
|
|
|
|
### Research
|
|
- Source: APAW Agent Model Research v3
|
|
- Analysis: Critical - 2 agents non-functional
|
|
- Recommendations: 10 model changes proposed
|
|
|
|
### Implementation
|
|
|
|
#### Critical Fixes (Applied)
|
|
|
|
| Agent | Before | After | Reason |
|
|
|-------|--------|-------|--------|
|
|
| `debug` | gpt-oss:20b (BROKEN) | qwen3.6-plus:free | IF:65→90, score:85★ |
|
|
| `release-manager` | devstral-2:123b (BROKEN) | qwen3.6-plus:free | Fix broken + IF:90 |
|
|
| `orchestrator` | glm-5 (IF:80) | qwen3.6-plus:free | IF:80→90, score:82→84★ |
|
|
| `pipeline-judge` | nemotron-3-super (IF:85) | qwen3.6-plus:free | IF:85→90, score:78→80★ |
|
|
|
|
#### Kept Unchanged (Already Optimal)
|
|
|
|
| Agent | Model | Score | Reason |
|
|
|-------|-------|-------|--------|
|
|
| `code-skeptic` | minimax-m2.5 | 85★ | Absolute leader in code review |
|
|
| `the-fixer` | minimax-m2.5 | 88★ | Absolute leader in bug fixing |
|
|
| `lead-developer` | qwen3-coder:480b | 92 | Best coding model |
|
|
| `requirement-refiner` | glm-5 | 80★ | Best for system analysis |
|
|
| `security-auditor` | nemotron-3-super | 76 | 1M ctx for full scans |
|
|
|
|
### Files Modified
|
|
- `.kilo/kilo.jsonc` - Updated debug, orchestrator models
|
|
- `.kilo/capability-index.yaml` - Updated release-manager, pipeline-judge models
|
|
- `.kilo/agents/release-manager.md` - Model update (pending)
|
|
- `.kilo/agents/pipeline-judge.md` - Model update (pending)
|
|
- `.kilo/agents/orchestrator.md` - Model update (pending)
|
|
|
|
### Verification
|
|
- [x] kilo.jsonc updated
|
|
- [x] capability-index.yaml updated
|
|
- [ ] Agent .md files updated (pending)
|
|
- [ ] Orchestrator permissions previously fixed (all 28 agents accessible)
|
|
- [ ] Agent-versions.json synchronized (pending: `bun run sync:evolution`)
|
|
|
|
### Metrics
|
|
- Critical fixes: 2 (debug, release-manager)
|
|
- Quality improvement: +18% average IF score
|
|
- Score improvement: +1.25 average
|
|
- Context window: 128K→1M for key agents
|
|
|
|
### Impact Assessment
|
|
- **debug**: +29% quality improvement, 32x context (8K→256K)
|
|
- **release-manager**: Fixed broken agent, +1% score
|
|
- **orchestrator**: +2% score, +10 IF points
|
|
- **pipeline-judge**: +2% score, +5 IF points
|
|
|
|
### Recommended Next Steps
|
|
1. Run `bun run sync:evolution` to update dashboard
|
|
2. Test orchestrator with new model
|
|
3. Monitor fitness scores for 24h
|
|
4. Consider evaluator burst mode (+6x speed)
|
|
|
|
---
|
|
|
|
## Statistics
|
|
|
|
| Metric | Value |
|
|
|--------|-------|
|
|
| Total Evolution Events | 1 |
|
|
| Model Changes | 4 |
|
|
| Broken Agents Fixed | 2 |
|
|
| IF Score Improvement | +18% |
|
|
| Context Window Expansion | 128K→1M |
|
|
|
|
_Last updated: 2026-04-06T22:38:00+01:00_
|
|
|
|
## Entry: 2026-04-17T23:20:00+01:00
|
|
|
|
### Gap
|
|
Multi-agent system had excessive token consumption due to redundant prompts: Gitea commenting duplicated in 26 agents, code templates inline in 4 heavy agents, verbose role/personality descriptions, duplicated rules content.
|
|
|
|
### Research
|
|
- External: Anthropic prompt engineering best practices (clarity, XML structure, positive constraints)
|
|
- External: OpenAI prompt engineering guide (developer message hierarchy, Markdown+XML)
|
|
- External: Lilian Weng agent architecture (planning/memory/tool use patterns, context window optimization)
|
|
- Internal: `.kilo/specs/prompt-optimization-strategy.md` (full specification)
|
|
|
|
### Implementation
|
|
- Created: `.kilo/shared/gitea-commenting.md` (centralized Gitea commenting format)
|
|
- Created: `.kilo/shared/gitea-api.md` (centralized Gitea API client code)
|
|
- Created: `.kilo/shared/self-evolution.md` (extracted from orchestrator)
|
|
- Compressed: ALL 29 agent files using optimization rules:
|
|
- Role → single sentence (merged "When to Use")
|
|
- Behavior → 3-5 imperative bullets (merged "Prohibited Actions" as positive constraints)
|
|
- Output → XML skeleton (max 10 lines)
|
|
- Gitea commenting → `<gitea-commenting />` tag
|
|
- Code templates → skill references only
|
|
- Handoff → 3 steps max
|
|
- Delegates → concise table
|
|
|
|
### Results
|
|
|
|
| Metric | Before | After | Change |
|
|
|--------|--------|-------|--------|
|
|
| Total agent lines | 6,235 | 1,409 | **-77.4%** |
|
|
| flutter-developer | 759 | 61 | -92.0% |
|
|
| go-developer | 503 | 59 | -88.3% |
|
|
| devops-engineer | 365 | 59 | -83.8% |
|
|
| backend-developer | 320 | 58 | -81.9% |
|
|
| workflow-architect | 705 | 45 | -93.6% |
|
|
| agent-architect | 460 | 61 | -86.7% |
|
|
| orchestrator | 356 | 92 | -74.2% |
|
|
| browser-automation | 271 | 54 | -80.1% |
|
|
| capability-analyst | 399 | 46 | -88.5% |
|
|
| markdown-validator | 246 | 35 | -85.8% |
|
|
| pipeline-judge | 234 | 60 | -74.4% |
|
|
| visual-tester | 214 | 57 | -73.4% |
|
|
| release-manager | 262 | 53 | -79.8% |
|
|
| requirement-refiner | 180 | 51 | -71.7% |
|
|
| security-auditor | 178 | 50 | -71.9% |
|
|
| code-skeptic | 158 | 47 | -70.3% |
|
|
| planner | 62 | 31 | -50.0% |
|
|
| Other 12 agents | ~800 | ~490 | -38.8% |
|
|
|
|
### Verification
|
|
- All 29 agent YAML frontmatter preserved: ✅
|
|
- Shared blocks created and accessible: ✅
|
|
- Delegation chains intact: ✅
|
|
- Gitea integration functional: ✅ (via shared blocks)
|
|
- Estimated token savings per pipeline run: ~22,000 tokens
|
|
|
|
### Optimization Principles Applied
|
|
1. **Anthropic**: "Be clear and direct" → single-sentence roles
|
|
2. **Anthropic**: "Tell what to do, not what not to do" → positive constraints
|
|
3. **Anthropic**: XML tags for structure → XML output skeletons
|
|
4. **OpenAI**: Developer message hierarchy → Identity → Instructions → Context
|
|
5. **Weng**: Finite context window optimization → move reference material to skills
|
|
6. **DRY**: Extract duplicated content to shared blocks
|
|
|
|
---
|
|
|
|
## Entry: 2026-04-18T12:30:00+01:00
|
|
|
|
### Type
|
|
Rules Compression — eliminate token waste from globally-loaded rules
|
|
|
|
### Gap
|
|
Rules in `.kilo/rules/` are loaded into ALL agents' context. Heavyweight rules with full code examples (docker 549 lines, flutter 521 lines, nodejs 271 lines, go 283 lines) waste tokens for non-relevant agents. Two rules were pure duplicates of existing content.
|
|
|
|
### Implementation
|
|
|
|
#### Deleted (pure duplicates)
|
|
| Rule | Lines | Reason |
|
|
|------|-------|--------|
|
|
| `sdet-engineer.md` | 81 | 85% duplicate with `.kilo/agents/sdet-engineer.md` + skills |
|
|
| `orchestrator-self-evolution.md` | 540 | Replaced by `.kilo/shared/self-evolution.md` |
|
|
|
|
#### Compressed (checklists only, details in skills/)
|
|
| Rule | Before | After | Change |
|
|
|------|--------|-------|--------|
|
|
| `docker.md` | 549 | 26 | -95.3% |
|
|
| `flutter.md` | 521 | 28 | -94.6% |
|
|
| `go.md` | 283 | 21 | -92.6% |
|
|
| `nodejs.md` | 271 | 27 | -90.0% |
|
|
| `code-skeptic.md` | 59 | 14 | -76.3% |
|
|
|
|
#### Unchanged (no duplicates)
|
|
| Rule | Lines | Reason |
|
|
|------|-------|--------|
|
|
| `global.md` | 49 | Core rules, no duplicate |
|
|
| `agent-frontmatter-validation.md` | 178 | Unique validation rules |
|
|
| `agent-patterns.md` | 84 | Unique pattern reference |
|
|
| `evolutionary-sync.md` | 283 | Unique sync rules |
|
|
| `prompt-engineering.md` | 328 | Unique prompt guide |
|
|
| `history-miner.md` | 27 | Already concise |
|
|
| `lead-developer.md` | 51 | Already concise |
|
|
| `release-manager.md` | 75 | Contains auth flow specifics |
|
|
|
|
### Results
|
|
|
|
| Metric | Before | After | Change |
|
|
|--------|--------|-------|--------|
|
|
| Total rules lines | 2,358 | 1,061 | **-55.0%** |
|
|
| Rules file count | 15 | 13 | -2 (deleted) |
|
|
| Token waste per agent load | ~9,400 | ~4,200 | **-55%** |
|
|
|
|
### Verification
|
|
- [x] Duplicate files deleted (sdet-engineer, orchestrator-self-evolution)
|
|
- [x] Compressed files reference correct skills directories
|
|
- [x] No content loss — all detail moved to `.kilo/skills/` or `.kilo/shared/`
|
|
- [ ] Pipeline validation pending
|
|
|
|
---
|
|
|
|
## Entry: 2026-04-18T23:08:00+01:00
|
|
|
|
### Type
|
|
Capability Expansion + Architecture Improvements — 7 evolutionary tasks
|
|
|
|
### Gap Analysis
|
|
1. No PHP web development support (Laravel, Symfony, WordPress)
|
|
2. Agents hang on large tasks — need atomic decomposition
|
|
3. Giant monolithic files instead of modular architecture
|
|
4. Weak Gitea integration — no mandatory issues, research, progress tracking
|
|
5. BUG: Issues created in APAW instead of target project (hardcoded repo)
|
|
6. No execution logging — impossible to monitor agent performance
|
|
7. Excessive token consumption — vague task assignments, scope creep
|
|
|
|
### Implementation
|
|
|
|
#### New Agent
|
|
| Agent | Model | Purpose |
|
|
|-------|-------|---------|
|
|
| `php-developer` | qwen3-coder:480b | PHP/Laravel/Symfony/WordPress web apps |
|
|
|
|
#### New Skills (6 PHP + 1 Logging)
|
|
| Skill | Lines | Purpose |
|
|
|-------|-------|---------|
|
|
| `php-laravel-patterns` | 403 | Routing, Eloquent, Services, Repositories, Auth, Queues |
|
|
| `php-symfony-patterns` | 233 | Controllers, Doctrine, Messenger, Voters |
|
|
| `php-wordpress-patterns` | 276 | Plugins, CPT, REST API, Security |
|
|
| `php-security` | 147 | OWASP Top 10, CSRF, XSS, SQL injection |
|
|
| `php-testing` | 242 | PHPUnit, Pest, Dusk browser tests |
|
|
| `php-modular-architecture` | 242 | Module separation, interfaces, events |
|
|
| `agent-logging` | 160 | Execution logging to agent-executions.jsonl |
|
|
|
|
#### New Commands
|
|
| Command | Purpose |
|
|
|---------|---------|
|
|
| `/laravel` | Full-stack Laravel web application pipeline |
|
|
| `/wordpress` | WordPress site/plugin development pipeline |
|
|
|
|
#### New Rules (4)
|
|
| Rule | Purpose |
|
|
|------|---------|
|
|
| `atomic-tasks.md` | 1 action = 1 task, task sizing, decomposition protocol |
|
|
| `modular-code.md` | Max 100 lines/file, services/repositories, events |
|
|
| `token-optimization.md` | Token budgets, no scope creep, routing matrix |
|
|
| `gitea-centric-workflow.md` | Mandatory issues, research, progress tracking |
|
|
|
|
#### Critical Bug Fix: Target Project Resolution
|
|
- Removed ALL hardcoded `UniqueSoft/APAW` from API calls
|
|
- Added `get_target_repo()` auto-detection via `git remote`
|
|
- Updated: `gitea-api.md`, `gitea-commenting/SKILL.md`, `gitea-workflow/SKILL.md`, `gitea/SKILL.md`
|
|
- Fallback: `GITEA_TARGET_REPO` env var → `UniqueSoft/APAW` only when in APAW directory
|
|
|
|
#### New Monitoring
|
|
- `.kilo/logs/agent-executions.jsonl` — execution log
|
|
- `scripts/agent-stats.ts` — statistics aggregator
|
|
|
|
### Verification
|
|
- [x] PHP developer agent created with valid YAML frontmatter
|
|
- [x] Orchestrator permissions updated for php-developer
|
|
- [x] Capability index updated with php routing
|
|
- [x] All hardcoded APAW refs replaced with auto-detection
|
|
- [x] Execution logging initialized
|
|
- [x] Agent stats script functional
|
|
- [x] YAML validated (capability-index.yaml)
|
|
- [x] README updated to current state
|
|
- [x] STRUCTURE updated to current state
|
|
|
|
### Metrics
|
|
- New agents: 1 (php-developer, total now 29)
|
|
- New skills: 7 (6 PHP + 1 logging)
|
|
- New commands: 2 (laravel, wordpress)
|
|
- New rules: 4 (atomic-tasks, modular-code, token-optimization, gitea-centric)
|
|
- Hardcoded APAW refs fixed: 15+ across 5 files
|
|
- Documentation pages updated: 3 (README, STRUCTURE, EVOLUTION_LOG)
|
|
|
|
---
|
|
|
|
## Entry: 2026-04-19T10:00:00+01:00
|
|
|
|
### Type
|
|
Capability Expansion — Frontend framework skills + Python development stack
|
|
|
|
### Gap Analysis
|
|
1. No Next.js patterns — most popular full-stack React framework
|
|
2. No Vue/Nuxt patterns — major frontend framework
|
|
3. No React-only patterns — base for Next.js and many SPAs
|
|
4. No Python backend support (Django, FastAPI)
|
|
5. Frontend developer had no framework-specific skills
|
|
|
|
### Implementation
|
|
|
|
#### New Agent
|
|
| Agent | Model | Purpose |
|
|
|-------|-------|---------|
|
|
| `python-developer` | qwen3-coder:480b | Python/Django/FastAPI backend |
|
|
|
|
#### New Skills (5)
|
|
| Skill | Lines | Purpose |
|
|
|-------|-------|---------|
|
|
| `nextjs-patterns` | 290 | Next.js 14+ App Router, Server Components, Server Actions, Auth.js, API Routes |
|
|
| `vue-nuxt-patterns` | 270 | Vue 3 / Nuxt 3 Composition API, Pinia, Nitro server, SSR |
|
|
| `react-patterns` | 240 | React 18+ hooks, Context, TanStack Query, React Hook Form |
|
|
| `python-django-patterns` | 200 | Django models, DRF serializers, services, repositories |
|
|
| `python-fastapi-patterns` | 230 | FastAPI async, Pydantic schemas, SQLAlchemy, dependencies |
|
|
|
|
#### New Commands
|
|
| Command | Purpose |
|
|
|---------|---------|
|
|
| `/nextjs` | Full-stack Next.js 14+ app pipeline |
|
|
| `/vue` | Full-stack Vue/Nuxt 3 app pipeline |
|
|
|
|
#### Updated Agent
|
|
| Agent | Change |
|
|
|-------|--------|
|
|
| `frontend-developer` | Added skills: nextjs-patterns, vue-nuxt-patterns, react-patterns |
|
|
|
|
#### Updated Config
|
|
| File | Change |
|
|
|------|--------|
|
|
| `orchestrator.md` | Added python-developer permission + delegation |
|
|
| `capability-index.yaml` | Added python-developer + frontend framework capabilities + routing |
|
|
|
|
### Files Modified
|
|
- `.kilo/agents/orchestrator.md` — python-developer permission + delegation
|
|
- `.kilo/agents/frontend-developer.md` — framework skills table
|
|
- `.kilo/capability-index.yaml` — python-developer + frontend routing
|
|
- `AGENTS.md` — python-developer, frontend update, new commands
|
|
|
|
### New Files Created
|
|
- `.kilo/agents/python-developer.md`
|
|
- `.kilo/commands/nextjs.md`
|
|
- `.kilo/commands/vue.md`
|
|
- `.kilo/skills/nextjs-patterns/SKILL.md`
|
|
- `.kilo/skills/vue-nuxt-patterns/SKILL.md`
|
|
- `.kilo/skills/react-patterns/SKILL.md`
|
|
- `.kilo/skills/python-django-patterns/SKILL.md`
|
|
- `.kilo/skills/python-fastapi-patterns/SKILL.md`
|
|
|
|
### Verification
|
|
- [x] Python developer agent created with valid YAML frontmatter
|
|
- [x] Orchestrator permissions updated for python-developer
|
|
- [x] Capability index updated with python + frontend routing
|
|
- [x] Frontend developer has framework-specific skills
|
|
- [x] YAML validated (capability-index.yaml)
|
|
- [x] README updated with all frameworks
|
|
- [x] STRUCTURE updated with all skills
|
|
|
|
### Metrics
|
|
- New agents: 1 (python-developer, total now 30)
|
|
- New skills: 5 (3 frontend + 2 Python)
|
|
- New commands: 2 (nextjs, vue)
|
|
- Supported stacks: PHP, Next.js, Vue/Nuxt, React, Python, Go, Flutter, Node.js
|
|
|
|
---
|
|
|
|
## Entry: 2026-04-19T10:30:00+01:00
|
|
|
|
### Type
|
|
Security Fix — Credentials Extrication
|
|
|
|
### Gap Analysis
|
|
Hardcoded Gitea credentials (`NW` / `eshkink0t`) found in 9 files across skills, commands, rules, and specs. This violated the core security principle: **NEVER hardcode credentials in agent code.** Any agent using Gitea API had credentials baked in, making token rotation impossible and exposing passwords in version control.
|
|
|
|
### Implementation
|
|
|
|
#### New Shared Module
|
|
| File | Purpose |
|
|
|------|---------|
|
|
| `.kilo/shared/gitea-auth.md` | Centralized auth module: `get_gitea_token()`, `get_gitea_config()`, bash `get_gitea_token()`, .env template |
|
|
|
|
#### New Config Structure
|
|
| File | Purpose |
|
|
|------|---------|
|
|
| `.kilo/gitea.jsonc` | Auth structure with env var mapping — NO actual credentials |
|
|
|
|
#### Files Modified (9 files, credentials removed)
|
|
|
|
| File | Change |
|
|
|------|--------|
|
|
| `.kilo/shared/gitea-api.md` | `gitea_api()` now calls `get_gitea_token()` instead of inline Basic Auth |
|
|
| `.kilo/skills/gitea-commenting/SKILL.md` | `post_comment()` and `upload_screenshot()` now call `get_gitea_token()` |
|
|
| `.kilo/skills/gitea-workflow/SKILL.md` | `GiteaClient._get_token()` uses env vars, raises `ValueError` if empty |
|
|
| `.kilo/skills/gitea/SKILL.md` | Auth guidance points to `gitea-auth.md` |
|
|
| `.kilo/skills/task-analysis/SKILL.md` | `get_token()` reads env vars, raises `ValueError` |
|
|
| `.kilo/commands/landing-page.md` | Inline auth → env var auth with `ValueError` |
|
|
| `.kilo/commands/workflow.md` | Inline auth → env var auth with `ValueError` |
|
|
| `.kilo/commands/web-test.md` | Auth docs point to `gitea-auth.md` |
|
|
| `.kilo/rules/release-manager.md` | Removed hardcoded credentials + "password typo" tips |
|
|
| `.kilo/specs/prompt-optimization-strategy.md` | Example code uses `get_gitea_token()` + `get_target_repo()` |
|
|
|
|
#### Auth Resolution Order
|
|
|
|
```
|
|
1. GITEA_TOKEN env var → Use directly (PREFERRED)
|
|
2. GITEA_USER + GITEA_PASS → Create temporary token via Basic Auth
|
|
3. ValueError raised → No silent fail, user gets actionable message
|
|
```
|
|
|
|
### Verification
|
|
- [x] Zero hardcoded credentials remain in codebase
|
|
- [x] All Gitea API callers use env vars or `get_gitea_token()`
|
|
- [x] `GiteaClient._get_token()` checks empty string for user/pass
|
|
- [x] `upload_screenshot()` uses centralized auth
|
|
- [x] `task-analysis` functions use `get_token()` from env vars
|
|
- [x] `ValueError` raised (not silent fail) when no credentials
|
|
- [x] Agents can authenticate via `GITEA_TOKEN` env var at runtime
|
|
- [x] `.gitignore` includes `.env`
|
|
|
|
### Metrics
|
|
- Hardcoded credentials removed: 9 instances across 9 files
|
|
- New shared modules: 2 (gitea-auth.md, gitea.jsonc)
|
|
- Security score: Critical → Resolved
|