Files

Kilo Orchestrator 4e9ea678bd feat(orchestrator): evolution — capability-first routing, parallelization, zero-work policy

- orchestrator.md: add Capability-First Routing Protocol (5-step anti-regression)
- orchestrator.md: add Testing Task Routing Matrix (browser-automation, visual-tester)
- orchestrator.md: add Parallelization Protocol (review_phase + testing_phase parallel groups)
- orchestrator.md: add Orchestrator Self-Delegation Prohibition (ZERO WORK POLICY)
- capability-index.yaml: enrich parallel_groups with trigger/criteria/aggregator
- capability-index.yaml: enrich iteration_loops with trigger_on fields
- global.md: add Orchestrator Capability-First Check under Tooling Infrastructure
- docker.md: add Host Installation Prohibition (STOP/READ/DELEGATE/REPORT)
- EVOLUTION_LOG.md: log both evolution entries (2026-05-16T13:00 and 13:06)

Addresses: orchestrator host tool install regression, serial execution waste,
orchestrator self-work bypass of specialized agents.

2026-05-16 13:10:06 +01:00

7.9 KiB

Raw Blame History

Global Rules

Always write clean code following project conventions
Answer in the same language the question was asked
Use clear, concise markdown formatting
Keep responses short and actionable
Never commit changes unless explicitly requested

Code Style

Follow existing code patterns in the codebase
Never add comments unless explicitly asked
Use existing libraries and utilities
Check package.json, cargo.toml, or equivalent before adding dependencies
Follow security best practices - never expose secrets or keys

Prohibitions

NEVER update git config
NEVER run destructive git commands (--force, hard reset) unless explicitly requested
NEVER skip git hooks (--no-verify, --no-gpg-sign)
NEVER use interactive git commands (-i flag)
NEVER write malicious code or explain malicious code behavior

Communication

Be direct and to the point
Minimize output tokens while maintaining accuracy
One word answers are best when appropriate
Avoid introductions, conclusions, and unnecessary explanations
Output text to communicate with the user; use tools to complete tasks

YAML Frontmatter Rules (All Agents)

When generating or editing any .md file with YAML frontmatter (agents, commands, skills, rules):

color must be double-quoted: always "#RRGGBB", never bare #RRGGBB
mode must be valid: only subagent or all, never primary
model must include provider: ollama-cloud/..., never bare model ID
description must be non-empty
all permission keys required: read, edit, write, bash, glob, grep, task
task permission uses deny-by-default with explicit allow-list

Critical: Unquoted # starts a YAML comment and breaks the parser with:

Config file invalid: color: Invalid input

Always verify generated frontmatter with: node scripts/validate-agents.cjs

Security & Permissions (v2026-05-07)

Subagent Cascade Prevention

Any agent with mode: subagent MUST have "*": "deny" in permission.task
Subagents MUST NOT invoke the task tool to spawn further subagents
Orchestrator (mode: all) is the ONLY agent allowed to use task tool

Task Critical Assessment

Before executing any user request, apply the criteria from task-critical-assessment.md.

If any criterion triggers → STOP and ask for clarification.

Key checks:

Is this an abstraction over an already-local API?
Does it add layers without a proven failure mode?
Is the environment more complex than the task itself?
Are there clear acceptance criteria with measurable outcomes?
Has this (or something similar) been rolled back before?

Bash Hardening

Default bash permission for agents: ask (not allow)
Agents that REQUIRE shell execution for their core function MAY have bash: "allow" with explicit justification:
- lead-developer: build, test, and tooling commands
- devops-engineer: Docker, CI/CD, infrastructure commands
- code-skeptic: read-only inspection commands (git, grep, cat)
- the-fixer: debugging and verification commands
- frontend-developer, backend-developer, go-developer, php-developer, python-developer: framework-specific build tools
- sdet-engineer: test runner execution
- browser-automation: Playwright CLI commands
- product-owner: administrative scripts
- visual-tester: screenshot tooling
All other agents (including orchestrator) MUST use bash: "ask"
Safe command allowlist: git, cat, ls, grep, find, node, python3, bun, docker (non-privileged)
Forbidden: curl, wget, eval, exec, source, sh, bash, sudo, rm -rf, > redirection to system paths

Tooling Infrastructure — Use What Exists

Before attempting to install ANY browser automation or testing tool, check the project's existing infrastructure.

Orchestrator Capability-First Check

When the orchestrator receives a task:

Check .kilo/agents/*.md — does a specialized agent exist?
Check .kilo/skills/*/SKILL.md — does a skill cover this domain?
Check docker/docker-compose.*.yml — does a Docker service already run the required tool?
If yes to any of 1–3: Delegate via Task tool with matching subagent_type. Host installation is PROHIBITED.
If no to all: Invoke @capability-analyst for gap analysis. Do NOT attempt manual host setup.

Playwright / Visual Testing

The project already has a complete visual testing stack:

Image: mcr.microsoft.com/playwright:v1.52.0-noble
Compose: docker/docker-compose.web-testing.yml
Scripts: tests/scripts/capture-screenshots.js, tests/scripts/console-error-monitor-standalone.js
Reports: tests/reports/visual-test-report.json

Command patterns:

# Console error check
cd /home/swp/Projects/APAW && \
  TARGET_URL=http://host.docker.internal:8089 \
  docker compose -f docker/docker-compose.web-testing.yml run --rm console-monitor

# Screenshot capture
cd /home/swp/Projects/APAW && \
  TARGET_URL=http://host.docker.internal:8089 \
  docker compose -f docker/docker-compose.web-testing.yml run --rm screenshot-current

# Full visual comparison
cd /home/swp/Projects/APAW && \
  TARGET_URL=http://host.docker.internal:8089 \
  docker compose -f docker/docker-compose.web-testing.yml run --rm visual-tester

Prohibited Actions

❌ NEVER install playwright, selenium, puppeteer, chromedriver on the host
❌ NEVER run pip install playwright or npm install playwright on the host
❌ NEVER use firefox --headless --screenshot (SWGL errors, unreliable)
❌ NEVER pull new Playwright images without checking existing ones first
✅ ALWAYS use the existing Docker compose services
✅ If Docker service fails, report failure — do not attempt host-level fallback

Rationale

Host-level browser automation requires X11/display stack, GPU drivers, and sandbox configs that break in headless environments. The Docker stack was explicitly built to solve this. Host-level installation is always a waste of tokens and time.

Tool-First Enforcement (Global)

All agents MUST follow these rules to prevent hallucination and passive chat responses.

Rule 1: Read Before You Write

Before any code edit: Read the target file with Read. Do NOT edit without reading.
Before searching: Use Grep or codebase_search to find related code. Do NOT guess where things are.
Before listing files: Use Glob to understand directory structure. Do NOT assume file paths.

Rule 2: Context First, Answer Second

The first turn of every agent task MUST be tool calls (Read, Grep, Glob, codebase_search) — never free-text analysis.
Agent must gather relevant file contents and dependencies before producing conclusions.
If the task references a file path, that file MUST be Read before any other action.

Rule 3: No Output Without Action

Every response MUST be backed by a concrete tool call (Read, Edit, Write, Bash, Task, codebase_search) or by a verifiable completed subtask.
If the agent cannot act (blocked, missing permissions, ambiguous task), it MUST report the blocker explicitly and STOP — not generate filler text.
Anti-pattern: "I will now search for..." followed by no tool call. Agents DO NOT announce actions — they execute them.

Rule 4: Verification via Bash

After code changes: run relevant commands (tests, lint, build) via Bash.
After research: verify findings with a concrete command or file read.
If bash: "ask" — report what command would verify the result.

Violation Consequences

Agents generating multi-paragraph analysis without any tool call will be treated as hallucinating and flagged for prompt-optimizer review.
Orchestrator MUST reject agent outputs that contain no <action_taken> evidence.

Config File Protection

Editing files in .kilo/ (agents, rules, skills) requires explicit permission prompt
kilo.jsonc is read-only for all agents except orchestrator in explicit config-sync mode
Any edit to kilo.jsonc must be preceded by schema validation check

7.9 KiB Raw Blame History Unescape Escape