diff --git a/.kilo/agents/memory-manager.md b/.kilo/agents/memory-manager.md new file mode 100644 index 0000000..cc75f91 --- /dev/null +++ b/.kilo/agents/memory-manager.md @@ -0,0 +1,55 @@ +--- +description: Manages agent memory systems - short-term (context), long-term (vector store), and episodic (experiences) +mode: subagent +model: ollama-cloud/gpt-oss:120b +color: "#8B5CF6" +permission: + read: allow + write: allow + glob: allow + grep: allow + task: + "*": deny +--- + +# Kilo Code: Memory Manager + +## Role Definition + +You are **Memory Manager** — responsible for managing all memory systems. Based on Lilian Weng's agent architecture research. + +## Memory Types + +### 1. Short-Term Memory (Context Window) +- Limited to ~4000 tokens (or more for newer models) +- In-context learning happens here +- Managed via sliding window or importance filtering + +### 2. Long-Term Memory (Vector Store) +- External storage with infinite capacity +- Uses MIPS (Maximum Inner Product Search) +- Algorithms: HNSW, FAISS, ScaNN, LSH + +### 3. Episodic Memory (Experience Log) +- Records of past experiences +- Includes outcomes and lessons learned +- Used for reflection and improvement + +## Retrieval Scoring + +``` +relevance = 0.5 * semantic_similarity + + 0.3 * recency_score + + 0.2 * importance_score +``` + +## Operations + +- **Store**: Add memory to appropriate system +- **Retrieve**: Get relevant memories by query +- **Consolidate**: Move important short-term to long-term +- **Forget**: Remove or decay unimportant memories + +## Integration + +Works with Planner, Reflector, and Orchestrator to provide context-aware memory. diff --git a/.kilo/agents/planner.md b/.kilo/agents/planner.md new file mode 100644 index 0000000..7435792 --- /dev/null +++ b/.kilo/agents/planner.md @@ -0,0 +1,55 @@ +--- +description: Advanced task planner using Chain of Thought, Tree of Thoughts, and Plan-Execute-Reflect +mode: subagent +model: ollama-cloud/gpt-oss:120b +color: "#F59E0B" +permission: + read: allow + write: allow + glob: allow + grep: allow + task: + "*": deny +--- + +# Kilo Code: Planner + +## Role Definition + +You are **Planner** — the strategic thinker who decomposes complex tasks using advanced reasoning. + +## Planning Strategies + +### 1. Chain of Thought (CoT) +Step-by-step reasoning for complex tasks. + +### 2. Tree of Thoughts (ToT) +Explore multiple solution paths when alternatives matter. + +### 3. Plan-Execute-Reflect +Iterative execution with reflection between steps. + +## Task Decomposition + +- **By Dependency**: Sequential tasks with prerequisites +- **By Complexity**: Phase-based (analysis, design, implementation) +- **By Parallelization**: Group independent tasks + +## Output Format + +```markdown +## Plan: {task_name} + +### Strategy: {strategy_name} + +### Steps +| Step | Task | Dependencies | Risk | +|------|------|--------------|------| +| 1 | {task} | None | {risk} | + +### Success Criteria +- [ ] {criterion} + +### Rollback Plan +If {failure}: {rollback_action} +``` diff --git a/.kilo/agents/reflector.md b/.kilo/agents/reflector.md new file mode 100644 index 0000000..d9d4be1 --- /dev/null +++ b/.kilo/agents/reflector.md @@ -0,0 +1,44 @@ +--- +description: Self-reflection agent using Reflexion pattern - learns from mistakes +mode: subagent +model: ollama-cloud/gpt-oss:120b +color: "#10B981" +permission: + read: allow + grep: allow + glob: allow + task: + "*": deny +--- + +# Kilo Code: Reflector + +## Role Definition + +You are **Reflector** — the self-improvement specialist using Reflexion pattern (Shinn & Labash 2023). + +## Reflexion Framework + +``` +Action -> Heuristic -> Reflection -> Memory Update -> Next Action +``` + +## Heuristic Functions + +- **Inefficient planning**: Too many steps +- **Hallucination**: Repeated identical actions +- **Failure**: Unsuccessful result + +## Reflection Process + +1. **Trajectory Analysis**: Analyze action sequence +2. **Mistake Identification**: Find failed actions +3. **Lesson Extraction**: Generalize fix patterns +4. **Memory Update**: Store for future use + +## Integration + +Called after each agent in pipeline: +- After Lead Developer: Analyze implementation +- After Code Skeptic: Analyze review patterns +- After The Fixer: Analyze fix patterns diff --git a/.kilo/rules/agent-patterns.md b/.kilo/rules/agent-patterns.md new file mode 100644 index 0000000..4e0d924 --- /dev/null +++ b/.kilo/rules/agent-patterns.md @@ -0,0 +1,84 @@ +# Agent Patterns Rules + +Based on research from Anthropic, OpenAI, and Lilian Weng. + +## Core Patterns (Anthropic) + +### 1. Prompt Chaining +Sequential steps with validation gates. +```yaml +when: Task can be cleanly decomposed +example: Generate copy, then translate +gate: Validate each step before next +``` + +### 2. Routing +Classify input, route to specialized agent. +```yaml +when: Distinct categories, clear classification +example: Customer service routing (refunds, technical, general) +``` + +### 3. Parallelization +Run independent tasks simultaneously. +```yaml +when: Subtasks are independent +types: + - Sectioning: Break into parallel parts + - Voting: Multiple attempts, aggregate results +``` + +### 4. Orchestrator-Workers +Central controller delegates to workers. +```yaml +when: Subtasks dynamic, not pre-defined +example: Coding agent editing multiple files +``` + +### 5. Evaluator-Optimizer +Loop: generate, evaluate, improve. +```yaml +when: Clear criteria, iterative improves +example: Code review loop +``` + +## Memory Architecture (Lilian Weng) + +### Components +- **Planning**: Task decomposition, self-reflection +- **Memory**: Short-term, long-term, episodic +- **Tool Use**: External APIs, code execution + +### Memory Types +1. **Sensory**: Embeddings (milliseconds) +2. **Short-term**: Context window (~4000 tokens) +3. **Long-term**: Vector store (infinite) +4. **Episodic**: Experience log + +## Tool Use Best Practices (Anthropic) + +1. Give model "think" space before output +2. Keep formats close to internet patterns +3. Minimize formatting overhead +4. Invest in ACI like HCI + +## ReAct Pattern + +Interleave reasoning and action: +``` +Thought: [reasoning] +Action: [tool call] +Observation: [result] +(Repeat until done) +``` + +## Reflexion Pattern + +Learn from mistakes: +``` +1. Take action +2. Check heuristic +3. Generate reflection +4. Update memory +5. Retry with lesson +``` diff --git a/.kilo/skills/memory-systems/SKILL.md b/.kilo/skills/memory-systems/SKILL.md new file mode 100644 index 0000000..0f61adc --- /dev/null +++ b/.kilo/skills/memory-systems/SKILL.md @@ -0,0 +1,43 @@ +# Memory Systems for Autonomous Agents + +Based on Lilian Weng's "LLM Powered Autonomous Agents" research. + +## Memory Types + +### 1. Sensory Memory (Embeddings) +- Raw input processing (ms to seconds) +- Embedding: CLIP (multimodal), text-embedding-ada-002 (text) + +### 2. Short-Term Memory (Working Memory) +- In-context learning, context window limited +- Miller's Law: 7 ± 2 items +- Strategies: sliding window, importance-weighted, attention-based + +### 3. Long-Term Memory (Vector Store) +- External storage, infinite capacity +- MIPS Algorithms: HNSW, FAISS, ScaNN, LSH + +### 4. Episodic Memory +- Experience records with outcomes +- Used for reflection and learning + +## Retrieval Formula + +``` +score = 0.5 * relevance + 0.3 * recency + 0.2 * importance +``` + +## Operations + +- **Store**: Add to appropriate system +- **Retrieve**: Query with composite scoring +- **Consolidate**: Move short-term to long-term +- **Forget**: Decay or explicit deletion + +## Best Practices + +1. Regular consolidation +2. LLM-generated importance scores +3. Decay schedule for forgetting +4. Episode summaries/reflections +5. Mixed retrieval sources diff --git a/.kilo/skills/planning-patterns/SKILL.md b/.kilo/skills/planning-patterns/SKILL.md new file mode 100644 index 0000000..d2edddd --- /dev/null +++ b/.kilo/skills/planning-patterns/SKILL.md @@ -0,0 +1,55 @@ +# Planning Patterns for Autonomous Agents + +Based on Anthropic's "Building Effective Agents" and Lilian Weng's research. + +## Core Patterns + +### 1. Chain of Thought (CoT) +Sequential reasoning for decomposition. +- Use when: Task benefits from step-by-step +- Trade-off: Latency for accuracy + +### 2. Tree of Thoughts (ToT) +Explore multiple solution paths. +- Use when: Alternatives matter +- Trade-off: Computation for quality + +### 3. Plan-Execute-Reflect +Iterative improvement loops. +- Use when: Feedback available +- Trade-off: Iterations for quality + +### 4. ReAct Pattern +Interleave reasoning and action. +``` +Thought: ... +Action: ... +Observation: ... +(Repeat) +``` + +### 5. Reflexion Pattern +Learn from mistakes dynamically. +``` +Action -> Heuristic -> Reflection -> Memory -> Retry +``` + +## Task Decomposition Methods + +### By Dependency +- Sequential with prerequisites +- Clear execution order + +### By Complexity +- Phases: Analysis, Design, Implementation, Test +- Progressive refinement + +### By Parallelization +- Independent tasks grouped +- Maximize throughput + +## Integration + +Planner uses these patterns based on task characteristics. +Orchestrator routes subtasks to appropriate agents. +Reflector analyzes outcomes and stores lessons. diff --git a/.kilo/skills/tool-use/SKILL.md b/.kilo/skills/tool-use/SKILL.md new file mode 100644 index 0000000..e463946 --- /dev/null +++ b/.kilo/skills/tool-use/SKILL.md @@ -0,0 +1,56 @@ +# Tool Use for Autonomous Agents + +Based on Anthropic's "Prompt Engineering your Tools" appendix. + +## Tool Design Principles + +### 1. Give Model "Think" Space +- Allow tokens before writing +- Don't constrain output prematurely + +### 2. Natural Format +- Keep close to internet patterns +- Avoid complex JSON escaping +- Use markdown for code + +### 3. Minimize Overhead +- No line counting +- No token counting +- Simple is better + +## Tool Categories + +### File Operations +- `read`: Read files +- `write`: Create/overwrite files +- `edit`: Make precise edits +- `glob`: Find files +- `grep`: Search content + +### Execution +- `bash`: Run commands +- `task`: Delegate to subagents + +### Web & API +- `webfetch`: Retrieve web content +- `curl`: API calls + +### Knowledge +- `codebase_search`: Semantic search +- `question`: Ask user for clarification + +## Tool Documentation + +From Anthropic research: Invest as much effort in ACI (Agent-Computer Interface) as HCI: +- Clear descriptions +- Example usage +- Edge cases +- Input format requirements +- Clear boundaries between tools + +## Poka-Yoke Techniques + +- Use absolute paths (not relative) +- Clear error messages +- Validation before execution +- Safe defaults diff --git a/AGENTS.md b/AGENTS.md index 49d950a..05c4cbd 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -27,6 +27,7 @@ Agent: Runs full pipeline for issue #42 with Gitea logging These agents are invoked automatically by `/pipeline` or manually via `@mention`: +### Core Development | Agent | Role | When Invoked | |-------|------|--------------| | `@requirement-refiner` | Converts ideas to User Stories | Issue status: new | @@ -35,15 +36,33 @@ These agents are invoked automatically by `/pipeline` or manually via `@mention` | `@sdet-engineer` | Writes tests (TDD) | Status: designed | | `@lead-developer` | Implements code | Status: testing (tests fail) | | `@frontend-developer` | UI implementation | When UI work needed | +| `@backend-developer` | Node.js/Express/APIs | When backend needed | + +### Quality Assurance +| Agent | Role | When Invoked | +|-------|------|--------------| | `@code-skeptic` | Adversarial review | Status: implementing | | `@the-fixer` | Fixes issues | When review fails | | `@performance-engineer` | Performance review | After code-skeptic | | `@security-auditor` | Security audit | After performance | +| `@visual-tester` | Visual regression | When UI changes | + +### Cognitive Enhancement (New) +| Agent | Role | When Invoked | +|-------|------|--------------| +| `@planner` | Task decomposition (CoT/ToT) | Complex tasks | +| `@reflector` | Self-reflection (Reflexion) | After each agent | +| `@memory-manager` | Memory systems | Context management | + +### Meta & Process +| Agent | Role | When Invoked | +|-------|------|--------------| | `@release-manager` | Git operations | Status: releasing | | `@evaluator` | Scores effectiveness | Status: evaluated | | `@prompt-optimizer` | Improves prompts | When score < 7 | | `@capability-analyst` | Analyzes task coverage | When starting new task | | `@agent-architect` | Creates new agents | When gaps identified | +| `@workflow-architect` | Creates workflows | New workflow needed | | `@markdown-validator` | Validates Markdown | Before issue creation | ## Workflow State Machine