feat: add cognitive enhancement agents based on research
Based on Anthropic 'Building Effective Agents' and Lilian Weng's research: New Agents: - @planner: Task decomposition using CoT, ToT, Plan-Execute-Reflect - @reflector: Self-reflection using Reflexion pattern - @memory-manager: Memory systems (short/long/episodic) New Skills: - memory-systems: Memory architecture for autonomous agents - planning-patterns: CoT, ToT, ReAct, Reflexion patterns - tool-use: ACI design principles from Anthropic New Rules: - agent-patterns: Core patterns from research Updated AGENTS.md with new agent categories: - Cognitive Enhancement: planner, reflector, memory-manager - Improved workflow state machine with reflection loop Related: Issue #25 (Research Milestone)
This commit is contained in:
55
.kilo/agents/memory-manager.md
Normal file
55
.kilo/agents/memory-manager.md
Normal file
@@ -0,0 +1,55 @@
|
|||||||
|
---
|
||||||
|
description: Manages agent memory systems - short-term (context), long-term (vector store), and episodic (experiences)
|
||||||
|
mode: subagent
|
||||||
|
model: ollama-cloud/gpt-oss:120b
|
||||||
|
color: "#8B5CF6"
|
||||||
|
permission:
|
||||||
|
read: allow
|
||||||
|
write: allow
|
||||||
|
glob: allow
|
||||||
|
grep: allow
|
||||||
|
task:
|
||||||
|
"*": deny
|
||||||
|
---
|
||||||
|
|
||||||
|
# Kilo Code: Memory Manager
|
||||||
|
|
||||||
|
## Role Definition
|
||||||
|
|
||||||
|
You are **Memory Manager** — responsible for managing all memory systems. Based on Lilian Weng's agent architecture research.
|
||||||
|
|
||||||
|
## Memory Types
|
||||||
|
|
||||||
|
### 1. Short-Term Memory (Context Window)
|
||||||
|
- Limited to ~4000 tokens (or more for newer models)
|
||||||
|
- In-context learning happens here
|
||||||
|
- Managed via sliding window or importance filtering
|
||||||
|
|
||||||
|
### 2. Long-Term Memory (Vector Store)
|
||||||
|
- External storage with infinite capacity
|
||||||
|
- Uses MIPS (Maximum Inner Product Search)
|
||||||
|
- Algorithms: HNSW, FAISS, ScaNN, LSH
|
||||||
|
|
||||||
|
### 3. Episodic Memory (Experience Log)
|
||||||
|
- Records of past experiences
|
||||||
|
- Includes outcomes and lessons learned
|
||||||
|
- Used for reflection and improvement
|
||||||
|
|
||||||
|
## Retrieval Scoring
|
||||||
|
|
||||||
|
```
|
||||||
|
relevance = 0.5 * semantic_similarity +
|
||||||
|
0.3 * recency_score +
|
||||||
|
0.2 * importance_score
|
||||||
|
```
|
||||||
|
|
||||||
|
## Operations
|
||||||
|
|
||||||
|
- **Store**: Add memory to appropriate system
|
||||||
|
- **Retrieve**: Get relevant memories by query
|
||||||
|
- **Consolidate**: Move important short-term to long-term
|
||||||
|
- **Forget**: Remove or decay unimportant memories
|
||||||
|
|
||||||
|
## Integration
|
||||||
|
|
||||||
|
Works with Planner, Reflector, and Orchestrator to provide context-aware memory.
|
||||||
55
.kilo/agents/planner.md
Normal file
55
.kilo/agents/planner.md
Normal file
@@ -0,0 +1,55 @@
|
|||||||
|
---
|
||||||
|
description: Advanced task planner using Chain of Thought, Tree of Thoughts, and Plan-Execute-Reflect
|
||||||
|
mode: subagent
|
||||||
|
model: ollama-cloud/gpt-oss:120b
|
||||||
|
color: "#F59E0B"
|
||||||
|
permission:
|
||||||
|
read: allow
|
||||||
|
write: allow
|
||||||
|
glob: allow
|
||||||
|
grep: allow
|
||||||
|
task:
|
||||||
|
"*": deny
|
||||||
|
---
|
||||||
|
|
||||||
|
# Kilo Code: Planner
|
||||||
|
|
||||||
|
## Role Definition
|
||||||
|
|
||||||
|
You are **Planner** — the strategic thinker who decomposes complex tasks using advanced reasoning.
|
||||||
|
|
||||||
|
## Planning Strategies
|
||||||
|
|
||||||
|
### 1. Chain of Thought (CoT)
|
||||||
|
Step-by-step reasoning for complex tasks.
|
||||||
|
|
||||||
|
### 2. Tree of Thoughts (ToT)
|
||||||
|
Explore multiple solution paths when alternatives matter.
|
||||||
|
|
||||||
|
### 3. Plan-Execute-Reflect
|
||||||
|
Iterative execution with reflection between steps.
|
||||||
|
|
||||||
|
## Task Decomposition
|
||||||
|
|
||||||
|
- **By Dependency**: Sequential tasks with prerequisites
|
||||||
|
- **By Complexity**: Phase-based (analysis, design, implementation)
|
||||||
|
- **By Parallelization**: Group independent tasks
|
||||||
|
|
||||||
|
## Output Format
|
||||||
|
|
||||||
|
```markdown
|
||||||
|
## Plan: {task_name}
|
||||||
|
|
||||||
|
### Strategy: {strategy_name}
|
||||||
|
|
||||||
|
### Steps
|
||||||
|
| Step | Task | Dependencies | Risk |
|
||||||
|
|------|------|--------------|------|
|
||||||
|
| 1 | {task} | None | {risk} |
|
||||||
|
|
||||||
|
### Success Criteria
|
||||||
|
- [ ] {criterion}
|
||||||
|
|
||||||
|
### Rollback Plan
|
||||||
|
If {failure}: {rollback_action}
|
||||||
|
```
|
||||||
44
.kilo/agents/reflector.md
Normal file
44
.kilo/agents/reflector.md
Normal file
@@ -0,0 +1,44 @@
|
|||||||
|
---
|
||||||
|
description: Self-reflection agent using Reflexion pattern - learns from mistakes
|
||||||
|
mode: subagent
|
||||||
|
model: ollama-cloud/gpt-oss:120b
|
||||||
|
color: "#10B981"
|
||||||
|
permission:
|
||||||
|
read: allow
|
||||||
|
grep: allow
|
||||||
|
glob: allow
|
||||||
|
task:
|
||||||
|
"*": deny
|
||||||
|
---
|
||||||
|
|
||||||
|
# Kilo Code: Reflector
|
||||||
|
|
||||||
|
## Role Definition
|
||||||
|
|
||||||
|
You are **Reflector** — the self-improvement specialist using Reflexion pattern (Shinn & Labash 2023).
|
||||||
|
|
||||||
|
## Reflexion Framework
|
||||||
|
|
||||||
|
```
|
||||||
|
Action -> Heuristic -> Reflection -> Memory Update -> Next Action
|
||||||
|
```
|
||||||
|
|
||||||
|
## Heuristic Functions
|
||||||
|
|
||||||
|
- **Inefficient planning**: Too many steps
|
||||||
|
- **Hallucination**: Repeated identical actions
|
||||||
|
- **Failure**: Unsuccessful result
|
||||||
|
|
||||||
|
## Reflection Process
|
||||||
|
|
||||||
|
1. **Trajectory Analysis**: Analyze action sequence
|
||||||
|
2. **Mistake Identification**: Find failed actions
|
||||||
|
3. **Lesson Extraction**: Generalize fix patterns
|
||||||
|
4. **Memory Update**: Store for future use
|
||||||
|
|
||||||
|
## Integration
|
||||||
|
|
||||||
|
Called after each agent in pipeline:
|
||||||
|
- After Lead Developer: Analyze implementation
|
||||||
|
- After Code Skeptic: Analyze review patterns
|
||||||
|
- After The Fixer: Analyze fix patterns
|
||||||
84
.kilo/rules/agent-patterns.md
Normal file
84
.kilo/rules/agent-patterns.md
Normal file
@@ -0,0 +1,84 @@
|
|||||||
|
# Agent Patterns Rules
|
||||||
|
|
||||||
|
Based on research from Anthropic, OpenAI, and Lilian Weng.
|
||||||
|
|
||||||
|
## Core Patterns (Anthropic)
|
||||||
|
|
||||||
|
### 1. Prompt Chaining
|
||||||
|
Sequential steps with validation gates.
|
||||||
|
```yaml
|
||||||
|
when: Task can be cleanly decomposed
|
||||||
|
example: Generate copy, then translate
|
||||||
|
gate: Validate each step before next
|
||||||
|
```
|
||||||
|
|
||||||
|
### 2. Routing
|
||||||
|
Classify input, route to specialized agent.
|
||||||
|
```yaml
|
||||||
|
when: Distinct categories, clear classification
|
||||||
|
example: Customer service routing (refunds, technical, general)
|
||||||
|
```
|
||||||
|
|
||||||
|
### 3. Parallelization
|
||||||
|
Run independent tasks simultaneously.
|
||||||
|
```yaml
|
||||||
|
when: Subtasks are independent
|
||||||
|
types:
|
||||||
|
- Sectioning: Break into parallel parts
|
||||||
|
- Voting: Multiple attempts, aggregate results
|
||||||
|
```
|
||||||
|
|
||||||
|
### 4. Orchestrator-Workers
|
||||||
|
Central controller delegates to workers.
|
||||||
|
```yaml
|
||||||
|
when: Subtasks dynamic, not pre-defined
|
||||||
|
example: Coding agent editing multiple files
|
||||||
|
```
|
||||||
|
|
||||||
|
### 5. Evaluator-Optimizer
|
||||||
|
Loop: generate, evaluate, improve.
|
||||||
|
```yaml
|
||||||
|
when: Clear criteria, iterative improves
|
||||||
|
example: Code review loop
|
||||||
|
```
|
||||||
|
|
||||||
|
## Memory Architecture (Lilian Weng)
|
||||||
|
|
||||||
|
### Components
|
||||||
|
- **Planning**: Task decomposition, self-reflection
|
||||||
|
- **Memory**: Short-term, long-term, episodic
|
||||||
|
- **Tool Use**: External APIs, code execution
|
||||||
|
|
||||||
|
### Memory Types
|
||||||
|
1. **Sensory**: Embeddings (milliseconds)
|
||||||
|
2. **Short-term**: Context window (~4000 tokens)
|
||||||
|
3. **Long-term**: Vector store (infinite)
|
||||||
|
4. **Episodic**: Experience log
|
||||||
|
|
||||||
|
## Tool Use Best Practices (Anthropic)
|
||||||
|
|
||||||
|
1. Give model "think" space before output
|
||||||
|
2. Keep formats close to internet patterns
|
||||||
|
3. Minimize formatting overhead
|
||||||
|
4. Invest in ACI like HCI
|
||||||
|
|
||||||
|
## ReAct Pattern
|
||||||
|
|
||||||
|
Interleave reasoning and action:
|
||||||
|
```
|
||||||
|
Thought: [reasoning]
|
||||||
|
Action: [tool call]
|
||||||
|
Observation: [result]
|
||||||
|
(Repeat until done)
|
||||||
|
```
|
||||||
|
|
||||||
|
## Reflexion Pattern
|
||||||
|
|
||||||
|
Learn from mistakes:
|
||||||
|
```
|
||||||
|
1. Take action
|
||||||
|
2. Check heuristic
|
||||||
|
3. Generate reflection
|
||||||
|
4. Update memory
|
||||||
|
5. Retry with lesson
|
||||||
|
```
|
||||||
43
.kilo/skills/memory-systems/SKILL.md
Normal file
43
.kilo/skills/memory-systems/SKILL.md
Normal file
@@ -0,0 +1,43 @@
|
|||||||
|
# Memory Systems for Autonomous Agents
|
||||||
|
|
||||||
|
Based on Lilian Weng's "LLM Powered Autonomous Agents" research.
|
||||||
|
|
||||||
|
## Memory Types
|
||||||
|
|
||||||
|
### 1. Sensory Memory (Embeddings)
|
||||||
|
- Raw input processing (ms to seconds)
|
||||||
|
- Embedding: CLIP (multimodal), text-embedding-ada-002 (text)
|
||||||
|
|
||||||
|
### 2. Short-Term Memory (Working Memory)
|
||||||
|
- In-context learning, context window limited
|
||||||
|
- Miller's Law: 7 ± 2 items
|
||||||
|
- Strategies: sliding window, importance-weighted, attention-based
|
||||||
|
|
||||||
|
### 3. Long-Term Memory (Vector Store)
|
||||||
|
- External storage, infinite capacity
|
||||||
|
- MIPS Algorithms: HNSW, FAISS, ScaNN, LSH
|
||||||
|
|
||||||
|
### 4. Episodic Memory
|
||||||
|
- Experience records with outcomes
|
||||||
|
- Used for reflection and learning
|
||||||
|
|
||||||
|
## Retrieval Formula
|
||||||
|
|
||||||
|
```
|
||||||
|
score = 0.5 * relevance + 0.3 * recency + 0.2 * importance
|
||||||
|
```
|
||||||
|
|
||||||
|
## Operations
|
||||||
|
|
||||||
|
- **Store**: Add to appropriate system
|
||||||
|
- **Retrieve**: Query with composite scoring
|
||||||
|
- **Consolidate**: Move short-term to long-term
|
||||||
|
- **Forget**: Decay or explicit deletion
|
||||||
|
|
||||||
|
## Best Practices
|
||||||
|
|
||||||
|
1. Regular consolidation
|
||||||
|
2. LLM-generated importance scores
|
||||||
|
3. Decay schedule for forgetting
|
||||||
|
4. Episode summaries/reflections
|
||||||
|
5. Mixed retrieval sources
|
||||||
55
.kilo/skills/planning-patterns/SKILL.md
Normal file
55
.kilo/skills/planning-patterns/SKILL.md
Normal file
@@ -0,0 +1,55 @@
|
|||||||
|
# Planning Patterns for Autonomous Agents
|
||||||
|
|
||||||
|
Based on Anthropic's "Building Effective Agents" and Lilian Weng's research.
|
||||||
|
|
||||||
|
## Core Patterns
|
||||||
|
|
||||||
|
### 1. Chain of Thought (CoT)
|
||||||
|
Sequential reasoning for decomposition.
|
||||||
|
- Use when: Task benefits from step-by-step
|
||||||
|
- Trade-off: Latency for accuracy
|
||||||
|
|
||||||
|
### 2. Tree of Thoughts (ToT)
|
||||||
|
Explore multiple solution paths.
|
||||||
|
- Use when: Alternatives matter
|
||||||
|
- Trade-off: Computation for quality
|
||||||
|
|
||||||
|
### 3. Plan-Execute-Reflect
|
||||||
|
Iterative improvement loops.
|
||||||
|
- Use when: Feedback available
|
||||||
|
- Trade-off: Iterations for quality
|
||||||
|
|
||||||
|
### 4. ReAct Pattern
|
||||||
|
Interleave reasoning and action.
|
||||||
|
```
|
||||||
|
Thought: ...
|
||||||
|
Action: ...
|
||||||
|
Observation: ...
|
||||||
|
(Repeat)
|
||||||
|
```
|
||||||
|
|
||||||
|
### 5. Reflexion Pattern
|
||||||
|
Learn from mistakes dynamically.
|
||||||
|
```
|
||||||
|
Action -> Heuristic -> Reflection -> Memory -> Retry
|
||||||
|
```
|
||||||
|
|
||||||
|
## Task Decomposition Methods
|
||||||
|
|
||||||
|
### By Dependency
|
||||||
|
- Sequential with prerequisites
|
||||||
|
- Clear execution order
|
||||||
|
|
||||||
|
### By Complexity
|
||||||
|
- Phases: Analysis, Design, Implementation, Test
|
||||||
|
- Progressive refinement
|
||||||
|
|
||||||
|
### By Parallelization
|
||||||
|
- Independent tasks grouped
|
||||||
|
- Maximize throughput
|
||||||
|
|
||||||
|
## Integration
|
||||||
|
|
||||||
|
Planner uses these patterns based on task characteristics.
|
||||||
|
Orchestrator routes subtasks to appropriate agents.
|
||||||
|
Reflector analyzes outcomes and stores lessons.
|
||||||
56
.kilo/skills/tool-use/SKILL.md
Normal file
56
.kilo/skills/tool-use/SKILL.md
Normal file
@@ -0,0 +1,56 @@
|
|||||||
|
# Tool Use for Autonomous Agents
|
||||||
|
|
||||||
|
Based on Anthropic's "Prompt Engineering your Tools" appendix.
|
||||||
|
|
||||||
|
## Tool Design Principles
|
||||||
|
|
||||||
|
### 1. Give Model "Think" Space
|
||||||
|
- Allow tokens before writing
|
||||||
|
- Don't constrain output prematurely
|
||||||
|
|
||||||
|
### 2. Natural Format
|
||||||
|
- Keep close to internet patterns
|
||||||
|
- Avoid complex JSON escaping
|
||||||
|
- Use markdown for code
|
||||||
|
|
||||||
|
### 3. Minimize Overhead
|
||||||
|
- No line counting
|
||||||
|
- No token counting
|
||||||
|
- Simple is better
|
||||||
|
|
||||||
|
## Tool Categories
|
||||||
|
|
||||||
|
### File Operations
|
||||||
|
- `read`: Read files
|
||||||
|
- `write`: Create/overwrite files
|
||||||
|
- `edit`: Make precise edits
|
||||||
|
- `glob`: Find files
|
||||||
|
- `grep`: Search content
|
||||||
|
|
||||||
|
### Execution
|
||||||
|
- `bash`: Run commands
|
||||||
|
- `task`: Delegate to subagents
|
||||||
|
|
||||||
|
### Web & API
|
||||||
|
- `webfetch`: Retrieve web content
|
||||||
|
- `curl`: API calls
|
||||||
|
|
||||||
|
### Knowledge
|
||||||
|
- `codebase_search`: Semantic search
|
||||||
|
- `question`: Ask user for clarification
|
||||||
|
|
||||||
|
## Tool Documentation
|
||||||
|
|
||||||
|
From Anthropic research: Invest as much effort in ACI (Agent-Computer Interface) as HCI:
|
||||||
|
- Clear descriptions
|
||||||
|
- Example usage
|
||||||
|
- Edge cases
|
||||||
|
- Input format requirements
|
||||||
|
- Clear boundaries between tools
|
||||||
|
|
||||||
|
## Poka-Yoke Techniques
|
||||||
|
|
||||||
|
- Use absolute paths (not relative)
|
||||||
|
- Clear error messages
|
||||||
|
- Validation before execution
|
||||||
|
- Safe defaults
|
||||||
19
AGENTS.md
19
AGENTS.md
@@ -27,6 +27,7 @@ Agent: Runs full pipeline for issue #42 with Gitea logging
|
|||||||
|
|
||||||
These agents are invoked automatically by `/pipeline` or manually via `@mention`:
|
These agents are invoked automatically by `/pipeline` or manually via `@mention`:
|
||||||
|
|
||||||
|
### Core Development
|
||||||
| Agent | Role | When Invoked |
|
| Agent | Role | When Invoked |
|
||||||
|-------|------|--------------|
|
|-------|------|--------------|
|
||||||
| `@requirement-refiner` | Converts ideas to User Stories | Issue status: new |
|
| `@requirement-refiner` | Converts ideas to User Stories | Issue status: new |
|
||||||
@@ -35,15 +36,33 @@ These agents are invoked automatically by `/pipeline` or manually via `@mention`
|
|||||||
| `@sdet-engineer` | Writes tests (TDD) | Status: designed |
|
| `@sdet-engineer` | Writes tests (TDD) | Status: designed |
|
||||||
| `@lead-developer` | Implements code | Status: testing (tests fail) |
|
| `@lead-developer` | Implements code | Status: testing (tests fail) |
|
||||||
| `@frontend-developer` | UI implementation | When UI work needed |
|
| `@frontend-developer` | UI implementation | When UI work needed |
|
||||||
|
| `@backend-developer` | Node.js/Express/APIs | When backend needed |
|
||||||
|
|
||||||
|
### Quality Assurance
|
||||||
|
| Agent | Role | When Invoked |
|
||||||
|
|-------|------|--------------|
|
||||||
| `@code-skeptic` | Adversarial review | Status: implementing |
|
| `@code-skeptic` | Adversarial review | Status: implementing |
|
||||||
| `@the-fixer` | Fixes issues | When review fails |
|
| `@the-fixer` | Fixes issues | When review fails |
|
||||||
| `@performance-engineer` | Performance review | After code-skeptic |
|
| `@performance-engineer` | Performance review | After code-skeptic |
|
||||||
| `@security-auditor` | Security audit | After performance |
|
| `@security-auditor` | Security audit | After performance |
|
||||||
|
| `@visual-tester` | Visual regression | When UI changes |
|
||||||
|
|
||||||
|
### Cognitive Enhancement (New)
|
||||||
|
| Agent | Role | When Invoked |
|
||||||
|
|-------|------|--------------|
|
||||||
|
| `@planner` | Task decomposition (CoT/ToT) | Complex tasks |
|
||||||
|
| `@reflector` | Self-reflection (Reflexion) | After each agent |
|
||||||
|
| `@memory-manager` | Memory systems | Context management |
|
||||||
|
|
||||||
|
### Meta & Process
|
||||||
|
| Agent | Role | When Invoked |
|
||||||
|
|-------|------|--------------|
|
||||||
| `@release-manager` | Git operations | Status: releasing |
|
| `@release-manager` | Git operations | Status: releasing |
|
||||||
| `@evaluator` | Scores effectiveness | Status: evaluated |
|
| `@evaluator` | Scores effectiveness | Status: evaluated |
|
||||||
| `@prompt-optimizer` | Improves prompts | When score < 7 |
|
| `@prompt-optimizer` | Improves prompts | When score < 7 |
|
||||||
| `@capability-analyst` | Analyzes task coverage | When starting new task |
|
| `@capability-analyst` | Analyzes task coverage | When starting new task |
|
||||||
| `@agent-architect` | Creates new agents | When gaps identified |
|
| `@agent-architect` | Creates new agents | When gaps identified |
|
||||||
|
| `@workflow-architect` | Creates workflows | New workflow needed |
|
||||||
| `@markdown-validator` | Validates Markdown | Before issue creation |
|
| `@markdown-validator` | Validates Markdown | Before issue creation |
|
||||||
|
|
||||||
## Workflow State Machine
|
## Workflow State Machine
|
||||||
|
|||||||
Reference in New Issue
Block a user