feat: add cognitive enhancement agents based on research
Based on Anthropic 'Building Effective Agents' and Lilian Weng's research: New Agents: - @planner: Task decomposition using CoT, ToT, Plan-Execute-Reflect - @reflector: Self-reflection using Reflexion pattern - @memory-manager: Memory systems (short/long/episodic) New Skills: - memory-systems: Memory architecture for autonomous agents - planning-patterns: CoT, ToT, ReAct, Reflexion patterns - tool-use: ACI design principles from Anthropic New Rules: - agent-patterns: Core patterns from research Updated AGENTS.md with new agent categories: - Cognitive Enhancement: planner, reflector, memory-manager - Improved workflow state machine with reflection loop Related: Issue #25 (Research Milestone)
This commit is contained in:
55
.kilo/agents/memory-manager.md
Normal file
55
.kilo/agents/memory-manager.md
Normal file
@@ -0,0 +1,55 @@
|
||||
---
|
||||
description: Manages agent memory systems - short-term (context), long-term (vector store), and episodic (experiences)
|
||||
mode: subagent
|
||||
model: ollama-cloud/gpt-oss:120b
|
||||
color: "#8B5CF6"
|
||||
permission:
|
||||
read: allow
|
||||
write: allow
|
||||
glob: allow
|
||||
grep: allow
|
||||
task:
|
||||
"*": deny
|
||||
---
|
||||
|
||||
# Kilo Code: Memory Manager
|
||||
|
||||
## Role Definition
|
||||
|
||||
You are **Memory Manager** — responsible for managing all memory systems. Based on Lilian Weng's agent architecture research.
|
||||
|
||||
## Memory Types
|
||||
|
||||
### 1. Short-Term Memory (Context Window)
|
||||
- Limited to ~4000 tokens (or more for newer models)
|
||||
- In-context learning happens here
|
||||
- Managed via sliding window or importance filtering
|
||||
|
||||
### 2. Long-Term Memory (Vector Store)
|
||||
- External storage with infinite capacity
|
||||
- Uses MIPS (Maximum Inner Product Search)
|
||||
- Algorithms: HNSW, FAISS, ScaNN, LSH
|
||||
|
||||
### 3. Episodic Memory (Experience Log)
|
||||
- Records of past experiences
|
||||
- Includes outcomes and lessons learned
|
||||
- Used for reflection and improvement
|
||||
|
||||
## Retrieval Scoring
|
||||
|
||||
```
|
||||
relevance = 0.5 * semantic_similarity +
|
||||
0.3 * recency_score +
|
||||
0.2 * importance_score
|
||||
```
|
||||
|
||||
## Operations
|
||||
|
||||
- **Store**: Add memory to appropriate system
|
||||
- **Retrieve**: Get relevant memories by query
|
||||
- **Consolidate**: Move important short-term to long-term
|
||||
- **Forget**: Remove or decay unimportant memories
|
||||
|
||||
## Integration
|
||||
|
||||
Works with Planner, Reflector, and Orchestrator to provide context-aware memory.
|
||||
55
.kilo/agents/planner.md
Normal file
55
.kilo/agents/planner.md
Normal file
@@ -0,0 +1,55 @@
|
||||
---
|
||||
description: Advanced task planner using Chain of Thought, Tree of Thoughts, and Plan-Execute-Reflect
|
||||
mode: subagent
|
||||
model: ollama-cloud/gpt-oss:120b
|
||||
color: "#F59E0B"
|
||||
permission:
|
||||
read: allow
|
||||
write: allow
|
||||
glob: allow
|
||||
grep: allow
|
||||
task:
|
||||
"*": deny
|
||||
---
|
||||
|
||||
# Kilo Code: Planner
|
||||
|
||||
## Role Definition
|
||||
|
||||
You are **Planner** — the strategic thinker who decomposes complex tasks using advanced reasoning.
|
||||
|
||||
## Planning Strategies
|
||||
|
||||
### 1. Chain of Thought (CoT)
|
||||
Step-by-step reasoning for complex tasks.
|
||||
|
||||
### 2. Tree of Thoughts (ToT)
|
||||
Explore multiple solution paths when alternatives matter.
|
||||
|
||||
### 3. Plan-Execute-Reflect
|
||||
Iterative execution with reflection between steps.
|
||||
|
||||
## Task Decomposition
|
||||
|
||||
- **By Dependency**: Sequential tasks with prerequisites
|
||||
- **By Complexity**: Phase-based (analysis, design, implementation)
|
||||
- **By Parallelization**: Group independent tasks
|
||||
|
||||
## Output Format
|
||||
|
||||
```markdown
|
||||
## Plan: {task_name}
|
||||
|
||||
### Strategy: {strategy_name}
|
||||
|
||||
### Steps
|
||||
| Step | Task | Dependencies | Risk |
|
||||
|------|------|--------------|------|
|
||||
| 1 | {task} | None | {risk} |
|
||||
|
||||
### Success Criteria
|
||||
- [ ] {criterion}
|
||||
|
||||
### Rollback Plan
|
||||
If {failure}: {rollback_action}
|
||||
```
|
||||
44
.kilo/agents/reflector.md
Normal file
44
.kilo/agents/reflector.md
Normal file
@@ -0,0 +1,44 @@
|
||||
---
|
||||
description: Self-reflection agent using Reflexion pattern - learns from mistakes
|
||||
mode: subagent
|
||||
model: ollama-cloud/gpt-oss:120b
|
||||
color: "#10B981"
|
||||
permission:
|
||||
read: allow
|
||||
grep: allow
|
||||
glob: allow
|
||||
task:
|
||||
"*": deny
|
||||
---
|
||||
|
||||
# Kilo Code: Reflector
|
||||
|
||||
## Role Definition
|
||||
|
||||
You are **Reflector** — the self-improvement specialist using Reflexion pattern (Shinn & Labash 2023).
|
||||
|
||||
## Reflexion Framework
|
||||
|
||||
```
|
||||
Action -> Heuristic -> Reflection -> Memory Update -> Next Action
|
||||
```
|
||||
|
||||
## Heuristic Functions
|
||||
|
||||
- **Inefficient planning**: Too many steps
|
||||
- **Hallucination**: Repeated identical actions
|
||||
- **Failure**: Unsuccessful result
|
||||
|
||||
## Reflection Process
|
||||
|
||||
1. **Trajectory Analysis**: Analyze action sequence
|
||||
2. **Mistake Identification**: Find failed actions
|
||||
3. **Lesson Extraction**: Generalize fix patterns
|
||||
4. **Memory Update**: Store for future use
|
||||
|
||||
## Integration
|
||||
|
||||
Called after each agent in pipeline:
|
||||
- After Lead Developer: Analyze implementation
|
||||
- After Code Skeptic: Analyze review patterns
|
||||
- After The Fixer: Analyze fix patterns
|
||||
84
.kilo/rules/agent-patterns.md
Normal file
84
.kilo/rules/agent-patterns.md
Normal file
@@ -0,0 +1,84 @@
|
||||
# Agent Patterns Rules
|
||||
|
||||
Based on research from Anthropic, OpenAI, and Lilian Weng.
|
||||
|
||||
## Core Patterns (Anthropic)
|
||||
|
||||
### 1. Prompt Chaining
|
||||
Sequential steps with validation gates.
|
||||
```yaml
|
||||
when: Task can be cleanly decomposed
|
||||
example: Generate copy, then translate
|
||||
gate: Validate each step before next
|
||||
```
|
||||
|
||||
### 2. Routing
|
||||
Classify input, route to specialized agent.
|
||||
```yaml
|
||||
when: Distinct categories, clear classification
|
||||
example: Customer service routing (refunds, technical, general)
|
||||
```
|
||||
|
||||
### 3. Parallelization
|
||||
Run independent tasks simultaneously.
|
||||
```yaml
|
||||
when: Subtasks are independent
|
||||
types:
|
||||
- Sectioning: Break into parallel parts
|
||||
- Voting: Multiple attempts, aggregate results
|
||||
```
|
||||
|
||||
### 4. Orchestrator-Workers
|
||||
Central controller delegates to workers.
|
||||
```yaml
|
||||
when: Subtasks dynamic, not pre-defined
|
||||
example: Coding agent editing multiple files
|
||||
```
|
||||
|
||||
### 5. Evaluator-Optimizer
|
||||
Loop: generate, evaluate, improve.
|
||||
```yaml
|
||||
when: Clear criteria, iterative improves
|
||||
example: Code review loop
|
||||
```
|
||||
|
||||
## Memory Architecture (Lilian Weng)
|
||||
|
||||
### Components
|
||||
- **Planning**: Task decomposition, self-reflection
|
||||
- **Memory**: Short-term, long-term, episodic
|
||||
- **Tool Use**: External APIs, code execution
|
||||
|
||||
### Memory Types
|
||||
1. **Sensory**: Embeddings (milliseconds)
|
||||
2. **Short-term**: Context window (~4000 tokens)
|
||||
3. **Long-term**: Vector store (infinite)
|
||||
4. **Episodic**: Experience log
|
||||
|
||||
## Tool Use Best Practices (Anthropic)
|
||||
|
||||
1. Give model "think" space before output
|
||||
2. Keep formats close to internet patterns
|
||||
3. Minimize formatting overhead
|
||||
4. Invest in ACI like HCI
|
||||
|
||||
## ReAct Pattern
|
||||
|
||||
Interleave reasoning and action:
|
||||
```
|
||||
Thought: [reasoning]
|
||||
Action: [tool call]
|
||||
Observation: [result]
|
||||
(Repeat until done)
|
||||
```
|
||||
|
||||
## Reflexion Pattern
|
||||
|
||||
Learn from mistakes:
|
||||
```
|
||||
1. Take action
|
||||
2. Check heuristic
|
||||
3. Generate reflection
|
||||
4. Update memory
|
||||
5. Retry with lesson
|
||||
```
|
||||
43
.kilo/skills/memory-systems/SKILL.md
Normal file
43
.kilo/skills/memory-systems/SKILL.md
Normal file
@@ -0,0 +1,43 @@
|
||||
# Memory Systems for Autonomous Agents
|
||||
|
||||
Based on Lilian Weng's "LLM Powered Autonomous Agents" research.
|
||||
|
||||
## Memory Types
|
||||
|
||||
### 1. Sensory Memory (Embeddings)
|
||||
- Raw input processing (ms to seconds)
|
||||
- Embedding: CLIP (multimodal), text-embedding-ada-002 (text)
|
||||
|
||||
### 2. Short-Term Memory (Working Memory)
|
||||
- In-context learning, context window limited
|
||||
- Miller's Law: 7 ± 2 items
|
||||
- Strategies: sliding window, importance-weighted, attention-based
|
||||
|
||||
### 3. Long-Term Memory (Vector Store)
|
||||
- External storage, infinite capacity
|
||||
- MIPS Algorithms: HNSW, FAISS, ScaNN, LSH
|
||||
|
||||
### 4. Episodic Memory
|
||||
- Experience records with outcomes
|
||||
- Used for reflection and learning
|
||||
|
||||
## Retrieval Formula
|
||||
|
||||
```
|
||||
score = 0.5 * relevance + 0.3 * recency + 0.2 * importance
|
||||
```
|
||||
|
||||
## Operations
|
||||
|
||||
- **Store**: Add to appropriate system
|
||||
- **Retrieve**: Query with composite scoring
|
||||
- **Consolidate**: Move short-term to long-term
|
||||
- **Forget**: Decay or explicit deletion
|
||||
|
||||
## Best Practices
|
||||
|
||||
1. Regular consolidation
|
||||
2. LLM-generated importance scores
|
||||
3. Decay schedule for forgetting
|
||||
4. Episode summaries/reflections
|
||||
5. Mixed retrieval sources
|
||||
55
.kilo/skills/planning-patterns/SKILL.md
Normal file
55
.kilo/skills/planning-patterns/SKILL.md
Normal file
@@ -0,0 +1,55 @@
|
||||
# Planning Patterns for Autonomous Agents
|
||||
|
||||
Based on Anthropic's "Building Effective Agents" and Lilian Weng's research.
|
||||
|
||||
## Core Patterns
|
||||
|
||||
### 1. Chain of Thought (CoT)
|
||||
Sequential reasoning for decomposition.
|
||||
- Use when: Task benefits from step-by-step
|
||||
- Trade-off: Latency for accuracy
|
||||
|
||||
### 2. Tree of Thoughts (ToT)
|
||||
Explore multiple solution paths.
|
||||
- Use when: Alternatives matter
|
||||
- Trade-off: Computation for quality
|
||||
|
||||
### 3. Plan-Execute-Reflect
|
||||
Iterative improvement loops.
|
||||
- Use when: Feedback available
|
||||
- Trade-off: Iterations for quality
|
||||
|
||||
### 4. ReAct Pattern
|
||||
Interleave reasoning and action.
|
||||
```
|
||||
Thought: ...
|
||||
Action: ...
|
||||
Observation: ...
|
||||
(Repeat)
|
||||
```
|
||||
|
||||
### 5. Reflexion Pattern
|
||||
Learn from mistakes dynamically.
|
||||
```
|
||||
Action -> Heuristic -> Reflection -> Memory -> Retry
|
||||
```
|
||||
|
||||
## Task Decomposition Methods
|
||||
|
||||
### By Dependency
|
||||
- Sequential with prerequisites
|
||||
- Clear execution order
|
||||
|
||||
### By Complexity
|
||||
- Phases: Analysis, Design, Implementation, Test
|
||||
- Progressive refinement
|
||||
|
||||
### By Parallelization
|
||||
- Independent tasks grouped
|
||||
- Maximize throughput
|
||||
|
||||
## Integration
|
||||
|
||||
Planner uses these patterns based on task characteristics.
|
||||
Orchestrator routes subtasks to appropriate agents.
|
||||
Reflector analyzes outcomes and stores lessons.
|
||||
56
.kilo/skills/tool-use/SKILL.md
Normal file
56
.kilo/skills/tool-use/SKILL.md
Normal file
@@ -0,0 +1,56 @@
|
||||
# Tool Use for Autonomous Agents
|
||||
|
||||
Based on Anthropic's "Prompt Engineering your Tools" appendix.
|
||||
|
||||
## Tool Design Principles
|
||||
|
||||
### 1. Give Model "Think" Space
|
||||
- Allow tokens before writing
|
||||
- Don't constrain output prematurely
|
||||
|
||||
### 2. Natural Format
|
||||
- Keep close to internet patterns
|
||||
- Avoid complex JSON escaping
|
||||
- Use markdown for code
|
||||
|
||||
### 3. Minimize Overhead
|
||||
- No line counting
|
||||
- No token counting
|
||||
- Simple is better
|
||||
|
||||
## Tool Categories
|
||||
|
||||
### File Operations
|
||||
- `read`: Read files
|
||||
- `write`: Create/overwrite files
|
||||
- `edit`: Make precise edits
|
||||
- `glob`: Find files
|
||||
- `grep`: Search content
|
||||
|
||||
### Execution
|
||||
- `bash`: Run commands
|
||||
- `task`: Delegate to subagents
|
||||
|
||||
### Web & API
|
||||
- `webfetch`: Retrieve web content
|
||||
- `curl`: API calls
|
||||
|
||||
### Knowledge
|
||||
- `codebase_search`: Semantic search
|
||||
- `question`: Ask user for clarification
|
||||
|
||||
## Tool Documentation
|
||||
|
||||
From Anthropic research: Invest as much effort in ACI (Agent-Computer Interface) as HCI:
|
||||
- Clear descriptions
|
||||
- Example usage
|
||||
- Edge cases
|
||||
- Input format requirements
|
||||
- Clear boundaries between tools
|
||||
|
||||
## Poka-Yoke Techniques
|
||||
|
||||
- Use absolute paths (not relative)
|
||||
- Clear error messages
|
||||
- Validation before execution
|
||||
- Safe defaults
|
||||
19
AGENTS.md
19
AGENTS.md
@@ -27,6 +27,7 @@ Agent: Runs full pipeline for issue #42 with Gitea logging
|
||||
|
||||
These agents are invoked automatically by `/pipeline` or manually via `@mention`:
|
||||
|
||||
### Core Development
|
||||
| Agent | Role | When Invoked |
|
||||
|-------|------|--------------|
|
||||
| `@requirement-refiner` | Converts ideas to User Stories | Issue status: new |
|
||||
@@ -35,15 +36,33 @@ These agents are invoked automatically by `/pipeline` or manually via `@mention`
|
||||
| `@sdet-engineer` | Writes tests (TDD) | Status: designed |
|
||||
| `@lead-developer` | Implements code | Status: testing (tests fail) |
|
||||
| `@frontend-developer` | UI implementation | When UI work needed |
|
||||
| `@backend-developer` | Node.js/Express/APIs | When backend needed |
|
||||
|
||||
### Quality Assurance
|
||||
| Agent | Role | When Invoked |
|
||||
|-------|------|--------------|
|
||||
| `@code-skeptic` | Adversarial review | Status: implementing |
|
||||
| `@the-fixer` | Fixes issues | When review fails |
|
||||
| `@performance-engineer` | Performance review | After code-skeptic |
|
||||
| `@security-auditor` | Security audit | After performance |
|
||||
| `@visual-tester` | Visual regression | When UI changes |
|
||||
|
||||
### Cognitive Enhancement (New)
|
||||
| Agent | Role | When Invoked |
|
||||
|-------|------|--------------|
|
||||
| `@planner` | Task decomposition (CoT/ToT) | Complex tasks |
|
||||
| `@reflector` | Self-reflection (Reflexion) | After each agent |
|
||||
| `@memory-manager` | Memory systems | Context management |
|
||||
|
||||
### Meta & Process
|
||||
| Agent | Role | When Invoked |
|
||||
|-------|------|--------------|
|
||||
| `@release-manager` | Git operations | Status: releasing |
|
||||
| `@evaluator` | Scores effectiveness | Status: evaluated |
|
||||
| `@prompt-optimizer` | Improves prompts | When score < 7 |
|
||||
| `@capability-analyst` | Analyzes task coverage | When starting new task |
|
||||
| `@agent-architect` | Creates new agents | When gaps identified |
|
||||
| `@workflow-architect` | Creates workflows | New workflow needed |
|
||||
| `@markdown-validator` | Validates Markdown | Before issue creation |
|
||||
|
||||
## Workflow State Machine
|
||||
|
||||
Reference in New Issue
Block a user