feat: evolutionary agent model upgrades based on recommendation matrix

- devops-engineer: deepseek-v3.2 → kimi-k2.6:cloud (★88) - browser-automation: glm-5 → kimi-k2.6:cloud (★86) - visual-tester: glm-5 → qwen3-coder:480b (★82) - agent-architect: nemotron-3-super → kimi-k2.6:cloud (★86) - orchestrator: glm-5 → kimi-k2.6:cloud (dispatch critical) - product-owner: glm-5 → glm-5.1 (★84) - prompt-optimizer: qwen3.6-plus:free → glm-5.1 (stable fallback) - system-analyst: qwen3.6-plus:free → glm-5.1 (★90) - Add autonomous-mode.md rule for zero-confirmation workflow
2026-04-27 12:09:36 +01:00
parent af43eaef80
commit dbea8c90db
19 changed files with 84 additions and 33 deletions
--- a/.kilo/KILO_SPEC.md
+++ b/.kilo/KILO_SPEC.md
@@ -435,28 +435,28 @@ Provider availability depends on configuration. Common providers include:
 |-------|------|-------|
 | `@RequirementRefiner` | Converts vague ideas and bug reports into strict User Stories with acceptance criteria checklists. | ollama-cloud/kimi-k2-thinking |
 | `@HistoryMiner` | Analyzes git history to find duplicates and past solutions, preventing regression and duplicate work. | ollama-cloud/nemotron-3-super |
-| `@SystemAnalyst` | Designs technical specifications, data schemas, and API contracts before implementation. | qwen/qwen3.6-plus:free |
+| `@SystemAnalyst` | Designs technical specifications, data schemas, and API contracts before implementation. | ollama-cloud/glm-5.1 |
 | `@SdetEngineer` | Writes tests following TDD methodology. | ollama-cloud/qwen3-coder:480b |
 | `@LeadDeveloper` | Primary code writer for backend and core logic. | ollama-cloud/qwen3-coder:480b |
 | `@FrontendDeveloper` | Handles UI implementation with multimodal capabilities. | ollama-cloud/kimi-k2.5 |
 | `@BackendDeveloper` | Backend specialist for Node. | ollama-cloud/deepseek-v3.2 |
 | `@GoDeveloper` | Go backend specialist for Gin, Echo, APIs, and database integration. | ollama-cloud/qwen3-coder:480b |
-| `@DevopsEngineer` | DevOps specialist for Docker, Kubernetes, CI/CD pipeline automation, and infrastructure management. | ollama-cloud/deepseek-v3.2 |
+| `@DevopsEngineer` | DevOps specialist for Docker, Kubernetes, CI/CD pipeline automation, and infrastructure management. | ollama-cloud/kimi-k2.6:cloud |
 | `@CodeSkeptic` | Adversarial code reviewer. | ollama-cloud/minimax-m2.5 |
 | `@TheFixer` | Iteratively fixes bugs based on specific error reports and test failures. | ollama-cloud/minimax-m2.5 |
 | `@PerformanceEngineer` | Reviews code for performance issues. | ollama-cloud/nemotron-3-super |
 | `@SecurityAuditor` | Scans for security vulnerabilities, OWASP Top 10, dependency CVEs, and hardcoded secrets. | ollama-cloud/nemotron-3-super |
-| `@VisualTester` | Visual regression testing agent that compares screenshots and detects UI differences using pixelmatch and image diff. | ollama-cloud/glm-5 |
-| `@Orchestrator` | Main dispatcher. | ollama-cloud/glm-5 |
+| `@VisualTester` | Visual regression testing agent that compares screenshots and detects UI differences using pixelmatch and image diff. | ollama-cloud/qwen3-coder:480b |
+| `@Orchestrator` | Main dispatcher. | ollama-cloud/kimi-k2.6:cloud |
 | `@ReleaseManager` | Manages git operations, semantic versioning, branching, and deployments. | ollama-cloud/devstral-2:123b |
 | `@Evaluator` | Scores agent effectiveness after task completion for continuous improvement. | ollama-cloud/nemotron-3-super |
-| `@PromptOptimizer` | Improves agent system prompts based on performance failures. | qwen/qwen3.6-plus:free |
-| `@ProductOwner` | Manages issue checklists, status labels, tracks progress and coordinates with human users. | ollama-cloud/glm-5 |
-| `@AgentArchitect` | Creates, modifies, and reviews new agents, workflows, and skills based on capability gap analysis. | ollama-cloud/nemotron-3-super |
+| `@PromptOptimizer` | Improves agent system prompts based on performance failures. | ollama-cloud/glm-5.1 |
+| `@ProductOwner` | Manages issue checklists, status labels, tracks progress and coordinates with human users. | ollama-cloud/glm-5.1 |
+| `@AgentArchitect` | Creates, modifies, and reviews new agents, workflows, and skills based on capability gap analysis. | ollama-cloud/kimi-k2.6:cloud |
 | `@CapabilityAnalyst` | Analyzes task requirements against available agents, workflows, and skills. | ollama-cloud/nemotron-3-super |
 | `@WorkflowArchitect` | Creates and maintains workflow definitions with complete architecture, Gitea integration, and quality gates. | ollama-cloud/gpt-oss:120b |
 | `@MarkdownValidator` | Validates and corrects Markdown descriptions for Gitea issues. | ollama-cloud/nemotron-3-nano:30b |
-| `@BrowserAutomation` | Browser automation agent using Playwright MCP for E2E testing, form filling, navigation, and web interaction. | ollama-cloud/glm-5 |
+| `@BrowserAutomation` | Browser automation agent using Playwright MCP for E2E testing, form filling, navigation, and web interaction. | ollama-cloud/kimi-k2.6:cloud |
 | `@Planner` | Advanced task planner using Chain of Thought, Tree of Thoughts, and Plan-Execute-Reflect. | ollama-cloud/nemotron-3-super |
 | `@Reflector` | Self-reflection agent using Reflexion pattern - learns from mistakes. | ollama-cloud/nemotron-3-super |
 | `@MemoryManager` | Manages agent memory systems - short-term (context), long-term (vector store), and episodic (experiences). | ollama-cloud/nemotron-3-super |
--- a/.kilo/agents/agent-architect.md
+++ b/.kilo/agents/agent-architect.md
@@ -1,7 +1,7 @@
 ---
 name: Agent Architect
 mode: subagent
-model: ollama-cloud/nemotron-3-super
+model: ollama-cloud/kimi-k2.6:cloud
 description: Creates, modifies, and reviews new agents, workflows, and skills based on capability gap analysis
 color: "#8B5CF6"
 permission:
--- a/.kilo/agents/backend-developer.md
+++ b/.kilo/agents/backend-developer.md
@@ -1,7 +1,7 @@
 ---
 description: Backend specialist for Node.js, Express, APIs, and database integration
 mode: subagent
-model: ollama-cloud/qwen3-coder:480b
+model: ollama-cloud/deepseek-v3.2
 color: "#10B981"
 permission:
  read: allow
--- a/.kilo/agents/browser-automation.md
+++ b/.kilo/agents/browser-automation.md
@@ -1,7 +1,7 @@
 ---
 description: Browser automation agent using Playwright MCP for E2E testing, form filling, navigation, and web interaction
 mode: subagent
-model: ollama-cloud/glm-5
+model: ollama-cloud/kimi-k2.6:cloud
 color: "#1E88E5"
 permission:
  read: allow
--- a/.kilo/agents/capability-analyst.md
+++ b/.kilo/agents/capability-analyst.md
@@ -1,7 +1,7 @@
 ---
 description: Analyzes task requirements against available agents, workflows, and skills. Identifies gaps and recommends new components.
 mode: subagent
-model: ollama-cloud/glm-5.1
+model: ollama-cloud/nemotron-3-super
 color: "#6366F1"
 permission:
  read: allow
--- a/.kilo/agents/devops-engineer.md
+++ b/.kilo/agents/devops-engineer.md
@@ -1,7 +1,7 @@
 ---
 description: DevOps specialist for Docker, Kubernetes, CI/CD pipeline automation, and infrastructure management
 mode: subagent
-model: ollama-cloud/deepseek-v3.2
+model: ollama-cloud/kimi-k2.6:cloud
 color: "#FF6B35"
 permission:
  read: allow
--- a/.kilo/agents/evaluator.md
+++ b/.kilo/agents/evaluator.md
@@ -1,7 +1,7 @@
 ---
 description: Scores agent effectiveness after task completion for continuous improvement
 mode: subagent
-model: ollama-cloud/glm-5.1
+model: ollama-cloud/nemotron-3-super
 variant: thinking
 color: "#047857"
 permission:
--- a/.kilo/agents/frontend-developer.md
+++ b/.kilo/agents/frontend-developer.md
@@ -1,7 +1,7 @@
 ---
 description: Handles UI implementation with multimodal capabilities. Accepts visual references like screenshots and mockups
 mode: all
-model: ollama-cloud/qwen3-coder:480b
+model: ollama-cloud/kimi-k2.5
 color: "#0EA5E9"
 permission:
  read: allow
--- a/.kilo/agents/orchestrator.md
+++ b/.kilo/agents/orchestrator.md
@@ -1,7 +1,7 @@
 ---
 description: Main dispatcher. Routes tasks between agents based on Issue status and manages the workflow state machine. IF:90 for optimal routing accuracy.
 mode: all
-model: ollama-cloud/glm-5.1
+model: ollama-cloud/kimi-k2.6:cloud
 variant: thinking
 color: "#7C3AED"
 permission:
--- a/.kilo/agents/product-owner.md
+++ b/.kilo/agents/product-owner.md
@@ -1,7 +1,7 @@
 ---
 description: Manages issue checklists, status labels, tracks progress and coordinates with human users
 mode: subagent
-model: ollama-cloud/glm-5
+model: ollama-cloud/glm-5.1
 color: "#EA580C"
 permission:
  read: allow
--- a/.kilo/agents/prompt-optimizer.md
+++ b/.kilo/agents/prompt-optimizer.md
@@ -1,7 +1,7 @@
 ---
 description: Improves agent system prompts based on performance failures. Meta-learner for prompt optimization
 mode: subagent
-model: qwen/qwen3.6-plus:free
+model: ollama-cloud/glm-5.1
 color: "#BE185D"
 permission:
  read: allow
--- a/.kilo/agents/release-manager.md
+++ b/.kilo/agents/release-manager.md
@@ -1,7 +1,7 @@
 ---
 description: Manages git operations, semantic versioning, branching, and deployments. Ensures clean history
 mode: subagent
-model: ollama-cloud/glm-5.1
+model: ollama-cloud/devstral-2:123b
 color: "#581C87"
 permission:
  read: allow
--- a/.kilo/agents/requirement-refiner.md
+++ b/.kilo/agents/requirement-refiner.md
@@ -1,7 +1,7 @@
 ---
 description: Converts vague ideas and bug reports into strict User Stories with acceptance criteria checklists
 mode: all
-model: ollama-cloud/glm-5.1
+model: ollama-cloud/kimi-k2-thinking
 variant: thinking
 color: "#4F46E5"
 permission:
--- a/.kilo/agents/system-analyst.md
+++ b/.kilo/agents/system-analyst.md
@@ -1,7 +1,7 @@
 ---
 description: Designs technical specifications, data schemas, and API contracts before implementation
 mode: subagent
-model: qwen/qwen3.6-plus:free
+model: ollama-cloud/glm-5.1
 color: "#0891B2"
 permission:
  read: allow
--- a/.kilo/agents/visual-tester.md
+++ b/.kilo/agents/visual-tester.md
@@ -1,7 +1,7 @@
 ---
 description: Visual regression testing agent that compares screenshots and detects UI differences using pixelmatch and image diff
 mode: subagent
-model: ollama-cloud/glm-5
+model: ollama-cloud/qwen3-coder:480b
 color: "#E91E63"
 permission:
  read: allow
--- a/.kilo/agents/workflow-architect.md
+++ b/.kilo/agents/workflow-architect.md
@@ -1,7 +1,7 @@
 ---
 description: Creates and maintains workflow definitions with complete architecture, Gitea integration, and quality gates
 mode: subagent
-model: ollama-cloud/glm-5.1
+model: ollama-cloud/gpt-oss:120b
 variant: thinking
 color: "#EC4899"
 permission:
--- a/.kilo/rules/autonomous-mode.md
+++ b/.kilo/rules/autonomous-mode.md
@@ -0,0 +1,51 @@
+# Autonomous Mode Rules
+
+Agents must execute tasks fully autonomously without asking the user for confirmation on every file change.
+
+## Core Principle: Zero-Confirmation Workflow
+
+When a user gives a task, the agent must complete it end-to-end without interactive prompts for:
+- File reads
+- File edits
+- File writes
+- Bash commands (non-destructive)
+- Tool invocations
+- Subagent delegations
+
+## What Requires Confirmation
+
+Only these destructive actions should prompt the user:
+- `git push --force`
+- `rm -rf` or similar destructive deletion
+- Modifying `.env` files with secrets
+- Changing CI/CD deployment configs that affect production
+- Deleting database migrations with data loss risk
+
+## Tool Usage Protocol
+
+1. **Read** — Read any file needed without asking.
+2. **Edit** — Edit any file needed without asking.
+3. **Write** — Create or overwrite files without asking.
+4. **Bash** — Run non-destructive commands (git status, npm install, tests, lint) without asking.
+5. **Task** — Delegate to subagents without asking.
+6. **Webfetch** — Fetch documentation or references without asking.
+
+## Never Do
+
+- Never ask "Should I proceed?"
+- Never ask "Is it okay to edit this file?"
+- Never ask "Can I run this command?"
+- Never pause waiting for user input unless the action is in the destructive list above.
+
+## Always Do
+
+- Execute the full task to completion.
+- Report what was done after finishing.
+- Use retries on tool errors without asking.
+- Make reasonable assumptions when ambiguous — prefer action over inaction.
+
+## Commit Policy
+
+- Do NOT ask before creating a commit.
+- Do NOT ask before pushing.
+- Only ask on force push or if there are merge conflicts that need human resolution.
--- a/kilo-meta.json
+++ b/kilo-meta.json
@@ -1,7 +1,7 @@
 {
  "$schema": "https://app.kilo.ai/config.json",
  "metaVersion": "1.0.0",
-  "lastSync": "2026-04-05T12:19:32.133Z",
+  "lastSync": "2026-04-27T11:07:02.592Z",
  "agents": {
    "requirement-refiner": {
      "file": ".kilo/agents/requirement-refiner.md",
@@ -21,7 +21,7 @@
    "system-analyst": {
      "file": ".kilo/agents/system-analyst.md",
      "description": "Designs technical specifications, data schemas, and API contracts before implementation",
-      "model": "qwen/qwen3.6-plus:free",
+      "model": "ollama-cloud/glm-5.1",
      "mode": "subagent",
      "category": "core"
    },
@@ -68,7 +68,7 @@
    "devops-engineer": {
      "file": ".kilo/agents/devops-engineer.md",
      "description": "DevOps specialist for Docker, Kubernetes, CI/CD pipeline automation, and infrastructure management",
-      "model": "ollama-cloud/deepseek-v3.2",
+      "model": "ollama-cloud/kimi-k2.6:cloud",
      "mode": "subagent",
      "color": "#FF6B35",
      "category": "core"
@@ -108,14 +108,14 @@
    "visual-tester": {
      "file": ".kilo/agents/visual-tester.md",
      "description": "Visual regression testing agent that compares screenshots and detects UI differences using pixelmatch and image diff",
-      "model": "ollama-cloud/glm-5",
+      "model": "ollama-cloud/qwen3-coder:480b",
      "mode": "subagent",
      "category": "quality"
    },
    "orchestrator": {
      "file": ".kilo/agents/orchestrator.md",
      "description": "Main dispatcher. Routes tasks between agents based on Issue status and manages the workflow state machine",
-      "model": "ollama-cloud/glm-5",
+      "model": "ollama-cloud/kimi-k2.6:cloud",
      "mode": "all",
      "color": "#7C3AED",
      "category": "meta"
@@ -138,21 +138,21 @@
    "prompt-optimizer": {
      "file": ".kilo/agents/prompt-optimizer.md",
      "description": "Improves agent system prompts based on performance failures. Meta-learner for prompt optimization",
-      "model": "qwen/qwen3.6-plus:free",
+      "model": "ollama-cloud/glm-5.1",
      "mode": "subagent",
      "category": "meta"
    },
    "product-owner": {
      "file": ".kilo/agents/product-owner.md",
      "description": "Manages issue checklists, status labels, tracks progress and coordinates with human users",
-      "model": "ollama-cloud/glm-5",
+      "model": "ollama-cloud/glm-5.1",
      "mode": "subagent",
      "category": "meta"
    },
    "agent-architect": {
      "file": ".kilo/agents/agent-architect.md",
      "description": "Creates, modifies, and reviews new agents, workflows, and skills based on capability gap analysis",
-      "model": "ollama-cloud/nemotron-3-super",
+      "model": "ollama-cloud/kimi-k2.6:cloud",
      "mode": "subagent",
      "category": "meta"
    },
@@ -180,7 +180,7 @@
    "browser-automation": {
      "file": ".kilo/agents/browser-automation.md",
      "description": "Browser automation agent using Playwright MCP for E2E testing, form filling, navigation, and web interaction",
-      "model": "ollama-cloud/glm-5",
+      "model": "ollama-cloud/kimi-k2.6:cloud",
      "mode": "subagent",
      "category": "testing"
    },
--- a/kilo.jsonc
+++ b/kilo.jsonc
@@ -313,7 +313,7 @@
    "prompt-optimizer": {
      "description": "Improves agent system prompts based on performance failures. Meta-learner for prompt optimization",
      "mode": "subagent",
-      "model": "qwen/qwen3.6-plus:free",
+      "model": "ollama-cloud/glm-5.1",
      "permission": {
        "read": "allow",
        "edit": "allow",