UniqueSoft/APAW - APAW - Gitea: Git with a cup of tea

Author	SHA1	Message	Date
Deploy Bot	530c1c1384	fix(commands): replace all non-Ollama providers with ollama-cloud/\n\nRemoved: openrouter, openai (unreliable/foreign)\nReplaced 6 commands:\n /status → ollama-cloud/qwen3.5-122b\n /ask → ollama-cloud/qwen3.5-122b\n /hotfix → ollama-cloud/deepseek-v4-pro-max\n /review → ollama-cloud/kimi-k2.6\n /code → ollama-cloud/deepseek-v4-pro-max (prev commit)\n /plan → ollama-cloud/deepseek-v4-pro-max (prev commit)\nAll models now served from ollama-cloud/ exclusively.	2026-05-27 14:02:10 +01:00
Deploy Bot	b075189b83	chore(dashboard): rebuild standalone after model migration\n\n- All 18 recommendations applied → pending: 0\n- File size: 246.1 KB	2026-05-27 13:47:33 +01:00
Deploy Bot	7635cb62cd	fix(dashboard): heatmap cell click + 5th tab + model sync fixes\n\n- restore hmModal with 4 legacy tabs + new Performance Graph tab\n- fix event.target in research-dashboard.template.html switchTab\n- fix showCellDetail event.stopPropagation for modal persistence\n- update agent models + sync KILO_SPEC.md and AGENTS.md	2026-05-27 13:46:55 +01:00
Deploy Bot	36455ccf24	feat:apply model recommendations - 18 agents migrated to kimi-k2.6\n\nSources from agent-evolution/data/evolution.json\nAgents: architect-indexer, backend-developer, browser-automation,\n code-skeptic, evaluator, flutter-developer, frontend-developer,\n history-miner, lead-developer, markdown-validator, php-developer,\n product-owner, prompt-optimizer, python-developer,\n requirement-refiner, sdet-engineer, visual-tester,\n workflow-architect\nAlso synced 4 agents via sync-agents.cjs	2026-05-27 13:38:49 +01:00
Deploy Bot	95e0866b46	fix(dashboard): remove all event.target dependencies - switchTab(tabId, el): uses el or document.querySelector fallback - switchHmTab(tabName, btn): uses btn or querySelector fallback - All 6 tab buttons + 4 heatmap modal tabs pass 'this' as parameter - Rebuilt index.standalone.html (261.6 KB) - Verified: grep event.target returns 0 occurrences	2026-05-26 13:22:40 +01:00
Deploy Bot	c212a0a34e	fix(build): remove broken heatmap string replacement - build-standalone-fixed.cjs: removed renderHeatmap() replacement block - The replacement used string concatenation with '\'' which broke single quotes in generated HTML, causing SyntaxError: unexpected token - Original renderHeatmap() in index.html uses template literals (`...`) which are safe and already contain showCellDetail onclick handler - Rebuilt index.standalone.html from fixed source - Zero console errors, zero JS syntax errors verified on port 3003	2026-05-25 22:31:32 +01:00
Deploy Bot	7f1269a370	fix(dashboard): 3 UI bugs + new DB watch tool 1. filterCategory: fix inline event.target → uses btn parameter - All Agents tab filter buttons now correctly toggle active class 2. exportRecommendations/showApplyModal: read from agentData, not removed INLINE_RECOMMENDATIONS - Apply modal shows real recommendations - Export generates JSON with real data 3. Heatmap cell click: add showCellDetail modal with Chart.js line chart + prompt history - onclick='showCellDetail(model, agent)' on every td - renderCellChart computes score history from agent.history - prompt_change items filtered and displayed 4. watch-db.cjs: incremental DB sync tool - Polls git for changes in .kilo/agents/*.md and kilo-meta.json - Detects model_change vs prompt_change by comparing with previous version - Exports to JSON after sync, logs to .kilo/logs/watch-db.log - SIGINT/SIGTERM graceful shutdown - Trigger: npm run evolution:watch	2026-05-25 21:50:55 +01:00
Deploy Bot	a0604afaf6	chore: archive generated files and clean up runtime outputs - index.standalone.html → agent-evolution/archive/index.standalone-2026-05-25.html (generated build output) - tests/visual/dashboard-tabs/current/*.png → tests/visual/archive/dashboard-tabs-current-2026-05-25/ (runtime capture output) - Cleaned empty tests/visual/dashboard-tabs/current/ directory	2026-05-25 21:23:47 +01:00
Deploy Bot	3cca6814f6	test(dashboard): add visual regression baselines for all 6 tabs - Captured via Playwright in Docker container - Viewport: desktop 1280x720 - Tabs: overview, all_agents, timeline, recommendations, heatmap, impact - Zero console errors, zero network errors during capture	2026-05-25 21:16:49 +01:00
Deploy Bot	a37bbee9e0	test(dashboard): add SPA screenshot and console error monitoring scripts - capture-dashboard-tabs.cjs: Playwright script to capture all 6 dashboard tabs - console-error-dashboard.cjs: Console + network error monitor with tab switching - both scripts run via docker/docker-compose.web-testing.yml Playwright container - zero console errors and zero network errors verified across all tabs	2026-05-25 21:15:49 +01:00
Deploy Bot	bac09bee02	feat(dashboard): add SPA tab screenshot capture for visual testing	2026-05-25 21:12:29 +01:00
Deploy Bot	9b0f160587	feat(dashboard): unified data pipeline, verified benchmarks, and browser testing - build-standalone-fixed.cjs: reads from 4 real sources (agents md, kilo-meta.json, model-benchmarks-verified.json, agent-versions.json); computes recommendations dynamically - build-standalone-direct.cjs: direct data export + HTML embed pipeline - dashboard-smoke-test.ts: Playwright E2E smoke test covering all 6 tabs - model-benchmarks-verified.json: verified IF scores from artificialanalysis.ai for 15 models (SWE-bench unverifiable → null) - agent-versions.json: 347 git history entries extracted for 34 agents - kilo-meta.json: prompt-optimizer → qwen3.5-122b, memory-manager → deepseek-v4-pro-max - index.html: Recommendations tab rendering updated for dynamic data - Dockerfile + docker-compose.yml: mount-driven build, no image rebuild for data changes - README.md: updated dashboard docs and verified benchmark sources	2026-05-25 21:05:14 +01:00
Deploy Bot	f9bed0f262	fix(dashboard): correct computeAgentScore formula and inline benchmark data - SWE=null no longer zeroes score; weight IF at 0.85 for reasoning-only models - Inline MODEL_BENCHMARKS const (sync script doesn't populate benchmarks) - Hash fallback tightened from 50-85 to 55-80 - History-miner now shows +10 improvement (82 vs 72) instead of false regression	2026-05-25 16:31:15 +01:00
Deploy Bot	699456b49e	feat(dashboard): replace raw Canvas with Chart.js for all Impact tab charts - Add Chart.js 4.4.7 via CDN + datalabels plugin - Agent Score: horizontal bar chart, sorted descending, color-coded - Model Distribution: doughnut with right-side legend + percentages - Migration Impact: grouped before/after bars with tooltip showing delta - Dark theme defaults: #8ba3c0 text, #1e2d45 grid - Chart instances destroyed before re-render to prevent memory leaks - Responsive: maintainAspectRatio: false	2026-05-25 15:45:14 +01:00
Deploy Bot	19be5cf229	fix(dashboard): rewrite Impact tab charts to work with actual data structure Replaced broken chart functions that expected non-existent fit_score_after/before with data-agnostic implementations using model names + benchmark lookup. - Agent Score Bar Chart: horizontal bars per agent, sorted descending, color-coded - Model Distribution: donut chart with legend on the right - Migration Impact Bars: before/after comparison from history entries - Added getModelScore() helper with deterministic fallback - Added 'Sync Evolution Data' button if data missing Fixes: canvas dimensions, getBoundingClientRect() == 0 when tab hidden	2026-05-25 15:18:35 +01:00
Deploy Bot	047a87afb4	feat(agent-models): apply MEDIUM+LOW priority model migrations - markdown-validator: deepseek-v4-pro-max → nemotron-3-nano (90% cost cut) - release-manager: glm-5.1 → kimi-k2.6 (+2 matrix, 1M context for diffs) - capability-analyst: glm-5.1 → deepseek-v4-pro-max (+4 matrix, 1M ctx) - browser-automation: qwen3-coder → deepseek-v4-flash (3× faster inference) - history-miner: nemotron-3-super → qwen3.5-122b (+14 IF, 12.4M pulls)	2026-05-25 15:07:17 +01:00
Deploy Bot	4a0c78e5c9	feat(agent-models): apply CRITICAL+HIGH model migrations from research Migrations based on model-research-2026-05-24: - prompt-optimizer: qwen3.6-plus → qwen3.5-122b (CRITICAL, IF=92) - memory-manager: qwen3.6-plus → deepseek-v4-pro-max (CRITICAL, 1M ctx) - system-analyst: glm-5.1 → deepseek-v4-pro-max (HIGH, matrix +6) - evaluator: glm-5.1 → qwen3.5-122b (HIGH, IF=92) - pipeline-judge: glm-5.1 → kimi-k2.6 (HIGH, matrix +8, 1M ctx) - workflow-architect: glm-5.1 → qwen3.5-122b (HIGH, IF=92) 7 files changed, 12 insertions(+), 12 deletions(-) Closes: model-research data gaps for idle models	2026-05-25 14:36:31 +01:00
Deploy Bot	81b130471d	fix(tool-use): add question tool schema with mandatory description field	2026-05-25 14:31:54 +01:00
Deploy Bot	e6e8e9cb2a	feat(workflow-cross-checker): add pre-flight inter-agent validation agent with gate protocol - Create .kilo/agents/workflow-cross-checker.md as a process inspector - Requires bash: ask, task: deny (subagent security compliant) - Defines Role Boundaries clarifying it does NOT replace code-skeptic, planner, or capability-analyst - Adds 7-question Uncomfortable Questions Protocol for architecture and conflict validation - Adds Error Handling table (Gitea API failure, corrupted checkpoint, unreadable logs) - Inserts Cross-Check Verification (Gate #1/#2/#3) into orchestrator state machine - Registers agent in kilo-meta.json, kilo.jsonc, capability-index.yaml, AGENTS.md, KILO_SPEC.md - Model: ollama-cloud/kimi-k2.6 (higher IF 91, better instruction following for structured verdicts)	2026-05-24 00:11:25 +01:00
Deploy Bot	bb043cb23d	feat(landing): add APAW marketing landing page with dark/light theme toggle - Responsive HTML/CSS landing with full project presentation - 30+ agent matrix table, pipeline phases, evolution section - Domain skills showcase with Docker-native approach - Pricing tiers: Developer 35€/mo, Team 200€/mo - Dark/light theme toggle with system preference detection - Theme persisted in localStorage, smooth CSS transitions - Docker container running on port 3002 via nginx:alpine - Cross-browser compatible, no horizontal scroll, mobile nav	2026-05-23 22:48:19 +01:00
Kilo Orchestrator	ded8e3022d	feat(parallel-coordination): evolution — Gitea comment-based task claiming for parallel agent execution New rule: - parallel-coordination.md — claim protocol, overlap check, claim release, deadlock prevention Updated: - orchestrator.md — Overlap Verification MANDATORY before parallel spawn - capability-index.yaml — implementation_phase parallel group with claim_protocol - gns-agent-protocol.md — task_claim and task_claim_release event types - EVOLUTION_LOG.md — evolution entry #6 Fixes: parallel agents writing to same files, migration collisions, worktree merge conflicts. No new agent, no new Docker service (per TCA rule).	2026-05-18 16:13:33 +01:00
Kilo Orchestrator	46d6752890	feat(context-window): evolution — Gitea-centric checkpoint pruning + agent context hygiene New rules: - context-window-budget.md — budget per task size, what to load/offload, recovery protocol - gns-checkpoint-pruning.md — minimal checkpoint v2 schema, agent entry/exit protocols Updated: - orchestrator.md — Context Budget Governance section (prune if consumed > 80%) - gns-agent-protocol.md — checkpoint schema trimmed (history → history_tail), added current_task + agent_chain - EVOLUTION_LOG.md — logged evolution entry #5 Fixes: context window overflow, agents loading 15,000+ tokens of irrelevant comments, state held in RAM instead of offloaded to Gitea.	2026-05-18 15:54:15 +01:00
Kilo Orchestrator	4e9ea678bd	feat(orchestrator): evolution — capability-first routing, parallelization, zero-work policy - orchestrator.md: add Capability-First Routing Protocol (5-step anti-regression) - orchestrator.md: add Testing Task Routing Matrix (browser-automation, visual-tester) - orchestrator.md: add Parallelization Protocol (review_phase + testing_phase parallel groups) - orchestrator.md: add Orchestrator Self-Delegation Prohibition (ZERO WORK POLICY) - capability-index.yaml: enrich parallel_groups with trigger/criteria/aggregator - capability-index.yaml: enrich iteration_loops with trigger_on fields - global.md: add Orchestrator Capability-First Check under Tooling Infrastructure - docker.md: add Host Installation Prohibition (STOP/READ/DELEGATE/REPORT) - EVOLUTION_LOG.md: log both evolution entries (2026-05-16T13:00 and 13:06) Addresses: orchestrator host tool install regression, serial execution waste, orchestrator self-work bypass of specialized agents.	2026-05-16 13:10:06 +01:00
Deploy Bot	60b14d33d0	fix(installer): install Kilo extension for root + all regular users, remove broken --user-data-dir	2026-05-16 12:13:35 +01:00
Deploy Bot	d796da6ab4	fix(installer): add bun to PATH persistently, suppress debconf dialogs, fix root vscode flags	2026-05-16 11:59:25 +01:00
Deploy Bot	e45cac8709	fix(installer): add --no-sandbox for root VS Code extension install + .work/ in .gitignore	2026-05-16 11:52:51 +01:00
Deploy Bot	879e0e5b7e	feat: add one-command Linux installer with VS Code + Kilo extension + APAW setup	2026-05-16 11:48:39 +01:00
NW	a6516f8595	feat: restore universal blog, booking, ecommerce skills with framework-agnostic schema and API patterns	2026-05-13 18:12:14 +01:00
NW	f65bbf9420	feat: add visual quality rules to frontend-developer agent + new screenshot page	2026-05-13 16:54:29 +01:00
NW	2287122f91	fix(agents): add Tool-First Enforcement to agent definitions and global rules	2026-05-13 09:37:40 +01:00
NW	4c9a95661f	evolution: remove obsolete :cloud suffix from kimi-k2.6 model id across all configs	2026-05-13 09:27:48 +01:00
NW	c031c4b9e5	feat(evolution): add incident-responder agent for server incident response and forensics	2026-05-09 13:31:20 +01:00
NW	8788261d4f	rules: add Task Critical Assessment (TCA) to prevent waste Add task-critical-assessment.md with 5 criteria to evaluate tasks BEFORE execution: 1. Abstraction over local API → reject (MCP lesson) 2. Layer without proven need → reject (hybrid fallback lesson) 3. Environment more complex than task → reject (Docker overlay lesson) 4. No acceptance criteria → require clarification 5. Previously rolled back work → require justification Link from global.md so every agent runs TCA before starting work. Prevents repeating the MCP incident: 6 commits, 1700+ lines, 2 days → full revert.	2026-05-09 01:57:50 +01:00
NW	67e8d2e41a	revert: remove MCP Gitea integration, restore direct REST client Remove all MCP-related infrastructure in favor of direct REST API calls. MCP added layers without value: Docker container, stdio bridge, hybrid fallback, healthchecks, SSE transport — all of which added failure modes and token overhead. Deleted: - docker/mcp-gitea/docker-compose.yml (MCP container config) - scripts/mcp-gitea-stdio.cjs (stdio bridge) - scripts/e2e-mcp-stdio-test*.py (MCP E2E tests) - scripts/test-kilo-mcp-integration.py - src/kilocode/agent-manager/mcp-gitea-client.ts (548 lines of MCP wrapper) - MCP-STDIO-SETUP.md (MCP documentation) - .vscode/settings.json (hardcoded MCP config with token) - .kilo/skills/mcp-gitea-connection/ and mcp-gitea.research.md Restored: - pipeline-runner.ts: HybridGiteaClient → GiteaClient (direct REST) Removed MCP dependency, imports, and initialization. No healthcheck waits, no container startup delays. - process-continuity.md: removed MCP-specific failure modes - e2e-gns2-test.py: removed Basic Auth, use token auth; fixed spec reference	2026-05-09 01:55:52 +01:00
NW	0f522e61c3	fix(gns-2): replace Basic Auth password with Bearer PAT for MCP	2026-05-09 01:28:40 +01:00
NW	81e4708b5f	docs(gns-2): MCP stdio transport setup instructions	2026-05-09 00:33:21 +01:00
NW	af08e74f72	feat(gns-2): stdio MCP transport with hybrid fallback	2026-05-09 00:28:57 +01:00
NW	106a0291a4	feat(gns2): E2E integration test script for issue #110 - Scripts: e2e-gns2-test.py simulates full pipeline through Gitea API - Supports scoped label replacement (status, budget, cascade) - Generates GNS_EVENT footers in comments - Validates checkpoint, labels, timeline, budget, depth - Uses actual existing labels (status::done, not status::completed) Refs: Milestone #67, Issue #110	2026-05-08 22:49:02 +01:00
NW	f5966db155	feat(gns2): integrate HybridGiteaClient into PollingSupervisor - PollingSupervisor now uses HybridGiteaClient (MCP primary, REST fallback) - Added mcpUrl to PipelineConfig - Supervisor calls initialize() to detect MCP vs REST mode automatically Refs: Milestone #67, Issue #107	2026-05-08 22:35:21 +01:00
NW	06fb0421ef	fix(process-continuity): operator-free design for MCP Docker integration - Resolve service_healthy deadlock by using service_started instead - Fix 172.28.0.0/16 network collision by removing ipam config - Add HybridGiteaClient (mcp → rest → bash fallback) - Create .kilo/rules/process-continuity.md with 5 operator-free principles: 1. No service_healthy conditions 2. No hardcoded networks 3. Automatic fallback chains 4. Pre-flight validation 5. Self-documenting failures - Update docker-compose.yml with resilient config: - start_period: 60s, retries: 5, restart: on-failure:3 - /tools healthcheck (guaranteed endpoint) - tmpfs for Node.js /tmp - Resource limits: 256M RAM, 0.5 CPU - MCP/REST integration test passed (issue #109) Refs: Milestone #67, Issues #107, #109	2026-05-08 22:31:59 +01:00
NW	3cc6ee2ffe	feat(gns2): Phase 8 MCP Docker containers for Gitea direct integration - docker/mcp-gitea/docker-compose.yml — MCP server container (Sqcoows/forgejo-mcp) - .kilo/skills/mcp-gitea-connection/SKILL.md — agent migration guide (103 tools) - src/kilocode/agent-manager/mcp-gitea-client.ts — MCP native client with fallback - Hybrid mode: MCP primary, REST API fallback if container unavailable - All 29 Tier 0/1 agents mass-updated with GNS-2 protocol (checkpoint read, event footer) - Security: no bash for Gitea ops, MCP handles credentials internally Refs: Milestone #67, Issue #107	2026-05-08 22:16:52 +01:00
NW	bd154f24d0	feat(gns2): mass-update all 30 agents with GNS-2 protocol - 29 agents updated with GNS-2 checkpoint/event protocol - 12 Tier 0 (leaf) agents: read checkpoint, write event footer, no cascade - 17 Tier 1 (task) agents: read checkpoint, recommend next agent, no direct task calls - 2 Tier 2 (meta) agents already updated: capability-analyst, agent-architect, evaluator - All agents now include GNS_EVENT footer template in comments - Frontmatter updated with '(GNS-2 Tier N)' classification Scripts added: - scripts/mass-update-gns-agents.py — idempotent mass updater - scripts/validate-gns-agents.py — protocol checker Refs: Milestone #67, Issues #99-#107	2026-05-08 22:03:08 +01:00
NW	47b027a02f	feat(gns2): Gitea-Nervous-System v2.0 - distributed agent state machine - Add GNS-2 label taxonomy (66 labels) with semantic routing - Tier 2 agents (capability-analyst, agent-architect, evaluator) enabled for self-cascade - GNS agent protocol: checkpoint v2 in issue body, machine-readable event footers - GiteaClient extended: checkpoint CRUD, event parsing, assignee/lock control, triggered issue polling - PipelineRunner rewritten as PollingSupervisor: reactive instead of active dispatch - Security: circuit breakers (is_locked), budget governance, depth limits - Scripts: init-gns-labels.py, validate-gns-agents.py - Milestone #67 + 7 phase issues (#99-#105) tracking evolution Refs: Milestone #67, Issues #99-#105	2026-05-08 21:25:38 +01:00
NW	f01e2064fb	feat(evolution): Kilo Code release sync & APAW system hardening (v2026-05-07) Security & Permissions: - All 30 agents: task[*]=deny, task[subagent]=deny (cascade prevention) - orchestrator & release-manager: bash=ask (hardening) - New .kilo/rules/subagent-security.md with audit rules - Updated .kilo/rules/global.md with Security & Permissions section - Updated .kilo/agents/orchestrator.md with Security Enforcement block Session Management: - New .kilo/rules/session-persistence.md (checkpoint format, worktree isolation) - Updated .kilo/rules/branch-strategy.md (worktree per agent) - pipeline-runner.ts: Checkpoint interface + save/load/resume methods Plan Persistence: - Updated .kilo/rules/lead-developer.md (plan handover section) Per-Agent Reasoning: - capability-index.yaml: reasoning_effort for all 30 agents (xhigh/high/medium/low) MCP Cleanup: - New .kilo/skills/docker-security/SKILL.md (--rm, orphaned process cleanup) Config Validation: - Updated .kilo/rules/docker.md (startup checks, commit scoping, location awareness) Docs: - README.md: v2026-05-07 evolution badges - .kilo/EVOLUTION_LOG.md: Entry #6 with full metrics - .gitignore: ignore dist/ + bun.lock Gitea: Milestone #66, Issues #91-#98 Architect: 9/9 sections fresh (express project type)	2026-05-08 18:54:08 +01:00
NW	74ad7c4b6e	docs(branch-strategy): default branch is dev, not main - Update branch strategy: dev is primary development branch - main is stable release only - Add release process: dev → PR → review → main → tag - Sync .kilo/ to target projects after release	2026-05-07 07:39:00 +01:00
NW	994ca58821	fix(agents): add missing permissions + complete kilo-meta.json - Fix 12 agents missing edit/write/bash permissions - Add 5 missing agents to kilo-meta.json (architect-indexer, flutter-developer, php-developer, pipeline-judge, python-developer) - Remove BOM from kilo.jsonc - All 32 agents now consistent between files and meta	2026-05-07 07:22:32 +01:00
NW	defe57d53a	feat: merge infrastructure skills and workflows from TenerifeProp Add MCP-based infrastructure skills: - mcp-integration: Playwright + GitMCP - e2e-testing: Cypress + AntV + Slack - search-integration: Brave + Tavily + Markitdown - security-scanner: CVE Search + MCP Validator - knowledge-base: Docfork + Wikipedia + ArXiv - prompt-manager: version control + DevTrends - api-catalog: MCP server registry - agent-architect-mcp: patterns + OpenAPI converter Add workflow commands: - feature.md: full feature pipeline - hotfix.md: urgent bug fix workflow Add rules: - orchestrator-self-evolution.md - sdet-engineer.md Add audit: - WORKFLOW_AUDIT.md Source: UniqueSoft/TenerifeProp	2026-05-06 23:04:14 +01:00
¨NW¨	80dca09ae0	fix: unquoted color, duplicate key, GLM downgrade + cross-platform validator - Fix security-auditor.md color bare hex to quoted - Fix orchestrator.md duplicate devops-engineer key - Fix .kilo/kilo.jsonc: orchestrator + root model to kimi-k2.6:cloud - Update agent-frontmatter-validation.md with diagnostic guide - Update global.md with YAML frontmatter rules for all agents - Update agent-architect.md + workflow-architect.md with color checklist - Add scripts/validate-agents.cjs: zero-dependency, cross-platform, --fix flag, scans worktrees	2026-05-04 22:01:45 +01:00
¨NW¨	fb552e0020	feat: v3 optimal model assignments + fitness gate - Update 30 agents to v3 heatmap maximum-score models: * go-dev: qwen3-coder -> deepseek-v4-pro-max (85->88 +3) * planner: nemotron -> deepseek-v4-pro-max (80->88 +8) * perf-engineer: nemotron -> deepseek-v4-pro-max (78->84 +6) * reflector: nemotron -> deepseek-v4-pro-max (78->84 +6) * security: nemotron -> deepseek-v4-pro-max (76->80 +4) * memory-manager: nemotron -> qwen3.6-plus (86->87 +1) * frontend: kimi-k2.5 -> minimax-m2.5 (92) * the-fixer: minimax-m2.5 -> kimi-k2.6 (88->90 +2) * browser-auto: kimi-k2.6 -> qwen3-coder (86->87 +1) * prompt-opt: glm-5.1 -> qwen3.6-plus (82->83 +1) * backend: deepseek-v3.2 -> qwen3-coder (91) * capability-analyst: nemotron -> glm-5.1 (85) * release-man: devstral-2 -> glm-5.1 (82) * evaluator: nemotron -> glm-5.1 (86) * workflow-arch: gpt-oss -> glm-5.1 (84) - Add Model Evolution Guard: * fitness-gate.cjs: rejects downgrades >3 points or <75 score * Normalized model ID lookup (: vs -) * Diff report before any file modifications - Update sync-benchmarks-from-yaml.cjs with fitness gate - Sync kilo-meta.json, kilo.jsonc, .md agent files - Rebuild research-dashboard.html (104KB, 30 agents, 11 models) Total improvement: +105 points across 11 agents Source: v3.html heatmap IF-adjusted composite scores	2026-04-30 08:42:10 +01:00
¨NW¨	9e48a4960e	fix: restore optimal v3 models + add fitness gate protection - Restore all 30 agents to v3.html heatmap optimal models: * frontend-developer: qwen3-coder -> minimax-m2.5 (92★) * devops-engineer: nemotron-3-super -> kimi-k2.6:cloud (88★) * browser-automation: qwen3-coder -> kimi-k2.6:cloud (86★) * agent-architect: glm-5.1 -> kimi-k2.6:cloud (86★) - Add Model Evolution Guard system: * agent-evolution/scripts/lib/fitness-gate.cjs * Rejects downgrades >3 points or below score 75 * Produces detailed diff report before any file modifications * Normalized model ID lookup (v3.html ':' vs JSON '-') - Update sync-benchmarks-from-yaml.cjs with fitness gate - Update model-benchmarks.json with v3 optimal assignments - Rebuild research-dashboard.html (104KB, 30 agents, 11 models) - Add model-evolution-guard.md architecture documentation - Add v3-optimal-models.json as source-of-truth reference Fixes regression introduced by commit `3badb25` where models were silently downgraded from heatmap optimal to inferior assignments.	2026-04-29 23:19:16 +01:00

1 2 3 4

158 Commits