- mergeCachedResults() deleted — no more hidden cache layer
- startAgentResearch() / startCellResearch() — no longer save to localStorage
- load() — pure API/JSON driven, no localStorage clearing logic
- Page now shows exactly what API returns — easier to verify fixes
- rebuild-report.py: sync current_model from kilo-meta.json (UPDATE not only INSERT)
- real-fit-report.json: regenerated from DB after agents table model rename
- real-fit.db: 10 agents updated: current_model pro-max → pro
- real-fit.html: remove stale model alias fallback
- modelShort(): alias deepseek-v4-pro-max → deepseek-v4-pro
- mergeCachedResults(): dedupe allAvailableModels by short name via some()
- openAgentModal(): dedupe render with Set
- MODEL_BENCHMARKS: remove stale deepseek-v4-pro-max entry
- evolution.json: sed-replace remaining deepseek-v4-pro-max → pro
- load(): normalize ollama-cloud/* names to short form, deduplicate with Set
- Prevents double entries when cache adds short names alongside API full names
- Remove localStorage wipe before mergeCachedResults() — was deleting
cached research results when Evolve button triggered load()
- mergeCachedResults(): only fill gaps (existing===undefined||0), never
override scores from DB — prevents stale cache from shadowing live data
- updateCell(): auto-create agent entry in reportData.agents if missing
- updateCell(): add model to allAvailableModels for modal checkboxes
- mergeCachedResults(): auto-create agent entry, normalize model names via modelShort()
- mergeCachedResults(): add new models to allAvailableModels for modal picker
- MODEL_BENCHMARKS: add deepseek-v4-pro (was missing, only had deepseek-v4-pro-max)
- real-fit.html: API-driven research dashboard with agent/model heatmap, detail modal with score breakdown and evaluator commentary
- api.py: FastAPI backend serving /api/real-fit-report (dynamic from SQLite), /api/research, /api/evolve-agent/start
- rebuild-report.py: generates real-fit-report.json from SQLite DB for static fallback
- docker-compose.yml: add evolution-api service (Python 3.12, uvicorn) for research endpoints
- index.standalone.html: sync with dashboard data updates
- archive/index.html: standalone dashboard snapshot (263KB)
- .gitignore: exclude *.db, research-jobs.json from tracking
- build-standalone-fixed.cjs: removed renderHeatmap() replacement block
- The replacement used string concatenation with '\'' which broke
single quotes in generated HTML, causing SyntaxError: unexpected token
- Original renderHeatmap() in index.html uses template literals (`...`)
which are safe and already contain showCellDetail onclick handler
- Rebuilt index.standalone.html from fixed source
- Zero console errors, zero JS syntax errors verified on port 3003
1. filterCategory: fix inline event.target → uses btn parameter
- All Agents tab filter buttons now correctly toggle active class
2. exportRecommendations/showApplyModal: read from agentData, not removed INLINE_RECOMMENDATIONS
- Apply modal shows real recommendations
- Export generates JSON with real data
3. Heatmap cell click: add showCellDetail modal with Chart.js line chart + prompt history
- onclick='showCellDetail(model, agent)' on every td
- renderCellChart computes score history from agent.history
- prompt_change items filtered and displayed
4. watch-db.cjs: incremental DB sync tool
- Polls git for changes in .kilo/agents/*.md and kilo-meta.json
- Detects model_change vs prompt_change by comparing with previous version
- Exports to JSON after sync, logs to .kilo/logs/watch-db.log
- SIGINT/SIGTERM graceful shutdown
- Trigger: npm run evolution:watch
- capture-dashboard-tabs.cjs: Playwright script to capture all 6 dashboard tabs
- console-error-dashboard.cjs: Console + network error monitor with tab switching
- both scripts run via docker/docker-compose.web-testing.yml Playwright container
- zero console errors and zero network errors verified across all tabs
- SWE=null no longer zeroes score; weight IF at 0.85 for reasoning-only models
- Inline MODEL_BENCHMARKS const (sync script doesn't populate benchmarks)
- Hash fallback tightened from 50-85 to 55-80
- History-miner now shows +10 improvement (82 vs 72) instead of false regression
Replaced broken chart functions that expected non-existent fit_score_after/before
with data-agnostic implementations using model names + benchmark lookup.
- Agent Score Bar Chart: horizontal bars per agent, sorted descending, color-coded
- Model Distribution: donut chart with legend on the right
- Migration Impact Bars: before/after comparison from history entries
- Added getModelScore() helper with deterministic fallback
- Added 'Sync Evolution Data' button if data missing
Fixes: canvas dimensions, getBoundingClientRect() == 0 when tab hidden
- Create .kilo/agents/workflow-cross-checker.md as a process inspector
- Requires bash: ask, task: deny (subagent security compliant)
- Defines Role Boundaries clarifying it does NOT replace code-skeptic, planner, or capability-analyst
- Adds 7-question Uncomfortable Questions Protocol for architecture and conflict validation
- Adds Error Handling table (Gitea API failure, corrupted checkpoint, unreadable logs)
- Inserts Cross-Check Verification (Gate #1/#2/#3) into orchestrator state machine
- Registers agent in kilo-meta.json, kilo.jsonc, capability-index.yaml, AGENTS.md, KILO_SPEC.md
- Model: ollama-cloud/kimi-k2.6 (higher IF 91, better instruction following for structured verdicts)
- Responsive HTML/CSS landing with full project presentation
- 30+ agent matrix table, pipeline phases, evolution section
- Domain skills showcase with Docker-native approach
- Pricing tiers: Developer 35€/mo, Team 200€/mo
- Dark/light theme toggle with system preference detection
- Theme persisted in localStorage, smooth CSS transitions
- Docker container running on port 3002 via nginx:alpine
- Cross-browser compatible, no horizontal scroll, mobile nav
Add task-critical-assessment.md with 5 criteria to evaluate tasks BEFORE execution:
1. Abstraction over local API → reject (MCP lesson)
2. Layer without proven need → reject (hybrid fallback lesson)
3. Environment more complex than task → reject (Docker overlay lesson)
4. No acceptance criteria → require clarification
5. Previously rolled back work → require justification
Link from global.md so every agent runs TCA before starting work.
Prevents repeating the MCP incident: 6 commits, 1700+ lines, 2 days → full revert.