Files

Deploy Bot b95fd41587 feat(evolution): add real-fit dashboard, API, report builder, and docker compose

- real-fit.html: API-driven research dashboard with agent/model heatmap, detail modal with score breakdown and evaluator commentary
- api.py: FastAPI backend serving /api/real-fit-report (dynamic from SQLite), /api/research, /api/evolve-agent/start
- rebuild-report.py: generates real-fit-report.json from SQLite DB for static fallback
- docker-compose.yml: add evolution-api service (Python 3.12, uvicorn) for research endpoints
- index.standalone.html: sync with dashboard data updates
- archive/index.html: standalone dashboard snapshot (263KB)
- .gitignore: exclude *.db, research-jobs.json from tracking

2026-05-28 11:55:49 +01:00

data

chore(archive): move untracked files + clean working tree\n\nArchived to agent-evolution/archive/:\n - test scripts, specs, data exports\n - dashboard-user-journey.md → .kilo/archive/\n\nClean: all non-ollama models verified (openrouter, openai removed)

2026-05-27 14:04:37 +01:00

tests

feat(evolution): add real-fit dashboard, API, report builder, and docker compose

2026-05-28 11:55:49 +01:00

index.html

feat(evolution): add real-fit dashboard, API, report builder, and docker compose

2026-05-28 11:55:49 +01:00

index.standalone-2026-05-25.html

chore: archive generated files and clean up runtime outputs

2026-05-25 21:23:47 +01:00

README.md

chore: organize temporary research artifacts into archive

2026-04-29 21:14:23 +01:00

README.md

Agent Evolution Archive

This directory contains temporary scripts, reports, and artifacts from the April 2026 model research and data migration work.

Structure

scripts/ — One-off Python/Node scripts used for data migration, diagnostics, and fixes
reports/ — Intermediate analysis outputs (JSON, Markdown, text)
data/ — Test data files and old dashboard builds

Notable Scripts

Script	Purpose
`add_current_model_id.py`	Adds `current_model_id` string field to agent scores (replaces fragile index lookup)
`add_fallback_models.py`	Injects `fallback_models` list into `model-benchmarks.json` agents
`add_groq_fallbacks.py`	Adds Groq provider fallback configuration with rate limits
`analyze_if_scores.py`	Analyzes IF (Inverse FLOPs) scores across model matrix
`check_fixes.py`	Validates fixes applied to `model-benchmarks.json`
`fix_model_benchmarks.py`	Corrects field name mismatches (`before`/`after` vs `score_before`/`score_after`)
`update_failover_strategies.py`	Adds `failover_strategy` blocks to capability-index.yaml
`validate_fixes.py`	Full validation suite for `model-benchmarks.json` structure
`verify_fallback.py`	Checks fallback model existence in benchmark data
`verify_fixes_simple.py`	Quick structure verification after fixes
`_validate_benchmarks.py`	JSON schema validation runner
`rebuild-template.js`	ES module version of `rebuild-template.cjs` (superseded)
`sync-benchmarks-from-yaml.js`	ES module version of sync script (superseded)

Reports

Report	Description
`analysis_results.json`	IF score analysis output
`optimization_summary.md`	Model assignment optimization recommendations
`recommendations_summary.md`	Condensed recommendation matrix
`research_report.md`	Full research findings before dashboard integration
`quick_findings.txt`	Bullet-point summary for quick reference
`patch.json`	Proposed data patches before application

Data Files

File	Description
`model-research-test.json`	Test fixture for dashboard rendering
`research-dashboard-2026_04_27.html`	Previous dashboard build (superseded by 2026_04_29)

When to Reuse

These scripts may be useful for:

Future data migrations when model-benchmarks.json schema changes
Re-adding fallback configurations after resets
Benchmarking new model providers
Diagnostic checks when dashboard fails to render

Migration Notes

All fixes applied by these scripts have been merged into the production data:

model-benchmarks.json contains current_model_id on every agent
capability-index.yaml contains fallback_models and failover_strategy
Field names standardized to before/after (not score_before/score_after)