- real-fit.html: API-driven research dashboard with agent/model heatmap, detail modal with score breakdown and evaluator commentary - api.py: FastAPI backend serving /api/real-fit-report (dynamic from SQLite), /api/research, /api/evolve-agent/start - rebuild-report.py: generates real-fit-report.json from SQLite DB for static fallback - docker-compose.yml: add evolution-api service (Python 3.12, uvicorn) for research endpoints - index.standalone.html: sync with dashboard data updates - archive/index.html: standalone dashboard snapshot (263KB) - .gitignore: exclude *.db, research-jobs.json from tracking
Agent Evolution Archive
This directory contains temporary scripts, reports, and artifacts from the April 2026 model research and data migration work.
Structure
scripts/— One-off Python/Node scripts used for data migration, diagnostics, and fixesreports/— Intermediate analysis outputs (JSON, Markdown, text)data/— Test data files and old dashboard builds
Notable Scripts
| Script | Purpose |
|---|---|
add_current_model_id.py |
Adds current_model_id string field to agent scores (replaces fragile index lookup) |
add_fallback_models.py |
Injects fallback_models list into model-benchmarks.json agents |
add_groq_fallbacks.py |
Adds Groq provider fallback configuration with rate limits |
analyze_if_scores.py |
Analyzes IF (Inverse FLOPs) scores across model matrix |
check_fixes.py |
Validates fixes applied to model-benchmarks.json |
fix_model_benchmarks.py |
Corrects field name mismatches (before/after vs score_before/score_after) |
update_failover_strategies.py |
Adds failover_strategy blocks to capability-index.yaml |
validate_fixes.py |
Full validation suite for model-benchmarks.json structure |
verify_fallback.py |
Checks fallback model existence in benchmark data |
verify_fixes_simple.py |
Quick structure verification after fixes |
_validate_benchmarks.py |
JSON schema validation runner |
rebuild-template.js |
ES module version of rebuild-template.cjs (superseded) |
sync-benchmarks-from-yaml.js |
ES module version of sync script (superseded) |
Reports
| Report | Description |
|---|---|
analysis_results.json |
IF score analysis output |
optimization_summary.md |
Model assignment optimization recommendations |
recommendations_summary.md |
Condensed recommendation matrix |
research_report.md |
Full research findings before dashboard integration |
quick_findings.txt |
Bullet-point summary for quick reference |
patch.json |
Proposed data patches before application |
Data Files
| File | Description |
|---|---|
model-research-test.json |
Test fixture for dashboard rendering |
research-dashboard-2026_04_27.html |
Previous dashboard build (superseded by 2026_04_29) |
When to Reuse
These scripts may be useful for:
- Future data migrations when
model-benchmarks.jsonschema changes - Re-adding fallback configurations after resets
- Benchmarking new model providers
- Diagnostic checks when dashboard fails to render
Migration Notes
All fixes applied by these scripts have been merged into the production data:
model-benchmarks.jsoncontainscurrent_model_idon every agentcapability-index.yamlcontainsfallback_modelsandfailover_strategy- Field names standardized to
before/after(notscore_before/score_after)