Files
APAW/agent-evolution/archive
Deploy Bot b95fd41587 feat(evolution): add real-fit dashboard, API, report builder, and docker compose
- real-fit.html: API-driven research dashboard with agent/model heatmap, detail modal with score breakdown and evaluator commentary
- api.py: FastAPI backend serving /api/real-fit-report (dynamic from SQLite), /api/research, /api/evolve-agent/start
- rebuild-report.py: generates real-fit-report.json from SQLite DB for static fallback
- docker-compose.yml: add evolution-api service (Python 3.12, uvicorn) for research endpoints
- index.standalone.html: sync with dashboard data updates
- archive/index.html: standalone dashboard snapshot (263KB)
- .gitignore: exclude *.db, research-jobs.json from tracking
2026-05-28 11:55:49 +01:00
..

Agent Evolution Archive

This directory contains temporary scripts, reports, and artifacts from the April 2026 model research and data migration work.

Structure

  • scripts/ — One-off Python/Node scripts used for data migration, diagnostics, and fixes
  • reports/ — Intermediate analysis outputs (JSON, Markdown, text)
  • data/ — Test data files and old dashboard builds

Notable Scripts

Script Purpose
add_current_model_id.py Adds current_model_id string field to agent scores (replaces fragile index lookup)
add_fallback_models.py Injects fallback_models list into model-benchmarks.json agents
add_groq_fallbacks.py Adds Groq provider fallback configuration with rate limits
analyze_if_scores.py Analyzes IF (Inverse FLOPs) scores across model matrix
check_fixes.py Validates fixes applied to model-benchmarks.json
fix_model_benchmarks.py Corrects field name mismatches (before/after vs score_before/score_after)
update_failover_strategies.py Adds failover_strategy blocks to capability-index.yaml
validate_fixes.py Full validation suite for model-benchmarks.json structure
verify_fallback.py Checks fallback model existence in benchmark data
verify_fixes_simple.py Quick structure verification after fixes
_validate_benchmarks.py JSON schema validation runner
rebuild-template.js ES module version of rebuild-template.cjs (superseded)
sync-benchmarks-from-yaml.js ES module version of sync script (superseded)

Reports

Report Description
analysis_results.json IF score analysis output
optimization_summary.md Model assignment optimization recommendations
recommendations_summary.md Condensed recommendation matrix
research_report.md Full research findings before dashboard integration
quick_findings.txt Bullet-point summary for quick reference
patch.json Proposed data patches before application

Data Files

File Description
model-research-test.json Test fixture for dashboard rendering
research-dashboard-2026_04_27.html Previous dashboard build (superseded by 2026_04_29)

When to Reuse

These scripts may be useful for:

  • Future data migrations when model-benchmarks.json schema changes
  • Re-adding fallback configurations after resets
  • Benchmarking new model providers
  • Diagnostic checks when dashboard fails to render

Migration Notes

All fixes applied by these scripts have been merged into the production data:

  • model-benchmarks.json contains current_model_id on every agent
  • capability-index.yaml contains fallback_models and failover_strategy
  • Field names standardized to before/after (not score_before/score_after)