Files

¨NW¨ d1516f4856 chore: organize temporary research artifacts into archive

- Create agent-evolution/archive/ with scripts/, reports/, data/
- Move 11 Python migration/diagnostic scripts
- Move 7 intermediate report files (json, md, txt)
- Move test data and old dashboard builds
- Add archive/README.md with full index of contents
- Update .gitignore to exclude archive/scripts, reports, data
- Keep archive/README.md tracked for documentation

2026-04-29 21:14:23 +01:00

README.md

chore: organize temporary research artifacts into archive

2026-04-29 21:14:23 +01:00

README.md

Agent Evolution Archive

This directory contains temporary scripts, reports, and artifacts from the April 2026 model research and data migration work.

Structure

scripts/ — One-off Python/Node scripts used for data migration, diagnostics, and fixes
reports/ — Intermediate analysis outputs (JSON, Markdown, text)
data/ — Test data files and old dashboard builds

Notable Scripts

Script	Purpose
`add_current_model_id.py`	Adds `current_model_id` string field to agent scores (replaces fragile index lookup)
`add_fallback_models.py`	Injects `fallback_models` list into `model-benchmarks.json` agents
`add_groq_fallbacks.py`	Adds Groq provider fallback configuration with rate limits
`analyze_if_scores.py`	Analyzes IF (Inverse FLOPs) scores across model matrix
`check_fixes.py`	Validates fixes applied to `model-benchmarks.json`
`fix_model_benchmarks.py`	Corrects field name mismatches (`before`/`after` vs `score_before`/`score_after`)
`update_failover_strategies.py`	Adds `failover_strategy` blocks to capability-index.yaml
`validate_fixes.py`	Full validation suite for `model-benchmarks.json` structure
`verify_fallback.py`	Checks fallback model existence in benchmark data
`verify_fixes_simple.py`	Quick structure verification after fixes
`_validate_benchmarks.py`	JSON schema validation runner
`rebuild-template.js`	ES module version of `rebuild-template.cjs` (superseded)
`sync-benchmarks-from-yaml.js`	ES module version of sync script (superseded)

Reports

Report	Description
`analysis_results.json`	IF score analysis output
`optimization_summary.md`	Model assignment optimization recommendations
`recommendations_summary.md`	Condensed recommendation matrix
`research_report.md`	Full research findings before dashboard integration
`quick_findings.txt`	Bullet-point summary for quick reference
`patch.json`	Proposed data patches before application

Data Files

File	Description
`model-research-test.json`	Test fixture for dashboard rendering
`research-dashboard-2026_04_27.html`	Previous dashboard build (superseded by 2026_04_29)

When to Reuse

These scripts may be useful for:

Future data migrations when model-benchmarks.json schema changes
Re-adding fallback configurations after resets
Benchmarking new model providers
Diagnostic checks when dashboard fails to render

Migration Notes

All fixes applied by these scripts have been merged into the production data:

model-benchmarks.json contains current_model_id on every agent
capability-index.yaml contains fallback_models and failover_strategy
Field names standardized to before/after (not score_before/score_after)