chore: organize temporary research artifacts into archive

- Create agent-evolution/archive/ with scripts/, reports/, data/ - Move 11 Python migration/diagnostic scripts - Move 7 intermediate report files (json, md, txt) - Move test data and old dashboard builds - Add archive/README.md with full index of contents - Update .gitignore to exclude archive/scripts, reports, data - Keep archive/README.md tracked for documentation
2026-04-29 21:14:23 +01:00
parent 3badb259cc
commit d1516f4856
2 changed files with 65 additions and 0 deletions
--- a/.gitignore
+++ b/.gitignore
@@ -19,6 +19,11 @@ tests/reports/
 .kilo/EVOLUTION_LOG.md
 .kilo/WORKFLOW_AUDIT.md

+# Temporary research artifacts (one-off scripts, diagnostic outputs)
+agent-evolution/archive/scripts/
+agent-evolution/archive/reports/
+agent-evolution/archive/data/
+
 # Architect generated maps (can be large, auto-indexed)
 # Note: .architect/ md and json files ARE tracked for team orientation
 # Only maps/ with file graphs can be very large
--- a/agent-evolution/archive/README.md
+++ b/agent-evolution/archive/README.md
@@ -0,0 +1,60 @@
+# Agent Evolution Archive
+
+This directory contains temporary scripts, reports, and artifacts from the April 2026 model research and data migration work.
+
+## Structure
+
+- `scripts/` — One-off Python/Node scripts used for data migration, diagnostics, and fixes
+- `reports/` — Intermediate analysis outputs (JSON, Markdown, text)
+- `data/` — Test data files and old dashboard builds
+
+## Notable Scripts
+
+| Script | Purpose |
+|--------|---------|
+| `add_current_model_id.py` | Adds `current_model_id` string field to agent scores (replaces fragile index lookup) |
+| `add_fallback_models.py` | Injects `fallback_models` list into `model-benchmarks.json` agents |
+| `add_groq_fallbacks.py` | Adds Groq provider fallback configuration with rate limits |
+| `analyze_if_scores.py` | Analyzes IF (Inverse FLOPs) scores across model matrix |
+| `check_fixes.py` | Validates fixes applied to `model-benchmarks.json` |
+| `fix_model_benchmarks.py` | Corrects field name mismatches (`before`/`after` vs `score_before`/`score_after`) |
+| `update_failover_strategies.py` | Adds `failover_strategy` blocks to capability-index.yaml |
+| `validate_fixes.py` | Full validation suite for `model-benchmarks.json` structure |
+| `verify_fallback.py` | Checks fallback model existence in benchmark data |
+| `verify_fixes_simple.py` | Quick structure verification after fixes |
+| `_validate_benchmarks.py` | JSON schema validation runner |
+| `rebuild-template.js` | ES module version of `rebuild-template.cjs` (superseded) |
+| `sync-benchmarks-from-yaml.js` | ES module version of sync script (superseded) |
+
+## Reports
+
+| Report | Description |
+|--------|-------------|
+| `analysis_results.json` | IF score analysis output |
+| `optimization_summary.md` | Model assignment optimization recommendations |
+| `recommendations_summary.md` | Condensed recommendation matrix |
+| `research_report.md` | Full research findings before dashboard integration |
+| `quick_findings.txt` | Bullet-point summary for quick reference |
+| `patch.json` | Proposed data patches before application |
+
+## Data Files
+
+| File | Description |
+|------|-------------|
+| `model-research-test.json` | Test fixture for dashboard rendering |
+| `research-dashboard-2026_04_27.html` | Previous dashboard build (superseded by 2026_04_29) |
+
+## When to Reuse
+
+These scripts may be useful for:
+- Future data migrations when `model-benchmarks.json` schema changes
+- Re-adding fallback configurations after resets
+- Benchmarking new model providers
+- Diagnostic checks when dashboard fails to render
+
+## Migration Notes
+
+All fixes applied by these scripts have been merged into the production data:
+- `model-benchmarks.json` contains `current_model_id` on every agent
+- `capability-index.yaml` contains `fallback_models` and `failover_strategy`
+- Field names standardized to `before`/`after` (not `score_before`/`score_after`)