feat(dashboard): unified data pipeline, verified benchmarks, and browser testing

- build-standalone-fixed.cjs: reads from 4 real sources (agents md, kilo-meta.json, model-benchmarks-verified.json, agent-versions.json); computes recommendations dynamically - build-standalone-direct.cjs: direct data export + HTML embed pipeline - dashboard-smoke-test.ts: Playwright E2E smoke test covering all 6 tabs - model-benchmarks-verified.json: verified IF scores from artificialanalysis.ai for 15 models (SWE-bench unverifiable → null) - agent-versions.json: 347 git history entries extracted for 34 agents - kilo-meta.json: prompt-optimizer → qwen3.5-122b, memory-manager → deepseek-v4-pro-max - index.html: Recommendations tab rendering updated for dynamic data - Dockerfile + docker-compose.yml: mount-driven build, no image rebuild for data changes - README.md: updated dashboard docs and verified benchmark sources
2026-05-25 21:05:14 +01:00
parent f9bed0f262
commit 9b0f160587
13 changed files with 4108 additions and 616 deletions
--- a/agent-evolution/Dockerfile
+++ b/agent-evolution/Dockerfile
@@ -16,9 +16,9 @@ WORKDIR /app
 # Placeholder content until host mounts the real index.standalone.html
 RUN echo '<!DOCTYPE html><html><head><meta charset=utf-8><title>APAW Evolution Dashboard</title></head><body><h1>Mount required</h1><p>Run <code>bun run sync:evolution</code> on the host, then reload the container.</p></body></html>' > index.html

-EXPOSE 3001
+EXPOSE 80

 HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \
-  CMD wget --no-verbose --tries=1 --spider http://127.0.0.1:3001/ || exit 1
+  CMD wget --no-verbose --tries=1 --spider http://127.0.0.1:80/ || exit 1

-CMD ["python3", "-m", "http.server", "3001"]
+CMD ["python3", "-m", "http.server", "80"]