Files
APAW/scripts/opencompass-setup.sh
Deploy Bot 397d8367e9 feat: milestone 78 — objective model evolution from benchmark research
- Reassign 29/30 agents based on capability-analyst web research
- deepseek-v4-pro: 14 agents (coding SOTA: SWE-bench 80.6%, LiveCodeBench 93.5%)
- minimax-m3☁️ 8 agents (agentic: BrowseComp 83.5%, 12h autonomous)
- glm-5.1: 4 agents (CyberGym 68.7% SOTA, sustained rounds)
- minimax-m2.5☁️ 2 agents (frontend productivity, 2.2M pulls)
- kimi-k2.6: 1 agent (ONLY true multimodal)
- Add OpenCompass evaluation container (docker, scripts) for future objective runs
- Evidence saved to agent-evolution/data/research-report.json (598 lines, 6 models)

Data gaps honestly documented: minimax-m3/m2.5, qwen3-coder, kimi-k2.6 benchmark tables are image-only on Ollama.
2026-06-01 20:50:10 +01:00

38 lines
994 B
Bash
Executable File

#!/usr/bin/env bash
set -euo pipefail
# OpenCompass dataset setup script
# Downloads required datasets on first run
DATA_DIR="/data"
ZIP_URL="https://github.com/InternLM/opencompass/releases/download/0.2.2/OpenCompassData-core-20240207.zip"
ZIP_FILE="${DATA_DIR}/OpenCompassData-core-20240207.zip"
MARKER="${DATA_DIR}/.datasets_ready"
if [[ -f "$MARKER" ]]; then
echo "Datasets already present (${MARKER} exists). Skipping download."
exit 0
fi
echo "Downloading OpenCompass core datasets ..."
mkdir -p "$DATA_DIR"
if command -v wget >/dev/null 2>&1; then
wget -q --show-progress -O "$ZIP_FILE" "$ZIP_URL" || {
echo "Error: Failed to download datasets from ${ZIP_URL}" >&2
exit 1
}
else
echo "Error: wget not found. Cannot download datasets." >&2
exit 1
fi
echo "Extracting datasets ..."
unzip -q "$ZIP_FILE" -d "$DATA_DIR" || {
echo "Error: Failed to extract ${ZIP_FILE}" >&2
exit 1
}
touch "$MARKER"
echo "Datasets ready in ${DATA_DIR}."