# GNS-2: Process Continuity Rules ## Problem The pipeline repeatedly broke in early GNS-2 phases because: 1. **service_healthy deadlock** (docker-compose.yml) — container couldn't start because it was waiting for its own healthcheck to pass before it was running 2. **Network overlap** — subnet 172.28.0.0/16 conflicted with existing Docker networks 3. **Hardcoded IPs and ports** — rigid Docker setups caused conflicts with host networks 4. **Operator dependency** — process stopped when technical barriers hit, required human decisions ## Root Cause | Failure | Why it happened | Operator-Free Fix | |---------|-----------------|-----------------| | `service_healthy` deadlock | Docker compose blocked startup waiting for healthcheck on a container that wasn't yet running | Use `condition: service_started` for depends_on | | Subnet `172.28.0.0/16` conflict | Hardcoded IP overlap with host Docker networks | Remove `ipam` config, let Docker auto-assign | | Rigid container configs | Inflexible Docker Compose setups caused conflicts with host networks | Use dynamic networking and auto-assigned IPs | | `/health` endpoint mismatch | Container used unstable `/health` endpoint | Probe guaranteed endpoints or use `service_started` | ## Operator-Free Design Principles ### 1. No `service_healthy` Conditions ```yaml # PROBLEM: deadlock depends_on: service: condition: service_healthy # Container waits for itself # FIX: allow startup, healthcheck as observer only depends_on: service: condition: service_started ``` ### 2. No Hardcoded Networks ```yaml # PROBLEM: overlap networks: gns-network: ipam: config: - subnet: 172.28.0.0/16 # May conflict # FIX: Docker auto-assigns networks: gns-network: driver: bridge ``` ### 3. Use Direct REST API ```typescript // Direct REST client: fast, simple, no extra layers const client = new GiteaClient({ apiUrl: config.giteaApiUrl, token: config.giteaToken, }) // All operations go directly to Gitea API via HTTP/REST ``` ### 4. Pre-flight Validation Before starting containers, validate prerequisites: ```bash # Check if port is free, if not use another curl -f http://localhost:3000/health || PORT=3001 # Check network doesn't exist docker network ls | grep my-network && docker network rm my-network # Check env vars are set [ -z "$GITEA_TOKEN" ] && echo "WARNING: GITEA_TOKEN not set" ``` ### 5. Self-Documenting Failures If process must stop, write explicit "why" and "what to do" to both: - Console output (human readable) - Gitea issue comment (machine readable, includes `GNS_EVENT`) ```markdown ## 🚫 Agent Blocked **Reason**: Gitea API not reachable **Action**: Check `GITEA_API_URL` and `GITEA_TOKEN` environment variables **Fallback**: Operations will use local file logging until API is available ``` ## Implementation Checklist For every new container/service: - [ ] No `service_healthy` conditions in depends_on - [ ] No hardcoded subnets or IPs - [ ] Environment variables have safe fallbacks for startup - [ ] Error boundaries in all async operations (try/catch) - [ ] Error messages include both "what happened" and "next step" - [ ] All operator-required steps are documented as checklist in issue body ## GNS-2 Event Format for Failures ```html ``` ## Reference - Docker compose depends_on behavior: https://docs.docker.com/compose/startup-order/ - Gitea API: `.kilo/shared/gitea-api.md`