Skip to content

Docker Deployment Guide

Run the full Colony pipeline with docker-compose up. The topology consists of singleton services (sprint-master, monitor, webhook-receiver) plus per-repo worker containers that process all task types (analyze, develop, review, merge, plan).

  • Docker Engine 20+ and Docker Compose v2
  • A GitHub token (PAT or GitHub App)
  • An Anthropic API key
  • A colony.config.yaml tailored for your repo(s)
Terminal window
# 1. Copy and fill in environment variables
cp .env.example .env
# Edit .env — set GITHUB_TOKEN and ANTHROPIC_API_KEY
# 2. Prepare your colony.config.yaml
# Workers clone repos at startup — workspace.repo_dir is overridden at runtime.
# Only github.app.private_key_path needs to use the Docker volume path:
# github.app.private_key_path → /colony/keys/github-app.pem
# 3. Build the image
docker-compose build
# 4. Start all agents
docker-compose up -d
# 5. View logs
docker-compose logs -f
VariableRequiredDescription
GITHUB_TOKENYes*GitHub personal access token. Not needed if using App auth.
ANTHROPIC_API_KEYYesAnthropic API key for Claude Code CLI.
DATABASE_URLNo*Postgres connection string. Do not set in .env for Docker Composedocker-compose.yml sets this automatically using the bundled Postgres. Only set manually when using an external database.
COLONY_CONFIGNoBase64-encoded colony.config.yaml. If set, the entrypoint writes it to /colony/colony.config.yaml. Useful for cloud deployments where volume mounts are inconvenient.
COLONY_AGENTNoWhich agent to run. Set per service in docker-compose.yml. Values: sprint-master, monitor, webhook-receiver, cli, <name>-worker-<N> (e.g., colony-worker-1).
COLONY_REPONoScopes a worker to a single repository (e.g., your-org/your-repo). Set per worker service in multi-repo deployments.
WEBHOOK_SECRETNoGitHub webhook secret for signature validation. Required when running the webhook-receiver service.
NODE_OPTIONSNoNode.js runtime flags (e.g., --max-old-space-size=4096).
/colony/colony.config.yaml — config file (mount read-only)
/colony/keys/github-app.pem — GitHub App A PEM (mount read-only, if using App auth)
/colony/keys/github-ops-app.pem — GitHub App B PEM (mount read-only, if using dual App setup)
/colony/workspaces/ — worktree base_dir (read-write, named volume — sprint-master and webhook-receiver only; monitor does not need this mount)
/colony/repos/{owner}/{repo}/ — worker clone path (created automatically at startup)

Workers clone their target repos automatically at startup to /colony/repos/{owner}/{repo}. No host-mounted repo directory is needed for worker containers. The workspace.repo_dir and workspace.base_dir values in your config are overridden at runtime by the clone-setup process.

# colony.config.yaml — workspace.repo_dir is ignored by workers in container mode
repos:
- owner: my-org
repo: my-app
workers:
pool_size: 2 # Safe — each worker container gets its own isolated clone
health_port_start: 9200
workspace:
repo_dir: ~/git/my-app # Used by native mode only; workers override this
base_dir: ~/.colony/workspaces # Used by native mode only; workers override this

Workers clone via HTTPS using the system-wide git credential helper configured in the Dockerfile. After cloning, workers run mise install (if available) and the repo’s workspace.setup_command (defaults to npm install).

Container restart behavior: Restarting a worker container produces a fresh clone with no stale worktree artifacts. This means pool_size > 1 works safely — each container gets its own .git/ state with no shared-git concurrency issues.

Disk usage: Each worker container needs enough ephemeral storage for a full repo clone plus worktrees created during task execution.

If using GitHub App authentication, mount the PEM file and update the config path:

colony.config.yaml
github:
app:
app_id: 123456
private_key_path: /colony/keys/github-app.pem
installation_id: 78901234
# docker-compose.yml — uncomment the keys volume
volumes:
- ~/.colony/keys/github-app.pem:/colony/keys/github-app.pem:ro

Host → container path mapping: Place the PEM file at ~/.colony/keys/github-app.pem on the host (matching the native install convention). The volume mount maps it to /colony/keys/github-app.pem inside the container, which is the path to set in private_key_path for Docker deployments.

If using a second GitHub App (colony-ops) for autonomous merging, mount a second PEM file and add the ops_app block to your config:

colony.config.yaml
github:
app:
app_id: 111111
private_key_path: /colony/keys/github-app.pem
installation_id: 11111111
ops_app:
app_id: 222222
private_key_path: /colony/keys/github-ops-app.pem
installation_id: 22222222
review:
auto_merge_on_approval: true
docker-compose.override.yml
services:
sprint-master:
volumes:
- ~/.colony/keys/github-app.pem:/colony/keys/github-app.pem:ro
- ~/.colony/keys/github-ops-app.pem:/colony/keys/github-ops-app.pem:ro
worker:
volumes:
# No repo mount needed — workers clone repos at startup
- ~/.colony/keys/github-app.pem:/colony/keys/github-app.pem:ro
- ~/.colony/keys/github-ops-app.pem:/colony/keys/github-ops-app.pem:ro
webhook-receiver:
volumes:
- ~/.colony/keys/github-app.pem:/colony/keys/github-app.pem:ro
monitor:
volumes:
- ~/.colony/keys/github-app.pem:/colony/keys/github-app.pem:ro

Place App B’s PEM at ~/.colony/keys/github-ops-app.pem on the host:

Terminal window
mv ~/Downloads/<ops-app-name>.*.private-key.pem ~/.colony/keys/github-ops-app.pem
chmod 600 ~/.colony/keys/github-ops-app.pem

See docs/github-app-setup.md for the full dual App walkthrough.

Each service exposes an HTTP health endpoint. Default ports:

ServicePortNotes
sprint-master9100Singleton
monitor9106Singleton (also serves dashboard)
webhook-receiver9800Singleton
worker9200+Per-repo; sequential from repos[].workers.health_port_start

The docker-compose.yml health checks use these defaults. If you override health ports in your config YAML, update the health check URLs and port mappings in docker-compose.yml to match.

Terminal window
# Check health manually
curl http://localhost:9100/health

The webhook-receiver service provides instant GitHub event dispatch to agents, eliminating up to 30s polling delays.

Terminal window
openssl rand -hex 32

Add the result to your .env file:

WEBHOOK_SECRET=<generated-secret>

2. Configure the webhook in your colony config

Section titled “2. Configure the webhook in your colony config”
colony.config.yaml
webhook:
enabled: true
secret_env: WEBHOOK_SECRET
port: 9800

In your GitHub repository (or organization) settings, add a webhook:

  • Payload URL: http://<your-host>:9800/webhook
  • Content type: application/json
  • Secret: the same value you set in WEBHOOK_SECRET
  • Events: select Let me select individual events and enable:
    • Issues
    • Pull requests
    • Pull request reviews

The webhook-receiver service starts automatically with docker-compose up -d. Verify it is healthy:

Terminal window
curl http://localhost:9800/health

In Docker, services communicate using Docker service names, not localhost. If you set explicit dispatch URLs in your config, use service names (e.g., http://sprint-master:9100/wake, http://worker:9200/health). When using the defaults derived from health_port values, services dispatch to the host network port, which works when all services share the default bridge network.

All agents continue to poll GitHub on their configured interval. Webhooks are additive — polling remains the fallback for environments where inbound HTTP is not feasible (local development, firewalled deployments).

Run ad-hoc CLI commands against the running deployment:

Terminal window
# colony status
docker-compose run --rm -e COLONY_AGENT=cli sprint-master colony status
# Or use docker exec on a running container
docker exec colony-sprint-master colony status
# Migrate old config to worker pool format
docker exec colony-sprint-master colony config migrate --config /colony/colony.config.yaml
Terminal window
docker build .

The default build runs npm test during the build stage. To skip tests for faster builds, target the build stage directly:

Terminal window
docker build --target build .
Terminal window
docker-compose build --no-cache
docker-compose up -d
Terminal window
# All services
docker-compose logs -f
# Single service
docker-compose logs -f worker
# Last 100 lines
docker-compose logs --tail=100 sprint-master
Terminal window
# Stop all agents
docker-compose down
# Stop and remove volumes (deletes worktrees)
docker-compose down -v

Agent won’t start — “No config found” Mount your colony.config.yaml at /colony/colony.config.yaml or set the COLONY_CONFIG env var.

Developer agent fails to create worktrees Workers clone repos automatically at startup. If worktree creation fails, check network connectivity and GitHub App auth (the clone may have failed). For native deployments, ensure the target repo is cloned and accessible at the path in workspace.repo_dir.

Health check failing Verify the health port in your config matches the port in the docker-compose.yml health check. Default ports: sprint-master 9100, worker 9200+, monitor 9106, webhook-receiver 9800.

Claude Code CLI not found The production image installs @anthropic-ai/claude-code globally. If builds fail at that step, check npm registry access from your build environment.

Permission denied on /colony/workspaces The named volume is owned by root by default. The Node.js process runs as root in the container. If you mount a host directory instead, ensure it’s writable by UID 0 (or run the container with a matching user).