Colony Deployment Guide
Colony Deployment Guide
Section titled “Colony Deployment Guide”Set up Colony as a dedicated agent swarm on a Mac server (or similar hardware). This guide covers everything from a bare machine to a fully operational, monitored deployment.
Audience: Technical founders through DevOps engineers. Each section has “skip if” markers — jump past what you’ve already done.
Three deployment modes:
- Docker (recommended) — containerized agents via
docker compose - Apple Container (macOS) — containerized agents via Apple’s lightweight VM runtime
- Native — agents run directly as Node.js processes
Coming soon: Pre-built images on GHCR will enable
docker compose pull && docker compose upwithout cloning Colony source.
First time? Start with the Getting Started Guide for a quick evaluation setup. This guide is for production deployment on dedicated hardware.
Phase 1: Foundation
Section titled “Phase 1: Foundation”Run scripts/setup-mac.sh to automate this phase, or follow the manual steps below.
./scripts/setup-mac.shThe script is idempotent — safe to re-run at any time. It checks each step before acting and prints a color-coded summary at the end.
1.1 Xcode Command Line Tools
Section titled “1.1 Xcode Command Line Tools”Skip if: xcode-select -p prints a path.
xcode-select --installA dialog will appear — click “Install” and wait for it to finish. This provides git, make, and other build essentials.
1.2 Homebrew
Section titled “1.2 Homebrew”Skip if: brew --version works.
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"On Apple Silicon, Homebrew installs to /opt/homebrew. Add it to your shell profile if it’s not already on PATH:
echo 'eval "$(/opt/homebrew/bin/brew shellenv)"' >> ~/.zprofileeval "$(/opt/homebrew/bin/brew shellenv)"1.3 GitHub CLI
Section titled “1.3 GitHub CLI”Skip if: gh --version works.
brew install ghAuthenticate with GitHub:
gh auth loginFollow the prompts — select SSH as the preferred protocol. This generates an SSH key and registers it with your GitHub account in one step. You will need this to clone Colony and your target repos.
1.4 Container Runtime
Section titled “1.4 Container Runtime”Choose one of the following container runtimes. Both run the same OCI image built from Colony’s Dockerfile.
Option A: Docker Desktop
Section titled “Option A: Docker Desktop”Skip if: docker info succeeds.
brew install --cask dockerAfter installation:
- Open Docker Desktop from
/Applications - Accept the license agreement
- Go to Settings > Resources and allocate:
- Memory: 8 GB+ RAM
- CPUs: 4+ CPUs
- Disk: 50 GB+ disk image size
- Wait for Docker to finish starting
Verify:
docker infoOption B: Apple Container (macOS only)
Section titled “Option B: Apple Container (macOS only)”Skip if: container --version succeeds.
Requires: macOS 15+, Apple Silicon.
brew install containerNo GUI, no daemon — just a CLI. After installation, start the runtime (this also downloads the Linux kernel on first run):
container system startDisable Rosetta for builds (uses native ARM instead of x86 emulation):
container system property set build.rosetta falseVerify:
container --versioncontainer system statusNote: Apple Container runs one lightweight VM per container, providing stronger isolation than Docker’s shared-kernel model. It uses the same Dockerfiles and OCI images. The
build.rosettaproperty defaults totrue, which requires Rosetta to be installed. Setting it tofalsebuilds native ARM images, which is preferred on Apple Silicon.
1.5 Node.js 20
Section titled “1.5 Node.js 20”Skip if: node --version shows v20 or higher.
brew install node@20If node is not on PATH after installation:
brew link --overwrite node@201.6 Claude Code CLI
Section titled “1.6 Claude Code CLI”Skip if: claude --version works.
npm install -g @anthropic-ai/claude-codeThe Developer and Analyzer agents shell out to the claude CLI for LLM-powered work.
1.7 Directory Structure
Section titled “1.7 Directory Structure”Skip if: ls ~/.colony/keys ~/.colony/workspaces ~/.colony/logs succeeds.
mkdir -p ~/.colony/keys ~/.colony/workspaces ~/.colony/logs| Directory | Purpose |
|---|---|
~/.colony/keys/ | GitHub App PEM file(s) (github-app.pem, github-ops-app.pem) |
~/.colony/workspaces/ | Git worktrees created by the Developer agent |
~/.colony/logs/ | Agent logs, health check output, disk monitor output |
1.8 Git Config
Section titled “1.8 Git Config”Skip if: git config --global user.name and git config --global user.email both return values.
git config --global user.name "Your Name"git config --global user.email "you@example.com"1.9 SSH Key
Section titled “1.9 SSH Key”Skip if: gh auth login (step 1.3) already configured SSH, or ssh -T git@github.com succeeds.
If you need to set up SSH manually without gh:
ssh-keygen -t ed25519 -C "you@example.com"Add the public key to GitHub: https://github.com/settings/ssh/new
cat ~/.ssh/id_ed25519.pubPhase 2: Credentials
Section titled “Phase 2: Credentials”2a. GitHub App
Section titled “2a. GitHub App”See docs/github-app-setup.md for the full walkthrough. Key deployment-specific notes:
-
Place the PEM file at
~/.colony/keys/github-app.pem:Terminal window mv ~/Downloads/<app-name>.*.private-key.pem ~/.colony/keys/github-app.pemchmod 600 ~/.colony/keys/github-app.pem -
Record these values from the GitHub App setup — you will need them for
colony.config.yaml:app_id— shown on the app’s settings page after creationinstallation_id— from the URL after installing the app on your reposbot_username— format is<app-slug>[bot](e.g.,colony-bot[bot])
-
Required repository permissions for App A (colony-coder):
Permission Access Purpose Contents Read & write Push branches, read repo files Issues Read & write Manage issue labels, post comments Pull requests Read & write Open PRs, post analyzer/developer comments Metadata Read-only Required (automatically granted)
Optional: App B (colony-ops) for Autonomous Merging
Section titled “Optional: App B (colony-ops) for Autonomous Merging”To enable fully autonomous merging without a human approving each PR, create a second GitHub App (colony-ops). See the Dual App Setup section of the GitHub App setup guide.
If using dual Apps, also place App B’s PEM file at ~/.colony/keys/github-ops-app.pem:
mv ~/Downloads/<ops-app-name>.*.private-key.pem ~/.colony/keys/github-ops-app.pemchmod 600 ~/.colony/keys/github-ops-app.pemFor Docker/container deployments, update the volume mount in step 4a to include the second PEM file (see docs/docker.md).
2b. Anthropic API Key
Section titled “2b. Anthropic API Key”Get your API key from https://console.anthropic.com.
Cost guidance by agent:
| Agent | Model (recommended) | Relative Cost | Notes |
|---|---|---|---|
| Analyzer | Sonnet | Low | Single-pass structured output |
| Developer | Sonnet (small) / Opus (medium/large) | High | Bulk of spend |
| Reviewer | Sonnet | Low | Deterministic checks + short LLM review |
| Merger | — | Zero | No LLM calls |
| Sprint Master | — | Zero | No LLM calls |
Recommendations:
- Set a $100-200/month spending limit in the Anthropic console
- Start all agents on Sonnet to validate the pipeline end-to-end
- Scale the Developer agent to Opus after validation (for medium and large issues)
- Monitor costs closely for the first week
Phase 3: Colony Setup
Section titled “Phase 3: Colony Setup”3a. Clone Colony
Section titled “3a. Clone Colony”mkdir -p ~/git && cd ~/gitgit clone git@github.com:RunColony/colony.gitcd colonynpm installnpm run build3b. Clone Target Repos (Native Mode Only)
Section titled “3b. Clone Target Repos (Native Mode Only)”Docker/container deployments: Skip this step. Worker containers clone their target repos automatically at startup (via
clone-setup.ts). No host-mounted repo directories are needed for workers.
For native deployments, clone each repository Colony will manage:
git clone git@github.com:your-org/your-repo.git ~/git/your-repo3c. Create colony.config.yaml
Section titled “3c. Create colony.config.yaml”Create the config file in the Colony root directory. This is a full starter config with conservative defaults:
github: owner: your-org repo: your-repo app: app_id: YOUR_APP_ID private_key_path: ~/.colony/keys/github-app.pem installation_id: YOUR_INSTALLATION_ID bot_username: 'your-app-slug[bot]'
repos: - owner: your-org repo: your-repo app: app_id: YOUR_APP_ID private_key_path: ~/.colony/keys/github-app.pem installation_id: YOUR_INSTALLATION_ID intake_mode: tagged workers: pool_size: 1 health_port_start: 9200 workspace: repo_dir: ~/git/your-repo # Native mode only — workers override this at runtime base_dir: ~/.colony/workspaces/{owner}/{repo} review: checks: build: 'npm run build' test: 'npm test' timeout_per_check: 300
agents: sprint_master: enabled: true poll_interval: 60 health_port: 9100 workers: enabled: true poll_interval: 10 heartbeat_interval: 30 health_port: 9200
executors: analyzer: effort: medium planner: max_turns: 200
labels: prefix: 'colony'
logging: level: info format: pretty
workspace: repo_dir: ~/git/your-repo # Native mode only — workers override this at runtime base_dir: ~/.colony/workspaces/{owner}/{repo} cleanup_after_merge: true
claude: timeout: 1800 max_retries: 1 models: developer: claude-sonnet-4-6 reviewer: claude-sonnet-4-6 analyzer: claude-sonnet-4-6 scaling: small: developer_max_turns: 80 model: claude-sonnet-4-6 effort: medium medium: developer_max_turns: 150 model: claude-sonnet-4-6 effort: high large: developer_max_turns: 250 model: claude-sonnet-4-6 effort: high
review: auto_merge_on_approval: false # checks are configured per-repo in the repos[] array above
database: url_env: DATABASE_URL # required for workers and sprint-masterMigrating from old config? Run
npx colony config migrate --config old-config.yamlto automatically convert per-agent config keys to the new worker pool format.
Key settings to understand:
-
intake_mode: tagged— only issues manually labeledcolony:enqueueget picked up. Change toallonce you trust the pipeline. -
auto_merge_on_approval: false— you review and merge PRs manually. Set totrueafter validation. -
review.checksmust match actual scripts in your target repo’spackage.json. If a configured check does not exist in the target repo, the Developer will add it to pass self-validation, conflicting with Reviewer feedback. Only configure checks that already exist.
For Docker or Apple Container deployments, change this path:
github.app.private_key_path→/colony/keys/github-app.pem
Note: Workers clone repos automatically at startup to
/colony/repos/{owner}/{repo}and overrideworkspace.repo_dirandworkspace.base_dirat runtime. You do not need to change these values in your config for worker containers. For native deployments, setworkspace.repo_dirto the local clone path andworkspace.base_dirto your preferred worktree directory.
3d. Create .env
Section titled “3d. Create .env”cp .env.example .envEdit .env and set:
ANTHROPIC_API_KEY=sk-ant-...DATABASE_URL=postgresql://colony:colony@localhost:5432/colonyIf using a PAT instead of GitHub App auth, also set GITHUB_TOKEN. DATABASE_URL is required for workers and sprint-master (the docker-compose.yml sets this automatically when using the bundled Postgres service).
3e. Validate
Section titled “3e. Validate”npx colony status --config colony.config.yamlThis checks config loading, GitHub connectivity, and agent readiness without starting the pipeline.
3f. Initialize Labels
Section titled “3f. Initialize Labels”Colony needs its label set on each target repo. Run:
npx colony init --config colony.config.yamlThis creates the colony:* labels on your GitHub repo. Without this step, the first pipeline run will fail because the required labels don’t exist.
3g. Database Migrations
Section titled “3g. Database Migrations”Colony’s pipeline store uses a versioned migration framework. Migrations run automatically at agent startup — both sprint-master and worker call PipelineStore.initialize() which applies any pending migrations before the agent begins processing. No manual migration step is required for either Docker or native deployments.
Migrations 014 and 015 (added is_blocked and is_paused columns to pipeline_issues) must be applied before agents that use Postgres-authoritative state (#1209) begin processing. Because migrations run at startup, simply restarting agents after pulling the new image is sufficient.
To verify the current migration version:
psql "$DATABASE_URL" -c "SELECT version FROM schema_migrations ORDER BY version DESC LIMIT 1;"The version should be 015 or higher before agents resume processing.
Phase 4: Launch
Section titled “Phase 4: Launch”4a. Docker Path (Recommended)
Section titled “4a. Docker Path (Recommended)”First, ensure your colony.config.yaml uses Docker volume paths (see 3c above).
Create a docker-compose.override.yml to mount the GitHub App PEM into each service. Workers clone their repos automatically at startup — no repo volume mount is needed:
services: sprint-master: volumes: - ~/.colony/keys/github-app.pem:/colony/keys/github-app.pem:ro worker: volumes: # No repo mount needed — workers clone repos at startup to /colony/repos/{owner}/{repo} - ~/.colony/keys/github-app.pem:/colony/keys/github-app.pem:ro # Uncomment if using dual App setup (autonomous merging): # - ~/.colony/keys/github-ops-app.pem:/colony/keys/github-ops-app.pem:ro webhook-receiver: volumes: - ~/.colony/keys/github-app.pem:/colony/keys/github-app.pem:ro monitor: volumes: # No repo mount needed — monitor reads repo list from Postgres (DATABASE_URL + network only) - ~/.colony/keys/github-app.pem:/colony/keys/github-app.pem:roSee docker-compose.override.example.yml for more examples (multi-repo mounts, resource limits, Postgres customization).
Build and start:
docker compose builddocker compose up -ddocker compose psdocker compose logs -fVerify health endpoints:
curl http://localhost:9100/health # sprint-mastercurl http://localhost:9200/health # workercurl http://localhost:9106/health # monitorcurl http://localhost:9800/health # webhook-receiverTo stop:
docker compose down4b. Apple Container Path
Section titled “4b. Apple Container Path”First, ensure your colony.config.yaml uses container volume paths (see 3c above). If using webhooks, add dispatch_host:
webhook: enabled: true secret_env: WEBHOOK_SECRET port: 9800 dispatch_host: host.docker.internalBuild and start:
./scripts/colony-container.sh build./scripts/colony-container.sh up -d./scripts/colony-container.sh psVerify health endpoints:
./scripts/colony-container.sh healthTo stop:
./scripts/colony-container.sh down4c. Native Path (Alternative)
Section titled “4c. Native Path (Alternative)”Export your API key and start all agents:
export ANTHROPIC_API_KEY=sk-ant-...npx colony start --config colony.config.yamlCheck status:
npx colony statusView logs:
tail -f ~/.colony/logs/*.logTo stop:
npx colony stop4d. First Pipeline Run
Section titled “4d. First Pipeline Run”Walk through a complete issue lifecycle to validate the deployment:
-
Create a test issue on your target repo. Use a small, well-defined task (e.g., “Add a
greet(name)function toutils.tsthat returnsHello, {name}!”). -
Label the issue
colony:enqueue(or just create it ifintake_mode: all). -
Watch the pipeline progress. The issue will move through states:
colony:new— Sprint Master picks it upcolony:analyzing— Analyzer triages and writes a speccolony:ready-for-dev— ready for implementationcolony:in-development— Developer creates a branch, writes code, opens a PRcolony:in-review— Reviewer runs checks and reviews the PRcolony:merge-pendingorcolony:human-review-ready— depending onauto_merge_on_approval
-
Check the PR. Review the code, the agent comments, and the CI results.
-
Merge the PR (or let the Merger agent handle it if
auto_merge_on_approval: true).
Monitor progress via:
# Dockerdocker compose logs -f
# Nativetail -f ~/.colony/logs/*.logOr check issue labels on GitHub — they update in real time as agents process the issue.
Phase 5: Operations
Section titled “Phase 5: Operations”5a. Auto-Start on Boot
Section titled “5a. Auto-Start on Boot”Docker: launchd + Docker Desktop
Section titled “Docker: launchd + Docker Desktop”-
Add Docker Desktop to Login Items (System Settings > General > Login Items).
-
Create a launchd plist to start Colony containers after Docker is ready:
<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd"><plist version="1.0"><dict> <key>Label</key> <string>com.colony.docker-start</string> <key>ProgramArguments</key> <array> <string>/bin/bash</string> <string>-c</string> <string>while ! docker info &>/dev/null 2>&1; do sleep 5; done; cd $HOME/git/colony && /usr/local/bin/docker compose up -d</string> </array> <key>RunAtLoad</key> <true/> <key>StandardOutPath</key> <string>/tmp/colony-autostart.log</string> <key>StandardErrorPath</key> <string>/tmp/colony-autostart.log</string></dict></plist>Install the plist:
cp com.colony.docker-start.plist ~/Library/LaunchAgents/launchctl load ~/Library/LaunchAgents/com.colony.docker-start.plistTo unload:
launchctl unload ~/Library/LaunchAgents/com.colony.docker-start.plistApple Container: launchd
Section titled “Apple Container: launchd”Create a launchd plist that starts Colony containers after the system boots:
<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd"><plist version="1.0"><dict> <key>Label</key> <string>com.colony.container-start</string> <key>ProgramArguments</key> <array> <string>/bin/bash</string> <string>-c</string> <string>cd $HOME/git/colony && ./scripts/colony-container.sh up -d</string> </array> <key>RunAtLoad</key> <true/> <key>StandardOutPath</key> <string>/tmp/colony-autostart.log</string> <key>StandardErrorPath</key> <string>/tmp/colony-autostart.log</string></dict></plist>Install the plist:
cp com.colony.container-start.plist ~/Library/LaunchAgents/launchctl load ~/Library/LaunchAgents/com.colony.container-start.plistNative: launchd with KeepAlive
Section titled “Native: launchd with KeepAlive”Create a plist that starts Colony agents directly and restarts them if they crash:
<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd"><plist version="1.0"><dict> <key>Label</key> <string>com.colony.agents</string> <key>ProgramArguments</key> <array> <string>/usr/local/bin/npx</string> <string>colony</string> <string>start</string> <string>--config</string> <string>/Users/you/git/colony/colony.config.yaml</string> </array> <key>WorkingDirectory</key> <string>/Users/you/git/colony</string> <key>EnvironmentVariables</key> <dict> <key>ANTHROPIC_API_KEY</key> <string>sk-ant-your-key</string> <key>PATH</key> <string>/opt/homebrew/bin:/usr/local/bin:/usr/bin:/bin</string> </dict> <key>RunAtLoad</key> <true/> <key>KeepAlive</key> <true/> <key>StandardOutPath</key> <string>/Users/you/.colony/logs/launchd.log</string> <key>StandardErrorPath</key> <string>/Users/you/.colony/logs/launchd.log</string></dict></plist>Apple Silicon note: The plist uses
/usr/local/bin/npx. On Apple Silicon Macs with Homebrew,npxis at/opt/homebrew/bin/npx. Runwhich npxto find your correct path and update the plist accordingly.
Replace /Users/you with your actual home directory. Install:
cp com.colony.agents.plist ~/Library/LaunchAgents/launchctl load ~/Library/LaunchAgents/com.colony.agents.plist5b. Log Management
Section titled “5b. Log Management”Docker: daemon.json log rotation
Section titled “Docker: daemon.json log rotation”Edit (or create) ~/.docker/daemon.json:
{ "log-driver": "json-file", "log-opts": { "max-size": "50m", "max-file": "5" }}Restart Docker Desktop after saving. On Linux, the path is /etc/docker/daemon.json and use sudo systemctl restart docker.
This limits each container’s log to 50 MB with 5 rotated files.
Native: newsyslog.conf
Section titled “Native: newsyslog.conf”Add an entry for Colony logs in /etc/newsyslog.d/colony.conf:
# logfilename [owner:group] mode count size when flags/Users/you/.colony/logs/*.log 644 5 10240 * JThis rotates logs at 10 MB with 5 compressed backups.
5c. Disk Space Monitoring
Section titled “5c. Disk Space Monitoring”Colony worktrees accumulate disk space over time. The scripts/monitor-disk.sh script checks disk usage, counts worktrees per repo, and identifies stale worktrees.
Docker workers: Each worker container needs enough ephemeral storage for a full repo clone plus worktrees. Container restarts produce clean clones, so stale worktree accumulation is not a concern for containerized workers — only for native deployments.
Set up a daily cron job:
0 8 * * * /path/to/colony/scripts/monitor-disk.sh --alert slack >> ~/.colony/logs/disk.log 2>&1Options:
--threshold N— warn when disk usage exceeds N% (default: 80)--stale-days N— worktrees older than N days are flagged as stale (default: 7)--alert slack|macos— send an alert when warnings are detected
5d. Health Checks
Section titled “5d. Health Checks”The scripts/health-check.sh script pings all agent health endpoints (singleton + worker) and reports status. It auto-detects Docker vs native mode.
Set up a cron job to run every 5 minutes:
*/5 * * * * /path/to/colony/scripts/health-check.sh --alert slack >> ~/.colony/logs/health.log 2>&1Options:
--mode docker|native|auto— detection mode (default: auto)--alert slack|macos|email— alert on failure--host HOST— host to check (default: localhost)
5e. Alerting
Section titled “5e. Alerting”Three alerting options, from simplest to most capable:
macOS Notifications (zero setup):
Both health-check.sh and monitor-disk.sh support --alert macos, which uses osascript to display native notifications. Good for a machine you’re logged into.
Email (msmtp or Postfix):
Both scripts support --alert email. Set COLONY_ALERT_EMAIL in your environment. Requires a local MTA — msmtp is the simplest:
brew install msmtpSlack Webhook (recommended for teams):
- Create a Slack Incoming Webhook at https://api.slack.com/messaging/webhooks
- Set
COLONY_SLACK_WEBHOOK_URLin your environment (or add to.env):Terminal window export COLONY_SLACK_WEBHOOK_URL="https://hooks.slack.com/services/T.../B.../..." - Use
--alert slackwith the monitoring scripts
5f. Webhooks (Optional)
Section titled “5f. Webhooks (Optional)”Polling-only is the default and works well for most deployments. For sub-second response to GitHub events:
-
Install Cloudflare Tunnel:
Terminal window brew install cloudflared -
Create a tunnel to expose the webhook receiver:
Terminal window cloudflared tunnel --url http://localhost:9800Note the generated URL (e.g.,
https://xxxx.trycloudflare.com). -
Register the webhook on GitHub. In your repository settings > Webhooks > Add webhook:
- Payload URL:
https://xxxx.trycloudflare.com/webhook - Content type:
application/json - Secret: generate with
openssl rand -hex 32 - Events: select Issues, Pull requests, Pull request reviews
- Payload URL:
-
Add webhook config to
colony.config.yaml:webhook:enabled: truesecret_env: WEBHOOK_SECRETport: 9800
For Docker or Apple Container deployments, add dispatch_host so the webhook receiver can reach agents via the host network:
webhook: enabled: true secret_env: WEBHOOK_SECRET port: 9800 dispatch_host: host.docker.internal-
Add
WEBHOOK_SECRETto.env:WEBHOOK_SECRET=<the-secret-from-step-3> -
Optionally run cloudflared as a launchd service for persistence:
Terminal window cloudflared service installOr create a launchd plist similar to the ones in section 5a.
Troubleshooting
Section titled “Troubleshooting”Config not found
Section titled “Config not found”Error: No config foundColony looks for config in this order: --config flag, ./colony.config.yaml, ~/.colony/config.yaml. Ensure one of these exists. For Docker, the config must be mounted at /colony/colony.config.yaml.
Worktree creation failures
Section titled “Worktree creation failures”Error: Failed to create worktree- Docker workers clone repos automatically at startup — if worktree creation fails, check network connectivity, GitHub App auth, and available disk space in the container
- Native mode: ensure the target repo is cloned and accessible at the path in
workspace.repo_dir - Check that the worktrees directory has sufficient disk space
- Run
git worktree prunein the target repo to clean up stale worktree references (native mode only)
Health check failures
Section titled “Health check failures”✗ worker (port 9200) — unhealthy (HTTP 000)- Verify the service is running:
docker compose ps(Docker) ornpx colony status(native) - For Apple Container:
./scripts/colony-container.sh psorcontainer ls - Check that health port in config matches the port the service is actually listening on (sprint-master: 9100, worker: 9200+, monitor: 9106, webhook-receiver: 9800)
- Review service logs for startup errors
Claude Code CLI not found
Section titled “Claude Code CLI not found”Error: claude command not found- Verify installation:
which claude - If using Docker, the image installs
@anthropic-ai/claude-codeglobally during build. Rebuild if the CLI was added after your last build:docker compose build --no-cache - For native, install globally:
npm install -g @anthropic-ai/claude-code
Permission errors
Section titled “Permission errors”Error: EACCES: permission denied- PEM file:
chmod 600 ~/.colony/keys/github-app.pem - Workspaces directory: ensure the user running Colony owns
~/.colony/workspaces/ - Docker: the Node.js process runs as root in the container. If mounting a host directory for workspaces instead of using the named volume, ensure it is writable by UID 0
Issues stuck in in-development after restart
Section titled “Issues stuck in in-development after restart”Workers use a Postgres task queue, not label polling. If a worker crashes mid-task, the task remains in claimed status. On startup, workers reclaim stale tasks automatically (threshold: 10 minutes). The monitor agent also periodically reclaims stale tasks. If an issue remains stuck, check:
work_taskstable for orphaned claimed tasks- Worker logs for repeated failures on the same issue
- The monitor dashboard for self-healing activity
Review cycle limit reached
Section titled “Review cycle limit reached”When countReviewCycles exceeds max_review_cycles, the Developer blocks immediately before doing any work. Relabeling to changes-requested just re-triggers the block. To advance a stuck issue manually, relabel directly to in-review and let the Reviewer assess the PR as-is.