Skip to content

Workers (Bring Your Own)

Colony Cloud provides managed worker capacity out of the box. You can also register your own worker containers — useful when you need to run pipeline tasks in a private network, on specific hardware, or at a scale beyond your current managed plan.

ScenarioRecommendation
Standard workloads on public repositoriesUse managed workers — no setup required.
Repositories in a private network (VPN, VPC)BYO workers deployed inside the network.
GPU or high-memory tasksBYO workers on purpose-built hardware.
Burst capacity beyond plan limitsBYO workers to supplement managed capacity during peaks.
Compliance requirements (data residency, air-gapped)BYO workers in your own infrastructure.

Managed and BYO workers coexist. The task dispatcher routes pipeline tasks to any available worker, regardless of whether it is managed or BYO.

  1. Open Settings → Workers.

  2. Click Generate token. Give the token a descriptive label (e.g., prod-worker-us-east) and click Create. Copy the token — it is shown only once.

  3. Log in to the Colony registry and pull the worker image:

    Terminal window
    docker login registry.runcolony.com -u colony -p <your-token>
    docker run -d \
    -e COLONY_CLOUD_TOKEN=<your-token> \
    -e COLONY_CLOUD_URL=https://runcolony.com \
    registry.runcolony.com/colony-managed:latest
  4. Return to Settings → Workers. The new worker appears in the Active list within a few seconds after its first heartbeat.

Worker registration is idempotent. If a container restarts mid-registration — for example, during a rolling deploy — the Cloud API derives a stable worker ID from the token and host metadata. Restarting the container does not create a duplicate worker entry.

On every /api/worker/register and /api/worker/heartbeat call, the Cloud API re-mints a short-lived GitHub App installation token and returns it to the worker. Workers never hold long-lived GitHub credentials; token refresh is Cloud-side.

Workers send a heartbeat to /api/worker/heartbeat on a fixed interval (configured in your colony.config.yaml under agents.workers.heartbeat_interval, in seconds). A worker that misses heartbeats is marked Stale and eventually removed from the active pool.

Monitor worker health from the Operator Home dashboard — the Workers metric shows the count of registered workers currently reporting heartbeats. A drop in worker count that is not intentional warrants immediate investigation.

Per-worker configuration inherits from the organization config, which can be overridden at the repo level. Key fields under agents.workers in colony.config.yaml:

FieldUnitDescription
heartbeat_intervalsecondsHow often the worker sends a heartbeat to the Cloud API.
max_task_retriescountMaximum retry attempts for a failed task before it is sent to the DLQ.
max_task_durationsecondsMaximum time a single task may run before the worker times it out.

To deregister a worker, stop the container. The worker transitions to Stale after missing its heartbeat window, then is removed from the active pool automatically. No manual deregistration step is required.

To revoke a worker’s token (e.g., if the token is compromised):

  1. Open Settings → Workers.
  2. Find the token in the list and click Revoke.
  3. Confirm the dialog.

Revoking a token immediately blocks any container using it from registering or sending heartbeats. Stop the affected containers after revoking.