Skip to content

Team Patterns

How an organization adopts Colony matters as much as the configuration itself. The patterns here are the ones we’ve seen work across pilots and rollouts.

A tenant is Colony’s security and configuration boundary. The choice between one tenant for the org or one tenant per team is the most consequential org-level decision you’ll make.

Choose one tenant when:

  • Repos share dependencies, conventions, or coordinated releases
  • You want one Operator (or Operator team) responsible for the colony
  • Cross-repo work is common and you don’t want to negotiate access

Choose multiple tenants when:

  • Strict isolation between groups is required (different business units, regulated workloads)
  • Each team needs to own its own conventions and worker pool independently
  • The organizations don’t share dependencies or coordinated releases

Most rollouts start with one tenant per company and split only if isolation becomes a hard requirement. Splitting later is straightforward; merging tenants is not.

When multiple teams contribute to the same repo, .colony/conventions.md becomes shared. Conflicts surface as repeating-review-loop symptoms.

The pattern that works: conventions are repo-scoped, exceptions are folder-scoped. The repo-level .colony/conventions.md describes what’s true everywhere; folder-level overrides describe the exceptions. The reviewer worker reads the closest convention to the file being changed.

When conflicts go beyond what folder-scoping can express, the conflict is usually a sign the teams should be in separate repos.

The rollout patterns that produce the cleanest first 90 days:

  • Single-repo pilot. Pick one repo, ideally one with a single owning team and well-scoped issues. Run for 60 days before adding a second repo.
  • Internal champion. One person on the pilot team who owns the relationship with Colony — the Operator role concentrated in one human. They tune .colony/conventions.md, escalate when needed, and report back to the Sponsor.
  • Visible metrics from day one. Throughput, cost-per-issue, escape rate visible to the team and the Sponsor from the first week. Surprises later are much worse than discomfort early.
  • Conservative risk envelope at start. Tight cost caps, narrow automerge threshold, broad human-review labels. Loosen each lever after a month of data.

The patterns that fail:

  • Rolling out to every repo at once. The Operator is overwhelmed; conventions never get tuned; the Sponsor can’t tell what’s working.
  • Hiding Colony’s existence from the team. The first time someone notices Colony shipped a PR they don’t recognize, trust collapses.
  • Optimizing for a single dramatic metric (cost, throughput) instead of the basket. Each lever individually misleads.

When Colony decomposes an epic into subtasks, someone owns the whole-epic outcome. By default this is the Author of the epic issue, but for cross-team epics the ownership question is non-obvious.

The pattern: the Author of the parent epic owns the result, even when subtasks span teams. Cross-team subtasks get an additional reviewer from the affected team. The epic’s audit trail attributes work to whoever did it; ownership of the result stays with the chartering Author.

Branch freezes (release windows, holiday seasons, audit periods) need to be visible to Colony. The merger worker respects branch-protection rules and per-repo freeze configuration.

To freeze:

  • Mark the branch protected with the merge windows your existing CI uses
  • Add a freeze label or config flag (per-repo) that Colony’s merger reads
  • Communicate the window to the team — Colony will queue PRs but not merge them; an Operator should resume manually if a critical fix needs to land mid-freeze

Pause the colony — at the issue, repo, tenant, or global scope — when:

  • An incident touches Colony’s surface area (e.g., bad PR landed, repeating-review-loop with broken downstream effects)
  • A major refactor is in flight that Colony can’t reason about from existing conventions
  • Audit, security, or compliance review is happening and you don’t want concurrent activity confusing the trail

Pausing is reversible and cheap; rushing through an incident isn’t. Default to pausing.