Night-Orch Configuration Guide
This document explains how to write night-orch configuration files.
Source of truth for the schema is src/config/schema.ts. If this document and the code differ, treat the schema as authoritative and update this document.
Config File Discovery
night-orch resolves config in this order:
--config <path>(if provided).night-orch.yaml(only when--trust-workspaceis set).night-orch.yml(only when--trust-workspaceis set)config.yamlconfig.yml~/.night-orch/config.yaml~/.night-orch/config.yml~/.config/night-orch/config.yaml~/.config/night-orch/config.yml
Recommended deployment uses a dedicated non-root user (for example orch) with:
- config/state in
/home/orch/.night-orch/ - code in
/home/orch/apps/night-orch - target repos in
/home/orch/repos/*
Per-Repo Project Config (.night-orch.yml)
After the central config is loaded, night-orch checks each configured repos[].localPath for:
.night-orch.yml.night-orch.yaml
If both files exist in the same repo, config load fails (ambiguous source).
Project config is deep-merged into central config with project values winning:
- Repo-scoped keys merge into that repo entry only.
workflowsandworkerProfilesmerge into top-level maps.- Objects merge recursively.
- Arrays are replaced (not concatenated).
Project files are intended for repo-scoped settings and project-owned workflow/profile definitions.
Runtime Settings Overrides (DB-backed)
Night-orch supports DB-backed runtime overrides stored in SQLite (settings_overrides table).
Effective config precedence is:
- YAML value from central config, merged with per-repo project config (project wins where present)
- DB override (if present; applies only to runtime-overridable non-repo keys)
Overrides are persisted in DB and survive process restarts. They are not written back to YAML.
Runtime settings registry scope:
- Includes all non-project-specific config keys used at runtime.
- Excludes project-scoped
repos[*]settings and schema markerversion. storage.dbPathis listed for visibility but is read-only at runtime (DB bootstraps before overrides load).- Sensitive fields are redacted in settings read surfaces (for example
workerProfiles.*.envvalues). - JSON setting overrides are schema-validated per key before persistence.
Registered keys are visible via night-orch settings list (or Web/TUI Settings/MCP night-orch-list-settings). Current key groups:
github:tokenEnv,apiBaseUrl,pollIntervalSeconds,appMentionsstorage:dbPath(read-only),worktreeRoot,logsRoot,autoCleanup.enabled,autoCleanup.intervalMinutes,retention.worktreeAgeDays,retention.detailDays,retention.archiveDaysnotifications:channels,events.onRunStarted,events.onBlocked,events.onPrReady,events.onPrUpdated,events.onError,events.onRetryExhaustedloop:maxReviewIterations,maxTotalAgentPasses,stopOnPlannerFailure,requireVerificationPass,reviewApprovalKeyword,reviewNeedsChangesKeyword,blockOnAmbiguousReview,maxAutoRetries,maxEmptyDiffRetries,maxConsecutiveBlocks,decompose,maxSubtasks,maxConcurrentSubtaskssecurity:maxChangedFiles,maxChangedLines,maxDailyCostUsd,maxCostPerRunUsdcost:model,subscriptionMetered,pricing.defaultModel,pricing.modelsworkerProfilesmetrics:enabled,port,hostobservability:agentStreaming,eventRetention,sessionLogs,sessionLogRetentionmcp:enabled,transport,authTokenEnv,httpPort,httpHostcommentCommands:enabled,requireCollaboratorworkflows
Keys not in the runtime registry — edit YAML and restart the daemon: ai.*, fileLoop.*, cost.allowEstimatedDuration, all repos[] settings, github.tokenEnv environment values (the registry exposes the env var name, not the token itself). Use night-orch daily-cost-override / night-orch cost-override for budget headroom rather than mutating security.maxDailyCostUsd at runtime.
Update surfaces:
- CLI:
night-orch settings list|set|unset - MCP:
night-orch-list-settings,night-orch-set-setting,night-orch-clear-setting - Web: Settings page (
/api/settings,/api/operations/settings/*) - TUI:
settingstab (hotkey5)
YAML Conventions
versionmust be exactly1.tokenEnvvalues are env var names, not literal tokens.- Path expansion (
~,$VAR,${VAR}) is applied to:- the config file path
storage.dbPathstorage.worktreeRootstorage.logsRootrepos[].localPath
CommandSpecfields accept either:- string:
"pnpm test -- --run" - array:
["pnpm", "test", "--", "--run"]
- string:
repos[].labels.readyandrepos[].labels.blockedaccept either string or string array and are normalized to arrays.- Project config files (
repos[].localPath/.night-orch.ymlor.yaml) may define repo-scoped keys plus optional top-levelworkflowsandworkerProfiles.
Timestamp & Timezone Semantics
- Night-orch treats all timestamps as UTC.
- Runtime-generated timestamps use ISO-8601 UTC (
YYYY-MM-DDTHH:mm:ss.sssZ). - Legacy SQLite-style timestamps without an explicit timezone (for example
YYYY-MM-DD HH:mm:ss) are interpreted as UTC. - CLI/TUI time displays include an explicit
UTClabel.
Top-Level Schema
| Key | Type | Required | Default | Notes |
|---|---|---|---|---|
version | 1 | yes | none | Schema version. |
github | object | yes | none | Global forge/auth settings. |
storage | object | no | object with defaults | DB/worktree/log paths. |
notifications | object | no | object with defaults | Channel/event notification config. |
loop | object | no | object with defaults | Loop decision limits and behavior. |
fileLoop | object | no | object with defaults | Repo-idle maintenance loop for low-risk file cleanup and review. |
security | object | no | object with defaults | Diff/cost safety limits. |
cost | object | no | object with defaults | Cost model (pay-per-use enforces USD caps; subscription is advisory-only; subscription-metered tracks advisory USD with optional enforcement). |
workerProfiles | record | no | {} | Named CLI profiles for agents. |
metrics | object | no | object with defaults | Prometheus exporter config. |
observability | object | no | object with defaults | Live agent event streaming/persistence settings. |
mcp | object | no | object with defaults | MCP server config for run/mcp commands. |
commentCommands | object | no | object with defaults | Issue comment command processing config. |
repos | array | yes | none | At least one repo is required. |
workflows | record | no | {} | Named workflow definitions for custom pipelines. |
github
| Key | Type | Required | Default | Notes |
|---|---|---|---|---|
tokenEnv | string | yes | none | Env var name holding GitHub token. Literal token prefixes (ghp_, ghs_, github_pat_) are rejected. |
apiBaseUrl | URL string | no | https://api.github.com | Default base URL for GitHub repos. |
pollIntervalSeconds | positive number | no | 300 | Poll interval used by run loop. |
appMentions | record | no | {} | Mention templates keyed by mention alias (claude, codex, etc.). |
github.appMentions.<key>
| Key | Type | Required | Default | Notes |
|---|---|---|---|---|
enabled | boolean | no | false | If false, that mention key is filtered out even if requested by labels/defaults. |
commentTemplate | string | yes | none | Template used when posting mention comments; supports {issue}, {pr}, {repo} placeholders. |
storage
| Key | Type | Required | Default | Notes |
|---|---|---|---|---|
dbPath | string path | no | ~/.config/night-orch/state.db | |
worktreeRoot | string path | no | ~/code/.night-orch/worktrees | night-orch exports MISE_TRUSTED_CONFIG_PATHS with this path at startup, so any .mise.toml / mise.toml / .tool-versions checked out inside a worktree is automatically trusted by mise. Required for repos whose toolchain is managed by mise — otherwise bootstrap commands that invoke mise-shimmed tools (bundle, node, rake, ...) fail with "Config files ... are not trusted". |
logsRoot | string path | no | ~/code/.night-orch/logs | |
autoCleanup | object | no | object with defaults | Automatic cleanup settings. |
retention | object | no | object with defaults | Data retention periods. |
storage.autoCleanup
| Key | Type | Default | Notes |
|---|---|---|---|
enabled | boolean | true | Enable automatic cleanup of stale worktrees and logs. |
intervalMinutes | positive number | 60 | How often auto-cleanup runs (in minutes). |
storage.retention
| Key | Type | Default | Notes |
|---|---|---|---|
worktreeAgeDays | positive number | 7 | Remove completed/error worktrees older than this. |
detailDays | positive number | 30 | Retain detailed run data (events, phase data) for this many days. |
archiveDays | positive number | 90 | Archive run records older than this. |
notifications
| Key | Type | Required | Default | Notes |
|---|---|---|---|---|
channels | array | no | [ { type: "console" } ] | Multiple channels allowed. |
events | object | no | object with defaults | Per-event toggle switches. |
notifications.channels[]
Discriminated by type:
consoletype: "console"
webhooktype: "webhook"urlEnv: string(env var name containing webhook URL)
discordtype: "discord"urlEnv: string(env var name containing Discord webhook URL)
smtptype: "smtp"host: stringport: positive int(default587)from: stringto: stringuserEnv: string(env var name)passEnv: string(env var name)
webpush(Phase 2c — Web Push notifications to subscribed browsers)type: "webpush"vapidPublicKeyEnv: string(env var name, public VAPID key)vapidPrivateKeyEnv: string(env var name, private VAPID key)vapidSubjectEnv: string(env var name, e.g.mailto:you@example.com)- Generate a keypair once with
npx web-push generate-vapid-keys, export the three env vars on the daemon host, and the web UI's Settings page will expose an "Enable notifications" button. Any browser that subscribes receives background push notifications for configured events (blocked, pr_ready, error, retry_exhausted by default). Subscriptions are persisted inpush_subscriptionsand pruned automatically on410 Gone.
notifications.events
| Key | Type | Default |
|---|---|---|
onRunStarted | boolean | false |
onBlocked | boolean | true |
onPrReady | boolean | true |
onPrUpdated | boolean | true |
onError | boolean | true |
onRetryExhausted | boolean | true |
loop
| Key | Type | Default | Notes |
|---|---|---|---|
maxReviewIterations | positive number | 4 | Base max loop iterations before stop. |
maxTotalAgentPasses | positive number | 10 | Base max total worker passes. |
stopOnPlannerFailure | boolean | true | If planner output fails, stop early instead of continuing. |
requireVerificationPass | boolean | true | If true, verification failures block completion. |
reviewApprovalKeyword | string | APPROVED | Expected reviewer verdict keyword. |
reviewNeedsChangesKeyword | string | CHANGES_REQUIRED | Expected reviewer verdict keyword. |
blockOnAmbiguousReview | boolean | true | Parse failures in review phase become blocked state. |
maxAutoRetries | int >= 0 | 3 | Auto-retry count for infrastructure errors. |
maxEmptyDiffRetries | int 0-5 | 2 | Auto-retry count when coder produces no file changes. |
maxConsecutiveBlocks | int 1-20 | 4 | Circuit breaker: stop retrying after this many consecutive blocked runs on the same issue. |
decompose | boolean | false | Enable automatic issue decomposition into sub-tasks. |
maxSubtasks | int 1-10 | 5 | Maximum sub-tasks per decomposition. |
maxConcurrentSubtasks | int 1-10 | 3 | Max parallel sub-task worktrees. |
Note: loop limits are later triage-adjusted per issue (trivial/standard/architectural), so these are base values.
Decomposition
When decompose: true, issues classified as standard triage level with a body exceeding 500 characters (or containing 3+ numbered items/headings) are sent to the planner for decomposition. The planner decides whether to split the issue and outputs 2-5 atomic sub-tasks. Each sub-task runs the full Plan→Code→Verify→Review loop in its own git worktree. Sub-tasks execute in parallel waves based on their dependency graph, up to maxConcurrentSubtasks concurrent worktrees.
fileLoop
fileLoop configures a repo-scoped maintenance loop that runs only while the repo is otherwise idle. A session iterates through candidate files in a dedicated file-loop worktree, asks the configured reviewer profile to classify the next change, applies only trivial edits automatically, records larger follow-up ideas in loop.md, and publishes one PR when the session ends.
Operational constraints:
- File-loop work runs only when a repo has no active issue runs.
- Sessions are started and stopped explicitly through CLI, MCP, or the TUI file-loop tab.
- Top-level
fileLoopvalues provide defaults;repos[].fileLoopmerges over them for per-repo overrides. - Final publish runs
finalizeVerify; if verification fails,onFailurecontrols whether a draft PR is still opened.
| Key | Type | Default | Notes |
|---|---|---|---|
enabled | boolean | false | Master gate. night-orch file-loop start refuses to start when disabled for the repo. |
maxDurationMinutes | positive int | 480 | Hard wall-clock cap per session unless start --max-minutes overrides it. |
maxIterations | positive int | 1000 | Upper bound on file-loop iterations per session. |
minIntervalSecondsBetweenFiles | int >= 0 | 5 | Cooldown before reconsidering the next file. |
perIterationTimeoutSeconds | positive int | 120 | Timeout for each reviewer worker invocation. |
maxCostUsd | non-negative number | 5 | Session cost cap. Hitting it requests finalization. |
maxFileLines | positive int | 1500 | Skip files larger than this line count. |
includeGlobs | string[] | ["**/*.{ts,tsx,js,jsx,py,go,rs,md}"] | Candidate file allowlist. |
excludeGlobs | string[] | built-in list | Candidate file denylist. Defaults exclude generated artifacts, lockfiles, .git, and loop.md. |
reviewerProfileKey | string | claude-cheap | Worker profile name, or a worker type, used for file review iterations. Override this if your config does not define claude-cheap. |
branchNameTemplate | string | orch/file-loop/{repoSlug}/{yyyyMmDd} | Supports {repoSlug} and {yyyyMmDd} placeholders. |
loopMdPath | string | loop.md | Repo-relative backlog file for deferred refactor notes. |
commitPrefix | string | [FILE-LOOP] | Prefix used for per-file and loop.md commits. |
perEditVerify | object | object with defaults | Verification run immediately after each trivial edit. |
finalizeVerify | object | object with defaults | Verification run once before PR publication/finalization. |
fileLoop.perEditVerify
| Key | Type | Default | Notes |
|---|---|---|---|
enabled | boolean | true | If false, trivial edits are committed without per-file verification. |
commands | string[] | ["pnpm typecheck"] | Commands run sequentially in the file-loop worktree. |
timeoutSeconds | positive int | 60 | Per-command timeout budget. |
fileLoop.finalizeVerify
| Key | Type | Default | Notes |
|---|---|---|---|
enabled | boolean | true | If false, publication does not run final verification. |
commands | string[] | ["pnpm typecheck", "pnpm lint"] | Commands run sequentially before publication. |
timeoutSeconds | positive int | 300 | Per-command timeout budget. |
onFailure | draft-pr | no-pr | draft-pr | Whether to still open a draft PR when final verification fails. |
security
| Key | Type | Default | Notes |
|---|---|---|---|
maxChangedFiles | positive number | 50 | Diff guard threshold. |
maxChangedLines | positive number | 5000 | Diff guard threshold. |
maxDailyCostUsd | positive number | 50 | Daily budget cap, enforced in pay-per-use and optionally in subscription-metered (cost.subscriptionMetered.enforceDailyLimit). |
maxCostPerRunUsd | positive number | 10 | Per-run budget cap, enforced in pay-per-use and optionally in subscription-metered (cost.subscriptionMetered.enforcePerRunLimit). |
Unblocking a run hit by a cost cap
These controls are interpreted by cost.model:
pay-per-use: always enforcedsubscription: never enforced (USD advisory only)subscription-metered: enforced only whencost.subscriptionMetered.enforcePerRunLimitand/orcost.subscriptionMetered.enforceDailyLimitare enabled
When a run is blocked by a cost limit, there are three escape hatches — pick whichever matches the scope of the situation:
- Whole day over budget → raise today's cap with
night-orch daily-cost-override <amount>. Scoped to the current UTC day; auto-expires at 00:00 UTC. Use this when multiple queued issues would otherwise need individual overrides. Clear early withnight-orch daily-cost-override --clear. Also exposed via MCP (night-orch-daily-cost-override) and TUI (%hotkey — doubles the current cap). - One expensive run stuck → grant a per-run override with
night-orch cost-override <repo> <issue> <amount>. Replaces the per-run cap for that one run and exempts it from the daily cap. Use when a single heavyweight issue needs more headroom than the daily cap would normally permit. - Permanently raise the cap →
night-orch settings set security.maxDailyCostUsd <amount>(orsecurity.maxCostPerRunUsd). This persists until explicitly cleared withnight-orch settings unset, so reserve it for deliberate budget increases — not incident response.
cost
| Key | Type | Default | Notes |
|---|---|---|---|
model | pay-per-use, subscription, or subscription-metered | pay-per-use | pay-per-use enforces security.maxDailyCostUsd/security.maxCostPerRunUsd; subscription bypasses cost-limit blocking; subscription-metered logs advisory warnings and can optionally enforce caps via cost.subscriptionMetered. |
allowEstimatedDuration | boolean | false | When false (default), worker runs that finish without parseable token usage block the attempt with tokenCaptureFailed instead of silently estimating cost from wall-clock duration. The duration estimate undercounted by 10–100× in production and was the root cause of inaccurate cost reports. Flip to true only as a temporary unblocker when a specific worker adapter genuinely cannot report token usage. |
subscriptionMetered | object | { advisoryThresholdUsd: null, enforcePerRunLimit: false, enforceDailyLimit: false } | Controls warning/enforcement behavior for subscription-metered mode. Ignored for other models. |
pricing | object | unset | Optional model-aware pricing table. When unset, built-in defaults are used (input $3/M, output $15/M, cache-read $0.3/M, fallback $0.008/min) for advisory/estimated USD. |
cost.subscriptionMetered
| Key | Type | Default | Notes |
|---|---|---|---|
advisoryThresholdUsd | positive number or null | null | Logs warnings when run/day estimated cost meets or exceeds this threshold. |
enforcePerRunLimit | boolean | false | When true, applies security.maxCostPerRunUsd as a hard block in subscription-metered. |
enforceDailyLimit | boolean | false | When true, applies security.maxDailyCostUsd as a hard block in subscription-metered. |
cost.pricing
| Key | Type | Default | Notes |
|---|---|---|---|
defaultModel | string | "default" | Fallback key when a worker's pricingModel/type has no direct pricing entry. |
models | record | {} | Per-model pricing map keyed by model name. |
cost.pricing.models.<model>
| Key | Type | Default | Notes |
|---|---|---|---|
inputUsdPerMillionTokens | non-negative number | 3 | Prompt/input token price in USD per 1,000,000 tokens. |
outputUsdPerMillionTokens | non-negative number | 15 | Completion/output token price in USD per 1,000,000 tokens. |
cacheReadUsdPerMillionTokens | non-negative number | 0.3 | Cached-input read token price in USD per 1,000,000 tokens. |
minuteUsd | non-negative number | 0.008 | Time-based fallback price per minute when token counts are unavailable. |
Validation notes:
cost.pricing.defaultModelmust reference a key present undercost.pricing.modelswhen models are provided.workerProfiles.<name>.pricingModelmust reference a key present undercost.pricing.models.
ai
Phase 3: direct-LLM API layer for night-orch's internal AI tasks — triage refinement, PR body summaries, reviewer parse salvage, and a bounded rebase-conflict resolver. This does NOT replace the Claude Code / Codex / opencode CLIs used for actual code-editing (planner, coder, reviewer); those keep running on the CLI path because they rely on the agentic tool-use loop that the direct API doesn't have. The conflict resolver is the narrow exception: it operates on one conflicted file at a time, validates the returned file, and falls back to the normal human block path on any failure.
When no ai.internal.enable.* flag is set the entire layer is a no-op and every consumer falls back to its pre-Phase-3 behavior (rule-based triage, template-only PR body, fail-closed reviewer parser). The conflict resolver is gated separately by autoResolveConflicts.enabled and ai.internal.features.conflictResolver.
ai.internal
| Key | Type | Default | Notes |
|---|---|---|---|
provider | "anthropic" | "openrouter" | "openai" | null | null | Which direct-LLM backend to use. null disables the layer. |
model | string | null | null | Model id passed to the provider (e.g. "claude-3-5-sonnet-20241022" for Anthropic, "anthropic/claude-3.5-sonnet" for OpenRouter, "gpt-4o-mini" for OpenAI). |
apiKeyEnv | string | null | null | Env var name holding the API key. Refuses literal keys in YAML. |
timeoutMs | positive int | 30000 | Per-request timeout. |
maxTokens | positive int | 1024 | Default max tokens per call. Each consumer may override. |
features.conflictResolver | boolean | true | Enables the internal-AI conflict resolver feature when autoResolveConflicts.enabled is also true. If provider/model/API key are missing at runtime, the resolver quietly falls back to the existing human block path and doctor reports it as unavailable. |
enable.triage | boolean | false | LLM refines rule-based triage classification. |
enable.reviewerParseFallback | boolean | false | When the primary reviewer JSON parser fails, ask the LLM to salvage a structured verdict (CHANGES_REQUIRED or BLOCKED only — APPROVED is never inferred from free text). |
enable.prBody | boolean | false | Prepends a 2-3 sentence plain-English summary to PR body bodies. The structured template still renders below. |
Validation:
- When any
enable.*flag istrue, all three ofprovider,model, andapiKeyEnvmust be set. The schema rejects the config otherwise at load time. apiKeyEnvmust be an environment variable name, not a literal API key — the schema rejects values that look like inline secrets (sk-…,claude-…).
Security: AI API keys are added to the worker environment blacklist, so ANTHROPIC_API_KEY, OPENAI_API_KEY, OPENROUTER_API_KEY, and any ANTHROPIC_* / OPENAI_* / OPENROUTER_* env var are blocked from reaching CLI worker subprocesses. Pino redaction also scrubs apiKey and x-api-key fields from every log record.
Cost tracking: every AI call records through the same R4 cost ledger as CLI workers, tagged tokenSource='measured_api' and workerType='internal-ai' or workerType='internal-ai-conflict-resolver'. The /api/cost/health endpoint surfaces these as distinct funding sources so operators can see direct-API spend alongside CLI spend.
Example:
ai:
internal:
provider: anthropic
model: claude-3-5-sonnet-20241022
apiKeyEnv: ANTHROPIC_API_KEY
features:
conflictResolver: true
enable:
triage: true
reviewerParseFallback: true
prBody: trueautoResolveConflicts
Controls the bounded AI-assisted resolver that runs only after a queued rebase operation hits textual conflicts.
If resolution succeeds, night-orch continues the rebase, force-pushes the branch, and then follows the existing verify contract:
- verify passes: the run returns to
review_ready - verify fails: the coder loop runs as usual
If resolution fails, validation fails, the provider is unavailable, or the feature is disabled, night-orch aborts the rebase and blocks the run with the existing merge_conflict reason.
| Key | Type | Default | Notes |
|---|---|---|---|
enabled | boolean | true | Master switch for the automated resolver pass. |
maxAttempts | int 1-5 | 2 | Maximum number of resolve -> git add -> git rebase --continue cycles before falling back to the human block path. |
maxFiles | int 1-20 | 5 | Maximum number of conflicted files eligible for one automated attempt. Larger conflict sets skip auto-resolution and block immediately. |
Example:
autoResolveConflicts:
enabled: true
maxAttempts: 2
maxFiles: 5workerProfiles
workerProfiles is a map of profile name to profile config.
Example:
workerProfiles:
claude-default:
type: claude
command: claude
args: ["-p"]workerProfiles.<name>
| Key | Type | Required | Default | Notes |
|---|---|---|---|---|
type | string | yes | none | Adapter type. Built-in: claude, codex, acp. OpenCode uses acp with command: opencode. |
pricingModel | string | no | none | Optional model key used by cost.pricing.models for cost estimation. Falls back to type when omitted. |
minuteUsd | non-negative number | no | none | Optional profile-level duration fallback override used when token usage is unavailable. |
command | string | yes | none | Binary to execute. |
args | string[] | no | [] | Base CLI args for every task invocation. |
workerTimeoutSeconds | positive number | no | 1800 | Base timeout before triage scaling. |
minimalEnv | boolean | no | true | Deprecated/ignored; worker env is always whitelist-based. |
runtimeWrapper | string or null | no | null | Wrapper command prepended before command (for sandbox wrappers, etc.). |
env | record string->string | no | {} | Extra env vars for worker process; blacklist still applies. |
Worker PATH is normalized at runtime: if missing, ~/.local/bin, ~/.local/share/pnpm, ~/.local/share/mise/shims, /usr/local/bin, /usr/bin, and /bin are appended.
repos[].agents references these profile names. Unknown profile references fail config load.
Authentication Considerations
Night-orch invokes the claude CLI as a subprocess — it does not handle authentication itself. The installed claude binary uses whatever auth is configured on the host (OAuth subscription login or API key).
For production / high-volume deployments, configure the claude CLI with an API key (ANTHROPIC_API_KEY) rather than a subscription OAuth login. As of April 2026, Anthropic restricts subscription OAuth to "ordinary, individual usage" of Claude Code and reserves the right to enforce this without notice. API key auth uses metered billing and is unaffected by these restrictions.
For personal dev-server usage, subscription OAuth login works fine and is fully supported — night-orch invokes the real claude CLI binary, not the API directly.
See Anthropic's legal and compliance docs for current policy on authentication methods.
ACP Adapter
The acp adapter type uses the Agent Client Protocol for agent-agnostic communication:
workerProfiles:
gemini-acp:
type: acp
command: gemini # acpx agent name
args: []
workerTimeoutSeconds: 1800The command field specifies the acpx agent name (e.g., codex, claude, gemini, pi). ACPX resolves this to the correct ACP adapter. Supported agents include any ACP-compatible agent registered with acpx.
Requires acpx installed as a dependency (pnpm add acpx).
metrics
| Key | Type | Default |
|---|---|---|
enabled | boolean | true |
port | positive int | 9090 |
host | string | 0.0.0.0 |
Notes:
- For the default Docker-based monitoring stack, keep
metrics.host: 0.0.0.0so Prometheus can scrape the daemon from its container network. metrics.enabledis runtime-overridable (night-orch settings set metrics.enabled ...).night-orch statusreports when runtime state diverges from YAML.
observability
| Key | Type | Default | Notes |
|---|---|---|---|
agentStreaming | boolean | true | Enable live worker event emission and persistence. |
eventRetention | int (100-10000) | 1000 | In-memory max agent events retained per run. |
sessionLogs | boolean | true | Write per-phase JSONL session logs to storage.logsRoot/<runId>/. |
sessionLogRetention | positive int | 7 | Retention target in days for session logs (consumed by cleanup policy). |
mcp
| Key | Type | Default | Notes |
|---|---|---|---|
enabled | boolean | false | When true, run starts the embedded MCP HTTP server (dual transport — see below). |
transport | stdio | stdio | Reserved. The standalone night-orch mcp command speaks stdio; the HTTP server started by run/web exposes streamable HTTP and legacy SSE on the same port regardless of this value. |
authTokenEnv | string or null | null | Name of an environment variable holding a bearer token. When set and the env var is non-empty, every MCP request must present a matching Authorization: Bearer … header. Required when httpHost is non-loopback. |
httpPort | positive int | 3100 | Port the embedded MCP server listens on. |
httpHost | string | 127.0.0.1 | Host to bind. Loopback (127.0.0.1, ::1, localhost) is always allowed; any other host requires authTokenEnv to be set. |
Transports
The embedded MCP server exposes both transports on the same port so old and new clients can coexist:
- Streamable HTTP (modern) —
POST /mcp, withMcp-Session-Idresponse/request header for session routing. AlsoGET /mcp(server-initiated SSE stream) andDELETE /mcp(client-initiated session teardown). This is the transport Claude Code'stype: "http"client speaks. - Legacy SSE —
GET /ssefor the session handshake followed byPOST /mcp?sessionId=…for follow-up JSON-RPC messages. Kept for backwards compatibility with existing proxies and older MCP clients.
A liveness probe is available at GET /health and does not require auth.
Exposing MCP over a private network
To let a remote Claude Code instance connect directly (e.g. over Tailscale), bind to a non-loopback address and configure a strong bearer token:
mcp:
enabled: true
httpHost: 100.94.242.23 # e.g. Tailscale IP
httpPort: 8808
authTokenEnv: NIGHT_ORCH_MCP_TOKENexport NIGHT_ORCH_MCP_TOKEN=$(openssl rand -hex 32)Client-side .mcp.json:
{
"mcpServers": {
"night-orch": {
"type": "http",
"url": "http://100.94.242.23:8808/mcp",
"headers": { "Authorization": "Bearer ${NIGHT_ORCH_MCP_TOKEN}" }
}
}
}Non-loopback binding without authTokenEnv is rejected at startup — exposing mutation tools to an unauthenticated listener is never a supported configuration.
commentCommands
| Key | Type | Default | Notes |
|---|---|---|---|
enabled | boolean | true | Enable processing of /orch commands in issue comments. |
requireCollaborator | boolean | true | Only repo collaborators can use comment commands. Set to false only for private repos where all commenters are trusted. |
Supported commands (posted as issue comments):
/orch retry— start a fresh retry from the latest base branch/orch rebase— queue an explicit rebase of the work branch onto the latest base/orch cancel— cancel an active run/orch continue— resume the existing branch with fresh context for blocked/review-ready/errored runs
When a PR becomes non-mergeable while it is in review_ready, night-orch does not treat that as a generic continue. It queues a dedicated branch refresh attempt that uses the repo's updateStrategy, and if that refresh conflicts the blocked run stores a durable conflict snapshot for the next /orch continue pass.
workflows
Named workflow definitions for custom execution pipelines. When no workflow is configured:
standardissues use Plan→Code→Verify→Review→Decidetrivialissues use a lightweight Code→Verify→Decide flow (review gate disabled)
workflows:
minimal:
steps:
- { type: worker, id: code, role: coder }
- { type: verify, id: verify }
- { type: worker, id: review, role: reviewer }
- { type: decide, id: decide, onIterate: code }
fast-trivial:
roles:
coder: codex
reviewer: codex
agents:
codex: codex-fast
steps:
- { type: worker, id: code, role: coder }
- { type: verify, id: verify }
- { type: decide, id: decide, onIterate: code, requireReview: false }
with-security:
steps:
- { type: worker, id: plan, role: planner, skipWhen: trivial }
- { type: worker, id: code, role: coder, continueFrom: plan }
- { type: verify, id: verify }
- { type: worker, id: security, role: reviewer, prompt: security-review.md }
- { type: worker, id: review, role: reviewer }
- { type: decide, id: decide, onIterate: code }Step Types
| Type | Fields | Description |
|---|---|---|
worker | id, role, skipWhen?, continueFrom?, prompt? | Invoke a worker adapter. Built-in roles: planner, coder, reviewer. |
verify | id, skipWhen? | Run configured verify commands. |
decide | id, onIterate, requireReview? | Evaluate review/verify results and route to publish, iterate (jump to onIterate step), or block. |
skipWhen— skip the step when the triage level matches (e.g.,trivial)continueFrom— continue the AI session from a prior step (e.g., coder continues planner's session). Session reuse is agent-specific; cross-agent handoffs (for exampleplanner=claude,coder=codex) start a fresh session.prompt— path to a custom system prompt template (overrides the default)requireReview— defaulttrue; set tofalsefor no-review workflows (for example lightweight triage paths)
Workflow-Level Overrides
roles— optional role defaults (planner/coder/reviewer) for runs using this workflowagents— optional per-agent worker profile overrides (same shape asrepos[].agents)
Reference a workflow in repos[].workflow by name.
repos[]
| Key | Type | Required | Default | Notes |
|---|---|---|---|---|
repo | owner/name string | yes | none | Repository slug. |
forge | github or forgejo | no | github | Forge implementation selector. |
linkedProjects | owner/name string[] | no | [] | Additional issue-source repos to discover from using this repo's selectors/flow. |
apiBaseUrl | URL string | no | none | Required for forgejo; optional override for github. |
tokenEnv | string | no | none | Token env override per repo. |
maxConcurrentRuns | int 1-20 | no | 1 | Max issues processed concurrently for this repo per poll cycle. |
localPath | string path | yes | none | Local repo checkout path. |
baseBranch | string | no | main | PR target branch. |
branchPrefix | string | no | orch | Work branch prefix. |
updateStrategy | merge | rebase | no | merge | How normal queued work incorporates upstream base branch changes by default. merge creates merge commits (reliable for automated systems). rebase replays commits for linear history (use only if your repo requires linear history). This setting is used by automatic branch refreshes, merge-conflict follow-up attempts, and publish-time branch reconciliation. Manual retry, continue, and rebase actions can override it per action from the CLI, TUI, MCP, or web UI; explicit rebase still defaults to rebase unless overridden. |
labels | object | no | object with defaults | Orchestration label names. |
kanban | object | no | none | Optional alternate state-label flow activated by a trigger label. |
labelConfig | record | no | {} | Label metadata overrides for labels-init. |
defaults | object | no | object with defaults | Default roles + mention settings. |
planning | object | no | object with defaults | Planning-only mode settings (PRD path). |
fileLoop | object | no | {} | Per-repo overrides merged onto top-level fileLoop. |
environment | object | no | none | Shared/dedicated env setup. |
verify | CommandSpec[] | no | [] | Verify commands run in worktree. |
prompts | object | no | none | Optional custom system prompt template paths. |
selectors | object | no | object with defaults | Issue label inclusion/exclusion filters. |
agents | record | no | {} | Maps agent names to worker profile names. |
workflow | string | no | none | Name of a workflow from workflows section. Uses default pipeline if omitted. |
workflowByTriage | object | no | none | Per-triage workflow selection (trivial/standard). |
mergeQueue | object | no | object with defaults | Merge queue configuration. |
Poll execution model:
- Repos are polled in parallel.
- Each repo runs up to
maxConcurrentRunsissues at once (default1).
Project-local repo overrides
You can move repo-specific settings into a file inside the repository checkout:
# <repo>/.night-orch.yml
workflow: project-fast
defaults:
coder: codex
environment:
bootstrap:
- command: pnpm install
when: always
workflows:
project-fast:
steps:
- { type: worker, id: code, role: coder }
- { type: decide, id: decide, onIterate: code }This file is merged with the matching repos[] entry from central config.
repos[].workflowByTriage
Route triage levels to different named workflows:
repos:
- repo: myorg/myrepo
workflow: full
workflowByTriage:
trivial: fast-trivial
standard: fullResolution order:
- Planning-label workflow override (planning-only mode)
workflowByTriage[triageLevel]workflow- Built-in defaults (
trivial-> lightweight, others -> full)
Note: architectural issues are intentionally handled outside workflow execution and are labeled for human guidance.
repos[].labels
| Key | Type | Default | Notes |
|---|---|---|---|
ready | string or string[] | ['orch:ready'] | Normalized to array. |
running | string | orch:running | |
blocked | string or string[] | orch:blocked | Normalized to array. |
needsHuman | string | orch:needs-human | |
reviewReady | string | orch:review-ready | |
error | string | orch:error | |
retry | string | orch:retry | |
planning | string | orch:planning | When present on an issue, night-orch switches to planning-only mode and publishes only a PRD markdown file. |
mergeQueued | string | orch:merge-queued | Set when PR enters the merge queue. |
merging | string | orch:merging | Set while staging branch CI is running. |
mergeFailed | string | orch:merge-failed | Set when the merge queue identifies this PR as the culprit. |
repos[].linkedProjects
List of additional repositories to use as issue sources for the repo.
Example:
repos:
- repo: myorg/app
linkedProjects:
- myorg/tracker
- myorg/platform-triageEach entry must use owner/name format.
repos[].kanban
Optional alternate state flow. When triggerLabel is present on an issue, night-orch uses kanban.labels for status transitions (queued/running/blocked/review/error/retry) instead of repos[].labels.
repos:
- repo: myorg/myrepo
kanban:
triggerLabel: flow:kanban
labels:
ready: [kanban:todo]
running: kanban:doing
blocked: kanban:blocked
needsHuman: kanban:needs-human
reviewReady: kanban:review
error: kanban:error
retry: kanban:retry
planning: kanban:planning
mergeQueued: kanban:merge-queued
merging: kanban:merging
mergeFailed: kanban:merge-failedrepos[].labelConfig
Map of label name to optional metadata used by night-orch labels-init.
Each entry supports:
| Key | Type | Required | Notes |
|---|---|---|---|
color | 6-char hex string | no | Example: 0E8A16. |
description | string (<= 100 chars) | no |
Constraint: each entry must include at least one of color or description.
repos[].defaults
| Key | Type | Default | Notes |
|---|---|---|---|
planner | claude, codex, or opencode | claude | Default planner role assignment. |
coder | claude, codex, or opencode | claude | Default coder role assignment. |
reviewer | claude, codex, or opencode | claude | Default reviewer role assignment. |
doneMode | pr-ready or manual-only | pr-ready | Reserved for workflow policy; currently not consumed in runtime logic. |
notifyPriority | normal or high | normal | Reserved for notification priority; currently not consumed in notifier routing. |
prMentions | string[] | [] | Mention aliases posted on PRs by default. |
Role labels can override these defaults per issue:
plan:claude/plan:codex/plan:opencodecode:claude/code:codex/code:opencodereview:claude/review:codex/review:opencode
Planning-only mode label:
orch:planning(or whateverrepos[].labels.planningis set to)
When this label is present, night-orch uses a planning-only workflow and must produce a PR containing exactly one PRD markdown file.
repos[].planning
| Key | Type | Default | Notes |
|---|---|---|---|
prdDirectory | string | docs/prd | Repository-relative directory where planning-mode PRD files are created. |
repos[].fileLoop
Repo-level file-loop overrides merge onto the top-level fileLoop block for that repo only.
Example:
fileLoop:
enabled: false
reviewerProfileKey: claude-default
repos:
- repo: myorg/myrepo
fileLoop:
enabled: true
maxDurationMinutes: 180
reviewerProfileKey: codex-default
includeGlobs:
- "src/**/*.{ts,tsx}"
excludeGlobs:
- "src/generated/**"Repo overrides support the same keys as top-level fileLoop, but every field is optional. Nested perEditVerify and finalizeVerify objects merge field-by-field rather than replacing the entire object.
repos[].environment
| Key | Type | Required | Default | Notes |
|---|---|---|---|---|
defaultMode | shared or dedicated | no | shared | Base env mode. |
dedicated | object | no | none | Required if dedicated mode is used. |
shared | object | no | none | Shared mode behavior. |
bootstrap | command object[] | no | [] | Runs during setup (always/shared/dedicated). |
cleanup | command object[] | no | [] | Runs during dedicated teardown. |
Issue labels can force mode per run:
env:sharedenv:dedicated
repos[].environment.dedicated
| Key | Type | Required | Default | Notes |
|---|---|---|---|---|
compose.file | string | yes | none | Compose file path used in worktree. |
compose.services | string[] | no | [] | Optional service subset. |
compose.projectName | string | no | orch-{issue} | {issue} placeholder supported. |
env.copyFrom | string | no | .env | Base env file copied from repo root. |
env.overrides | record | no | {} | Values support {issue} and {auto:min-max} port token. |
env.overrideFiles | string[] | no | [] | Additional env files appended in order. |
healthcheck | CommandSpec | no | none | Supports {port} placeholder after auto-port allocation. |
teardownOnComplete | boolean | no | true | If true, compose stack is stopped after run. |
repos[].environment.shared
| Key | Type | Default | Notes |
|---|---|---|---|
requireRunning | boolean | true | If true, failed healthcheck aborts run. |
healthcheck | CommandSpec | none | Command to verify shared stack is up. |
repos[].environment.bootstrap[] and cleanup[]
| Key | Type | Required | Default | Notes |
|---|---|---|---|---|
command | CommandSpec | yes | none | Executed in worktree directory. |
when | always | shared | dedicated | no | always | Mode filter. |
failureHints | object[] | no | [] | Optional pattern-based hints appended to bootstrap error output. |
failureHints[]
| Key | Type | Required | Default | Notes |
|---|---|---|---|---|
contains | string | yes | none | Substring to match against command output. |
message | string | yes | none | Hint text shown when the pattern matches. |
output | combined | stdout | stderr | no | combined | Which output stream(s) to inspect. |
repos[].verify
Array of commands executed sequentially in worktree. Failures are collected per command; verification result is evaluated after all commands run.
repos[].prompts
| Key | Type | Required | Notes |
|---|---|---|---|
plannerSystem | string path | no | If file exists, used instead of default planner system prompt. |
coderSystem | string path | no | If file exists, used instead of default coder system prompt. |
reviewerSystem | string path | no | If file exists, used instead of default reviewer system prompt. |
If a configured template file is missing, a warning is logged and built-in defaults are used.
repos[].selectors
| Key | Type | Default | Notes |
|---|---|---|---|
includeLabelsAny | string[] | ['orch:ready'] | Issue must include at least one (empty list means include all). |
excludeLabelsAny | string[] | ['orch:blocked', 'orch:error', 'orch:needs-human'] | Issue is skipped if any label matches. |
repos[].agents
Map of agent name to worker profile name.
Typical shape:
agents:
claude: claude-default
codex: codex-default
opencode: opencode-qwenResolution behavior:
- If mapping exists and profile exists, that profile is used.
- Otherwise, night-orch falls back to first profile whose
typematches the role agent (claude/codex/opencode). - If no matching profile exists, the run fails.
OpenCode runs through the acp adapter with command: opencode. The target repo must have an opencode.json defining available models and provider config. To select different models per role, use OPENCODE_CONFIG_CONTENT in the worker profile's env to override the default model:
workerProfiles:
opencode-qwen:
type: acp
command: opencode
env:
OPENCODE_CONFIG_CONTENT: '{"model":"openrouter/qwen/qwen3-coder"}'
opencode-kimi:
type: acp
command: opencode
env:
OPENCODE_CONFIG_CONTENT: '{"model":"openrouter/moonshotai/kimi-k2.5"}'OpenCode reads API credentials from its own auth store (~/.local/share/opencode/auth.json, configured via opencode /connect). Since HOME is on the worker env whitelist, no additional env changes are needed.
repos[].mergeQueue
Bors-style merge queue that batches approved PRs, tests them together, and bisects on failure.
| Key | Type | Default | Notes |
|---|---|---|---|
enabled | boolean | false | Enable the merge queue for this repo. |
batchSize | int 1-20 | 5 | Max PRs per batch. |
mergeMethod | merge | squash | rebase | merge | Git merge strategy for staging branch. |
retryFlakyOnce | boolean | true | Retry a failed batch once before bisecting. |
requireApproval | boolean | true | Require human PR approval before entering queue. |
stagingBranchPrefix | string | orch/staging | Prefix for staging branches. |
When enabled, each poll cycle:
- Checks for an active staging batch — if CI passed, fast-forwards base branch
- If CI failed, bisects the batch (halves it, tests each half) until the culprit PR is identified
- If no active batch, scans for eligible PRs (review_ready + CI passing + approved) and forms a new batch
- Conflicting PRs are ejected from the batch and continue to the next eligible PR
Labels used: orch:merge-queued, orch:merging, orch:merge-failed.
Forge-Specific Notes
forge: github- token env:
repos[].tokenEnvif present, otherwisegithub.tokenEnv - API base URL:
repos[].apiBaseUrlif present, otherwisegithub.apiBaseUrl
- token env:
forge: forgejo- token env:
repos[].tokenEnvif present, otherwiseFORGEJO_TOKEN repos[].apiBaseUrlis required
- token env:
Mention Behavior
Mentions posted to PR comments are resolved from:
- issue labels:
pr-mention:<key> - repo defaults:
repos[].defaults.prMentions - global gating:
github.appMentions.<key>.enabled(disabled entries are removed)
Comment body is github.appMentions.<key>.commentTemplate if configured, otherwise @<key>.
Examples
- Full example config: examples/config.example.yaml
- Local project sample used by this repo: config.yaml