Skip to content

Feature-flag rollout

Every new feature in NovaVMS ships behind a flag. Flags have three states: alpha (Novalien-internal orgs only), beta (opt-in per customer org), ga (default on for all orgs). Rollout is always one state at a time, with at minimum a 24-hour soak between promotions. This is the standard that prevented the 2026-04-17 reconnect-storm from reaching every customer — see docs/reliability-reports/2026-04-17-session-record.md in the repo.

Treat this as the default rollout shape for any feature touched by the ongoing work in docs/superpowers/plans/*. If a feature is urgent enough to skip a soak, that decision goes in the incident channel with a named decision-maker, not in-thread as “let’s just flip it.”

Rollout stages

StageScopeTypical durationExit criteria
alphanovalien-internal + novalien-internal-staging orgs only2–7 daysNo errors in platform logs; owner/team signs off
betaCustomer orgs that opt in via /admin/feature-flags24h minimum, typically 3–7 daysOpted-in orgs stay stable for 24h at production load
gaAll orgs, default onpermanent

Procedure — promote alpha to beta

  1. Confirm alpha exit criteria. Platform audit log has no flag-related errors in the last 24 hours. The engineering team owning the feature has signed off in #novavms-releases.

  2. Flip the flag’s default state. POST /api/v1/platform/feature-flags/{flag}/state with body {state: "beta"}. This does not enable it for any customer — it only makes the flag available for customer opt-in in /admin/feature-flags.

  1. Announce in the customer release channel. Post a short note with the feature summary, the /admin/feature-flags opt-in path, and a link to the feature’s manual page (if public).

  2. Monitor for 24h. Watch platform.feature_flag_enabled entries per org. Watch error rates on any endpoint the feature touches. See Release-pipeline monitoring for the commands.

  3. Collect opt-in feedback. Orgs that enable it and file no support ticket are the signal. A single ticket is not a blocker — two tickets with the same root cause is.

Procedure — promote beta to GA

  1. Confirm beta exit criteria. At least 10 customer orgs opted in, at least 72 hours of stable running, no open incidents tagged to the feature.

  2. Flip to GA. POST /api/v1/platform/feature-flags/{flag}/state with body {state: "ga"}. This sets the default to on for every org that has not explicitly opted out.

  1. Post GA announcement. Include the revert procedure in the announcement — every GA promotion ships with a named revert path.

  2. Monitor for 24h at full scale. Full traffic is the first time every org’s edge cases hit the code. Watch error rates.

  3. After a successful soak (typically 7–14 days at GA), remove the flag from the codebase in a follow-up PR. Dead flags are technical debt.

Revert path

Every flag must be revertible by flipping state back, with no schema or data-format dependency. If a feature needs a migration, the migration must be idempotent both forward and backward.

To revert a GA or beta flag:

  1. POST /api/v1/platform/feature-flags/{flag}/state with body {state: "alpha"}.
  2. Orgs that had the feature enabled see it disappear from their UI within 60 seconds (client re-fetches the flag list).
  3. Post in #novavms-releases with the revert reason and a link to the incident thread.
  4. Open a postmortem. See Incident response runbook.

If the flag cannot be reverted (because data written while it was on is incompatible with it being off), that is a bug in the flag design, not in the rollout. Escalate to the feature’s owning team — do not ship a non-revertible flag to GA.

Per-org opt-in for alpha

For alpha features, Novalien staff enable the flag for a specific org via POST /api/v1/platform/orgs/{org_id}/feature-flags/{flag} with {enabled: true}. Use this sparingly — typically for staging orgs or a handful of design-partner customers who have signed the preview agreement. Each enable writes platform.feature_flag_enabled to both audit logs (platform side + target org side).

Verify

  • /platform/feature-flags shows the new state and the per-org opt-in count.
  • platform_audit_log contains platform.feature_flag_state_changed with the old and new states.
  • Target-org audit_log (for per-org opt-ins) contains feature_flag.enabled or feature_flag.disabled.
  • An org that opts in sees the feature in their UI within one page reload.

If this didn’t work

  • Flag shows the new state in /platform/feature-flags but customer orgs don’t see the feature — the frontend caches flags for 60 seconds. Wait a minute or hard-reload.
  • 409 FLAG_DEPENDENCY — this flag depends on another that is still alpha. Promote the dependency first.

See also