Feature-flag rollout
Every new feature in NovaVMS ships behind a flag. Flags have three states: alpha (Novalien-internal orgs only), beta (opt-in per customer org), ga (default on for all orgs). Rollout is always one state at a time, with at minimum a 24-hour soak between promotions. This is the standard that prevented the 2026-04-17 reconnect-storm from reaching every customer — see docs/reliability-reports/2026-04-17-session-record.md in the repo.
Treat this as the default rollout shape for any feature touched by the ongoing work in docs/superpowers/plans/*. If a feature is urgent enough to skip a soak, that decision goes in the incident channel with a named decision-maker, not in-thread as “let’s just flip it.”
Rollout stages
| Stage | Scope | Typical duration | Exit criteria |
|---|---|---|---|
alpha | novalien-internal + novalien-internal-staging orgs only | 2–7 days | No errors in platform logs; owner/team signs off |
beta | Customer orgs that opt in via /admin/feature-flags | 24h minimum, typically 3–7 days | Opted-in orgs stay stable for 24h at production load |
ga | All orgs, default on | permanent | — |
Procedure — promote alpha to beta
-
Confirm alpha exit criteria. Platform audit log has no flag-related errors in the last 24 hours. The engineering team owning the feature has signed off in
#novavms-releases. -
Flip the flag’s default state.
POST /api/v1/platform/feature-flags/{flag}/statewith body{state: "beta"}. This does not enable it for any customer — it only makes the flag available for customer opt-in in/admin/feature-flags.
-
Announce in the customer release channel. Post a short note with the feature summary, the
/admin/feature-flagsopt-in path, and a link to the feature’s manual page (if public). -
Monitor for 24h. Watch
platform.feature_flag_enabledentries per org. Watch error rates on any endpoint the feature touches. See Release-pipeline monitoring for the commands. -
Collect opt-in feedback. Orgs that enable it and file no support ticket are the signal. A single ticket is not a blocker — two tickets with the same root cause is.
Procedure — promote beta to GA
-
Confirm beta exit criteria. At least 10 customer orgs opted in, at least 72 hours of stable running, no open incidents tagged to the feature.
-
Flip to GA.
POST /api/v1/platform/feature-flags/{flag}/statewith body{state: "ga"}. This sets the default to on for every org that has not explicitly opted out.
-
Post GA announcement. Include the revert procedure in the announcement — every GA promotion ships with a named revert path.
-
Monitor for 24h at full scale. Full traffic is the first time every org’s edge cases hit the code. Watch error rates.
-
After a successful soak (typically 7–14 days at GA), remove the flag from the codebase in a follow-up PR. Dead flags are technical debt.
Revert path
Every flag must be revertible by flipping state back, with no schema or data-format dependency. If a feature needs a migration, the migration must be idempotent both forward and backward.
To revert a GA or beta flag:
POST /api/v1/platform/feature-flags/{flag}/statewith body{state: "alpha"}.- Orgs that had the feature enabled see it disappear from their UI within 60 seconds (client re-fetches the flag list).
- Post in
#novavms-releaseswith the revert reason and a link to the incident thread. - Open a postmortem. See Incident response runbook.
If the flag cannot be reverted (because data written while it was on is incompatible with it being off), that is a bug in the flag design, not in the rollout. Escalate to the feature’s owning team — do not ship a non-revertible flag to GA.
Per-org opt-in for alpha
For alpha features, Novalien staff enable the flag for a specific org via POST /api/v1/platform/orgs/{org_id}/feature-flags/{flag} with {enabled: true}. Use this sparingly — typically for staging orgs or a handful of design-partner customers who have signed the preview agreement. Each enable writes platform.feature_flag_enabled to both audit logs (platform side + target org side).
Verify
/platform/feature-flagsshows the new state and the per-org opt-in count.platform_audit_logcontainsplatform.feature_flag_state_changedwith the old and new states.- Target-org
audit_log(for per-org opt-ins) containsfeature_flag.enabledorfeature_flag.disabled. - An org that opts in sees the feature in their UI within one page reload.
If this didn’t work
- Flag shows the new state in
/platform/feature-flagsbut customer orgs don’t see the feature — the frontend caches flags for 60 seconds. Wait a minute or hard-reload. 409 FLAG_DEPENDENCY— this flag depends on another that is stillalpha. Promote the dependency first.
See also
- Release-pipeline monitoring — how to watch the 24h soak.
- Incident response runbook — if the soak fails.
- In-repo:
docs/superpowers/plans/— per-feature rollout plans (for example2026-04-21-webrtc-allcodecs-STATUS.md).