Skip to content

Customer reports a bug you cannot reproduce

Symptom

A customer reports a specific failure (for example “Live view shows a black tile on camera X after 2026-04-18” or “Alert rule Y never fires”). You try to reproduce it against the staging org or your own test org and everything works. The customer keeps insisting it fails for them.

Likely causes

  1. Per-org feature-flag drift. A recently-rolled feature is enabled for engineering orgs but still off for the customer’s org (or the opposite — beta feature enabled for their org, not yours).
  2. Regional deploy skew. The customer hits a cluster that is still on the previous release; engineering hits the updated cluster.
  3. Data-shape difference. The customer has a config state that exercises a code path your test org never enters (example: a camera with zero recordings over the last 30 days, or an alert rule with > 50 cameras attached).
  4. A bug that is genuinely scoped to the customer’s data. Rare but real — pursue this only after ruling out the other three.

Fix

1. Mint a scoped impersonation token for the customer’s org

Per US-PLAT-8. Reason: "reproducing bug ticket=<id>". Target user: the user who filed the ticket if they gave consent. See Scoped impersonation for the full lifecycle.

2. Open the same UI view the customer describes

Navigate to the exact page. If the customer gave you a screenshot, compare side-by-side. Note any UI element that appears in yours but not theirs (or vice versa) — that is a flag-state difference made visible.

3. Compare feature flags between the two orgs

Call GET /api/v1/platform/orgs/{customer_org_id}/feature-flags and GET /api/v1/platform/orgs/{your_test_org_id}/feature-flags. Diff the JSON. A flag present in one and absent in the other is the most common explanation.

4. Check recent deploys to the customer’s region

gh run list --workflow=deploy --limit=20 --json conclusion,headBranch,createdAt. Cross-reference the customer’s region (visible on the org detail page) with the last successful deploy there. If the customer’s cluster is on a release older than the reporter’s fix, that release is the cause; schedule the upgrade.

5. If flags and deploys match, inspect the customer’s data shape

Under the impersonation token, load the specific entity the customer named (camera, rule, clip). Look for empty arrays, null fields, or boundary values (> 50 items, > 30 days old, size 0). Reproduce the exact shape in staging and re-try.

Verify

  1. The failure reproduces under your impersonation JWT.
  2. You can identify which of the four causes it is.
  3. The customer-side audit log shows your impersonation session with the correct reason (US-PLAT-11).

If none of this worked

  • The cause is a genuine per-customer data bug. Open an engineering ticket with the impersonation_id, the customer org ID, and the specific entity ID. Do NOT attach customer data to the ticket — reference IDs only.
  • If you could not reproduce even under impersonation, ask the customer for their browser’s Network HAR file and the X-Request-Id header on a failing request, then search the cloud logs on request_id=<value>.
  • Related: Audit log investigation when the failure is an action the customer took, not a render problem.