Configure the LLM model

Changing the LLM model takes effect immediately for all new events. Events already in flight finish on the previous model. Cost per event changes — sometimes substantially — because providers charge different rates per input image, per input token, and per output token. This page walks you through switching the org-level default; per-camera overrides are documented at the end.

NovaVMS uses a single VLM call per event by default (perception and judgment in one round-trip, per R1-REV). The model you pick here is the one that gets called. See Why NovaVMS uses a single AI call for the rationale.

Procedure

Open Settings from the left nav.
Under the Governance group, click AI Model & Cost.

Pick a Provider from the dropdown. See the supported-providers table below.
Pick a Model from the filtered dropdown. Only models valid for that provider appear.
Paste the API key into the masked input. The key is encrypted at rest and never returned in full via the API.
Click Test connection. NovaVMS sends a small inference request and reports success or the exact error message.

(Optional) Set a Per-event cost cap under the Cost Controls panel. Any event whose estimated cost exceeds the cap is skipped with ai_status=skipped_cost.
Click Save. The audit log records ai_config.updated with the changes diff — the key itself is never written to the log.

Supported providers

Provider	Models typical for NovaVMS	Multi-frame (D40)	Notes
Gemini (Vertex AI)	`gemini-2.5-flash-lite`, `gemini-2.5-flash`, `gemini-2.5-pro`	yes	Default. Cheapest per event at current list prices. Best fit for one-pass.
OpenAI	`gpt-4o`, `gpt-4o-mini`, `gpt-4.1`	yes (up to 10 images)	Image-only input; multi-frame works, full-video does not.
Qwen	`qwen-vl-max`, `qwen-vl-plus`	yes	Alibaba DashScope. Good cost-to-quality trade in APAC deployments.
Ollama (local)	`llava`, `bakllava`, `moondream`	varies	For air-gapped deployments. Zero marginal cost, self-hosted hardware required. [Not yet shipped — D60.]

Rationale and trade-offs for keeping the call single-pass: see Why NovaVMS uses a single AI call.

Common variations

Per-camera override. On a specific camera’s detail page, set AI provider override and AI model override. The camera uses its override; all others use the org default. This is the right place for “this one camera needs premium analysis.”
Two-pass per prompt pack. The pass_mode field on a prompt pack flips a specific pack to two-pass (VLM perception + text LLM judgment). The model you pick on this page is still the VLM half; the judgment LLM uses the same provider unless its prompt pack overrides it.
Ollama for air-gapped deployments. When D60 ships, select Ollama as provider and supply the base URL (default http://localhost:11434). Model dropdown auto-populates via GET /api/tags. Not yet implemented in v1.

If this didn’t work

Test connection returns “Invalid API key”: re-copy the key — trailing whitespace and partial copies are the most common cause.
Test connection times out: the provider’s endpoint is unreachable from the cloud backend. Check the outbound firewall on the server hosting NovaVMS.
Model dropdown is empty: you picked a provider that requires an API key before models are listed (OpenAI, Qwen). Paste the key first, then the model list loads.
Save returns 403: your role is operator or viewer. Model selection is admin-only per D82. See Roles and permissions.

Why NovaVMS uses a single AI call — R1-REV rationale and cost/latency trade-offs
Configure prompt packs — the prompt content the model sees (operator-gated per D82)
Roles and permissions — who can change model vs who can change prompts
Alert rule schema — how AI outputs feed alert triggers