Configure the LLM model
Changing the LLM model takes effect immediately for all new events. Events already in flight finish on the previous model. Cost per event changes — sometimes substantially — because providers charge different rates per input image, per input token, and per output token. This page walks you through switching the org-level default; per-camera overrides are documented at the end.
NovaVMS uses a single VLM call per event by default (perception and judgment in one round-trip, per R1-REV). The model you pick here is the one that gets called. See Why NovaVMS uses a single AI call for the rationale.
Procedure
- Open Settings from the left nav.
- Under the Governance group, click AI Model & Cost.
- Pick a Provider from the dropdown. See the supported-providers table below.
- Pick a Model from the filtered dropdown. Only models valid for that provider appear.
- Paste the API key into the masked input. The key is encrypted at rest and never returned in full via the API.
- Click Test connection. NovaVMS sends a small inference request and reports success or the exact error message.
- (Optional) Set a Per-event cost cap under the Cost Controls panel. Any event whose estimated cost exceeds the cap is skipped with
ai_status=skipped_cost. - Click Save. The audit log records
ai_config.updatedwith thechangesdiff — the key itself is never written to the log.
Supported providers
| Provider | Models typical for NovaVMS | Multi-frame (D40) | Notes |
|---|---|---|---|
| Gemini (Vertex AI) | gemini-2.5-flash-lite, gemini-2.5-flash, gemini-2.5-pro | yes | Default. Cheapest per event at current list prices. Best fit for one-pass. |
| OpenAI | gpt-4o, gpt-4o-mini, gpt-4.1 | yes (up to 10 images) | Image-only input; multi-frame works, full-video does not. |
| Qwen | qwen-vl-max, qwen-vl-plus | yes | Alibaba DashScope. Good cost-to-quality trade in APAC deployments. |
| Ollama (local) | llava, bakllava, moondream | varies | For air-gapped deployments. Zero marginal cost, self-hosted hardware required. [Not yet shipped — D60.] |
Rationale and trade-offs for keeping the call single-pass: see Why NovaVMS uses a single AI call.
Common variations
- Per-camera override. On a specific camera’s detail page, set
AI provider overrideandAI model override. The camera uses its override; all others use the org default. This is the right place for “this one camera needs premium analysis.” - Two-pass per prompt pack. The
pass_modefield on a prompt pack flips a specific pack to two-pass (VLM perception + text LLM judgment). The model you pick on this page is still the VLM half; the judgment LLM uses the same provider unless its prompt pack overrides it. - Ollama for air-gapped deployments. When D60 ships, select
Ollamaas provider and supply the base URL (defaulthttp://localhost:11434). Model dropdown auto-populates viaGET /api/tags. Not yet implemented in v1.
If this didn’t work
- Test connection returns “Invalid API key”: re-copy the key — trailing whitespace and partial copies are the most common cause.
- Test connection times out: the provider’s endpoint is unreachable from the cloud backend. Check the outbound firewall on the server hosting NovaVMS.
- Model dropdown is empty: you picked a provider that requires an API key before models are listed (OpenAI, Qwen). Paste the key first, then the model list loads.
- Save returns 403: your role is operator or viewer. Model selection is admin-only per D82. See Roles and permissions.
Related
- Why NovaVMS uses a single AI call — R1-REV rationale and cost/latency trade-offs
- Configure prompt packs — the prompt content the model sees (operator-gated per D82)
- Roles and permissions — who can change model vs who can change prompts
- Alert rule schema — how AI outputs feed alert triggers