Skip to main content

Platform LLM Quotas

DriftWise rate-limits platform-LLM usage per org via two stacked gates: a weekly quota and an hourly rate limit. BYOK bypasses both.

How the caps apply

PlanWeekly quotaHourly rate limit
Free5 / ISO-weekunlimited
Teamunlimited20 / hour
Enterpriseunlimitedcontract-negotiated (default unlimited)

-1 = unlimited (gate skipped, no DB touch). 0 = hard-off (every call is immediately rejected — used for enterprise accounts with paused contracts).

Bucket semantics

  • Weekly: ISO-week, reset at 00:00 UTC each Monday. Resets are absolute — there is no per-org anchor date.
  • Hourly: fixed (not sliding). A burst of 20 at 12:59 followed by 20 more at 13:00 is intentional for runaway protection, not a limiter bug.

Every reserve call runs through a single transaction:

  1. If weekly cap is finite, atomically increment the weekly bucket.
  2. If hourly cap is finite, atomically increment the hourly bucket.
  3. If the hourly gate denies after the weekly increment, the weekly counter is decremented in the same transaction — blocked calls never consume quota.

HTTP response shapes

402 plan_weekly_quota_exhausted

{
"error": "plan_weekly_quota_exhausted",
"message": "weekly AI analysis quota reached for your plan",
"used": 5,
"cap": 5,
"week_resets_at": "2026-04-20T00:00:00Z",
"byok_config_url": "/api/v2/orgs/<org_id>/llm-config"
}

429 plan_hourly_rate_limit

{
"error": "plan_hourly_rate_limit",
"message": "hourly AI analysis rate limit reached",
"used": 20,
"cap": 20,
"hour_resets_at": "2026-04-17T13:00:00Z",
"byok_config_url": "/api/v2/orgs/<org_id>/llm-config"
}

A Retry-After header (in seconds, rounded up) accompanies every 429.

402 plan_hard_off

Contract-paused or explicitly-disabled org: weekly_platform_llm_quota = 0 or hourly_platform_llm_limit = 0. Unblock with BYOK, or contact support.

Reading your usage

curl "https://app.driftwise.ai/api/v2/orgs/$ORG_ID/llm-usage" \
-H "x-api-key: $DRIFTWISE_API_KEY"
{
"weekly_used": 3,
"weekly_cap": 5,
"week_resets_at": "2026-04-20T00:00:00Z",
"hourly_used": 0,
"hourly_cap": -1,
"hour_resets_at": "2026-04-17T13:00:00Z",
"byok_configured": false
}

When byok_configured is true, the caps are advisory — BYOK requests bypass both gates. See LLM Providers for the BYOK setup.

Rollback on transient LLM failure

A platform-LLM call that fails with a 5xx / timeout / network error does not consume a quota slot — DriftWise releases the weekly and hourly reservation atomically before returning the error. 4xx failures (auth issues on a configured BYOK key, for example) do consume the slot and additionally record against the BYOK failure circuit breaker.

Breaking change — per-request BYOK removed (April 2026)

The legacy per-request BYOK shape — llm_config as a field on POST /analyze — has been removed. The strict request decoder rejects the field with 400 unknown_field. Migrate to persisted BYOK via PUT /api/v2/orgs/:id/llm-config.

Why the change

  • One place to configure — the drift-narrative worker (async) now uses the same BYOK credential as the /analyze handler. Previously the worker only used the platform key, which meant free/team orgs never got BYOK narratives.
  • Quota is per-unit, not per-LLM-call — one /analyze invocation now consumes exactly one quota slot regardless of how many internal LLM calls it triggers.

Security trade-off

Per-request BYOK kept your key ephemeral — it never landed in DriftWise's datastore. Persisted BYOK stores the ciphertext on DriftWise infrastructure next to the encryption key. Mitigations in place:

  • AES-256-GCM envelope encryption.
  • GKE Secrets wrapped by a KMS keyring (prevent_destroy = true).
  • Audit log on every create/update/delete — provider name only, never key material.
  • Hard-delete on revocation; no soft-delete column.
  • Least-privilege on the server's encryption key.

If your security policy forbids any customer key material at rest on our infrastructure, open a ticket. An enterprise contract can negotiate a hybrid ephemeral path — that is a planned follow-up, not a current feature.