Skip to main content

Drift Detection

Drift detection compares your live cloud infrastructure against Terraform state to find resources that are missing, unmanaged, or changed. DriftWise generates AI-powered narratives explaining what drifted and why it matters, plus remediation steps to fix it.

How It Works

Cloud Account          State Source          DriftWise
│ │ │
│──── scan ───────────>│ │
│ (live resources) │ │
│ │──── fetch state ──>│
│ │ (IaC resources) │
│ │ │
│ │ compare ───────│
│ │ │
│ │ drift items ───│
│ │ narrative ─────│
│ │ import blocks ─│
  1. Scan — DriftWise reads live resources from your cloud account (AWS, GCP, Azure)
  2. Fetch state — Terraform state is pulled from your configured state sources
  3. Compare — Live resources are matched against IaC resources by cloud-native ID
  4. Classify — Each difference is categorized as missing, extra, or changed
  5. Narrate — An LLM generates a plain-English explanation with risk scoring and remediation

Drift Types

TypeMeaningExample
MissingIn Terraform state but not in the cloudSomeone deleted an EC2 instance outside Terraform
ExtraIn the cloud but not in Terraform stateShadow IT — a manually created S3 bucket
ChangedIn both, but attributes differSecurity group rules modified in the console

Running a Scan

Via the UI

  1. Go to the cloud account you want to scan
  2. Click Scan Now
  3. Wait for the scan to complete (status: pending → running → done)
  4. Click Compute Drift to compare against Terraform state

Via the API

The drift-detection flow is a 4-step sequence:

  1. Start a scanPOST /orgs/:id/scans with a cloud_account_id. Returns 202 with the scan row at status: "pending"; the scan worker picks it up asynchronously.
  2. Poll scan statusGET /orgs/:id/scans/:scan_id. Wait for status: "done" before computing drift.
  3. Compute driftPOST /orgs/:id/scans/:scan_id/drift/compute. Runs inline against the scan's live + IaC data and returns the drift snapshot.
  4. Retrieve drift results laterGET /orgs/:id/scans/:scan_id/drift. Same snapshot, idempotent.

See the scans and drift tags of the API reference for the full contracts.

Drift Response

{
"id": "drift-snapshot-uuid",
"scan_run_id": "scan-uuid",
"missing_count": 2,
"extra_count": 1,
"changed_count": 3,
"parse_failure_count": 0,
"matched_count": 39,
"total_live": 45,
"total_iac": 47,
"coverage_pct": 95.7,
"risk_level": "high",
"narrative": "## Drift Summary\n\n3 resources have changed...",
"narrative_status": "done",
"created_at": "2026-04-21T12:00:00Z",
"items": [
{
"id": "item-uuid",
"drift_type": "changed",
"resource_name": "prod-web-sg",
"resource_type": "AWS::EC2::SecurityGroup",
"provider": "aws",
"region": "us-east-1",
"attribute_diffs": { "ingress": { "old": "...", "new": "..." } }
},
{
"id": "item-uuid",
"drift_type": "extra",
"resource_name": "vpc-048783f9c43b52b6b",
"resource_type": "AWS::EC2::VPC",
"provider": "aws",
"region": "us-east-1"
}
]
}

matched_count is the number of live resources that are in sync with IaC (derived as total_live − extra_count − changed_count − parse_failure_count). parse_failure_count surfaces resources whose attributes couldn't be parsed for comparison — non-zero means the scan is partial and operator attention is needed.

resource_type on each drift item is the provider-native type string — AWS::<Service>::<Resource> for AWS, <service>.googleapis.com/<Kind> for GCP (e.g. compute.googleapis.com/Instance, storage.googleapis.com/Bucket), Microsoft.<Namespace>/<Resource> for Azure, and Terraform-style kubernetes_<resource> strings for Kubernetes (e.g. kubernetes_deployment, kubernetes_service). The broad category (compute, storage, network, etc.) that Cloud Discovery assigns lives on the underlying live_resources row as normalized_type — for Kubernetes this uses a k8s/<kind> shape (e.g. k8s/deployment) — and is exposed separately on the resources API.

Narratives

When an LLM is configured (either the platform LLM or BYOK), drift computation triggers narrative generation:

  • Narrative status progresses from pending to one of three terminal states: done, error, or skipped. A snapshot initially lands at pending; the drift worker picks it up and writes the terminal status in a single update once the LLM call returns (or is gated).
  • skipped — returned in two cases: (1) the org has no BYOK and the platform-LLM weekly quota is already exhausted for this scan; (2) the org has BYOK configured but the upstream provider returned an error — DriftWise does not silently fall back to the platform LLM. Drift items, counts, and import blocks are still available in both cases. Internally the reason is recorded as one of weekly_quota_exhausted / hourly_rate_limited / plan_hard_off (platform-LLM gating) or byok_client_error / byok_rate_limited / decrypt_failed / key_unavailable (BYOK-side failures) and surfaced in the DriftWise UI; these reasons are not currently exposed on the API response body.
  • error — LLM call failed (timeout, 5xx, parse error). Drift data is still available.
  • No drift produces a fixed message: "No drift detected. Your live infrastructure matches the Terraform state exactly."
  • With drift produces a structured summary covering what changed, risk assessment, and recommended actions
  • Narratives are best-effort — if generation fails or is skipped, drift data is still available

Import Blocks

For extra (unmanaged) resources, DriftWise generates Terraform import blocks to bring them under management:

import {
to = aws_vpc.imported_vpc_048783f9c43b52b6b
id = "vpc-048783f9c43b52b6b"
}

resource "aws_vpc" "imported_vpc_048783f9c43b52b6b" {
# Placeholder — run 'terraform plan' after import to see actual attributes
}

These appear in the narrative under a "Bring Under Management" section. Copy the blocks into your Terraform config and run terraform plan to populate the resource attributes.

Bulk scans

POST /orgs/:id/scans/bulk scans multiple cloud accounts at once. Pass account_ids explicitly or leave it empty to scan every registered account. Bulk scans respect your plan's concurrent-scan limit — exceeding it returns 402 plan_concurrent_limit. Rate-limited at 5/min per org.

Provider Scoping

Drift detection is always scoped to a single cloud provider. When a scan runs against an AWS account, only AWS resources are compared — GCP and Azure resources are untouched. This prevents cross-provider contamination in drift results.