How long should an idempotency key TTL be?

Align the TTL to the longest expected reconciliation cycle of dependent downstream systems plus a safety buffer equal to your maximum retry window. Payment systems typically need 24–72 hours; real-time inventory reservations can use 15 minutes.

What happens when a TTL expires while a request is still executing?

A subsequent retry bypasses the deduplication guard. Mitigate this by storing a PENDING state flag alongside the key, and coupling TTL expiry with a compensating transaction that marks the orphaned operation FAILED before the key is deleted.

Should I use Redis or PostgreSQL for idempotency key storage?

Redis is preferred when latency is the primary constraint and audit trails are not required; set the noeviction policy on the idempotency keyspace. PostgreSQL is required when strict durability, long-term retention, or compliance audit logs are mandated; manage cleanup through range partitioning on expires_at.

Idempotency Key Storage & TTL Management

Part of: Backend Implementation & Storage Patterns

Idempotency key storage is the authoritative deduplication ledger that determines whether an arriving request is a first-time execution or a duplicate that must be short-circuited. Selecting the wrong backend, misconfiguring TTL semantics, or neglecting eviction policies are the most common causes of phantom duplicate charges and stale-replay vulnerabilities in production payment systems. This page covers the lifecycle mechanics — from atomic key registration through expiration, eviction, and cross-region synchronization — for the two dominant storage tiers used in the Backend Implementation & Storage Patterns repertoire.

Problem Framing

Every distributed system that retries requests faces an asymmetry: the client cannot distinguish a network timeout from a server-side failure after state was already mutated. The only reliable safeguard is a durable, time-bounded record that says “this request was already processed.” When that record is absent — because it was evicted too early, never written atomically, or never replicated to the serving region — a retry triggers a second execution and the deduplication guarantee collapses.

The engineering challenge is that the storage layer itself is subject to the same distributed failures it is meant to protect against: cache evictions under memory pressure, replication lag in multi-region deployments, clock drift between nodes, and transaction rollbacks that leave partial state. Managing idempotency key storage correctly means designing for the failure modes of the store, not just the failure modes of the API it guards.

Guarantee Model

A well-configured idempotency key store provides at-most-once execution within the TTL window: any request that arrives with a key already registered as COMPLETED receives the cached response without triggering business logic. Outside the TTL window the guarantee lapses and the request is treated as new.

The guarantee degrades under the following conditions:

Clock skew > 100 ms between nodes causes one node to consider a key expired while another still honors it, opening a replay window equal to the skew magnitude.
Lazy key expiration (Redis default) leaves expired keys in memory until the next access; a lookup after physical expiry but before eviction returns a false-positive cache hit.
Replication lag in active-passive topologies means a secondary node may accept a request whose key the primary has already marked COMPLETED but has not yet propagated.
Transaction rollback after key insertion leaves the key registered as PENDING permanently unless a compensating cleanup mechanism runs.

Core Algorithm: Key Lifecycle State Machine

The following diagram shows the full state machine for an idempotency key from registration through natural expiration or explicit failure:

The key transitions are:

First request arrives — atomically register the key as PENDING and begin execution.
Execution succeeds — update state to COMPLETED and cache the response body.
Execution fails — update state to FAILED; the client may retry, which transitions back to PENDING only if retry policy permits.
Duplicate arrives in PENDING — return 202 Accepted or a processing indicator; do not re-execute.
Duplicate arrives in COMPLETED — return the cached response verbatim with the original status code.
TTL elapses — key transitions to EXPIRED (deleted or archived); subsequent requests are treated as new.

Implementation Variants

Variant 1 — Redis Ephemeral Store

Redis is the standard choice for sub-millisecond deduplication where retention requirements are measured in hours rather than months. The atomic SET ... NX EX command registers the key, stores the serialized response, and assigns the TTL in a single round-trip, eliminating the TOCTOU race inherent in separate SETNX + EXPIRE calls.

-- Register key atomically; returns OK on first call, nil on duplicate
SET idem:{tenant}:{key} "{\"status\":\"PENDING\"}" NX EX 86400

For absolute expiration anchored to the original request timestamp — avoiding cumulative drift across retries — use EXPIREAT with a Unix epoch value:

SET idem:{tenant}:{key} "{\"status\":\"PENDING\"}" NX
EXPIREAT idem:{tenant}:{key} 1750000000

To update state atomically after execution completes, use a Lua script so that the read-modify-write sequence is serialized:

local key = KEYS[1]
local new_val = ARGV[1]
local current = redis.call('GET', key)
if current == nil then
  return redis.error_reply('KEY_MISSING')
end
redis.call('SET', key, new_val, 'KEEPTTL')
return 'OK'

Eviction risk: under memory pressure, Redis LRU/LFU policies may evict valid idempotency keys before their TTL expires. Isolate the idempotency keyspace in a dedicated Redis logical database and configure maxmemory-policy noeviction for that database, accepting increased memory provisioning as the trade-off. See Redis & Cache-Based Deduplication for full cluster topology guidance.

Variant 2 — PostgreSQL Persistent Store

When audit trails, compliance retention periods, or TTLs exceeding 72 hours are required, PostgreSQL becomes the idempotency ledger. The unique constraint on (idempotency_key, tenant_id) enforces deduplication at the storage layer, surviving cache outages entirely.

CREATE TABLE idempotency_keys (
  id            BIGSERIAL PRIMARY KEY,
  idempotency_key TEXT        NOT NULL,
  tenant_id       UUID        NOT NULL,
  status          TEXT        NOT NULL DEFAULT 'PENDING'
                                CHECK (status IN ('PENDING','COMPLETED','FAILED')),
  response_body   JSONB,
  response_status SMALLINT,
  created_at      TIMESTAMPTZ NOT NULL DEFAULT now(),
  expires_at      TIMESTAMPTZ NOT NULL,
  CONSTRAINT uq_idem_key_tenant UNIQUE (idempotency_key, tenant_id)
);

CREATE INDEX idx_idem_expires ON idempotency_keys (expires_at)
  WHERE status != 'COMPLETED';

Insert with conflict detection — no explicit lock required:

INSERT INTO idempotency_keys (idempotency_key, tenant_id, expires_at)
VALUES ($1, $2, now() + INTERVAL '72 hours')
ON CONFLICT (idempotency_key, tenant_id) DO NOTHING
RETURNING id, status, response_body, response_status;

Wrap the insert and the business logic in a single transaction so a rollback leaves the key unregistered and retryable. See Transaction Scoping & Atomic Operations for isolation-level guidance and deadlock avoidance patterns.

For cleanup, range-partition by expires_at and drop old partitions rather than running bulk DELETE:

-- Partition for keys expiring in the current month
CREATE TABLE idempotency_keys_2026_06
  PARTITION OF idempotency_keys
  FOR VALUES FROM ('2026-06-01') TO ('2026-07-01');

-- Drop the previous month's partition in a maintenance window
DROP TABLE idempotency_keys_2026_05;

Variant 3 — Hybrid Tier (Redis + PostgreSQL)

High-throughput payment APIs combine both tiers: Redis handles the hot path for the first 24 hours (sub-millisecond lookup), and PostgreSQL persists the canonical record for audit and 7-day replay protection. Key registration writes to both stores transactionally using the outbox pattern to prevent the Redis write from succeeding while the PostgreSQL write fails.

Variant Comparison

Variant	Lookup latency	Durability	Max practical TTL	Compliance audit	Eviction risk
Redis only	< 1 ms	Memory (RDB/AOF)	72 h	None	LRU under pressure
PostgreSQL only	2–10 ms	WAL-durable	Unlimited	Full row history	None (partition drop)
Hybrid Redis + PG	< 1 ms (hot), 2–10 ms (cold)	WAL-durable	Unlimited	Full row history	Mitigated by PG fallback

Edge Cases & Failure Scenarios

Failure Scenario	Remediation Steps	Observability Hooks
TTL expires while request is `PENDING` (slow downstream)	Store `expires_at` separately from the TTL; a watchdog job marks orphaned `PENDING` keys `FAILED` and emits a compensating event before cleanup	Alert on `idempotency_pending_age_seconds > TTL * 0.8`; trace span tag `idem.state=PENDING` with duration
Redis evicts a valid key under memory pressure before TTL	Set `maxmemory-policy noeviction` on the idempotency keyspace; provision memory for peak QPS × avg key size × TTL seconds	Alert on `redis_evicted_keys > 0` for the idempotency database; `used_memory_rss` vs `maxmemory` ratio
Clock skew between Redis nodes causes split-expiration view	Enforce NTP/PTP synchronization; set alert threshold at 50 ms drift; use `EXPIREAT` with an absolute epoch agreed upon at key creation	`ntp_offset_ms` metric per node; Redis `DEBUG SLEEP` in chaos tests to simulate drift
PostgreSQL transaction rollback leaves key registered as `PENDING`	Run a background job every 60 s: `UPDATE idempotency_keys SET status='FAILED' WHERE status='PENDING' AND created_at < now() - INTERVAL '5 minutes'`	`idem_orphaned_pending_count` gauge; Postgres `pg_stat_activity` long-running transaction alert
Cross-region replication lag causes secondary to re-accept a completed request	Route all idempotency lookups to the primary region during lag windows; fall back to degraded-mode rejection with `503 Service Unavailable` rather than risking a duplicate	`replication_lag_seconds` per replica; `idem_duplicate_accepted_secondary_total` counter
`ON CONFLICT DO NOTHING` silently drops a retry that should return cached response	Separate the “did I just insert?” from “what is the current state?” — always re-read the row after the upsert regardless of rows-affected count	Log `idem_conflict_suppressed=true` on every zero-rows-affected insert; trace `idem.cache_hit` boolean

Operational Concerns

TTL Window Sizing

TTL must exceed the longest possible retry interval plus the downstream reconciliation period. A concrete formula:

TTL = max_retry_window + downstream_reconciliation_period + safety_buffer

For a payment system with 30-minute retry budgets and hourly bank reconciliation: TTL = 30 min + 60 min + 30 min = 120 min minimum. Most payment teams set 24–72 hours to accommodate dispute windows; real-time inventory reservations use 15 minutes.

Retry logic and backoff fundamentals defines the retry budget and backoff curve that feeds into the max_retry_window term above. Critically, exponential backoff with jitter must never produce a retry beyond the configured TTL boundary.

Index Strategy

For PostgreSQL, the composite unique index on (idempotency_key, tenant_id) guarantees O(1) deduplication lookups under multi-tenant load. A partial index on expires_at WHERE status != 'COMPLETED' accelerates the cleanup worker without scanning committed rows. Avoid indexing response_body (JSONB) — it doubles write amplification with no deduplication benefit.

For Redis, namespace keys consistently: idem:{tenant_id}:{idempotency_key} ensures key distribution is uniform across hash slots in cluster mode and makes keyspace notifications filterable by prefix.

Memory and Storage Budgeting

Estimate Redis memory: active_keys = QPS × TTL_seconds; memory_bytes = active_keys × avg_key_size_bytes. At 500 RPS with 86,400 s TTL and 512-byte average values: 500 × 86400 × 512 = ~21 GB. Size accordingly and monitor used_memory_rss vs maxmemory continuously.

For PostgreSQL, row size is typically 200–400 bytes before JSONB response caching. A 7-day retention at 500 RPS = 500 × 604800 × 350 bytes ≈ 100 GB. Partition monthly and plan archival to cold storage (e.g., S3/R2 via COPY) before dropping partitions for compliance-regulated systems.

SRE Alert Thresholds

idem_cache_miss_rate > 0.1% during steady traffic — indicates unexpected eviction or key miss.
idem_duplicate_rejection_rate drops to zero — deduplication may be broken (keys not being registered).
idem_pending_duration_p99 > TTL * 0.5 — slow downstream consumers risking expiry mid-execution.
idem_ttl_cleanup_lag_seconds > 300 — cleanup worker falling behind, causing storage cost overrun.
Redis evicted_keys > 0 for the idempotency database — memory policy misconfiguration.

Middleware Integration

Idempotency validation belongs in a pre-routing middleware or HTTP interceptor that runs before any business logic. The middleware must:

Extract the key from the Idempotency-Key header (fall back to X-Idempotency-Key for legacy clients).
Perform an atomic lookup-and-register against the storage layer.
If the key is COMPLETED, return the cached response verbatim with the original HTTP status code.
If the key is PENDING, return 202 Accepted with a Retry-After header.
If the key is absent, insert it as PENDING and proceed to business logic.
Include X-Idempotency-Expires in every response so clients can implement intelligent backoff.

Return 409 Conflict when a key is registered but the incoming request body hash differs from the original — this signals a client-side bug, not a network retry. See Database Unique Constraints & Upserts for the storage-layer contract backing this conflict detection.

Backend Implementation & Storage Patterns — parent section covering all storage-layer deduplication approaches
Redis & Cache-Based Deduplication — SET NX, Lua scripts, and cluster topology for cache-tier idempotency
Database Unique Constraints & Upserts — PostgreSQL ON CONFLICT patterns and index design for durable deduplication
Transaction Scoping & Atomic Operations — wrapping key registration and business logic in a single atomic transaction boundary
Retry Logic & Backoff Fundamentals — sizing the retry window that feeds into TTL calculations