BackendForFintech
Payment Infrastructure

Webhook Reliability: Handling Asynchronous Payment Events Safely

11 min readPayment Infrastructure2024-03-01

Asynchronous payment events (gateway webhooks) are where many MVPs break. Duplicate delivery, out-of-order events, and slow processing cause incorrect balances and failed reconciliations. This note covers deduplication, idempotent processing, and failure handling.

Event ordering problems

Webhooks can arrive out of order or be retried. Do not assume order. Design consumers to be idempotent and, if order matters for a given resource, use sequence numbers or timestamps and handle late events (e.g. reject or enqueue for manual review).

Deduplication strategy

Store processed webhook ids (or idempotency key derived from gateway + event id). Before processing, check if already seen; if so, return 200 and skip. Use a TTL or partition by date to bound storage. Never process the same event twice.

Idempotent webhook processing

Processing must be deterministic: same event id always yields the same side effects (e.g. one ledger entry). Use the event id or a derived key when writing to the ledger or downstream systems. On retry, re-read state and either no-op or re-apply the same outcome.

Dead-letter queues

When processing fails after retries, move the event to a DLQ. Alert and support manual replay or fix. Do not drop events. Return 2xx to the gateway only after you have durably accepted the event (e.g. written to your queue); otherwise the gateway will retry and you may process twice.

Replay handling

Support replaying events from a given id or timestamp for reconciliation or recovery. Replay must still be idempotent: re-use the same event id so duplicate replays do not double-apply.

Book Architecture Strategy Call

Schedule a call →