Building Reliable Webhooks: Retry Logic and Idempotency
Back to Home
API Design

Building Reliable Webhooks: Retry Logic and Idempotency

F
Futureaiit
Dec 10, 2025
11 min read

Webhooks are the backbone of modern integrations, enabling real time communication between systems. But they are deceptively hard to get right. A webhook that works in development can fail spectacularly in production: lost events, duplicate processing, cascading failures. At Futureaiit, we have built webhook systems processing millions of events per day. Here is how to build webhooks that are reliable, scalable, and maintainable.

What Makes Webhooks Fragile

Webhooks seem simple: when an event happens, send an HTTP POST to a URL. But this simplicity hides complexity:

  • Network failures: The receiving server might be down, slow, or unreachable
  • Timeouts: The receiver might process slowly, causing the sender to timeout
  • Duplicate delivery: Retries can cause the same event to be delivered multiple times
  • Ordering issues: Events might arrive out of order
  • Security risks: Unauthenticated webhooks can be spoofed or replayed

A naive webhook implementation handles none of these issues. Production webhooks must be designed for failure.

1. Implement Exponential Backoff Retries

Networks are unreliable. The receiving server will occasionally be down, overloaded, or timing out. Your webhook system must retry failed deliveries.

Naive Approach (Do Not Do This)

// Send webhook, give up on failure
try {
    await fetch(webhookUrl, { method: 'POST', body: JSON.stringify(event) });
} catch (error) {
    console.error('Webhook failed:', error);
    // Event is lost forever
}

This loses events. If the receiver is down for 10 seconds, all events during that window are gone.

Correct Approach: Exponential Backoff

Retry failed deliveries with increasing delays: 1s, 2s, 4s, 8s, 16s, etc. This gives the receiver time to recover without overwhelming it.

async function sendWebhookWithRetry(url, payload, maxRetries = 5) {
    for (let attempt = 0; attempt < maxRetries; attempt++) {
        try {
            const response = await fetch(url, {
                method: 'POST',
                headers: { 'Content-Type': 'application/json' },
                body: JSON.stringify(payload),
                timeout: 10000 // 10 second timeout
            });
            
            if (response.ok) return { success: true };
            
            // Retry on 5xx errors, not 4xx (client errors are permanent)
            if (response.status >= 500) {
                throw new Error(`Server error: ${response.status}`);
            }
            
            return { success: false, error: 'Client error' };
        } catch (error) {
            if (attempt === maxRetries - 1) {
                // Final attempt failed, log to dead letter queue
                await logToDeadLetterQueue(url, payload, error);
                return { success: false, error };
            }
            
            // Exponential backoff: 2^attempt seconds
            const delay = Math.pow(2, attempt) * 1000;
            await sleep(delay);
        }
    }
}

This ensures transient failures do not lose events. At Futureaiit, we typically retry up to 5 times over 31 seconds before giving up.

2. Use a Dead Letter Queue

Even with retries, some webhooks will fail permanently (e.g., the receiver is misconfigured or the URL is invalid). Do not lose these events. Store them in a dead letter queue for manual review.

Implementation

After exhausting retries, write failed events to a database table or message queue:

async function logToDeadLetterQueue(url, payload, error) {
    await db.insert('webhook_failures', {
        url,
        payload: JSON.stringify(payload),
        error: error.message,
        timestamp: new Date(),
        retries: 5
    });
    
    // Alert ops team
    await sendAlert('Webhook permanently failed', { url, error });
}

Periodically review the dead letter queue to identify systemic issues (e.g., a receiver that is always down).

3. Make Webhooks Idempotent

Retries can cause duplicate delivery. The receiver might process the same event twice, creating duplicate orders, double charges, or inconsistent state.

Solution: Idempotency Keys

Include a unique event ID in every webhook. The receiver uses this ID to deduplicate events.

Sender:

const event = {
    id: 'evt_1234567890', // Unique event ID
    type: 'order.created',
    data: { orderId: 'order_abc', amount: 100 }
};

await sendWebhook(webhookUrl, event);

Receiver:

app.post('/webhook', async (req, res) => {
    const event = req.body;
    
    // Check if we have already processed this event
    const exists = await db.exists('processed_events', { eventId: event.id });
    if (exists) {
        console.log('Duplicate event, ignoring:', event.id);
        return res.status(200).send('OK'); // Acknowledge to stop retries
    }
    
    // Process event
    await processEvent(event);
    
    // Mark as processed
    await db.insert('processed_events', { eventId: event.id, timestamp: new Date() });
    
    res.status(200).send('OK');
});

This ensures duplicate deliveries are harmless. At Futureaiit, we store processed event IDs for 30 days to handle late retries.

4. Verify Webhook Signatures

Unauthenticated webhooks are a security risk. An attacker can send fake events to your webhook endpoint, triggering unintended actions.

Solution: HMAC Signatures

Sign each webhook with a secret key. The receiver verifies the signature before processing.

Sender:

const crypto = require('crypto');

function signWebhook(payload, secret) {
    const signature = crypto
        .createHmac('sha256', secret)
        .update(JSON.stringify(payload))
        .digest('hex');
    return signature;
}

const payload = { id: 'evt_123', type: 'order.created', data: {...} };
const signature = signWebhook(payload, 'your_secret_key');

await fetch(webhookUrl, {
    method: 'POST',
    headers: {
        'Content-Type': 'application/json',
        'X-Webhook-Signature': signature
    },
    body: JSON.stringify(payload)
});

Receiver:

app.post('/webhook', (req, res) => {
    const receivedSignature = req.headers['x-webhook-signature'];
    const expectedSignature = signWebhook(req.body, 'your_secret_key');
    
    if (receivedSignature !== expectedSignature) {
        console.error('Invalid webhook signature');
        return res.status(401).send('Unauthorized');
    }
    
    // Signature valid, process event
    processEvent(req.body);
    res.status(200).send('OK');
});

This prevents spoofed webhooks. Use a strong, randomly generated secret and rotate it periodically.

5. Process Webhooks Asynchronously

Webhook receivers should respond quickly (under 5 seconds). If processing takes longer, the sender will timeout and retry, causing duplicate processing.

Solution: Queue Based Processing

Acknowledge the webhook immediately, then process it asynchronously via a queue.

app.post('/webhook', async (req, res) => {
    const event = req.body;
    
    // Verify signature
    if (!verifySignature(event, req.headers['x-webhook-signature'])) {
        return res.status(401).send('Unauthorized');
    }
    
    // Push to queue for async processing
    await queue.push('webhook_events', event);
    
    // Acknowledge immediately
    res.status(200).send('OK');
});

// Separate worker processes events from queue
async function processWebhookQueue() {
    while (true) {
        const event = await queue.pop('webhook_events');
        if (!event) {
            await sleep(1000);
            continue;
        }
        
        try {
            await processEvent(event);
        } catch (error) {
            console.error('Failed to process event:', error);
            // Optionally retry or log to dead letter queue
        }
    }
}

This decouples receipt from processing, ensuring fast responses and preventing timeouts.

6. Handle Ordering Issues

Webhooks can arrive out of order. An "order.updated" event might arrive before "order.created."

Solution: Timestamp Based Ordering

Include a timestamp in each event. The receiver uses it to order events correctly.

const event = {
    id: 'evt_123',
    type: 'order.updated',
    timestamp: '2026-02-14T07:30:00Z',
    data: { orderId: 'order_abc', status: 'shipped' }
};

// Receiver checks timestamp before applying update
async function processEvent(event) {
    const existing = await db.findOne('orders', { id: event.data.orderId });
    
    if (existing && existing.lastUpdated > event.timestamp) {
        console.log('Ignoring out of order event:', event.id);
        return; // Newer event already processed
    }
    
    // Apply update
    await db.update('orders', { id: event.data.orderId }, {
        ...event.data,
        lastUpdated: event.timestamp
    });
}

7. Monitor and Alert

Webhook failures are silent. Without monitoring, you will not know events are being lost.

Metrics to Track

  • Delivery success rate: Percentage of webhooks delivered successfully
  • Retry rate: How often are retries needed?
  • Dead letter queue size: How many events are permanently failing?
  • Processing latency: How long does it take to process events?

Alerting

Set up alerts for:

  • Success rate drops below 95%
  • Dead letter queue grows beyond threshold
  • Processing latency exceeds SLA

At Futureaiit, we use Datadog and PagerDuty to monitor webhook health and alert on anomalies.

8. Version Your Webhooks

Webhook schemas evolve. Adding fields is safe, but removing or renaming fields breaks receivers.

Solution: API Versioning

Include a version in each webhook. Allow receivers to specify which version they support.

const event = {
    version: 'v2',
    id: 'evt_123',
    type: 'order.created',
    data: { orderId: 'order_abc', amount: 100 }
};

// Receiver specifies supported version when registering webhook
await registerWebhook({
    url: 'https://example.com/webhook',
    events: ['order.created', 'order.updated'],
    version: 'v2'
});

Support multiple versions simultaneously during transitions, then deprecate old versions gradually.

How Futureaiit Can Help

At Futureaiit, we build production grade webhook systems for companies across industries. We can help you:

  • Design reliable webhook architectures: Implement retries, idempotency, and async processing
  • Build webhook infrastructure: Set up queues, dead letter queues, and monitoring
  • Integrate with third party webhooks: Receive and process webhooks from Stripe, Shopify, Twilio, etc.
  • Secure webhook endpoints: Implement signature verification and rate limiting
  • Debug webhook failures: Identify and fix delivery issues
  • Scale webhook processing: Handle millions of events per day reliably

We have built webhook systems processing billions of events, with 99.99% delivery success rates. Our battle tested patterns ensure your webhooks are reliable from day one.

Conclusion

Reliable webhooks require more than just sending HTTP requests. You need retries with exponential backoff, dead letter queues for permanent failures, idempotency to handle duplicates, signature verification for security, async processing to avoid timeouts, timestamp based ordering, comprehensive monitoring, and versioning for schema evolution.

These patterns are not optional. They are the difference between webhooks that work in development and webhooks that work in production at scale.

At Futureaiit, we have built webhook systems for companies processing millions of dollars in transactions. We know the edge cases and have battle tested solutions for every failure mode.

Need help building reliable webhooks? Contact Futureaiit to discuss how we can help you design and implement production grade webhook infrastructure.

F

Futureaiit

AI & Technology Experts