Exponential Backoff: Why attempts⁴ and attempts⁵ Are Not the Same

📖 2 minutes read

Your queue job hits a rate limit. You need exponential backoff. But which formula?

Here are two real patterns I’ve seen coexist in the same codebase — and the differences matter more than you’d think.

Pattern A: attempts⁴ + Jitter

catch (RateLimitedException $e) {
    $this->release(($this->attempts() ** 4) + random_int(0, 40));
}

Pattern B: Base Delay + attempts⁵

protected function getDelay(): int
{
    return 30 + $this->attempts() ** 5;
}

catch (RateLimitedException $e) {
    $this->release($this->getDelay());
}

The Numbers Tell the Story

Let’s calculate the actual delays:

Pattern A (attempts⁴ + random(0, 40)):

  • Attempt 1: 1–41 seconds
  • Attempt 2: 16–56 seconds
  • Attempt 3: 81–121 seconds (~2 min)
  • Attempt 4: 256–296 seconds (~5 min)
  • Attempt 5: 625–665 seconds (~11 min)

Pattern B (30 + attempts⁵):

  • Attempt 1: 31 seconds
  • Attempt 2: 62 seconds
  • Attempt 3: 273 seconds (~4.5 min)
  • Attempt 4: 1,054 seconds (~17.5 min)
  • Attempt 5: 3,155 seconds (~52 min)

When to Use Which

Pattern A is for user-facing work. Order processing, payment confirmations, notification delivery — anything where a customer is waiting. The lower exponent (⁴ vs ⁵) keeps retry times reasonable. The random jitter prevents thundering herd problems when dozens of jobs fail simultaneously and would otherwise all retry at the exact same second.

Pattern B is for background synchronization. Data imports, inventory syncs, report generation — work that can wait. The higher exponent (⁵) creates a much steeper curve that backs off aggressively. The 30-second base floor means you never retry immediately, giving the remote service real breathing room.

Why Jitter Matters

Pattern A’s random_int(0, 40) isn’t cosmetic. Without it, if 50 jobs hit a rate limit at the same time, they all retry at exactly attempts⁴ seconds later — probably hitting the rate limit again. Jitter spreads them across a 40-second window.

Pattern B skips jitter because background jobs are already spread out. They don’t pile up the same way user-triggered actions do.

The Takeaway

Don’t just copy the first backoff formula you find. Think about:

  1. Who’s waiting? Users → lower exponent, faster retries
  2. How many concurrent failures? Many → add jitter
  3. How sensitive is the API? Very → higher exponent, base delay floor

Pick the curve that matches your use case. One size doesn’t fit all, even within the same application.

Daryle De Silva

VP of Technology

11+ years building and scaling web applications. Writing about what I learn in the trenches.

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *