All games

Circuit Breaker Clinic

Diagnose dependency failures and choose circuit breaker, timeout, fallback, retry, half-open, and bulkhead strategies that reduce blast radius.

Concept
Circuit breakers, timeouts, retries, fallbacks, and dependency isolation
Difficulty
Intermediate
Play time
6-9 minutes
Path
Production Reliability
practice/circuit-breaker-clinic Resilience design score

Play, get feedback, save local progress, and optionally submit a leaderboard score.

Concept explanation

Outages spread when every caller patiently waits, retries, and shares the same exhausted resources. This clinic asks you to prescribe the resilience treatment before one slow dependency infects the whole request path.

Your local progress

0 XP 0 games played 0 completed

Progress, review history, and best scores are stored in this browser with localStorage.

Open full progress dashboard

Playable game area

Use the controls below. Feedback appears immediately, and final scores are stored locally.

Leaderboard

Top 10 submitted scores. No account required.

Loading leaderboard...

    Finish the game to load your latest local score.

    Learning objectives

    • Choose when to open, half-open, and close a circuit breaker.
    • Design fallbacks that preserve business correctness.
    • Combine timeouts, bounded retries, jitter, and bulkheads to reduce blast radius.

    How to play

    1. Read the dependency failure symptom and business context.
    2. Choose the resilience treatment that protects users without hiding correctness problems.
    3. Use feedback to connect circuit breakers with timeouts, retries, fallbacks, and bulkheads.

    Scoring

    • Correct resilience choices add points and streak bonuses.
    • Incorrect choices explain how failure spreads or correctness breaks.
    • Your best local resilience score is saved at completion.

    Backend concept notes

    Circuit breakers stop a caller from repeatedly waiting on a dependency that is likely unhealthy. They work best with short timeouts, bounded retries, jitter, and safe fallback behavior.

    Resilience design is not simply keeping pages green. A fallback must preserve business correctness, and bulkheads should stop one dependency from exhausting shared resources.

    Common mistakes

    • Using very long timeouts that hold threads and amplify outages.
    • Counting expected 4xx validation errors as dependency-health failures.
    • Retrying without jitter, budgets, or idempotency.
    • Serving stale data for correctness-critical decisions such as inventory or payments.

    FAQ

    Short answers for how this game fits backend interview and study practice.

    Is a circuit breaker the same as a rate limiter?

    No. Rate limiters control caller volume or fairness. Circuit breakers protect callers and dependencies when a downstream service appears unhealthy.

    Should every dependency have the same fallback?

    No. Recommendation data may fall back to cache, but payment or inventory decisions often need conservative failure behavior to preserve correctness.