Caching Strategies

Chapter 1: The Speed Gap - Why Caching Exists

Computers have a dirty secret: they are incredibly fast at thinking (CPU) but incredibly slow at remembering things (Data Access).

Every time your application needs data, it has to fetch it. The speed of that fetch depends entirely on where the data is. The difference is not just "a little bit" — it can be millions of times slower.

The Latency Hierarchy:

L1 Cache (CPU): ~0.5 nanoseconds (Instant)
RAM (Memory): ~100 nanoseconds (Very fast)
SSD (Disk): ~150,000 nanoseconds (Slow)
Network Call (Database): ~100,000,000 nanoseconds (Eternity!)

🥡 Real World: The Food Analogy
Imagine you're cooking dinner.

L1 Cache: Steps you have memorized. (Instant)

RAM: Ingredients on the cutting board. (Fast reach)

Disk (SSD): Ingredients in the fridge. (Walk across kitchen, open door)

Network (Database): Ingredients at the grocery store. (Drive 30 mins, park, shop, drive back)

Caching is the art of buying extra groceries so you don't have to drive to the store for every single onion.

Chapter 2: Designing a Cache - Where Do We Put It?

You can cache data at many layers of your system. Each layer protects the one below it.

2.1 Browser / Client Caching

Location: The user's device.
What: Images, CSS, HTML, API responses.
Benefit: Instant load, zero network traffic.

2.2 CDN (Content Delivery Network)

Location: Servers all over the world, close to users.
What: Static files (images, videos).
Benefit: User in London doesn't fetch image from New York server.

2.3 Application Memory (In-Process)

Location: Inside your running code (RAM).
What: Configuration, frequent lookups.
Pros: Microsecond access.
Cons: If server restarts, data is lost. If you have 10 servers, you have 10 separate caches (inconsistent).

2.4 Distributed Cache (Redis / Memcached)

Location: A separate, dedicated server.
What: Shared session data, database query results.
Pros: Shared across all web servers. Survives web server restarts.
Cons: Network call required (slower than in-process).

Chapter 3: Caching Patterns - How to Handle Data

You can't just "cache everything". You need a strategy for loading and updating data.

3.1 Cache-Aside (Lazy Loading) - Maximum Control

This is the most common pattern. The application is responsible for talking to the cache and the database.

The Flow:

App asks Cache: "Do you have Object X?"
Hit: Cache says "Yes". App uses it. (Done)
Miss: Cache says "No".
- App asks Database for Object X.
- App saves Object X to Cache.
- App returns Object X.

Pros: Only requests what is needed. Fails safe (if cache dies, app talks to DB).
Cons: First request is always slow (Cache Miss). Data can become stale.

3.2 Write-Through - Always Fresh

The application writes to the Cache and the Database at the same time.

Pros: Cache is never stale.
Cons: Writing is slower (two operations). Unused data sits in cache.

3.3 Write-Behind (Write-Back) - High Performance

The application writes ONLY to the Cache. The Cache asynchronously writes to the DB later.

Pros: Extremely fast writes.
Cons: **Dangerous**. If Cache crashes before saving to DB, data is lost permanently.

Chapter 4: Eviction - Taking Out the Trash

Cache memory is expensive and limited. You cannot store everything forever. When full, you must remove (evict) items.

👕 Real World: The Closet Rule
Your closet is full. To buy a new shirt, you must throw an old one out.

LRU (Least Recently Used): "I haven't worn this shirt in 2 years." (Standard for most caches)

LFU (Least Frequently Used): "I wore this shirt only once ever."

FIFO (First In, First Out): "I bought this shirt first." (Usually bad logic)

Random: Close eyes and pick one. (Surprisingly effective in rare cases)

Chapter 5: Invalidation - The Hardest Problem

"There are only two hard things in Computer Science: cache invalidation and naming things." — Phil Karlton

The Problem: You cache a user's profile: Name: "Bob".
Bob updates his name to "Robert" in the database.
The cache still says "Bob". The data is now stale.

Strategy 1: TTL (Time To Live)

Set a timer on every item. "Keep this for 5 minutes".

Good: Easy to implement. Self-cleaning.
Bad: Data is stale for up to 5 minutes. Is that acceptable?

Strategy 2: Explicit Deletion

When code updates the DB, it forces the cache to delete that specific key using cache.remove("user:123").

Good: Data is usually fresh.
Bad: Easy to forget. "Race conditions" can still happen.

Chapter 6: The "Thundering Herd" (Cache Stampede)

This is a legendary way to crash a system.

🐘 Real World: Black Friday Doorbuster
A popular TV is cached. It expires at 8:00:00 AM.

At 8:00:01, 5,000 users request the TV details.

User 1 checks cache → Miss. Goes to DB.

User 2 checks cache → Miss. Goes to DB.

...

User 5000 checks cache → Miss. Goes to DB.

The Database receives 5,000 heavy queries instantly and explodes.

The Fixes

Locking: User 1 puts a "lock" on the key. User 2-5000 wait for User 1 to finish. Cache is built once.
Probabilistic Early Expiration (Jitter): Expire close items at random times so they don't expire together.

Chapter 7: C# Implementation Example

Using IMemoryCache with a "Mutex" to prevent stampedes (simplified).


public class ProductService
{
    private readonly IMemoryCache _cache;
    private readonly SemaphoreSlim _lock = new(1, 1); // Allows 1 thread at a time

    public async Task GetProductAsync(int id)
    {
        string key = $"product:{id}";

        // 1. Try Cache
        if (_cache.TryGetValue(key, out Product cachedItem))
        {
            return cachedItem;
        }

        // 2. Cache Miss - Wait for Lock (Prevent Stampede)
        await _lock.WaitAsync();
        try
        {
            // 3. Double-Check Cache (Someone might have filled it while we waited)
            if (_cache.TryGetValue(key, out cachedItem))
            {
                return cachedItem;
            }

            // 4. Actually fetch from DB (Simulated)
            Product dbItem = await FetchFromDatabase(id);

            // 5. Save to Cache
            _cache.Set(key, dbItem, TimeSpan.FromMinutes(10));

            return dbItem;
        }
        finally
        {
            _lock.Release();
        }
    }
}

Chapter 8: Summary Checklist

Before you optimize:

[ ] Don't Prematurely Cache. Only cache what is slow or frequently accessed.
[ ] Always set a TTL. Never let cache grow forever.
[ ] Use Cache-Aside as your default pattern.
[ ] Handle failures. If Redis is down, your app should work (just slower).
[ ] Monitor Cache Hit Rate. If Hit Rate is 5%, your cache is useless. Aim for >80%.

Quick Review

Caching trades memory for speed by storing frequently accessed or expensive-to-compute data in a faster layer, allowing us to reduce latency and downstream load when staleness is controlled.

✅ Where caches live (closest → fastest)

Client/Browser: avoids network calls entirely.
CDN: caches static content near users.
In-process: fastest, but per-instance (not shared across servers).
Distributed (Redis): shared across servers, network hop added.

✅ Patterns (how data moves)

Cache-Aside: app reads cache, falls back to DB, then fills cache (default choice).
Write-Through: write goes to cache and storage together (fresh reads, slower writes).
Write-Behind: write goes to cache first, storage later (fast writes, riskier consistency).

✅ The two hard problems

Eviction: what to remove when memory is full (LRU is common).
Invalidation: how to avoid serving stale data (TTL, explicit delete, or both).

✅ Failure modes to remember

Stampede: many misses at once → protect with locks, single-flight, and/or TTL jitter.
Unbounded growth: always set TTL/size limits or the cache becomes the outage.
Low hit rate: means poor keys/TTL/selection; measure hit rate before celebrating.