An async, high-performance cache server written in Rust. It acts as a read-through cache tier sitting between your web server and database — absorbing repeated reads, reducing database load, and returning responses from memory in microseconds instead of milliseconds.
Client
│
▼ (rate-limited per IP, Cache-Control aware)
Axum HTTP Server
│
├── L1: moka in-memory cache
│ TinyLFU eviction · bounded capacity · per-entry TTL
│
└── L2: Redis cache (optional)
shared across processes and survives restarts
│
└── Database (MockDatabase or sqlx::PgPool)
Every GET /get/:key request follows this path:
1. Check L1 (in-memory moka cache)
└─ HIT → return value, source: "l1"
2. Check L2 (Redis, if configured)
└─ HIT → backfill L1 → return value, source: "l2"
3. Query database
└─ FOUND → populate L1 + L2 → return value, source: "database"
└─ NOT FOUND → 404
Every POST /set writes to the database first (source of truth), then updates both L1 and L2. The cache is always a mirror, never the authority.
DELETE /del/:key removes the key from both L1 and L2 simultaneously.
- L1 —
moka::future::Cache: async, lock-free, size-bounded in-memory cache using the TinyLFU eviction policy (same algorithm as Java's Caffeine/Guava). Capacity and TTL are configurable. When L1 is full, the least-frequently-used entries are evicted automatically. - L2 — Redis: optional shared cache layer that survives process restarts and is visible to multiple instances. On an L2 hit the value is backfilled into L1 so subsequent reads are served entirely from memory. Uses
redis::aio::ConnectionManagerfor async, auto-reconnecting access.
Incoming Cache-Control request headers are parsed and respected:
| Directive | Behaviour |
|---|---|
no-cache |
Bypass L1/L2, always revalidate with database |
no-store |
Bypass cache on read; do not populate cache on response |
max-age=0 |
Treated as no-cache |
Every /get response includes:
Cache-Control: public, max-age=<ttl>— tells browsers and upstream CDNs how long to cacheX-Cache: l1 | l2 | database— indicates which layer served the response
All routes are protected by a token bucket rate limiter (tower_governor) applied as a Tower middleware layer. Requests exceeding the configured rate return 429 Too Many Requests before any handler logic runs. Burst capacity is set to 5× the per-second rate to absorb short spikes.
All events are emitted via tracing with structured fields (key, layer, TTL). Log level is controlled at runtime via RUST_LOG — no recompile needed.
CTRL+C triggers a clean drain: the server stops accepting new connections and waits for in-flight requests to complete before exiting.
All tunables are read from environment variables (.env file supported via dotenvy):
| Variable | Default | Description |
|---|---|---|
PORT |
3000 |
TCP port to bind |
REDIS_URL |
(empty) | Redis connection string; omit to run L1-only |
CACHE_MAX_CAPACITY |
10000 |
Maximum number of L1 entries before eviction |
CACHE_DEFAULT_TTL_SECS |
300 |
Default TTL (seconds) when ttl_secs is omitted |
RATE_LIMIT_RPS |
100 |
Max requests per second per IP |
| Method | Path | Description |
|---|---|---|
GET |
/get/:key |
Read-through cache lookup |
POST |
/set |
Write-through upsert {key, value, ttl_secs?} |
DELETE |
/del/:key |
Invalidate key from all cache layers |
GET |
/stats |
Cache performance counters |
GET |
/health |
Liveness probe — always returns {"status":"ok"} |
cache-tier/
├── Cargo.toml
├── .env.example
└── src/
├── main.rs — Tokio runtime, Axum router, Redis init, graceful shutdown
├── config.rs — Config loaded from environment variables
├── state.rs — Shared AppState (Arc-wrapped cache, db, config)
├── cache.rs — CacheStore: moka L1 + Redis L2 + atomic stats
├── db.rs — MockDatabase (drop-in replacement with sqlx/PgPool)
├── handlers.rs — Route handlers (read-through + write-through logic)
├── headers.rs — Cache-Control request parsing + response header builder
└── error.rs — AppError → HTTP response mapping via thiserror
# 1. Copy example config
cp .env.example .env
# 2. (Optional) start a local Redis instance
docker run -d -p 6379:6379 redis:alpine
# 3. Run the server
cargo run# Write a key with a 60-second TTL
curl -X POST http://localhost:3000/set \
-H 'Content-Type: application/json' \
-d '{"key":"user:1","value":"Alice","ttl_secs":60}'
# Read — first call hits database, second call hits L1 cache
curl http://localhost:3000/get/user:1
# Force a fresh database read (bypass cache)
curl -H 'Cache-Control: no-cache' http://localhost:3000/get/user:1
# Read without caching the response
curl -H 'Cache-Control: no-store' http://localhost:3000/get/user:1
# Invalidate a key from all cache layers
curl -X DELETE http://localhost:3000/del/user:1
# View hit/miss statistics
curl http://localhost:3000/stats
# → {"hits_l1": 42, "hits_l2": 3, "misses": 5, "l1_entries": 10}
# Liveness probe (for Kubernetes, load balancers, etc.)
curl http://localhost:3000/health
# → {"status": "ok"}Replace MockDatabase in src/db.rs with an sqlx::PgPool — the get / set async interface is identical, no other files need changing:
pub struct Database { pool: sqlx::PgPool }
impl Database {
pub async fn get(&self, key: &str) -> Result<Option<String>, AppError> {
sqlx::query_scalar("SELECT value FROM kv WHERE key = $1")
.bind(key)
.fetch_optional(&self.pool)
.await
.map_err(|e| AppError::Database(e.to_string()))
}
}