Scaling Part 2
On this page
In Part 1 we covered the basics, horizontal scaling, load balancers, statelessness. Now let’s go deeper into database scaling, CDNs, and finding performance bottlenecks.
CDN, Content Delivery Networks

A CDN is a network of servers spread across the globe. Instead of every request going to your origin server in Virginia, static content gets served from the nearest CDN edge location.
User in Mumbai: served from Mumbai edge (30ms) User in Tokyo: served from Tokyo edge (20ms) vs. both going to Virginia (200ms+)
What to put on a CDN:
- Static files (images, CSS, JS, fonts)
- API responses that don’t change often (with cache headers)
- Video/media content
What NOT to put on a CDN:
- User-specific data
- Real-time data
- Anything that changes per-request
Database Scaling Strategies

Read replicas: One primary database handles all writes. Multiple replicas handle reads. Since most apps are 90%+ reads, this massively reduces load on the primary.
Partitioning (sharding): Split your data across multiple databases. Users A-M go to DB1, N-Z go to DB2. Each database handles less data, faster queries. But cross-shard queries become hard.
Connection pooling with PgBouncer: Even with a pool in your app, if you have 50 app servers each with 20 connections, that’s 1000 connections to your database. PgBouncer sits between apps and the DB, multiplexing hundreds of app connections into a few dozen real DB connections.
Finding Performance Bottlenecks
Before optimizing, MEASURE. Don’t guess where the slowness is.
APM tools (Application Performance Monitoring):
- Track every request’s journey through your system
- Show you exactly which function/query is slow
- Datadog, New Relic, Jaeger
Database slow query logs:
- PostgreSQL can log any query taking more than X milliseconds
- These are your first optimization targets
- Usually it’s a missing index
The N+1 Query Problem: The most common performance killer in ORMs. You load 100 posts, then for EACH post you make a separate query to load its author. That’s 101 queries instead of 2.
Fix: eager loading / JOINs.
Caching Layers (Revisited)
Multiple levels of caching work together:
Browser cache (user's machine)
→ CDN cache (edge server)
→ Application cache (Redis)
→ Database query cache
→ Database (actual query)
Each layer reduces the load on the layer below it.
Performance Checklist
- Enable database slow query logs
- Add indexes on columns in WHERE/JOIN/ORDER BY
- Fix N+1 queries (eager load related data)
- Cache hot data in Redis (with appropriate TTL)
- Use a CDN for static assets
- Enable gzip/brotli compression
- Set proper HTTP cache headers
- Use read replicas if reads dominate
- Profile with APM tools, not guesswork
- Load test before launch (k6, Artillery, Locust)
Wrapping Up
- CDNs serve static content from the nearest edge location
- Read replicas handle read-heavy workloads
- Sharding splits data across databases for horizontal DB scaling
- Always measure before optimizing
- N+1 queries are the most common performance killer
- Layer your caches (browser, CDN, Redis, DB)
Day 20 of 95 | Backend Engineering Series