How We Built an Insurance API That Scales to 10,000 TPS
When we set out to build CoverKit, we knew we needed to handle massive scale. E-commerce checkouts happen in milliseconds, and our API needed to keep up. Here is how we built an infrastructure that handles 10,000+ transactions per second.
The Challenge
Insurance APIs are not like typical CRUD applications. Every request involves:
- Complex risk calculations with actuarial models
- Real-time fraud detection
- Underwriting rule evaluation
- Multi-party coordination (carriers, reinsurers)
- Strict audit requirements
And it all needs to happen in under 200ms at the p95 level. No pressure.
Architecture Overview
We built CoverKit on Google Cloud Platform using a microservices architecture:
āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
ā Cloud CDN + Armor ā
āāāāāāāāāāāāāāāāāāāāāāāāāāā¬āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
ā
āāāāāāāāāāāāāāāāāāāāāāāāāāā¼āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
ā Cloud Endpoints Gateway ā
ā (Auth, Rate Limiting, Routing) ā
āāāāāāāāāāāāāāāāāāāāāāāāāāā¬āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
ā
āāāāāāāāāāāāāāāāāāā¼āāāāāāāāāāāāāāāāāā
ā ā ā
āāāāāāāāā¼āāāāāāāā āāāāāāāāā¼āāāāāāāā āāāāāāāāā¼āāāāāāāā
ā Quote Engine ā ā Policy Admin ā ā Claims ā
ā (Node.js) ā ā (Node.js) ā ā (Node.js) ā
āāāāāāāāā¬āāāāāāāā āāāāāāāāā¬āāāāāāāā āāāāāāāāā¬āāāāāāāā
ā ā ā
āāāāāāāāāāāāāāāāāāā¼āāāāāāāāāāāāāāāāāā
ā
āāāāāāāāāāāāāāāāāāā¼āāāāāāāāāāāāāāāāāā
ā ā ā
āāāāāāāāā¼āāāāāāāā āāāāāāāāā¼āāāāāāāā āāāāāāāāā¼āāāāāāāā
ā Redis HA ā ā Cloud SQL ā ā Cloud Storage ā
ā (Caching) ā ā (PostgreSQL) ā ā (Documents) ā
āāāāāāāāāāāāāāāāā āāāāāāāāāāāāāāāāā āāāāāāāāāāāāāāāāāKey Decisions
1. Multi-Tier Caching
We implemented a three-tier caching strategy:
- L1 (In-Memory): Node.js LRU cache for hot data (<1ms)
- L2 (Redis): Distributed cache for shared data (<5ms)
- L3 (Computed): Real-time calculation when cache misses (<200ms)
// Our caching strategy achieves 85%+ cache hit rate
const cacheConfig = {
l1: {
maxSize: 10000,
ttl: 60, // 1 minute
},
l2: {
cluster: 'redis-ha',
ttl: 3600, // 1 hour
compression: true,
},
};
async function getQuote(params: QuoteParams): Promise<Quote> {
const key = computeKey(params);
// L1 check
const l1 = memoryCache.get(key);
if (l1) return l1;
// L2 check
const l2 = await redis.get(key);
if (l2) {
memoryCache.set(key, l2);
return l2;
}
// Compute
const quote = await computeQuote(params);
await Promise.all([
redis.setex(key, 3600, quote),
memoryCache.set(key, quote),
]);
return quote;
}2. Horizontal Scaling with GKE Autopilot
We use GKE Autopilot for automatic scaling based on demand. During peak hours (Black Friday, Cyber Monday), we scale from 10 to 200+ pods automatically.
Key configurations:
- CPU-based HPA with custom metrics for queue depth
- Pod Disruption Budgets for zero-downtime deployments
- Regional clusters for high availability
- Workload Identity for secure service-to-service auth
3. Database Optimization
PostgreSQL is our source of truth, but we optimize heavily:
- Read replicas for reporting and analytics queries
- Connection pooling with PgBouncer (6000+ connections)
- Partitioned tables for time-series data
- Careful index design (we have a dedicated DBA)
4. Async Processing with Cloud Tasks
Not everything needs to be synchronous. We offload non-critical work:
- Webhook delivery (with automatic retries)
- Document generation (policies, certificates)
- Analytics events
- Email notifications
Performance Results
Lessons Learned
- Cache invalidation is hard: We spent months getting this right. Event-driven invalidation with version keys was the solution.
- Observability from day one: We built comprehensive tracing and metrics before scaling. This saved us countless hours.
- Graceful degradation matters: When Redis is slow, we fall back to computed values. Never fail the customer.
- Load test regularly: We run load tests weekly to catch regressions before they hit production.
What is Next
We are working on edge computing to bring quote generation even closer to customers. Stay tuned for our next engineering deep dive.
Interested in joining our engineering team? Check our open positions.