How We Scaled 1 Million WebSocket Connections: Real-World Engineering Insights
Picture this: It’s 3 AM, and I’m staring at our monitoring dashboard as our WebSocket connections graph climbs steadily upward. Our real-time chat application had just gone viral in Southeast Asia, and we were about to hit numbers we’d only dreamed about during our initial architecture planning sessions. That night taught me more about WebSocket scaling than any documentation ever could.
The Journey to Our First Million
When we started building our chat platform in late 2024, we had modest expectations – maybe 10,000 concurrent connections tops. But as they say, life comes at you fast. Within six months, we were looking at 100,000 concurrent users, and the numbers kept climbing. Here’s how we evolved our architecture to handle that growth, and the lessons we learned along the way.
Starting Simple: The Single Server Dream
Like many developers, we started with a single Node.js server handling WebSocket connections. Here’s what our initial setup looked like:
const WebSocket = require('ws');
const wss = new WebSocket.Server({ port: 8080 });
wss.on('connection', function connection(ws) {
ws.on('message', function incoming(message) {
// Simple broadcast to all clients
wss.clients.forEach(function each(client) {
if (client.readyState === WebSocket.OPEN) {
client.send(message);
}
});
});
});
This worked great… until it didn’t. Around 10,000 connections, we started seeing memory issues and significant latency. Time for our first major architecture shift.
The Redis Pub/Sub Evolution
Our first scaling step was implementing Redis pub/sub to handle message distribution across multiple Node.js instances. This allowed us to scale horizontally while maintaining real-time message delivery.
const Redis = require('ioredis');
const subscriber = new Redis();
const publisher = new Redis();
wss.on('connection', function connection(ws) {
ws.on('message', function incoming(message) {
// Publish to Redis instead of direct broadcast
publisher.publish('chat_messages', message);
});
});
subscriber.subscribe('chat_messages', (err, count) => {
if (err) console.error(err);
});
subscriber.on('message', (channel, message) => {
wss.clients.forEach(function each(client) {
if (client.readyState === WebSocket.OPEN) {
client.send(message);
}
});
});
Load Balancing and Connection Distribution
As we approached 250,000 connections, we needed to implement proper load balancing. We chose HAProxy with sticky sessions based on IP hash, ensuring clients consistently connected to the same WebSocket server.
frontend ws_frontend
bind *:80
mode http
option forwardfor
stick-table type ip size 1m expire 1h
stick on src
default_backend ws_backend
backend ws_backend
mode http
balance source
hash-type consistent
server ws1 10.0.0.1:8080 check
server ws2 10.0.0.2:8080 check
server ws3 10.0.0.3:8080 check
The Architecture That Got Us to 1 Million
graph TD
A[HAProxy] --> B1[WebSocket Server 1]
A --> B2[WebSocket Server 2]
A --> B3[WebSocket Server 3]
B1 --> C[Redis Cluster]
B2 --> C
B3 --> C
C --> D[Message Queue]
D --> E[Processing Workers]
Critical Optimizations That Made It Possible
- Implemented connection pooling to manage database connections efficiently
- Added message batching for bulk operations
- Introduced message compression using protocol buffers
- Implemented heartbeat mechanisms to manage connection lifecycle
- Created automatic scaling based on connection metrics
Monitoring and Debugging at Scale
One of our biggest challenges was maintaining visibility into the system’s health. We implemented a comprehensive monitoring solution using Prometheus and Grafana, focusing on key metrics:
const metrics = {
activeConnections: new prometheus.Gauge({
name: 'ws_active_connections',
help: 'Number of active WebSocket connections'
}),
messageLatency: new prometheus.Histogram({
name: 'ws_message_latency',
help: 'Message processing latency in milliseconds'
})
};
wss.on('connection', function connection(ws) {
metrics.activeConnections.inc();
ws.on('close', () => metrics.activeConnections.dec());
});
Lessons Learned and Future Considerations
Looking back, here are the most valuable insights from our scaling journey:
- Start with simple architecture but design for scale from day one
- Monitor everything – you can’t optimize what you can’t measure
- Connection management is more critical than raw performance
- Network optimization is just as important as application optimization
As we look toward supporting 2 million connections in 2026, we’re already planning our next evolution. We’re exploring WebAssembly for more efficient message processing and experimenting with custom protocol optimizations.
Have you faced similar scaling challenges with WebSockets? I’d love to hear about your experiences and solutions in the comments below. After all, the best part of our developer community is learning from each other’s real-world battles.