Ahmed Rizawan

How We Scaled 1 Million WebSocket Connections: Real-World Engineering Insights

Picture this: It’s 3 AM, and I’m staring at our monitoring dashboard as our WebSocket connections graph climbs steadily upward. Our real-time chat application had just gone viral in Southeast Asia, and we were about to hit numbers we’d only dreamed about during our initial architecture planning sessions. That night taught me more about WebSocket scaling than any documentation ever could.

Server room with blue LED lights showing modern data center infrastructure

The Journey to Our First Million

When we started building our chat platform in late 2024, we had modest expectations – maybe 10,000 concurrent connections tops. But as they say, life comes at you fast. Within six months, we were looking at 100,000 concurrent users, and the numbers kept climbing. Here’s how we evolved our architecture to handle that growth, and the lessons we learned along the way.

Starting Simple: The Single Server Dream

Like many developers, we started with a single Node.js server handling WebSocket connections. Here’s what our initial setup looked like:


const WebSocket = require('ws');
const wss = new WebSocket.Server({ port: 8080 });

wss.on('connection', function connection(ws) {
  ws.on('message', function incoming(message) {
    // Simple broadcast to all clients
    wss.clients.forEach(function each(client) {
      if (client.readyState === WebSocket.OPEN) {
        client.send(message);
      }
    });
  });
});

This worked great… until it didn’t. Around 10,000 connections, we started seeing memory issues and significant latency. Time for our first major architecture shift.

The Redis Pub/Sub Evolution

Our first scaling step was implementing Redis pub/sub to handle message distribution across multiple Node.js instances. This allowed us to scale horizontally while maintaining real-time message delivery.


const Redis = require('ioredis');
const subscriber = new Redis();
const publisher = new Redis();

wss.on('connection', function connection(ws) {
  ws.on('message', function incoming(message) {
    // Publish to Redis instead of direct broadcast
    publisher.publish('chat_messages', message);
  });
});

subscriber.subscribe('chat_messages', (err, count) => {
  if (err) console.error(err);
});

subscriber.on('message', (channel, message) => {
  wss.clients.forEach(function each(client) {
    if (client.readyState === WebSocket.OPEN) {
      client.send(message);
    }
  });
});

Load Balancing and Connection Distribution

As we approached 250,000 connections, we needed to implement proper load balancing. We chose HAProxy with sticky sessions based on IP hash, ensuring clients consistently connected to the same WebSocket server.


frontend ws_frontend
    bind *:80
    mode http
    option forwardfor
    stick-table type ip size 1m expire 1h
    stick on src
    default_backend ws_backend

backend ws_backend
    mode http
    balance source
    hash-type consistent
    server ws1 10.0.0.1:8080 check
    server ws2 10.0.0.2:8080 check
    server ws3 10.0.0.3:8080 check

The Architecture That Got Us to 1 Million


graph TD
    A[HAProxy] --> B1[WebSocket Server 1]
    A --> B2[WebSocket Server 2]
    A --> B3[WebSocket Server 3]
    B1 --> C[Redis Cluster]
    B2 --> C
    B3 --> C
    C --> D[Message Queue]
    D --> E[Processing Workers]

Critical Optimizations That Made It Possible

  • Implemented connection pooling to manage database connections efficiently
  • Added message batching for bulk operations
  • Introduced message compression using protocol buffers
  • Implemented heartbeat mechanisms to manage connection lifecycle
  • Created automatic scaling based on connection metrics

Monitoring and Debugging at Scale

One of our biggest challenges was maintaining visibility into the system’s health. We implemented a comprehensive monitoring solution using Prometheus and Grafana, focusing on key metrics:


const metrics = {
  activeConnections: new prometheus.Gauge({
    name: 'ws_active_connections',
    help: 'Number of active WebSocket connections'
  }),
  messageLatency: new prometheus.Histogram({
    name: 'ws_message_latency',
    help: 'Message processing latency in milliseconds'
  })
};

wss.on('connection', function connection(ws) {
  metrics.activeConnections.inc();
  ws.on('close', () => metrics.activeConnections.dec());
});

Lessons Learned and Future Considerations

Looking back, here are the most valuable insights from our scaling journey:

  • Start with simple architecture but design for scale from day one
  • Monitor everything – you can’t optimize what you can’t measure
  • Connection management is more critical than raw performance
  • Network optimization is just as important as application optimization

As we look toward supporting 2 million connections in 2026, we’re already planning our next evolution. We’re exploring WebAssembly for more efficient message processing and experimenting with custom protocol optimizations.

Have you faced similar scaling challenges with WebSockets? I’d love to hear about your experiences and solutions in the comments below. After all, the best part of our developer community is learning from each other’s real-world battles.