"Session ID unknown" after handshake on high server load [Socket.io 1.0.6]
Problem
I am running a multi-node server (16 workers running Socket.io 1.0.6; accessed via Nginx, configured as a reverse proxy supporting sticky sessions) for ~ 5k users. While the load of the server is low (2~3 on a 20 core server / 2k users), everyone is able to connect instantly. When the load of the server gets higher (5~6 / 5k users), new users are not able to connect and receive data instantly. In this case, it takes 2~4 handshakes for the users to connect succesfully. This is what happens (high load): - User opens the website; receives HTML and JS - User's browser attempts to initialize a socket.io connection to the server (`io.connect(...)`) - A handshake request is sent to the server, the server responds with a SID and other information (`{"sid":"f-re6ABU3Si4pmyWADCx","upgrades":["websocket"],"pingInterval":25000,"pingTimeout":60000}`) - The client initiates a polling-request, including this SID: `GET .../socket.io/?EIO=2&transport=polling&t=1408648886249-1&sid=f-re6ABU3Si4pmyWADCx` - Instead of sending data, the server responds with 400 Bad Request: `{"code":1,"message":"Session ID unknown"}` - The client performs a new handshake (`GET .../socket.io/?EIO=2&transport=polling&t=1408648888050-3`, notice the previously received SID is omitted) - The server responds with new connection data, including a new SID: (`{"sid":"DdRxn2gv6vrtZOBiAEAS","upgrades":["websocket"],"pingInterval":25000,"pingTimeout":60000}`) - The client performs a new polling request, including the new SID
Unverified for your environment
Select your OS to check compatibility.
1 Fix
Optimize Socket.io Session Management Under High Load
The 'Session ID unknown' error occurs due to high server load causing delays in session management and state synchronization across multiple workers. When the server is under heavy load, the existing session IDs may not be recognized by the worker processing the request, leading to a failure in maintaining the connection state.
Awaiting Verification
Be the first to verify this fix
- 1
Increase Socket.io Connection Timeout
Adjust the connection timeout settings in your Socket.io server configuration to allow for longer connection attempts under high load. This can help mitigate issues where the server takes longer to respond due to load.
javascriptconst io = require('socket.io')(server, { pingTimeout: 120000 }); - 2
Implement Redis for Session Store
Use Redis as a session store to manage Socket.io sessions across multiple workers. This ensures that session data is shared and consistent, reducing the likelihood of 'Session ID unknown' errors.
javascriptconst redis = require('socket.io-redis'); io.adapter(redis({ host: 'localhost', port: 6379 })); - 3
Optimize Nginx Configuration for WebSocket
Ensure that your Nginx configuration is optimized for WebSocket connections. This includes setting appropriate timeouts and buffer sizes to handle high traffic efficiently.
nginxlocation /socket.io/ { proxy_pass http://your_socket_io_backend; proxy_http_version 1.1; proxy_set_header Upgrade $http_upgrade; proxy_set_header Connection 'upgrade'; proxy_read_timeout 600s; } - 4
Monitor and Scale Resources
Implement monitoring for your server's resource usage (CPU, memory, etc.) and scale your server resources or worker instances based on traffic patterns to ensure optimal performance under high load.
bashUse tools like New Relic or Prometheus to monitor performance metrics. - 5
Review Socket.io Version
Consider upgrading to a more recent version of Socket.io that may have performance improvements and bug fixes related to session management under load.
bashnpm install socket.io@latest
Validation
After implementing the fixes, conduct load testing to simulate high traffic conditions. Monitor for 'Session ID unknown' errors and ensure that new users can connect successfully within the expected time frame. Use logging to verify that session IDs are being recognized correctly.
Sign in to verify this fix
Environment
Submitted by
Alex Chen
2450 rep