ClusterAllFailedError on version 4.24.1
Problem
Hey! We're using ioredis with our AWS ElastiCache cluster running 3 shards on version 5.0.5. We're making the redis calls through a lambda with quite high traffic, meaning multiple concurrent lambdas running. I've done some version bumping and I'm seeing the error `ClusterAllFailedError: Failed to refresh slots cache` intermittently. Through some debugging I've narrowed version 4.24.1 to be the culprit - any version before that works fine. When setting `DEBUG=ioredis:*` in the lambda env the `ClusterAllFailedError: Failed to refresh slots cache` is in most cases followed by these logs: [code block] When looking at the 4.24.1 commit https://github.com/luin/ioredis/commit/8524eeaedaa2542f119f2b65ab8e2f15644b474e I can tell that code related to this error has been touched - could this fix have introduced unintended issues? Any pointers would be appreciated :+1:
Error Output
error `ClusterAllFailedError: Failed to refresh slots cache` intermittently. Through some debugging I've narrowed version 4.24.1 to be the culprit - any version before that works fine. When setting `DEBUG=i
Unverified for your environment
Select your OS to check compatibility.
1 Fix
Solution: ClusterAllFailedError on version 4.24.1
Hi @sundeqvist, thanks for raising this issue! Before 4.24.1, ioredis asked cluster nodes for cluster slot information when connecting and periodically after connected. If all cluster nodes failed to provide the information (ex all nodes were down), ioredis would raise the "Failed to refresh slots cache" error and reconnect to the cluster (and print debug log `Reset with [] `) if it hadn't connec
Trust Score
2 verifications
- 1
Hi @sundeqvist, thanks for raising this issue!
Hi @sundeqvist, thanks for raising this issue!
- 2
Before 4.24.1, ioredis asked cluster nodes for cluster slot information when con
Before 4.24.1, ioredis asked cluster nodes for cluster slot information when connecting and periodically after connected. If all cluster nodes failed to provide the information (ex all nodes were down), ioredis would raise the "Failed to refresh slots cache" error and reconnect to the cluster (and print debug log `Reset with [] `) if it hadn't connected, otherwise (when running periodically) it would just ignore.
- 3
However, after 4.24.1, ioredis will raise and reconnect to the cluster even the
However, after 4.24.1, ioredis will raise and reconnect to the cluster even the cluster has already connected. This change is introduced to make failover detection faster.
- 4
For your case, I'd suggest listen to the "node error" event (`cluster.on('node e
For your case, I'd suggest listen to the "node error" event (`cluster.on('node error', err => console.error(err))`) and see what errors cause the issue. This event will be emitted every time a cluster node fails to provide slot information.
Validation
Resolved in redis/ioredis GitHub issue #1330. Community reactions: 1 upvotes.
Verification Summary
Sign in to verify this fix
Environment
Submitted by
Alex Chen
2450 rep