1、查看错误信息:
1.1 错误信息(1)
- 127.0.0.1:7000> get name
- (error) CLUSTERDOWN The cluster is down
- 127.0.0.1:7000> cluster info
- cluster_state:fail
- cluster_slots_assigned:16380
- cluster_slots_ok:16380
- cluster_slots_pfail:0
- cluster_slots_fail:0
- cluster_known_nodes:6
- cluster_size:3
- cluster_current_epoch:8
- cluster_my_epoch:1
- cluster_stats_messages_sent:1007
- cluster_stats_messages_received:1005
1.2 错误信息(2)
- 127.0.0.1:7000> cluster slots
- 1) 1) (integer) 0
- 2) (integer) 5460
- 3) 1) "127.0.0.1"
- 2) (integer) 7000
- 3) "09d1ad8d8aa8acd0ca2b95206c58901e47318ec9"
- 4) 1) "127.0.0.1"
- 2) (integer) 7004
- 3) "3b6910a3cf76025564d9f744f64ffa3a3b35fbc8"
- 2) 1) (integer) 10923
- 2) (integer) 11991
- 3) 1) "127.0.0.1"
- 2) (integer) 7002
- 3) "9b7110678b6eef4ae80c330eb0cfb51ffbc216ea"
- 4) 1) "127.0.0.1"
- 2) (integer) 7003
- 3) "7d0da6cebfa834177a189b9f71b048e8aeb29c49"
- 3) 1) (integer) 11993
- 2) (integer) 12381
- 3) 1) "127.0.0.1"
- 2) (integer) 7002
- 3) "9b7110678b6eef4ae80c330eb0cfb51ffbc216ea"
- 4) 1) "127.0.0.1"
- 2) (integer) 7003
- 3) "7d0da6cebfa834177a189b9f71b048e8aeb29c49"
- 4) 1) (integer) 12383
- 2) (integer) 14040
- 3) 1) "127.0.0.1"
- 2) (integer) 7002
- 3) "9b7110678b6eef4ae80c330eb0cfb51ffbc216ea"
- 4) 1) "127.0.0.1"
- 2) (integer) 7003
- 3) "7d0da6cebfa834177a189b9f71b048e8aeb29c49"
- 5) 1) (integer) 14042
- 2) (integer) 14385
- 3) 1) "127.0.0.1"
- 2) (integer) 7002
- 3) "9b7110678b6eef4ae80c330eb0cfb51ffbc216ea"
- 4) 1) "127.0.0.1"
- 2) (integer) 7003
- 3) "7d0da6cebfa834177a189b9f71b048e8aeb29c49"
- 6) 1) (integer) 14387
- 2) (integer) 16383
- 3) 1) "127.0.0.1"
- 2) (integer) 7002
- 3) "9b7110678b6eef4ae80c330eb0cfb51ffbc216ea"
- 4) 1) "127.0.0.1"
- 2) (integer) 7003
- 3) "7d0da6cebfa834177a189b9f71b048e8aeb29c49"
- 7) 1) (integer) 5461
- 2) (integer) 10922
- 3) 1) "127.0.0.1"
- 2) (integer) 7001
- 3) "caa158fcb538991c73438ca9801ab6ab2510e85a"
- 4) 1) "127.0.0.1"
- 2) (integer) 7005
- 3) "0fbf5cbddefe6ad2324a225d25447ff80b033b27"
2.分析错误信息 2.1 从错误信息(1)中cluster_slots_assigned:16380看出少了4个slot,因为集群就是要满足所有的16364个槽点全部分配才会成功。 2.2 统计错误信息(2)
- 1) 0-5460 7000、7004
- 2)10923-11991 7002、7003
- 3)11993-12381 7002、7003
- 4)12383-14040 7002、7003
- 5)14042-14385 7002、7003
- 6)14387-16383 7002、7003
- 7) 5461-10922 7001、7005
找到缺少的slot分别为11992、12382、14041、14386
3.解决方法: 3.1将一个或多个槽(slot)指派(assign)给当前节点
cluster addslots 11992 12382 14041 14386 3.2 显示结果:
- 127.0.0.1:7000> cluster addslots 11992 12382 14041 14386
- OK
- 127.0.0.1:7000> cluster info
- cluster_state:ok
- cluster_slots_assigned:16384
- cluster_slots_ok:16384
- cluster_slots_pfail:0
- cluster_slots_fail:0
- cluster_known_nodes:6
- cluster_size:3
- cluster_current_epoch:8
- cluster_my_epoch:1
- cluster_stats_messages_sent:42312
4、解决方法2:
写shell脚本用cluster addslots命令吧1~16384所有槽点都添加一遍即可。对于已经存在的会说此槽点正在忙,所以不用担心重复添加。之后查看集群状态即为正常。
脚本如下:
- > /home/sw/2
- for ((i=1;i<=16384;i++));
- do
- echo "redis-cli -h 192.168.5.115 -p 7004 cluster addslots "${i} >> /home/sw/2
- done
#redis-cli -h 192.168.5.115 -p 7004 cluster addslots 1..16384
之后运行2脚本并且把日志打印到/home/sw/3:sh 2 |tee /home/sw/3 即可。
参考:https://blog.csdn.net/zsx18273117003/article/details/83414440 |