Java自学者论坛

 找回密码
 立即注册

手机号码,快捷登录

恭喜Java自学者论坛(https://www.javazxz.com)已经为数万Java学习者服务超过8年了!积累会员资料超过10000G+
成为本站VIP会员,下载本站10000G+会员资源,会员资料板块,购买链接:点击进入购买VIP会员

JAVA高级面试进阶训练营视频教程

Java架构师系统进阶VIP课程

分布式高可用全栈开发微服务教程Go语言视频零基础入门到精通Java架构师3期(课件+源码)
Java开发全终端实战租房项目视频教程SpringBoot2.X入门到高级使用教程大数据培训第六期全套视频教程深度学习(CNN RNN GAN)算法原理Java亿级流量电商系统视频教程
互联网架构师视频教程年薪50万Spark2.0从入门到精通年薪50万!人工智能学习路线教程年薪50万大数据入门到精通学习路线年薪50万机器学习入门到精通教程
仿小米商城类app和小程序视频教程深度学习数据分析基础到实战最新黑马javaEE2.1就业课程从 0到JVM实战高手教程MySQL入门到精通教程
查看: 429|回复: 0

redis主从中断异常处理

[复制链接]
  • TA的每日心情
    奋斗
    2024-11-24 15:47
  • 签到天数: 804 天

    [LV.10]以坛为家III

    2053

    主题

    2111

    帖子

    72万

    积分

    管理员

    Rank: 9Rank: 9Rank: 9

    积分
    726782
    发表于 2021-7-14 15:57:26 | 显示全部楼层 |阅读模式

    线上预警主从中断: 查看线上复制信息:

    # Replication role:slave master_host:master_host master_port:6379 master_link_status:down master_last_io_seconds_ago:-1 master_sync_in_progress:1 slave_repl_offset:1 master_sync_left_bytes:713983940 master_sync_last_io_seconds_ago:0 master_link_down_since_seconds:248 slave_priority:100 slave_read_only:1 connected_slaves:0 master_repl_offset:0 repl_backlog_active:0 repl_backlog_size:1048576 repl_backlog_first_byte_offset:0 repl_backlog_histlen:0 

    状态为DOWN.主从失败,查看主节点相关日志

    [374] 15 Oct 16:41:28.146 # Connection with slave 10.72.26.55:6379 lost. [374] 15 Oct 16:41:28.999 * Slave asks for synchronization [374] 15 Oct 16:41:28.999 * Unable to partial resync with the slave for lack of backlog (Slave request was: 152340118946214). [374] 15 Oct 16:41:28.999 * Starting BGSAVE for SYNC [374] 15 Oct 16:41:29.447 * Background saving started by pid 11357 [11357] 15 Oct 16:41:57.325 * DB saved on disk [11357] 15 Oct 16:41:57.555 * RDB: 231 MB of memory used by copy-on-write [374] 15 Oct 16:41:57.980 * Background saving terminated with success [374] 15 Oct 16:42:31.739 * Synchronization with slave succeeded [374] 15 Oct 16:43:01.021 # Client id=6082455 addr=slave_host:55308 fd=329 name= age=93 idle=1 flags=S db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=10657 omem=2504780296 events=rw cmd=replconf scheduled to be closed ASAP for overcoming of output buffer limits. 

    查看从节点日志:

    [372] 15 Oct 16:43:01.141 # Connection with master lost. [372] 15 Oct 16:43:01.141 * Caching the disconnected master state. [372] 15 Oct 16:43:01.213 * Connecting to MASTER masterhost:6379 [372] 15 Oct 16:43:01.213 * MASTER <-> SLAVE sync started [372] 15 Oct 16:43:01.213 * Non blocking connect for SYNC fired the event. [372] 15 Oct 16:43:01.572 * Master replied to PING, replication can continue... [372] 15 Oct 16:43:01.599 * Trying a partial resynchronization (request cbc213a279fde141211f65d436595e4ed64198fa:152342150944513). [372] 15 Oct 16:43:01.602 * Full resync from master: cbc213a279fde141211f65d436595e4ed64198fa:152344338348685 [372] 15 Oct 16:43:01.602 * Discarding previously cached master state. [372] 15 Oct 16:43:30.326 * MASTER <-> SLAVE sync: receiving 1308737462 bytes from master [372] 15 Oct 16:43:59.846 * MASTER <-> SLAVE sync: Flushing old data [372] 15 Oct 16:44:01.534 * MASTER <-> SLAVE sync: Loading DB in memory [372] 15 Oct 16:44:22.590 * MASTER <-> SLAVE sync: Finished with success [372] 15 Oct 16:44:22.600 # Connection with master lost. [372] 15 Oct 16:44:22.600 * Caching the disconnected master state. 

    从主库的日志我们可以看到slave的链接由于超过了output buffer limits的设置值所以被强行中断了。看一下redis2.8的自描述文件

    # client-output-buffer-limit <class> <hard limit> <soft limit> <soft seconds> # # A client is immediately disconnected once the hard limit is reached, or if # the soft limit is reached and remains reached for the specified number of # seconds (continuously). # So for instance if the hard limit is 32 megabytes and the soft limit is # 16 megabytes / 10 seconds, the client will get disconnected immediately # if the size of the output buffers reach 32 megabytes, but will also get # disconnected if the client reaches 16 megabytes and continuously overcomes # the limit for 10 seconds. # # By default normal clients are not limited because they don't receive data # without asking (in a push way), but just after a request, so only # asynchronous clients may create a scenario where data is requested faster # than it can read. # # Instead there is a default limit for pubsub and slave clients, since # subscribers and slaves receive data in a push fashion. # # Both the hard or the soft limit can be disabled by setting them to zero. client-output-buffer-limit normal 0 0 0 client-output-buffer-limit slave 256mb 64mb 60 client-output-buffer-limit pubsub 32mb 8mb 60 

    我们主要看slave的限制:

    256mb 是一个硬性限制,当output-buffer的大小大于256mb之后就会断开连接 64mb 60 是一个条件限制,当output-buffer的大小大于64mb并且超过了60秒的时候就会断开连接 

    当我们链接暴增,数据量大的情况下默认参数已经不能满足主从同步,从库会不停的向主库发起同步,主库就会不停的bgsave,发送文件给从库,这样就会造成一个死循环。我们必须依据从库的使用来调整client-output-buffer-limit slave 的值。调整以后就可以正常同步了。

    哎...今天够累的,签到来了1...
    回复

    使用道具 举报

    您需要登录后才可以回帖 登录 | 立即注册

    本版积分规则

    QQ|手机版|小黑屋|Java自学者论坛 ( 声明:本站文章及资料整理自互联网,用于Java自学者交流学习使用,对资料版权不负任何法律责任,若有侵权请及时联系客服屏蔽删除 )

    GMT+8, 2024-12-23 05:18 , Processed in 0.098871 second(s), 30 queries .

    Powered by Discuz! X3.4

    Copyright © 2001-2021, Tencent Cloud.

    快速回复 返回顶部 返回列表