此次测试使⽤1台服务器同时配置2个redis实例,组建redis主从⾼可⽤。另外启动3个redissentinel实例⽤来管理这个拥有2个实例的redis集群。
环境项
值
备注
服务器IP地址
10.100.202.248
redis版本
5.0.12
redis_6379实例端口
6379
redis_6380实例端口
6380
sentinel实例1端口
26379
待删除的哨兵实例
sentinel实例2端口
26380
sentinel实例3端口
26381
sentinel实例4端口
26382
新增哨兵实例
⾼可⽤环境搭建安装redis_6379和redis_6380实例
cd/usr/local/++makemaketestmakeinstallcdutils/./install_/install_etuparunningredisserver15Pleaseselecttheredisportforthisinstance:[6379]Selectingdefault:6379Pleaseselecttheredisconfigfilename[/etc/redis/6379.conf]Selecteddefault-/etc/redis/6379.confPleaseselecttheredislogfilename[/var/log/redis_6379.log]Selecteddefault-/var/log/redis_6379.logPleaseselectthedatadirectoryforthisinstance[/var/lib/redis/6379]Selecteddefault-/var/lib/redis/6379Pleaseselecttheredisexecutablepath[/usr/local/bin/redis-server]Selectedconfig:Port:6379Configfile:/etc/redis/6379.confLogfile:/var/log/redis_6379.logDatadir:/var/lib/redis/6379Executable:/usr/local/bin/redis-serverCliExecutable:/usr/local/bin/redis-cliIsthisok?/tmp/6379.conf=/etc//redis_6379InstallingserviceSuccessfullyaddedtochkconfig!Successfullyaddedtorunlevels345!StartingRedisserverInstallationsuccessful!
6380实例
[root@c7-8utils]grep-Ev"^$|[grep-Ev"^$|[从实例配置Master实例的IP和端口号
重启启动2个redis实例,使配置⽣效:
[root@c7-8redis]serviceredis_6380restartStoppingRedisstoppedStartingRedisserver
验证主从关系:
[root@c7-8redis]Replicationrole:masterconnected_slaves:1slave0:ip=10.100.202.248,port=6380,state=online,offset=140,lag=1master_replid:379bbd733623cf1e7e8c125f6a7027c8344484admaster_replid2:0000000000000000000000000000000000000000master_repl_offset:140second_repl_offset:-1repl_backlog_active:1repl_backlog_size:1048576repl_backlog_first_byte_offset:1repl_backlog_histlen:140[root@c7-8redis]Replicationrole:slavemaster_host:10.100.202.248master_port:6379master_link_status:upmaster_last_io_seconds_ago:9master_sync_in_progress:0slave_repl_offset:98slave_priority:100slave_read_only:1connected_slaves:0master_replid:379bbd733623cf1e7e8c125f6a7027c8344484admaster_replid2:0000000000000000000000000000000000000000master_repl_offset:98second_repl_offset:-1repl_backlog_active:1repl_backlog_size:1048576repl_backlog_first_byte_offset:1repl_backlog_histlen:98配置redissentinel集群
创建sentinel实例所需配置⽂件:touch{26379..26381}.conf26379实例配置⽂件
/var/run/"/var/log/"dir/tmpsentinelmonitorRedis_Master__Master_csAbc@123sentineldown-after-millisecondsRedis_Master_cs30000sentinelparallel-syncsRedis_Master_cs1sentinelfailover-timeoutRedis_Master_cs180000sentineldeny-scripts-reconfigyes
针对另外2个配置⽂件,灌⼊配置并修改
[root@c7-8redis][root@c7-8redis]sed-i's/26379/26381/g'26381.conf
最终配置⽂件内容如下:
[root@c7-8redis]/var/run/"/var/log/"dir/tmpsentinelmonitorRedis_Master__Master_csAbc@123sentineldown-after-millisecondsRedis_Master_cs30000sentinelparallel-syncsRedis_Master_cs1sentinelfailover-timeoutRedis_Master_cs180000sentineldeny-scripts-reconfigyes
启动redissentinel实例:
[root@c7-8redis]redis-sentinel/etc/redis/26380.conf[root@c7-8redis]ps-ef|grepsentinelroot65531017:20?00:00:00:26379[sentinel]root65601017:20?00:00:00:26380[sentinel]root65651017:20?00:00:00:26381[sentinel]root65701095017:20pts/000:00:00grep--color=autosentinel
验证哨兵实例状态
[root@c7-8redis]:26380sentinelmasterRedis_Master_cs1)"name"2)"Redis_Master_cs"3)"ip"4)"10.100.202.248"5)"port"6)"6379"7)"runid"8)"a7704b9feb83df32e4b1a89ee60a62cb06d8750d"9)"flags"10)"master"11)"link-ping-commands"12)"0"13)"link-refcount"14)"1"15)"last-ping-sent"16)"0"17)"last-ok-ping-reply"18)"216"19)"last-ping-reply"20)"216"21)"down-after-milliseconds"22)"30000"23)"info-refresh"24)"5209"25)"role-reported"26)"master"27)"role-reported-time"28)"196097"29)"config-epoch"30)"0"31)"num-slaves"32)"1"33)"num-other-sentinels"34)"2"35)"quorum"36)"2"37)"failover-timeout"38)"180000"39)"parallel-syncs"40)"1"127.0.0.1:26380[root@c7-8redis]ps-ef|grepredis|grep-vgreproot6444104月20?00:03:36/usr/local/bin/:6379root6493104月20?00:03:37/usr/local/bin/:6380root6553104月20?00:04:10:26379[sentinel]root6560104月20?00:04:10:26380[sentinel]root6565104月20?00:04:09:26381[sentinel]
当前主从关系为:
Master实例:redis_6379Slave实例:redis_6380
强制结束redismaster实例进程——redis_6379.
kill-96444
查看当前redis实例⻆⾊
127.0.0.1:6380inforeplication+sdownmasterRedis_Master_:X21Apr202213:26:40.642quorum3/26553:X21Apr202213:26:40.642+try-failovermasterRedis_Master_:X21Apr202213:26:40.71251d9e69843cbc2d01073843b51d5832a4969da8bvotedfor51d9e69843cbc2d01073843b51d5832a4969da8b16553:X21Apr202213:26:40.872+elected-leadermasterRedis_Master_:X21Apr202213:26:40.887+:638010.100.202.2486380@Redis_Master_:X21Apr202213:26:40.964*+:638010.100.202.2486380@Redis_Master_:X21Apr202213:26:41.021*+:638010.100.202.2486380@Redis_Master_:X21Apr202213:26:42.030+failover-state-reconf-slavesmasterRedis_Master_:X21Apr202213:26:42.047+switch-masterRedis_Master_:X21Apr202213:26:42.048*+:637910.100.202.2486379@Redis_Master_:X21Apr202213:27:12.063rm/var/run/redis_6379.pidrm:是否删除普通文件"/var/run/redis_6379.pid"?y[root@c7-8redis]serviceredis_6379statusRedisisrunning(8166)-:637910.100.202.2486379@Redis_Master_:6380authAbc@123:6380inforeplication6379实例已上线连接到主节点了。master_replid:ce51c53bc762ecb0340d2387caeff78ee17a96cemaster_replid2:379bbd733623cf1e7e8c125f6a7027c8344484admaster_repl_offset:16255722second_repl_offset:16123437repl_backlog_active:1repl_backlog_size:1048576repl_backlog_first_byte_offset:15207147repl_backlog_histlen:1048576模拟哨兵实例的增删操作
测试过程:
1.向当前3节点的哨兵集群中增加⼀个新的哨兵实例——26382.conf2.测试failover3.停⽌其中⼀个哨兵实例——测试选⽤26379实例4.其他哨兵实例中执⾏reset命令5.测试failover模拟向哨兵集群中增加⼀个实例
创建26382哨兵实例的配置⽂件
vim/etc/redis/26382./var/run/"/var/log/"dir/tmpsentinelmonitorRedis_Master__Master_csAbc@123sentineldown-after-millisecondsRedis_Master_cs30000sentinelparallel-syncsRedis_Master_cs1sentinelfailover-timeoutRedis_Master_cs180000sentineldeny-scripts-reconfigyes
启动新哨兵实例26382
[root@c7-8redis]ps-ef|grepredis-sentinelroot6553104月20?00:05:22:26379[sentinel]root6560104月20?00:05:21:26380[sentinel]root6565104月20?00:05:21:26381[sentinel]root82271014:54?00:00:00/usr/local/bin/:26382[sentinel]
redissentinel26319实例的⽇志输出中可以看到哨兵集群中增加了⼀个哨兵实例
6553:X21Apr202214:54:26.396*+@Redis_Master_
连接任意redissentinel控制台,查看当前哨兵集群中num-other-sentinels的数值
127.0.0.1:26380sentinelmasterRedis_Master_cs1)"name"2)"Redis_Master_cs"3)"ip"4)"10.100.202.248"5)"port"6)"6379"7)"runid"8)"cf591aea7d11f8f4fa09910b2c92d786ad4adace"9)"flags"10)"master"11)"link-ping-commands"12)"0"13)"link-refcount"14)"1"15)"last-ping-sent"16)"0"17)"last-ok-ping-reply"18)"562"19)"last-ping-reply"20)"562"21)"down-after-milliseconds"22)"30000"23)"info-refresh"24)"5143"25)"role-reported"26)"master"27)"role-reported-time"28)"1623522"29)"config-epoch"30)"4"31)"num-slaves"32)"1"33)"num-other-sentinels"上面测试failover后,6380实例为Master实例,6379位slave实例,结束6380实例进行主从切换测试[root@c7-8redis]kill-96493+new-epoch36560:X21Apr202215:11:21.535+vote-for-leader3109e9691cc3f8fcfb3d16567ad4fb2dca40567936560:X21Apr202215:11:21.72551d9e69843cbc2d01073843b51d5832a4969da8bvotedfor51d9e69843cbc2d01073843b51d5832a4969da8b36560:X21Apr202215:11:21.785-failover-abort-not-electedmasterRedis_Master_:X21Apr202215:11:32.023+new-epoch46560:X21Apr202215:17:21.907+vote-for-leader3109e9691cc3f8fcfb3d16567ad4fb2dca40567946560:X21Apr202215:17:22.098282035f5c0b5d960c07eb0c94e57a857b31691b6votedfor3109e9691cc3f8fcfb3d16567ad4fb2dca40567946560:X21Apr202215:17:22.323+elected-leadermasterRedis_Master_:X21Apr202215:17:22.386+:637910.100.202.2486379@Redis_Master_:X21Apr202215:17:22.463*+:637910.100.202.2486379@Redis_Master_:X21Apr202215:17:22.529*+:637910.100.202.2486379@Redis_Master_:X21Apr202215:17:22.739+failover-state-reconf-slavesmasterRedis_Master_:X21Apr202215:17:22.749+switch-masterRedis_Master_:X21Apr202215:17:22.750*+:638010.100.202.2486380@Redis_Master_:X21Apr202215:17:52.776查看6379实例的角色,并测试是否可读写127.0.0.1:6379:6379authAbc@123:6379inforeplication当前6380实例未启动,所以master实例上看到连接的slave实例为0.master_replid:41cd2272f1c43d1465cf71c7ea678eefa28a5fb4master_replid2:ce51c53bc762ecb0340d2387caeff78ee17a96cemaster_repl_offset:17509840second_repl_offset:17482487repl_backlog_active:1repl_backlog_size:1048576repl_backlog_first_byte_offset:16461265repl_backlog_histlen:1048576127.0.0.1:6379getabc"456"127.0.0.1:6379:6379getabc"789"结论
哨兵实例由原来的3实例,增加到4实例后,如果触发主从切换,会由于哨兵节点数为偶数,导致⼏轮选举失败,但最终会选举出master实例,并进⾏⻆⾊转换。
模拟从哨兵集群中删除⼀个实例当前环境状态:四个哨兵实例,⼀个6379redismaster实例,⽆slave实例
[root@c7-8redis]serviceredis_6380statusRedisisnotrunning[root@c7-8redis]rm/var/run/redis_6380.pid-f[root@c7-8redis]serviceredis_6380statusRedisisrunning(8331)-:638010.100.202.2486380@Redis_Master_:X21Apr202215:30:30.557*+:638010.100.202.2486380@Redis_Master_实例查看主从信息:127.0.0.1:6379inforeplication已连接的slave由0变为1.slave0:ip=10.100.202.248,port=6380,state=online,offset=17802773,lag=0kill-96553
⼀段时间后,通过其他哨兵实例的⽇志可以观察到,当前活着的哨兵实例已经检测到26379实例不存在了,并主观上认为该实例down掉了。
执⾏命令
sentinelmasterRedis_Master_csnum-other-sentinels的值认仍为3.
执⾏sentinelreset*命令重置哨兵集群
此时,26380哨兵实例的日志中,会出现如下信息:6560:X21Apr202216:00:18.884执行reset命令后,该值变为234)"2"35)"quorum"36)"2"26381控制台执行:127.0.0.1:26381sentinelresetRedis_Master_cs(integer)1+reset-mastermasterRedis_Master_:X21Apr202216:05:54.457*+@Redis_Master_:X21Apr202216:05:55.445*+@Redis_Master_:X21Apr202216:05:56.951*+:638010.100.202.2486380@Redis_Master_`sentinelmasterRedis_Master_cs`命令执行结果中,`num-other-sentinels`的值变为了2.杀死当前master实例——6379[root@c7-8redis]kill-98166+sdownmasterRedis_Master_:X21Apr202216:13:51.620quorum3/26565:X21Apr202216:13:51.620+try-failovermasterRedis_Master_:X21Apr202216:13:51.773e9726b29c7237775b23d8ae2ba2affa37b5e28c3votedfore9726b29c7237775b23d8ae2ba2affa37b5e28c356565:X21Apr202216:13:51.899+confi@Redis_Master_:X21Apr202216:13:52.707+:637910.100.202.2486379@Redis_Master_重新启动redis6379实例,恢复主从模式[root@c7-8redis]serviceredis_6379statusRedisisrunning(8423)-:637910.100.202.2486379@Redis_Master_:X21Apr202216:17:09.475*+:637910.100.202.2486379@Redis_Master_
⾄此,redis主从⼜恢复了,只是6380实例为master实例,6379实例为slave实例。
通过上面一系列操作,可以看出redis哨兵实例退出集群,需要在实例结束后,分别在其他几个正常的实例控制台中手动发起reset指令,重新获取哨兵集群中当前实例的状态信息,才能实现哨兵实例的退出操作。