sentinel是一个管理redis实例的工具,它可以实现对redis的监控、通知、自动故障转移。sentinel不断的检测redis实例是否可以正常工作,通过API向其他程序报告redis的状态,如果redis master不能工作,则会自动启动故障转移进程,将其中的一个slave提升为master,其他的slave重新设置新的master服务器。

Sentinel介绍

Redis Sentinel是一个分布式架构,包含若干个Sentinel节点和Redis数据节点,每个Sentinel节点会对数据节点和其余Sentinel节点进行监控,当发现节点不可达时,会对节点做下线标识.
如果被标识的是主节点,他还会选择和其他Sentinel节点进行“协商”,当大多数的Sentinel节点都认为主节点不可达时,他们会选举出一个Sentinel节点来完成自动故障转移工作,同时将这个变化通知给Redis应用方.
整个过程完全自动,不需要人工介入,所以可以很好解决Redis的高可用问题.

redis主从复制
Redis主从复制可将主节点数据同步给从节点,从节点此时有两个作用:

  • 一旦主节点宕机,从节点作为主节点的备份可以随时顶上来
  • 扩展主节点的读能力,分担主节点读压力

环境:
   Centos 7.4
   Redis version:3.2.10

分别有3个Sentinel节点,1个主节点,2个从节点组成一个Redis Sentinel
Role | IP | Port
—|—|—
master | 192.168.1.100 | 6379
slave1 | 192.168.1.101 | 6379
slave2 | 192.168.1.200 | 6379
Sentinel1 | 192.168.1.100 | 26379
Sentinel2 | 192.168.1.101 | 26379
Sentinel3 | 192.168.1.200 | 26379

配置redis

关闭防火墙和selinux

1
2
[root@localhost ~]# iptables -F
[root@localhost ~]# setenforce 0

三个redis部署在不同的服务器上

1
2
3
[root@localhost ~]# yum install -y https://mirrors.aliyun.com/epel/epel-release-latest-7.noarch.rpm
[root@localhost ~]# yum clean all
[root@localhost ~]# yum install redis

redis主从关系

master redis配置
打井号的为主要配置,其余为默认配置

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
[root@localhost ~]# grep -v '^#' /etc/redis.conf |grep -v '^$'
bind 192.168.1.100 #
protected-mode yes #
port 6379 #
tcp-backlog 511
timeout 0
tcp-keepalive 300
daemonize yes #
supervised no
pidfile /var/run/redis_6379.pid
loglevel notice
logfile /var/log/redis/redis.log
databases 16
save 900 1
save 300 10
save 60 10000
stop-writes-on-bgsave-error yes
rdbcompression yes
rdbchecksum yes
dbfilename dump.rdb
dir /var/lib/redis
slave-serve-stale-data yes
slave-read-only yes
repl-diskless-sync no
repl-diskless-sync-delay 5
repl-disable-tcp-nodelay no
slave-priority 100
requirepass 1234567 #
appendonly no
appendfilename "appendonly.aof"
appendfsync everysec
no-appendfsync-on-rewrite no
auto-aof-rewrite-percentage 100
auto-aof-rewrite-min-size 64mb
aof-load-truncated yes
lua-time-limit 5000
slowlog-log-slower-than 10000
slowlog-max-len 128
latency-monitor-threshold 0
notify-keyspace-events ""
hash-max-ziplist-entries 512
hash-max-ziplist-value 64
list-max-ziplist-size -2
list-compress-depth 0
set-max-intset-entries 512
zset-max-ziplist-entries 128
zset-max-ziplist-value 64
hll-sparse-max-bytes 3000
activerehashing yes
client-output-buffer-limit normal 0 0 0
client-output-buffer-limit slave 256mb 64mb 60
client-output-buffer-limit pubsub 32mb 8mb 60
hz 10
aof-rewrite-incremental-fsync yes
[root@localhost ~]#

slave1 redis配置

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
[root@localhost ~]# grep -v '^#' /etc/redis.conf |grep -v '^$'
bind 192.168.1.101 #
protected-mode yes #
port 6379 #
tcp-backlog 511
timeout 0
tcp-keepalive 300
daemonize yes #
supervised no
pidfile /var/run/redis_6379.pid
loglevel notice
logfile /var/log/redis/redis.log
databases 16
save 900 1
save 300 10
save 60 10000
stop-writes-on-bgsave-error yes
rdbcompression yes
rdbchecksum yes
dbfilename dump.rdb
dir /var/lib/redis
slaveof 192.168.1.100 6379 #
masterauth "1234567" #
slave-serve-stale-data yes
slave-read-only yes
repl-diskless-sync no
repl-diskless-sync-delay 5
repl-disable-tcp-nodelay no
slave-priority 100
appendonly no
appendfilename "appendonly.aof"
appendfsync everysec
no-appendfsync-on-rewrite no
auto-aof-rewrite-percentage 100
auto-aof-rewrite-min-size 64mb
aof-load-truncated yes
lua-time-limit 5000
slowlog-log-slower-than 10000
slowlog-max-len 128
latency-monitor-threshold 0
notify-keyspace-events ""
hash-max-ziplist-entries 512
hash-max-ziplist-value 64
list-max-ziplist-size -2
list-compress-depth 0
set-max-intset-entries 512
zset-max-ziplist-entries 128
zset-max-ziplist-value 64
hll-sparse-max-bytes 3000
activerehashing yes
client-output-buffer-limit normal 0 0 0
client-output-buffer-limit slave 256mb 64mb 60
client-output-buffer-limit pubsub 32mb 8mb 60
hz 10
aof-rewrite-incremental-fsync yes
[root@localhost ~]#

slave2 redis配置

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
[root@localhost ~]# grep -v '^#' /etc/redis.conf |grep -v '^$'
bind 192.168.1.200 #
protected-mode yes #
port 6379 #
tcp-backlog 511
timeout 0
tcp-keepalive 300
daemonize yes #
supervised no
pidfile /var/run/redis_6379.pid
loglevel notice
logfile /var/log/redis/redis.log
databases 16
save 900 1
save 300 10
save 60 10000
stop-writes-on-bgsave-error yes
rdbcompression yes
rdbchecksum yes
dbfilename dump.rdb
dir /var/lib/redis
slaveof 192.168.1.100 6379 #
masterauth "1234567" #
slave-serve-stale-data yes
slave-read-only yes
repl-diskless-sync no
repl-diskless-sync-delay 5
repl-disable-tcp-nodelay no
slave-priority 100
appendonly no
appendfilename "appendonly.aof"
appendfsync everysec
no-appendfsync-on-rewrite no
auto-aof-rewrite-percentage 100
auto-aof-rewrite-min-size 64mb
aof-load-truncated yes
lua-time-limit 5000
slowlog-log-slower-than 10000
slowlog-max-len 128
latency-monitor-threshold 0
notify-keyspace-events ""
hash-max-ziplist-entries 512
hash-max-ziplist-value 64
list-max-ziplist-size -2
list-compress-depth 0
set-max-intset-entries 512
zset-max-ziplist-entries 128
zset-max-ziplist-value 64
hll-sparse-max-bytes 3000
activerehashing yes
client-output-buffer-limit normal 0 0 0
client-output-buffer-limit slave 256mb 64mb 60
client-output-buffer-limit pubsub 32mb 8mb 60
hz 10
aof-rewrite-incremental-fsync yes
[root@localhost ~]#

启动redis

先启动redis master,在启动redis slave

1
[root@localhost ~]# redis-server /etc/redis.conf

1
2
3
4
5
6
7
8
9
10
11
12
13
[root@localhost ~]# redis-cli -h 192.168.1.100 -p 6379 -a '1234567'
192.168.1.100:6379> info replication
# Replication
role:master
connected_slaves:2
slave0:ip=192.168.1.101,port=6379,state=online,offset=1,lag=0
slave1:ip=192.168.1.200,port=6379,state=online,offset=1,lag=0
master_repl_offset:1
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:2
repl_backlog_histlen:0
192.168.1.100:6379>

部署Sentinel节点

三个Sentinel部署在不同的服务器上

Sentinel1配置

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
[root@localhost ~]# grep -iv '^#' /etc/redis-sentinel.conf | grep -iv '^$'
bind 192.168.1.100 #
port 26379 #
daemon yes #
dir "/opt/redis-sentinel/data" #
#当前Sentinel节点监控192.168.1.100:6379这个主节点
#2代表判断主节点失败至少需要2个Sentinel节点同意
#mymaster是主节点的别名
sentinel monitor mymaster 192.168.1.100 6379 2 #
sentinel auth-pass mymaster 1234567 #redis启动了身份验证
#每个Sentinel节点都要定期PING命令来判断Redis数据节点和其余Sentinel节点是否可达,如果超过30000毫秒且没有回复,则判定不可达
sentinel down-after-milliseconds mymaster 30000
#当Sentinel节点集合对主节点故障判定达成一致时,Sentinel领导者节点会做故障转移操作,选出新的主节点,原来的从节点会向新的主节点发起复制操作,限制每次向新的主节点发起复制操作的从节点个数为1
sentinel parallel-syncs mymaster 1
#故障转移超时时间为180000毫秒
sentinel failover-timeout mymaster 180000
logfile /var/log/redis/sentinel.log
[root@localhost ~]#

启动sentinel

Sentinelt拓扑图
当部署号Redis Sentinel之后,会有如下变化

  • Sentinel节点自动发现了从节点、其余Sentinel节点
  • 去掉了默认配置,例如:parallel-syncsfailover-timeout
  • 新添加了纪元(epoch)参数

需要先启动sentinel再启动redis

1
2
[root@localhost ~]# redis-sentinel /etc/redis-sentinel.conf &
[root@localhost ~]# redis-server /etc/redis.conf

查看日志

1
2
3
11116:X 28 Apr 00:59:09.095 # WARNING: The TCP backlog setting of 511 cannot be enforced be cause /proc/sys/net/core/somaxconn is set to the lower value of 128.11116:X 28 Apr 00:59:09.095 # Sentinel ID is 1f8abc9be76435a72401e236767cced5c8d535ba
11116:X 28 Apr 00:59:09.095 # +monitor master mymaster 192.168.1.100 6379 quorum 2
11116:X 28 Apr 00:59:29.978 * +sentinel sentinel 8f2a2091de02341d21d8f5fb9e00cbf72112e007 192.168.1.200 26379 @ mymaster 192.168.1.100 637911116:X 28 Apr 00:59:31.748 * +sentinel sentinel 9b905b81691a53a5b2ae53b7bbb9b286f047f5b6 192.168.1.101 26379 @ mymaster 192.168.1.100 637911116:X 28 Apr 00:59:49.953 * +slave slave 192.168.1.101:6379 192.168.1.101 6379 @ mymaster 192.168.1.100 637911116:X 28 Apr 00:59:49.967 * +slave slave 192.168.1.200:6379 192.168.1.200 6379 @ mymaster 192.168.1.100 6379

通过redis sentinel日志可以看到redis-192.168.1.101和redis-192.168.1.200已经加入到mymaster组,现在mymaster组有三台redis服务器了

在主节点上看

1
2
3
4
5
6
7
8
9
10
11
12
192.168.1.100:6379> info replication
# Replication
role:master
connected_slaves:2
slave0:ip=192.168.1.101,port=6379,state=online,offset=132341,lag=1
slave1:ip=192.168.1.200,port=6379,state=online,offset=132341,lag=1
master_repl_offset:132341
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:2
repl_backlog_histlen:132340
192.168.1.100:6379>

在从节点上看

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
192.168.1.101:6379> info replication
# Replication
role:slave
master_host:192.168.1.100
master_port:6379
master_link_status:down
master_last_io_seconds_ago:-1
master_sync_in_progress:0
slave_repl_offset:421
master_link_down_since_seconds:43
slave_priority:100
slave_read_only:1
connected_slaves:0
master_repl_offset:0
repl_backlog_active:0
repl_backlog_size:1048576
repl_backlog_first_byte_offset:0
repl_backlog_histlen:0
192.168.1.101:6379>

测试Sentinel的故障转移

redis节点终止复制

Redis Sentinel对主节点进行客观下线(Objectively Down, 简称 ODOWN)的判断,确认主节点不可达,则通知从节点中止复制主节点的操作
当主节点下线时长超过配置的下线时长30000秒,Redis Sentinel执行故障转移操作

手动让redis master(192.168.1.100)下线

1
2
3
[root@localhost ~]# ps -ef |grep -i redis | egrep -iv 'color|grep'
root 11120 1 0 00:59 ? 00:00:05 redis-server 192.168.1.100:6379
[root@localhost ~]# kill 11120

再来看下redis sentinel日志

1
2
3
4
5
6
11116:X 28 Apr 01:13:24.489 # +sdown master mymaster 192.168.1.100 6379
11116:X 28 Apr 01:13:24.531 # +new-epoch 1
11116:X 28 Apr 01:13:24.533 # +vote-for-leader 9b905b81691a53a5b2ae53b7bbb9b286f047f5b6 1
11116:X 28 Apr 01:13:24.566 # +odown master mymaster 192.168.1.100 6379 #quorum 3/2
11116:X 28 Apr 01:13:24.567 # Next failover delay: I will not start a failover before Sat Apr 28 01:19:24 201811116:X 28 Apr 01:13:25.238 # +config-update-from sentinel 9b905b81691a53a5b2ae53b7bbb9b286f047f5b6 192.168.1.101 26379 @ mymaster 192.168.1.100 637911116:X 28 Apr 01:13:25.239 # +switch-master mymaster 192.168.1.100 6379 192.168.1.200 6379
11116:X 28 Apr 01:13:25.240 * +slave slave 192.168.1.101:6379 192.168.1.101 6379 @ mymaster 192.168.1.200 6379

通过日志观察,现在redis master切换到redis-192.168.1.200上了

1
2
3
4
5
6
7
8
9
10
192.168.1.200:6379> info replication
# Replication
role:master
connected_slaves:0
master_repl_offset:0
repl_backlog_active:0
repl_backlog_size:1048576
repl_backlog_first_byte_offset:0
repl_backlog_histlen:0
192.168.1.200:6379>

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
192.168.1.101:6379> info replication
# Replication
role:slave
master_host:192.168.1.200
master_port:6379
master_link_status:down
master_last_io_seconds_ago:-1
master_sync_in_progress:0
slave_repl_offset:1
master_link_down_since_seconds:1524849879
slave_priority:100
slave_read_only:1
connected_slaves:0
master_repl_offset:0
repl_backlog_active:0
repl_backlog_size:1048576
repl_backlog_first_byte_offset:0
repl_backlog_histlen:0
192.168.1.101:6379>

redis降级
从上面的逻辑架构和故障转移试验中,可以看出Redis Sentinel的以下几个功能

  • 监控:Sentinel节点会定期检测Redis数据节点和其余Sentinel节点是否可达
  • 通知:Sentinel节点会将故障转移通知给应用方
  • 主节点故障转移:实现从节点晋升为主节点并维护后续正确的主从关系
  • 配置提供者:在Redis Sentinel结构中,客户端在初始化的时候连接的是Sentinel节点集合,从中获取主节点信息

redis连接中断
现在启动redis-192.168.1.100,再来看看下redis sentinel日志

1
11116:X 28 Apr 01:26:08.209 * +convert-to-slave slave 192.168.1.100:6379 192.168.1.100 6379 @ mymaster 192.168.1.200 6379

通过查看redis sentinel日志,当redis-192.168.1.100上线后,角色转换为slave了

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
192.168.1.100:6379> info replication
# Replication
role:slave
master_host:192.168.1.200
master_port:6379
master_link_status:up
master_last_io_seconds_ago:0
master_sync_in_progress:0
slave_repl_offset:82068
slave_priority:100
slave_read_only:1
connected_slaves:0
master_repl_offset:0
repl_backlog_active:0
repl_backlog_size:1048576
repl_backlog_first_byte_offset:0
repl_backlog_histlen:0
192.168.1.100:6379>

数据测试

redis主节点数据写入测试

1
2
3
192.168.1.100:6379> set myname jack_wang
OK
192.168.1.100:6379>

1
2
3
192.168.1.101:6379> keys *
1) "myname"
192.168.1.101:6379>
1
2
3
192.168.1.200:6379> keys *
1) "myname"
192.168.1.200:6379>

OK,测试成功。

注:这个版本的redis好像没有sentinel命令

1
2
192.168.1.100:6379> sentinel masters
(error) ERR unknown command 'sentinel'

参考:https://blog.csdn.net/men_wen/article/details/72724406


本文出自”Jack Wang Blog”:http://www.yfshare.vip/2016/11/18/Redis-HA-Sentinel/