注:如无特别说明,描述中的 FW1 或 MSR1 对应拓扑中设备名称末尾数字为 1 的设备,FW2 或 MSR2 对应拓扑中设备名称末尾数字为 2 的设备,以此类推;另外,同一网段中,IP 地址的主机位为其设备编号,如 FW1 的 g0/0 接口若在 1.1.1.0/24 网段,则其 IP 地址为 1.1.1.1/24,以此类推。
组网说明:
1. FW1和FW2组成RBM
2. 上行VRRP,下行OSPF,其中SW5和SW6为二层交换机。作用是down掉SW5 GE1/0/2接口的时候FW1接口不会down,但是ospf邻居会断开连接。模拟BFD超时的场景。
不涉及
防火墙RBM结合OSPF以及VRRP组网下BFD会话down后是否能联动RBM切换分析
探究说明:
现网中可能出现BFD报文超时,导致OSPF邻居down的场景。该故障并非是由OSPF的接口链路状态down导致的。
如果OSPF邻居down掉,那么RBM状态是否会发生切换呢?
关键配置如下:
|
FW1 |
FW2 |
SW4 |
关键配置 |
# remote-backup group data-channel interface GigabitEthernet1/0/1 adjust-cost ospf enable absolute 1000 track 1 local-ip 1.1.1.1 remote-ip 1.1.1.2 device-role primary # ospf 1 router-id 10.14.1.1 area 0.0.0.0 network 10.14.1.0 0.0.0.255 # interface GigabitEthernet1/0/1 port link-mode route combo enable copper ip address 10.1.1.1 255.255.255.0 vrrp vrid 10 virtual-ip 10.1.1.254 active # interface GigabitEthernet1/0/2 port link-mode route combo enable copper ip address 10.14.1.1 255.255.255.0 ospf bfd enable #
|
# remote-backup group data-channel interface GigabitEthernet1/0/1 adjust-cost ospf enable absolute 1000 track 1 local-ip 1.1.1.2 remote-ip 1.1.1.1 device-role secondary # ospf 1 router-id 10.24.1.2 area 0.0.0.0 network 10.24.1.0 0.0.0.255 # interface GigabitEthernet1/0/1 port link-mode route combo enable copper ip address 10.1.1.2 255.255.255.0 vrrp vrid 10 virtual-ip 10.1.1.254 standby # interface GigabitEthernet1/0/2 port link-mode route combo enable copper ip address 10.24.1.2 255.255.255.0 ospf bfd enable #
|
ospf 1 router-id 4.4.4.4 area 0.0.0.0 network 10.14.1.0 0.0.0.255 network 10.24.1.0 0.0.0.255 |
步骤如下:
1. down掉SW5接口GE1/0/2,此时FW1的OSPF接口没有down,但是BFD报文超时,OSPF邻居down掉。
%Nov 12 19:46:12:808 2022 FW1 BFD/5/BFD_CHANGE_FSM: -COntext=1; Sess[10.14.1.1/10.14.1.4, LD/RD:32833/129, Interface:GE1/0/2, SessType:Ctrl, LinkType:INET], Ver:1, Sta: UP->DOWN, Diag: 1 (Control Detection Time Expired)
%Nov 12 19:46:12:821 2022 FW1 OSPF/5/OSPF_NBR_CHG: -COntext=1; OSPF 1 Neighbor 10.14.1.4(GigabitEthernet1/0/2) changed from FULL to DOWN.
查看FW1的RBM状态和VRRP状态,没有发生切换:
RBM_P<FW1>disp remo sta
Remote backup group information:
Backup mode: Active/standby
Device management role: Primary
Device running status: Active
Data channel interface: GigabitEthernet1/0/1
Local IP: 1.1.1.1
Remote IP: 1.1.1.2 Destination port: 60064
Control channel status: Connected
Keepalive interval: 1s
Keepalive count: 10
Configuration consistency check interval: 24 hour
Configuration consistency check result: Not Performed
Configuration backup status: Auto sync enabled
Session backup status: Hot backup enabled
Delay-time: 0 min
Uptime since last switchover: 0 days, 0 hours, 54 minutes
Switchover records:
Time Status change Cause
2022-11-12 18:52:27 Active to Active Keepalive link established
2022-11-12 18:51:56 Initial to Active Local device rebooted
2022-11-11 19:37:12 Active to Standby Switchover request
2022-11-11 19:36:50 Standby to Active Switchover request
2022-11-11 19:36:01 Active to Standby Switchover request
2022-11-11 19:17:57 Active to Active Keepalive link established
2022-11-11 19:17:54 Active to Active Keepalive link disconnected
2022-11-11 18:48:57 Active to Active Keepalive link established
2022-11-11 18:48:35 Initial to Active Local device rebooted
2022-11-11 18:31:30 Active to Active Keepalive link established
RBM_P<FW1>disp vrrp
IPv4 Virtual Router Information:
Running mode : Standard
RBM control channel is established
VRRP active group status : Master
VRRP standby group status: Master
Total number of virtual routers : 1
Interface VRID State Running Adver Auth Virtual
Pri Timer Type IP
----------------------------------------------------------------------------
GE1/0/1 10 Master 100 100 Not supported 10.1.1.254
RBM_P<FW1>disp ospf peer
OSPF Process 1 with Router ID 10.14.1.1
Neighbor Brief Information
原因分析:
触发主备切换的事件如下:
· 控制通道断开:两台设备正常运行情况下,当控制通道断开后会进行主备切换。这时两台设备都变为主设备,进行业务处理,但是两台设备不再是HA状态,对后续的非对称流量会有影响。
· 主设备整机故障。
· 主设备上HA监控的接口故障。
· 主设备上任意安全业务板故障。
主设备上任意安全业务板故障会触发业务主备竞选。竞选过程中当两台设备上在位的安全业务板数量一样时,主备工作模式中,业务主就是管理主设备,业务备就是管理从设备;双主工作模式中,两台设备都是业务主。当两台设备上在位的安全业务板数量不一样时,任何工作模式中,都是安全业务板在位数量多的一方竞选为业务主,少的一方竞选为业务备。
· 主设备上所有主控板故障。
· 主设备上所有交换网板故障。
· 主设备上HA关联的任意Track项状态为Negative。
OSPF邻居down不在切换之列,故主备状态未发生切换。
可以配置EAA监控如下:
rtm cli-policy test
event syslog priority all msg "from FULL to DOWN" occurs 1 period 10
action 10 cli system-view
action 20 cli remote-backup group
action 30 cli switchover request
user-role level-3
user-role network-operator
user-role network-admin
commit
注意:配置EAA之前先关闭信息中心,不然自动执行切换动作。
%Nov 12 19:50:07:755 2022 FW1 SHELL/6/SHELL_CMD: -Line=con0-IPAddr=**-User=admin; Command is rtm cli-policy test
%Nov 12 19:50:07:936 2022 FW1 SHELL/6/SHELL_CMD: -Line=con0-IPAddr=**-User=admin; Command is event syslog priority all msg "from FULL to DOWN" occurs 1 period 10
%Nov 12 19:50:08:081 2022 FW1 SHELL/6/SHELL_CMD: -Line=con0-IPAddr=**-User=admin; Command is action 10 cli system-view
%Nov 12 19:50:08:130 2022 FW1 SHELL/6/SHELL_CMD: -Line=con0-IPAddr=**-User=admin; Command is action 20 cli remote-backup group
%Nov 12 19:50:08:161 2022 FW1 SHELL/6/SHELL_CMD: -Line=con0-IPAddr=**-User=admin; Command is action 30 cli switchover request
%Nov 12 19:50:08:225 2022 FW1 SHELL/6/SHELL_CMD: -Line=con0-IPAddr=**-User=admin; Command is user-role level-3
%Nov 12 19:50:08:265 2022 FW1 SHELL/6/SHELL_CMD: -Line=con0-IPAddr=**-User=admin; Command is user-role network-operator
%Nov 12 19:50:08:338 2022 FW1 SHELL/6/SHELL_CMD: -Line=con0-IPAddr=**-User=admin; Command is user-role network-admin
%Nov 12 19:50:11:724 2022 FW1 SHELL/6/SHELL_CMD: -Line=con0-IPAddr=**-User=admin; Command is commit
%Nov 12 19:50:12:270 2022 FW1 OSPF/6/RBM_ADJUST_COST: RBM notified cost adjust to 1000.
%Nov 12 19:50:12:283 2022 FW1 VRRP4/6/VRRP_STATUS_CHANGE:
The status of IPv4 virtual router 10 (configured on GigabitEthernet1/0/1) changed from Master to Backup: Controled by RBM.
%Nov 12 19:50:12:368 2022 FW1 RTM/6/RTM_POLICY: CLI policy test is running successfully.
✖
案例意见反馈
亲~登录后才可以操作哦!
确定你的邮箱还未认证,请认证邮箱或绑定手机后进行当前操作