版本:Version 7.1.064, Release 6749P2102
现场反馈MSR路由器 BFD检测超时,导致BGP状态变化,影响MSR和友商设备的邻居建立,需要分析BFD检测超时的原因
查看接口无错包,故障时间点左右没有异常日志,从本端的日志看到Diag为1,说明是本端超时导致故障,需要重点排查本端设备运行情况。
BFD/5/BFD_CHANGE_FSM: Sess[XXXX, Interface:XGE0/XXX, SessType:Ctrl, LinkType:INET], Ver:1, Sta: UP->DOWN, Diag: 1 (Control Detection Time Expired)
BGP/5/BGP_STATE_CHANGED:
BGP.: 10.xxx.xxx.xxx state has changed from ESTABLISHED to IDLE for session down event received from BFD.
BFD/5/BFD_CHANGE_SESS: Sess[XXXX, Interface:XGE0/XXX, SessType:Ctrl, LinkType:INET], Ver:1, Sta: Deleted, Diag: 1 (Control Detection Time Expired)
BFD/5/BFD_CHANGE_FSM: Sess[XXXX, Interface:XGE0/XXXX, SessType:Ctrl, LinkType:INET], Ver:1, Sta: UP->DOWN, Diag: 1 (Control Detection Time Expired)
BGP/5/BGP_STATE_CHANGED:
BGP.: 10.xxx.xxx.xxx state has changed from ESTABLISHED to IDLE for session down event received from BFD.
BFD/5/BFD_CHANGE_SESS: Sess[XXXX, Interface:XGE0/XXXX, SessType:Ctrl, LinkType:INET], Ver:1, Sta: Deleted, Diag: 1 (Control Detection Time Expired)
BGP/5/BGP_STATE_CHANGED:
BGP.: 10.xxx.xxx.xxx state has changed from OPENCONFIRM to ESTABLISHED.
BGP/5/BGP_STATE_CHANGED:
BGP.: 10.xxx.xxx.xxx state has changed from OPENCONFIRM to ESTABLISHED.
BFD/5/BFD_CHANGE_FSM: Sess[XXXX, Interface:XGE0/XXX, SessType:Ctrl, LinkType:INET], Ver:1, Sta: DOWN->INIT, Diag: 0 (No Diagnostic)
BFD/5/BFD_CHANGE_FSM: Sess[XXXX, Interface:XGE0/XXX, SessType:Ctrl, LinkType:INET], Ver:1, Sta: INIT->UP, Diag: 0 (No Diagnostic)
BFD/5/BFD_CHANGE_FSM: Sess[XXXX, Interface:XGE0/XXXX, SessType:Ctrl, LinkType:INET], Ver:1, Sta: DOWN->UP, Diag: 0 (No Diagnostic)
BFD/5/BFD_CHANGE_FSM: Sess[XXXX, Interface:XGE0/XXXX, SessType:Ctrl, LinkType:INET], Ver:1, Sta: UP->DOWN, Diag: 1 (Control Detection Time Expired)
-------------------------------------------------------------------------------------------------------
从诊断信息里看到,两个设备都有后端接收丢包,并且在BFD震荡的时间段也是流量高峰,BFD诊断可能是丢包导致的
MSR1:
===============inner-port 5 statistics===============
PKI_STAT4_STAT0 = 0x000000d1d4476736
PKI_STAT4_STAT1 = 0x00000223c69bd284
PKI_STAT4_STAT2 = 0x0000000000000000
PKI_STAT4_STAT3 = 0x00000000009a9ffb
PKI_STAT4_STAT4 = 0x000000007f2bbc13
Ten-GigabitEthernet0/XX
Peak input rate: 72567759 bytes/sec, at 2024-05-27 23:31:36
Peak output rate: 397379844 bytes/sec, at 2024-04-25 20:06:11
Top 10 peak input bit rates: 580542072 bits/sec at 2024-05-27 23:31:36
580540152 bits/sec at 2024-05-27 23:31:36
580538224 bits/sec at 2024-05-27 23:31:26
580536304 bits/sec at 2024-05-27 23:31:26
580534376 bits/sec at 2024-05-27 23:31:16
MSR2
===============inner-port 5 statistics=============== ~
PKI_STAT4_STAT0 = 0x000000c8fd1b90cd
PKI_STAT4_STAT1 = 0x0000cc48e4bdd6b4
PKI_STAT4_STAT2 = 0x0000000000000000
PKI_STAT4_STAT3 = 0x00000000028b12f3
PKI_STAT4_STAT4 = 0x000000021d521de4
Ten-GigabitEthernet0/XX
Peak input rate: 490599 bytes/sec, at 2024-05-27 23:30:26
Peak output rate: 630080262 bytes/sec, at 2024-04-17 10:03:48
Top 10 peak input bit rates: 3924792 bits/sec at 2024-05-27 23:30:26
3924776 bits/sec at 2024-05-27 23:30:16
3924768 bits/sec at 2024-05-27 23:29:56
3924760 bits/sec at 2024-05-27 23:29:46
3924752 bits/sec at 2024-05-27 23:29:31
3806152 bits/sec at 2024-05-27 23:29:31
3806144 bits/sec at 2024-05-27 23:29:16
设备的接口都是一个CPU后端,所有接口的报文都通过这个通道上送,加起来可能超过单核的接收性能,导致其他接口上送CPU报文丢掉,包括BFD
BFD震荡的时间段也是流量高峰,BFD震荡是丢包导致的。
前方如果频繁出现或者不能接受震荡的话,可以配上forwarding policy per-flow enhance看有没有改善。
命令手册链接:1.1.1 forwarding policy
forwarding policy命令用来配置报文负载分担策略。
undo forwarding policy命令用来恢复缺省情况。
【命令】
forwarding policy { per-flow [ enhance ] | per-packet }
undo forwarding policy
【缺省情况】
采用基于流处理的报文负载分担策略。
【视图】
系统视图
【缺省用户角色】
network-admin
【参数】
per-flow:基于流处理,处理过程保证先进先出。
enhance:增强模式的流处理。配置本参数后,同一条流的入方向、转发和出方向分担到不同的CPU进行处理,从而提升单条流的处理性能。
本参数的支持情况与设备型号有关,请以实际情况为准。
该案例暂时没有网友评论
✖
案例意见反馈
亲~登录后才可以操作哦!
确定你的邮箱还未认证,请认证邮箱或绑定手机后进行当前操作