现场S6900F设备,当前版本为R2609+H25补丁,准备卸载掉H25补丁后加载最新的H35补丁。
现场卸载掉H25补丁,加载了H35补丁后,发现bgp邻居异常,并伴随着多条隧道down:
BGP.: 10.4.253.1 state has changed from ESTABLISHED to IDLE for TCP_Connection_Failed event received.
%Apr 27 01:50:48:734 2022 XAS23&24.PRI.LF-DSJYYPT IFNET/3/PHY_UPDOWN: Physical state on the interface Tunnel0 changed to down.
%Apr 27 01:50:48:734 2022 XAS23&24.PRI.LF-DSJYYPT IFNET/5/LINK_UPDOWN: Line protocol state on the interface Tunnel0 changed to down.
%Apr 27 01:50:48:964 2022 XAS23&24.PRI.LF-DSJYYPT IFNET/3/PHY_UPDOWN: Physical state on the interface Tunnel5 changed to down.
%Apr 27 01:50:48:965 2022 XAS23&24.PRI.LF-DSJYYPT IFNET/5/LINK_UPDOWN: Line protocol state on the interface Tunnel5 changed to down.
%Apr 27 01:50:49:137 2022 XAS23&24.PRI.LF-DSJYYPT IFNET/3/PHY_UPDOWN: Physical state on the interface Tunnel4 changed to down.
%Apr 27 01:50:49:138 2022 XAS23&24.PRI.LF-DSJYYPT IFNET/5/LINK_UPDOWN: Line protocol state on the interface Tunnel4 changed to down.
%Apr 27 01:50:49:299 2022 XAS23&24.PRI.LF-DSJYYPT IFNET/3/PHY_UPDOWN: Physical state on the interface Tunnel2 changed to down.
%Apr 27 01:50:49:299 2022 XAS23&24.PRI.LF-DSJYYPT IFNET/5/LINK_UPDOWN: Line protocol state on the interface Tunnel2 changed to down.
%Apr 27 01:50:49:501 2022 XAS23&24.PRI.LF-DSJYYPT IFNET/3/PHY_UPDOWN: Physical state on the interface Tunnel14 changed to down.
%Apr 27 01:50:49:502 2022 XAS23&24.PRI.LF-DSJYYPT IFNET/5/LINK_UPDOWN: Line protocol state on the interface Tunnel14 changed to down.
%Apr 27 01:50:49:650 2022 XAS23&24.PRI.LF-DSJYYPT IFNET/3/PHY_UPDOWN: Physical state on the interface Tunnel9 changed to down.
%Apr 27 01:50:49:650 2022 XAS23&24.PRI.LF-DSJYYPT IFNET/5/LINK_UPDOWN: Line protocol state on the interface Tunnel9 changed to down
查看设备日志,发现在进行补丁升级操作之前设备上就有内存告警日志:
%@243209%Apr 27 00:35:21:386 2022 XAS23&24.PRI.LF-DSJYYPT DIAG/1/MEM_EXCEED_THRESHOLD: Memory early-warning threshold has been exceeded.
Memory statistics are measured in KB:
Total Free FreeRatio
Mem: 2024320 332496 16%
之后先卸载H25补丁,后打H35补丁 :
%@243229%Apr 27 01:14:51:651 2022 XAS23&24.PRI.LF-DSJYYPT SHELL/6/SHELL_CMD: -Line=vty7-IPAddr=192.168.100.3-User=dsj_mmu; Command is install deactivate patch flash:/S6900F-CMW710-SYSTEM-R2609H25.bin all %@243292%Apr 27 01:23:52:020 2022 XAS23&24.PRI.LF-DSJYYPT SHELL/6/SHELL_CMD: -Line=vty7-IPAddr=192.168.100.3-User=dsj_mmu; Command is install activate patch flash:/S6900F-CMW710-SYSTEM-R2609H35.bin all
从开始打H35补丁之后,内存就开始下降,触发了一级门限告警:
%@243305%Apr 27 01:28:03:680 2022 XAS23&24.PRI.LF-DSJYYPT DIAG/1/MEM_EXCEED_THRESHOLD: Memory minor threshold has been exceeded.
Memory statistics are measured in KB:
Total Free FreeRatio
Mem: 2024320 260000 12%
随后内存持续下降:
%@243409%Apr 27 01:30:08:952 2022 XAS23&24.PRI.LF-DSJYYPT DIAG/1/MEM_EXCEED_THRESHOLD: Memory minor threshold has been exceeded.
Memory statistics are measured in KB:
Total Free FreeRatio
Mem: 2024320 227360 11%
补丁结束之后,内存并未回升,内存下探到sever附近,导致bgp中断,相关隧道都down了;
再次收集设备内存,发现现设备的内存有大量cache段未释放,free/total 只有10%,但是freeratio计算有19%,cache段占用了有9%:
===============display memory===============
Memory statistics are measured in KB:
Slot 1:
Total Used Free Shared Buffers Cached FreeRatio
Mem: 2024320 1822480 201840 0 48 464640 19.6%
-/+ Buffers/Cache: 1357792 666528
Swap: 0 0 0
之后在此基础上,再次卸载了H35,加载H25:
%@243642%Apr 27 04:18:45:616 2022 XAS23&24.PRI.LF-DSJYYPT SHELL/6/SHELL_CMD: -Line=vty7-IPAddr=192.168.100.3-User=dsj_mmu; Command is install deactivate patch flash:/S6900F-CMW710-SYSTEM-R2609H35.bin all
%@243725%Apr 27 04:23:48:917 2022 XAS23&24.PRI.LF-DSJYYPT SHELL/6/SHELL_CMD: -Line=vty7-IPAddr=192.168.100.3-User=dsj_mmu; Command is install activate patch flash:/S6900F-CMW710-SYSTEM-R2609H25.bin all
设备内存进一步下降至severe门限:
%@243765%Apr 27 04:26:12:411 2022 XAS23&24.PRI.LF-DSJYYPT DIAG/1/MEM_EXCEED_THRESHOLD: Memory severe threshold has been exceeded.
Memory statistics are measured in KB:
Total Free FreeRatio
Mem: 2024320 139680 6%
随后触发缓存释放,内存逐渐恢复,bgp邻居正常建立
%@243766%Apr 27 04:26:18:234 2022 XAS23&24.PRI.LF-DSJYYPT DIAG/1/MEM_BELOW_THRESHOLD: Memory usage has dropped below severe threshold.
%@243767%Apr 27 04:26:23:916 2022 XAS23&24.PRI.LF-DSJYYPT DIAG/1/MEM_BELOW_THRESHOLD: Memory usage has dropped below minor threshold.
%@243768%Apr 27 04:26:33:752 2022 XAS23&24.PRI.LF-DSJYYPT BGP/5/BGP_STATE_CHANGED: BGP.: 10.4.253.1 State is changed from OPENCONFIRM to ESTABLISHED.
查看恢复之后的cache段占用,比故障时少了140M 左右:
===============display memory===============
Memory statistics are measured in KB:
Slot 1:
Total Used Free Shared Buffers Cached FreeRatio
Mem: 2024320 1655424 368896 0 48 320288 20.7%
-/+ Buffers/Cache: 1335088 689232
Swap: 0 0 0
对于6800及6900F等部分小内存设备,在加载补丁之前要保证设备剩余的free内存满足补丁安装书要求,以6900F的R2609H35补丁为例,对于内存有如下要求,若不满足安装要求,则需调整设备内存minor门限值符合补丁安装标准再进行操作,完成补丁安装操作后,按照当前的free内存调整门限,让设备触发一次minor门限,释放出多余的cache段内存。
热补丁文件名 |
强制安装 |
建议安装 |
限制安装 |
补丁安装内存容量要求 |
备注 |
S6900F-CMW710-SYSTEM-R2609H35.bin |
√ |
|
|
240M +Minor threshold |
Minor threshold请从display memory-threshold命令中条目“Minor”中获取。 |
该案例暂时没有网友评论
✖
案例意见反馈
亲~登录后才可以操作哦!
确定你的邮箱还未认证,请认证邮箱或绑定手机后进行当前操作