/
/
ADCAMPUS组网,S5560X作为leaf设备,6月29日10:00左右,市公司认证业务所在leaf下的终端存在业务报障。前方尝试拔掉leaf与spine的互联口agg1024(1/0/26,2/0/26)单独留一根线,没有恢复,约15分钟左右后自动恢复:
%@90496%Jun 29 08:35:33:208 2022 leaf_shigongsi DRVPLAT/4/SOFTCAR DROP:
PktType=UKNOWN_SMAC, SrcMAC=dc4a-3e59-0fd6, Dropped from interface=GigabitEthernet1/0/4 at Stage=61, StageCnt=5322, TotalCnt=23216, MaxRateInterface=GigabitEthernet1/0/14.
%@90497%Jun 29 10:06:00:779 2022 leaf_shigongsi LAGG/6/LAGG_INACTIVE_PHYSTATE: Member port XGE2/0/26 of aggregation group BAGG1024 changed to the inactive state, because the physical or line protocol state of the port was down.
%@90498%Jun 29 10:06:00:790 2022 leaf_shigongsi IFNET/3/PHY_UPDOWN: Physical state on the interface Ten-GigabitEthernet2/0/26 changed to down.
%@90499%Jun 29 10:06:00:842 2022 leaf_shigongsi IFNET/5/LINK_UPDOWN: Line protocol state on the interface Ten-GigabitEthernet2/0/26 changed to down.
%@90500%Jun 29 10:06:02:958 2022 leaf_shigongsi OPTMOD/4/MODULE_OUT: -Slot=2; Ten-GigabitEthernet2/0/26: Transceiver absent.
%@90501%Jun 29 10:06:05:657 2022 leaf_shigongsi LAGG/4/LACP_MAD_INTERFACE_CHANGE_STATE: LACP MAD function enabled on Bridge-Aggregation1024 changed to the faulty state.
%@90502%Jun 29 10:06:19:705 2022 leaf_shigongsi OPTMOD/4/MODULE_IN: -Slot=2; Ten-GigabitEthernet2/0/26: The transceiver is STACK_SFP_PLUS.
%@90503%Jun 29 10:06:32:439 2022 leaf_shigongsi LAGG/6/LAGG_INACTIVE_PHYSTATE: Member port XGE1/0/26 of aggregation group BAGG1024 changed to the inactive state, because the physical or line protocol state of the port was down.
%@90504%Jun 29 10:06:32:457 2022 leaf_shigongsi IFNET/3/PHY_UPDOWN: Physical state on the interface Ten-GigabitEthernet1/0/26 changed to down.
%@90505%Jun 29 10:06:32:490 2022 leaf_shigongsi IFNET/5/LINK_UPDOWN: Line protocol state on the interface Ten-GigabitEthernet1/0/26 changed to down.
%@90506%Jun 29 10:06:32:815 2022 leaf_shigongsi IFNET/3/PHY_UPDOWN: Physical state on the interface Bridge-Aggregation1024 changed to down.
%@90507%Jun 29 10:06:32:821 2022 leaf_shigongsi IFNET/5/LINK_UPDOWN: Line protocol state on the interface Bridge-Aggregation1024 changed to down.
%@90508%Jun 29 10:06:41:578 2022 leaf_shigongsi IFNET/3/PHY_UPDOWN: Physical state on the interface Ten-GigabitEthernet2/0/26 changed to up.
%@90509%Jun 29 10:06:41:912 2022 leaf_shigongsi RADIUS/4/RADIUS_AUTH_SERVER_DOWN: RADIUS authentication server was blocked: server IP=192.168.7.220, port=1812, VPN instance=vpn-default.
%@90510%Jun 29 10:06:42:748 2022 leaf_shigongsi IFNET/3/PHY_UPDOWN: Physical state on the interface Ten-GigabitEthernet2/0/26 changed to down.
%@90511%Jun 29 10:06:42:993 2022 leaf_shigongsi RADIUS/4/RADIUS_ACCT_SERVER_DOWN: RADIUS accounting server was blocked: server IP=192.168.7.220, port=1813, VPN instance=vpn-default.
1、查看日志,设备存在arp报文冲击的告警:
%@90519%Jun 29 10:06:48:793 2022 leaf_shigongsi DRVPLAT/4/SOFTCAR DROP: -Slot=2;
PktType=UKNOWN_SMAC, SrcMAC=6c0b-846b-be0a, Dropped from interface=GigabitEthernet2/0/3 at Stage=1, StageCnt=1344, TotalCnt=14945, MaxRateInterface=GigabitEthernet2/0/4.
%@90520%Jun 29 10:06:49:536 2022 leaf_shigongsi OFP/5/OFP_DISCONNECT: Openflow instance 1, controller 2 is disconnected.disconnected reason:Echo timeout.
%@90521%Jun 29 10:06:50:539 2022 leaf_shigongsi OFP/5/OFP_DISCONNECT: Openflow instance 1, controller 1 is disconnected.disconnected reason:Echo timeout.
%@90522%Jun 29 10:06:50:848 2022 leaf_shigongsi OFP/5/OFP_FAIL_OPEN: Openflow instance 1 is in fail secure mode.
%@90523%Jun 29 10:06:51:122 2022 leaf_shigongsi DRVPLAT/4/DrvDebug: -Slot=2;
Rx/Tx failure recovered between the CPU and switching chip on slot 2.
%@90524%Jun 29 10:06:55:186 2022 leaf_shigongsi DRVPLAT/4/DrvDebug: -Slot=2;
Rx/Tx failure recovered between the CPU and switching chip on slot 2.
%@90525%Jun 29 10:06:59:237 2022 leaf_shigongsi OPTMOD/4/MODULE_OUT: Ten-GigabitEthernet1/0/26: Transceiver absent.
%@90526%Jun 29 10:07:06:630 2022 leaf_shigongsi ARP/6/ARP_PKTQUE_ALERT: The current size of the ARP_PKT queue has reached 4244. Please check the network environment.
%@90638%Jun 29 10:10:25:082 2022 leaf_shigongsi IFNET/5/LINK_UPDOWN: Line protocol state on the interface Bridge-Aggregation1024 changed to down.
%@90639%Jun 29 10:10:25:543 2022 leaf_shigongsi LLDP/5/LLDP_NEIGHBOR_AGE_OUT: -Slot=2; Nearest bridge agent neighbor aged out on port Ten-GigabitEthernet2/0/26 (IfIndex 89), neighbor's chassis ID is 542b-de70-6a00, port ID is Ten-GigabitEthernet1/4/0/1.
%@90640%Jun 29 10:10:27:646 2022 leaf_shigongsi IFNET/3/PHY_UPDOWN: Physical state on the interface Ten-GigabitEthernet2/0/26 changed to up.
%@90641%Jun 29 10:10:27:387 2022 leaf_shigongsi LLDP/6/LLDP_CREATE_NEIGHBOR: -Slot=2; Nearest bridge agent neighbor created on port Ten-GigabitEthernet2/0/26 (IfIndex 89), neighbor's chassis ID is 542b-de70-6a00, port ID is Ten-GigabitEthernet2/4/0/1.
%@90642%Jun 29 10:10:27:858 2022 leaf_shigongsi LAGG/6/LAGG_ACTIVE: Member port XGE2/0/26 of aggregation group BAGG1024 changed to the active state.
%@90643%Jun 29 10:10:27:968 2022 leaf_shigongsi IFNET/3/PHY_UPDOWN: Physical state on the interface Bridge-Aggregation1024 changed to up.
%@90644%Jun 29 10:10:27:983 2022 leaf_shigongsi IFNET/5/LINK_UPDOWN: Line protocol state on the interface Bridge-Aggregation1024 changed to up.
%@90645%Jun 29 10:10:28:087 2022 leaf_shigongsi IFNET/5/LINK_UPDOWN: Line protocol state on the interface Ten-GigabitEthernet2/0/26 changed to up.
%@90646%Jun 29 10:10:32:613 2022 leaf_shigongsi OPTMOD/4/MODULE_OUT: Ten-GigabitEthernet1/0/26: Transceiver absent.
%@90647%Jun 29 10:10:36:207 2022 leaf_shigongsi RADIUS/4/RADIUS_ACCT_SERVER_DOWN: RADIUS accounting server was blocked: server IP=192.168.7.220, port=1813, VPN instance=vpn-default.
%@90648%Jun 29 10:10:37:590 2022 leaf_shigongsi DRVPLAT/4/DrvDebug: -Slot=2;
Rx/Tx failure recovered between the CPU and switching chip on slot 2.
%@90649%Jun 29 10:10:38:050 2022 leaf_shigongsi OPTMOD/4/MODULE_IN: Ten-GigabitEthernet1/0/26: The transceiver is STACK_SFP_PLUS.
%@90650%Jun 29 10:10:39:790 2022 leaf_shigongsi IFNET/3/PHY_UPDOWN: Physical state on the interface Ten-GigabitEthernet1/0/26 changed to up.
%@90651%Jun 29 10:10:39:892 2022 leaf_shigongsi LLDP/6/LLDP_CREATE_NEIGHBOR: Nearest bridge agent neighbor created on port Ten-GigabitEthernet1/0/26 (IfIndex 26), neighbor's chassis ID is 542b-de70-6a00, port ID is Ten-GigabitEthernet1/4/0/1.
%@90652%Jun 29 10:10:41:440 2022 leaf_shigongsi LAGG/6/LAGG_ACTIVE: Member port XGE1/0/26 of aggregation group BAGG1024 changed to the active state.
%@90653%Jun 29 10:10:42:254 2022 leaf_shigongsi IFNET/5/LINK_UPDOWN: Line protocol state on the interface Ten-GigabitEthernet1/0/26 changed to up.
%@90654%Jun 29 10:10:42:535 2022 leaf_shigongsi OFP/5/OFP_DISCONNECT: Openflow instance 1, controller 2 is disconnected.disconnected reason:Echo timeout.
%@90655%Jun 29 10:10:43:806 2022 leaf_shigongsi DRVPLAT/4/DrvDebug: -Slot=2;
Rx/Tx failure recovered between the CPU and switching chip on slot 2.
%@90656%Jun 29 10:10:44:541 2022 leaf_shigongsi OFP/5/OFP_DISCONNECT: Openflow instance 1, controller 1 is disconnected.disconnected reason:Echo timeout.
%@90657%Jun 29 10:10:44:562 2022 leaf_shigongsi OFP/5/OFP_FAIL_OPEN: Openflow instance 1 is in fail secure mode.
进一步查看Arp模块的收发报文情况,基本都在200pps以上,有时会突发超过300:
[leaf_shigongsi-probe]debug rxtx sof sh sl 1 30
ID Type RcvPps Rcv_All DisPkt_All Pps Dyn Swi Hash Am APps
30 ARP 220 615662430 110942 600 S On SMAC 8 –
排查发现前几个端口比较多,建议排查一下对应的接入设备
[leaf_shigongsi-probe]debug rxtx soft 30 portdetail sl 1
Softcar Type ARP PortStatusFetchCnt=18418710
Port Lvl Atk Packet/s DisPkt/s Pack_tol DisP_tol Pps Prop ENum/s Eport
0 0 0 104 0 79584199 0 600 0 0 104 699
1 0 0 24 0 30194771 44 600 0 0 24 997
2 0 0 66 0 115171520 227 600 0 0 66 526
3 0 0 10 0 23947785 0 600 0 0 10 798
4 0 0 6 0 18762585 0 600 0 0 6 748
5 0 0 39 0 44471179 0 600 0 0 39 1084
===============display lldp neighbor-information list===============
Chassis ID : * -- -- Nearest nontpmr bridge neighbor
# -- -- Nearest customer bridge neighbor
Default -- -- Nearest bridge neighbor
Local Interface Chassis ID Port ID System Name
GE1/0/1 f875-88e3-9e30 XGigabitEthernet1/0/1 2F-S5720
GE1/0/2 f875-88e3-9e30 XGigabitEthernet3/0/1 2F-S5720
GE1/0/3 f875-88e3-9e40 XGigabitEthernet3/0/1 9F-S5720
GE1/0/4 446a-2eae-b6a0 XGigabitEthernet1/0/1 13F-5720
GE1/0/5 446a-2eae-b6a0 XGigabitEthernet3/0/1 13F-5720
GE1/0/6 f875-8868-8390 XGigabitEthernet1/0/1 18F-4320
经排查,基本确认是近期安装的内网通软件(类似于飞秋,基于ARP的)导致的大量arp冲击设备cpu,导致上述故障。
1、排查内网通软件,减少arp报文的发送。
该案例暂时没有网友评论
✖
案例意见反馈
亲~登录后才可以操作哦!
确定你的邮箱还未认证,请认证邮箱或绑定手机后进行当前操作