/
/
在15日03:03开始突然BGP邻居几乎全down,伴随着subslot 4子卡的部分聚合成员端口协议down,无法选中。随后客户立刻对设备配置了peer xxx ignore隔离设备规避,期间业务影响较小。
%Jun 15 03:03:22:311 2022 BJS20_F0103_C01_021_T1_1.80 BGP/5/BGP_STATE_CHANGED:
BGP.: 100.125.49.130 state has changed from ESTABLISHED to IDLE for a notification received: Hold Timer Expired/ErrSubCode Unspecified.
%Jun 15 03:03:24:477 2022 BJS20_F0103_C01_021_T1_1.80 BGP/5/BGP_STATE_CHANGED:
BGP.: 100.125.49.131 state has changed from ESTABLISHED to IDLE for a notification received: Hold Timer Expired/ErrSubCode Unspecified.
%Jun 15 03:03:26:345 2022 BJS20_F0103_C01_021_T1_1.80 BGP/5/BGP_STATE_CHANGED:
BGP.: 100.125.49.132 state has changed from ESTABLISHED to IDLE for a notification received: Hold Timer Expired/ErrSubCode Unspecified.
%Jun 15 03:03:27:073 2022 BJS20_F0103_C01_021_T1_1.80 BGP/5/BGP_STATE_CHANGED:
BGP.: 100.125.64.161 state has changed from ESTABLISHED to IDLE for a notification received: Hold Timer Expired/ErrSubCode Unspecified.
%Jun 15 03:03:27:314 2022 BJS20_F0103_C01_021_T1_1.80 BGP/5/BGP_STATE_CHANGED:
BGP.: 100.125.48.36 state has changed from ESTABLISHED to IDLE for a notification received: Hold Timer Expired/ErrSubCode Unspecified.
%Jun 15 03:03:27:528 2022 BJS20_F0103_C01_021_T1_1.80 BGP/5/BGP_STATE_CHANGED:
BGP.: 100.125.48.40 state has changed from ESTABLISHED to IDLE for a notification received: Hold Timer Expired/ErrSubCode Unspecified.
%Jun 15 03:03:28:428 2022 BJS20_F0103_C01_021_T1_1.80 BGP/5/BGP_STATE_CHANGED:
BGP.: 100.125.48.44 state has changed from ESTABLISHED to IDLE for a notification received: Hold Timer Expired/ErrSubCode Unspecified.
%Jun 15 03:03:28:465 2022 BJS20_F0103_C01_021_T1_1.80 BGP/5/BGP_STATE_CHANGED:
BGP.: 100.125.48.32 state has changed from ESTABLISHED to IDLE for a notification received: Hold Timer Expired/ErrSubCode Unspecified.
%Jun 15 03:03:28:883 2022 BJS20_F0103_C01_021_T1_1.80 BGP/5/BGP_STATE_CHANGED:
BGP.: 100.125.48.56 state has changed from ESTABLISHED to IDLE for a notification received: Hold Timer Expired/ErrSubCode Unspecified.
%Jun 15 03:03:28:983 2022 BJS20_F0103_C01_021_T1_1.80 BGP/5/BGP_STATE_CHANGED:
BGP.: 100.125.48.54 state has changed from ESTABLISHED to IDLE for a notification received: Hold Timer Expired/ErrSubCode Unspecified.
%Jun 15 03:03:29:280 2022 BJS20_F0103_C01_021_T1_1.80 BGP/5/BGP_STATE_CHANGED:
BGP.: 100.125.48.58 state has changed from ESTABLISHED to IDLE for a notification received: Hold Timer Expired/ErrSubCode Unspecified.
%Jun 15 03:03:29:460 2022 BJS20_F0103_C01_021_T1_1.80 BGP/5/BGP_STATE_CHANGED:
BGP.: 100.125.48.50 state has changed from ESTABLISHED to IDLE for a notification received: Hold Timer Expired/ErrSubCode Unspecified.
%Jun 15 03:03:29:622 2022 BJS20_F0103_C01_021_T1_1.80 BGP/5/BGP_STATE_CHANGED:
BGP.: 100.125.48.62 state has changed from ESTABLISHED to IDLE for a notification received: Hold Timer Expired/ErrSubCode Unspecified.
%Jun 15 03:03:29:642 2022 BJS20_F0103_C01_021_T1_1.80 BGP/5/BGP_STATE_CHANGED:
BGP.: 100.125.48.52 state has changed from ESTABLISHED to IDLE for a notification received: Hold Timer Expired/ErrSubCode Unspecified.
%Jun 15 03:03:30:666 2022 BJS20_F0103_C01_021_T1_1.80 BGP/5/BGP_STATE_CHANGED:
BGP.: 100.125.48.48 state has changed from ESTABLISHED to IDLE for a notification received: Hold Timer Expired/ErrSubCode Unspecified.
%Jun 15 03:03:30:738 2022 BJS20_F0103_C01_021_T1_1.80 BGP/5/BGP_STATE_CHANGED:
BGP.: 100.125.48.42 state has changed from ESTABLISHED to IDLE for a notification received: Hold Timer Expired/ErrSubCode Unspecified.
%Jun 15 03:03:31:168 2022 BJS20_F0103_C01_021_T1_1.80 BGP/5/BGP_STATE_CHANGED:
BGP.: 100.125.48.38 state has changed from ESTABLISHED to IDLE for a notification received: Hold Timer Expired/ErrSubCode Unspecified.
%Jun 15 03:03:31:396 2022 BJS20_F0103_C01_021_T1_1.80 BGP/5/BGP_STATE_CHANGED:
BGP.: 100.125.48.46 state has changed from ESTABLISHED to IDLE for a notification received: Hold Timer Expired/ErrSubCode Unspecified.
%Jun 15 03:03:31:814 2022 BJS20_F0103_C01_021_T1_1.80 BGP/5/BGP_STATE_CHANGED:
BGP.: 100.125.48.34 state has changed from ESTABLISHED to IDLE for a notification received: Hold Timer Expired/ErrSubCode Unspecified.
%Jun 15 03:03:31:865 2022 BJS20_F0103_C01_021_T1_1.80 BGP/5/BGP_STATE_CHANGED:
BGP.: 100.125.48.60 state has changed from ESTABLISHED to IDLE for a notification received: Hold Timer Expired/ErrSubCode Unspecified.
%Jun 15 03:04:16:637 2022 BJS20_F0103_C01_021_T1_1.80 LAGG/6/LAGG_INACTIVE_PARTNER: Member port HGE1/4/7 of aggregation group RAGG11 changed to the inactive state, because the aggregation configuration of its peer port is incorrect.
%Jun 15 03:04:16:639 2022 BJS20_F0103_C01_021_T1_1.80 IFNET/5/LINK_UPDOWN: Line protocol state on the interface HundredGigE1/4/7 changed to down.
%Jun 15 03:04:18:316 2022 BJS20_F0103_C01_021_T1_1.80 LAGG/6/LAGG_INACTIVE_PARTNER: Member port HGE1/4/5 of aggregation group RAGG101 changed to the inactive state, because the aggregation configuration of its peer port is incorrect.
%Jun 15 03:04:18:318 2022 BJS20_F0103_C01_021_T1_1.80 IFNET/5/LINK_UPDOWN: Line protocol state on the interface HundredGigE1/4/5 changed to down.
%Jun 15 03:04:24:021 2022 BJS20_F0103_C01_021_T1_1.80 LAGG/6/LAGG_INACTIVE_PARTNER: Member port HGE1/4/1 of aggregation group RAGG101 changed to the inactive state, because the aggregation configuration of its peer port is incorrect.
%Jun 15 03:04:24:023 2022 BJS20_F0103_C01_021_T1_1.80 IFNET/5/LINK_UPDOWN: Line protocol state on the interface HundredGigE1/4/1 changed to down.
%Jun 15 03:04:26:886 2022 BJS20_F0103_C01_021_T1_1.80 LAGG/6/LAGG_INACTIVE_PARTNER: Member port HGE1/4/2 of aggregation group RAGG101 changed to the inactive state, because the aggregation configuration of its peer port is incorrect.
%Jun 15 03:04:26:888 2022 BJS20_F0103_C01_021_T1_1.80 IFNET/5/LINK_UPDOWN: Line protocol state on the interface HundredGigE1/4/2 changed to down.
%Jun 15 03:04:34:602 2022 BJS20_F0103_C01_021_T1_1.80 LAGG/6/LAGG_INACTIVE_PARTNER: Member port HGE1/4/8 of aggregation group RAGG11 changed to the inactive state, because the aggregation configuration of its peer port is incorrect.
%Jun 15 03:04:34:603 2022 BJS20_F0103_C01_021_T1_1.80 IFNET/5/LINK_UPDOWN: Line protocol state on the interface HundredGigE1/4/8 changed to down.
%Jun 15 03:04:34:609 2022 BJS20_F0103_C01_021_T1_1.80 IFNET/3/PHY_UPDOWN: Physical state on the interface Route-Aggregation11 changed to down.
%Jun 15 03:04:34:609 2022 BJS20_F0103_C01_021_T1_1.80 IFNET/5/LINK_UPDOWN: Line protocol state on the interface Route-Aggregation11 changed to down.
%Jun 15 03:04:42:692 2022 BJS20_F0103_C01_021_T1_1.80 LAGG/6/LAGG_INACTIVE_PARTNER: Member port HGE1/4/6 of aggregation group RAGG101 changed to the inactive state, because the aggregation configuration of its peer port is incorrect.
%Jun 15 03:04:42:694 2022 BJS20_F0103_C01_021_T1_1.80 IFNET/5/LINK_UPDOWN: Line protocol state on the interface HundredGigE1/4/6 changed to down.
%Jun 15 03:04:42:696 2022 BJS20_F0103_C01_021_T1_1.80 IFNET/3/PHY_UPDOWN: Physical state on the interface Route-Aggregation101 changed to down.
%Jun 15 03:04:42:705 2022 BJS20_F0103_C01_021_T1_1.80 IFNET/5/LINK_UPDOWN: Line protocol state on the interface Route-Aggregation101 changed to down.
1.大量bgp邻居down,查看log-info,震荡的bgp全为收到了对端发过来的4/0报文,表示对端没收到或者我们没发出去;但大量直连bgp邻居如此,基本开判断是我们设备没发出去;
<BJS20_F0103_C01_021_T1_1.80>display bgp peer ipv4 100.125.49.130 log-info
Peer: 100.125.49.130
Date Time State Notification
Error/SubError
15-Jun-2022 03:03:22 Down Receive notification with error 4/0
Hold Timer Expired/ErrSubCode Unspecified
Keepalive last triggered time: 03:03:21-2022.6.15
Keepalive last sent time : 03:03:21-2022.6.15
Update last sent time : 21:12:18-2022.6.14
EPOLLOUT last occurred time : 03:03:17-2022.6.15
14-Jun-2022 01:52:19 Up
<BJS20_F0103_C01_021_T1_1.80>dis bgp peer ipv4 100.125.49.130 log-info
Peer: 100.125.49.130
Date Time State Notification
Error/SubError
15-Jun-2022 03:03:22 Down Receive notification with error 4/0
Hold Timer Expired/ErrSubCode Unspecified
Keepalive last triggered time: 03:03:21-2022.6.15
Keepalive last sent time : 03:03:21-2022.6.15
Update last sent time : 21:12:18-2022.6.14
2.slot 4 聚合端口协议down,查看debugging信息,发现我们平台层面有发有收,但是对端回应的lacp报文,sys-mac全0,说明没收到我们的报文;只能是我们底层没有发出去或者链路存在问题,但因为该设备多个端口如此情况,基本可以判断是我们设备没发出去;
<BJS20_F0103_C01_021_T1_1.80>debugging link-aggregation lacp packet all interface HundredGigE 1/4/1 to HundredGigE 1/4/2
<BJS20_F0103_C01_021_T1_1.80>t d
The current terminal is enabled to display debugging logs.
<BJS20_F0103_C01_021_T1_1.80>t m
The current terminal is enabled to display logs.
<BJS20_F0103_C01_021_T1_1.80>*Jun 15 04:07:20:493 2022 BJS20_F0103_C01_021_T1_1.80 LAGG/7/Packet: PACKET.HundredGigE1/4/1.receive.
size=110, subtype=1, version=1
Actor: type=1, len=20, sys-pri=0x8000, sys-mac=c433-064a-4401, key=0xd351, pri=0x8000, port-index=0x15, state=0x45
Partner: type=2, len=20, sys-pri=0x0, sys-mac=0000-0000-0000, key=0x0, pri=0x0, port-index=0x0, state=0xc5
Collector: type=3, len=16, col-max-delay=0x0
Terminator: type=0, len=0
*Jun 15 04:07:20:493 2022 BJS20_F0103_C01_021_T1_1.80 LAGG/7/Packet: PACKET.HundredGigE1/4/1.send.
size=110, subtype=1, version=1
Actor: type=1, len=20, sys-pri=0x8000, sys-mac=9ce8-95e1-cef2, key=0x1, pri=0x8000, port-index=0x1, state=0xd
Partner: type=2, len=20, sys-pri=0x8000, sys-mac=c433-064a-4401, key=0xd351, pri=0x8000, port-index=0x15, state=0x45
Collector: type=3, len=16, col-max-delay=0x0
Terminator: type=0, len=0
*Jun 15 04:07:20:732 2022 BJS20_F0103_C01_021_T1_1.80 LAGG/7/Packet: PACKET.HundredGigE1/4/2.receive.
size=110, subtype=1, version=1
Actor: type=1, len=20, sys-pri=0x8000, sys-mac=c433-064a-4401, key=0xd351, pri=0x8000, port-index=0x39, state=0x45
Partner: type=2, len=20, sys-pri=0x0, sys-mac=0000-0000-0000, key=0x0, pri=0x0, port-index=0x0, state=0xc5
Collector: type=3, len=16, col-max-delay=0x0
Terminator: type=0, len=0
*Jun 15 04:07:20:732 2022 BJS20_F0103_C01_021_T1_1.80 LAGG/7/Packet: PACKET.HundredGigE1/4/2.send.
size=110, subtype=1, version=1
Actor: type=1, len=20, sys-pri=0x8000, sys-mac=9ce8-95e1-cef2, key=0x1, pri=0x8000, port-index=0x3, state=0xd
Partner: type=2, len=20, sys-pri=0x8000, sys-mac=c433-064a-4401, key=0xd351, pri=0x8000, port-index=0x39, state=0x45
Collector: type=3, len=16, col-max-delay=0x0
Terminator: type=0, len=0
3.查看底层local logbuffer信息,发现设备有多bit parity error,有MMU相关的parity,时间点也和故障时间吻合;
Slot01 Jun 15 2022 03:03:12:413:LINE:5315-TASK:bDPC-FUNC:soc_ser_correction:SER_CORRECTION: reg/mem:1291 btype:19 sblk:2 at:-1 stage:1 addr:0x04300000 port: 0 index: 2032
Slot01 Jun 15 2022 03:03:12:413:LINE:5427-TASK:bDPC-FUNC:soc_ser_correction:mem: 1291=EGR_MAP_MH blkoffset:35
Slot01 Jun 15 2022 03:03:12:413:LINE:5797-TASK:bDPC-FUNC:soc_ser_correction:CACHE_RESTORE: EGR_MAP_MH[1291] blk: epipe0 index: 2032 : [2][4300000]
Slot01 Jun 15 2022 03:03:12:427:LINE:1140-TASK:bRX1-FUNC:DRV_QINQ_GetState:ifindex 13 info get return 0x40010008
Slot01 Jun 15 2022 03:03:12:437:LINE:5116-TASK:bDPC-FUNC:soc_tomahawk_process_ser_fifo:Multiple:
Slot01 Jun 15 2022 03:03:12:447:LINE:5112-TASK:bDPC-FUNC:soc_tomahawk_process_ser_fifo:Unit: 0
Slot01 Jun 15 2022 03:03:12:448:LINE:5125-TASK:bDPC-FUNC:soc_tomahawk_process_ser_fifo:Mem:
Slot01 Jun 15 2022 03:03:12:448:LINE:5131-TASK:bDPC-FUNC:soc_tomahawk_process_ser_fifo:Parity error..
Slot01 Jun 15 2022 03:03:12:448:LINE:4770-TASK:bDPC-FUNC:_soc_tomahawk_print_ser_fifo_details:Error in: SOP cell.
Slot01 Jun 15 2022 03:03:12:448:LINE:4801-TASK:bDPC-FUNC:_soc_tomahawk_print_ser_fifo_details:Blk: 2, Pipe: 2, Address: 0x043007f0, base: 0xc, stage: 1, index: 2032
Slot01 Jun 15 2022 03:03:12:448:LINE:4829-TASK:bDPC-FUNC:_soc_tomahawk_print_ser_fifo_details:SER caused packet drop.
Slot01 Jun 15 2022 03:03:12:448:LINE:5315-TASK:bDPC-FUNC:soc_ser_correction:SER_CORRECTION: reg/mem:1291 btype:19 sblk:2 at:-1 stage:1 addr:0x04300000 port: 0 index: 2032
Slot01 Jun 15 2022 03:03:12:448:LINE:5427-TASK:bDPC-FUNC:soc_ser_correction:mem: 1291=EGR_MAP_MH blkoffset:35
Slot01 Jun 15 2022 03:03:12:448:LINE:5797-TASK:bDPC-FUNC:soc_ser_correction:CACHE_RESTORE: EGR_MAP_MH[1291] blk: epipe0 index: 2032 : [2][4300000]
Slot01 Jun 15 2022 03:03:12:451:LINE:1597-TASK:FMCK-FUNC:DRV_DEVM_GetSubSlotFromUnitPort:call DRV_DEVM_GetUserPortFromUnitPort error: uiBID 1,uiUnit 0, uiPort 2
Slot01 Jun 15 2022 03:03:12:451:LINE:7477-TASK:FMCK-FUNC:DRV_DEVM_PortIsPhy82391:DRV_DEVM_PortIsPhy82391 error invalid uiRet=1073807361
Slot01 Jun 15 2022 03:03:12:467:LINE:5112-TASK:bDPC-FUNC:soc_tomahawk_process_ser_fifo:Unit: 0
Slot01 Jun 15 2022 03:03:12:467:LINE:5125-TASK:bDPC-FUNC:soc_tomahawk_process_ser_fifo:Mem:
Slot01 Jun 15 2022 03:03:12:467:LINE:5131-TASK:bDPC-FUNC:soc_tomahawk_process_ser_fifo:Parity error..
Slot01 Jun 15 2022 03:03:12:467:LINE:4770-TASK:bDPC-FUNC:_soc_tomahawk_print_ser_fifo_details:Error in: SOP cell.
Slot01 Jun 15 2022 03:03:12:467:LINE:4801-TASK:bDPC-FUNC:_soc_tomahawk_print_ser_fifo_details:Blk: 2, Pipe: 2, Address: 0x043007f0, base: 0xc, stage: 1, index: 2032
Slot01 Jun 15 2022 03:03:12:467:LINE:4829-TASK:bDPC-FUNC:_soc_tomahawk_print_ser_fifo_details:SER caused packet drop.
Slot01 Jun 15 2022 03:03:12:467:LINE:5315-TASK:bDPC-FUNC:soc_ser_correction:SER_CORRECTION: reg/mem:1291 btype:19 sblk:2 at:-1 stage:1 addr:0x04300000 port: 0 index: 2032
Slot01 Jun 15 2022 03:03:12:467:LINE:5427-TASK:bDPC-FUNC:soc_ser_correction:mem: 1291=EGR_MAP_MH blkoffset:35
Slot01 Jun 15 2022 03:03:12:467:LINE:5797-TASK:bDPC-FUNC:soc_ser_correction:CACHE_RESTORE: EGR_MAP_MH[1291] blk: epipe0 index: 2032 : [2][4300000]
Slot01 Jun 15 2022 03:03:12:468:LINE:5143-TASK:bDPC-FUNC:soc_tomahawk_process_ser_fifo:Double or Multiple bit ECC error..
Slot01 Jun 15 2022 03:03:12:478:LINE:3395-TASK:bDPC-FUNC:_soc_tomahawk_ser_process_mmu_err:unit 0 MMU XPE MEM PAR mem error interrupt count: 283
Slot01 Jun 15 2022 03:03:12:478:LINE:3407-TASK:bDPC-FUNC:_soc_tomahawk_ser_process_mmu_err:MMU XPE MEM PAR
Slot01 Jun 15 2022 03:03:12:478:LINE:3417-TASK:bDPC-FUNC:_soc_tomahawk_ser_process_mmu_err:unit 0 MMU XPE MEM PAR multiple parity/2bit error at address 0x129981d6
Slot01 Jun 15 2022 03:03:12:479:LINE:3693-TASK:bDPC-FUNC:_soc_tomahawk_ser_process_mmu_err:unit 0 MMU XPE MEM PAR address 0x129981d6: decoding of address to mem FAILED !!
Slot01 Jun 15 2022 03:03:12:487:LINE:12026-TASK:bDPC-FUNC:soc_th_mmu_non_ser_intr_handler:Unit: 0 -- Could not service DEQ0_NOT_IP_ERR_STAT intr from xpe = 3
Slot01 Jun 15 2022 03:03:12:492:LINE:1597-TASK:FMCK-FUNC:DRV_DEVM_GetSubSlotFromUnitPort:call DRV_DEVM_GetUserPortFromUnitPort error: uiBID 1,uiUnit 0, uiPort 20
Slot01 Jun 15 2022 03:03:12:492:LINE:7477-TASK:FMCK-FUNC:DRV_DEVM_PortIsPhy82391:DRV_DEVM_PortIsPhy82391 error invalid uiRet=1073807361
Slot01 Jun 15 2022 03:03:12:498:LINE:5112-TASK:bDPC-FUNC:soc_tomahawk_process_ser_fifo:Unit: 0
Slot01 Jun 15 2022 03:03:12:498:LINE:5116-TASK:bDPC-FUNC:soc_tomahawk_process_ser_fifo:Multiple:
Slot01 Jun 15 2022 03:03:12:498:LINE:5125-TASK:bDPC-FUNC:soc_tomahawk_process_ser_fifo:Mem:
Slot01 Jun 15 2022 03:03:12:498:LINE:5143-TASK:bDPC-FUNC:soc_tomahawk_process_ser_fifo:Double or Multiple bit ECC error..
4. subslot 1子卡上全部端口出现物理震荡;之后基本在同一时间恢复正常;
Line 3294: Last link flapping: 0 hours 28 minutes 43 seconds
Line 3335: Last link flapping: 0 hours 28 minutes 44 seconds
Line 3376: Last link flapping: 0 hours 28 minutes 44 seconds
Line 3417: Last link flapping: 0 hours 28 minutes 45 seconds
Line 3458: Last link flapping: 0 hours 28 minutes 44 seconds
Line 3499: Last link flapping: 0 hours 28 minutes 45 seconds
Line 3540: Last link flapping: 0 hours 28 minutes 44 seconds
Line 3581: Last link flapping: 0 hours 28 minutes 45 seconds
Line 3622: Last link flapping: 0 hours 28 minutes 44 seconds
Line 3663: Last link flapping: 0 hours 28 minutes 44 seconds
Line 3704: Last link flapping: 0 hours 28 minutes 44 seconds
Line 3745: Last link flapping: 0 hours 28 minutes 44 seconds
Line 3786: Last link flapping: 0 hours 28 minutes 44 seconds
Line 3827: Last link flapping: 0 hours 28 minutes 44 seconds
Line 3868: Last link flapping: 0 hours 28 minutes 43 seconds
Line 3909: Last link flapping: 0 hours 28 minutes 44 seconds
5. 端口有TPCE计数;设备如果出现TPCE计数,理论上交换芯片出现问题了,但是show/c是读清的,隔了两个小时,再次收集诊断,就没有新增tpce计数了,怀疑是因为该设备业务已隔离,没有流量所以没有进一步计数增长;但出现TPCE的端口,基本都出现PERQ_DROP_PKT_UC计数;且第二次收集时,依然有计数在增加;
Line 31537: TPCE_64.ce0 : 6 +6
Line 31609: TPCE_64.ce1 : 14 +14
Line 31678: TPCE_64.ce2 : 10 +10
Line 31857: TPCE_64.ce7 : 8 +8
Line 31950: TPCE_64.xe88 : 3 +3
Line 32057: TPCE_64.xe92 : 2 +2
Line 32111: TPCE_64.xe94 : 5 +5
Line 32166: TPCE_64.xe96 : 2 +2
Line 32218: TPCE_64.xe98 : 2 +2
Line 32272: TPCE_64.xe100 : 2 +2
Line 32375: TPCE_64.xe104 : 1 +1
Line 32479: TPCE_64.xe108 : 5 +5
Line 32585: TPCE_64.xe112 : 3 +3
Line 32695: TPCE_64.xe116 : 2 +2
第一次诊断:
Line 31528: PERQ_DROP_PKT_UC(0).ge0: 25,257,318,217 +25,257,318,217
Line 31587: PERQ_DROP_PKT_UC(0).ce0: 6,884,239 +6,884,239
Line 31588: PERQ_DROP_PKT_UC(1).ce0: 115 +115
Line 31589: PERQ_DROP_PKT_UC(2).ce0: 35 +35
Line 31590: PERQ_DROP_PKT_UC(4).ce0: 6,821 +6,821
Line 31591: PERQ_DROP_PKT_UC(6).ce0: 17 +17
Line 31592: PERQ_DROP_PKT_UC(7).ce0: 1,866 +1,866
Line 31655: PERQ_DROP_PKT_UC(0).ce1: 6,850,688 +6,850,688
Line 31656: PERQ_DROP_PKT_UC(1).ce1: 160 +160
Line 31657: PERQ_DROP_PKT_UC(2).ce1: 29 +29
Line 31658: PERQ_DROP_PKT_UC(4).ce1: 5,262 +5,262
Line 31659: PERQ_DROP_PKT_UC(5).ce1: 2 +2
Line 31660: PERQ_DROP_PKT_UC(6).ce1: 10 +10
Line 31661: PERQ_DROP_PKT_UC(7).ce1: 1,708 +1,708
Line 31728: PERQ_DROP_PKT_UC(0).ce2: 6,877,870 +6,877,870
Line 31729: PERQ_DROP_PKT_UC(1).ce2: 188 +188
Line 31730: PERQ_DROP_PKT_UC(2).ce2: 29 +29
Line 31731: PERQ_DROP_PKT_UC(4).ce2: 6,294 +6,294
Line 31732: PERQ_DROP_PKT_UC(5).ce2: 1 +1
Line 31733: PERQ_DROP_PKT_UC(6).ce2: 6 +6
Line 31734: PERQ_DROP_PKT_UC(7).ce2: 1,745 +1,745 1/s
Line 31794: PERQ_DROP_PKT_UC(0).ce5: 225 +225
Line 31795: PERQ_DROP_PKT_UC(7).ce5: 1,551 +1,551
Line 31846: PERQ_DROP_PKT_UC(0).ce6: 260 +260
Line 31847: PERQ_DROP_PKT_UC(7).ce6: 1,402 +1,402 1/s
Line 31910: PERQ_DROP_PKT_UC(0).ce7: 6,887,403 +6,887,403
Line 31911: PERQ_DROP_PKT_UC(1).ce7: 188 +188
Line 31912: PERQ_DROP_PKT_UC(2).ce7: 21 +21
Line 31913: PERQ_DROP_PKT_UC(4).ce7: 5,181 +5,181
Line 31914: PERQ_DROP_PKT_UC(6).ce7: 10 +10
Line 31915: PERQ_DROP_PKT_UC(7).ce7: 1,786 +1,786 1/s
Line 31939: PERQ_DROP_PKT_UC(0).ge1: 3,314,739,183 +3,314,739,183
Line 31994: PERQ_DROP_PKT_UC(0).xe88: 3,745 +3,745
Line 31995: PERQ_DROP_PKT_UC(7).xe88: 785 +785
Line 32046: PERQ_DROP_PKT_UC(0).xe90: 3,284 +3,284
Line 32047: PERQ_DROP_PKT_UC(7).xe90: 759 +759
Line 32100: PERQ_DROP_PKT_UC(0).xe92: 4,204 +4,204
Line 32101: PERQ_DROP_PKT_UC(7).xe92: 718 +718
Line 32156: PERQ_DROP_PKT_UC(0).xe94: 2,525 +2,525
Line 32157: PERQ_DROP_PKT_UC(7).xe94: 715 +715
Line 32208: PERQ_DROP_PKT_UC(0).xe96: 3,273 +3,273
Line 32209: PERQ_DROP_PKT_UC(7).xe96: 701 +701
Line 32261: PERQ_DROP_PKT_UC(0).xe98: 4,497 +4,497
Line 32262: PERQ_DROP_PKT_UC(7).xe98: 738 +738
Line 32314: PERQ_DROP_PKT_UC(0).xe100: 4,315 +4,315
Line 32315: PERQ_DROP_PKT_UC(7).xe100: 706 +706
Line 32365: PERQ_DROP_PKT_UC(0).xe102: 3,581 +3,581
Line 32366: PERQ_DROP_PKT_UC(7).xe102: 765 +765
Line 32417: PERQ_DROP_PKT_UC(0).xe104: 3,495 +3,495
Line 32418: PERQ_DROP_PKT_UC(7).xe104: 752 +752
Line 32468: PERQ_DROP_PKT_UC(0).xe106: 3,819 +3,819
Line 32469: PERQ_DROP_PKT_UC(7).xe106: 715 +715
Line 32522: PERQ_DROP_PKT_UC(0).xe108: 3,502 +3,502
Line 32523: PERQ_DROP_PKT_UC(7).xe108: 724 +724
Line 32574: PERQ_DROP_PKT_UC(0).xe110: 3,538 +3,538
Line 32575: PERQ_DROP_PKT_UC(7).xe110: 731 +731
Line 32631: PERQ_DROP_PKT_UC(0).xe112: 3,545 +3,545
Line 32632: PERQ_DROP_PKT_UC(7).xe112: 777 +777
Line 32684: PERQ_DROP_PKT_UC(0).xe114: 4,334 +4,334
Line 32685: PERQ_DROP_PKT_UC(7).xe114: 698 +698
Line 32739: PERQ_DROP_PKT_UC(0).xe116: 3,762 +3,762
Line 32740: PERQ_DROP_PKT_UC(4).xe116: 2 +2
Line 32741: PERQ_DROP_PKT_UC(7).xe116: 727 +727
Line 32793: PERQ_DROP_PKT_UC(0).xe118: 4,073 +4,073
Line 32794: PERQ_DROP_PKT_UC(7).xe118: 716 +716
第二次诊断:
Line 36237: PERQ_DROP_PKT_UC(7).ce0: 9,320 +1,057
Line 36245: PERQ_DROP_PKT_UC(7).ce1: 9,163 +1,057
Line 36253: PERQ_DROP_PKT_UC(7).ce2: 9,197 +1,056
Line 36267: PERQ_DROP_PKT_UC(7).ce5: 9,006 +1,057 1/s
Line 36281: PERQ_DROP_PKT_UC(7).ce6: 8,856 +1,057
Line 36289: PERQ_DROP_PKT_UC(7).ce7: 9,240 +1,057 1/s
Line 36291: PERQ_DROP_PKT(0).ge1: 253 +33
Line 36302: PERQ_DROP_PKT_UC(7).xe88: 1,759 +138
Line 36313: PERQ_DROP_PKT_UC(7).xe90: 1,735 +137 1/s
Line 36324: PERQ_DROP_PKT_UC(7).xe92: 1,686 +138
Line 36335: PERQ_DROP_PKT_UC(7).xe94: 1,686 +138
Line 36346: PERQ_DROP_PKT_UC(7).xe96: 1,675 +138
Line 36357: PERQ_DROP_PKT_UC(7).xe98: 1,712 +141
Line 36368: PERQ_DROP_PKT_UC(7).xe100: 1,680 +138
Line 36379: PERQ_DROP_PKT_UC(7).xe102: 1,736 +138
Line 36390: PERQ_DROP_PKT_UC(7).xe104: 1,726 +135
Line 36401: PERQ_DROP_PKT_UC(7).xe106: 1,686 +138
Line 36412: PERQ_DROP_PKT_UC(7).xe108: 1,698 +138
Line 36423: PERQ_DROP_PKT_UC(7).xe110: 1,699 +135
Line 36434: PERQ_DROP_PKT_UC(7).xe112: 1,751 +141
Line 36445: PERQ_DROP_PKT_UC(7).xe114: 1,672 +138
Line 36456: PERQ_DROP_PKT_UC(7).xe116: 1,701 +141
Line 36467: PERQ_DROP_PKT_UC(7).xe118: 1,690 +139
综上,
1.设备上bgp异常,subslot4 聚合协议down,应该是由于MMU XPE MEM PAR multiple parity/2bit error导致的设备发包出现问题;
2. subslot 1上全部端口发生物理震荡,理论上软件不会导致物理震荡;如果担心再次震荡影响业务,建议先更换subslot 1。
3. 有TPCE计数,不确定是何时产生的,有DROP丢包,导致的原因可能是上述的1,也不能排除可能是转发芯片出现故障。为了排除隐患,建议硬件更换。
1、设备多bit parity error,需要重启恢复。
2、重启后观察是否还有TPCE计数,以及DROP丢包,观察subslot1是否还会发生震荡;若仍旧有TPCE计数,说明转发芯片存在异常,建议更换设备。
该案例暂时没有网友评论
✖
案例意见反馈
亲~登录后才可以操作哦!
确定你的邮箱还未认证,请认证邮箱或绑定手机后进行当前操作