• 全部
  • 经验案例
  • 典型配置
  • 技术公告
  • FAQ
  • 漏洞说明
  • 全部
  • 全部
  • 大数据引擎
  • 知了引擎
产品线
搜索
取消
案例类型
发布者
是否解决
是否官方
时间
搜索引擎
匹配模式
高级搜索

某局点S6820-4C设备突然大量BGP邻居异常中断问题

  • 0关注
  • 0收藏 1607浏览
粉丝:29人 关注:3人

组网及说明

/

告警信息

/

问题描述

1503:03开始突然BGP邻居几乎全down,伴随着subslot 4子卡的部分聚合成员端口协议down,无法选中。随后客户立刻对设备配置了peer xxx ignore隔离设备规避,期间业务影响较小。

%Jun 15 03:03:22:311 2022 BJS20_F0103_C01_021_T1_1.80 BGP/5/BGP_STATE_CHANGED:

 BGP.: 100.125.49.130 state has changed from ESTABLISHED to IDLE for a notification received: Hold Timer Expired/ErrSubCode Unspecified.

%Jun 15 03:03:24:477 2022 BJS20_F0103_C01_021_T1_1.80 BGP/5/BGP_STATE_CHANGED:

 BGP.: 100.125.49.131 state has changed from ESTABLISHED to IDLE for a notification received: Hold Timer Expired/ErrSubCode Unspecified.

%Jun 15 03:03:26:345 2022 BJS20_F0103_C01_021_T1_1.80 BGP/5/BGP_STATE_CHANGED:

 BGP.: 100.125.49.132 state has changed from ESTABLISHED to IDLE for a notification received: Hold Timer Expired/ErrSubCode Unspecified.

%Jun 15 03:03:27:073 2022 BJS20_F0103_C01_021_T1_1.80 BGP/5/BGP_STATE_CHANGED:

 BGP.: 100.125.64.161 state has changed from ESTABLISHED to IDLE for a notification received: Hold Timer Expired/ErrSubCode Unspecified.

%Jun 15 03:03:27:314 2022 BJS20_F0103_C01_021_T1_1.80 BGP/5/BGP_STATE_CHANGED:

 BGP.: 100.125.48.36 state has changed from ESTABLISHED to IDLE for a notification received: Hold Timer Expired/ErrSubCode Unspecified.

%Jun 15 03:03:27:528 2022 BJS20_F0103_C01_021_T1_1.80 BGP/5/BGP_STATE_CHANGED:

 BGP.: 100.125.48.40 state has changed from ESTABLISHED to IDLE for a notification received: Hold Timer Expired/ErrSubCode Unspecified.

%Jun 15 03:03:28:428 2022 BJS20_F0103_C01_021_T1_1.80 BGP/5/BGP_STATE_CHANGED:

 BGP.: 100.125.48.44 state has changed from ESTABLISHED to IDLE for a notification received: Hold Timer Expired/ErrSubCode Unspecified.

%Jun 15 03:03:28:465 2022 BJS20_F0103_C01_021_T1_1.80 BGP/5/BGP_STATE_CHANGED:

 BGP.: 100.125.48.32 state has changed from ESTABLISHED to IDLE for a notification received: Hold Timer Expired/ErrSubCode Unspecified.

%Jun 15 03:03:28:883 2022 BJS20_F0103_C01_021_T1_1.80 BGP/5/BGP_STATE_CHANGED:

 BGP.: 100.125.48.56 state has changed from ESTABLISHED to IDLE for a notification received: Hold Timer Expired/ErrSubCode Unspecified.

%Jun 15 03:03:28:983 2022 BJS20_F0103_C01_021_T1_1.80 BGP/5/BGP_STATE_CHANGED:

 BGP.: 100.125.48.54 state has changed from ESTABLISHED to IDLE for a notification received: Hold Timer Expired/ErrSubCode Unspecified.

%Jun 15 03:03:29:280 2022 BJS20_F0103_C01_021_T1_1.80 BGP/5/BGP_STATE_CHANGED:

 BGP.: 100.125.48.58 state has changed from ESTABLISHED to IDLE for a notification received: Hold Timer Expired/ErrSubCode Unspecified.

%Jun 15 03:03:29:460 2022 BJS20_F0103_C01_021_T1_1.80 BGP/5/BGP_STATE_CHANGED:

 BGP.: 100.125.48.50 state has changed from ESTABLISHED to IDLE for a notification received: Hold Timer Expired/ErrSubCode Unspecified.

%Jun 15 03:03:29:622 2022 BJS20_F0103_C01_021_T1_1.80 BGP/5/BGP_STATE_CHANGED:

 BGP.: 100.125.48.62 state has changed from ESTABLISHED to IDLE for a notification received: Hold Timer Expired/ErrSubCode Unspecified.

%Jun 15 03:03:29:642 2022 BJS20_F0103_C01_021_T1_1.80 BGP/5/BGP_STATE_CHANGED:

 BGP.: 100.125.48.52 state has changed from ESTABLISHED to IDLE for a notification received: Hold Timer Expired/ErrSubCode Unspecified.

%Jun 15 03:03:30:666 2022 BJS20_F0103_C01_021_T1_1.80 BGP/5/BGP_STATE_CHANGED:

 BGP.: 100.125.48.48 state has changed from ESTABLISHED to IDLE for a notification received: Hold Timer Expired/ErrSubCode Unspecified.

%Jun 15 03:03:30:738 2022 BJS20_F0103_C01_021_T1_1.80 BGP/5/BGP_STATE_CHANGED:

 BGP.: 100.125.48.42 state has changed from ESTABLISHED to IDLE for a notification received: Hold Timer Expired/ErrSubCode Unspecified.

%Jun 15 03:03:31:168 2022 BJS20_F0103_C01_021_T1_1.80 BGP/5/BGP_STATE_CHANGED:

 BGP.: 100.125.48.38 state has changed from ESTABLISHED to IDLE for a notification received: Hold Timer Expired/ErrSubCode Unspecified.

%Jun 15 03:03:31:396 2022 BJS20_F0103_C01_021_T1_1.80 BGP/5/BGP_STATE_CHANGED:

 BGP.: 100.125.48.46 state has changed from ESTABLISHED to IDLE for a notification received: Hold Timer Expired/ErrSubCode Unspecified.

%Jun 15 03:03:31:814 2022 BJS20_F0103_C01_021_T1_1.80 BGP/5/BGP_STATE_CHANGED:

 BGP.: 100.125.48.34 state has changed from ESTABLISHED to IDLE for a notification received: Hold Timer Expired/ErrSubCode Unspecified.

%Jun 15 03:03:31:865 2022 BJS20_F0103_C01_021_T1_1.80 BGP/5/BGP_STATE_CHANGED:

 BGP.: 100.125.48.60 state has changed from ESTABLISHED to IDLE for a notification received: Hold Timer Expired/ErrSubCode Unspecified.

 

%Jun 15 03:04:16:637 2022 BJS20_F0103_C01_021_T1_1.80 LAGG/6/LAGG_INACTIVE_PARTNER: Member port HGE1/4/7 of aggregation group RAGG11 changed to the inactive state, because the aggregation configuration of its peer port is incorrect.

%Jun 15 03:04:16:639 2022 BJS20_F0103_C01_021_T1_1.80 IFNET/5/LINK_UPDOWN: Line protocol state on the interface HundredGigE1/4/7 changed to down.

%Jun 15 03:04:18:316 2022 BJS20_F0103_C01_021_T1_1.80 LAGG/6/LAGG_INACTIVE_PARTNER: Member port HGE1/4/5 of aggregation group RAGG101 changed to the inactive state, because the aggregation configuration of its peer port is incorrect.

%Jun 15 03:04:18:318 2022 BJS20_F0103_C01_021_T1_1.80 IFNET/5/LINK_UPDOWN: Line protocol state on the interface HundredGigE1/4/5 changed to down.

%Jun 15 03:04:24:021 2022 BJS20_F0103_C01_021_T1_1.80 LAGG/6/LAGG_INACTIVE_PARTNER: Member port HGE1/4/1 of aggregation group RAGG101 changed to the inactive state, because the aggregation configuration of its peer port is incorrect.

%Jun 15 03:04:24:023 2022 BJS20_F0103_C01_021_T1_1.80 IFNET/5/LINK_UPDOWN: Line protocol state on the interface HundredGigE1/4/1 changed to down.

%Jun 15 03:04:26:886 2022 BJS20_F0103_C01_021_T1_1.80 LAGG/6/LAGG_INACTIVE_PARTNER: Member port HGE1/4/2 of aggregation group RAGG101 changed to the inactive state, because the aggregation configuration of its peer port is incorrect.

%Jun 15 03:04:26:888 2022 BJS20_F0103_C01_021_T1_1.80 IFNET/5/LINK_UPDOWN: Line protocol state on the interface HundredGigE1/4/2 changed to down.

%Jun 15 03:04:34:602 2022 BJS20_F0103_C01_021_T1_1.80 LAGG/6/LAGG_INACTIVE_PARTNER: Member port HGE1/4/8 of aggregation group RAGG11 changed to the inactive state, because the aggregation configuration of its peer port is incorrect.

%Jun 15 03:04:34:603 2022 BJS20_F0103_C01_021_T1_1.80 IFNET/5/LINK_UPDOWN: Line protocol state on the interface HundredGigE1/4/8 changed to down.

%Jun 15 03:04:34:609 2022 BJS20_F0103_C01_021_T1_1.80 IFNET/3/PHY_UPDOWN: Physical state on the interface Route-Aggregation11 changed to down.

%Jun 15 03:04:34:609 2022 BJS20_F0103_C01_021_T1_1.80 IFNET/5/LINK_UPDOWN: Line protocol state on the interface Route-Aggregation11 changed to down.

%Jun 15 03:04:42:692 2022 BJS20_F0103_C01_021_T1_1.80 LAGG/6/LAGG_INACTIVE_PARTNER: Member port HGE1/4/6 of aggregation group RAGG101 changed to the inactive state, because the aggregation configuration of its peer port is incorrect.

%Jun 15 03:04:42:694 2022 BJS20_F0103_C01_021_T1_1.80 IFNET/5/LINK_UPDOWN: Line protocol state on the interface HundredGigE1/4/6 changed to down.

%Jun 15 03:04:42:696 2022 BJS20_F0103_C01_021_T1_1.80 IFNET/3/PHY_UPDOWN: Physical state on the interface Route-Aggregation101 changed to down.

%Jun 15 03:04:42:705 2022 BJS20_F0103_C01_021_T1_1.80 IFNET/5/LINK_UPDOWN: Line protocol state on the interface Route-Aggregation101 changed to down.

 

过程分析

1.大量bgp邻居down,查看log-info,震荡的bgp全为收到了对端发过来的4/0报文,表示对端没收到或者我们没发出去;但大量直连bgp邻居如此,基本开判断是我们设备没发出去;

 

<BJS20_F0103_C01_021_T1_1.80>display  bgp peer   ipv4 100.125.49.130 log-info

 Peer: 100.125.49.130

     Date      Time    State Notification

                             Error/SubError

  15-Jun-2022 03:03:22 Down  Receive notification with error 4/0

                             Hold Timer Expired/ErrSubCode Unspecified

                             Keepalive last triggered time: 03:03:21-2022.6.15

                             Keepalive last sent time     : 03:03:21-2022.6.15

                             Update last sent time        : 21:12:18-2022.6.14

                             EPOLLOUT last occurred time  : 03:03:17-2022.6.15

  14-Jun-2022 01:52:19 Up

<BJS20_F0103_C01_021_T1_1.80>dis  bgp peer ipv4 100.125.49.130 log-info

 Peer: 100.125.49.130

     Date      Time    State Notification

                             Error/SubError

  15-Jun-2022 03:03:22 Down  Receive notification with error 4/0

                             Hold Timer Expired/ErrSubCode Unspecified

                             Keepalive last triggered time: 03:03:21-2022.6.15

                             Keepalive last sent time     : 03:03:21-2022.6.15

                             Update last sent time        : 21:12:18-2022.6.14

 

2.slot 4 聚合端口协议down,查看debugging信息,发现我们平台层面有发有收,但是对端回应的lacp报文,sys-mac0,说明没收到我们的报文;只能是我们底层没有发出去或者链路存在问题,但因为该设备多个端口如此情况,基本可以判断是我们设备没发出去;

<BJS20_F0103_C01_021_T1_1.80>debugging   link-aggregation  lacp   packet   all  interface   HundredGigE   1/4/1  to HundredGigE     1/4/2

<BJS20_F0103_C01_021_T1_1.80>t d

The current terminal is enabled to display debugging logs.

<BJS20_F0103_C01_021_T1_1.80>t m

The current terminal is enabled to display logs.

<BJS20_F0103_C01_021_T1_1.80>*Jun 15 04:07:20:493 2022 BJS20_F0103_C01_021_T1_1.80 LAGG/7/Packet: PACKET.HundredGigE1/4/1.receive.

size=110, subtype=1, version=1

Actor: type=1, len=20, sys-pri=0x8000, sys-mac=c433-064a-4401, key=0xd351, pri=0x8000, port-index=0x15, state=0x45

Partner: type=2, len=20, sys-pri=0x0, sys-mac=0000-0000-0000, key=0x0, pri=0x0, port-index=0x0, state=0xc5

Collector: type=3, len=16, col-max-delay=0x0

Terminator: type=0, len=0

 

*Jun 15 04:07:20:493 2022 BJS20_F0103_C01_021_T1_1.80 LAGG/7/Packet: PACKET.HundredGigE1/4/1.send.

size=110, subtype=1, version=1

Actor: type=1, len=20, sys-pri=0x8000, sys-mac=9ce8-95e1-cef2, key=0x1, pri=0x8000, port-index=0x1, state=0xd

Partner: type=2, len=20, sys-pri=0x8000, sys-mac=c433-064a-4401, key=0xd351, pri=0x8000, port-index=0x15, state=0x45

Collector: type=3, len=16, col-max-delay=0x0

Terminator: type=0, len=0

 

*Jun 15 04:07:20:732 2022 BJS20_F0103_C01_021_T1_1.80 LAGG/7/Packet: PACKET.HundredGigE1/4/2.receive.

size=110, subtype=1, version=1

Actor: type=1, len=20, sys-pri=0x8000, sys-mac=c433-064a-4401, key=0xd351, pri=0x8000, port-index=0x39, state=0x45

Partner: type=2, len=20, sys-pri=0x0, sys-mac=0000-0000-0000, key=0x0, pri=0x0, port-index=0x0, state=0xc5

Collector: type=3, len=16, col-max-delay=0x0

Terminator: type=0, len=0

 

*Jun 15 04:07:20:732 2022 BJS20_F0103_C01_021_T1_1.80 LAGG/7/Packet: PACKET.HundredGigE1/4/2.send.

size=110, subtype=1, version=1

Actor: type=1, len=20, sys-pri=0x8000, sys-mac=9ce8-95e1-cef2, key=0x1, pri=0x8000, port-index=0x3, state=0xd

Partner: type=2, len=20, sys-pri=0x8000, sys-mac=c433-064a-4401, key=0xd351, pri=0x8000, port-index=0x39, state=0x45

Collector: type=3, len=16, col-max-delay=0x0

Terminator: type=0, len=0

 

3.查看底层local logbuffer信息,发现设备有多bit parity error,有MMU相关的parity,时间点也和故障时间吻合;

Slot01 Jun 15 2022 03:03:12:413:LINE:5315-TASK:bDPC-FUNC:soc_ser_correction:SER_CORRECTION: reg/mem:1291 btype:19 sblk:2 at:-1 stage:1 addr:0x04300000 port: 0 index: 2032

Slot01 Jun 15 2022 03:03:12:413:LINE:5427-TASK:bDPC-FUNC:soc_ser_correction:mem: 1291=EGR_MAP_MH blkoffset:35

Slot01 Jun 15 2022 03:03:12:413:LINE:5797-TASK:bDPC-FUNC:soc_ser_correction:CACHE_RESTORE: EGR_MAP_MH[1291] blk: epipe0 index: 2032 : [2][4300000]

Slot01 Jun 15 2022 03:03:12:427:LINE:1140-TASK:bRX1-FUNC:DRV_QINQ_GetState:ifindex 13 info get return 0x40010008

Slot01 Jun 15 2022 03:03:12:437:LINE:5116-TASK:bDPC-FUNC:soc_tomahawk_process_ser_fifo:Multiple: 

Slot01 Jun 15 2022 03:03:12:447:LINE:5112-TASK:bDPC-FUNC:soc_tomahawk_process_ser_fifo:Unit: 0

Slot01 Jun 15 2022 03:03:12:448:LINE:5125-TASK:bDPC-FUNC:soc_tomahawk_process_ser_fifo:Mem: 

Slot01 Jun 15 2022 03:03:12:448:LINE:5131-TASK:bDPC-FUNC:soc_tomahawk_process_ser_fifo:Parity error..

Slot01 Jun 15 2022 03:03:12:448:LINE:4770-TASK:bDPC-FUNC:_soc_tomahawk_print_ser_fifo_details:Error in: SOP cell.

Slot01 Jun 15 2022 03:03:12:448:LINE:4801-TASK:bDPC-FUNC:_soc_tomahawk_print_ser_fifo_details:Blk: 2, Pipe: 2, Address: 0x043007f0, base: 0xc, stage: 1, index: 2032

Slot01 Jun 15 2022 03:03:12:448:LINE:4829-TASK:bDPC-FUNC:_soc_tomahawk_print_ser_fifo_details:SER caused packet drop.

Slot01 Jun 15 2022 03:03:12:448:LINE:5315-TASK:bDPC-FUNC:soc_ser_correction:SER_CORRECTION: reg/mem:1291 btype:19 sblk:2 at:-1 stage:1 addr:0x04300000 port: 0 index: 2032

Slot01 Jun 15 2022 03:03:12:448:LINE:5427-TASK:bDPC-FUNC:soc_ser_correction:mem: 1291=EGR_MAP_MH blkoffset:35

Slot01 Jun 15 2022 03:03:12:448:LINE:5797-TASK:bDPC-FUNC:soc_ser_correction:CACHE_RESTORE: EGR_MAP_MH[1291] blk: epipe0 index: 2032 : [2][4300000]

Slot01 Jun 15 2022 03:03:12:451:LINE:1597-TASK:FMCK-FUNC:DRV_DEVM_GetSubSlotFromUnitPort:call DRV_DEVM_GetUserPortFromUnitPort error: uiBID 1,uiUnit 0, uiPort 2

Slot01 Jun 15 2022 03:03:12:451:LINE:7477-TASK:FMCK-FUNC:DRV_DEVM_PortIsPhy82391:DRV_DEVM_PortIsPhy82391 error invalid uiRet=1073807361 

Slot01 Jun 15 2022 03:03:12:467:LINE:5112-TASK:bDPC-FUNC:soc_tomahawk_process_ser_fifo:Unit: 0

Slot01 Jun 15 2022 03:03:12:467:LINE:5125-TASK:bDPC-FUNC:soc_tomahawk_process_ser_fifo:Mem: 

Slot01 Jun 15 2022 03:03:12:467:LINE:5131-TASK:bDPC-FUNC:soc_tomahawk_process_ser_fifo:Parity error..

Slot01 Jun 15 2022 03:03:12:467:LINE:4770-TASK:bDPC-FUNC:_soc_tomahawk_print_ser_fifo_details:Error in: SOP cell.

Slot01 Jun 15 2022 03:03:12:467:LINE:4801-TASK:bDPC-FUNC:_soc_tomahawk_print_ser_fifo_details:Blk: 2, Pipe: 2, Address: 0x043007f0, base: 0xc, stage: 1, index: 2032

Slot01 Jun 15 2022 03:03:12:467:LINE:4829-TASK:bDPC-FUNC:_soc_tomahawk_print_ser_fifo_details:SER caused packet drop.

Slot01 Jun 15 2022 03:03:12:467:LINE:5315-TASK:bDPC-FUNC:soc_ser_correction:SER_CORRECTION: reg/mem:1291 btype:19 sblk:2 at:-1 stage:1 addr:0x04300000 port: 0 index: 2032

Slot01 Jun 15 2022 03:03:12:467:LINE:5427-TASK:bDPC-FUNC:soc_ser_correction:mem: 1291=EGR_MAP_MH blkoffset:35

Slot01 Jun 15 2022 03:03:12:467:LINE:5797-TASK:bDPC-FUNC:soc_ser_correction:CACHE_RESTORE: EGR_MAP_MH[1291] blk: epipe0 index: 2032 : [2][4300000]

Slot01 Jun 15 2022 03:03:12:468:LINE:5143-TASK:bDPC-FUNC:soc_tomahawk_process_ser_fifo:Double or Multiple bit ECC error..

Slot01 Jun 15 2022 03:03:12:478:LINE:3395-TASK:bDPC-FUNC:_soc_tomahawk_ser_process_mmu_err:unit 0 MMU XPE MEM PAR mem error interrupt count: 283

Slot01 Jun 15 2022 03:03:12:478:LINE:3407-TASK:bDPC-FUNC:_soc_tomahawk_ser_process_mmu_err:MMU XPE MEM PAR

Slot01 Jun 15 2022 03:03:12:478:LINE:3417-TASK:bDPC-FUNC:_soc_tomahawk_ser_process_mmu_err:unit 0 MMU XPE MEM PAR multiple parity/2bit error at address 0x129981d6

Slot01 Jun 15 2022 03:03:12:479:LINE:3693-TASK:bDPC-FUNC:_soc_tomahawk_ser_process_mmu_err:unit 0 MMU XPE MEM PAR address 0x129981d6: decoding of address to mem FAILED !!

Slot01 Jun 15 2022 03:03:12:487:LINE:12026-TASK:bDPC-FUNC:soc_th_mmu_non_ser_intr_handler:Unit: 0 -- Could not service DEQ0_NOT_IP_ERR_STAT intr from xpe = 3

Slot01 Jun 15 2022 03:03:12:492:LINE:1597-TASK:FMCK-FUNC:DRV_DEVM_GetSubSlotFromUnitPort:call DRV_DEVM_GetUserPortFromUnitPort error: uiBID 1,uiUnit 0, uiPort 20

Slot01 Jun 15 2022 03:03:12:492:LINE:7477-TASK:FMCK-FUNC:DRV_DEVM_PortIsPhy82391:DRV_DEVM_PortIsPhy82391 error invalid uiRet=1073807361 

Slot01 Jun 15 2022 03:03:12:498:LINE:5112-TASK:bDPC-FUNC:soc_tomahawk_process_ser_fifo:Unit: 0

Slot01 Jun 15 2022 03:03:12:498:LINE:5116-TASK:bDPC-FUNC:soc_tomahawk_process_ser_fifo:Multiple: 

Slot01 Jun 15 2022 03:03:12:498:LINE:5125-TASK:bDPC-FUNC:soc_tomahawk_process_ser_fifo:Mem: 

Slot01 Jun 15 2022 03:03:12:498:LINE:5143-TASK:bDPC-FUNC:soc_tomahawk_process_ser_fifo:Double or Multiple bit ECC error..

 

4. subslot 1子卡上全部端口出现物理震荡;之后基本在同一时间恢复正常;

Line 3294: Last link flapping: 0 hours 28 minutes 43 seconds

Line 3335: Last link flapping: 0 hours 28 minutes 44 seconds

Line 3376: Last link flapping: 0 hours 28 minutes 44 seconds

Line 3417: Last link flapping: 0 hours 28 minutes 45 seconds

Line 3458: Last link flapping: 0 hours 28 minutes 44 seconds

Line 3499: Last link flapping: 0 hours 28 minutes 45 seconds

Line 3540: Last link flapping: 0 hours 28 minutes 44 seconds

Line 3581: Last link flapping: 0 hours 28 minutes 45 seconds

Line 3622: Last link flapping: 0 hours 28 minutes 44 seconds

Line 3663: Last link flapping: 0 hours 28 minutes 44 seconds

Line 3704: Last link flapping: 0 hours 28 minutes 44 seconds

Line 3745: Last link flapping: 0 hours 28 minutes 44 seconds

Line 3786: Last link flapping: 0 hours 28 minutes 44 seconds

Line 3827: Last link flapping: 0 hours 28 minutes 44 seconds

Line 3868: Last link flapping: 0 hours 28 minutes 43 seconds

Line 3909: Last link flapping: 0 hours 28 minutes 44 seconds

5. 端口有TPCE计数;设备如果出现TPCE计数,理论上交换芯片出现问题了,但是show/c是读清的,隔了两个小时,再次收集诊断,就没有新增tpce计数了,怀疑是因为该设备业务已隔离,没有流量所以没有进一步计数增长;但出现TPCE的端口,基本都出现PERQ_DROP_PKT_UC计数;且第二次收集时,依然有计数在增加;

 

Line 31537: TPCE_64.ce0       :                     6                  +6

Line 31609: TPCE_64.ce1       :                    14                 +14

Line 31678: TPCE_64.ce2       :                    10                 +10

Line 31857: TPCE_64.ce7       :                     8                  +8

Line 31950: TPCE_64.xe88      :                     3                  +3

Line 32057: TPCE_64.xe92      :                     2                  +2

Line 32111: TPCE_64.xe94      :                     5                  +5

Line 32166: TPCE_64.xe96      :                     2                  +2

Line 32218: TPCE_64.xe98      :                     2                  +2

Line 32272: TPCE_64.xe100     :                     2                  +2

Line 32375: TPCE_64.xe104     :                     1                  +1

Line 32479: TPCE_64.xe108     :                     5                  +5

Line 32585: TPCE_64.xe112     :                     3                  +3

Line 32695: TPCE_64.xe116     :                     2                  +2

 

第一次诊断:

Line 31528: PERQ_DROP_PKT_UC(0).ge0:        25,257,318,217     +25,257,318,217

Line 31587: PERQ_DROP_PKT_UC(0).ce0:             6,884,239          +6,884,239

Line 31588: PERQ_DROP_PKT_UC(1).ce0:                   115                +115

Line 31589: PERQ_DROP_PKT_UC(2).ce0:                    35                 +35

Line 31590: PERQ_DROP_PKT_UC(4).ce0:                 6,821              +6,821

Line 31591: PERQ_DROP_PKT_UC(6).ce0:                    17                 +17

Line 31592: PERQ_DROP_PKT_UC(7).ce0:                 1,866              +1,866

Line 31655: PERQ_DROP_PKT_UC(0).ce1:             6,850,688          +6,850,688

Line 31656: PERQ_DROP_PKT_UC(1).ce1:                   160                +160

Line 31657: PERQ_DROP_PKT_UC(2).ce1:                    29                 +29

Line 31658: PERQ_DROP_PKT_UC(4).ce1:                 5,262              +5,262

Line 31659: PERQ_DROP_PKT_UC(5).ce1:                     2                  +2

Line 31660: PERQ_DROP_PKT_UC(6).ce1:                    10                 +10

Line 31661: PERQ_DROP_PKT_UC(7).ce1:                 1,708              +1,708

Line 31728: PERQ_DROP_PKT_UC(0).ce2:             6,877,870          +6,877,870

Line 31729: PERQ_DROP_PKT_UC(1).ce2:                   188                +188

Line 31730: PERQ_DROP_PKT_UC(2).ce2:                    29                 +29

Line 31731: PERQ_DROP_PKT_UC(4).ce2:                 6,294              +6,294

Line 31732: PERQ_DROP_PKT_UC(5).ce2:                     1                  +1

Line 31733: PERQ_DROP_PKT_UC(6).ce2:                     6                  +6

Line 31734: PERQ_DROP_PKT_UC(7).ce2:                 1,745              +1,745               1/s

Line 31794: PERQ_DROP_PKT_UC(0).ce5:                   225                +225

Line 31795: PERQ_DROP_PKT_UC(7).ce5:                 1,551              +1,551

Line 31846: PERQ_DROP_PKT_UC(0).ce6:                   260                +260

Line 31847: PERQ_DROP_PKT_UC(7).ce6:                 1,402              +1,402               1/s

Line 31910: PERQ_DROP_PKT_UC(0).ce7:             6,887,403          +6,887,403

Line 31911: PERQ_DROP_PKT_UC(1).ce7:                   188                +188

Line 31912: PERQ_DROP_PKT_UC(2).ce7:                    21                 +21

Line 31913: PERQ_DROP_PKT_UC(4).ce7:                 5,181              +5,181

Line 31914: PERQ_DROP_PKT_UC(6).ce7:                    10                 +10

Line 31915: PERQ_DROP_PKT_UC(7).ce7:                 1,786              +1,786               1/s

Line 31939: PERQ_DROP_PKT_UC(0).ge1:         3,314,739,183      +3,314,739,183

Line 31994: PERQ_DROP_PKT_UC(0).xe88:                 3,745              +3,745

Line 31995: PERQ_DROP_PKT_UC(7).xe88:                   785                +785

Line 32046: PERQ_DROP_PKT_UC(0).xe90:                 3,284              +3,284

Line 32047: PERQ_DROP_PKT_UC(7).xe90:                   759                +759

Line 32100: PERQ_DROP_PKT_UC(0).xe92:                 4,204              +4,204

Line 32101: PERQ_DROP_PKT_UC(7).xe92:                   718                +718

Line 32156: PERQ_DROP_PKT_UC(0).xe94:                 2,525              +2,525

Line 32157: PERQ_DROP_PKT_UC(7).xe94:                   715                +715

Line 32208: PERQ_DROP_PKT_UC(0).xe96:                 3,273              +3,273

Line 32209: PERQ_DROP_PKT_UC(7).xe96:                   701                +701

Line 32261: PERQ_DROP_PKT_UC(0).xe98:                 4,497              +4,497

Line 32262: PERQ_DROP_PKT_UC(7).xe98:                   738                +738

Line 32314: PERQ_DROP_PKT_UC(0).xe100:                 4,315              +4,315

Line 32315: PERQ_DROP_PKT_UC(7).xe100:                   706                +706

Line 32365: PERQ_DROP_PKT_UC(0).xe102:                 3,581              +3,581

Line 32366: PERQ_DROP_PKT_UC(7).xe102:                   765                +765

Line 32417: PERQ_DROP_PKT_UC(0).xe104:                 3,495              +3,495

Line 32418: PERQ_DROP_PKT_UC(7).xe104:                   752                +752

Line 32468: PERQ_DROP_PKT_UC(0).xe106:                 3,819              +3,819

Line 32469: PERQ_DROP_PKT_UC(7).xe106:                   715                +715

Line 32522: PERQ_DROP_PKT_UC(0).xe108:                 3,502              +3,502

Line 32523: PERQ_DROP_PKT_UC(7).xe108:                   724                +724

Line 32574: PERQ_DROP_PKT_UC(0).xe110:                 3,538              +3,538

Line 32575: PERQ_DROP_PKT_UC(7).xe110:                   731                +731

Line 32631: PERQ_DROP_PKT_UC(0).xe112:                 3,545              +3,545

Line 32632: PERQ_DROP_PKT_UC(7).xe112:                   777                +777

Line 32684: PERQ_DROP_PKT_UC(0).xe114:                 4,334              +4,334

Line 32685: PERQ_DROP_PKT_UC(7).xe114:                   698                +698

Line 32739: PERQ_DROP_PKT_UC(0).xe116:                 3,762              +3,762

Line 32740: PERQ_DROP_PKT_UC(4).xe116:                     2                  +2

Line 32741: PERQ_DROP_PKT_UC(7).xe116:                   727                +727

Line 32793: PERQ_DROP_PKT_UC(0).xe118:                 4,073              +4,073

Line 32794: PERQ_DROP_PKT_UC(7).xe118:                   716                +716

第二次诊断:

Line 36237: PERQ_DROP_PKT_UC(7).ce0:                 9,320              +1,057

Line 36245: PERQ_DROP_PKT_UC(7).ce1:                 9,163              +1,057

Line 36253: PERQ_DROP_PKT_UC(7).ce2:                 9,197              +1,056

Line 36267: PERQ_DROP_PKT_UC(7).ce5:                 9,006              +1,057               1/s

Line 36281: PERQ_DROP_PKT_UC(7).ce6:                 8,856              +1,057

Line 36289: PERQ_DROP_PKT_UC(7).ce7:                 9,240              +1,057               1/s

Line 36291: PERQ_DROP_PKT(0).ge1:                   253                 +33

Line 36302: PERQ_DROP_PKT_UC(7).xe88:                 1,759                +138

Line 36313: PERQ_DROP_PKT_UC(7).xe90:                 1,735                +137               1/s

Line 36324: PERQ_DROP_PKT_UC(7).xe92:                 1,686                +138

Line 36335: PERQ_DROP_PKT_UC(7).xe94:                 1,686                +138

Line 36346: PERQ_DROP_PKT_UC(7).xe96:                 1,675                +138

Line 36357: PERQ_DROP_PKT_UC(7).xe98:                 1,712                +141

Line 36368: PERQ_DROP_PKT_UC(7).xe100:                 1,680                +138

Line 36379: PERQ_DROP_PKT_UC(7).xe102:                 1,736                +138

Line 36390: PERQ_DROP_PKT_UC(7).xe104:                 1,726                +135

Line 36401: PERQ_DROP_PKT_UC(7).xe106:                 1,686                +138

Line 36412: PERQ_DROP_PKT_UC(7).xe108:                 1,698                +138

Line 36423: PERQ_DROP_PKT_UC(7).xe110:                 1,699                +135

Line 36434: PERQ_DROP_PKT_UC(7).xe112:                 1,751                +141

Line 36445: PERQ_DROP_PKT_UC(7).xe114:                 1,672                +138

Line 36456: PERQ_DROP_PKT_UC(7).xe116:                 1,701                +141

Line 36467: PERQ_DROP_PKT_UC(7).xe118:                 1,690                +139

综上,

1.设备上bgp异常,subslot4 聚合协议down,应该是由于MMU XPE MEM PAR multiple parity/2bit error导致的设备发包出现问题;

2. subslot 1上全部端口发生物理震荡,理论上软件不会导致物理震荡;如果担心再次震荡影响业务,建议先更换subslot 1

3. TPCE计数,不确定是何时产生的,有DROP丢包,导致的原因可能是上述的1,也不能排除可能是转发芯片出现故障。为了排除隐患,建议硬件更换。

解决方法

1、设备多bit parity error,需要重启恢复。

2、重启后观察是否还有TPCE计数,以及DROP丢包,观察subslot1是否还会发生震荡;若仍旧有TPCE计数,说明转发芯片存在异常,建议更换设备。

该案例对您是否有帮助:

您的评价:1

若您有关于案例的建议,请反馈:

作者在2022-10-09对此案例进行了修订
0 个评论

该案例暂时没有网友评论

编辑评论

举报

×

侵犯我的权益 >
对根叔知了社区有害的内容 >
辱骂、歧视、挑衅等(不友善)

侵犯我的权益

×

泄露了我的隐私 >
侵犯了我企业的权益 >
抄袭了我的内容 >
诽谤我 >
辱骂、歧视、挑衅等(不友善)
骚扰我

泄露了我的隐私

×

您好,当您发现根叔知了上有泄漏您隐私的内容时,您可以向根叔知了进行举报。 请您把以下内容通过邮件发送到pub.zhiliao@h3c.com 邮箱,我们会尽快处理。
  • 1. 您认为哪些内容泄露了您的隐私?(请在邮件中列出您举报的内容、链接地址,并给出简短的说明)
  • 2. 您是谁?(身份证明材料,可以是身份证或护照等证件)

侵犯了我企业的权益

×

您好,当您发现根叔知了上有关于您企业的造谣与诽谤、商业侵权等内容时,您可以向根叔知了进行举报。 请您把以下内容通过邮件发送到 pub.zhiliao@h3c.com 邮箱,我们会在审核后尽快给您答复。
  • 1. 您举报的内容是什么?(请在邮件中列出您举报的内容和链接地址)
  • 2. 您是谁?(身份证明材料,可以是身份证或护照等证件)
  • 3. 是哪家企业?(营业执照,单位登记证明等证件)
  • 4. 您与该企业的关系是?(您是企业法人或被授权人,需提供企业委托授权书)
我们认为知名企业应该坦然接受公众讨论,对于答案中不准确的部分,我们欢迎您以正式或非正式身份在根叔知了上进行澄清。

抄袭了我的内容

×

原文链接或出处

诽谤我

×

您好,当您发现根叔知了上有诽谤您的内容时,您可以向根叔知了进行举报。 请您把以下内容通过邮件发送到pub.zhiliao@h3c.com 邮箱,我们会尽快处理。
  • 1. 您举报的内容以及侵犯了您什么权益?(请在邮件中列出您举报的内容、链接地址,并给出简短的说明)
  • 2. 您是谁?(身份证明材料,可以是身份证或护照等证件)
我们认为知名企业应该坦然接受公众讨论,对于答案中不准确的部分,我们欢迎您以正式或非正式身份在根叔知了上进行澄清。

对根叔知了社区有害的内容

×

垃圾广告信息
色情、暴力、血腥等违反法律法规的内容
政治敏感
不规范转载 >
辱骂、歧视、挑衅等(不友善)
骚扰我
诱导投票

不规范转载

×

举报说明

提出建议

    +

亲~登录后才可以操作哦!

确定

亲~检测到您登陆的账号未在http://hclhub.h3c.com进行注册

注册后可访问此模块

跳转hclhub

你的邮箱还未认证,请认证邮箱或绑定手机后进行当前操作