现场S12600-08-G设备突然出现slot 2和slot 10单板频繁报内部hg口down up告警,同时slot 2、slot 3、slot 6单板在报:An error occurred on the data channel between switch chips
具体告警如下:
%Jan 20 09:23:45:021 2026S12608G DEV/2/INTERNALLINK_ALARM_OCCUR: Internal link alarm occurred. (PhysicalIndex=67, PhysicalName=Board 2, RelativeResource=(bustype:DEV LINK,sourceport:(chiptype:5,slot:2,chipid:2,portid:30),destination port:(chiptype:5,slot:10,chipid:0,portid:6)), ErrorCode=481001, Reason=HiGig link went down.)
%Jan 20 09:23:45:122 2026S12608G DEV/2/INTERNALLINK_ALARM_CLEAR: Internal link alarm cleared. (PhysicalIndex=67, PhysicalName=Board 2, RelativeResource=(bustype:DEV LINK,sourceport:(chiptype:5,slot:2,chipid:2,portid:30),destination port:(chiptype:5,slot:10,chipid:0,portid:6)), ErrorCode=481001, Reason=HiGig link came up.)
%Jan 20 09:23:49:822 2026S12608G DEV/2/INTERNALLINK_ALARM_OCCUR: Internal link alarm occurred. (PhysicalIndex=67, PhysicalName=Board 2, RelativeResource=(bustype:DEV LINK,sourceport:(chiptype:5,slot:2,chipid:2,portid:30),destination port:(chiptype:5,slot:10,chipid:0,portid:6)), ErrorCode=481001, Reason=HiGig link went down.)
%Jan 20 09:23:49:924 2026S12608G DEV/2/INTERNALLINK_ALARM_CLEAR: Internal link alarm cleared. (PhysicalIndex=67, PhysicalName=Board 2, RelativeResource=(bustype:DEV LINK,sourceport:(chiptype:5,slot:2,chipid:2,portid:30),destination port:(chiptype:5,slot:10,chipid:0,portid:6)), ErrorCode=481001, Reason=HiGig link came up.)
%Jan 20 09:23:55:774 2026S12608G DEV/2/INTERNALLINK_ALARM_OCCUR: Internal link alarm occurred. (PhysicalIndex=75, PhysicalName=Board 10, RelativeResource=(bustype:DEV LINK,sourceport:(chiptype:5,slot:10,chipid:0,portid:6),destination port:(chiptype:5,slot:2,chipid:2,portid:30)), ErrorCode=481001, Reason=HiGig link went down.)
%Jan 20 09:28:43:195 2026S12608G DEV/2/INTERNALLINK_ALARM_OCCUR: Internal link alarm occurred. (PhysicalIndex=71, PhysicalName=Board 6, RelativeResource=(bustype:data channel,sourceport:(chiptype:switch,slot:2.2,chipid:2,portid:30),destination port:(chiptype:switch,slot:6,chipid:0)), ErrorCode=473002, Reason=An error occurred on the data channel between switch chips.)
%Jan 20 09:28:43:197 2026S12608G DEV/2/INTERNALLINK_ALARM_OCCUR: Internal link alarm occurred. (PhysicalIndex=67, PhysicalName=Board 2, RelativeResource=(bustype:data channel,sourceport:(chiptype:switch,slot:2.2,chipid:2,portid:30),destination port:(chiptype:switch,slot:6,chipid:0)), ErrorCode=473002, Reason=An error occurred on the data channel between switch chips.)
%Jan 20 09:28:43:199 2026S12608G DEV/2/INTERNALLINK_ALARM_OCCUR: Internal link alarm occurred. (PhysicalIndex=71, PhysicalName=Board 6, RelativeResource=(bustype:data channel,sourceport:(chiptype:switch,slot:2.2,chipid:2,portid:30),destination port:(chiptype:switch,slot:6,chipid:1)), ErrorCode=473002, Reason=An error occurred on the data channel between switch chips.)
%Jan 20 09:28:43:201 2026S12608G DEV/2/INTERNALLINK_ALARM_OCCUR: Internal link alarm occurred. (PhysicalIndex=67, PhysicalName=Board 2, RelativeResource=(bustype:data channel,sourceport:(chiptype:switch,slot:2.2,chipid:2,portid:30),destination port:(chiptype:switch,slot:6,chipid:1)), ErrorCode=473002, Reason=An error occurred on the data channel between switch chips.)
%Jan 20 09:28:43:384 2026S12608G DEV/2/INTERNALLINK_ALARM_OCCUR: Internal link alarm occurred. (PhysicalIndex=68, PhysicalName=Board 3, RelativeResource=(bustype:data channel,sourceport:(chiptype:switch,slot:2.2,chipid:2,portid:30),destination port:(chiptype:switch,slot:3,chipid:0)), ErrorCode=473002, Reason=An error occurred on the data channel between switch chips.)
%Jan 20 09:28:43:386 2026S12608G DEV/2/INTERNALLINK_ALARM_OCCUR: Internal link alarm occurred. (PhysicalIndex=67, PhysicalName=Board 2, RelativeResource=(bustype:data channel,sourceport:(chiptype:switch,slot:2.2,chipid:2,portid:30),destination port:(chiptype:switch,slot:3,chipid:0)), ErrorCode=473002, Reason=An error occurred on the data channel between switch chips.
1、由于2槽位业务板和10槽位网板间互联HG频繁UP/DOWN,业务检测报文会经错HG UP/DOWN的内联口,因此也出现slot 2、slot 3、slot 6单板在报业务检查异常告警。
%@60495%Jan 20 09:05:17:188 2026S12608G DEV/2/INTERNALLINK_ALARM_CLEAR: Internal link alarm cleared. (PhysicalIndex=75, PhysicalName=Board 10, RelativeResource=(bustype:DEV LINK,sourceport:(chiptype:5,slot:10,chipid:0,portid:6),destination port:(chiptype:5,slot:2,chipid:2,portid:30)), ErrorCode=481001, Reason=HiGig link came up.)
%@60496%Jan 20 09:05:18:129 2026S12608G DEV/2/INTERNALLINK_ALARM_OCCUR: Internal link alarm occurred. (PhysicalIndex=75, PhysicalName=Board 10, RelativeResource=(bustype:DEV LINK,sourceport:(chiptype:5,slot:10,chipid:0,portid:6),destination port:(chiptype:5,slot:2,chipid:2,portid:30)), ErrorCode=481001, Reason=HiGig link went down.)
%@60497%Jan 20 09:05:18:230 2026S12608G DEV/2/INTERNALLINK_ALARM_CLEAR: Internal link alarm cleared. (PhysicalIndex=75, PhysicalName=Board 10, RelativeResource=(bustype:DEV LINK,sourceport:(chiptype:5,slot:10,chipid:0,portid:6),destination port:(chiptype:5,slot:2,chipid:2,portid:30)), ErrorCode=481001, Reason=HiGig link came up.)
%@60498%Jan 20 09:05:18:250 2026S12608G DEV/2/INTERNALLINK_ALARM_OCCUR: Internal link alarm occurred. (PhysicalIndex=75, PhysicalName=Board 10, RelativeResource=(bustype:DEV LINK,sourceport:(chiptype:5,slot:10,chipid:0,portid:6),destination port:(chiptype:5,slot:2,chipid:2,portid:30)), ErrorCode=481002, Reason=HiGig link flapped.)
%@60499%Jan 20 09:05:19:271 2026S12608G DEV/2/INTERNALLINK_ALARM_OCCUR: Internal link alarm occurred. (PhysicalIndex=75, PhysicalName=Board 10, RelativeResource=(bustype:DEV LINK,sourceport:(chiptype:5,slot:10,chipid:0,portid:6),destination port:(chiptype:5,slot:2,chipid:2,portid:30)), ErrorCode=481001, Reason=HiGig link went down.)
%@60500%Jan 20 09:05:19:374 2026S12608G DEV/2/INTERNALLINK_ALARM_CLEAR: Internal link alarm cleared. (PhysicalIndex=75, PhysicalName=Board 10, RelativeResource=(bustype:DEV LINK,sourceport:(chiptype:5,slot:10,chipid:0,portid:6),destination port:(chiptype:5,slot:2,chipid:2,portid:30)), ErrorCode=481001, Reason=HiGig link came up.)
2、因此需要分析slot 2和slot 10网板单板之间HG DOWN UP原因,从新收集如下信息确定,2槽位业务板侧存在较多不可纠错FEC计数,收集三次,每次读清后还会产生:
[S12608G-probe]dis hardware internal port hg-monitor slot 2
Fec Counter Record:
[uiLlogicport] [Correct] [Uncorrect] [Clock] [Number]
==================================================================================
UpLinkPort_274 21669 391 03:58:32:711797 01/20/2026 1
UpLinkPort_274 1177 101 03:59:15:170891 01/20/2026 2
UpLinkPort_274 61202 1337 04:03:17:873256 01/20/2026 3
UpLinkPort_274 134 11 04:06:45:534417 01/20/2026 4
网板侧正常:
[S12608G-probe]dis hardware internal port hg-monitor slot 10
Fec Counter Record:
[uiLlogicport] [Correct] [Uncorrect] [Clock] [Number]
==================================================================================
UpLinkPort_267 0 0 03:59:19:576707 01/20/2026 1
UpLinkPort_267 0 0 04:03:22:848997 01/20/2026 2
UpLinkPort_267 0 0 04:06:49:787219 01/20/2026 3
根据FEC计数情况来看,分析是2槽位业务板侧的问题。
更换2槽位业务板。
注:类似问题,国芯设备需要收集的命令如下,以本次2槽位和10槽位直接HG/DOWN为例:
1、收集下如下命令,间隔3分钟读取一次,读取3次:
sys
prob
dis clock
dis hardware internal port hg-monitor slot 2
dis hardware internal port hg-monitor slot 10
2、确定内部互联口关系,然后收集内部互联口信息:
====display devm hgport chassis 0 slot 2====
(slot 2, slot 10):
Slot 2 connect Slot 10
Lindex (Lchip, Lport, Gport) | Lindex (Lchip, Lport, Gport)
274 (2 , 30 ,0xe1e ) | 267 (0 , 6 ,0x1e06)
Sys
Prob
sdk slot 2 sdk
sdk slot 2 enter/lchip/2
sdk slot 2 show/port/0xe1e/all
sdk slot 2 port/0xe1e /self-checking
收集完上述槽位后,退出到系统系统,再次进入probe收集:
Sys
Prob
sdk slot 10 sdk
sdk slot 10 enter/lchip/0
sdk slot 10 show/port/0x1e06/all
sdk slot 10 port/0x1e06/self-checking
该案例暂时没有网友评论
✖
案例意见反馈
亲~登录后才可以操作哦!
确定你的邮箱还未认证,请认证邮箱或绑定手机后进行当前操作