• 全部
  • 经验案例
  • 典型配置
  • 技术公告
  • FAQ
  • 漏洞说明
  • 全部
  • 全部
  • 大数据引擎
  • 知了引擎
产品线
搜索
取消
案例类型
发布者
是否解决
是否官方
时间
搜索引擎
匹配模式
高级搜索

某局点 S6850-G BGP频繁断连

  • 0关注
  • 0收藏 168浏览
刘贝 九段
粉丝:23人 关注:7人

组网及说明

接入m-lag-6850-G(ASW)和核心-S125G-AF(DSW)

告警信息

不涉及

问题描述

6850-G(ASW)S125G-AF(DSW)上的日志,ASWBGP断开原因显示未收到keepalive报文超时断连,DSW侧收到远端发来的BGP断开消息状态切换。

ASW侧:

%Feb 23 15:57:01:273 2024 ASW-132-C03-1.AM11 BGP/5/BGP_STATE_CHANGED: BGP.: FD00:0:AC8:D038::AC8:D039  state has changed from ESTABLISHED to IDLE for hold timer expiration caused by peer device.

%Feb 23 15:57:01:273 2024 ASW-132-C03-1.AM11 BGP/5/BGP_STATE_CHANGED_REASON: BGP.: FD00:0:AC8:D038::AC8:D039  state has changed from ESTABLISHED to IDLE. (Reason: no keepalives or updates had been received from the peer when the hold timer expired, Error code: Send Notificationcode 4/0)

%Feb 23 15:57:59:273 2024 ASW-132-C03-1.AM11 BGP/5/BGP_STATE_CHANGED: BGP.: 10.200.208.49  state has changed from ESTABLISHED to IDLE for hold timer expiration caused by peer device.

%Feb 23 15:57:59:273 2024 ASW-132-C03-1.AM11 BGP/5/BGP_STATE_CHANGED_REASON: BGP.: 10.200.208.49  state has changed from ESTABLISHED to IDLE. (Reason: no keepalives or updates had been received from the peer when the hold timer expired, Error code: Send Notificationcode 4/0)

DSW侧:

%Feb 23 16:49:36:378 2024 DSW-VM-G1-P-1.SM132 BGP/5/BGP_STATE_CHANGED_REASON: BGP.: FD00:0:AC8:D054::AC8:D056  state has changed from ESTABLISHED to IDLE. (Reason: a notification was received from the peer, Error code: Receive Notificationcode 4/0)

%Feb 23 16:49:36:555 2024 DSW-VM-G1-P-1.SM132 BGP/5/BGP_STATE_CHANGED_REASON: BGP.: FD00:0:AC8:D098::AC8:D09A  state has changed from ESTABLISHED to IDLE. (Reason: a notification was received from the peer, Error code: Receive Notificationcode 4/0)

%Feb 23 16:49:37:185 2024 DSW-VM-G1-P-1.SM132 BGP/5/BGP_STATE_CHANGED_REASON: BGP.: 10.200.208.186  state has changed from ESTABLISHED to IDLE. (Reason: a notification was received from the peer, Error code: Receive Notificationcode 4/0)

过程分析

查看ASWBGP断开log-info,基本都是收不到keepalive报文超时down,发送notification给对端,但是排查DSW诊断未发现异常,且其还和其他接入网关也有建立BGP邻居都是正常的,怀疑是ASW侧可能存在CPU处理BGP报文不及时问题

<ASW-132-C03-1.AM11>dis bgp peer ipv4 10.200.209.125 log-info

Peer: 10.200.209.125

     Date      Time    State Notification

                             Error/SubError

  23-Feb-2024 18:59:48 Down  Send notification with error 4/0

                             Hold Timer Expired/ErrSubCode Unspecified

                             Keepalive last received time : 18:59:12-2024.2.23

                             Update last received time    : 18:59:17-2024.2.23

                            EPOLLIN last occurred time   : 18:59:17-2024.2.23

  23-Feb-2024 18:57:15 Up

  23-Feb-2024 18:56:58 Down  Send notification with error 4/0

                             Hold Timer Expired/ErrSubCode Unspecified

                             Keepalive last received time : 18:56:27-2024.2.23

                             Update last received time    : 18:56:27-2024.2.23

                             EPOLLIN last occurred time   : 18:56:27-2024.2.23

  23-Feb-2024 18:41:08 Up

  23-Feb-2024 18:40:45 Down  Send notification with error 4/0

                             Hold Timer Expired/ErrSubCode Unspecified

                             Keepalive last received time : 18:40:14-2024.2.23

                             Update last received time    : 18:40:14-2024.2.23

                             EPOLLIN last occurred time   : 18:40:14-2024.2.23

                 23-Feb-2024 18:38:07 Up

  23-Feb-2024 18:37:44 Down  Send notification with error 4/0

                             Hold Timer Expired/ErrSubCode Unspecified

                             Keepalive last received time : 18:37:13-2024.2.23

                             Update last received time    : 18:37:13-2024.2.23

 

                             EPOLLIN last occurred time   : 18:37:13-2024.2.23

  23-Feb-2024 18:22:50 Up

  23-Feb-2024 18:22:26 Down  Send notification with error 4/0

                             Hold Timer Expired/ErrSubCode Unspecified

                             Keepalive last received time : 18:21:55-2024.2.23

                             Update last received time    : 18:21:50-2024.2.23

                             EPOLLIN last occurred time   : 18:21:55-2024.2.23

  23-Feb-2024 18:19:10 Up

  23-Feb-2024 18:18:53 Down  Send notification with error 4/0

                             Hold Timer Expired/ErrSubCode Unspecified

                             Keepalive last received time : 18:18:17-2024.2.23

                             Update last received time    : 18:18:22-2024.2.23

                             EPOLLIN last occurred time   : 18:18:22-2024.2.23.

 

查看ASW上有持续的上送cpu收包计数,分析是有异常报文冲击CPU,将上送cpu的报文打印出来,发现有很多TTL等于1TCP报文,查看对应报文的目的ip在设备上表项,发现两台设备上该目的iparp都学到横联口,形成路由环路,该目的IP和现场确认是ASW下挂服务器地址。

 

#

interface Vlan-interface9 //arpmac是网关mac

ip address 10.200.199.247 255.255.254.0

mac-address 0000-5e00-0101

local-proxy-arp enable 

 arp route-direct advertise

arp timer aging second 90

#

interface Bridge-Aggregation100 //聚合100是横联口

port link-type trunk

undo port trunk permit vlan 1

port trunk permit vlan 2 to 4094

link-aggregation mode dynamic

port m-lag peer-link 1

undo mac-address static source-check enable

#

经过实验室按照现场的配置复现打流测试,发现是因为现场配置的双活网关下配置了arp代理(local-proxy-arp enable),在该场景下,当上行DSWASW发送流量时,此时ASW设备如果没有学到下行服务器的arp,上行流量下来后因为网关配了arp代理,此时会往横联peer-link广播arp请求,m-lag对端网关也配了本地arp代理,这样对端又会再往回发送一份arp请求,这样目的iparp就会学在m-lag两边的peer-link口上,导致路由环路。路由环路会导致报文最终ttl减到1后上送cpu处理,又由于当前设备BGP keepalive报文也是走的ttl=1报文上送cpu队列,如果异常ttl等于1报文过大会挤掉keepalive报文,导致keepalive收包超时,BGP断开。

解决方法

1、M-LAG双活网关组网环境下,M-LAG设备的下行VLAN接口不要配置本地代理ARP/ND功能,否则会触发流量环路。

2、升级软件版本,后续版本优化了keepalive上送cpu队列,保障BGP高优处理,建议升级到推荐版本R8307P08

该案例对您是否有帮助:

您的评价:1

若您有关于案例的建议,请反馈:

1 个评论
繁华易逝 知了小白
粉丝:0人 关注:0人

写的很好。作者技术实力很强

编辑评论

举报

×

侵犯我的权益 >
对根叔知了社区有害的内容 >
辱骂、歧视、挑衅等(不友善)

侵犯我的权益

×

泄露了我的隐私 >
侵犯了我企业的权益 >
抄袭了我的内容 >
诽谤我 >
辱骂、歧视、挑衅等(不友善)
骚扰我

泄露了我的隐私

×

您好,当您发现根叔知了上有泄漏您隐私的内容时,您可以向根叔知了进行举报。 请您把以下内容通过邮件发送到pub.zhiliao@h3c.com 邮箱,我们会尽快处理。
  • 1. 您认为哪些内容泄露了您的隐私?(请在邮件中列出您举报的内容、链接地址,并给出简短的说明)
  • 2. 您是谁?(身份证明材料,可以是身份证或护照等证件)

侵犯了我企业的权益

×

您好,当您发现根叔知了上有关于您企业的造谣与诽谤、商业侵权等内容时,您可以向根叔知了进行举报。 请您把以下内容通过邮件发送到 pub.zhiliao@h3c.com 邮箱,我们会在审核后尽快给您答复。
  • 1. 您举报的内容是什么?(请在邮件中列出您举报的内容和链接地址)
  • 2. 您是谁?(身份证明材料,可以是身份证或护照等证件)
  • 3. 是哪家企业?(营业执照,单位登记证明等证件)
  • 4. 您与该企业的关系是?(您是企业法人或被授权人,需提供企业委托授权书)
我们认为知名企业应该坦然接受公众讨论,对于答案中不准确的部分,我们欢迎您以正式或非正式身份在根叔知了上进行澄清。

抄袭了我的内容

×

原文链接或出处

诽谤我

×

您好,当您发现根叔知了上有诽谤您的内容时,您可以向根叔知了进行举报。 请您把以下内容通过邮件发送到pub.zhiliao@h3c.com 邮箱,我们会尽快处理。
  • 1. 您举报的内容以及侵犯了您什么权益?(请在邮件中列出您举报的内容、链接地址,并给出简短的说明)
  • 2. 您是谁?(身份证明材料,可以是身份证或护照等证件)
我们认为知名企业应该坦然接受公众讨论,对于答案中不准确的部分,我们欢迎您以正式或非正式身份在根叔知了上进行澄清。

对根叔知了社区有害的内容

×

垃圾广告信息
色情、暴力、血腥等违反法律法规的内容
政治敏感
不规范转载 >
辱骂、歧视、挑衅等(不友善)
骚扰我
诱导投票

不规范转载

×

举报说明

提出建议

    +

亲~登录后才可以操作哦!

确定

亲~检测到您登陆的账号未在http://hclhub.h3c.com进行注册

注册后可访问此模块

跳转hclhub

你的邮箱还未认证,请认证邮箱或绑定手机后进行当前操作