设备及版本:S12516-X Release 1152
现网拓扑及接口互联关系如下,3.1和3.2为核心S125设备,4.253和4.252为两台Tor,Tor和核心跑EBGP,Tor和服务器跑EBGP,核心间跑IBGP,Tor侧针对核心开了allow-as-loop功能。
无
客户现有环境中已经存在10.180.96.0/19的路由访问服务器地址10.180.99.173,后因业务要求服务器发布10.180.99.173/32位主机路由,现场反馈当把32位路由撤销后出现访问10.180.99.173约28s左右的不通现象,后自动恢复。
1.当服务器发了一条10.180.99.173/32的主机路由后,此时从外部访问这个地址可以ping通,设备表象都是正常的。其中和现场确认了下,10.180.99.173/32的主机路由只发布给4.253的TOR,66和67显示的是vm的地址;未直接发布给4.252的TOR,4.252的10.180.99.173/32路由是从两台核心学到的。
2.在各个设备上打debug发现,4.253会发撤销消息给两个核心,核心收到后会给核心和TOR发送该路由撤销消息,但此时3.1紧接着会给TOR发一条该路由更新消息,路由下一跳为3.2,导致TOR到10.180.99.173/32的路由都指向核心,核心上看无32位明细路由,只有10.180.96.0/19的路由指向TOR,导致路由环路丢包。
3.1侧的debug信息
[2022-06/14-18:10:35]*Jun 14 18:10:35:179 2022 N_Core-06_D_R16C11-H12516-10.172.3.1.zzzc BGP/7/DEBUG: -MDC=1;
[2022-06/14-18:10:35] BGP.: Recv UPDATE(Withdraw) from peer 10.180.88.52 for destinations:
[2022-06/14-18:10:35] 10.180.99.173/32,
[2022-06/14-18:10:35]
[2022-06/14-18:10:35]*Jun 14 18:10:35:181 2022 N_Core-06_D_R16C11-H12516-10.172.3.1.zzzc BGP/7/DEBUG: -MDC=1;
[2022-06/14-18:10:35] BGP.: Send UPDATE(Withdraw) to update-group 0 for destinations:
[2022-06/14-18:10:35] 10.180.99.173/32,
[2022-06/14-18:10:35]
[2022-06/14-18:10:35]*Jun 14 18:10:35:182 2022 N_Core-06_D_R16C11-H12516-10.172.3.1.zzzc BGP/7/DEBUG: -MDC=1;
[2022-06/14-18:10:35] BGP.: Send UPDATE to update-group 8 for following destinations:
[2022-06/14-18:10:35] Origin : Incomplete
[2022-06/14-18:10:35] AS path : 65012 4172004252 4173120001
[2022-06/14-18:10:35] Next hop : 10.172.3.2
[2022-06/14-18:10:35] 10.180.99.173/32,
[2022-06/14-18:11:02]*Jun 14 18:11:02:206 2022 N_Core-06_D_R16C11-H12516-10.172.3.1.zzzc BGP/7/DEBUG: -MDC=1;
[2022-06/14-18:11:02] BGP.: Send UPDATE(Withdraw) to update-group 8 for destinations:
[2022-06/14-18:11:02] 10.180.99.173/32,
[2022-06/14-18:11:02]
[2022-06/14-18:11:02]*Jun 14 18:11:02:711 2022 N_Core-06_D_R16C11-H12516-10.172.3.1.zzzc BGP/7/DEBUG: -MDC=1;
[2022-06/14-18:11:02] BGP.: Recv UPDATE(Withdraw) from peer 10.180.88.52 for destinations:
[2022-06/14-18:11:02] 10.180.99.173/32,
<N_Core-06_D_R16C11-H12516-10.172.3.1.zzzc>display bgp update-group ipv4
Update-group ID: 0
Type: IBGP link
4-byte AS number: Supported
Minimum time between advertisements: 15 seconds
Advertising community: Yes
Export route policy: 30
OutQ: 0
Members: 3
10.172.0.2
10.172.3.2
10.172.0.1
Update-group ID: 8
Type: EBGP link
4-byte AS number: Supported
Minimum time between advertisements: 30 seconds
OutQ: 0
Members: 2
10.180.88.48
10.180.88.52
Update-group ID: 8
Type: EBGP link
4-byte AS number: Supported
Minimum time between advertisements: 30 seconds
OutQ: 0
Members: 2
10.180.88.48
10.180.88.52
<N_Core-06_D_R16C11-H12516-10.172.3.1.zzzc>display arp 10.180.88.48
Type: S-Static D-Dynamic O-Openflow R-Rule M-Multiport I-Invalid
IP address MAC address VID Interface/Link ID Aging Type
10.180.88.48 60db-15dd-d4cd N/A FGE14/0/14 20 D
<N_Core-06_D_R16C11-H12516-10.172.3.1.zzzc>display current-configuration interface FortyGigE 14/0/14
#
interface FortyGigE14/0/14
port link-mode route
description To-Tool_W-06_D_R04C18-H6805-10.172.4.252.zzzc-Hun-1/0/49
ip address 10.180.88.49 255.255.255.254
#
4.252侧的debug消息
[2022-06/14-18:10:35]*Jun 14 18:10:35:184 2022 Tool_W-06_D_R04C18-H6805-10.172.4.252.zzzc BGP/7/DEBUG:
[2022-06/14-18:10:35] BGP.: Recv UPDATE from peer 10.180.88.49 with following destinations:
[2022-06/14-18:10:35] Update message length : 56
[2022-06/14-18:10:35] Origin : Incomplete
[2022-06/14-18:10:35] AS path : 65012 4172004252 4173120001
[2022-06/14-18:10:35] Next hop : 10.180.88.49
[2022-06/14-18:10:35] 10.180.99.173/32 PathID 0 ,
[2022-06/14-18:10:35]
[2022-06/14-18:10:35]*Jun 14 18:10:35:186 2022 Tool_W-06_D_R04C18-H6805-10.172.4.252.zzzc BGP/7/DEBUG:
[2022-06/14-18:10:35] BGP.: Recv UPDATE from peer 10.180.88.51 with following destinations:
[2022-06/14-18:10:35] Update message length : 56
[2022-06/14-18:10:35] Origin : Incomplete
[2022-06/14-18:10:35] AS path : 65012 4172004252 4173120001
[2022-06/14-18:10:35] Next hop : 10.180.88.51
[2022-06/14-18:10:35] 10.180.99.173/32 PathID 0 ,
[2022-06/14-18:11:02]*Jun 14 18:11:02:208 2022 Tool_W-06_D_R04C18-H6805-10.172.4.252.zzzc BGP/7/DEBUG:
[2022-06/14-18:11:02] BGP.: Recv UPDATE(Withdraw) from peer 10.180.88.49 for destinations:
[2022-06/14-18:11:02] 10.180.99.173/32 PathID 0 ,
[2022-06/14-18:11:02]
[2022-06/14-18:11:02]*Jun 14 18:11:02:209 2022 Tool_W-06_D_R04C18-H6805-10.172.4.252.zzzc BGP/7/DEBUG:
[2022-06/14-18:11:02] BGP.: Send UPDATE to update-group 1 for following destinations:
[2022-06/14-18:11:02] Origin : Incomplete
[2022-06/14-18:11:02] AS path : 4172004252 65012 4172004252 4173120001
[2022-06/14-18:11:02] Next hop : 10.180.88.51
[2022-06/14-18:11:02] 10.180.99.173/32 PathID 0 ,
[2022-06/14-18:11:02]
[2022-06/14-18:11:02]*Jun 14 18:11:02:209 2022 Tool_W-06_D_R04C18-H6805-10.172.4.252.zzzc BGP/7/DEBUG:
[2022-06/14-18:11:02] BGP.: Send UPDATE MSG to peer 10.180.88.49 (IPv4-UNC) NextHop: 10.180.88.48.
[2022-06/14-18:11:02]
[2022-06/14-18:11:02]*Jun 14 18:11:02:209 2022 Tool_W-06_D_R04C18-H6805-10.172.4.252.zzzc BGP/7/DEBUG:
[2022-06/14-18:11:02] BGP.: Send UPDATE(Withdraw) to peer 10.180.88.51 for destinations:
[2022-06/14-18:11:02] 10.180.99.173/32 PathID 0 ,
[2022-06/14-18:11:02]
[2022-06/14-18:11:02]*Jun 14 18:11:02:708 2022 Tool_W-06_D_R04C18-H6805-10.172.4.252.zzzc BGP/7/DEBUG:
[2022-06/14-18:11:02] BGP.: Recv UPDATE(Withdraw) from peer 10.180.88.51 for destinations:
[2022-06/14-18:11:02] 10.180.99.173/32 PathID 0 ,
[2022-06/14-18:11:02]
[2022-06/14-18:11:02]*Jun 14 18:11:02:709 2022 Tool_W-06_D_R04C18-H6805-10.172.4.252.zzzc BGP/7/DEBUG:
[2022-06/14-18:11:02] BGP.: Send UPDATE(Withdraw) to update-group 1 for destinations:
[2022-06/14-18:11:02] 10.180.99.173/32 PathID 0 ,
综上,结合现场组网及debug信息,可知Server发布的主机路由,4.253会给3.1和3.2都发布这条主机路由。3.1又会给3.2和4.252发布这条主机路由;3.2也会给3.1和4.252发布这条路由。当server撤销主机路由的时候,4.253会给3.1和3.2发撤销消息。但是由于3.1和3.2的路由可以从多条路径收到,因此可能会出现天价和删除消息追逐环路的“贪吃蛇”现象。
另外,这里存在一个问题,撤销消息按道理是不会被update消息抑制的,但是存在一种特殊情况撤销消息也会被抑制:就是路由要撤销的的时候发现这条路由还存在下一跳,比如这个问题中的3.1和3.2两台设备,当收到4.253的撤销消息,也要向外发送撤销消息。但是此时3.1和3.2发现要撤销的路由还存在下一跳,并且两台设备在30s以内发过update消息,那么撤销消息就会被抑制直到抑制结束,撤销消息才会被发出去。
现场可以通过:
1.取消allow-as-loop功能,使bgp能够自我防环;
2.配置route-update-interval 0解决撤销消息被抑制的问题。
现场可以通过:
1.取消allow-as-loop功能,使bgp能够自我防环;
2.配置route-update-interval 0解决撤销消息被抑制的问题。
该案例暂时没有网友评论
✖
案例意见反馈
亲~登录后才可以操作哦!
确定你的邮箱还未认证,请认证邮箱或绑定手机后进行当前操作