现场组网如图所示,两台S6800交换机做M-LAG VLAN双活网关,M-LAG接口与第三方交换机对接,HCL模拟器模拟了现场故障现象,以下组网中IP地址非真实地址。
M-LAG状态正常时,终端业务正常,但设备(M-LAG系统的Primary 设备)下电测试时,发现下面终端(10.0.0.1)ping双活网关地址(10.0.0.254)会丢十几个包。
S6800-1配置:
#
interface Bridge-Aggregation1
port link-type trunk
undo port trunk permit vlan 1
port trunk permit vlan 2 to 4094
link-aggregation mode dynamic
port m-lag peer-link 1
undo mac-address static source-check enable
#
#
interface Ten-GigabitEthernet1/0/48
port link-mode route
ip address 1.1.1.1 255.255.255.252
#
#
interface Bridge-Aggregation15
port link-type trunk
undo port trunk permit vlan 1
port trunk permit vlan 2 to 4094
link-aggregation mode dynamic
port lacp system-priority 100
port m-lag group 15
#
#
interface Ten-GigabitEthernet1/0/20
port link-mode bridge
port link-type trunk
undo port trunk permit vlan 1
port trunk permit vlan 2 to 4094
port link-aggregation group 15
#
#
m-lag mad exclude interface Ten-GigabitEthernet1/0/48
m-lag mad exclude interface Vlan-interface10
m-lag role priority 100
m-lag system-mac 0002-0002-0002
m-lag system-number 1
m-lag system-priority 234
m-lag standalone enable delay 1
m-lag keepalive ip destination 1.1.1.2 source 1.1.1.1
#
#
interface Vlan-interface10
ip address 10.0.0.254 255.255.255.0
#
S6800-2配置:
#
interface Bridge-Aggregation1
port link-type trunk
undo port trunk permit vlan 1
port trunk permit vlan 2 to 4094
link-aggregation mode dynamic
port m-lag peer-link 1
undo mac-address static source-check enable
#
#
interface Ten-GigabitEthernet1/0/48
port link-mode route
ip address 1.1.1.2 255.255.255.252
#
#
interface Bridge-Aggregation15
port link-type trunk
undo port trunk permit vlan 1
port trunk permit vlan 2 to 4094
link-aggregation mode dynamic
port lacp system-priority 100
port m-lag group 15
#
#
interface Ten-GigabitEthernet1/0/20
port link-mode bridge
port link-type trunk
undo port trunk permit vlan 1
port trunk permit vlan 2 to 4094
port link-aggregation group 15
#
#
m-lag mad exclude interface Ten-GigabitEthernet1/0/48
m-lag mad exclude interface Vlan-interface10
m-lag role priority 200
m-lag system-mac 0002-0002-0002
m-lag system-number 2
m-lag system-priority 234
m-lag standalone enable delay 1
m-lag keepalive ip destination 1.1.1.1 source 1.1.1.2
#
#
interface Vlan-interface10
ip address 10.0.0.254 255.255.255.0
#
1、M-LAG系统的Primary 设备下电前,display link-aggregation verbose查看两台设备中Bri-15端口都被选中;
M-LAG系统的Primary 设备下电后,display link-aggregation verbose查看另外一台设备中Bri-15端口未被选中,display interface
Ten-GigabitEthernet1/0/20查看端口状态为LAGG DOWN,S6800-2设备输出以下日志:
%Jan 1 13:02:45:636 2021S6800-2 M-LAG/6/MLAG_KEEPALIVELINK_DOWN: Keepalive link went down because the local keepalive timeout timer expired. Please check the keepalive packet transmission and reception status at the two ends.
%Jan 1 13:02:45:654 2021 S6800-2 STP/5/STP_CONSISTENCY_CHECK: M-LAG role assignment finished. Please verify that the local device and the peer device have consistent global and mlag-interface-specific STP settings.
%Jan 1 13:02:45:657 2021 S6800-2 M-LAG/6/MLAG_SYSEVENT_DEVICEROLE_CHANGE: Device role changed from Secondary to Primary for peer link and keepalive link down.///设备角色发生切换
%Jan 1 13:02:45:664 2021 S6800-2 M-LAG/6/MLAG_IFEVT_PEERIF_NOSELECTED: Peer M-LAG interface in M-LAG group 15 does not have Selected member ports.////M-LAG口无选择端口
%Jan 1 13:02:46:823 2021 S6800-2 M-LAG/6/MLAG_SYSEVENT_MODE_CHANGE: The device's working mode changed to standalone.///设备工作模式切换
%Jan 1 13:02:46:947 2021 S6800-2 LAGG/6/LAGG_INACTIVE_OPERSTATE: Member port XGE1/0/20 of aggregation group BAGG15 changed to the inactive state, because the peer port did not have the Synchronization flag.////端口无同步的标志位导致端口inactive
%Jan 1 13:02:46:953 2021 S6812_KXC_VDI_4E03_DS2 IFNET/5/LINK_UPDOWN: Line protocol state on the interface Ten-GigabitEthernet1/0/20 changed to down.
%Jan 1 13:02:46:969 2021 S6800-2 M-LAG/4/MLAG_DEVICE_MADDOWN: All new service interfaces not excluded from the M-LAG MAD DOWN will change to the M-LAG MAD DOWN state because the peer link and all M-LAG interfaces went down. Please first check the peer link settings on both ends of the peer link.
%Jan 1 13:02:46:971 2021 S6800-2 M-LAG/6/MLAG_IFEVT_MLAGIF_NOSELECTED: Local M-LAG interface Bridge-Aggregation15 in M-LAG group 15 does not have Selected member ports because the aggregate interface went down. Please check the aggregate link status.
%Jan 1 13:02:47:191 2021 S6800-2 IFNET/3/PHY_UPDOWN: Physical state on the interface Bridge-Aggregation15 changed to down.
%Jan 1 13:02:47:191 2021 S6800-2 IFNET/5/LINK_UPDOWN: Line protocol state on the interface Bridge-Aggregation15 changed to down.
%Jan 1 13:02:47:216 2021 S6800-2 M-LAG/6/MLAG_IFEVT_MLAGIF_GLOBALDOWN: The state of M-LAG group 15 changed to down.
%Jan 1 13:02:47:220 2021 S6800-2 M-LAG/6/MLAG_SYSEVENT_DEVICEROLE_CHANGE: Device role changed from Primary to None for peer link and Keepalive link down.All local M-LAG interfaces down.
2、通过日志大致可判断问题与LACP报文交互相关,但为什么导致协商异常,需要通过debugging link-aggregation lacp packet all interface Ten-GigabitEthernet 1/0/20确认两端协商参数。
M-LAG系统的Primary 设备下电前,正常debug输出,M-LAG系统参数0002-0002-0002作为本端LACP系统MAC地址,对端系统MAC为xxxx-xxxx-e080:
M-LAG系统的Primary 设备下电后,终端开始ping不通,我们设备发送的LACP报文已经刷新成了自己的系统MAC,但对端回复LACP报文的对端MAC地址仍为0002-0002-0002,由于两端协商参数始终不一致,导致聚合口起不来:
M-LAG系统的Primary 设备下电后,对端回复的LACP报文刷新对端MAC地址,终端ping恢复正常:
3、检查配置后发现现场用m-lag standalone enable命令开启M-LAG设备独立工作功能。
【命令使用指导】
当M-LAG系统分裂时,为避免出现M-LAG系统中的两台设备都作为主设备转发流量的情况,可配置本命令,使M-LAG设备立即或经过一段时间切换到独立运行模式。
M-LAG设备切换到独立运行模式后,聚合接口发送的LACP报文中携带的M-LAG系统参数还原为聚合接口的LACP系统MAC地址和LACP优先级,使同一M-LAG组中的两个聚合接口的LACP系统MAC地址和LACP优先级不一致。这样只有一边聚合接口的成员端口可以被选中,被选中的设备独立运行转发业务流量,避免流量转发异常。
当peer-link链路和Keepalive链路均发生故障时,本命令才会生效。当对端M-LAG设备整机重启时,会通知本端M-LAG设备,本端M-LAG设备感知到peer-link链路和Keepalive链路未故障,此时本功能不会生效。对于设备下电引起peer-link链路和Keepalive链路均故障场景,建议配置的M-LAG设备切换到独立工作状态的延迟时间大于设备整机重启的时间,以避免M-LAG接口震荡引起流量转发异常;对于其他非设备下电引起peer-link链路和Keepalive链路均故障场景,建议配置较小的M-LAG设备切换到独立工作状态的延迟时间,以使设备尽快切换为独立工作模式。
多次执行本命令,最后一次执行的命令生效。
建议M-LAG设备均配置本功能。
配置本命令前,需要保证M-LAG设备的LACP系统优先级大于连接M-LAG系统设备的LACP系统优先级,使参考端口位于连接M-LAG系统的设备上,避免连接M-LAG系统的设备的端口频繁震荡。
关闭m-lag standalone后,相同条件测试终端只会丢一个包,但问题最终原因是对端LACP报文中的MAC地址刷新太慢,调整对端LACP相关参数。
该案例暂时没有网友评论
✖
案例意见反馈
亲~登录后才可以操作哦!
确定你的邮箱还未认证,请认证邮箱或绑定手机后进行当前操作