不涉及组网
远程SSH登录S12510X设备时一直提示密码错误,通过console登录设备尝试修改密码,发现无法进入local-user scswadmin视图,并提示Add user failed。
[LJL-SW-W13/14<1-23>-S12510]local-user scswadmin class manage
Add user failed
从反馈的诊断信息中发现display process没有看到主控有lauthd进程,且在2022-02-24 15:48出现连续多次重启lauthd进程,导致lauthd进程异常。
===============display process log===============
lauthd 467 467 Y Y 2019-01-06 06:28:12 2022-02-24 15:48:16
lauthd 467 2494188 Y N 2022-02-24 15:48:16 2022-02-24 15:48:19
lauthd 467 2494195 Y N 2022-02-24 15:48:19 2022-02-24 15:48:21
lauthd 467 2494208 Y N 2022-02-24 15:48:21 2022-02-24 15:48:27
lauthd 467 2494213 Y N 2022-02-24 15:48:26 2022-02-24 15:48:27
lauthd 467 2494217 Y N 2022-02-24 15:48:28 2022-02-24 15:48:30
lauthd 467 2494221 Y N 2022-02-24 15:48:30 2022-02-24 15:48:33
lauthd 467 2494225 Y N 2022-02-24 15:48:32 2022-02-24 15:48:37
lauthd 467 2494229 Y N 2022-02-24 15:48:38 2022-02-24 15:48:40
lauthd 467 2494233 Y N 2022-02-24 15:48:39 2022-02-24 15:48:40
lauthd 467 2494237 Y N 2022-02-24 15:48:40 2022-02-24 15:48:41
lauthd 467 2494241 Y N 2022-02-24 15:48:42 2022-02-24 15:48:49
lauthd 467 2494247 Y N 2022-02-24 15:48:49 2022-02-24 15:48:52
同时在查看主控的printk log buffer时,也可以看到有lauthd进程异常的记录。
<4>[92362697.127739] [do_ade]: Killing process (lauthd) which is using ualigned accesses!
<4>[92362697.127807] Cpu 10
<4>[92362697.127825] $ 0 : 0000000000000000 0000000000000001 000000012002dbd0 0000000000000000
<4>[92362697.127894] $ 4 : 000000012002d880 0000000000000350 62173840559f5888 0000000900000004
以上信息说明故障是由于设备主控lauthd进程异常所导致,已经触发如下已知问题,可以采取主控主备倒换恢复业务,后续升级版本至R1005P21并加载H12补丁。
·
·
·
现场进行主备倒换后,可以正常通过SSH进行远程登录,但是配置中user-group之后的配置消失,通过收集诊断发现当前主用主控chassis 2 slot 17槽位的信息中能看到dbm子进程出现异常,属于已知问题,可以再次主备倒换恢复。
<7>[92378176.984894] ub 1-1:1.0: usb_probe_interface - got id
<7>[92378186.336356] uba:<7> uba1
<4>[92445924.753972] process dbmd[18516] bad addr 0x559282d3f: pmd none or bad, pgd=0xffffffff80679000
<4>[92445924.754098] set scheduler, ret=0
<4>[92445931.758258] process dbmd[18521] bad addr 0xe1688d6908: pmd none or bad, pgd=0xffffffff80679000
<4>[92445931.758373] set scheduler, ret=0
<4>[92446040.561855] process dbmd[19117] bad addr 0x559282d3f: pmd none or bad, pgd=0xffffffff80679000
<4>[92446040.561962] set scheduler, ret=0
<4>[92446054.798721] process dbmd[19681] bad addr 0xe1688d6908: pmd none or bad, pgd=0xffffffff80679000
<4>[92446054.798841] set scheduler, ret=0
<4>[92446230.523706] process dbmd[20564] bad addr 0xdce9432230: pmd none or bad, pgd=0xffffffff80679000
<4>[92446230.523824] set scheduler, ret=0
<4>[92446232.851600] process dbmd[20569] bad addr 0xdce9432230: pmd none or bad, pgd=0xffffffff80679000
<4>[92446232.851712] set scheduler, ret=0
<4>[92446304.067987] process dbmd[20692] bad addr 0x559282d3f: pmd none or bad, pgd=0xffffffff80679000
<4>[92446304.068103] set scheduler, ret=0
<4>[92446390.486457] process dbmd[21378] bad addr 0xdce9432230: pmd none or bad, pgd=0xffffffff80679000
<4>[92446390.486574] set scheduler, ret=0
<4>[92446852.718690] process dbmd[30777] bad addr 0xe1688d6908: pmd none or bad, pgd=0xffffffff80679000
<4>[92446852.718805] set scheduler, ret=0
1、进行两次主控主备倒换恢复业务,两次分别reboot chassis 2 slot 16和reboot chassis 2 slot 17;
2、当前设备版本R1005P09为2014年发布的旧版本,建议升级至R1005P21版本并加载H12补丁。
该案例暂时没有网友评论
✖
案例意见反馈
亲~登录后才可以操作哦!
确定你的邮箱还未认证,请认证邮箱或绑定手机后进行当前操作