无
Feb 15 10:38:41 hlw-cvk1 kernel: [45717.819047] scsi 2:0:3:0: timing out command, waited 82s
Feb 15 10:56:58 hlw-cvk1 kernel: [46814.306100] scsi 2:0:3:0: timing out command, waited 82s
Feb 15 15:42:32 hlw-cvk1 kernel: [63940.764880] qla2xxx [0000:87:00.0]-801c:2: Abort command issued nexus=2:3:0 -- 1 2002.
Feb 15 16:27:53 hlw-cvk1 kernel: [66660.206649] qla2xxx [0000:87:00.0]-801c:2: Abort command issued nexus=2:3:0 -- 1 2002.
Feb 15 16:29:15 hlw-cvk1 kernel: [66742.172670] scsi 2:0:3:0: timing out command, waited 82s
Feb 15 18:42:31 hlw-cvk1 kernel: [74734.922865] qla2xxx [0000:87:00.0]-801c:2: Abort command issued nexus=2:3:0 -- 1 2002.
Feb 15 18:44:22 hlw-cvk1 kernel: [74845.359598] scsi 2:0:3:0: timing out command, waited 82s
Feb 15 18:45:44 hlw-cvk1 kernel: [74927.329540] scsi 2:0:3:0: timing out command, waited 82s
Feb 15 20:26:21 hlw-cvk1 kernel: [80961.536537] scsi 2:0:3:0: timing out command, waited 82s
Feb 15 20:27:43 hlw-cvk1 kernel: [81043.506485] scsi 2:0:3:0: timing out command, waited 82s
Feb 15 20:48:20 hlw-cvk1 kernel: [82279.924596] scsi 2:0:3:0: timing out command, waited 82s
Feb 15 21:07:13 hlw-cvk1 kernel: [83412.430842] qla2xxx [0000:87:00.0]-801c:2: Abort command issued nexus=2:3:0 -- 1 2002.
Feb 15 21:48:51 hlw-cvk1 kernel: [85909.244169] scsi 2:0:3:0: timing out command, waited 82s
Feb 15 21:50:13 hlw-cvk1 kernel: [85991.209936] scsi 2:0:3:0: timing out command, waited 82s
Feb 15 22:08:03 hlw-cvk1 kernel: [87060.717438] scsi 2:0:3:0: timing out command, waited 82s
Feb 15 22:09:25 hlw-cvk1 kernel: [87142.687400] scsi 2:0:3:0: timing out command, waited 82s
Feb 15 23:09:57 hlw-cvk1 kernel: [90772.990288] scsi 2:0:3:0: timing out command, waited 82s
Feb 15 23:11:19 hlw-cvk1 kernel: [90854.956222] scsi 2:0:3:0: timing out command, waited 82s
Feb 15 23:52:14 hlw-cvk1 kernel: [93308.817160] scsi 2:0:3:0: timing out command, waited 82s
Feb 15 23:53:36 hlw-cvk1 kernel: [93390.786982] scsi 2:0:3:0: timing out command, waited 82s
Feb 16 00:34:39 hlw-cvk1 kernel: [95852.644239] scsi 2:0:3:0: timing out command, waited 82s
Feb 16 01:17:17 hlw-cvk1 kernel: [98409.473204] scsi 2:0:3:0: timing out command, waited 82s
Feb 16 01:18:27 hlw-cvk1 kernel: [98479.376882] qla2xxx [0000:87:00.0]-801c:2: Abort command issued nexus=2:3:0 -- 1 2002.
Feb 16 01:40:26 hlw-cvk1 kernel: [99797.832639] scsi 2:0:3:0: timing out command, waited 82s
Feb 16 02:24:42 hlw-cvk1 kernel: [102452.612260] scsi 2:0:3:0: timing out command, waited 82s
Feb 16 02:26:04 hlw-cvk1 kernel: [102534.578212] scsi 2:0:3:0: timing out command, waited 82s
Feb 16 03:49:15 hlw-cvk1 kernel: [107523.222263] qla2xxx [0000:87:00.0]-801c:2: Abort command issued nexus=2:3:0 -- 1 2002.
Feb 16 03:50:37 hlw-cvk1 kernel: [107605.188205] scsi 2:0:3:0: timing out command, waited 82s
Feb 16 03:50:42 hlw-cvk1 kernel: [107609.977966] rport-2:0-3: blocked FC remote port time out: removing target and saving binding
通过command.out发现主机生成crash日志:
s -lt /vms/crash/
total 16
-rw-r--r-- 1 root root 322 Feb 21 20:45 kexec_cmd
drwxr-xr-x 2 root root 4096 Feb 16 12:18 201902161218
drwxr-xr-x 2 root root 4096 Feb 14 18:37 201902141809
drwxr-xr-x 2 root root 4096 Feb 14 13:02 201902141302
再次收集crash日志分析如下:
PID: 837 TASK: ffff887fd7e50000 CPU: 42 COMMAND: "kworker/42:1"
#0 [ffff887fd7e5b9c0] machine_kexec at ffffffff8105b311
#1 [ffff887fd7e5ba30] crash_kexec at ffffffff8110c358
#2 [ffff887fd7e5bb00] oops_end at ffffffff8101a7a8
#3 [ffff887fd7e5bb30] no_context at ffffffff817eab09
#4 [ffff887fd7e5bb90] __bad_area_nosemaphore at ffffffff817ead0a
#5 [ffff887fd7e5bbf0] bad_area_nosemaphore at ffffffff817ead3c
#6 [ffff887fd7e5bc00] __do_page_fault at ffffffff8106a2df
#7 [ffff887fd7e5bc70] do_page_fault at ffffffff8106a5bb
#8 [ffff887fd7e5bc90] page_fault at ffffffff817fa158
[exception RIP: scsi_device_put+27]
RIP: ffffffff8156e91b RSP: ffff887fd7e5bd48 RFLAGS: 00010296
RAX: 0000000000000000 RBX: ffff885fed47d800 RCX: 000000018200016c
RDX: 000000018200016d RSI: ffffea017faa4500 RDI: ffff885fed47d800
RBP: ffff887fd7e5bd58 R8: ffff885fea914010 R9: 000000018200016c
R10: ffffffff813c62ac R11: dead000000000200 R12: ffff885fef681800
R13: ffff885ff0783000 R14: ffff885ff2ce8860 R15: ffff885ff0783010
ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
#9 [ffff887fd7e5bd60] scsi_remove_target at ffffffff8157f13a
#10 [ffff887fd7e5bdb0] fc_release_transport at ffffffffc0171ec6 [scsi_transport_fc]
#11 [ffff887fd7e5bdd0] process_one_work at ffffffff81096fc7
#12 [ffff887fd7e5be30] worker_thread at ffffffff81097a5d
#13 [ffff887fd7e5bec0] kthread at ffffffff8109e069
#14 [ffff887fd7e5bf50] ret_from_fork at ffffffff817f85e2
根据上面日志判断是SCSI驱动出现空指针导致。
此问题在E0520版本解决,升级CAS版本解决此问题。
该案例对您是否有帮助:
您的评价:1
若您有关于案例的建议,请反馈:
该案例暂时没有网友评论
✖
案例意见反馈
亲~登录后才可以操作哦!
确定你的邮箱还未认证,请认证邮箱或绑定手机后进行当前操作