E0535
集群中原有4台cvk,新扩容了两台,在启动新扩容cvk的存储时,报错:internal error: OCFS2 configuration error, cluster.conf do not match all remote host. please add the storage again or make sure storage is only managed by this CVM.
cvk4为新添加的主机,查看/var/log/ocfs_shell_202007.log,发现在启动时有如下报错:
2020-07-31 16:39:43 [23727] util_common_tools.py/209L: ERROR:Command 'debugfs.ocfs2 -R "stats -h" /dev/disk/by-id/dm-name-360002ac000000000000000170001e256 |grep "disk-lock" 2>>/dev/null' returned non-zero exit status 1
2020-07-31 16:39:43 [23727] ocfs2_mount_check.py/545L: ERROR:cluster.conf dismatch host 10.166.77.19, node count is different.
2020-07-31 16:39:43 [23727] ocfs2_mount_check.py/545L: ERROR:cluster.conf dismatch host 10.166.77.20, node count is different.
2020-07-31 16:39:43 [23727] ocfs2_mount_check.py/545L: ERROR:cluster.conf dismatch host 10.166.77.18, node count is different.
2020-07-31 16:39:46 [24018] ocfs2_share_filesystem_check.py/103L: ERROR:Command 'grep -w ocfs2 /proc/mounts' returned non-zero exit status 1 2020-07-31 16:39:46 [24018] ocfs2_share_filesystem_check.py/104L: ERROR:Traceback (most recent call last): File "ocfs2_share_filesystem_check.py", line 88, in get_mounted_ocfs2_pools File "/usr/lib/python2.7/subprocess.py", line 544, in check_output raise CalledProcessError(retcode, cmd, output=output) CalledProcessError: Command 'grep -w ocfs2 /proc/mounts' returned non-zero exit status 1
提示cvk4和之前集群中原有cvk上的cluster配置文件不一致,例如cvk1:
cat /etc/ocfs2/cluster.conf
只包括了集群中原有的4台cvk,没有包含新加的节点,手动同步以后启动还是会报错。而且即使修改了cluster文件,在尝试重新添加cvk4进入集群时,cluster配置文件又将还原。cvk1上的/var/log/ocfs_shell_202007.log 报错如下:
2020-07-31 16:40:01 [31371] ocfs2_cluster_config.py/678L: WARNING:Nodes [{'name': 'PZH-JB-GGJH-R4900-CVK4', 'type': 'node', 'number': '5', 'ip_port': '7100', 'cluster': 'PZH-JB-GGJH-Hostpool', 'ip_address': '10.166.77.248'}] are missing, will add them
2020-07-31 16:40:01 [31371] ocfs2_cluster_config.py/255L: WARNING:ocfs2 add new node(PZH-JB-GGJH-R4900-CVK4) failed, with errno 251
2020-07-31 16:40:01 [31371] ocfs2_cluster_config.py/681L: ERROR:add nodes failed
2020-07-31 16:40:01 [31371] ocfs2_cluster_config.py/958L: ERROR:process ocfs2_cluster_conf_diff failed.
在启动cvk4存储时,系统会 将cvk4加入到cluster配置文件中,但是添加失败
cvk1输入uptime,发现系统已经运行了400多天 .现场535版本应该是升级上来的,而运行了这么多天的系统,应该是在升级过后,没有重启主机。
如果不重启主机,内核状态不会刷新,导致ocfs2集群扩容节点时出错
重启升级上来的几台主机,启动存储正常了。
该案例暂时没有网友评论
✖
案例意见反馈
亲~登录后才可以操作哦!
确定你的邮箱还未认证,请认证邮箱或绑定手机后进行当前操作