提供一份阵列配置信息,以便于说明,后面的操作是基于主题实现的
输入:
# storcli64 /c0 show
输出:
Generating detailed summary of the adapter, it may take a while to complete.
CLI Version = 007.1705.0000.0000 Mar 31, 2021
Operating system = Linux 3.10.0-1160.el7.x86_64
COntroller= 0
Status = Success
Description = None
Product Name = AVAGO MegaRAID SAS 9361-8i 2GB
Serial Number = SK71678374
SAS Address = 500605b00d0efc70
PCI Address = 00:17:00:00
System Time = 07/18/2024 10:47:11
Mfg. Date = 04/21/17
Controller Time = 07/18/2024 10:47:10
FW Package Build = 24.21.0-0012
BIOS Version = 6.36.00.0_4.19.08.00_0x06180200
FW Version = 4.680.00-8249
Driver Name = megaraid_sas
Driver Version = 07.714.04.00-rh1
Current PersOnality= RAID-Mode
Vendor Id = 0x1000
Device Id = 0x5D
SubVendor Id = 0x1000
SubDevice Id = 0x9361
Host Interface = PCI-E
Device Interface = SAS-12G
Bus Number = 23
Device Number = 0
Function Number = 0
Domain ID = 0
Security Protocol = None
Drive Groups = 2
TOPOLOGY :
========
-----------------------------------------------------------------------------
DG Arr Row EID:Slot DID Type State BT Size PDC PI SED DS3 FSpace TR
-----------------------------------------------------------------------------
0 - - - - RAID5 Optl N 1.307 TB dflt N N dflt N N
0 0 - - - RAID5 Optl N 1.307 TB dflt N N dflt N N
0 0 0 252:0 14 DRIVE Onln N 446.625 GB dflt N N dflt - N
0 0 1 252:1 15 DRIVE Onln N 446.625 GB dflt N N dflt - N
0 0 2 252:2 17 DRIVE Onln N 446.625 GB dflt N N dflt - N
0 0 3 252:3 18 DRIVE Onln N 446.625 GB dflt N N dflt - N
1 - - - - RAID5 Optl N 2.618 TB dflt N N dflt N N
1 0 - - - RAID5 Optl N 2.618 TB dflt N N dflt N N
1 0 0 252:4 19 DRIVE Onln N 893.750 GB dflt N N dflt - N
1 0 1 252:5 20 DRIVE Onln N 893.750 GB dflt N N dflt - N
1 0 2 252:6 21 DRIVE Onln N 893.750 GB dflt N N dflt - N
1 0 3 252:7 16 DRIVE Onln N 893.750 GB dflt N N dflt - N
-----------------------------------------------------------------------------
DG=Disk Group Index|Arr=Array Index|Row=Row Index|EID=Enclosure Device ID
DID=Device ID|Type=Drive Type|Onln=Online|Rbld=Rebuild|Optl=Optimal|Dgrd=Degraded
Pdgd=Partially degraded|Offln=Offline|BT=Background Task Active
PDC=PD Cache|PI=Protection Info|SED=Self Encrypting Drive|Frgn=Foreign
DS3=Dimmer Switch 3|dflt=Default|Msng=Missing|FSpace=Free Space Present
TR=Transport Ready
Virtual Drives = 2
VD LIST :
=======
-------------------------------------------------------------
DG/VD TYPE State Access Consist Cache Cac sCC Size Name
-------------------------------------------------------------
0/0 RAID5 Optl RW No RWTD - ON 1.307 TB TS_1
1/1 RAID5 Optl RW No RWTD - ON 2.618 TB TS_3
-------------------------------------------------------------
VD=Virtual Drive| DG=Drive Group|Rec=Recovery
Cac=CacheCade|OfLn=OffLine|Pdgd=Partially Degraded|Dgrd=Degraded
Optl=Optimal|dflt=Default|RO=Read Only|RW=Read Write|HD=Hidden|TRANS=TransportReady
B=Blocked|COnsist=Consistent|R=Read Ahead Always|NR=No Read Ahead|WB=WriteBack
AWB=Always WriteBack|WT=WriteThrough|C=Cached IO|D=Direct IO|sCC=Scheduled
Check Consistency
Physical Drives = 8
PD LIST :
=======
----------------------------------------------------------------------------------------
EID:Slt DID State DG Size Intf Med SED PI SeSz Model Sp Type
----------------------------------------------------------------------------------------
252:0 14 Onln 0 446.625 GB SATA SSD N N 512B SAMSUNG MZ7L3480HCHQ-00B7C U -
252:1 15 Onln 0 446.625 GB SATA SSD N N 512B SAMSUNG MZ7L3480HCHQ-00B7C U -
252:2 17 Onln 0 446.625 GB SATA SSD N N 512B SAMSUNG MZ7L3480HCHQ-00B7C U -
252:3 18 Onln 0 446.625 GB SATA SSD N N 512B SAMSUNG MZ7L3480HCHQ-00B7C U -
252:4 19 Onln 1 893.750 GB SATA SSD N N 512B Micron_5300_MTFDDAK960TDT U -
252:5 20 Onln 1 893.750 GB SATA SSD N N 512B Micron_5300_MTFDDAK960TDT U -
252:6 21 Onln 1 893.750 GB SATA SSD N N 512B Micron_5300_MTFDDAK960TDT U -
252:7 16 Onln 1 893.750 GB SATA SSD N N 512B Micron_5300_MTFDDAK960TDT U -
----------------------------------------------------------------------------------------
EID=Enclosure Device ID|Slt=Slot No|DID=Device ID|DG=DriveGroup
DHS=Dedicated Hot Spare|UGood=Unconfigured Good|GHS=Global Hotspare
UBad=Unconfigured Bad|Sntze=Sanitize|Onln=Online|Offln=Offline|Intf=Interface
Med=Media Type|SED=Self Encryptive Drive|PI=Protection Info
SeSz=Sector Size|Sp=Spun|U=Up|D=Down|T=Transition|F=Foreign
UGUnsp=UGood Unsupported|UGShld=UGood shielded|HSPShld=Hotspare shielded
CFShld=Configured shielded|Cpybck=CopyBack|CBShld=Copyback Shielded
UBUnsp=UBad Unsupported|Rbld=Rebuild
Enclosures = 1
Enclosure LIST :
==============
--------------------------------------------------------------------
EID State Slots PD PS Fans TSs Alms SIM Port# ProdID VendorSpecific
--------------------------------------------------------------------
252 OK 8 8 0 0 0 0 1 - SGPIO
--------------------------------------------------------------------
EID=Enclosure Device ID | PD=Physical drive count | PS=Power Supply count
TSs=Temperature sensor count | Alms=Alarm count | SIM=SIM Count | ProdID=Product ID
2.3.2. 安装系统(使用CentOS Linux 7.9举例)所需要的smartmontools
yum install -y smartmontools
2.3.3. 通过smartctl 查询LSI阵列卡中磁盘的DID(Drive ID)信息
输入:
# smartctl --scan
输出:
/dev/sda -d scsi # /dev/sda, SCSI device # 呈现在系统中的阵列逻辑盘
/dev/sdb -d scsi # /dev/sdb, SCSI device # 呈现在系统中的阵列逻辑盘
/dev/bus/0 -d megaraid,14 # /dev/bus/0 [megaraid_disk_14], SCSI device # 对比上面的阵列信息,有"megaraid"字样的,都是阵列卡的成员盘
/dev/bus/0 -d megaraid,15 # /dev/bus/0 [megaraid_disk_15], SCSI device # '''
/dev/bus/0 -d megaraid,16 # /dev/bus/0 [megaraid_disk_16], SCSI device # '''
/dev/bus/0 -d megaraid,17 # /dev/bus/0 [megaraid_disk_17], SCSI device # '''
/dev/bus/0 -d megaraid,18 # /dev/bus/0 [megaraid_disk_18], SCSI device # '''
/dev/bus/0 -d megaraid,19 # /dev/bus/0 [megaraid_disk_19], SCSI device # '''
/dev/bus/0 -d megaraid,20 # /dev/bus/0 [megaraid_disk_20], SCSI device # '''
/dev/bus/0 -d megaraid,21 # /dev/bus/0 [megaraid_disk_21], SCSI device # '''
/dev/nvme0 -d nvme # /dev/nvme0, NVMe device # 这通主板的NVMe磁盘
2.3.4. 查看阵列中一块磁盘的smart信息(不同品牌硬盘,或者不同型号硬盘都可能产生"Vendor Specific SMART Attributes with Thresholds"输出有所不同)
三星SSD输出
输入:
# smartctl -a -d megaraid,14 /dev/bus/0
输出:
smartctl 7.0 2018-12-30 r4883 [x86_64-linux-3.10.0-1160.el7.x86_64] (local build)
Copyright (C) 2002-18, Bruce Allen, Christian Franke, ***.***
=== START OF INFORMATION SECTION ===
Device Model: SAMSUNG MZ7L3480HCHQ-00B7C
Serial Number: S6KLNE0RC20239
LU WWN Device Id: 5 002538 f01c16920
Firmware Version: JXTC104Q
User Capacity: 480,103,981,056 bytes [480 GB]
Sector Sizes: 512 bytes logical, 4096 bytes physical
Rotation Rate: Solid State Device
Form Factor: 2.5 inches
Device is: Not in smartctl database [for details use: -P showall]
ATA Version is: ACS-4 T13/BSR INCITS 529 revision 5
SATA Version is: SATA 3.2, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is: Thu Jul 18 10:56:51 2024 CST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART Status not supported: ATA return descriptor not supported by controller firmware
SMART overall-health self-assessment test result: PASSED
Warning: This result is based on an Attribute check.
General SMART Values:
Offline data collection status: (0x00) Offline data collection activity
was never started.
Auto Offline Data Collection: Disabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: ( 0) seconds.
Offline data collection
capabilities: (0x53) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
No Offline surface scan supported.
Self-test supported.
No Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 2) minutes.
Extended self-test routine
recommended polling time: ( 35) minutes.
SCT capabilities: (0x003d) SCT Status supported.
SCT Error Recovery Control supported.
SCT Feature Control supported.
SCT Data Table supported.
SMART Attributes Data Structure revision number: 1
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
5 Reallocated_Sector_Ct 0x0033 100 100 010 Pre-fail Always - 0
9 Power_On_Hours 0x0032 098 098 000 Old_age Always - 6753
12 Power_Cycle_Count 0x0032 099 099 000 Old_age Always - 535
177 Wear_Leveling_Count 0x0013 099 099 005 Pre-fail Always - 43
179 Used_Rsvd_Blk_Cnt_Tot 0x0013 100 100 010 Pre-fail Always - 0
180 Unused_Rsvd_Blk_Cnt_Tot 0x0013 100 100 010 Pre-fail Always - 439
181 Program_Fail_Cnt_Total 0x0032 100 100 010 Old_age Always - 0
182 Erase_Fail_Count_Total 0x0032 100 100 010 Old_age Always - 0
183 Runtime_Bad_Block 0x0013 100 100 010 Pre-fail Always - 0
184 End-to-End_Error 0x0033 100 100 097 Pre-fail Always - 0
187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 0
190 Airflow_Temperature_Cel 0x0032 063 057 000 Old_age Always - 37
194 Temperature_Celsius 0x0022 063 057 000 Old_age Always - 37 (Min/Max 28/43)
195 Hardware_ECC_Recovered 0x001a 200 200 000 Old_age Always - 0
197 Current_Pending_Sector 0x0032 100 100 000 Old_age Always - 0
199 UDMA_CRC_Error_Count 0x003e 100 100 000 Old_age Always - 0
202 Unknown_SSD_Attribute 0x0033 100 100 010 Pre-fail Always - 0
235 Unknown_Attribute 0x0012 099 099 000 Old_age Always - 472
241 Total_LBAs_Written 0x0032 099 099 000 Old_age Always - 10401110563
242 Total_LBAs_Read 0x0032 099 099 000 Old_age Always - 20001863403
243 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 0
244 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 0
245 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 65535
246 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 65535
247 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 65535
251 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 19410772672
SMART Error Log Version: 1
No Errors Logged
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Short offline Completed without error 00% 11 -
# 2 Short offline Completed without error 00% 3 -
SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
256 0 65535 Read_scanning was never started
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
镁光SSD输出
输入:
# smartctl -a -d megaraid,21 /dev/bus/0
输出:
smartctl 7.0 2018-12-30 r4883 [x86_64-linux-3.10.0-1160.el7.x86_64] (local build)
Copyright (C) 2002-18, Bruce Allen, Christian Franke, ***.***
=== START OF INFORMATION SECTION ===
Device Model: Micron_5300_MTFDDAK960TDT
Serial Number: 203129A48D1E
LU WWN Device Id: 5 00a075 129a48d1e
Firmware Version: D3CM004
User Capacity: 960,197,124,096 bytes [960 GB]
Sector Sizes: 512 bytes logical, 4096 bytes physical
Rotation Rate: Solid State Device
Form Factor: 2.5 inches
Device is: Not in smartctl database [for details use: -P showall]
ATA Version is: ACS-4 (minor revision not indicated)
SATA Version is: SATA 3.3, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is: Thu Jul 18 10:57:39 2024 CST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART Status not supported: ATA return descriptor not supported by controller firmware
SMART overall-health self-assessment test result: PASSED
Warning: This result is based on an Attribute check.
General SMART Values:
Offline data collection status: (0x00) Offline data collection activity
was never started.
Auto Offline Data Collection: Disabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: ( 1953) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 2) minutes.
Extended self-test routine
recommended polling time: ( 8) minutes.
Conveyance self-test routine
recommended polling time: ( 3) minutes.
SCT capabilities: (0x0035) SCT Status supported.
SCT Feature Control supported.
SCT Data Table supported.
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x002f 100 100 050 Pre-fail Always - 0
5 Reallocated_Sector_Ct 0x0032 100 100 001 Old_age Always - 0
9 Power_On_Hours 0x0032 100 100 000 Old_age Always - 7242
12 Power_Cycle_Count 0x0032 100 100 001 Old_age Always - 838
170 Unknown_Attribute 0x0033 100 100 010 Pre-fail Always - 0
171 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 0
172 Unknown_Attribute 0x0032 100 100 001 Old_age Always - 0
173 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 4
174 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 788
183 Runtime_Bad_Block 0x0032 100 100 000 Old_age Always - 0
184 End-to-End_Error 0x0032 100 100 000 Old_age Always - 0
187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 0
188 Command_Timeout 0x0032 100 100 000 Old_age Always - 1737
194 Temperature_Celsius 0x0022 070 049 000 Old_age Always - 30 (Min/Max 14/51)
195 Hardware_ECC_Recovered 0x0032 100 100 000 Old_age Always - 0
196 Reallocated_Event_Count 0x0032 100 100 000 Old_age Always - 0
197 Current_Pending_Sector 0x0032 100 100 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0030 100 100 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x0032 100 100 000 Old_age Always - 1
202 Unknown_SSD_Attribute 0x0030 100 100 001 Old_age Offline - 0
206 Unknown_SSD_Attribute 0x000e 100 100 000 Old_age Always - 0
246 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 13168050367
247 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 411532331
248 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 87594730
180 Unused_Rsvd_Blk_Cnt_Tot 0x0033 100 100 000 Pre-fail Always - 9242
210 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 0
211 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 784
212 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 0
SMART Error Log Version: 1
No Errors Logged
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Short offline Completed without error 00% 7 -
# 2 Short offline Completed without error 00% 2 -
SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
2.3.5. 关于"Vendor Specific SMART Attributes with Thresholds"输出说明
具体参见"man smartctl"中的"-A"参数
该案例暂时没有网友评论
✖
案例意见反馈
亲~登录后才可以操作哦!
确定你的邮箱还未认证,请认证邮箱或绑定手机后进行当前操作