硬盤的檢測

1.oops ~ # smartctl -H /dev/sdb

smartctl 6.2 2013-07-26 r3841 [x86_64-linux-3.3.1-gentoo] (local build)

Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===

SMART overall-health self-assessment test result: FAILED! <----result後邊的結果:PASSED,這表示硬盤健康狀態良好;如果這裏顯示Failure,那麼最好立刻給服務器更換硬盤

Drive failure expected in less than 24 hours. SAVE ALL DATA.

Failed Attributes:

ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE

  5 Reallocated_Sector_Ct 0x0033 001 001 036 Pre-fail Always FAILING_NOW 4095

2.oops ~ # smartctl -C -t short /dev/sdb

smartctl 6.2 2013-07-26 r3841 [x86_64-linux-3.3.1-gentoo] (local build)

Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF OFFLINE IMMEDIATE AND SELF-TEST SECTION ===

Sending command: "Execute SMART Short self-test routine immediately in captive mode".

Drive command "Execute SMART Short self-test routine immediately in captive mode" successful.

Testing has begun.

Please wait 1 minutes for test to complete.

Test will complete after Tue Nov 5 12:19:54 2013

3.oops ~ # smartctl -l selftest /dev/sdb2

smartctl 6.2 2013-07-26 r3841 [x86_64-linux-3.3.1-gentoo] (local build)

Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===

SMART Self-test log structure revision number 1

Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error

# 1 Short captive Completed: unknown failure 90% 36265 0

# 2 Short captive Completed: unknown failure 90% 36265 0

4.oops ~ # smartctl -l error /dev/sdb

smartctl 6.2 2013-07-26 r3841 [x86_64-linux-3.3.1-gentoo] (local build)

Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===

SMART Error Log Version: 1

ATA Error Count: 94 (device log contains only the most recent five errors)

 CR = Command Register [HEX]

 FR = Features Register [HEX]

 SC = Sector Count Register [HEX]

 SN = Sector Number Register [HEX]

 CL = Cylinder Low Register [HEX]

 CH = Cylinder High Register [HEX]

 DH = Device/Head Register [HEX]

 DC = Device Command Register [HEX]

 ER = Error register [HEX]

 ST = Status register [HEX]

Powered_Up_Time is measured from power on, and printed as

DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,

SS=sec, and sss=millisec. It "wraps" after 49.710 days.

Error 94 occurred at disk power-on lifetime: 36263 hours (1510 days + 23 hours)

  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:

  ER ST SC SN CL CH DH

  -- -- -- -- -- -- --

  04 71 04 9d 00 32 e0 Device Fault; Error: ABRT

  Commands leading to the command that caused the error were:

  CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name

  -- -- -- -- -- -- -- -- ---------------- --------------------

  a1 00 00 00 00 00 a0 00 00:14:55.632 IDENTIFY PACKET DEVICE

  ec 00 00 00 00 00 a0 00 00:14:55.631 IDENTIFY DEVICE

  ff 00 00 00 00 00 00 00 00:14:55.480 [VENDOR SPECIFIC]

  00 00 00 00 00 00 00 04 00:14:55.480 NOP [Abort queued commands]

  ff 00 00 00 00 00 00 00 00:14:55.173 [VENDOR SPECIFIC]

Error 93 occurred at disk power-on lifetime: 36263 hours (1510 days + 23 hours)

  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:

  ER ST SC SN CL CH DH

  -- -- -- -- -- -- --

  04 71 04 9d 00 32 e0

  Commands leading to the command that caused the error were:

  CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name

  -- -- -- -- -- -- -- -- ---------------- --------------------

  ec 00 00 00 00 00 a0 00 00:14:55.631 IDENTIFY DEVICE

  ff 00 00 00 00 00 00 00 00:14:55.480 [VENDOR SPECIFIC]

  00 00 00 00 00 00 00 04 00:14:55.480 NOP [Abort queued commands]

  ff 00 00 00 00 00 00 00 00:14:55.173 [VENDOR SPECIFIC]

  00 00 00 00 00 00 00 ff 00:14:55.173 NOP [Abort queued commands]

Error 92 occurred at disk power-on lifetime: 36263 hours (1510 days + 23 hours)

  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:

  ER ST SC SN CL CH DH

  -- -- -- -- -- -- --

  04 71 04 9d 00 32 e0 Device Fault; Error: ABRT

  Commands leading to the command that caused the error were:

  CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name

  -- -- -- -- -- -- -- -- ---------------- --------------------

  a1 00 00 00 00 00 a0 00 00:14:50.171 IDENTIFY PACKET DEVICE

  ec 00 00 00 00 00 a0 00 00:14:50.171 IDENTIFY DEVICE

  ff 00 00 00 00 00 00 00 00:14:50.020 [VENDOR SPECIFIC]

  00 00 00 00 00 00 00 04 00:14:50.020 NOP [Abort queued commands]

  ff 00 00 00 00 00 00 00 00:14:49.713 [VENDOR SPECIFIC]

Error 91 occurred at disk power-on lifetime: 36263 hours (1510 days + 23 hours)

  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:

  ER ST SC SN CL CH DH

  -- -- -- -- -- -- --

  04 71 04 9d 00 32 e0

  Commands leading to the command that caused the error were:

  CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name

  -- -- -- -- -- -- -- -- ---------------- --------------------

  ec 00 00 00 00 00 a0 00 00:14:50.171 IDENTIFY DEVICE

  ff 00 00 00 00 00 00 00 00:14:50.020 [VENDOR SPECIFIC]

  00 00 00 00 00 00 00 04 00:14:50.020 NOP [Abort queued commands]

  ff 00 00 00 00 00 00 00 00:14:49.713 [VENDOR SPECIFIC]

  00 00 00 00 00 00 00 ff 00:14:49.713 NOP [Abort queued commands]

Error 90 occurred at disk power-on lifetime: 36263 hours (1510 days + 23 hours)

  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:

  ER ST SC SN CL CH DH

  -- -- -- -- -- -- --

  04 71 04 9d 00 32 e0 Device Fault; Error: ABRT

  Commands leading to the command that caused the error were:

  CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name

  -- -- -- -- -- -- -- -- ---------------- --------------------

  a1 00 00 00 00 00 a0 00 00:14:49.686 IDENTIFY PACKET DEVICE

  ec 00 00 00 00 00 a0 00 00:14:49.661 IDENTIFY DEVICE

  2f 00 01 10 00 00 a0 00 00:14:49.632 READ LOG EXT

  60 00 00 ff ff ff 4f 00 00:14:46.851 READ FPDMA QUEUED

  60 00 00 ff ff ff 4f 00 00:14:46.847 READ FPDMA QUEUED

4.smartctl -a /dev/sdb

smartctl 6.2 2013-07-26 r3841 [x86_64-linux-3.3.1-gentoo] (local build)

Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===

Model Family: Seagate Barracuda 7200.11

Device Model: ST31000333AS

Serial Number: 9TE19F5L

LU WWN Device Id: 5 000c50 010aad5e2

Firmware Version: CC1H

User Capacity: 1,000,204,886,016 bytes [1.00 TB]

Sector Size: 512 bytes logical/physical

Rotation Rate: 7200 rpm

Device is: In smartctl database [for details use: -P show]

ATA Version is: ATA8-ACS T13/1699-D revision 4

SATA Version is: SATA 2.6, 3.0 Gb/s

Local Time is: Tue Nov 5 11:48:20 2013 CST

SMART support is: Available - device has SMART capability.

SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===

SMART overall-health self-assessment test result: FAILED!

Drive failure expected in less than 24 hours. SAVE ALL DATA.

See vendor-specific Attribute list for failed Attributes.

General SMART Values:

Offline data collection status: (0x82) Offline data collection activity

     was completed without error.

     Auto Offline Data Collection: Enabled.

Self-test execution status: ( 73) The previous self-test completed having

     a test element that failed and the test

     element that failed is not known.

Total time to complete Offline

data collection: ( 625) seconds.

Offline data collection

capabilities: (0x7b) SMART execute Offline immediate.

     Auto Offline data collection on/off support.

     Suspend Offline collection upon new

     command.

     Offline surface scan supported.

     Self-test supported.

     Conveyance Self-test supported.

     Selective Self-test supported.

SMART capabilities: (0x0003) Saves SMART data before entering

     power-saving mode.

     Supports SMART auto save timer.

Error logging capability: (0x01) Error logging supported.

     General Purpose Logging supported.

Short self-test routine

recommended polling time: ( 1) minutes.

Extended self-test routine

recommended polling time: ( 208) minutes.

Conveyance self-test routine

recommended polling time: ( 2) minutes.

SCT capabilities: (0x103f) SCT Status supported.

     SCT Error Recovery Control supported.

     SCT Feature Control supported.

     SCT Data Table supported.

SMART Attributes Data Structure revision number: 10

Vendor Specific SMART Attributes with Thresholds:

ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE

  1 Raw_Read_Error_Rate 0x000f 116 099 006 Pre-fail Always - 114128654

  3 Spin_Up_Time 0x0003 100 100 000 Pre-fail Always - 0

  4 Start_Stop_Count 0x0032 100 100 020 Old_age Always - 85

  5 Reallocated_Sector_Ct 0x0033 001 001 036 Pre-fail Always FAILING_NOW 4095 <-------

  7 Seek_Error_Rate 0x000f 084 060 030 Pre-fail Always - 281473872

  9 Power_On_Hours 0x0032 059 059 000 Old_age Always - 36265

 10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 0

 12 Power_Cycle_Count 0x0032 100 037 020 Old_age Always - 85

184 End-to-End_Error 0x0032 100 100 099 Old_age Always - 0

187 Reported_Uncorrect 0x0032 090 090 000 Old_age Always - 10

188 Command_Timeout 0x0032 100 097 000 Old_age Always - 73015558161

189 High_Fly_Writes 0x003a 001 001 000 Old_age Always - 856

190 Airflow_Temperature_Cel 0x0022 070 054 045 Old_age Always - 30 (Min/Max 23/31)

194 Temperature_Celsius 0x0022 030 046 000 Old_age Always - 30 (0 16 0 0 0)

195 Hardware_ECC_Recovered 0x001a 052 022 000 Old_age Always - 114128654

197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 0

198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 0

199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0

240 Head_Flying_Hours 0x0000 100 253 000 Old_age Offline - 269680996540835

241 Total_LBAs_Written 0x0000 100 253 000 Old_age Offline - 481731982

242 Total_LBAs_Read 0x0000 100 253 000 Old_age Offline - 1978196986

SMART Error Log Version: 1

ATA Error Count: 96 (device log contains only the most recent five errors)

 CR = Command Register [HEX]

 FR = Features Register [HEX]

 SC = Sector Count Register [HEX]

 SN = Sector Number Register [HEX]

 CL = Cylinder Low Register [HEX]

 CH = Cylinder High Register [HEX]

 DH = Device/Head Register [HEX]

 DC = Device Command Register [HEX]

 ER = Error register [HEX]

 ST = Status register [HEX]

Powered_Up_Time is measured from power on, and printed as

DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,

SS=sec, and sss=millisec. It "wraps" after 49.710 days.

Error 96 occurred at disk power-on lifetime: 36265 hours (1511 days + 1 hours)

  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:

  ER ST SC SN CL CH DH

  -- -- -- -- -- -- --

  04 51 00 00 00 00 00 Error: ABRT

  Commands leading to the command that caused the error were:

  CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name

  -- -- -- -- -- -- -- -- ---------------- --------------------

  b0 d4 00 81 4f c2 00 00 01:51:33.197 SMART EXECUTE OFF-LINE IMMEDIATE

  b0 d0 01 00 4f c2 00 00 01:51:33.112 SMART READ DATA

  ec 00 01 00 00 00 00 00 01:51:33.102 IDENTIFY DEVICE

  ec 00 01 00 00 00 00 00 01:51:33.100 IDENTIFY DEVICE

  b0 d5 01 01 4f c2 00 00 01:49:22.563 SMART READ LOG

Error 95 occurred at disk power-on lifetime: 36265 hours (1511 days + 1 hours)

  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:

  ER ST SC SN CL CH DH

  -- -- -- -- -- -- --

  04 51 00 00 00 00 00 Error: ABRT

  Commands leading to the command that caused the error were:

  CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name

  -- -- -- -- -- -- -- -- ---------------- --------------------

  b0 d4 00 81 4f c2 00 00 01:48:43.940 SMART EXECUTE OFF-LINE IMMEDIATE

  b0 d0 01 00 4f c2 00 00 01:48:43.876 SMART READ DATA

  ec 00 01 00 00 00 00 00 01:48:43.866 IDENTIFY DEVICE

  ec 00 01 00 00 00 00 00 01:48:43.865 IDENTIFY DEVICE

  b0 d5 01 01 4f c2 00 00 01:45:49.010 SMART READ LOG

Error 94 occurred at disk power-on lifetime: 36263 hours (1510 days + 23 hours)

  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:

  ER ST SC SN CL CH DH

  -- -- -- -- -- -- --

  04 71 04 9d 00 32 e0 Device Fault; Error: ABRT

  Commands leading to the command that caused the error were:

  CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name

  -- -- -- -- -- -- -- -- ---------------- --------------------

  a1 00 00 00 00 00 a0 00 00:14:55.632 IDENTIFY PACKET DEVICE

  ec 00 00 00 00 00 a0 00 00:14:55.631 IDENTIFY DEVICE

  ff 00 00 00 00 00 00 00 00:14:55.480 [VENDOR SPECIFIC]

  00 00 00 00 00 00 00 04 00:14:55.480 NOP [Abort queued commands]

  ff 00 00 00 00 00 00 00 00:14:55.173 [VENDOR SPECIFIC]

Error 93 occurred at disk power-on lifetime: 36263 hours (1510 days + 23 hours)

  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:

  ER ST SC SN CL CH DH

  -- -- -- -- -- -- --

  04 71 04 9d 00 32 e0

  Commands leading to the command that caused the error were:

  CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name

  -- -- -- -- -- -- -- -- ---------------- --------------------

  ec 00 00 00 00 00 a0 00 00:14:55.631 IDENTIFY DEVICE

  ff 00 00 00 00 00 00 00 00:14:55.480 [VENDOR SPECIFIC]

  00 00 00 00 00 00 00 04 00:14:55.480 NOP [Abort queued commands]

  ff 00 00 00 00 00 00 00 00:14:55.173 [VENDOR SPECIFIC]

  00 00 00 00 00 00 00 ff 00:14:55.173 NOP [Abort queued commands]

Error 92 occurred at disk power-on lifetime: 36263 hours (1510 days + 23 hours)

  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:

  ER ST SC SN CL CH DH

  -- -- -- -- -- -- --

  04 71 04 9d 00 32 e0 Device Fault; Error: ABRT

  Commands leading to the command that caused the error were:

  CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name

  -- -- -- -- -- -- -- -- ---------------- --------------------

  a1 00 00 00 00 00 a0 00 00:14:50.171 IDENTIFY PACKET DEVICE

  ec 00 00 00 00 00 a0 00 00:14:50.171 IDENTIFY DEVICE

  ff 00 00 00 00 00 00 00 00:14:50.020 [VENDOR SPECIFIC]

  00 00 00 00 00 00 00 04 00:14:50.020 NOP [Abort queued commands]

  ff 00 00 00 00 00 00 00 00:14:49.713 [VENDOR SPECIFIC]

SMART Self-test log structure revision number 1

Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error

# 1 Short captive Completed: unknown failure 90% 36265 0

# 2 Short captive Completed: unknown failure 90% 36265 0

# 3 Extended offline Completed: unknown failure 90% 36265 0

# 4 Extended offline Completed: unknown failure 90% 36264 0

# 5 Extended offline Completed: unknown failure 90% 36264 0

SMART Selective self-test log data structure revision number 1

 SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS

    1 0 0 Not_testing

    2 0 0 Not_testing

    3 0 0 Not_testing

    4 0 0 Not_testing

    5 0 0 Not_testing

Selective self-test flags (0x0):

  After scanning selected spans, do NOT read-scan remainder of disk.

If Selective self-test is pending on power-up, resume after 0 minute delay.

#smartctl -A /dev/sda 查看硬盤的詳細信息

#smartctl -s on /dev/sda 如果沒有打開SMART技術,使用該命令打開SMART技術。

#smartctl -t short /dev/sda 後臺檢測硬盤,消耗時間短;

#smartctl -t long /dev/sda 後臺檢測硬盤,消耗時間長;

#smartctl -C -t /dev/sda short前臺檢測硬盤,消耗時間短;

#smartctl -C -t /dev/sda long前臺檢測硬盤,消耗時間長。其實就是利用硬盤SMART的自檢程序。

#smartctl -X /dev/sda 中斷後臺檢測硬盤。

#smartctl -l selftest /dev/sda 顯示硬盤檢測日誌。

#smartctl -l error /dev/sda 顯示硬盤錯誤彙總

壞道修復

檢查: smartctl -l selftest /dev/sda

卸載: umount /dev/sda*

修復: badblocks /dev/sda

badblocks -s -v -c 32 /dev/sdb

Checking blocks 0 to 976762583

Checking for bad blocks (read-only test): 0.20% done, 0:17 elapsed. (0/0/0 errors)

檢測過程是可以中途終止,也可以指定區塊重新開始。

sudo badblock -s -v -c 32 /dev/sd* 976762583 125637824 (注意此處結束區塊在前,起始區塊在後)

修復壞道

如果只是邏輯壞道,你可以

直接fsck

fsck -a /dev/sdb

或者格式化

如果是物理壞道,你需要

a.備份硬盤數據

b.刪除所有硬盤分區 <--可以隔離壞塊

c.根據壞塊位置以及大小,估算出所佔空間。然後重新分區隔離損壞部分。btw:壞道是會擴散的,所以儘可能隔離掉多些空間。

http://www.kbnix.com/2013/07/17/check_and_repari_linux_disk_bad_block

http://www.linuxidc.com/Linux/2012-07/65723.htm

2)有些壞的分區處

       #debugreiserfs /dev/sda1 | grep -i 'blocksize'

      #badblocks -n -b 4096 -o badblocksfile /dev/sda1 把壞塊區標識出來

      #mkfs.reiserfs --badblocks badblocksfile /dev/sda1

      #reiserfsck --fix--fixable --badblocks badblocksfile /dev/sda1

badblocks -w是破壞性的檢查

badblocks -b 4096 -c 16 /dev/sda3 -o 1.txt

1.oops ~ # badblocks -s -b 4096 -c 16 /dev/sdb1 -o bad.sdb1

Checking for bad blocks (read-only test): 13.34% done, 2:24 elapsed. (0/0/0 errors)

2.oops ~ # badblocks -s -b 4096 -c 16 /dev/sdb2 -o bad.sdb2

Checking for bad blocks (read-only test): 0.00% done, 0:22 elapsed. (0/0/0 errors)

意思就是以4k爲一個block,每一個block檢查16次,

將結果輸入到1.txt文件,如果硬盤正常的話,1.txt是沒有任何內容的,如果硬盤很大,我們可以加一個-s參數來顯示進度。

http://linux.anheng.com.cn/news/html/net_admin_blog/linux_badblocks_online_fix.html

使用badblocks 可以查出壞塊,然後badblocks本是具有寫測試功能,我們只需要用badblocks就可以了,

因爲不用向上層的文件系統提供壞道表, 所以我們在掃描壞道時,不用設置塊大小參數(-b),

首先掃描壞道

badblocks -b 4096 -o /root/sdb.bad /etc/sdb

經過慢長的時間,我們得到了一個文件/root/sdb.bad :

16435904

sdb 有1個壞塊

先用dd儘量備份壞塊

dd if=/dev/sdb bs=4096 skip=16435904 of=/tmp/15435904.dat count=1

如果顯示讀取字節數是0就多試幾次, 不行就可能丟失此塊數據, 倒是不用擔心,一般不會有太大問題.

用badblocks的寫測試功能,對這些壞塊進行重寫(注意! -w寫測試會覆蓋數據):

badblocks -w -f /dev/sdb5 16435904 16435904

如果前面的操作有成功的備份/tmp/15435904.dat, 就把它寫回:

dd if=/tmp/15435904.dat of=/dev/sdb seek=15435904 bs=4096 count=1

其實我們不需要等待badblocks掃描完成, 就可以進行修復。

badblocks是對塊設備進行處理, 所以可以實現對掛載中的系統進行處理。

數據比硬盤值錢, 這只可以作爲臨時措施,出現壞道, 還是應該換硬盤,現在硬盤不貴。

在修復前後,利用smartctl 對磁盤進行long測試的2次結果如下:

web:~# smartctl -l selftest /dev/sdb

smartctl version 5.38 [x86_64-unknown-linux-gnu] Copyright (C) 2002-8 Bruce Allen

Home page is http://smartmontools.sourceforge.net/

=== START OF READ SMART DATA SECTION ===

SMART Self-test log structure revision number 1

Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error

# 1 Short offline Completed without error 00% 8308 -

# 2 Short offline Completed: read failure 10% 8292 234908935

#2是修復前的測試 在234908935處出現讀錯誤。

#1是修復後的情況沒有錯誤

上面是對硬盤的壞塊逐塊進行修復的方式

當然可以寫一個腳本把這些命令聯起來, badblocks 也可以連續覆蓋修復壞塊 ,2個塊號的參數, 後面一個是要覆蓋的開始塊數,前面一個是結束塊數.

用smartctl對硬盤進行測試的方法如下

在線快速測試: 大約2分鐘

smartctl -t short /dev/sdb

在線長測試: (1T大約4小時)

smartctl -t long /dev/sdb

此外還有幾種測試模式:offline, short, long, conveyance, select,M-N, pending,N, afterselect,[on|off], scttempint,N[,p]

終止正在進行的測試

smartctl -X /dev/sdb

在測試結束後獲取測試結果:

smatrctl -l selftest /dev/sdb

備份壞硬盤的整盤數據:

如果用dd來備份壞硬盤的整盤數據, 千萬記住要增加 conv=noerror,sync 參數, noerror是遇到讀錯誤繼續, sync是用0填充錯誤的數據, 否則會造成數據錯位, 硬盤映像就毀了。

最好是用dd_rescue來做硬盤的dump, 能顯示當前的dump速度和平均速度。據說它還可以隨時終止, 並再次繼續,可以反向的做數據dump, 從磁盤的尾部向前dump。

在debian下,安裝ddrescue軟件包, 就可以執行dd_rescue了:

dd_rescue /dev/sdb /home/sdb.dump

在debian的新發行版中, 已經沒有dd_rescue 換成了myrescue 用法一樣

200003504

.....

240258229

(240258229-200003504)*512=20610419200b=20.61G

oops ~ # fdisk -l

Disk /dev/sdb: 1000.2 GB, 1000204886016 bytes, 1953525168 sectors

Units = sectors of 1 * 512 = 512 bytes

Sector size (logical/physical): 512 bytes / 512 bytes

I/O size (minimum/optimal): 512 bytes / 512 bytes

Disk identifier: 0x119af6ec

Device Boot Start End Blocks Id System

/dev/sdb1 * 2048 31459327 15728640 83 Linux

/dev/sdb2 31459328 1953525167 961032920 83 Linux

# dmesg |grep error

[14800.929308] end_request: I/O error, dev sdb, sector 1953525080

[14800.929364] end_request: I/O error, dev sdb, sector 1953525088

[14800.929423] end_request: I/O error, dev sdb, sector 1953525096

[14800.929479] end_request: I/O error, dev sdb, sector 1953525104

[14800.929535] end_request: I/O error, dev sdb, sector 1953525112

[14800.929592] end_request: I/O error, dev sdb, sector 1953525120

[14800.929648] end_request: I/O error, dev sdb, sector 1953525128

[14800.929707] end_request: I/O error, dev sdb, sector 1953525136

[14800.929762] end_request: I/O error, dev sdb, sector 1953525144

[14800.929819] end_request: I/O error, dev sdb, sector 1953525152

[14800.929876] end_request: I/O error, dev sdb, sector 1953525160

2013-11-06

debugfs: open /dev/sdb2

/dev/sdb2: Can't read an inode bitmap while reading inode bitmap

讀寫速度測試

一.寫速度測試

oops /test/nat # time dd if=/dev/zero of=/test/nat/test bs=2k count=1000000

1000000+0 records in1000000+0 records out2048000000 bytes (2.0 GB) copied, 29.6383 s, 69.1 MB/s

real 0m29.683s

user 0m0.186s

sys 0m4.672s

二.讀速度測試

oops /test/nat # time dd if=/test/nat/test of=/dev/null bs=2k1000000+0 records in1000000+0 records out2048000000 bytes (2.0 GB) copied, 1.41538 s, 1.4 GB/sreal 0m1.417suser 0m0.135s

sys 0m1.282s


發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章