LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Slackware (https://www.linuxquestions.org/questions/slackware-14/)
-   -   dd command destroys/corrupts my new Seagate hard drives?? (https://www.linuxquestions.org/questions/slackware-14/dd-command-destroys-corrupts-my-new-seagate-hard-drives-4175412803/)

BashTin 06-22-2012 05:27 AM

dd command destroys/corrupts my new Seagate hard drives??
 
Ok, I know it sounds unbelievable but this is drive three I am on!

I am using Seagate 1 Tera byte drives ((ST31000524AS (original), ST1000DM003 (num 2 & 3)). I wish to use encryption so want to fill the drive with randomness. I used the command

Code:

dd if=/dev/urandom of=/dev/sdb
On all three drives the process started and then quit saying read errors. Now these are brand new drives. On the first drive just assumed it was duff and the retailer exchanged it (I found out about smartctl at this point).

So before I tried again with the second drive I did smartclt short test and showed the drive was fine and passed. Ran the command and then it quit sometime latter complaining about read errors. At the same time my root partition filled up with message and syslog error logs. Also when I rebooted the system could not even detect any of my two hard drives (my existing one and this new one). I had to take out the sata cable from the sdb and reboot. Then it would detect my original drive (sda). Plug sdb back in then it would detect it. Ran some diagnostics

Code:

=== START OF INFORMATION SECTION ===
Device Model:    ST1000DM003-9YN162
Serial Number:    S1D25X1D

=== START OF READ SMART DATA SECTION ===
SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline      Completed: read failure      90%        32        13048

SMART Error Log Version: 1
ATA Error Count: 17 (device log contains only the most recent five errors)
Error 17 occurred at disk power-on lifetime: 32 hours (1 days + 8 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 00 f8 32 00 00

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC  Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  60 00 08 f8 32 00 40 00      00:22:24.246  READ FPDMA QUEUED
  27 00 00 00 00 00 e0 00      00:22:24.246  READ NATIVE MAX ADDRESS EXT
  ec 00 00 00 00 00 a0 00      00:22:24.246  IDENTIFY DEVICE
  ef 03 46 00 00 00 a0 00      00:22:24.246  SET FEATURES [Set transfer mode]
  27 00 00 00 00 00 e0 00      00:22:24.246  READ NATIVE MAX ADDRESS EXT

*********************************************************************************************
[root@tuxbox mrt]#dd if=/dev/urandom of=/dev/sdb
dd: writing to `/dev/sdb': Input/output error
13049+0 records in
13048+0 records out
6680576 bytes (6.7 MB) copied, 18.9018 s, 353 kB/s

***********************************************************************************************

[root@tuxbox mrt]#badblocks -c 10240 -e 10000 -wsvt random /dev/sdb
9989
9990
9991
9992
9993
9994
9995
9996
9997
9998
9999
Too many bad blocks, aborting test
done                               
Reading and comparing: Too many bad blocks, aborting test
done                               
Pass completed, 10000 bad blocks found.

So onto hard drive 3 and similar story. Did smartctl short test and all was fine. Run the dd command and same problems as on drive 1 and 2. This time (having learned a tad more) I downloaded the seagate tool for dos which is a bootable cd image. I had to run the tool using the long test around 7 times to correct LBA errors 60 to 100 at a time. In the end there were 859!!!! errors corrected.

Code:

SMART Error Log Version: 1
ATA Error Count: 859 (device log contains only the most recent five errors)

Error 859 occurred at disk power-on lifetime: 35 hours (1 days + 11 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 00 ff ff ff 0f  Error: UNC at LBA = 0x0fffffff = 268435455

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC  Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  42 ff 00 ff ff ff ef 00      02:42:10.489  READ VERIFY SECTOR(S) EXT
  42 ff 00 ff ff ff ef 00      02:42:07.693  READ VERIFY SECTOR(S) EXT
  42 ff 00 ff ff ff ef 00      02:42:04.930  READ VERIFY SECTOR(S) EXT
  42 ff 00 ff ff ff ef 00      02:42:02.141  READ VERIFY SECTOR(S) EXT
  42 ff 00 ff ff ff ef 00      02:41:59.379  READ VERIFY SECTOR(S) EXT

So what is causing this? Three faulty drives?? Seems very improbable, especially when they all pass selftests to begin with. So that leaves what I am trying to do to them. Is my dd command somehow over writing the actual hard drive platter structures messing up any already re-allocted bad sectors? Do I need specify bs or other with my dd command??

And one final point, having had seatools correct (re-allocate, I think) 859 sectors why is this not reflected in smartclt output??

Code:

5 Reallocated_Sector_Ct  0x0033  100  100  036    Pre-fail  Always      -      0
Sorry this is quite long but I tried not to miss anything. Please anyone can help only I am a little timid to take drive 3 back in as many days!

BashTin.

PS. My original drive was in the sata port that I am putting these new drives on so guess that eliminates any board issues.

pixellany 06-22-2012 05:34 AM

Not directly relevant to your question, but---Why would you need to write random data to the drive before using encryption?

curious to know whether something like GParted would recognize the drives....

BashTin 06-22-2012 05:46 AM

Quote:

Originally Posted by pixellany (Post 4709180)
Not directly relevant to your question, but---Why would you need to write random data to the drive before using encryption?

curious to know whether something like GParted would recognize the drives....

Well, depends on your state of paranoia. But the idea is that if you fill your entire drive with random gibberish before you put your encrypted data on it the actual encrypted data is harder to locate for cryptographic analysis. If you did not fill the drive before hand it would be obvious where your data started and finished. Bit like looking for a drop of water in a lake, or similar.

BashTin

H_TeXMeX_H 06-22-2012 06:25 AM

I think the drives are just bad. But, what SATA drivers are you using ? I recommend AHCI in the BIOS settings and the 'ahci' Linux driver. I've found this avoids possible driver bugs.

I use 'wipe' to wipe disks with random data, it is much faster than /dev/urandom (which isn't really meant for this purpose). 'wipe' uses MT and is seeded and is many times faster than /dev/urandom.

catkin 06-22-2012 06:27 AM

Quote:

Originally Posted by BashTin (Post 4709175)
So what is causing this? Three faulty drives?? Seems very improbable, especially when they all pass selftests to begin with. So that leaves what I am trying to do to them. Is my dd command somehow over writing the actual hard drive platter structures messing up any already re-allocted bad sectors? Do I need specify bs or other with my dd command??

And one final point, having had seatools correct (re-allocate, I think) 859 sectors why is this not reflected in smartclt output??

Code:

5 Reallocated_Sector_Ct  0x0033  100  100  036    Pre-fail  Always      -      0

Improbable as it may be, IMHO the most likely explanation is three faulty drives.

Whatever dd writes to /dev/sdb, it should not be writing to the drive's internal tables. The drive's control electronics presents a series of blocks at /dev/sdb with any bad blocks known to the drive itself transparently mapped out as far as the OS is concerned.

smartctl's drive self-tests are not exhaustive, especially the short ones. Seatools' tests are more complete.

SMART messages are notoriously unintuitive and not always correctly populated by the drive. Reallocated_Sector_Ct may not (or may) indicate the number of bad sectors re-mapped by the drive's control electronics.

H_TeXMeX_H 06-22-2012 06:36 AM

The smartctl long test is the best one to use, or the manufacturer utils. The short test is useless.

I have never seen a drive where wiping it would affect SMART diagnostics, those are not normally accessible. However, this is a new drive, maybe things have changed, as I only have relatively old drives.

sparkyhall 06-22-2012 07:09 AM

My failing drive always passed the smartctl short self test even though it would fail the long test. I believe it's only the long test that performs a full surface read/write test so I wouldn't place too much reliance on the short test results.

TobiSGD 06-22-2012 09:09 AM

It may be unlikely to have three faulty drives, but it is not impossible. There also may be a second cause for this behavior: Either faulty RAM (or RAM that is not used in its specified settings/overclocked RAM) or a faulty CPU (or an unstable overclocked CPU).

I would recommend to test the disks in a different system. If they fail also in a different system you know that you really have bad luck.

H_TeXMeX_H 06-22-2012 09:40 AM

Use memtest86 to rule out faulty RAM.

dfwrider 06-22-2012 01:51 PM

I agree with the last two posts.

bad i/o. which could be caused by numerous things.

bad/hot powersupply/motherboard/memory/cpu/cables/connectors/chipset/controller... etc etc etc.

is everything dust/crap free, especially heatsinks?

cascade9 06-22-2012 02:00 PM

Quote:

Originally Posted by dfwrider (Post 4709527)
bad/hot powersupply/motherboard/memory/cpu/cables/connectors/chipset/controller... etc etc etc.

+1, was thinking the same thing.

Another possible problem- 4K sectors. AFAIK ST31000524AS is a 512B sector drive, ST1000DM003 is a 4k sector drive.

szboardstretcher 06-22-2012 02:07 PM

I am interested to know whether doing a regular fdisk, mkfs, fsck on the drive actually works. If not, then you are looking at faulty drives, or related connective hardware (cables, controllers, power etc)

BashTin 06-23-2012 10:21 AM

Just a 'quicky'. Thanks for the replies and suggestions.

In reply to szboardstretcher did fdisk -l, mkreiserfs, fsck and all ran without fault.

I also ran memetest86+ without problems.

Think at this stage now I know for sure the hd has passed a full smartclt test I will do another dd /dev/urandom and see how it performs this time around.

Will report back.

BashTin.

BashTin 07-13-2012 06:28 PM

Thought I would give this thread 'closure'.

Well it turned out you were all right. Improbable as it was all three drives were duffs. As mentioned in my original post I used SeaTools to reallocate over 800 bad sectors on the third drive. After that I used it for only a few days before I started having more problems. A rescan with SeaTools showed up 16 new bad sectors. So there could be no doubt the drive was a dud. Got yet another replacement and the first thing I did before I even tried to use it was do a long scan with SeaTools. No issues found at all. Now using this drive with no problems and am finally confident.

Thanks to all of you who gave advise and suggestions.

Keep learning! BashTin.

jefro 07-13-2012 10:24 PM

Thanks for the update. I'd have replaced the motherboard/disk controller before I went to another drive.


All times are GMT -5. The time now is 06:57 AM.