LinuxQuestions.org - [SOLVED] dd command destroys/corrupts my new Seagate hard drives??

- Slackware (https://www.linuxquestions.org/questions/slackware-14/)

- - dd command destroys/corrupts my new Seagate hard drives?? (https://www.linuxquestions.org/questions/slackware-14/dd-command-destroys-corrupts-my-new-seagate-hard-drives-4175412803/)

dd command destroys/corrupts my new Seagate hard drives??

Ok, I know it sounds unbelievable but this is drive three I am on!

I am using Seagate 1 Tera byte drives ((ST31000524AS (original), ST1000DM003 (num 2 & 3)). I wish to use encryption so want to fill the drive with randomness. I used the command

Code:

dd if=/dev/urandom of=/dev/sdb

On all three drives the process started and then quit saying read errors. Now these are brand new drives. On the first drive just assumed it was duff and the retailer exchanged it (I found out about smartctl at this point).

So before I tried again with the second drive I did smartclt short test and showed the drive was fine and passed. Ran the command and then it quit sometime latter complaining about read errors. At the same time my root partition filled up with message and syslog error logs. Also when I rebooted the system could not even detect any of my two hard drives (my existing one and this new one). I had to take out the sata cable from the sdb and reboot. Then it would detect my original drive (sda). Plug sdb back in then it would detect it. Ran some diagnostics

Code:

=== START OF INFORMATION SECTION === 

Device Model:    ST1000DM003-9YN162 

Serial Number:    S1D25X1D 



=== START OF READ SMART DATA SECTION ===

SMART Self-test log structure revision number 1

Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error

# 1  Short offline      Completed: read failure      90%        32        13048



SMART Error Log Version: 1 

ATA Error Count: 17 (device log contains only the most recent five errors) 

Error 17 occurred at disk power-on lifetime: 32 hours (1 days + 8 hours) 

  When the command that caused the error occurred, the device was active or idle. 



  After command completion occurred, registers were: 

  ER ST SC SN CL CH DH 

  -- -- -- -- -- -- -- 

  40 51 00 f8 32 00 00 



  Commands leading to the command that caused the error were: 

  CR FR SC SN CL CH DH DC  Powered_Up_Time  Command/Feature_Name 

  -- -- -- -- -- -- -- --  ----------------  -------------------- 

  60 00 08 f8 32 00 40 00      00:22:24.246  READ FPDMA QUEUED 

  27 00 00 00 00 00 e0 00      00:22:24.246  READ NATIVE MAX ADDRESS EXT 

  ec 00 00 00 00 00 a0 00      00:22:24.246  IDENTIFY DEVICE 

  ef 03 46 00 00 00 a0 00      00:22:24.246  SET FEATURES [Set transfer mode] 

  27 00 00 00 00 00 e0 00      00:22:24.246  READ NATIVE MAX ADDRESS EXT 



*********************************************************************************************

[root@tuxbox mrt]#dd if=/dev/urandom of=/dev/sdb

dd: writing to `/dev/sdb': Input/output error

13049+0 records in

13048+0 records out

6680576 bytes (6.7 MB) copied, 18.9018 s, 353 kB/s



***********************************************************************************************



[root@tuxbox mrt]#badblocks -c 10240 -e 10000 -wsvt random /dev/sdb

9989

9990

9991

9992

9993

9994

9995

9996

9997

9998

9999

Too many bad blocks, aborting test

done                                

Reading and comparing: Too many bad blocks, aborting test

done                                

Pass completed, 10000 bad blocks found.

So onto hard drive 3 and similar story. Did smartctl short test and all was fine. Run the dd command and same problems as on drive 1 and 2. This time (having learned a tad more) I downloaded the seagate tool for dos which is a bootable cd image. I had to run the tool using the long test around 7 times to correct LBA errors 60 to 100 at a time. In the end there were 859!!!! errors corrected.

Code:

SMART Error Log Version: 1

ATA Error Count: 859 (device log contains only the most recent five errors)



Error 859 occurred at disk power-on lifetime: 35 hours (1 days + 11 hours)

  When the command that caused the error occurred, the device was active or idle.



  After command completion occurred, registers were:

  ER ST SC SN CL CH DH

  -- -- -- -- -- -- --

  40 51 00 ff ff ff 0f  Error: UNC at LBA = 0x0fffffff = 268435455



  Commands leading to the command that caused the error were:

  CR FR SC SN CL CH DH DC  Powered_Up_Time  Command/Feature_Name

  -- -- -- -- -- -- -- --  ----------------  --------------------

  42 ff 00 ff ff ff ef 00      02:42:10.489  READ VERIFY SECTOR(S) EXT

  42 ff 00 ff ff ff ef 00      02:42:07.693  READ VERIFY SECTOR(S) EXT

  42 ff 00 ff ff ff ef 00      02:42:04.930  READ VERIFY SECTOR(S) EXT

  42 ff 00 ff ff ff ef 00      02:42:02.141  READ VERIFY SECTOR(S) EXT

  42 ff 00 ff ff ff ef 00      02:41:59.379  READ VERIFY SECTOR(S) EXT

So what is causing this? Three faulty drives?? Seems very improbable, especially when they all pass selftests to begin with. So that leaves what I am trying to do to them. Is my dd command somehow over writing the actual hard drive platter structures messing up any already re-allocted bad sectors? Do I need specify bs or other with my dd command??

And one final point, having had seatools correct (re-allocate, I think) 859 sectors why is this not reflected in smartclt output??

Code:

5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail Always - 0

Sorry this is quite long but I tried not to miss anything. Please anyone can help only I am a little timid to take drive 3 back in as many days!

BashTin.

PS. My original drive was in the sata port that I am putting these new drives on so guess that eliminates any board issues.

Not directly relevant to your question, but---Why would you need to write random data to the drive before using encryption?

curious to know whether something like GParted would recognize the drives....

Quote:

Originally Posted by pixellany (Post 4709180)

Well, depends on your state of paranoia. But the idea is that if you fill your entire drive with random gibberish before you put your encrypted data on it the actual encrypted data is harder to locate for cryptographic analysis. If you did not fill the drive before hand it would be obvious where your data started and finished. Bit like looking for a drop of water in a lake, or similar.

BashTin

I think the drives are just bad. But, what SATA drivers are you using ? I recommend AHCI in the BIOS settings and the 'ahci' Linux driver. I've found this avoids possible driver bugs.

I use 'wipe' to wipe disks with random data, it is much faster than /dev/urandom (which isn't really meant for this purpose). 'wipe' uses MT and is seeded and is many times faster than /dev/urandom.

Quote:

Originally Posted by BashTin (Post 4709175)

Code:

5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail Always - 0

Improbable as it may be, IMHO the most likely explanation is three faulty drives.

Whatever dd writes to /dev/sdb, it should not be writing to the drive's internal tables. The drive's control electronics presents a series of blocks at /dev/sdb with any bad blocks known to the drive itself transparently mapped out as far as the OS is concerned.

smartctl's drive self-tests are not exhaustive, especially the short ones. Seatools' tests are more complete.

SMART messages are notoriously unintuitive and not always correctly populated by the drive. Reallocated_Sector_Ct may not (or may) indicate the number of bad sectors re-mapped by the drive's control electronics.

The smartctl long test is the best one to use, or the manufacturer utils. The short test is useless.

I have never seen a drive where wiping it would affect SMART diagnostics, those are not normally accessible. However, this is a new drive, maybe things have changed, as I only have relatively old drives.

My failing drive always passed the smartctl short self test even though it would fail the long test. I believe it's only the long test that performs a full surface read/write test so I wouldn't place too much reliance on the short test results.

It may be unlikely to have three faulty drives, but it is not impossible. There also may be a second cause for this behavior: Either faulty RAM (or RAM that is not used in its specified settings/overclocked RAM) or a faulty CPU (or an unstable overclocked CPU).

I would recommend to test the disks in a different system. If they fail also in a different system you know that you really have bad luck.

Use memtest86 to rule out faulty RAM.

I agree with the last two posts.

bad i/o. which could be caused by numerous things.

bad/hot powersupply/motherboard/memory/cpu/cables/connectors/chipset/controller... etc etc etc.

is everything dust/crap free, especially heatsinks?

Quote:

Originally Posted by dfwrider (Post 4709527)

bad/hot powersupply/motherboard/memory/cpu/cables/connectors/chipset/controller... etc etc etc.

+1, was thinking the same thing.

Another possible problem- 4K sectors. AFAIK ST31000524AS is a 512B sector drive, ST1000DM003 is a 4k sector drive.

I am interested to know whether doing a regular fdisk, mkfs, fsck on the drive actually works. If not, then you are looking at faulty drives, or related connective hardware (cables, controllers, power etc)

Just a 'quicky'. Thanks for the replies and suggestions.

In reply to szboardstretcher did fdisk -l, mkreiserfs, fsck and all ran without fault.

I also ran memetest86+ without problems.

Think at this stage now I know for sure the hd has passed a full smartclt test I will do another dd /dev/urandom and see how it performs this time around.

Will report back.

BashTin.

Thought I would give this thread 'closure'.

Well it turned out you were all right. Improbable as it was all three drives were duffs. As mentioned in my original post I used SeaTools to reallocate over 800 bad sectors on the third drive. After that I used it for only a few days before I started having more problems. A rescan with SeaTools showed up 16 new bad sectors. So there could be no doubt the drive was a dud. Got yet another replacement and the first thing I did before I even tried to use it was do a long scan with SeaTools. No issues found at all. Now using this drive with no problems and am finally confident.

Thanks to all of you who gave advise and suggestions.

Keep learning! BashTin.

Thanks for the update. I'd have replaced the motherboard/disk controller before I went to another drive.