LinuxQuestions.org
Visit Jeremy's Blog.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Software
User Name
Password
Linux - Software This forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.

Notices

Reply
 
Search this Thread
Old 11-27-2006, 10:15 PM   #1
exodist
Senior Member
 
Registered: Aug 2003
Location: Portland, Oregon
Distribution: Gentoo
Posts: 1,372

Rep: Reputation: 46
Unhappy software raid, corruption, not problem w/ media


first off some specs:

Pentium4 3.0 HT disabled
1gb ram
via onboard ide w/ 320 gb maxtor HD
via onboard sata w/ 1 250gb western digital hd
first silicon image sata controller, 4 western digital 250gb drives
second silicon image sata controller, 2 segate 300gb hd's
Gentoo up to date

What I have tried/problem, I will list what I have done, alternatives I have tried to fix it with, and problem. Extra debug info will follow:
I create a raid arrays, tied levels 1 and 5.Create and make filesystem up to this point works fine (tried both xfs and reiserfs, same prob occurs on both) Have tried several combinations of drives and controllers, all controllers/drives have same problem. I have tried both the 2.6.17 and 2.6.18 version kernels, no love.

Basically I create the raid, wait for it to resync, then I create the filesystem (xfs or reiserfs) then I copy a lot of files to it. I then try to delete some stuff/read some stuff, modify the drive in some way. Stuff fails or even segfaults requiring reboot, dmesg shows that io was trying to access way beyond the edge of the device numerous times saying it is going out to arbitrary block numbers that are often 10+ digits long. I run the filesystems scanning/repairing utility and it finds tons of files with incorrect sizes/lengths. usually this is it, but a few times on reiser it has found corruption requring the tree be rebuilt.

If I rpair the fs then scan it it is clean, I mount it, then unmount it and scan again, still clean, I try to make changes on the drive and once again get errors, scan says repair is needed, same drill as before it is all fixed, then scan says clean.

Once again I have tried different filesystems, different kernels, different raid builders (both mdadm and raidtools)

Filesystems on non raid partitions (including all partitions used in the raids) not corrupted after much use.

Problem as far as I can narrow it down to is in the raid array, not the filesystem or devices.

Forgot to mention, simply copying data from the drive also seems to ?cause? corruption. Like I said I can repair with the fs repair tool, after that scanning will say it is good, then I mount it and unmount it and scan still says good, but then I start copying files from it and after the first few I will start to get read errors and dmesg gets the io access beyond end of device errors. then scan finds the corruption... I have not tried mounting the fs read only, when this rebuild-tree is done I will do that (700gb raid takes a long time to repair)

some extra stuff:

Kernel When erros occurs (this is after repairing, scanning and finding it clean, then setting raid read-only and mounting read-only, then trying to copy data off of it)
Code:
md8: rw=0, want=4261343920, limit=1465175424
attempt to access beyond end of device
md8: rw=0, want=4261343920, limit=1465175424
Buffer I/O error on device md8, logical block 532667989
attempt to access beyond end of device
md8: rw=0, want=4261343920, limit=1465175424
Buffer I/O error on device md8, logical block 532667989
attempt to access beyond end of device
md8: rw=0, want=17166515696, limit=1465175424
attempt to access beyond end of device
md8: rw=0, want=17166515696, limit=1465175424
Buffer I/O error on device md8, logical block 2145814461
attempt to access beyond end of device
md8: rw=0, want=17166515696, limit=1465175424
Buffer I/O error on device md8, logical block 2145814461
attempt to access beyond end of device
md8: rw=0, want=18446744068902090496, limit=1465175424
attempt to access beyond end of device
md8: rw=0, want=18446744068902090496, limit=1465175424
Buffer I/O error on device md8, logical block 18446744073108618975
attempt to access beyond end of device
md8: rw=0, want=18446744068902090496, limit=1465175424
Buffer I/O error on device md8, logical block 18446744073108618975
attempt to access beyond end of device
md8: rw=0, want=4597359648, limit=1465175424
attempt to access beyond end of device
md8: rw=0, want=4597359648, limit=1465175424
Buffer I/O error on device md8, logical block 574669955
attempt to access beyond end of device
md8: rw=0, want=4597359648, limit=1465175424
Buffer I/O error on device md8, logical block 574669955
attempt to access beyond end of device
md8: rw=0, want=18446744070477053544, limit=1465175424
attempt to access beyond end of device
md8: rw=0, want=18446744070477053544, limit=1465175424
Buffer I/O error on device md8, logical block 18446744073305489356
attempt to access beyond end of device
md8: rw=0, want=18446744070477053544, limit=1465175424
Buffer I/O error on device md8, logical block 18446744073305489356
Here is a blotted out output from rsync when the errors occur
Code:
Luxor hd1 # rsync -aqP /Blotted/Out/Path/U* ./
rsync: read errors mapping "/Blotted/Out/file1.xxx": Input/output error (5)
rsync: read errors mapping "/Blotted/Out/file2.xxx": Input/output error (5)
rsync: read errors mapping "/Blotted/Out/file3.xxx": Input/output error (5)
rsync: read errors mapping "/Blotted/Out/file4.xxx": Input/output error (5)

Last edited by exodist; 11-28-2006 at 01:05 AM.
 
Old 12-05-2006, 12:02 PM   #2
exodist
Senior Member
 
Registered: Aug 2003
Location: Portland, Oregon
Distribution: Gentoo
Posts: 1,372

Original Poster
Rep: Reputation: 46
the motherboard was defective apparently, but for some reason there is only corruption with the raid array, something to do w/ the ammount of bandwidth being used.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
reiserfs slight occasional corruption on software RAID-5 arrays Cairan Linux - Software 3 07-11-2006 04:11 PM
Problem with software installer in FC5, and media playing greatquizzard Fedora 17 06-05-2006 04:07 AM
Software that avoids RAM corruption Matt_U Linux - Software 2 03-18-2006 08:10 AM
Filesystem corruption on software RAID5 drkdiggler Linux - General 6 02-23-2004 08:39 PM


All times are GMT -5. The time now is 07:06 AM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration