"SCSI error: <0 0 0 0> return code = 0x70000"

Http · 04-04-2007, 01:50 AM

Quote:

Originally Posted by jddunlap

I just purchased a Dell PowerEdge 2900, which is very similar to a PE1900. I'm experiencing the same issue with CentOS 4.4. smartd refuses to start both on system startup and when invoked manually. Here follows a rough summary of the server specifications,

(2) 1.6Ghz Quad-core Xeon 1066Mhz FSB
(4) 146GB SAS 3.5" 15K Hard drive
(4) 2GB 667Mhz Dual Ranked Fully Buffered DIMMS
(1) PERC 5/i configured for RAID 10

The technical specifications, for the hard drives, can be found here,
http://193.128.183.41/home/v3__produ...0&inf=fsp&wg=0

The error in /var/log/messages is as follows,
[root@bigdog ~]# service smartd start
Starting smartd: [FAILED]
[root@bigdog ~]# tail /var/log/messages
Apr 3 16:37:35 bigdog smartd: smartd startup failed
Apr 3 16:38:08 bigdog smartd[30205]: smartd version 5.33 [i686-redhat-linux-gnu] Copyright (C) 2002-4 Bruce Allen
Apr 3 16:38:08 bigdog smartd[30205]: Home page is http://smartmontools.sourceforge.net/
Apr 3 16:38:08 bigdog smartd[30205]: Opened configuration file /etc/smartd.conf
Apr 3 16:38:08 bigdog smartd[30205]: Configuration file /etc/smartd.conf parsed.
Apr 3 16:38:08 bigdog smartd[30205]: Device: /dev/sda, opened
Apr 3 16:38:08 bigdog smartd[30205]: Device: /dev/sda, Bad IEC (SMART) mode page, err=-5, skip device
Apr 3 16:38:08 bigdog smartd[30205]: Unable to register SCSI device /dev/sda at line 30 of file /etc/smartd.conf
Apr 3 16:38:08 bigdog smartd[30205]: Unable to register device /dev/sda (no Directive -d removable). Exiting.
Apr 3 16:38:08 bigdog smartd: smartd startup failed
[root@bigdog ~]#

I'm not very experienced with smartd. From what I can gather, this utility is used to detect potential failures before they occur. If smartd is not needed for post-failure recovery of an array then I'd feel safe just removing it from chkconfig...

Even so, any thoughts on the subject would be greatly appreciated.

Cheers!

Hi ,

As siya said , check that your scripts/softwares not using smartd command .

Brad.Scalio@noaa.gov · 04-07-2007, 08:03 AM

Greetings

I have had this same problem:

end_request: I/O Error, dev sda
Buffer I/O error on device dm-2, logical block 326
SCSI error: <1 0 0 0> return code=0x70000

The dmesg shows 7 iterations of the following:

SCSI error : <1 0 0 0> return code = 0x70000
end_request : I/O error, dev sda, sector 3031605

we run ServeRAID on this as well, RHEL4u2 ... this partition (dm-2) is a very large ext3FS that acts as the mount for a postgres database that is very frequently used and taxed...the partition is in LVM2 so I thought maybe a LVM issue but:

The live dm data corresponds to what we see in the LVM2 metadata, so we can probably rule out any problem at the volume manager/device mapper layer:

Code:

vg00-lvol01: 0 2097152 linear 8:2 384
vg00-lvol09: 0 10289152 linear 8:2 643891584
vg00-lvol08: 0 4128768 linear 8:2 639762816
vg00-lvol10: 0 632881152 linear 8:2 2752896
vg00-lvol07: 0 4128768 linear 8:2 635634048
vg00-lvol06: 0 655360 linear 8:2 2097536
vg00-lvol05: 0 4128768 linear 8:2 679870848
vg00-lvol04: 0 6160384 linear 8:2 673710464
vg00-lvol03: 0 18481152 linear 8:2 655229312
vg00-lvol02: 0 1048576 linear 8:2 654180736

These linear mappings correspond to the device regions identified in the etc/lvm/backup/vg00 metadata file, for example:

Code:

              lvol10 {
                      id = "v1j3Ii-GqDO-tVHF-y845-kEQj-vWe5-l4NT7f"
                      status = ["READ", "WRITE", "VISIBLE"]
                      segment_count = 1

                      segment1 {
                              start_extent = 0
                              extent_count = 9657     # 301.781 Gigabytes

                              type = "striped"
                              stripe_count = 1        # linear

                              stripes = [
                                      "pv0", 42
                              ]
                      }
              }

pv0 is sda2 (8:2):

              pv0 {
                      id = "0MWHVn-TYKx-0ifq-jCw8-KnrK-LVLD-BH5QGg"
                      device = "/dev/sda2"    # Hint only

                      status = ["ALLOCATABLE"]
                      pe_start = 384
                      pe_count = 10625        # 332.031 Gigabytes
              }

Exten 42 above ("pv0", 42) puts us right at the beginning of the region on the disk that is throwing back all the scsi errors...the region is only about 27k in size (that is throwing back the errors) and is high up on the deivce and corrosponds to the journal itself (for the ext3FS).

So I thought there is a problem with how the ips driver or firmware is dealing with the I/Os being sent down from the jbd driver (journaled block device). Immediately remaking the partition as ext2FS resolves the problem ... hmmm ...

So anyway, wanted to try to keep it ext3FS --- so tried:

increase the commit time to 30 seconds by editing /etc/fstab.

Code:

For example: /dev/vg0/varvol /var ext3 commit=30 1 2

Then remount or reboot for the settings to take a effect on the filesystem.

No go with that so I also tried:
decreasing block flushing frequency via dirty_writeback_centisecs/dirty_expire_centisecs....

Code:

etc/sysctl.conf vm.dirty_expire_centisecs = 8000 vm.dirty_writeback_centisecs = 2000

-- Again no go ... smartd disabled etc,, all the original items posted in this post ... well we have this problem at multiple locations running the same configuration ... funny thing is now the ext2FS is spitting out the SCSI I/O errors on dm-2 (lvol10) the postgres mount partition ... but only when doing a dd to the partition ... I tried the irqpoll option for booting, again no go -- it aint a hardware issue as 18 different sites are having the same problem ...

Anyone?? ;-)

d3matt · 10-19-2010, 08:59 AM

I too am seeing this problem on an IBM x226 server that has been running fine for nearly 5 years (without being touched) and now it is scrolling these errors up the console screen:

SCSI error : <0 0 0 0> return code = 0x70000
SCSI error : <0 0 0 0> return code = 0x70000
SCSI error : <0 0 1 0> return code = 0x70000
SCSI error : <0 0 1 0> return code = 0x70000
SCSI error : <0 0 2 0> return code = 0x70000
SCSI error : <0 0 2 0> return code = 0x70000

It has six SCSI disks on a ServeRAID-6i. IBM have swapped out the RAID card and now also the motherboard. All hardware looks fine and is not reporting errors. Server is still in use with 40 users off it and it running fine.
The o/s (RedHat Enterprise 4 AS 2.6.9-22.0.1.EL) boots fine with no errors. Then these SCSI errors start about 20 seconds after the o/s has fully booted.

Are there any known fixes for this?
I'm just about to look into smartd, but I've seen from many Google searches that this doesn't seem to be the reason.

d3matt · 10-19-2010, 09:32 AM

Can anyone help a Linux beginner? How do I stop smartd, or indeed find if it is running?

There is a smartd.conf file in /etc/.
To stop it, I've tried:

/etc/rc.d/init.d/smartd stop - results in no such file or directory

There is no smartd in the init.d folder. Does this mean it's not running?

Also "smartd" does not exist in the /var/log/messages file, so from all of this do I assume that smartd is not running?

jml75 · 11-02-2010, 10:19 AM

Usualy, when you look for a process, you run the following command : ps aux | grep <name of process>.

In the case of smartd, it would be : ps aux | grep smartd

In /etc/init.d, the script that starts "smartd" is "smartmontools" in Debian Lenny (5.0).

Jonathan

d3matt · 11-03-2010, 04:03 AM

Quote:

Originally Posted by jml75

In the case of smartd, it would be : ps aux | grep smartd

Thank you for your reply.
I did this, but what does the result show?

# ps aux | grep smartd
root 1775 0.0 0.0 5160 660 pts/46 R+ 09:03 0:00 grep smartd

jml75 · 11-03-2010, 08:25 AM

Me, I get :

# ps aux | grep smartd
root 2882 0.0 0.0 3200 696 ? S Nov02 0:00 /usr/sbin/smartd --pidfile /var/run/smartd.pid --interval=1800
root 17439 0.0 0.0 3144 764 pts/0 R+ 09:22 0:00 grep smartd

The first line shows that smartd is running, the second line shows the command you are just running.

What distribution are you using?

I'm using Debian and Ubuntu. On both, to install smartd, I use the following command :

apt-get install smartmontools

What distro are you using?

d3matt · 11-04-2010, 06:17 AM

Quote:

Originally Posted by jml75

What distro are you using?

This server is running RedHat Linux Enterprise AS 4 (2.6.9-22.0.1.EL).

It would seem that in my case smartd is not running.
Many others have found that turning off smartd doesn't get rid of these SCSI error messages.