LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Hardware (https://www.linuxquestions.org/questions/linux-hardware-18/)
-   -   /dev/sdb: read failed after 0 of 4096 at 0: Input/output error (https://www.linuxquestions.org/questions/linux-hardware-18/dev-sdb-read-failed-after-0-of-4096-at-0-input-output-error-4175548804/)

deep27ak 07-24-2015 02:45 AM

/dev/sdb: read failed after 0 of 4096 at 0: Input/output error
 
I get this error everytime I run any LVM commands on my machine which is running with SLES on IBM and is connected to EMC VNX5300 storage.

Initially I thought one of my block device has failed but everything on the OS side looked normal as seen below

output of multipath
Code:

# multipath -ll
36006016037a02e00ca86fb1d4847e111 dm-0 DGC,RAID 5
size=200G features='1 queue_if_no_path' hwhandler='1 emc' wp=rw
|-+- policy='service-time 0' prio=4 status=active
| `- 0:0:0:0 sda 8:0  active ready running
`-+- policy='service-time 0' prio=1 status=enabled
  `- 1:0:0:0 sdb 8:16 active ready running

# lsscsi
[1:0:0:0]    disk    DGC      RAID 5          0532  /dev/sda
[2:0:0:0]    disk    DGC      RAID 5          0532  /dev/sdb

multipathd> show paths
hcil    dev dev_t pri dm_st  chk_st dev_st  next_check
1:0:0:0 sda 8:0  4  active ready  running XXXXXX.... 13/20
2:0:0:0 sdb 8:16  1  active ready  running XXXXXX.... 13/20

But I couldnot see /dev/sdb
Code:

# fdisk /dev/sdb
fdisk: unable to read /dev/sdb: Invalid argument

So I manually tried to remove sda just to make sure if the redundancy is working properly
Code:

# echo 1 > /sys/block/sda/device/delete

# multipath -ll
36006016037a02e00ca86fb1d4847e111 dm-0 DGC,RAID 5
size=200G features='1 queue_if_no_path' hwhandler='1 emc' wp=rw
`-+- policy='service-time 0' prio=2 status=active
  `- 1:0:0:0 sdb 8:16 active ready running

Thern I rescanned my HBA
Code:

# echo "- - -" > /sys/class/scsi_host/host0/scan
and both the block device re-appear
Code:

# multipath -ll
36006016037a02e00ca86fb1d4847e111 dm-0 DGC,RAID 5
size=200G features='1 queue_if_no_path' hwhandler='1 emc' wp=rw
|-+- policy='service-time 0' prio=4 status=active
| `- 0:0:0:0 sda 8:0  active ready running
`-+- policy='service-time 0' prio=1 status=enabled
  `- 1:0:0:0 sdb 8:16 active ready running

Adding to my surprise all the errors which were shown earlier while running lvm commands were gone
Code:

# lvs
  LV                    VG    Attr      LSize  Pool Origin Data%  Move Log Copy%  Convert
  ISS                    system -wi-ao--- 120.47g
  is-main                system -wi-ao---  15.00g
  system-opt-mgtservices system -wi-ao---  5.00g
  system-usr            system -wi-ao---  2.00g
  system-var            system -wi-ao---  2.00g
  system-var-log        system -wi-ao---  2.00g
  system-var-opt        system -wi-ao---  25.00g
  tmp                    system -wi-ao---  20.00g

And also this device was visible again
Code:

# fdisk /dev/sdb

Command (m for help): p

Disk /dev/sdb: 214.7 GB, 214748364800 bytes
255 heads, 63 sectors/track, 26108 cylinders, total 419430400 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x0006db56

  Device Boot      Start        End      Blocks  Id  System
/dev/sdb1  *        2048    1060863      529408  83  Linux
/dev/sdb2        1060864    9461759    4200448  83  Linux
/dev/sdb3        9461760  411023359  200780800  8e  Linux LVM
/dev/sdb4      411023360  419430399    4203520  82  Linux swap / Solaris

I am really confused what is leading to such behavior? Any misconfiguration in mutipath?

Below is my multipath config
Code:

defaults {
        verbosity 2
        polling_interval 5
        multipath_dir "/lib64/multipath"
        path_selector "service-time 0"
        path_grouping_policy "failover"
        uid_attribute "ID_SERIAL"
        prio "const"
        prio_args ""
        features "0"
        path_checker "directio"
        alias_prefix "mpath"
        failback "manual"
        rr_min_io 1000
        rr_min_io_rq 1
        max_fds "max"
        rr_weight "uniform"
        queue_without_daemon "yes"
        flush_on_last_del "no"
        user_friendly_names "no"
        fast_io_fail_tmo 5
        bindings_file "/etc/multipath/bindings"
        wwids_file /etc/multipath/wwids
        log_checker_err always
        retain_attached_hw_handler no
        detect_prio no
}


Keruskerfuerst 07-24-2015 03:54 AM

Then try to replace the mentioned HDD.

deep27ak 07-24-2015 04:33 AM

It is a SAN storage, sda and sdb are just block devices

I didn't get you exactly !!

Keruskerfuerst 07-24-2015 06:02 AM

Then check the complete device.


All times are GMT -5. The time now is 01:03 PM.