I get this error everytime I run any LVM commands on my machine which is running with SLES on IBM and is connected to EMC VNX5300 storage.
Initially I thought one of my block device has failed but everything on the OS side looked normal as seen below
output of multipath
Code:
# multipath -ll
36006016037a02e00ca86fb1d4847e111 dm-0 DGC,RAID 5
size=200G features='1 queue_if_no_path' hwhandler='1 emc' wp=rw
|-+- policy='service-time 0' prio=4 status=active
| `- 0:0:0:0 sda 8:0 active ready running
`-+- policy='service-time 0' prio=1 status=enabled
`- 1:0:0:0 sdb 8:16 active ready running
# lsscsi
[1:0:0:0] disk DGC RAID 5 0532 /dev/sda
[2:0:0:0] disk DGC RAID 5 0532 /dev/sdb
multipathd> show paths
hcil dev dev_t pri dm_st chk_st dev_st next_check
1:0:0:0 sda 8:0 4 active ready running XXXXXX.... 13/20
2:0:0:0 sdb 8:16 1 active ready running XXXXXX.... 13/20
But I couldnot see /dev/sdb
Code:
# fdisk /dev/sdb
fdisk: unable to read /dev/sdb: Invalid argument
So I manually tried to remove sda just to make sure if the redundancy is working properly
Code:
# echo 1 > /sys/block/sda/device/delete
# multipath -ll
36006016037a02e00ca86fb1d4847e111 dm-0 DGC,RAID 5
size=200G features='1 queue_if_no_path' hwhandler='1 emc' wp=rw
`-+- policy='service-time 0' prio=2 status=active
`- 1:0:0:0 sdb 8:16 active ready running
Thern I rescanned my HBA
Code:
# echo "- - -" > /sys/class/scsi_host/host0/scan
and both the block device re-appear
Code:
# multipath -ll
36006016037a02e00ca86fb1d4847e111 dm-0 DGC,RAID 5
size=200G features='1 queue_if_no_path' hwhandler='1 emc' wp=rw
|-+- policy='service-time 0' prio=4 status=active
| `- 0:0:0:0 sda 8:0 active ready running
`-+- policy='service-time 0' prio=1 status=enabled
`- 1:0:0:0 sdb 8:16 active ready running
Adding to my surprise all the errors which were shown earlier while running lvm commands were gone
Code:
# lvs
LV VG Attr LSize Pool Origin Data% Move Log Copy% Convert
ISS system -wi-ao--- 120.47g
is-main system -wi-ao--- 15.00g
system-opt-mgtservices system -wi-ao--- 5.00g
system-usr system -wi-ao--- 2.00g
system-var system -wi-ao--- 2.00g
system-var-log system -wi-ao--- 2.00g
system-var-opt system -wi-ao--- 25.00g
tmp system -wi-ao--- 20.00g
And also this device was visible again
Code:
# fdisk /dev/sdb
Command (m for help): p
Disk /dev/sdb: 214.7 GB, 214748364800 bytes
255 heads, 63 sectors/track, 26108 cylinders, total 419430400 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x0006db56
Device Boot Start End Blocks Id System
/dev/sdb1 * 2048 1060863 529408 83 Linux
/dev/sdb2 1060864 9461759 4200448 83 Linux
/dev/sdb3 9461760 411023359 200780800 8e Linux LVM
/dev/sdb4 411023360 419430399 4203520 82 Linux swap / Solaris
I am really confused what is leading to such behavior? Any misconfiguration in mutipath?
Below is my multipath config
Code:
defaults {
verbosity 2
polling_interval 5
multipath_dir "/lib64/multipath"
path_selector "service-time 0"
path_grouping_policy "failover"
uid_attribute "ID_SERIAL"
prio "const"
prio_args ""
features "0"
path_checker "directio"
alias_prefix "mpath"
failback "manual"
rr_min_io 1000
rr_min_io_rq 1
max_fds "max"
rr_weight "uniform"
queue_without_daemon "yes"
flush_on_last_del "no"
user_friendly_names "no"
fast_io_fail_tmo 5
bindings_file "/etc/multipath/bindings"
wwids_file /etc/multipath/wwids
log_checker_err always
retain_attached_hw_handler no
detect_prio no
}