LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - General (http://www.linuxquestions.org/questions/linux-general-1/)
-   -   ext2online does not work on extended lvm logical volume (http://www.linuxquestions.org/questions/linux-general-1/ext2online-does-not-work-on-extended-lvm-logical-volume-913229/)

jonas_berlin 11-12-2011 04:59 PM

ext2online does not work on extended lvm logical volume
 
hi all,

if have a problem extending a file system on a lvm managed raid. It does not expand to the available size, instead it seems to try to SHRINK the file system.

The Setup:
we have a server running Suse Enterprise Server 10. we have a raid which contained 6x 2TB discs in a raid 5 configuration under an lvm. we added 6 more 2tb discs and now we wanted to expand the ext3-filesystem residing on the first 6 hds to span the whole raid (12x 2tb discs).

the raid controller is a hp smart array p800. the old logical disc is /dev/cciss/c0p3 .

These were the steps involved:
- configured the raid controller to incorporate the 6 new disks into the current array.
- created a new logical disk on the added space which came up as /dev/cciss/c0p2 as device descriptor
- labeled the partition table with gpt and created a primary partition with type lvm spanning the whole device:
Code:

(parted) print
Disk geometry for /dev/cciss/c0d2: 0kB - 12TB
Disk-Label-Typ: gpt
Number  Start  End    Size    File system  Name                  Flags
1      17kB    12TB    12TB                                      lvm

- created a new lvm physical device from this device (Attention: data is from after adding it to the volume group)
Code:

--- Physical volume ---
  PV Name              /dev/cciss/c0d2p1
  VG Name              raid2tb
  PV Size              10,92 TB / not usable 2,49 MB
  Allocatable          yes (but full)
  PE Size (KByte)      4096
  Total PE              2861545
  Free PE              0
  Allocated PE          2861545
  PV UUID              XXXXX-XXXX-XXXX-XXXX-XXXX-XXXX-XXXXXX

- added the physical device to the lvm volume group
Code:

server:~ # pvs
  PV                VG      Fmt  Attr PSize  PFree
  /dev/cciss/c0d2p1 raid2tb lvm2 a-  10,92T    0
  /dev/cciss/c0d3p1 raid2tb lvm2 a-    7,28T    0

- extended the lvm logical device to span the whole volume group
Code:

  --- Logical volume ---
  LV Name                /dev/raid2tb/raid2lv
  VG Name                raid2tb
  LV UUID                XXXXXX-XXXX-XXXX-XXXX-XXXX-XXXX-XXXXXX
  LV Write Access        read/write
  LV Status              available
  # open                1
  LV Size                18,19 TB
  Current LE            4769241
  Segments              2
  Allocation            inherit
  Read ahead sectors    0
  Block device          253:0

the old raid has been mounted on /raid2 so df -h gave me this
Code:

/dev/mapper/raid2tb-raid2lv
                      7,2T  6,3T  935G  88% /raid2

now i tried
ext2online /raid2
but the output was
Code:

ext2online v1.1.18 - 2001/03/18 for EXT2FS 0.5b
ext2online: warning - device size 588735488, filesystem 1953480704
ext2online: /dev/mapper/raid2tb-raid2lv has 1953480704 blocks cannot shrink to 588735488

and didn't change anything on the filesystem

out of curiosity i did
ext2online -d -v /raid2
the output was basically this:
Code:

ext2online: warning - device size 588735488, filesystem 1953480704
group 2 inode table has offset 2, not 1027
group 4 inode table has offset 2, not 1027
[...snipp...]
group 59614 inode table has offset 2, not 1027
group 59615 inode table has offset 2, not 1027
ext2online: /dev/mapper/raid2tb-raid2lv has 1953480704 blocks cannot shrink to 588735488
ext2online v1.1.18 - 2001/03/18 for EXT2FS 0.5b
ext2_open
ext2_bcache_init
new filesystem size 588735488
ext2_determine_itoffset
setting itoffset to +1027
ext2_get_reserved
Found 558 blocks in s_reserved_gdt_blocks
using 558 reserved group descriptor blocks

that's it, it terminates with code 2.

can anyone identify the problem and how to fix this? help is very much appreciated.

i have to add the unmounting and doing the resize offline is at the moment not an option. but any hint on how long it takes to resize a filesystem from 7TB to 18TB would be nice anyway.

thanks in advance
jonas

tommylovell 11-13-2011 04:00 PM

Clearly there is a bug.

And at 23 hours old and 120+ views, no one has the answer.

I never heard of 'ext2online' before your post; but I've used 'resize2fs' (and the technique that you described) successfully dozens of times. Have you tried 'resize2fs'?

Also, I assume you have good backups because there is always an element of risk when manipulating filesystems.

syg00 11-13-2011 05:16 PM

I believe SLES10 will require (the very old) ext2online for online resize.
Where available resize2fs is preferable, and part of e2fsprogs - you should be able to use that with the f/s unmounted, even on SLES10. Best suggestion might be to get onto a more current SLES release.

jonas_berlin 11-18-2011 02:15 AM

thanks for the suggestions.

it's almost a no-go to take the server offline, therefore installing a new OS or unmounting the fs is a last resort.

we would rather create a new partition in the remaining space to use it.

i just compiled the latest e2fsprogs with resize2fs 1.41.14, which is supposed to support online resize as tommylovell suggested. i will give it a try on the weekend and report back.

jonas_berlin 12-03-2011 04:36 PM

OK, just a conclusion:

All went well with resize2fs 1.41.14 . However, couldn't expand the filesystem to the whole 18 TB, only up to 16 TB because thats apparently the maximum for ext3 (i wasn't aware of that). Either way, colleagues are happy. Thanks for your help! :)

jonas_berlin 12-04-2011 11:09 AM

apparently it wasn't that OK:

After a few hours no user other than root was able to write on that device. upon write every tool stated that there was no space left on the device while df showed a usage of 44% and a remaining space of 8.2 TB. however, root was able to write.

i guess somehow the new space was not available to the users while root could only write in it's own reserved space.

i tried partprobe (which from i know is used to re-read the partition table) and some re-mounts. But this didn't help, the error persisted.

any ideas on this?

(P.S.: I#m currently not able to logon to the system, since it is a remote server, and i also tried a reboot of the system: I am not able to reach it at the moment, i guess because the it does not shut down properly.)

tommylovell 12-04-2011 10:49 PM

This sounds as if the reserved field in the ext2 superblock was improperly modified somewhere along the way.

If you can manage to get onto the system, 'dumpe2fs -h <device>' will show you the "Reserved block count:" value. That value is normally 5% of the filesystem. The "Reserved blocks uid:" user and "Reserved blocks gid:" group (normally uid 0 - root; and gid 0 - root) are exempt from this and can use whatever space is left. For all other uid's and gid's, that reserved space is off limits.

'tune2fs -r ...' can reset that reserved value, again if you can log in as root, if that is the cause of the problem.

jonas_berlin 12-05-2011 01:48 AM

hey,

thanks for the answer.

the machine cam up again (it booted into runlevel 1, had to send a guy to flick some switches, now it's in runlevel 3 again)

hmm...
Code:

dumpe2fs 1.41.14 (22-Dec-2010)
Filesystem volume name:  <none>
Last mounted on:          <not available>
Filesystem UUID:          XXXXXXXXXXXXXXXXXXXXXXXXXX
Filesystem magic number:  0xEF53
Filesystem revision #:    1 (dynamic)
Filesystem features:      has_journal resize_inode filetype needs_recovery sparse_super large_file
Default mount options:    (none)
Filesystem state:        clean
Errors behavior:          Continue
Filesystem OS type:      Linux
Inode count:              2097152000
Block count:              4194304000
Reserved block count:    25246337
Free blocks:              2337720331
Free inodes:              2096800474
First block:              0
Block size:              4096
Fragment size:            4096
Reserved GDT blocks:      24
Blocks per group:        32768
Fragments per group:      32768
Inodes per group:        16384
Inode blocks per group:  512
Filesystem created:      Wed Jan 12 12:40:41 2011
Last mount time:          Mon Dec  5 07:12:56 2011
Last write time:          Mon Dec  5 07:12:56 2011
Mount count:              1
Maximum mount count:      24
Last checked:            Mon Dec  5 01:24:41 2011
Check interval:          15552000 (6 months)
Next check after:        Sat Jun  2 02:24:41 2012
Reserved blocks uid:      0 (user root)
Reserved blocks gid:      0 (group root)
First inode:              11
Inode size:              128
Journal inode:            8
Default directory hash:  tea
Directory Hash Seed:      XXXXXXXXXXXXXXXXXXXXXXXXX
Journal backup:          inode blocks
Jounal properties:        journal_incompat_revoke
Journal size:            128M
Journal length:            32768
Journal sequence:          0x0042b6f5
Journal start:            23175

it looks ok, doesn't it? as i read the output, the reserved block count is actually far less than 5 %, more in the range of 0.5% (25246337/4194304000) .

as you can see, i ran e2fsck last night, it found no errors and reported the right block size.

i am slowly running out of ideas... :( everything looks ok, but i cannot write on that thing.

jonas_berlin 12-05-2011 02:10 AM

I am opening a new thread, since the title does not reflect the problem any more.

The new thread is here.


All times are GMT -5. The time now is 09:06 AM.