LinuxQuestions.org - [SOLVED] ext2online does not work on extended lvm logical volume

- Linux - General (https://www.linuxquestions.org/questions/linux-general-1/)

- - ext2online does not work on extended lvm logical volume (https://www.linuxquestions.org/questions/linux-general-1/ext2online-does-not-work-on-extended-lvm-logical-volume-913229/)

ext2online does not work on extended lvm logical volume

hi all,

if have a problem extending a file system on a lvm managed raid. It does not expand to the available size, instead it seems to try to SHRINK the file system.

The Setup:
we have a server running Suse Enterprise Server 10. we have a raid which contained 6x 2TB discs in a raid 5 configuration under an lvm. we added 6 more 2tb discs and now we wanted to expand the ext3-filesystem residing on the first 6 hds to span the whole raid (12x 2tb discs).

the raid controller is a hp smart array p800. the old logical disc is /dev/cciss/c0p3 .

These were the steps involved:
- configured the raid controller to incorporate the 6 new disks into the current array.
- created a new logical disk on the added space which came up as /dev/cciss/c0p2 as device descriptor
- labeled the partition table with gpt and created a primary partition with type lvm spanning the whole device:

Code:

(parted) print

Disk geometry for /dev/cciss/c0d2: 0kB - 12TB

Disk-Label-Typ: gpt

Number  Start  End    Size    File system  Name                  Flags

1      17kB    12TB    12TB                                      lvm

- created a new lvm physical device from this device (Attention: data is from after adding it to the volume group)

Code:

 --- Physical volume ---

  PV Name              /dev/cciss/c0d2p1

  VG Name              raid2tb

  PV Size              10,92 TB / not usable 2,49 MB

  Allocatable          yes (but full)

  PE Size (KByte)      4096

  Total PE              2861545

  Free PE              0

  Allocated PE          2861545

  PV UUID              XXXXX-XXXX-XXXX-XXXX-XXXX-XXXX-XXXXXX

- added the physical device to the lvm volume group

Code:

server:~ # pvs

  PV                VG      Fmt  Attr PSize  PFree

  /dev/cciss/c0d2p1 raid2tb lvm2 a-  10,92T    0

  /dev/cciss/c0d3p1 raid2tb lvm2 a-    7,28T    0

- extended the lvm logical device to span the whole volume group

Code:

  --- Logical volume ---

  LV Name                /dev/raid2tb/raid2lv

  VG Name                raid2tb

  LV UUID                XXXXXX-XXXX-XXXX-XXXX-XXXX-XXXX-XXXXXX

  LV Write Access        read/write

  LV Status              available

  # open                1

  LV Size                18,19 TB

  Current LE            4769241

  Segments              2

  Allocation            inherit

  Read ahead sectors    0

  Block device          253:0

the old raid has been mounted on /raid2 so df -h gave me this

Code:

/dev/mapper/raid2tb-raid2lv

                      7,2T  6,3T  935G  88% /raid2

now i tried
ext2online /raid2
but the output was

Code:

ext2online v1.1.18 - 2001/03/18 for EXT2FS 0.5b

ext2online: warning - device size 588735488, filesystem 1953480704

ext2online: /dev/mapper/raid2tb-raid2lv has 1953480704 blocks cannot shrink to 588735488

and didn't change anything on the filesystem

out of curiosity i did
ext2online -d -v /raid2
the output was basically this:

Code:

ext2online: warning - device size 588735488, filesystem 1953480704

group 2 inode table has offset 2, not 1027

group 4 inode table has offset 2, not 1027

[...snipp...]

group 59614 inode table has offset 2, not 1027

group 59615 inode table has offset 2, not 1027

ext2online: /dev/mapper/raid2tb-raid2lv has 1953480704 blocks cannot shrink to 588735488

ext2online v1.1.18 - 2001/03/18 for EXT2FS 0.5b

ext2_open

ext2_bcache_init

new filesystem size 588735488

ext2_determine_itoffset

setting itoffset to +1027

ext2_get_reserved

Found 558 blocks in s_reserved_gdt_blocks

using 558 reserved group descriptor blocks

that's it, it terminates with code 2.

can anyone identify the problem and how to fix this? help is very much appreciated.

i have to add the unmounting and doing the resize offline is at the moment not an option. but any hint on how long it takes to resize a filesystem from 7TB to 18TB would be nice anyway.

thanks in advance
jonas

Clearly there is a bug.

And at 23 hours old and 120+ views, no one has the answer.

I never heard of 'ext2online' before your post; but I've used 'resize2fs' (and the technique that you described) successfully dozens of times. Have you tried 'resize2fs'?

Also, I assume you have good backups because there is always an element of risk when manipulating filesystems.

I believe SLES10 will require (the very old) ext2online for online resize.
Where available resize2fs is preferable, and part of e2fsprogs - you should be able to use that with the f/s unmounted, even on SLES10. Best suggestion might be to get onto a more current SLES release.

thanks for the suggestions.

it's almost a no-go to take the server offline, therefore installing a new OS or unmounting the fs is a last resort.

we would rather create a new partition in the remaining space to use it.

i just compiled the latest e2fsprogs with resize2fs 1.41.14, which is supposed to support online resize as tommylovell suggested. i will give it a try on the weekend and report back.

OK, just a conclusion:

All went well with resize2fs 1.41.14 . However, couldn't expand the filesystem to the whole 18 TB, only up to 16 TB because thats apparently the maximum for ext3 (i wasn't aware of that). Either way, colleagues are happy. Thanks for your help! :)

apparently it wasn't that OK:

After a few hours no user other than root was able to write on that device. upon write every tool stated that there was no space left on the device while df showed a usage of 44% and a remaining space of 8.2 TB. however, root was able to write.

i guess somehow the new space was not available to the users while root could only write in it's own reserved space.

i tried partprobe (which from i know is used to re-read the partition table) and some re-mounts. But this didn't help, the error persisted.

any ideas on this?

(P.S.: I#m currently not able to logon to the system, since it is a remote server, and i also tried a reboot of the system: I am not able to reach it at the moment, i guess because the it does not shut down properly.)

This sounds as if the reserved field in the ext2 superblock was improperly modified somewhere along the way.

If you can manage to get onto the system, 'dumpe2fs -h <device>' will show you the "Reserved block count:" value. That value is normally 5% of the filesystem. The "Reserved blocks uid:" user and "Reserved blocks gid:" group (normally uid 0 - root; and gid 0 - root) are exempt from this and can use whatever space is left. For all other uid's and gid's, that reserved space is off limits.

'tune2fs -r ...' can reset that reserved value, again if you can log in as root, if that is the cause of the problem.

hey,

thanks for the answer.

the machine cam up again (it booted into runlevel 1, had to send a guy to flick some switches, now it's in runlevel 3 again)

hmm...

Code:

dumpe2fs 1.41.14 (22-Dec-2010)

Filesystem volume name:  <none>

Last mounted on:          <not available>

Filesystem UUID:          XXXXXXXXXXXXXXXXXXXXXXXXXX

Filesystem magic number:  0xEF53

Filesystem revision #:    1 (dynamic)

Filesystem features:      has_journal resize_inode filetype needs_recovery sparse_super large_file

Default mount options:    (none)

Filesystem state:        clean

Errors behavior:          Continue

Filesystem OS type:      Linux

Inode count:              2097152000

Block count:              4194304000

Reserved block count:    25246337

Free blocks:              2337720331

Free inodes:              2096800474

First block:              0

Block size:              4096

Fragment size:            4096

Reserved GDT blocks:      24

Blocks per group:        32768

Fragments per group:      32768

Inodes per group:        16384

Inode blocks per group:  512

Filesystem created:      Wed Jan 12 12:40:41 2011

Last mount time:          Mon Dec  5 07:12:56 2011

Last write time:          Mon Dec  5 07:12:56 2011

Mount count:              1

Maximum mount count:      24

Last checked:            Mon Dec  5 01:24:41 2011

Check interval:          15552000 (6 months)

Next check after:        Sat Jun  2 02:24:41 2012

Reserved blocks uid:      0 (user root)

Reserved blocks gid:      0 (group root)

First inode:              11

Inode size:              128

Journal inode:            8

Default directory hash:  tea

Directory Hash Seed:      XXXXXXXXXXXXXXXXXXXXXXXXX

Journal backup:          inode blocks

Jounal properties:        journal_incompat_revoke

Journal size:            128M

Journal length:            32768

Journal sequence:          0x0042b6f5

Journal start:            23175

it looks ok, doesn't it? as i read the output, the reserved block count is actually far less than 5 %, more in the range of 0.5% (25246337/4194304000) .

as you can see, i ran e2fsck last night, it found no errors and reported the right block size.

i am slowly running out of ideas... :( everything looks ok, but i cannot write on that thing.

I am opening a new thread, since the title does not reflect the problem any more.

The new thread is here.