Hints? Managing large amounts of data, specifically photo archives
I was looking for some ideas on managing large amounts of data. Specifically, ever growing photo collections with more and more photographers storing larger images. I am looking for LVM/filesystem ideas, not management software (e.g., digiKam) unless certain filesystem choices directly affect specific management software in negative ways.
LVM is certainly the way to go in my mind. So it's easily expandable. But I'm asking about strategies for picking VG and LV sizes. For instance, one could keep adding multiple PV's to a single VG divided into a single LV. But since an LV equates to a mountable filesystem, eventually that single monolithic LV/fileystem will grow huge. How big can an ext3 filesystem grow and still be reasonably manageable? e.g., when growing an LV you should to do an fsck. Those might run forever on a terabyte (or larger) filesystem.
So what do you large-scale data collectors do? Make a new LV/filesystem for every new year? Just go with one gigantic LV/filesystem? Within this filesystem/filesystems I plan to use year and month subdirectories to organize things (makes for easy backups). My question is more along the lines of where do I put boundaries and choose a new filesystem over a new subdirectory? I'm experienced with Linux and LVM, but not necessarily with strategies for managing large chunks of data (to encapsulate possible future corruption, to make backups/restores manageable, etc.)
Also: I've been using LVM for years and never had any problems. But for those of you who HAVE had problems (corruption?), were those problems at the LV level or the VG level? That kind of info might help me in deciding how to divide things up. I may want multiple VG's in addition to multiple LV's. Of course I do backups, but it's still nicer to deal with smaller filesystems when you are trying to repair corruption or restore from backup. On the other hand, too many smaller filesystems can get cumbersome to manage.
Thanks for any experience/ideas on how to manage this.
I'd personally go with a SAN solution. Once you connect the HBA and fiber, the disk on the SAN side is presented to the Linux box. You can present it as 1 huge piece, or a zillion little volumes, but all that's done on the SAN side. Linux sees what is presented to it. Need more space? You can 'grow' a SAN volume, or add more disks by simply presenting them. Put them in whatever configuration you want (RAID-1, RAID-5, etc.), and again, Linux doesn't know or care....just sees disk.
I've had bad luck with LVM's in the past, though, where losing one volume has corrupted the others, or rendered the entire volume unreadable, so I may be biased. Your mileage may vary. :)
|All times are GMT -5. The time now is 10:10 PM.|