LinuxQuestions.org
Share your knowledge at the LQ Wiki.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - General
User Name
Password
Linux - General This Linux forum is for general Linux questions and discussion.
If it is Linux Related and doesn't seem to fit in any other forum then this is the place.

Notices

Reply
 
Search this Thread
Old 01-05-2009, 11:02 PM   #1
haertig
Senior Member
 
Registered: Nov 2004
Distribution: Debian, Ubuntu, LinuxMint, Slackware, SysrescueCD
Posts: 2,001

Rep: Reputation: 302Reputation: 302Reputation: 302Reputation: 302
Hints? Managing large amounts of data, specifically photo archives


I was looking for some ideas on managing large amounts of data. Specifically, ever growing photo collections with more and more photographers storing larger images. I am looking for LVM/filesystem ideas, not management software (e.g., digiKam) unless certain filesystem choices directly affect specific management software in negative ways.

LVM is certainly the way to go in my mind. So it's easily expandable. But I'm asking about strategies for picking VG and LV sizes. For instance, one could keep adding multiple PV's to a single VG divided into a single LV. But since an LV equates to a mountable filesystem, eventually that single monolithic LV/fileystem will grow huge. How big can an ext3 filesystem grow and still be reasonably manageable? e.g., when growing an LV you should to do an fsck. Those might run forever on a terabyte (or larger) filesystem.

So what do you large-scale data collectors do? Make a new LV/filesystem for every new year? Just go with one gigantic LV/filesystem? Within this filesystem/filesystems I plan to use year and month subdirectories to organize things (makes for easy backups). My question is more along the lines of where do I put boundaries and choose a new filesystem over a new subdirectory? I'm experienced with Linux and LVM, but not necessarily with strategies for managing large chunks of data (to encapsulate possible future corruption, to make backups/restores manageable, etc.)

Also: I've been using LVM for years and never had any problems. But for those of you who HAVE had problems (corruption?), were those problems at the LV level or the VG level? That kind of info might help me in deciding how to divide things up. I may want multiple VG's in addition to multiple LV's. Of course I do backups, but it's still nicer to deal with smaller filesystems when you are trying to repair corruption or restore from backup. On the other hand, too many smaller filesystems can get cumbersome to manage.

Thanks for any experience/ideas on how to manage this.

Last edited by haertig; 01-05-2009 at 11:04 PM.
 
Old 01-06-2009, 08:33 AM   #2
TB0ne
Guru
 
Registered: Jul 2003
Location: Birmingham, Alabama
Distribution: SuSE, RedHat, Slack,CentOS
Posts: 14,488

Rep: Reputation: 2543Reputation: 2543Reputation: 2543Reputation: 2543Reputation: 2543Reputation: 2543Reputation: 2543Reputation: 2543Reputation: 2543Reputation: 2543Reputation: 2543
Quote:
Originally Posted by haertig View Post
I was looking for some ideas on managing large amounts of data. Specifically, ever growing photo collections with more and more photographers storing larger images. I am looking for LVM/filesystem ideas, not management software (e.g., digiKam) unless certain filesystem choices directly affect specific management software in negative ways.

LVM is certainly the way to go in my mind. So it's easily expandable. But I'm asking about strategies for picking VG and LV sizes. For instance, one could keep adding multiple PV's to a single VG divided into a single LV. But since an LV equates to a mountable filesystem, eventually that single monolithic LV/fileystem will grow huge. How big can an ext3 filesystem grow and still be reasonably manageable? e.g., when growing an LV you should to do an fsck. Those might run forever on a terabyte (or larger) filesystem.

So what do you large-scale data collectors do? Make a new LV/filesystem for every new year? Just go with one gigantic LV/filesystem? Within this filesystem/filesystems I plan to use year and month subdirectories to organize things (makes for easy backups). My question is more along the lines of where do I put boundaries and choose a new filesystem over a new subdirectory? I'm experienced with Linux and LVM, but not necessarily with strategies for managing large chunks of data (to encapsulate possible future corruption, to make backups/restores manageable, etc.)

Also: I've been using LVM for years and never had any problems. But for those of you who HAVE had problems (corruption?), were those problems at the LV level or the VG level? That kind of info might help me in deciding how to divide things up. I may want multiple VG's in addition to multiple LV's. Of course I do backups, but it's still nicer to deal with smaller filesystems when you are trying to repair corruption or restore from backup. On the other hand, too many smaller filesystems can get cumbersome to manage.

Thanks for any experience/ideas on how to manage this.
I can just give you my $0.02 worth. To me, LVM's (or any kind of software disk-abstraction), is too slow and difficult to administer. This is especially true if you're wanting to have LOTS of space, with the ability to add space later, and keep it running 24/7.

I'd personally go with a SAN solution. Once you connect the HBA and fiber, the disk on the SAN side is presented to the Linux box. You can present it as 1 huge piece, or a zillion little volumes, but all that's done on the SAN side. Linux sees what is presented to it. Need more space? You can 'grow' a SAN volume, or add more disks by simply presenting them. Put them in whatever configuration you want (RAID-1, RAID-5, etc.), and again, Linux doesn't know or care....just sees disk.

I've had bad luck with LVM's in the past, though, where losing one volume has corrupted the others, or rendered the entire volume unreadable, so I may be biased. Your mileage may vary.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Kernel panics when trying to transfer large amounts of data from or to my hardrive CuriouserAndCur Debian 3 01-10-2007 11:53 AM
Using wget to copy large amounts of data via ftp. AndrewCAtWayofthebit Linux - General 1 05-11-2006 11:55 AM
DISCUSSION: Network Attached Storage – An Alternative To Tape Drives In Managing Massive Amounts of Data primearray.com LQ Articles Discussion 0 04-02-2006 04:48 PM
rm command is choking on large amounts of data? Jello Linux - General 18 02-28-2003 07:11 PM


All times are GMT -5. The time now is 07:23 AM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration