Linux - ServerThis forum is for the discussion of Linux Software used in a server related context.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
I have a situation where I need a file server to handle a few million scanned black and white tiff documents. These documents have variable numbers of pages and they can be any where from about 10K to 50MB.
I was wondering if anyone can suggest which of the available file systems would work best for serving up files of this size and volume. Adding new documents quickly, accessing existing documents, and saving updated documents are all important.
Also, is there a best way to setup the directory structure or is that irrelevant for a bTree directory system?
If the file sizes have something vaguely like a log-normal distribution (ie, there is a big peak in frequency of file sizes and that is at the low end) I might consider Reiser, but otherwise not.
If there is no data backup and regeneration issue (ie, the files come from a place from which they can be easily re-loaded) I might even be adventurous enough to consider ext4. If this is the primary copy of your data, I think I'd be more conservative than that and consider ext4 as still too bleeding edge.
Then I'd consider whether I wanted 'atime' for any reason; I'm guessing no, but you might have your reasons. If you feel happy with 'noatime', then you'd only want this on the same partiton as other noatime data and this is probably a real performance gain in this case. this then gets into what else is on that machine and whether you are happy with noatime on other partitions...
I've never used XFS or JFS, but they could come into consideration, again depending a bit on the distribution of file sizes.
Otherwise, its the old favourites ext2 and ext3; does journalling do anything for you (see the data backup and recreation issues); if not, ext2. If journalling does give you extra security, ext3.
What is the fastest method to transfer these files from one system to another without actually moving the hard drives?
I tried setting up a dedicated network link between the two machines over their spare 100BaseT connections and the data started out transferring 6GB of data the first hour, but then as time went on it slowed further and further until it was down to about 2GB an hour which at that rate would take me 12 days to transfer the entire 750GB of existing data.
I then connected an external USB 2.0 drive to the existing server and started copying. The first hour transferred 10GB and then it started falling off. It is on it's third hour of copying now and it looks like I am down to about 5GB an hour.
Is there some faster way of copying this data? It seems like USB 2.0 should be able to transfer the data much faster and the hard drive should at least be able to write 33MB a second. Even at half of that, I should be able to get 1GB a minute I would think.
There are 750GB to transfer and a total of 3.5 million individual files.
In my opinion you have two good options presently, xfs and ext4. That being said xfs is far more mature than ext4 presently, although in some cases ext4 performs better.
Edit: When you say "dedicated network link" you mean a crossover cable right? You would probably also get some benefit from raising your rmem/wmem from the default if you're going to try to do it over the wire. The fastest way of course is just drop the drive in the system. As far as an application to do the transfer... I'd suggest rsync make sure you DON'T turn on compression though. It also has the handy benefit that you can start and stop it whenever you like and it will effectively pick up where it left off.
Yes, I was referring to a crossover cable between the two machines.
I tried using rsync, but it gave me an error indicating that there were too many files. Is there a limit to how many files it will process from a directory?
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.