Linux - SoftwareThis forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
I am running Mandy 10 and have been looking for a hard drive defragmenter for linux... however I can't seem to find one anywhere (not preinstalled nor in the RPMDrake install / update section).
Is there a default one that I haven't found, or do I need to download a program? If so, whats the programs name?
I have several systems running linux rh9 and they are badly fragmented. This is seriously impacting system performance and disk space usage. I need to de-frag the drives to better compact the files for disk space usage, reduce memory block usage, and to reduce seek times by having the files organized into zones by type and frequency of use. Does anyone know of any tools that will do this on a linux system?
Distribution: RH 6.2, Gen2, Knoppix,arch, bodhi, studio, suse, mint
Posts: 3,304
Rep:
good luck. if you find something, post it.
the only way i defragment some, is by copying everything off, then deleting and copying
it back. it only helps a little. there used to a an ext2 defrag, but i bet you aren't using that.
Sorry, I didn't mean to impose my problems on a community running devlepment systems. My application is doing real time knowledge management of wire services and a system is comprised of several terabytes of data spanning many drives with some files at 2GB each and several hundred thousand small files at 2-5 k each. The speed of accessing and modifying these files, as well as scanning the 2GB files is critical to the real time support for the 100 or so users per system.
Gordon - what you are describing is not a fragmentation problem, which is the condition of a _single_ file being physically stored in numerous locations on a disk drive(s). Instead, based on your description, you've just got huge numbers of tiny (but unfragmented) files, or huge monolithic files that would take forever to load.
In order to increase performance of such a system, it appears that you need to re-evaluate your data storage scheme. Specifically, can those 2G files be broken into more moderately sized chunks? Obviously if you only need, say 20Mg of data out of that entire file, your system is doing a lot of unnecessary work scanning through an additional 1.98G of data. Similarly, can the 3K files be aggregated in some way, so that you reduce the number of I/O operations? Can you store the files differently, in order to keep similar files physically near one another, again so that you reduce the I/O? (Example: if your files are organized by date, with 2001 data on one drive, 2002 data on another, etc, but the queries that are most often run retrieve data based on the zipcode (or whatever), then you could gain a lot of improvement by reorganizing your data by zipcode.) The point is, to the greatest extent possible, you would want to use a storage scheme that is matched to the sorts of queries that are most frequently run. That's where I'd suggest you spend your analysis time -- defragging really doesn't seem to be the issue. Good luck with the project. -- J.W.
Well, the many file systems used with Linux don't become fragmented so obviously the problem lies somewhere else. I've never even seen it return more than 2% non-contiguous after an fsck.
Thats odd, on my mandy system, I have well over 500 gigs, and I have a server hosting a terabyte in raid.... Odd, I never even hit a half percent of fragmentation. On several systems too. Maybe your harddrive is going? Or your divider for your ide is off? But I dont think its the ext3 system.
Linux HDs can become fragmented if you run continually at close to 100%. The only time it will fragment is if there is no contiguous space to store the file. Gordon, have you thought of a ramdisk that loads the most often accessed files into memory. Also, have you looked at combinations of files. For example, on our production systems, one HD is for executables, one is for data, and one is for the database. This means that loading of the database, and/or data, is not fighting with the executable loading all from the same HD. This greatly decreases load time. Is there a logic to all your files where you can organize to seperate HDs parallel read files?
Hi Sorry i mis-stated the configuration. We have lots of 2 GB files and several hundred million smaller files, and perhaps several million intermediate files. I have discovered that fsck lies quite a bit about fragmentation. Dump the directory structure and take a look.
Our systems are suffering from several fragmentation problems. First, the large files are written as new often, and then scanned. The speed of the scan is dependent on the contiguity of the data. This is due to the overhead in the allocation of the system buffers for the i.o, the number of filesystem structure reads that have to be performed to follow the data stream, and the length of the seeks requried for following the data stream.
With the system continuously streaming in new files and supporting lots of update traffic, the system becomes defragmented. Then, while roughly half of the files become read-only, the fact that they are not contiguous causes needless overhead.
What i need is the tool that is being created that reorders the data on the disk so that it is contiguous, moves files of like kind (in terms of access frequency, r/w activity, and size) into different zones on the disk so that seek times can be optimized.
The ext3 filesystem seems to be very similar technologically to the NTFS filesystem on our XP systems. We get almost the identical behavior there, but using defrag tools on Windows seriously mitigates the problems.
Interestingly, Microsoft originally claimed that the NTFS filesystem never needed defragmenting. It only took a few months of major users complaining to fix that broken idea.
I can understand that the ext3 and NTFS technology provides for a fast access capability to the drive and balances the data spread. Unfortunately, what we want if focused data, not balanced data spread. Balanced data spread works well in workgroups, but not necessarily for highly focused applications with predictable data access patterns.
There have been a number of gotchas in linux that have hurt our large deployments and set us back for the past few years. Over time however, they have gotten fewer and fewer. Hopefully this one too will soon disappear.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.