Defragment Harddrives: What program?

vi0lat0r · 03-24-2004, 07:26 AM

I am running Mandy 10 and have been looking for a hard drive defragmenter for linux... however I can't seem to find one anywhere (not preinstalled nor in the RPMDrake install / update section).

Is there a default one that I haven't found, or do I need to download a program? If so, whats the programs name?

By the way, I read this post: http://www.luni.org/pipermail/luni/2...st/006672.html

Also, http://freshmeat.net/projects/defrag/ looks a little bit... outdated.

Thanks,
Thomas

trickykid · 03-24-2004, 07:41 AM

You don't need to defrag a linux filesystem. It doesn't get fragmented like crap Windows filesystems do..

Search this site, its asked all the time.

GordonKShort · 04-02-2004, 09:47 AM

I have several systems running linux rh9 and they are badly fragmented. This is seriously impacting system performance and disk space usage. I need to de-frag the drives to better compact the files for disk space usage, reduce memory block usage, and to reduce seek times by having the files organized into zones by type and frequency of use. Does anyone know of any tools that will do this on a linux system?

whansard · 04-02-2004, 10:16 AM

good luck. if you find something, post it.
the only way i defragment some, is by copying everything off, then deleting and copying
it back. it only helps a little. there used to a an ext2 defrag, but i bet you aren't using that.

AutOPSY · 04-02-2004, 10:36 AM

you people are so full of <EDIT>.

I have a Redhat install that spans 3.9 GB and a filesystem check reports that only 0.2 percent is non-contiguous.

Write a defragmenter in C if your that worried.

GordonKShort · 04-02-2004, 10:42 AM

Sorry, I didn't mean to impose my problems on a community running devlepment systems. My application is doing real time knowledge management of wire services and a system is comprised of several terabytes of data spanning many drives with some files at 2GB each and several hundred thousand small files at 2-5 k each. The speed of accessing and modifying these files, as well as scanning the 2GB files is critical to the real time support for the 100 or so users per system.

I won't bother you anymore.

aaa · 04-02-2004, 10:47 AM

There's this ... but its beta

J.W. · 04-02-2004, 01:49 PM

Gordon - what you are describing is not a fragmentation problem, which is the condition of a _single_ file being physically stored in numerous locations on a disk drive(s). Instead, based on your description, you've just got huge numbers of tiny (but unfragmented) files, or huge monolithic files that would take forever to load.

In order to increase performance of such a system, it appears that you need to re-evaluate your data storage scheme. Specifically, can those 2G files be broken into more moderately sized chunks? Obviously if you only need, say 20Mg of data out of that entire file, your system is doing a lot of unnecessary work scanning through an additional 1.98G of data. Similarly, can the 3K files be aggregated in some way, so that you reduce the number of I/O operations? Can you store the files differently, in order to keep similar files physically near one another, again so that you reduce the I/O? (Example: if your files are organized by date, with 2001 data on one drive, 2002 data on another, etc, but the queries that are most often run retrieve data based on the zipcode (or whatever), then you could gain a lot of improvement by reorganizing your data by zipcode.) The point is, to the greatest extent possible, you would want to use a storage scheme that is matched to the sorts of queries that are most frequently run. That's where I'd suggest you spend your analysis time -- defragging really doesn't seem to be the issue. Good luck with the project. -- J.W.

mcleodnine · 04-02-2004, 02:44 PM

Quote:

Originally posted by AutOPSY
you people are so full of <deleted>.

I have a Redhat install that spans 3.9 GB and a filesystem check reports that only 0.2 percent is non-contiguous.

Write a defragmenter in C if your that worried.

Could please refrain from the abusive language and celan up your post?

Thanks in advance.

Pwnz3r · 04-02-2004, 02:50 PM

Well, the many file systems used with Linux don't become fragmented so obviously the problem lies somewhere else. I've never even seen it return more than 2% non-contiguous after an fsck.

dukeinlondon · 04-02-2004, 04:13 PM

Gordon, what does fsck say about you disks fragmentation ? Also, what filesystem do you use ?

xodustrance · 04-02-2004, 07:41 PM

Thats odd, on my mandy system, I have well over 500 gigs, and I have a server hosting a terabyte in raid.... Odd, I never even hit a half percent of fragmentation. On several systems too. Maybe your harddrive is going? Or your divider for your ide is off? But I dont think its the ext3 system.

RolledOat · 04-02-2004, 09:42 PM

Linux HDs can become fragmented if you run continually at close to 100%. The only time it will fragment is if there is no contiguous space to store the file. Gordon, have you thought of a ramdisk that loads the most often accessed files into memory. Also, have you looked at combinations of files. For example, on our production systems, one HD is for executables, one is for data, and one is for the database. This means that loading of the database, and/or data, is not fighting with the executable loading all from the same HD. This greatly decreases load time. Is there a logic to all your files where you can organize to seperate HDs parallel read files?

http://www.linuxfocus.org/English/No...rticle124.html

RO

dukeinlondon · 04-03-2004, 01:00 AM

I think he is gone.

GordonKShort · 04-03-2004, 03:56 PM

Hi Sorry i mis-stated the configuration. We have lots of 2 GB files and several hundred million smaller files, and perhaps several million intermediate files. I have discovered that fsck lies quite a bit about fragmentation. Dump the directory structure and take a look.

Our systems are suffering from several fragmentation problems. First, the large files are written as new often, and then scanned. The speed of the scan is dependent on the contiguity of the data. This is due to the overhead in the allocation of the system buffers for the i.o, the number of filesystem structure reads that have to be performed to follow the data stream, and the length of the seeks requried for following the data stream.

With the system continuously streaming in new files and supporting lots of update traffic, the system becomes defragmented. Then, while roughly half of the files become read-only, the fact that they are not contiguous causes needless overhead.

What i need is the tool that is being created that reorders the data on the disk so that it is contiguous, moves files of like kind (in terms of access frequency, r/w activity, and size) into different zones on the disk so that seek times can be optimized.

http://www.compresoft.com/english/o&...frag_linux.asp

The ext3 filesystem seems to be very similar technologically to the NTFS filesystem on our XP systems. We get almost the identical behavior there, but using defrag tools on Windows seriously mitigates the problems.

Interestingly, Microsoft originally claimed that the NTFS filesystem never needed defragmenting. It only took a few months of major users complaining to fix that broken idea.

I can understand that the ext3 and NTFS technology provides for a fast access capability to the drive and balances the data spread. Unfortunately, what we want if focused data, not balanced data spread. Balanced data spread works well in workgroups, but not necessarily for highly focused applications with predictable data access patterns.

There have been a number of gotchas in linux that have hurt our large deployments and set us back for the past few years. Over time however, they have gotten fewer and fewer. Hopefully this one too will soon disappear.