LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - General (https://www.linuxquestions.org/questions/linux-general-1/)
-   -   defrag on Linux (https://www.linuxquestions.org/questions/linux-general-1/defrag-on-linux-331862/)

JOKirk 01-28-2009 01:17 PM

Quote:

Originally Posted by knersus (Post 2908546)
Thanks, I will look into the Windows fs driver. There is of course the problem of portability - I would have to install that driver on all of the PCs where that USB disk may be plugged in. I suppose that the best way to do this would be to split the disk into 2 partitions, one being a FAT32 and the other a suitable Linux native partition. The FAT32 partition can then be mounted anywhere and can be used to hold the fs driver installation file for the other partition. The only other drawbacks will then be Vista, and having the necessary permissions on the PC to install the driver.

On searching for Linux defrag practices, I stumbled upon this (very old) thread, and thought I'd jump in with some thoughts.

Sounds like knersus is setting himself up for a huge mess. No way would I mess with a format that requires me to install drivers that I already know will likely be a huge pain on any of my newer workstations (or, in a couple of years, almost all of my workstations). NTFS would be by far the easiest and best solution, unless you want the reduced overhead of FAT. For an external drive, I'd just use FAT (many do by default) because it's widely supported and efficient on space usage -- despite it's potential performance implications. Depends what kind of usage it's going to see. A few thoughts on fragmentation in general, not really pertaining to external drives:

1) Hard drives have multiple read/write heads, therefore it's not simply one read/write head running through a sequential set of data. While so far unreferenced, this is obviously pertinent, in relation to "multi-user, multi-tasking, multi-threaded OS" performance. I do prefer to keep my hard drive thoroughly defragmented -- after every install, if I can, which I'll go into later -- but it's not likely to make a big difference in a lot of situations. You have numerous simultanious read/write operations happening all the time, for all different files. The idea that Linux's ext3fs or the like does a better job than NTFS because it groups data better is, from all current information above, dead wrong. The reality is that neither file system (or, any pertinent file system for us) is really going to see a significant variance in performance in modern environments from a defrag, other than FAT systems.

2) These posts focus solely on single-file fragmentation, but not order of files. If you're talking about one head scanning sequential data for performance reasons (which really is all we can address, since we don't write the algorythms for the hard drives, or add/remove read/write heads to/from hard drives), it's worth noting that most applications install countless small files for a variety of purposes, which may or may not be accessed at any time, especially loading an application or saving a large amount of data. Ignoring file organization on a defragmented drive is no better than leaving it excessively fragmented -- any given head still has to seek over and over again. This is why I try to run defrag on my Windows machines (twice if needed) to get all the files themselves defragmented appropriately, -and- get them clustered together on one portion of the drive. Partitioning can help with this, also, if you feel like planning way ahead (though that's not really good practice).

3) Most of the relatively pertinent comparisons I've found here on LinuxQuestions.org have been in relation to FAT and FAT32, which are very outdated file systems no longer used on out-of-the-box installations of Windows (1998 was the last OS to use a FAT-style file system by default). If we're looking for ways to attack Windows, I think we'll need to dig a tad further, as that drivel serves no one, and is misleading in the question of what to use for a drive that'll be used on both Linux and Windows systems.

My conclusion: Defragmenting can potentially hurt performance for some operations if it ends up grabbing one file from many that you'll need, and locating it somewhere else on the drive entirely, and when you need it your read/write heads aren't already nearby. However, a best-practice would be to keep things thoroughly defragmented so that any given read/write head is most efficient, which will overall decrease the seek time invested by any given head for any given file. If your defrag software also bunches files together (Windows XP defrag does somewhat, although not very well, especially on a single pass) you will further decrease the seek time for most file operations. While following this practice will undoubtedly give worse performance every once in a while, on the whole your data access performance should be better than otherwise, by a small margin, whether you're using NTFS, EXT3FS, or what have you.

Thoughts? References to other postings that may pertain? This thread was my first response from Google, and the only one that really looked useful, so I'm certainly glad to entertain other references.

-John

PTrenholme 01-28-2009 03:30 PM

JOKirk, have you looked at the ext4 file system specifications? Apparently the file system developers had opinions similar to yours, and included fragmentation reduction and seek optimization algorithms in the new stuff. (And, in case you need it, support for multiple exabyte file and disk sizes.)

Quakeboy02 01-28-2009 04:14 PM

In a single-user system, where the user is doing a single task, then perhaps something can be said for periodic defragmentation. However the goal of defragmentation is to separate data by files and to crowd them together as if they won't ever grow; this is not beneficial to a multi-user/multi-tasking system. In Linux, the OS uses what's called an "elevator algorithm" so that the heads are moving in a single direction as long as possible. In the case of scattered data, this will generally improve throughput in a multi-user/multi-task system. If the data is clumped by files, there may be some "surging" noticed by users as they have to wait for their turn in the elevator.

If you have large files and are a single user, then perhaps you would benefit from this defragmentation. However, Linux uses memory and disk caching much better than MS does, so the benefit may only be noticeable on a benchmark.

mrclisdue 01-28-2009 04:27 PM

This is my all time favourite LQ defrag thread:

Code:

http://www.linuxquestions.org/questions/linux-software-2/a-file-system-defragmentation-tool-on-linux-545928/?highlight=fragmentation
cheers,

JOKirk 01-29-2009 08:54 AM

While Ext4 looks great, for a variety of reasons, it rather clearly doesn't eliminate fragmentation -- an impossible concept, from what I can tell, even though the first few sites I found regarding ext4 and fragmentation mentioned it. Quite a few pages erraneously reference Extents as eliminating file fragmentation, but that's not the case at all. What they do (as you probably already know, but I'll explain for the thread anyway) is they help reduce it somewhat by looking for a big enough spot for the file to be placed at it's starting size, and they help reduce the amount of space needed in the file system table for the file as the file is created. Adding in data later will still have an impact on file fragmentation if there's something else coming in after it, or beyond the original size defined (though I'm having trouble finding in-depth explanations for how it handles such a situation). Both good things that help performance quite a bit in file system operations. Really, from what I'm seeing, investing in larger file system blocks is the best way to proactively reduce fragmentation on any file system.

It's still crazy to me that they can store an exabyte on a file system and have a 16 terrabyte maximum file size, with bigger on the way. That's incredible. I still remember my old 286 that had a ... what, 30mb hard drive? Insanity.

Good references:

http://www.ibm.com/developerworks/li...g-filesystems/
http://kernelnewbies.org/Ext4#head-7...756346be4268f4

PTrenholme 01-29-2009 09:59 AM

Yes, it's probably impossible to completely eliminate fragmentation. (Even my wife's file cabinets contain several fragmented folders. In fact, she complains that I increase fragmentation every time I access the files. :))

You must be younger than I: My first system with a hard drive had a 10Mb one, and I was ecstatic! (Just to date myself, the first system I played with was an IBM system, with 512 bytes of RAM, that used "punched paper tape" for mass storage. That was when I was in college half a century ago.)

Borax_Man 01-31-2009 05:07 AM

There are a couple of ways to 'defrag' your system.

First is to MOVE all the files off the partition, them move them back on.

Second is to use e2defrag. It's an old program, and can only handle file systems with a block size of 1K.
There is an patched version of it, which can handle larger block sizes, its called
e2defrag 0.73pjm1

http://rpmseek.com/rpm/defrag_0.73pj...:3341643:0:0:0

There are two caveats. The filsystem must NOT be mounted when you defrag (or at least, mounted read only), and it MUST be ext2, not ext3. So for ext3 filesystems, you MUST convert them to ext2 before using this tool or it can screw up your filesystem.

BACK UP BEFORE HAND! I tested it on an ext3 filesystem and it trashed it, so make sure you convert it to ext2 then when done, switch back to ext3.

But my Linux installation is about 7 years old, and in all honesty, the claims that you don't need to defrag are correct. The fragmentation level fluctuates, but remains low. It's the same now as it was 5 years ago. The only filesystem where fragmentation gets high is one I use for downloads, which is usually near capacity most of the time. But usually, you won't get to the state where defragging is necessary.

nigelc 01-31-2009 08:33 PM

Defragmenting is a concept invented by ms-dos users. The fat12, fat16 then later fat32 are really bad file systems for file fragmentation. Every file always tried to go towards the outside clusters. Even ntfs does it.
It you have file_A taking the first flew blocks on the disk, then file_B will go on the next few blocks etc.
file_C will go next. Then delete file_A, create file_D, but this time make sure it will be bigger than the original file_A. Now we have fragmentation!:)
If the defrag prog does not leave gaps between them, then it will all go bad again.
The system is usually going slow because of other reasons.

JOKirk 02-04-2009 02:33 PM

I'm having a difficult time finding any good explanation of how ext4 (or 3, or whatever other file systems....) do things differently. Now it seems we're discussing more the size of file system blocks than anything. The only way to avoid running out of space for a file placed before more recent ones would be to leave gaps, which may help fragmentation on a file level, but if you install an application, it'll still spread out the files all over -- leaving you with the same performance problem you'd have with a few of the files being fragmented. I have stumbled on references to four kinds of elevator algorithms. Does anyone have any good references for the referenced method of allocating space for a new file? would enlighten me and would be pertinent to the thread, methinks.

JOKirk 02-04-2009 03:07 PM

Quote:

Originally Posted by PTrenholme (Post 3425315)
Yes, it's probably impossible to completely eliminate fragmentation. (Even my wife's file cabinets contain several fragmented folders. In fact, she complains that I increase fragmentation every time I access the files. :))

You must be younger than I: My first system with a hard drive had a 10Mb one, and I was ecstatic! (Just to date myself, the first system I played with was an IBM system, with 512 bytes of RAM, that used "punched paper tape" for mass storage. That was when I was in college half a century ago.)

I was writing batch files and QBASIC code when I was 10 or so, I started messing with some stuff when I was 7-8. I'm 26 now, so that's... 18 years?

Man, time flies when you're having fun. :)

I haven't even -seen- a paper punch system, yet. I've seen pictures, but never in person.

chrism01 02-04-2009 07:23 PM

My 1st on-line/remote system: ICL something (19xx or 29xx I think) accessed via teletype ie paper roll and optional paper tape.
My 1st desktop/local system: Commodore PET
:)

(I didn't own either ;) , just used them )

JOKirk 02-05-2009 10:44 AM

Quote:

Originally Posted by chrism01 (Post 3432492)
My 1st on-line/remote system: ICL something (19xx or 29xx I think) accessed via teletype ie paper roll and optional paper tape.
My 1st desktop/local system: Commodore PET
:)

(I didn't own either ;) , just used them )

Vintage! That's awesome. :)

rweaver 02-05-2009 04:21 PM

Quote:

Originally Posted by ashley75 (Post 1685549)
Hi all,

how would you defrag on Linux????

by what command????


thanks,

It depends on the file system you're using more so than saying "linux" or windows.

Ext2 - You can use e2defrag (umounted only, I think... been a while.)
Ext3 - No tool. No need really.
Ext4 - Tool not available yet, but should be eventually. No need really.
XFS - Has a tool for using on the mounted filesystem while in use.
FAT32/NTFS - Both have defragmentation tools

I'm sure there are others too, but those are the ones I'm most familiar with.

Generally a modern file system shouldn't be experiencing significant fragmentation. If you want to delve into the differences in how files are allocated and accessed you would need significantly more in depth information than I'm going to type up here :P There are tons of documents on the hows and whys available with a simple google search :)


All times are GMT -5. The time now is 10:57 PM.