Sunday, June 12, 2005

Fragmentation in Unix file systems

Back in the days when I started to use Unix-like systems (Linux), I learned that their file-systems barely suffer from fragmentation. Nobody ever told me the reason behind that statement, and I never bothered to look for it, since there was no choice in the file-systems area (ext2 at the time under Linux) and there were no defragmentation utilities. Therefore, I assumed it was true... but it's not! (At least not as I understood it.)

However, I recently started to worry about this issue because I felt that some typical tasks, such as CVS updates, were becoming slower and slower every day. As I learned in an operating system course at university, the file system does not try to (or at least it's not required to) prevent fragmentation. Even though, some of its basic data structures — fragments and blocks — mitigate the problem of internal fragmentation of small files.

So I did some empirical tests. I unpacked a copy of pkgsrc inside my fastest disk (/home, FFSv2) and the operation took around 300 seconds to complete; that is five minutes. Note that this disk hasn't been formatted for a long time and holds more than 100 GB of data (including lots of small files), so one can expect that the files are widely spread around the disk).

Then I did the same operation under a clean partition of the slower disk (the speed difference is barely noticeable between the two disks, I may add). It took less than 90 seconds... that is a gain of 233%! Keep in mind that this was just a very specific test and the results cannot be extrapolated to other uses (don't do that!), but at least proved my doubts.

So what did I do? Repartitioned my NetBSD installation. Instead of having a single root file system for everything except /home, I created three partitions, one the system itself, one for the sources (pkgsrc, src and xsrc) and one for temporary object files. This way I expect, at least, that the system binaries will be kept together giving better program startup times, and expensive operations over the source trees (the CVS updates I mentioned before) will remain quick.

4 comments:

  1. [Originally posted at 2005-06-13 09:39 am UTC]

    They always pointed me to this when asking about filesystem fragmentation:

    /usr/share/doc/smm/05.fastfs

    GH

    ReplyDelete
  2. [Originally posted at 2005-06-13 07:16 pm UTC]

    A very interesting read, indeed. Thanks!

    ReplyDelete
  3. Hi

    So you mean you partitioned using the following partitions?:

    /
    /home
    /sources
    /tmp

    If so, how does this actually speed up programs?

    TIA and thanks for a very nice blog (although, maybe obviously, most of it is well over my head).

    ReplyDelete
  4. I ended up with the following layout:

    / - 20 GB
    /s - Sources, 10 GB
    /o - Objects, 20 GB
    /tmp - tmpfs, automatically sized
    /home - Second hard disk.

    I think it sped things a bit, but it eventually got bad. I guess it's because CVS's poor way of updating source trees, as it creates lots of directory entries that it later has to delete, spreading the files all around.

    As regards applications, I can't really compare the "improvement", but it didn't feel much faster. GNOME is kind of slow, no matter how you look at it ;-)

    ReplyDelete