Sunday, January 07, 2007

CVS and fragmentation

First of all, happy new year to everybody!

I've recently got a MacBook Pro and, while this little machine is great overall, the 5400 RPM hard disk is a noticeable performance bottleneck. Many people I've talked to say that the difference from 5400 to 7200 RPM should not be noticeable because:
  • These 2.5-inch drives use perpendicular recording, hence storing data with a higher bit density. This means that, theorically, they can read/write data more quickly achieving speeds similar to 7200 RPM drives.
  • Modern file systems prevent fragmentation, as described here for HFS+.
To me, these two reasons are valid as long as you manage large files: the file system will try to keep them physically close and the disk will be able to transfer sequential data fairly quickly.

But unfortunately, these ideas break when you have to deal with thousands of tiny files around (or when you flood the drive with requests from different applications, but this is not what I want to talk about today). The easiest way to demonstrate this is to use CVS to manage a copy of pkgsrc on such drives.

Let's start by checking out a fresh copy of pkgsrc from the CVS repository. As long as the file system has a lot of free space (and has not been "polluted" by erased files), this will run quite fast because it will store all new files physically close (theorically in consecutive cylinders). Hence, we take advantage of the higher bit densities and the file system's file allocation policy. Just after the check out operation (or unarchiving of a tarball of the tree), run an update (cvs -z3 -q update -dP) and write down the amount of time it takes. In my specific tests, the update took around 5 minutes, which is a good measure; in fact, it is almost the same I got in my desktop machine with a 7200 RPM disk.

Now start using pkgsrc by building a "big" package; I've been doing tests with mencoder, which has a bunch of dependencies and boost, which installs a ton of files. The object files generated during the builds, as well as the resulting files, will be physically stored "after" pkgsrc. It is likely that there will be "holes" in the disk because you'll be removing the work directories but not the installed files, which will result in a lot of files stored non-contiguously. To make things worse, keep using your machine for a couple of days.

Then, do another update of the whole tree. In my specific tests, the process now takes around 10 minutes. Yes, it has doubled the original measure. This problem was also present with faster disks, but not as noticeable. But do we have to blame the drive for such a slowdown or maybe, just maybe, it is CVS's fault?

The pkgsrc repository contains lots of empty directories that were once populated. However, CVS does not handle such entries very well. During an update, CVS recreates these empty directories locally and, at the end of the process, it erases them provided that you passed the -P (prune) option. Furthermore, every such directory will end up consuming, at least, 5 inodes on the local disk because it will contain a CVS control directory (which typically stores 3 tiny files). This continuous creation and deletion of directories and files fragment the original tree by spreading the updated files all around.

Sincerely, I don't know why CVS works like this (anyone?), but I bet that switching to a superior VCS could mitigate this problem. A temporary solution can be the usage of disk images, holding each source tree individually and keeping its total size as tight as possible. This way one can expect the image to be permanently stored in a contiguous disk area.

Oh, and by the way: Boot Camp really suffers from the slow drive because it creates the Windows partition at the end of the disk; that is, its inner part, which typically has slower access times. (Well, I'm not sure if it'd make any difference if the partition was created at the beginning.) Launching a game such as Half-Life 2 takes forever; fortunately, when it is up it is fast enough.

Update (January 9th): As "r." kindly points out, the slower part of the disk is the inner one, not the outer one as I had previously written (had a lapsus because CDs are written the other way around). And the reason is this: current disks use Zone Bit Recording (ZBR), a technique that fits a different amount of sectors depeding on the track's length. Hence, outer (longer) tracks have more sectors allocated to them and can transfer more data in a single disk rotation.

4 comments:

  1. I may be wrong, but I think a common problem in UNIX fs is the idea of namei cache, and the overheads of the stat() -like function calls.

    if you size the namei cache for normal file usage, and then perform operations which sweep over non-normal volumes of files and directories, the consequences are dire.

    stat() is expensive. it requires pokes into the dirblock name structure beyond the name part, and stat() over directories which are into secondary, and tertiary directory block chaining is even worse. cvs does heaps of stat calls.

    so consider eg an NFS server, where the ~user/Mail/folder/ hierarchy is MH, one mail per file, and you have a "folder"
    with 1000 or 5000 files, and you do an operation which requries either the equivalent of an ls -l or something morally equivalent, one stat() per file. -this single operation, which a mail client might do once every minute for all folders (I have 150 folders) is going to absolutely *cream* the NFS namei cache.

    Or a CVS process, walking pkgsrc, the same thing happens.

    then look at interactive performance for something like firefox, doing X11+gtk nested procedure calls, and repaints!

    basically, I have yet to see a UNIX system which can combine the CVS cost of processing, or the namei cache implications of large dir scanning, *and* be interactive.


    ggm

    ReplyDelete
  2. On my NetBSD systems, I always put src, pkgsrc and distfile trees on a separate /cvs filesystem (usually LFS). Similarly, I put the build.sh obj trees, pkgsrc work directories and other big temporary stuff on a /scratch filesystem (also LFS). This avoids them clobbering the FFS /usr partition (which is much more static) and gives better performance as well. From time to time, I just newfs the /scratch filesystem.

    ReplyDelete
  3. "suffers from the slow drive because it creates the Windows partition at the end of the disk; that is, its outer part" - I believe that the end of a disk is the disk's inner part (center of the disk).

    ReplyDelete