Wednesday, August 31, 2005

A prompt that does not get in the way

Yesterday, I saw a screenshot of a developer's box. His prompt was composed of two characters only: a colon and a semicolon (:;). This looks wired, doesn't it? Yes, it does, but there is a rationale behind it: using this prompt, you can copy and paste complete lines from a terminal into another one and they will just work. Why? Because those characters are ignored by the shell (the colon returns success and the semicolon starts another command).

Oh yes, this prompt does not provide useful information, but that's a matter of preference. E.g., I used a simple $ prompt for a long while, so :; is not any worse than it.

Friday, August 26, 2005

SoC: Project announced

Despite I don't like doing premature announcements of my projects, I've been kind of forced to do it for tmpfs. The reason is that SoC's deadline is really close now and people should have a chance to test it. Not to mention that the code won't suffer any serious improvements in the subsequent days, so delaying the announcement is not worth it either.

You can read the announcement in my mail to the tech-kern@ mailing list, which also includes a step-by-step guide to test tmpfs.

Thursday, August 25, 2005

pkgsrc's strengths

Jeremy C. Reed has collected an excellent list of pkgsrc's strengths. I encourage you to read his post to tech-pkg@ in case you still had doubts about why to use pkgsrc ;-)

His mail contains also some bad things about this packaging system, although the list is very short. Followups to this email list some more items on the negative side.

Wednesday, August 24, 2005

Local sockets' permissions

A few days ago, I was trying gamin under NetBSD which unfortunately didn't work at all. The first problem I encountered was that it complained about the excessive permissions given to newly created local sockets (those stored in the file-system, also known as "Unix sockets" historically). After analyzing the issue, I saw that those files were given 777 permissions, regardless of the user's umask. Strangely, the code was explicitly checking for this mode after creation, so I was probably missing something.

I wrote a little test program that creates a local socket and ran it under Linux, FreeBSD and OpenBSD. All of them correctly respected the user's umask (setting the permissions to 755). So, what was going on with NetBSD?

After asking in the tech-kern@ mailing list, I was told the following: the traditional Unix behavior when creating local sockets was to give them 777 permissions to mimic real sockets (i.e., everybody can connect to them). Therefore, the portable way to create them in a secure way is to first make a directory with safe permissions (say, 700) and then create the socket inside it.

However, I'm thinking about changing NetBSD to honor the user's mask in this case too (it's a trivial fix). It does not hurt in any way and it may improve portability of some "non-portable" programs. (Though, this will hide portability bugs in some programs, which makes me dubious about the change...)

By the way, gamin was doing the right thing (creating sockets inside secure directories) so I don't know why it wanted to ensure that the sockets were "safe".

Tuesday, August 23, 2005

SoC: Status report 6

This past week has not been excessively productive because I spent some time dealing with long overdue pkgsrc tasks (mainly updating GNOME to 2.10.2, the latest stable version) and was away from computer more time than usual. Anyway, I have done a bunch of things, although they are not as visible as the work from other weeks (this is, in part, why I felt less productive).

I started by adding support for local sockets, which was easy enough to do but caused panics (specially when switching /tmp to tmpfs and starting an X session).

Then I tried to switch memory pools to use anonymous memory, so that all file meta-data was pageable. I did several clumsy tests and, while it mostly worked, the system crashed after intensive usage of the file-system (such as decompressing pkgsrc over it). I had to give this up because I had some other stuff to do (see below) and this was consuming much time.

The next thing I did was to rework the way pools were handled by making them file-system specific and by implementing a tmpfs-specific extension of them. There are several benefits from this change, but the most important one is that tmpfs is now able to control memory usage in a fine-grained manner.

Later on I added NFS support by defining the VFS operations I was missing. This was harder than I thought, mostly because the lack of documentation, but after some time I got something that basically works. There are still some some serious issues, so I'd not call tmpfs as NFS-ready yet. I'm afraid that these problems are not specific to NFS; i.e., NFS is only exposing some real bugs in the code. We'll see.

Related to the previous, I was disappointed to see that I had to modify mountd's code to be able to export tmpfs file-systems. I have some ideas to improve this, thus making mountd file-system independent, but I'll delay them until SoC's deadline has passed.

Furthermore, I cleaned up some parts of the code: split the different vnode operations vectors into different source files (tmpfs_vnops.c, tmpfs_fifoops.c and tmpfs_specops.c) for better readability and to get shorter files, split the main tmpfs.h file into several headers, cleaned up inclusions and centralized some code.

At last, I wrote the document describing tmpfs internals. This is in the form of a manual page in the 9th section, and can be found in the CVS. If you want to visualize it, you can run the following command: groff -Tascii -mdoc tmpfs.9 | less after fetching the file. It may lack some details, but I feel it is quite complete, though it assumes prior knowledge of how file-systems work within NetBSD.

What I'm going to target this week (after prior agreement with my mentors) is the other so-promised document I've been talking about: a complete document that describes how to write file-systems for NetBSD. I started it yesterday, and it is proving to be very difficult to explain things in a clean and well-structured way (mostly because I want to avoid big code examples without real explanations).

Monday, August 22, 2005

Manual ChangeLogs; a thing of the past?

If you have ever examined the source distribution of an open source project, you'll probably have noticed a ChangeLog file. This file lists, in good detail, all changes done to the source code in reverse order, giving their description, the name of the affected files and the name of the author who did the change. So far, so good. But I really think that these files, or better said, the way they are written and managed, is flawed. Let's see why:

ChangeLogs are written and controlled by hand. That is: you do a set of changes to the project and then proceed to write an entry that lists the files you modified and why you did so. Some tools, like GNU Emacs, provide hooks to simplify their writing, but, in the end, it is the software developer who ends up taking care of them manually. This is not really a problem when the project does not use a version control system (VCS, for short) to store its source code, because in this case, there is no other way to keep track of the changes.

But when you add a decent VCS to the mix (one that has, at the very least, changesets), such as Subversion or Monotone, things get much worse. Worse because the information you have in the ChangeLog is duplicated: once in the file and once in the revision history stored in the VCS. Not to mention that the latter is much more accurate than the former and is automatically managed by the system; e.g., you only need to write the purpose of your changes and it will associate it to the files you changed.

However, all has to be said: ChangeLogs are useful, but in the context of distribution files. E.g., when a user downloads a tar.gz file, he will want the list of changes to be included in it, because he may not be able to (or not want to) use an specific VCS to read it.

What I'm trying to emphasize here is that maintaining a ChangeLog by hand when you use a VCS system is not a good idea, and something you shouldn't need to do. Ideally, you'd generate it from the revision history in the VCS (AFAIK, Subversion can do it). E.g., before publishing a new release, you'd ask the VCS to build a ChangeLog based on the changes from version 0.1 and 0.2, stick it into your distribution file and release it. Your users could be happy and you didn't have to maintain it manually. Things get more complex when you have branches in between, but the VCS should be clever enough to generate a ChangeLog that details what happened to the source code; doing it by hand in this case can be painful.

Now, to the rants. I guess ChangeLogs were originated when there weren't (free) VCS systems available, because authors needed to document the changes between versions of their software. Then, when CVS appeared, things didn't get better because of its lack of changesets. That is, when a change is committed to the repository, its history is spread over all affected files, so you have no way to know which files were modified in a single commit — unless you take note of it in the ChangeLog.

I believe it is time to start throwing away ChangeLogs (in the traditional concept) at the same time you abandon CVS, by generating them in an automated fashion when publishing new releases. Development will be easier because you'll have something less to care about. But, of course, this is just my personal opinion.

Tuesday, August 16, 2005

Booting NetBSD with Yaboot

I have an iBook G3 with Debian GNU/Linux as the primary OS and NetBSD to play with. Yaboot is the boot loader I'm using because it was installed by Debian automatically.

For quite some time, I was booting NetBSD manually; that is, entering into OpenFirmware (by pressing Alt, Command, O and F altogether) and typing the "cryptic" command "boot hd:2,ofwboot hd:5/netbsd" into it. The reason is that I couldn't get Yaboot to boot NetBSD successfully despite I followed the instructions given in the manual page. (From what I saw, they seemed to be OpenBSD specific because the kernel name was "bsd".)

As you can imagine, this procedure was really boring, so I spent some time investigating how could I fix that. I decided to mount the HFS partition holding the boot loader to see if there was something interesting to modify, and I found an OpenFirmware script... calling me to modify it. Note that, to do the following, you need to have configured Yaboot manually to accept a BSD entry (otherwise, the changes are not so trivial), even if it does not work "out of the box".

Once you have the partition mounted, edit the ofboot.b file and look for the line that starts with ": bootybsd", which holds the commands needed to boot BSD. What you need to do is to change it to match your exact boot command, instead of the original one. In my case, I ended up with the following single line (despite formatting):

:bootybsd " Booting BSD..." .printf 100 ms load-base release-load-area " hd:2,ofwboot hd:5/netbsd" $boot ;

I now need to be careful to not run ybin again, which will overwrite my changes. But otherwise it works fine. Hope that the people who asked me how to do this find the explanation useful ;-)

Monday, August 15, 2005

SoC: Status report 5

I started this week's work by reading the first chapters of Design and Implementation of the UVM virtual memory system to see if I'd learn how to manage anonymous memory. I had been suggested to use anonymous memory objects (aobjs, for short) to store file contents, so I was shown with the task to learn what they are and how to use them. I have to confess that I was afraid of not knowing how to complete the read/write operations for the file-system, because things were very confusing to me even after reading the document. In fact, I spent two or three days reading documentation and code, as well as doing tests, but not doing any real work.

Fortunately, after lots of tests and some words of advice from my mentor, I started to understand how things worked and how aobjs could benefit tmpfs. After two days or so, I had a working implementation of the read and write hooks, which popped out to be simpler than I thought because most of the work is done by the memory manager. Well... the write operation is less than optimal: it uses the slowest algorithm it'd use to resize aobjs, but is enough for now. Note that it's a must to fix this item (plus several others; see below) before the file-system can be considered "decent".

Aside these two operations, which were my only goal for this week, I managed to complete all other functionality of the file-system. Some of the missing operations were trivial to implement, because the implementations provided by genfs are enough for them. Others, such as the ability to create and use symbolic links, named pipes and special devices, were a bit trickier, but not really difficult. These required refactoring some of the existing code to avoid duplication, which is good because I ended up with cleaner code.

Of special interest was the creation of the getpages function. The one in genfs is extremely complex because it's designed to read and write data block by block from a file-system stored on-disk (calling other operations such as bmap and strategy). In the tmpfs case, this only needs to loan pages from the internal file to the vnode object representing it, a fast operation done internally by UVM. Neat.

Summarizing: tmpfs is (almost) feature complete. It's now time to optimize it (this is very important), fix some stuff, debug it and write the two documents I promised (as described in the project's page). I've created a little non-exhaustive to-do list to not forget about anything. But if you want, you can start playing with it: just don't expect great performance nor stability. Enjoy!

Sunday, August 14, 2005

Raw disk devices vs. regular ones

People have sometimes asked me what is the difference between regular disk devices and raw ones in BSD systems; as an example, take /dev/fd0a and its corresponding /dev/rfd0a. The thing is that I wasn't able to answer them correctly because I didn't know how they really differed. However, while reading The Design and Implementation of the 4.4BSD Operating System during past month, I found the explanation. So here goes a clarification:

Raw devices are provided for direct access to the device they represent. On the first hand, this means that the data transferred to/from them does not pass through the system caches. On the other hand, the data is copied directly from the device driver into the user buffer, without previously passing through an intermediate kernel area.

As the kernel does not do buffering nor caching, the user must allocate buffers whose size matches the expectations of the underlying device. For example, if you were to read data from a floppy disk, your buffer's size could need to be a multiple of 512 bytes (the sector size), and all reads and writes could need to use that length for their transfers. Given this, these devices are used by programs which need direct access to devices and can't afford the consistency problems that intermediate caches could bring in (e.g., fsck(8) or newfs(8)). Note that they need deep knowledge of the underlying hardware.

As you can imagine, regular devices are the opposite of raw ones. The system has buffers to transfer data to/from them and caches this data as appropriate to avoid unnecessary requests to the device. When the user works with them, he can request data transfers of any size (which do no need to match the block size), because the system will do all necessary steps to provide him with the data he really asked for.

Monday, August 08, 2005

SoC: Status report 4

This past week has been quite productive as regards my SoC project, tmpfs, although at the beginning I was a bit stalled (and afraid of not knowing how to solve the problems I had).

I started trying to fix the rmdir operation, which was broken since its addition. Thanks to the dedicated test machine, I was able to discover the point of failure quite easily because it panic'ed long before the iBook did. Solving the issue was not easy, though, as I didn't have some concepts clear (which the nice guys at tech-kern@ quickly clarified after my post). The thing is that I had serious issues with vnode allocation (duplicate vnodes for a single real file) and node removal (which can only happen after a reclaim operation).

While doing the above, I also implemented vnode locking, which I thought could be causing some of my problems too. This was really complex to get right (and I still have my doubts about its correctness), because the locking protocol is not consistent across different calls. I also found that some parts of the vnodeops(9) manual page are inconsistent with existing code in this area.

When I got this working correctly, I continued adding vnode operations. I saw it was time to start dealing with files, so I implemented file creation, removal, renaming and hard-linking. I was excited to see that, since I already learned the basics, writing these was not too difficult.

Furthermore, I've also done multiple bug fixes all around. For example, I solved the assignment of ownerships and modes of new files (and directories), simplified the unmount operation (to not use a recursive algorithm that could crash the kernel), solved problems with file-names including trailing slashes and fixed hard link counts.

Lastly, but not least, I've also added several new regression tests to ensure that all the new functionality works and that the problems that were solved do not reappear, and have reorganized their code so that code duplication is minimal. Some of the improvements include code to execute some parts of the tests as regular users, basically to verify that ownerships get correctly assigned... which makes me think I should update the project page to describe how to use this functionality.

So, summarizing: at the moment the file-system basically supports: creation, removal, renames and moves of directories and files, hard-linking of files and gets and sets of their attributes (ownerships, flags and mode). It's probably time to start to deal with the read and write operations... that is, to learn something about UVM...

Sunday, August 07, 2005

Using 'goto's in C

It is common knowledge that usages of the goto statement are potentially dangerous in any structured programming language, as their abuse can quickly make your code unreadable. This is why this construction is seldom explained to people learning how to program and their use is strongly discouraged.

However, there are some situations in which it is very useful and, despite what some people might say, makes your code more readable. I had never used gotos before, but have experienced this recently while writing tmpfs.

One of the situations where a goto is useful is when you need break from within a set of nested loops. This is the most cited example of "correct" usage of this statement, but personally, I don't like it. FWIW some languages, such as Perl, provide a nice break statement to do exactly this.

Another scenario where goto is useful is what I'm using it for now: to control all exit conditions from a function. Instead of placing return statements all around inside them (after each error check, for example), I put a goto that jumps to the end of the function; that block is responsible for checking postconditions, doing any necessary cleanup and return the correct error code.

Here is an outline of such a function:

int
foo(...)
{
int error;

/* Check function preconditions here using assert(3). */

if (error condition one) {
error = EFOO;
goto out;
}

if (error condition two) {
error = EBAR;
goto out;
}

/* Code executed when there are no errors. */

error = 0;

out:
/* Release unneeded resources here. */

/* Check function postconditions here using assert(3). */

return error;
}

As you can see, if I wanted to avoid using gotos, I'd have to duplicate a lot of code inside each error condition check. Using this structure, all exit-related stuff is kept in just one place; specially postconditions (remember that they must be fulfilled whichever action the function did). And the function is far more readable (at least to my eyes).

If you have any other example of nice goto usages, or if you see another way to do what I described above, please share! ;-)

Friday, August 05, 2005

Dedicated machine for kernel testing

During the past month, I had to do all tmpfs development on my laptop. This includes coding and testing. If you have ever done any kernel hacking you know what this means: reboot every now and then to test your changes, which can drive you crazy after few reboots (specially if things keep breaking).

So when I got back home, the first thing I did was to set up a machine I had lying around for kernel testing exclusively. The machine is a Pentium 133Mhz with 32MB of RAM and a 3GB hard disk, more than enough for my purposes. (I'd have also used qemu... but since I had the hardware...)

Here are some details about its configuration, designed to minimize manual intervention:

  • The machine is in headless mode. I.e., no monitor nor keyboard and connected to my development machine using a null serial cable. I did all the installation of NetBSD over the serial line. In order to connect to it, I use the tip(1) command.
  • The machine uses GRUB (also configured to use the serial line) to load the development kernel from my server and boot it, all of this in an unattended manner. Note that GRUB has some problems loading modern NetBSD kernels: some of them can be avoided by tweaking the configuration file, but others can't.
  • All services (e.g., sendmail, cron, inetd...) are disabled during startup to reduce boot time.
  • getty(8) is configured to automatically log in the root user. This can be done through the al parameter in /etc/gettytab.
  • root's ~/.profile fetches the new kernel again and installs it as /netbsd. Then, it asks me to press a key to run tmpfs regression tests afterward. (I made it require a key press to let me cancel the execution of tests if wanted.)
  • The development machine, after successful builds, uploads new kernels to the server from which GRUB fetches them.

With this setup, whenever I have built a new kernel in my development machine, I can instantly reboot the testing machine, press RETURN when asked for it and see the results. No need to copy the kernel to the second box in any way nor to enter any command!

Wednesday, August 03, 2005

Hollywood OS

Have you ever been disappointed by how software looks and behaves on almost all movies? If so, just go and read the description of Hollywood OS; it's worth it ;-)

Tuesday, August 02, 2005

SoC: Status report 3

It has been a long time since the previous status report; I'm sorry for that, but I haven't been able to publish one earlier. The good thing is I'm finally back from my vacations, so I'll able to work on tmpfs more seriously and continuously from now on (and I have to!).

Anyway, to the point of this post. I've just pushed all the changes I had in my work tree to the mainstream CVS server. Most of these changes focused on adding new vnode operations (none of them were implemented when I posted the previous entry), although there have been multiple improvements all around the code too.

As regards vnode operations, I started by adding simple ones, such as access, getattr and setattr (the later was rather long, but still quite straighforward). These were untestable on their own, so I continued by adding lookup and mkdir. After these two, I was able to create directories in the mounted file-system, but unable to see them. So I wrote the readdir operation and... voila! ls(1) started to show items and I discovered problems in setattr (how not!).

I have to mention that existing code (specially from FFS) was very useful to write these operations (detailed documentation could be better, of course). Also, I noticed that several file-systems have duplicated code, such as permission/flags checking. I wonder if this could be abstracted somewhere for consistency (i.e., some macros or inlined functions), but this is something I shouldn't touch for now.

At last, I implemented the rmdir hook. It works, but unfortunately it seems to cause some corruption that makes the system crash later on when reusing items from a pool (with a "free list modified"-style panic). This can be reproduced using the regression tests. I haven't discovered the point of the failure yet, although I have been debugging it for a long time... I'll have to ask the gurus from tech-kern@ which possible problems can cause this error.

By the way, while writing this code, I've found multiple mistakes in manual pages that I plan to fix in the following days (have collected a long to-do list). Some of these are just typos, but others are more serious (inconsistencies with descriptions and existing code).

Take a look at the existing code and stay tuned!