Log in

No account? Create an account
19 May 2009 @ 10:11 pm
Commit message formatting  
(or: GNU coding standards considered harmful)

I've recently been watching GCC and GDB development, and every time I have reason to go looking through their commits, the total lack of useful information in the commit messages really strikes me.

Take this commit message for example, reproduced below for your convenience:

Log message:
	* fork-child.c: Don't include frame.h.  Include terminal.h.
	(fork_inferior): Call new_tty_postfork after forking adn adding
	the child to the inferior list.
	* inferior.h (new_tty_prefork, gdb_has_a_terminal): Don't declare
	* inflow.c (struct terminal_info): Remove const qualifier from
	`run_terminal' field.
	(inferior_thisrun_terminal): Tweak comment.
	(inflow_inferior_exit): Release the `run_terminal' field.
	(copy_terminal_info): New function.
	(new_tty_postfork): New function.
	* terminal.h (new_tty_prefork, new_tty, new_tty_postfork,
	(copy_terminal_info, gdb_has_a_terminal, gdb_setpgid): Declare.
	* inf-ptrace.c: Include terminal.h.
	(inf_ptrace_follow_fork): Copy the parent's terminal info to the
	* linux-nat.c: Include terminal.h.
	(linux_child_follow_fork): Copy the parent's terminal info to the
	* inf-ttrace.c: Include terminal.h.
	(inf_ttrace_follow_fork): Copy the parent's terminal info to the

Okay, so quick quiz: what does this change do? Does it fix a bug? Add a feature? Can you tell from this message? If you get lost in the useless meaningless text, well, me too.

What if instead of showing you that, I showed you this email and this email about an earlier part of the same change. Isn't that much better? Wouldn't it be nice if I'd shown you that to begin with?

Before you think I'm just picking on the maintainers of gdb, let me now point you to the GNU coding standards changelog guidelines, which they're faithfully following.

There's no need to describe the full purpose of the changes or how they work together. If you think that a change calls for explanation, you're probably right. Please do explain it—but please put the explanation in comments in the code, where people will see it whenever they see the code. For example, “New function” is enough for the change log when you add a function, because there should be a comment before the function definition to explain what it does.

Let's think about this...gdb (and all the rest of the GNU projects) are using a version control system. Even one as poor as CVS can tell me that a function is new, when it was added, and who did it. So exactly what value is putting "(new_tty_postfork): New function." in the changelog? Absolutely none. Did I really need the author to tell me that a new .h file was being included now? No, of course not, even if that might be interesting for some reason, it's utterly obvious from the diff.

The GNU coding standard is being actively harmful here, by recommending that people put useless junk in the change message, instead of an explanation of the change. It's really rather unfortunate.

What do I want to see in a commit message? Well, it seems to me that basically everyone except the GNU project has this well figured out already, but:

At a high level: what does the change do (implementation), and what is it for (reason). Include both a one-line summary of the change, and then some prose describing it in more detail. Try to keep it high level: in most cases, the details about each individual thing you touched in your change isn't necessary. I should be able to get that from the diff directly.
fuhm on May 21st, 2009 12:33 am (UTC)
Re: What they said.
Yeah, I don't see that as a valuable use case. Certainly not enough value for developers to spend their precious time manually writing down what is just about as useful as an automated summary of the diff output would be. The time people spend writing the ChangeLog entry would be much more profitably used writing a useful commit message.

Furthermore, if I'm getting a source dump from a commercial vendor, I certainly will also want to see why changes were made. The "what" is almost entirely useless without the "why". How would I know whether I should merge that change back into upstream (or into my branch), without the reason it was done? Do their GNU-style ChangeLog entries really get me that much more value than a diff -r upstream-source commercial-vendor-source would?

On the other hand, they're probably happier just providing me the GNU changelog format, because it's less useful to me. (I have to suppose that releasing source in a useless form is their goal if they're only providing a full dump rather than patch sets. And since not providing reasoning behind changes is the right thing to do according to GNU, even better, they don't even have to look evil!)

That said, I certainly wouldn't mind if developers would provide a GNU-style ChangeLog in addition to useful commit messages. The problem right now is that people write GNU-style ChangeLogs instead of commit messages. If you can get both, fine with me; I can skip right over it and just read the prose!