You are viewing fuhm

 
 
19 May 2009 @ 10:11 pm
Commit message formatting  
(or: GNU coding standards considered harmful)

I've recently been watching GCC and GDB development, and every time I have reason to go looking through their commits, the total lack of useful information in the commit messages really strikes me.

Take this commit message for example, reproduced below for your convenience:

Log message:
	* fork-child.c: Don't include frame.h.  Include terminal.h.
	(fork_inferior): Call new_tty_postfork after forking adn adding
	the child to the inferior list.
	* inferior.h (new_tty_prefork, gdb_has_a_terminal): Don't declare
	here.
	* inflow.c (struct terminal_info): Remove const qualifier from
	`run_terminal' field.
	(inferior_thisrun_terminal): Tweak comment.
	(inflow_inferior_exit): Release the `run_terminal' field.
	(copy_terminal_info): New function.
	(new_tty_postfork): New function.
	* terminal.h (new_tty_prefork, new_tty, new_tty_postfork,
	(copy_terminal_info, gdb_has_a_terminal, gdb_setpgid): Declare.
	* inf-ptrace.c: Include terminal.h.
	(inf_ptrace_follow_fork): Copy the parent's terminal info to the
	child.
	* linux-nat.c: Include terminal.h.
	(linux_child_follow_fork): Copy the parent's terminal info to the
	child.
	* inf-ttrace.c: Include terminal.h.
	(inf_ttrace_follow_fork): Copy the parent's terminal info to the
	child.



Okay, so quick quiz: what does this change do? Does it fix a bug? Add a feature? Can you tell from this message? If you get lost in the useless meaningless text, well, me too.

What if instead of showing you that, I showed you this email and this email about an earlier part of the same change. Isn't that much better? Wouldn't it be nice if I'd shown you that to begin with?

Before you think I'm just picking on the maintainers of gdb, let me now point you to the GNU coding standards changelog guidelines, which they're faithfully following.


There's no need to describe the full purpose of the changes or how they work together. If you think that a change calls for explanation, you're probably right. Please do explain it—but please put the explanation in comments in the code, where people will see it whenever they see the code. For example, “New function” is enough for the change log when you add a function, because there should be a comment before the function definition to explain what it does.


Let's think about this...gdb (and all the rest of the GNU projects) are using a version control system. Even one as poor as CVS can tell me that a function is new, when it was added, and who did it. So exactly what value is putting "(new_tty_postfork): New function." in the changelog? Absolutely none. Did I really need the author to tell me that a new .h file was being included now? No, of course not, even if that might be interesting for some reason, it's utterly obvious from the diff.

The GNU coding standard is being actively harmful here, by recommending that people put useless junk in the change message, instead of an explanation of the change. It's really rather unfortunate.

What do I want to see in a commit message? Well, it seems to me that basically everyone except the GNU project has this well figured out already, but:

At a high level: what does the change do (implementation), and what is it for (reason). Include both a one-line summary of the change, and then some prose describing it in more detail. Try to keep it high level: in most cases, the details about each individual thing you touched in your change isn't necessary. I should be able to get that from the diff directly.
 
 
 
( 14 comments — Leave a comment )
Samuel Tardieu [rfc1149.net] on May 20th, 2009 06:02 pm (UTC)
Things never change
You might be interested with my attempt to get it fixed 18 months ago, with no luck: http://gcc.gnu.org/ml/gcc/2007-12/msg00016.html
fuhm on May 20th, 2009 11:10 pm (UTC)
Re: Things never change
Nice that you proposed changing it on the gcc mailing list. Sucks that it went nowhere. :( I still can't believe some people actually defended the current lack of information, but it looked to me like most people were pretty much in favor of adding more information to the commit messages. (although not in favor of removing the useless info currently required). Perhaps you should try again.

My favorite message is this one, where it's pointed out that we don't know why the guidelines say not to include the rationale in the changelog, because nobody wrote down the rationale for the guidelines. :)
(Anonymous) on May 21st, 2009 05:43 am (UTC)
Re: Things never change
Duh. Next time you try, don't mention Linux ;-)

Brooksbrooksmoses on May 20th, 2009 06:15 pm (UTC)
Not to disagree with your main point, but it's worth noting that the GNU coding standard for ChangeLogs predates having things in CVS, much less a better version-control system. (And CVS cannot easily tell you that change A to file X and change B to file Y are part of the same overall thing.)

However, you're conflating two different sets of requirements here -- a GNU changelog entry is not just a repository commit message. It's also what goes into the ChangeLog file in the source tree.

Traditionally, these two things have the same information, and I think that's a reasonable idea -- it saves trouble of creating two things when one will do, and also it guarantees that the ChangeLog actually has all of the information. The latter is important because, in many cases, people who are getting the source just get a tarball of the tree, not the full repository -- and they still may need to be able to track the history of the source code.

Now, postulating that, it's clear why this stuff gets into the commit log. It's because, in order for the ChangeLog to act as a surrogate for a version-control log, the information about which files were changed by the patch needs to be in the ChangeLog file.

(With that said, including an explanation in the log entry would almost certainly be a good thing as well. As I say, I don't disagree with your main point there! It's just that this stuff is emphatically not "useless junk" in the ChangeLog file, and given the reasonable requirement that the ChangeLog file entry and the commit message are the same, it's logical for it to end up in the commit messages.)
(Anonymous) on May 20th, 2009 06:59 pm (UTC)
> they still may need to be able to track the history of the source code.

CVSWeb, or DVCS log.

I question the value of a changelog of this kind anywhere at all. Why have the same thing that "git log" and "git blame" has, but in a completely useless format?
Who, me?metageek on May 20th, 2009 07:52 pm (UTC)
So it's obsolete
it's worth noting that the GNU coding standard for ChangeLogs predates having things in CVS, much less a better version-control system.

Then, clearly, ChangeLogs are obsolete, and should just be discarded in favor of useful commit messages.

Brooksbrooksmoses on May 20th, 2009 10:55 pm (UTC)
Re: So it's obsolete
You seem to have confused "this is one historical reason why it's this way" with "this is the only reason it's this way." That particular reason is obsolete; it is not the only reason.
fuhm on May 20th, 2009 10:41 pm (UTC)
What they said.
I agree with the other two replies to your comment.

I will note that I find a properly-written, informative, high-level change log file quite useful -- it's often nice to see what changed between the last version of software you had installed and the current one, without getting into the nitty-gritty of the code. It should be more concise than VC commit messages, and targeted towards someone who uses your software, rather than someone who knows the code inside-out. For example, the Twisted NEWS file

But the GNU ChangeLog file as it stands provides zero value, and its existence causes an overall negative by getting in the way of something more useful.
Brooksbrooksmoses on May 20th, 2009 11:14 pm (UTC)
Re: What they said.
A "high-level change log file" is a different thing from the GNU ChangeLog file, and meeting a different need, though. So, yes, useful -- but, IMO, irrelevant to this question.

I think we're arguing at the wrong level, though; we're working from different suppositions.

The GNU rules are based on postulating the use-case that (a) many people will be getting source distributions in the form of sets of files, rather than in the form of repository dumps, and (b) many of those people will want to have some ability to track the historical changes in the files. Oh, and (c) these files may have taken a circuitous path to arrive at the end user and there may be no single canonical repository that represents them.

This use case happens in actual practice. For instance, a commercial variant of GCC will typically start out with a set of files from the FSF, and then make various changes and so forth to those files to apply local patches (such as backporting things to older versions that the FSF considers "frozen"). Those local patches will very likely be made within the company's version-control system, and their internal commit logs may reference proprietary information about their customers or for other reasons may be something they don't wish to allow public access to. And so "use CVSweb" or "use git-blame" or whatever is not a workable answer there for anyone outside the company. However, if the company has maintained a GNU-like ChangeLog file, that will be usable by anyone who receives a copy of the source files.

Now, I think if you agree that that use case should be supported, then you can pretty much derive the existence of either a ChangeLog file such as it currently exists in the GNU coding guidelines, or something pretty much equivalent.

If you don't agree with that use case, then I think the place to have a profitable debate is about whether or not it's a reasonable use case to support, not about the ChangeLog files that do or do not come out of the decision to or not to support it.

Edited at 2009-05-20 11:16 pm (UTC)
fuhm on May 21st, 2009 12:33 am (UTC)
Re: What they said.
Yeah, I don't see that as a valuable use case. Certainly not enough value for developers to spend their precious time manually writing down what is just about as useful as an automated summary of the diff output would be. The time people spend writing the ChangeLog entry would be much more profitably used writing a useful commit message.

Furthermore, if I'm getting a source dump from a commercial vendor, I certainly will also want to see why changes were made. The "what" is almost entirely useless without the "why". How would I know whether I should merge that change back into upstream (or into my branch), without the reason it was done? Do their GNU-style ChangeLog entries really get me that much more value than a diff -r upstream-source commercial-vendor-source would?

On the other hand, they're probably happier just providing me the GNU changelog format, because it's less useful to me. (I have to suppose that releasing source in a useless form is their goal if they're only providing a full dump rather than patch sets. And since not providing reasoning behind changes is the right thing to do according to GNU, even better, they don't even have to look evil!)

That said, I certainly wouldn't mind if developers would provide a GNU-style ChangeLog in addition to useful commit messages. The problem right now is that people write GNU-style ChangeLogs instead of commit messages. If you can get both, fine with me; I can skip right over it and just read the prose!
(Anonymous) on May 21st, 2009 05:37 am (UTC)
The purpose of ChangeLog
It seems people have missed the point of what a ChaneLog (CL) entry is
supposed to document.

As the GNU Coding Standards state, it is not supposed to document the
`why', but what happened to the source tree. This is so you can get a
overall view of what changes the patch does or did; if you get a large
patch you will want to have a brief summary of what it does (not the
why, but code changes). Just doing a diff, and getting a monstrous
patch back will be quite useless, but a few CL entries will make life
easier in tracking down a newly introduced bug.

Having the `why' documented, or more precisly, the intent of the code,
is very important as well, but this belongs in the actual code, or in
the manual.

Writing a CL entry is not that hard either, and gives you a overview
over your own changes. I have often seen better ways to fix a bug
after having written a CL entry.

Whether one agrees with this or not is really beside the point, but a
CL entry does not document the `why', it never had this intent. It is
a general overview of a change, something that diff will never be able
to give you.

Alfred M. Szmidt <ams@gnu.org> -- A GNU Hacker who cannot live without
ChangeLog.
fuhm on May 21st, 2009 04:43 pm (UTC)
Re: The purpose of ChangeLog
Having the `why' documented, or more precisly, the intent of the code, is very important as well, but this belongs in the actual code, or in the manual.


You're missing the distinction between the why of how the code currently works, and why the change was made. Absolutely the intent of the code should be documented in the code. However, the intent of the change should be documented in the commit message.
(Anonymous) on February 12th, 2011 10:14 am (UTC)
You gathered the useful information people would find interesting here.
Hello.Reauthoring a new series -
I have been doing a series of tv shows. After I did the first batch, when I started the second batch, every time I am the new series, I keep getting the previous series. I have tried repeatedly. Am I missing something, I am new to the program DVD Shrink 3.2. I had no problem doing the first dvd, I just can't seem to add the new files when I go through re-authorHelp would be greatly appreciated.Cojunie

[url=http://www.topvideoconverter.com/m4p-converter/]m4p converter[/url]
(Anonymous) on February 18th, 2011 05:30 pm (UTC)
Filled to the brim Picture Booth MD?
Right, this has been genuinely a wonderful childrens party!
Janine with Johnathan and approximately 260 guests and a exciting social gathering and the cuisine was first rank! As well a group was in concert. This cool whole area appears to be looking like our sun, dazzling in blonde and red yellow. And this great flowers complement the colors!
A really hot occurrence in addition to present was just this man from [url=http://www.shifoto.com/]Picture Booth MD[/url]. The man made Pictures always. An extraordinarily cool memory from this cool nice occurrence.
I, Simon Cooley from San Marino did enjoy the thing very a lot!
( 14 comments — Leave a comment )