Much of what constitutes best practice in the open-source
community is a natural adaptation to distributed development; you'll
read a lot in the rest of this chapter about behaviors that maintain
good communication with other developers. Where Unix conventions are
arbitrary (such as the standard names of files that convey
metainformation about a source distribution) they often trace back
either to Usenet in
the early 1980s, or to the conventions and standards of the GNU
project.
Most people become involved in open-source software by writing patches
for other people's software before releasing projects of their own.
Suppose you've written a set of source-code changes for someone else's
baseline code. Now put yourself in that person's shoes. How is he
to judge whether to include the patch?
It is very difficult to judge the quality of code, so
developers tend to evaluate patches by the quality of the submission.
They look for clues in the submitter's style and communications
behavior instead — indications that the person has been in their
shoes and understands what it's like to have to evaluate and merge an
incoming patch.
This is actually a rather reliable proxy for code quality. In
many years of dealing with patches from many hundreds of strangers,
I have only seldom seen a patch that was thoughtfully
presented and respectful of my time but technically bogus. On the
other hand, experience teaches that patches which look careless or
are packaged in a lazy and inconsiderate way are very likely to
actually
be
bogus.
Here are some tips on how to get your patch accepted:
Do send patches, don't send whole archives or files.
If your change includes a new file that doesn't exist in the code,
then of course you have to send the whole file. But if you're modifying
an already-existing file, don't send the whole file. Send a diff instead;
specifically, send the output of the
diff(1)
command run to compare the baseline distributed version against your
modified version.
The
diff(1)
command and its dual,
patch(1),
are the most basic tools of open-source development. Diffs are better
than whole files because the developer you're sending a patch to may
have changed the baseline version since you got your copy. By sending
him a diff you save him the effort of separating your changes from
his; you show respect for his time.
Send patches against the current version of the code.
It is both counterproductive and rude to send a maintainer
patches against the code as it existed several releases ago, and expect
him to do all the work of determining which changes duplicate things he
has since done, versus which things are actually novel in your patch.
As a patch submitter, it is
your
responsibility
to track the state of the source and send the maintainer a minimal patch
that expresses what you want done to the main-line codebase. That means
sending a patch against the current version.
Don't include patches for generated files.
Before you send your patch, walk through it and delete any patch
bands for files in it that are going to be automatically regenerated
once the maintainer applies the patch and remakes. The classic
examples of this error are C files
generated by Bison or
Flex.
These days the most common mistake of this kind is sending a
diff with a huge band that is nothing but changebars between your
configure script and the maintainer's. This file is
generated by autoconf.
This is inconsiderate. It means your recipient is put to the trouble
of separating the real content of the patch from a lot of bulky noise.
It's a minor error, not as important as some of the things we'll get
to further on — but it will count against you.
Don't send patch bands that just tweak RCS or SCCS $-symbols.
Some people put special tokens in their source files that are
expanded by the version-control system when the file is checked in:
the $Id$ construct used by RCS and CVS,
for example.
If you're using a local version-control system yourself, your
changes may alter these tokens. This isn't really harmful, because
when your recipient checks his code back in after applying your patch
the tokens will be re-expanded in accordance with the
maintainer's
version-control status. But those
extra patch bands are noise. They're distracting. It's more
considerate not to send them.
This is another minor error. You'll be forgiven for it if you
get the big things right. But you want to avoid it anyway.
Do use -c or -u format, don't use the default (-e) format.
The default (-e) format of
diff(1)
is very brittle. It doesn't include any context, so the patch tool
can't cope if any lines have been inserted or deleted in the baseline
code since you took the copy you modified.
Getting an -e diff is annoying, and suggests
that the sender is either an extreme newbie, careless, or clueless.
Most such patches get tossed out without a second thought.
Do include documentation with your patch.
This is very important. If your patch makes a user-visible addition
or change to the software's features,
include changes to the
appropriate man pages and other documentation files in your patch
.
Do
not
assume that the recipient will be happy to
document your code for you, or to have undocumented features lurking
in the code.
Documenting your changes well demonstrates some good things. First,
it's considerate to the person you are trying to persuade. Second, it
shows that you understand the ramifications of your change well enough
to explain it to somebody who can't see the code. Third, it demonstrates
that you care about the people who will ultimately use the software.
Good documentation is usually the most visible sign of what separates
a solid contribution from a quick and dirty hack. If you take the time
and care necessary to produce it, you'll find you're already 85% of the
way to having your patch accepted by most developers.
Do include an explanation with your patch.
Your patch should include cover notes explaining why you think the
patch is necessary or useful. This is explanation directed not to the
users of the software but to the maintainer to whom you are sending the
patch.
The note can be short — in fact, some of the most
effective cover notes I've ever seen just said “See the
documentation updates in this patch”. But it should show the
right attitude.
The right attitude is helpful, respectful of the maintainer's time,
quietly confident but unassuming. It's good to display understanding of
the code you're patching. It's good to show that you can identify with the
maintainer's problems. It's also good to be up front about any risks you
perceive in applying the patch. Here are some examples of the sorts of
explanatory comments that experienced developers send:
“I've seen two problems with this code, X and Y. I fixed
problem X, but I didn't try addressing problem Y because I don't think
I understand the part of the code that I believe is
involved”.
“Fixed a core dump that can happen when one of the foo
inputs is too long. While I was at it, I went looking for similar
overflows elsewhere. I found a possible one in blarg.c, near line
666. Are you sure the sender can't generate more than 80 characters
per transmission?”
“Have you considered using the Foonly algorithm for this
problem? There is a good implementation at
<https://www.example.com/~jsmith/foonly.html>”.
“This patch solves the immediate problem, but I realize it
complicates the memory allocation in an unpleasant way. Works for me,
but you should probably test it under heavy load before
shipping”.
“This may be featuritis, but I'm sending it anyway.
Maybe you'll know a cleaner way to implement the
feature”.
Do include useful comments in your code.
A maintainer will want to have strong confidence that he
understands your changes before merging them in. This isn't an
invariable rule; if you have a track record of good work with the
maintainer, he may just run a casual eye over the changes before
checking them in semiautomatically. But everything you can do to
help him understand your code and decrease his uncertainty increases
your chances that your patch will be accepted.
Good comments in your code help the maintainer understand it.
Bad comments don't.
Here's an example of a bad comment:
/* norman newbie fixed this 13 Aug 2001 */
This conveys no information. It's nothing but a muddy
territorial bootprint you're planting in the middle of the
maintainer's code. If he takes your patch (which you've made less
likely) he will almost certainly strip out this comment. If you want
a credit, include a patch band for the project
NEWS or HISTORY file. He's
more likely to take that.
Here's an example of a good comment:
/*
* This conditional needs to be guarded so that crunch_data() never
* gets passed a NULL pointer. <norman_newbie@foosite.com>
*/
This comment shows that you understand not only the maintainer's
code but the kind of information that he needs to have confidence in
your changes. This kind of comment
gives
him
confidence in your changes.
Don't take it personally if your patch is rejected
There are lots of reasons a patch can be rejected that don't
reflect on you. Remember that most maintainers are under heavy time
pressure, and have to be conservative in what they accept lest the
project code get broken. Sometime resubmitting with improvements
will help. Sometimes it won't. Life is hard.