|
Unix Programming - Best Practices for Working with Open-Source Developers - Good Development Practice
Good Development Practice
Here are some of the behaviors that can make the difference
between a successful project with lots of contributors and one that
stalls out after attracting no interest:
Don't rely on proprietary code.
Don't rely on proprietary languages, libraries, or other code.
Doing so is risky business at the best of times; in the open-source
community, it is considered downright rude. Open-source developers
don't trust code for which they can't review the source.
Configuration choices should be made at compile time. A
significant advantage of open-source distributions is that they allow
the package to adapt at compile-time to the environment it finds. This
is critical because it allows the package to run on platforms its
developers have never seen, and it allows the software's community of
users to do their own ports. Only the largest of development teams can
afford to buy all the hardware and hire enough employees to support
even a limited number of platforms.
Therefore: Use the GNU autotools to handle portability issues,
do system-configuration probes, and tailor your makefiles. People
building from sources today expect to be able to type
configure; make; make install and get a clean build
— and rightly so. There is a good tutorial on these
tools.
autoconf and
autoheader are mature.
automake, as we've previously noted, is
still buggy and brittle as of mid-2003; you may have to maintain
your own Makefile.in. Fortunately
it's the least important of the autotools.
Regardless of your approach to configuration, do not ask the user
for system information at compile-time. The user installing the package
does not know the answers to your questions, and this approach is doomed
from the start. The software must be able to determine for itself any
information that it may need at compile- or install-time.
But autoconf should not be regarded
as a license for knob-ridden designs. If at all possible, program to
standards like POSIX and refrain also from asking the system for
configuration information. Keep ifdefs to a minimum — or,
better yet, have none at all.
Test your code before release.
A good test suite allows the team to easily run regression tests
before releases. Create a strong, usable test framework so that you
can incrementally add tests to your software without having to train
developers in the specialized intricacies of the test suite.
Distributing the test suite allows the community of users to test
their ports before contributing them back to the group.
Encourage your developers to use a wide variety of platforms as
their desktop and test machines, so that code is continuously being
tested for portability flaws as part of normal development.
It is good practice, and encourages confidence in your code, when
it ships with the test suite you use, and that test suite can be run
with make test.
Sanity-check your code before release.
By “sanity check” we mean: use every tool available
that has a reasonable chance of catching errors a human would be prone
to overlook. The more of these you catch with tools, the fewer your
users and you will have to contend with.
If you're writing C/C++ using GCC, test-compile with -Wall and
clean up all warning messages before each release. Compile your code
with every compiler you can find — different compilers often
find different problems. Specifically, compile your software on a true
64-bit machine. Underlying datatypes can change on 64-bit machines,
and you will often find new problems there. Find a Unix vendor's
system and run the lint utility over your software.
Run tools that look for memory leaks and other runtime errors;
Electric Fence and Valgrind are two good ones available in open source.
For Python projects, the PyChecker
program can be a useful check. It often catches nontrivial errors.
If you're writing Perl, check your code with perl
-c (and maybe -T, if
applicable). Use perl -w and 'use strict'
religiously. (See the Perl documentation for further
discussion.)
Spell-check your documentation and READMEs before release.
Spell-check your documentation, README files and error messages
in your software. Sloppy code, code that produces warning messages when
compiled, and spelling errors in README files or error messages, all lead
users to believe the engineering behind it is also haphazard and
sloppy.
Recommended C/C++ Portability Practices
If you are writing C, feel free to use the full ANSI
features. Specifically, do use function prototypes, which will help
you spot cross-module inconsistencies. The old-style K&R
compilers are ancient history.
Do not assume compiler-specific features such as the GCC
-pipe option or nested functions are available. These
will come around and bite you the second somebody ports to a
non-Linux, non-GCC system.
Code required for portability should be isolated to a single
area and a single set of source files (for example, an
os subdirectory). Compiler, library and
operating system interfaces with portability issues should be
abstracted to files in this directory.
A portability layer is a library (or perhaps just a set of
macros in header files) that abstracts away just the parts of an
operating system's API your program is interested in. Portability
layers make it easier to do new software ports. Often, no member of
the development team knows the porting platform (for example, there
are literally hundreds of different embedded operating systems, and
nobody knows any significant fraction of them). By creating a separate
portability layer, it becomes possible for a specialist who knows a
platform to port your software without having to understand anything
outside the portability layer.
Portability layers also simplify applications. Software rarely
needs the full functionality of more complex system calls such as
mmap(2)
or
stat(2),
and programmers commonly configure such complex interfaces
incorrectly. A portability layer with abstracted interfaces
(say, something named __file_exists instead of a call to
stat(2))
allows you to import only the limited, necessary functionality from
the system, simplifying the code in your application.
Always write your portability layer to select based on a
feature, never based on a platform. Trying to create a separate
portability layer for each supported platform results in a multiple
update problem maintenance nightmare. A “platform” is
always selected on at least two axes: the compiler and the
library/operating system release. In some cases there are three axes,
as when Linux vendors select a C library independently of the operating
system release. With
M
vendors,
N
compilers, and
O
operating system
releases, the number of platforms quickly scales out of reach of any
but the largest development teams. On the other hand, by using
language and systems standards such as ANSI and POSIX 1003.1, the set
of features is relatively
constrained.
Portability choices can be made along either lines of code or
compiled files. It doesn't make a difference if you select alternate
lines of code on a platform, or one of a few different files. A rule
of thumb is to move portability code for different platforms into
separate files when the implementations diverge significantly (shared
memory mapping on Unix vs. Windows), and leave portability code in a
single file when the differences are minimal (for example, whether
you're using gettimeofday, clock_gettime, ftime or time to
find out the current time-of-day).
For anywhere outside a portability layer, heed this advice:
|
#ifdef and #if are last resorts, usually a sign of failure
of imagination, excessive product differentiation, gratuitous
“optimization” or accumulated trash. In the middle of
code they are anathema. /usr/include/stdio.h
from GNU is an archetypical horror.
|
|
| --
Doug McIlroy
|
|
Use of #ifdef and #if is permissible (if well controlled) within a
portability layer. Outside it, try hard to confine these to
conditionalizing #includes based on feature
symbols.
Never intrude on the namespace of any other part of the system,
including filenames, error return values and function names. Where
the namespace is shared, document the portion of the namespace that you
use.
Choose a coding standard. The debate over the choice of standard
can go on forever — regardless, it is too difficult and
expensive to maintain software built using multiple coding standards,
and so some common style must be chosen. Enforce your coding standard
ruthlessly, as consistency and cleanliness of the code are of the
highest priority; the details of the coding standard itself are a
distant second.
[an error occurred while processing this directive]
|