Unix Programming - Best Practices for Working with Open-Source Developers - Good Development Practice

Follow Techotopia on Twitter

On-line Guides

Linux for Beginners

Office Productivity

Linux Installation

Linux Utilities

Linux Virtualization

System/Network Admin

Scripting Languages

Development Tools

Web Development

GUI Toolkits/Desktop

Eclipse Documentation

Virtuatopia.com

Answertopia.com

How To Guides

General System Admin

Linux Filesystems

Graphics & Desktop

Problem Solutions

The Art of Unix Programming
Prev	Home	Next

Unix Programming - Best Practices for Working with Open-Source Developers - Good Development Practice

Good Development Practice

Here are some of the behaviors that can make the difference between a successful project with lots of contributors and one that stalls out after attracting no interest:

Don't rely on proprietary code.

Don't rely on proprietary languages, libraries, or other code. Doing so is risky business at the best of times; in the open-source community, it is considered downright rude. Open-source developers don't trust code for which they can't review the source.

Use GNU Autotools.

Configuration choices should be made at compile time. A significant advantage of open-source distributions is that they allow the package to adapt at compile-time to the environment it finds. This is critical because it allows the package to run on platforms its developers have never seen, and it allows the software's community of users to do their own ports. Only the largest of development teams can afford to buy all the hardware and hire enough employees to support even a limited number of platforms.

Therefore: Use the GNU autotools to handle portability issues, do system-configuration probes, and tailor your makefiles. People building from sources today expect to be able to type configure; make; make install and get a clean build — and rightly so. There is a good tutorial on these tools.

autoconf and autoheader are mature. automake, as we've previously noted, is still buggy and brittle as of mid-2003; you may have to maintain your own Makefile.in. Fortunately it's the least important of the autotools.

Regardless of your approach to configuration, do not ask the user for system information at compile-time. The user installing the package does not know the answers to your questions, and this approach is doomed from the start. The software must be able to determine for itself any information that it may need at compile- or install-time.

But autoconf should not be regarded as a license for knob-ridden designs. If at all possible, program to standards like POSIX and refrain also from asking the system for configuration information. Keep ifdefs to a minimum — or, better yet, have none at all.

Test your code before release.

A good test suite allows the team to easily run regression tests before releases. Create a strong, usable test framework so that you can incrementally add tests to your software without having to train developers in the specialized intricacies of the test suite.

Distributing the test suite allows the community of users to test their ports before contributing them back to the group.

Encourage your developers to use a wide variety of platforms as their desktop and test machines, so that code is continuously being tested for portability flaws as part of normal development.

It is good practice, and encourages confidence in your code, when it ships with the test suite you use, and that test suite can be run with make test.

Sanity-check your code before release.

By “sanity check” we mean: use every tool available that has a reasonable chance of catching errors a human would be prone to overlook. The more of these you catch with tools, the fewer your users and you will have to contend with.

If you're writing C /C++ using GCC, test-compile with -Wall and clean up all warning messages before each release. Compile your code with every compiler you can find — different compilers often find different problems. Specifically, compile your software on a true 64-bit machine. Underlying datatypes can change on 64-bit machines, and you will often find new problems there. Find a Unix vendor's system and run the lint utility over your software.

Run tools that look for memory leaks and other runtime errors; Electric Fence and Valgrind are two good ones available in open source.

For Python projects, the PyChecker program can be a useful check. It often catches nontrivial errors.

If you're writing Perl, check your code with perl -c (and maybe -T, if applicable). Use perl -w and 'use strict' religiously. (See the Perl documentation for further discussion.)

Spell-check your documentation and READMEs before release.

Spell-check your documentation, README files and error messages in your software. Sloppy code, code that produces warning messages when compiled, and spelling errors in README files or error messages, all lead users to believe the engineering behind it is also haphazard and sloppy.

Recommended C/C++ Portability Practices

If you are writing C , feel free to use the full ANSI features. Specifically, do use function prototypes, which will help you spot cross-module inconsistencies. The old-style K&R compilers are ancient history.

Do not assume compiler-specific features such as the GCC -pipe option or nested functions are available. These will come around and bite you the second somebody ports to a non-Linux, non-GCC system.

Code required for portability should be isolated to a single area and a single set of source files (for example, an os subdirectory). Compiler, library and operating system interfaces with portability issues should be abstracted to files in this directory.

A portability layer is a library (or perhaps just a set of macros in header files) that abstracts away just the parts of an operating system's API your program is interested in. Portability layers make it easier to do new software ports. Often, no member of the development team knows the porting platform (for example, there are literally hundreds of different embedded operating systems, and nobody knows any significant fraction of them). By creating a separate portability layer, it becomes possible for a specialist who knows a platform to port your software without having to understand anything outside the portability layer.

Portability layers also simplify applications. Software rarely needs the full functionality of more complex system calls such as mmap(2) or stat(2), and programmers commonly configure such complex interfaces incorrectly. A portability layer with abstracted interfaces (say, something named __file_exists instead of a call to stat(2)) allows you to import only the limited, necessary functionality from the system, simplifying the code in your application.

Always write your portability layer to select based on a feature, never based on a platform. Trying to create a separate portability layer for each supported platform results in a multiple update problem maintenance nightmare. A “platform” is always selected on at least two axes: the compiler and the library/operating system release. In some cases there are three axes, as when Linux vendors select a C library independently of the operating system release. With M vendors, N compilers, and O operating system releases, the number of platforms quickly scales out of reach of any but the largest development teams. On the other hand, by using language and systems standards such as ANSI and POSIX 1003.1, the set of features is relatively constrained.

Portability choices can be made along either lines of code or compiled files. It doesn't make a difference if you select alternate lines of code on a platform, or one of a few different files. A rule of thumb is to move portability code for different platforms into separate files when the implementations diverge significantly (shared memory mapping on Unix vs. Windows), and leave portability code in a single file when the differences are minimal (for example, whether you're using gettimeofday, clock_gettime, ftime or time to find out the current time-of-day).

For anywhere outside a portability layer, heed this advice:

	`#ifdef` and `#if` are last resorts, usually a sign of failure of imagination, excessive product differentiation, gratuitous “optimization” or accumulated trash. In the middle of code they are anathema. `/usr/include/stdio.h` from GNU is an archetypical horror.
-- Doug McIlroy

Use of #ifdef and #if is permissible (if well controlled) within a portability layer. Outside it, try hard to confine these to conditionalizing #includes based on feature symbols.

Never intrude on the namespace of any other part of the system, including filenames, error return values and function names. Where the namespace is shared, document the portion of the namespace that you use.

Choose a coding standard. The debate over the choice of standard can go on forever — regardless, it is too difficult and expensive to maintain software built using multiple coding standards, and so some common style must be chosen. Enforce your coding standard ruthlessly, as consistency and cleanliness of the code are of the highest priority; the details of the coding standard itself are a distant second.

[an error occurred while processing this directive]

The Art of Unix Programming
Prev	Home	Next

Published under free license.

Design by Interspire