Mixing languages is a knowledge-intensive (rather than coding-intensive)
style of programming. To make it work, you have to have both
working knowledge of a suitable variety of languages and expertise
about what they're best at and how to fit them together. In this
section, we will try to point you at references to help you with the
first and an overview to convey the second. For each language
surveyed we will include case studies of successful programs that
exemplify its strengths.
Despite the memory-management problem, there are some
application niches for which C is still king. Programs that require maximum speed, have
real-time requirements, or are tightly coupled to the OS kernel are
good candidates for C.
Programs that must be portable across multiple operating systems
may also be good candidates for C. Some of the alternatives to C that
we shall discuss below are, however, increasingly penetrating major
non-Unix operating systems; in the near future, portability may be less a
distinguishing advantage of C.
Sometimes the leverage to be gained from existing programs like
parser generators or GUI builders that generate C code is so great
that it justifies C coding of the rest of a small application.
And, of course, C proved indispensable to the developers of all
its alternatives. Dig down through enough implementation layers
under any of the other languages surveyed here and you will find a
core implemented in pure, portable C. These languages inherit
many of the advantages of C.
Under modern conditions, it's perhaps best to think of C as a
high-level assembler for the Unix virtual machine (recall the
discussion of the success of C as a case study in Chapter4). C standards have exported many of the
facilities of this virtual machine, such as the standard I/O library,
to other operating systems. C is where you go when you want to get as
close as possible to the bare metal but stay portable.
One good reason to learn C, even if your programming needs are
satisfied by a higher-level language, is that it can help you learn to
think at hardware-architecture level. The best reference and tutorial
on C for people who are already programmers is still The C
Programming Language [Kernighan-Ritchie].
Porting C code between Unix variants is almost always possible
and usually easy, but specific areas of variation (like
signals and process control) can be tricky to get right. We
highlight some of these issues in Chapter17. Differing C bindings on other
operating systems can of course cause C portability problems, although
Windows NT at least theoretically supports an ANSI/POSIX-compliant C
API.
High-quality C compilers are available as open-source software
over the Internet; the best-known and most widely used is the Free
Software Foundation's GNU C compiler (part of GCC, the GNU Compiler
Collection), which has become the native C of
all open-source Unix systems and many even in the closed-source
world. GCC ports are even available for
Microsoft's family of operating systems. GCC sources
are available at the FSF's FTP
site.
Summing up: C's best side is resource efficiency and
closeness to the machine. Its worst side is that programming
in it is a resource-management hell.
The best case study for C is the Unix kernel itself, for which a
language that naturally supports hardware-level operations is
actually a strong advantage. But
fetchmail is a good example of the kind of
user-land utility that is still best coded in C.
fetchmail does only the simplest kind
of dynamic-memory management; its only complex data structure is a
singly-linked list of per-mailserver control blocks built just once,
at startup time, and changed only in fairly trivial ways
afterwards. This substantially erodes the case against using C by
sidestepping C's greatest weakness.
On the other hand, these control blocks are fairly complex
(including all of string, flag, and numeric data) and would be
difficult to handle as coherent fast-access objects in an
implementation language without an equivalent of the C struct
feature. Most of the alternatives to C are weaker than C in this
respect (Python and Java being notable exceptions).
Finally, fetchmail requires the
ability to parse a fairly complex specification syntax for
per-mail-server control information. In the Unix world this sort of
thing is classically handled by using C code generators that grind out
source code for a tokenizer and grammar parser from declarative
specifications. The existence of yacc and
lex was a point in favor of C.
fetchmail might reasonably have been coded in
Python,
albeit with possibly significant loss of performance. Its size and
data-structure complexity would have excluded shell and
Tcl right off
and strongly counterindicated Perl, and the application domain is outside
the natural scope of Emacs Lisp. A Java implementation wouldn't have been an
unreasonable path, but Java's object-oriented style and garbage
collection would have offered little purchase on
fetchmail's specific problems over what
C already
yields. Nor could C++ have done much to simplify the
relatively simple internal logic of
fetchmail.
However, the real reason fetchmail is
a C program is that it evolved by gradual mutation from an ancestor
already written in C. The existing implementation has been
extensively tested on many different platforms and against many odd
and quirky servers. Carrying all that implicit knowledge through to a
re-implementation in a different language would be messy and
difficult. Furthermore, fetchmail depends
on imported code for functions (like NTLM authentication) that don't
seem to be available above C level.
fetchmail's interactive configurator,
which did not have a C legacy problem, is written in Python; we'll
discuss that case along with that language.