Code is not the only sort of thing with an optimal chunk size.
Languages and APIs (such as sets of library or system calls) run up
against the same sorts of human cognitive constraints that produce
Hatton's U-curve.
Accordingly, Unix programmers have learned to think very hard
about two other properties when designing APIs, command sets,
protocols, and other ways to make computers do tricks:
compactness and
orthogonality.
Compactness
is the property that a design can fit inside a human being's head. A
good practical test for compactness is this: Does an experienced user
normally need a manual? If not, then the design (or at least the
subset of it that covers normal use) is compact.
Compact software tools have all the virtues of physical tools
that fit well in the hand. They feel pleasant to use, they don't obtrude
themselves between your mind and your work, they make you more
productive — and they are much less likely than unwieldy tools
to turn in your hand and injure you.
Compact is not equivalent to ‘weak’. A design can
have a great deal of power and flexibility and still be compact if it
is built on abstractions that are easy to think about and fit together
well. Nor is compact equivalent to ‘easily learned’; some
compact designs are quite difficult to understand until you have
mastered an underlying conceptual model that is tricky, at which point
your view of the world changes and compact
becomes
simple. For a lot of people, the Lisp
language is a classic example of this.
|
Nor does compact mean ‘small’. If a well-designed
system is predictable and ‘obvious’ to the experienced
user, it might have quite a few pieces.
|
|
| --
Ken Arnold
|
|
Very few software designs are compact in an absolute sense, but
many are compact in a slightly looser sense of the term. They have a
compact working set, a subset of capabilities that suffices for 80% or
more of what expert users normally do with them. Practically
speaking, such designs normally need a reference card or cheat sheet
but not a manual. We'll call such designs
semi-compact, as opposed to strictly
compact.
The concept is perhaps best illustrated by examples. The Unix
system call API is semi-compact, but the standard
C library is not
compact in any sense. While Unix programmers easily keep a subset of
the system calls sufficient for most applications programming (file
system operations, signals, and process control) in their heads, the C
library on modern Unixes includes many hundreds of entry points,
e.g., mathematical functions, that won't all fit inside a single
programmer's cranium.
The Magical Number Seven, Plus or Minus Two: Some
Limits on Our Capacity for Processing Information [Miller] is one of the foundation papers in cognitive
psychology (and, incidentally, the specific reason that U.S.
local telephone numbers have seven digits). It showed that the number
of discrete items of information human beings can hold in
short-term memory is seven, plus or minus two. This gives us a good
rule of thumb for evaluating the compactness of APIs: Does a
programmer have to remember more than seven entry points? Anything
larger than this is unlikely to be strictly compact.
Among Unix tools,
make(1)
is compact;
autoconf(1)
and
automake(1)
are not. Among markup languages, HTML is semi-compact, but DocBook (a
documentation markup language we shall discuss in Chapter18) is not. The
man(7)
macros are compact, but
troff(1)
markup is not.
Among general-purpose programming languages,
C and
Python are
semi-compact;
Perl,
Java, Emacs
Lisp, and
shell are
not (especially since serious shell programming requires you to know
half-a-dozen other tools like
sed(1)
and
awk(1)).
C++ is
anti-compact — the language's designer has admitted that he
doesn't expect any one programmer to ever understand it all.
Some designs that are not compact have enough internal
redundancy of features that individual programmers end up carving out
compact dialects sufficient for that 80% of common tasks by choosing a
working subset of the language. Perl has this kind of
pseudo-compactness, for example. Such designs have a built-in trap;
when two programmers try to communicate about a project, they may find
that differences in their working subsets are a significant barrier to
understanding and modifying the code.
Noncompact designs are not automatically doomed or bad,
however. Some problem domains are simply too complex for a compact
design to span them. Sometimes it's necessary to trade away
compactness for some other virtue, like raw power and range. Troff
markup is a good example of this. So is the
BSD sockets
API. The purpose of emphasizing compactness as a virtue is not to
condition you to treat compactness as an absolute requirement, but to
teach you to do what Unix programmers do: value compactness
properly, design for it whenever possible, and not throw it away
casually.