Unix Haters Handbook: 14 years later...

KokoroMix · 10-07-2006, 12:33 AM

I've read the Unix Haters Handbook, released 14 years ago, and I thought it would be interesting to mention some facts that remain true nowadays.
Feel free to comment.

Code:

I think Unix and snowflakes are the only two classes of objects
in the universe in which no two instances ever match exactly.
                                  -Noel Chiappa

Code:

Some years ago, when I was being a consultant for a living, I had a
job at a software outfit that was building a large graphical user-interface
sort of application. They were using some kind of Unix on a
PDP-11 for development and planning to sell it with a board to
OEMs. I had the job of evaluating various Unix variants, running on
various multibus-like hardware, to see what would best meet their
needs.
The evaluation process consisted largely of trying to get their test
program, which was an early prototype of the product, to compile
and run on the various *nixes. Piece of cake, sez I. But oops, one
vendor changed all the argument order around on this class of system
functions. And gee, look at that: A bug in the Xenix compiler prevents
you from using byte-sized frobs here; you have to fake it out
with structs and unions and things. Well, what do you know, Venix’s
pseudo real-time facilities don’t work at all; you have to roll your
own. Ad nauseam.
I don’t remember the details of which variants had which problems,
but the result was that no two of the five that I tried were compatible
for anything more than trivial programs! I was shocked. I was
appalled. I was impressed that a family of operating systems that
claimed to be compatible would exhibit this class of lossage. But the
thing that really got me was that none of this was surprising to the
other *nix hackers there! Their attitude was something to the effect
of “Well, life’s like that, a few #ifdefs here, a few fake library interface
functions there, what’s the big deal?”

Code:

Ken Thompson has an automobile which he helped design. Unlike
most automobiles, it has neither speedometer, nor gas gauge, nor
any of the other numerous idiot lights which plague the modern
driver. Rather, if the driver makes a mistake, a giant “?” lights up in
the center of the dashboard. “The experienced driver,” says Thompson,
“will usually know what’s wrong.”
—Anonymous

Code:

Users care deeply about their files and data. They use computers to generate,
analyze, and store important information. They trust the computer to
safeguard their valuable belongings. Without this trust, the relationship
becomes strained. Unix abuses our trust by steadfastly refusing to protect
its clients from dangerous commands. In particular, there is rm, that most
dangerous of commands, whose raison d’etre is deleting files.
All Unix novices have “accidentally” and irretrievably deleted important
files. Even experts and sysadmins “accidentally” delete files. The bill for
lost time, lost effort, and file restoration probably runs in the millions of
dollars annually. This should be a problem worth solving; we don’t understand
why the Unixcenti are in denial on this point. Does misery love company
that much?

Code:

Most operating systems use the two-step, delete-and-purge idea to
return the disk blocks used by files to the operating system. This
isn’t rocket science; even the Macintosh, back in 1984, separated
“throwing things into the trash” from “emptying the trash.” Tenex
had it back in 1974.
DOS and Windows give you something more like a sewage line
with a trap than a wastebasket. It simply deletes the file, but if you
want to stick your hand in to get it back, at least there are utilities
you can buy to do the job. They work—some of the time.

Code:

We mentioned that the shell performs wildcard expansion, that is, it
replaces the star (*) with a listing of all the files in a directory. This is flaw
#1; the program should be calling a library to perform wildcard expansion.
By convention, programs accept their options as their first argument, usually
preceded by a dash (–). This is flaw #2. Options (switches) and other
arguments should be separate entities, as they are on VMS, DOS, Genera,
and many other operationg systems. Finally, Unix filenames can contain
most characters, including nonprinting ones. This is flaw #3. These architectural
choices interact badly. The shell lists files alphabetically when
expanding “*”, and the dash (-) comes first in the lexicographic caste system.
Therefore, filenames that begin with a dash (-) appear first when “*”
is used. These filenames become options to the invoked program, yielding
unpredictable, surprising, and dangerous behavior.

Code:

Then there’s the story of the poor student who happened to have a
file called “-r” in his home directory. As he wanted to remove all his
non directory files (I presume) he typed:
% rm *
… And yes, it does remove everything except the beloved “-r” file…
Luckily our backup system was fairly good.

			-Kees Goossens

Code:

We’ve known several people who have made a typo while renaming a file
that resulted in a filename that began with a dash:
% mv file1 -file2
Now just try to name it back:
% mv -file2 file1
usage: mv [-if] f1 f2 or mv [-if] f1 ... fn d1
(‘fn’ is a file or directory)
%
The filename does not cause a problem with other Unix commands because
there’s little consistency among Unix commands. For example, the filename
“-file2” is kosher to Unix’s “standard text editor,” ed. This example
works just fine:
% ed -file2
4347
But even if you save the file under a different name, or decide to give up on
the file entirely and want nothing more than to delete it, your quandary
remains:
% rm -file
usage: rm [-rif] file ...
% rm ?file
usage: rm [-rif] file ...
% rm ?????
usage: rm [-rif] file ...
% rm *file2
usage: rm [-rif] file ...
%
rm interprets the file’s first character (the dash) as a command-line option;
then it complains that the characters “l” and “e” are not valid options.
Doesn’t it seem a little crazy that a filename beginning with a hypen, especially
when that dash is the result of a wildcard match, is treated as an
option list?

Code:

Here’s a way to amuse and delight your friends (courtesy of Leigh Klotz).
First, in great secret, do the following:
% mkdir foo
% touch foo/foo~
Then show your victim the results of these incantations:
% ls foo*
foo~
% rm foo~
rm: foo~ nonexistent
% rm foo*
rm: foo directory
% ls foo*
foo~
%

Code:

To Delete Your File, Try the Compiler
Some versions of cc frequently bite undergraduates by deleting previous
output files before checking for obvious input problems.
Date: Thu, 26 Nov 1992 16:01:55 GMT
From: tk@dcs.ed.ac.uk (Tommy Kelly)
Subject: HELP!
Newsgroups: cs.questions9
Organization: Lab for the Foundations of Computer Science,
Edinburgh UK
I just did:

% cc -o doit.c doit
instead of:
% cc -o doit doit.c
Needless to say I have lost doit.c
Is there anyway I can get it back? (It has been extensively modified
since this morning).
:-(

Code:

Imagine if there was an exterior paint that emitted chlorine
No problem using it outside, according to the directions, but
your bedroom and you might wind up dead. How long do
paint would last on the market? Certainly not 20 years.

KokoroMix · 10-07-2006, 12:38 AM

Code:

People have published some of Unix’s more ludicrous errors messages as
jokes. The following Unix puns were distributed on the Usenet, without an
attributed author. They work with the C shell.
% rm meese-ethics
rm: meese-ethics nonexistent
% ar m God
ar: God does not exist
% "How would you rate Dan Quayle's incompetence?
Unmatched ".
% ^How did the sex change^ operation go?
Modifier failed.
% If I had a ( for every $ the Congress spent,
what would I have?
Too many ('s.
% make love
Make: Don't know how to make love. Stop.
% sleep with me
bad character
% got a light?
No match.
% man: why did you get a divorce?
man:: Too many arguments.
% ^What is saccharine?
Bad substitute.
% %blow
%blow: No such job.

These attempts at humor work with the Bourne shell:
$ PATH=pretending! /usr/ucb/which sense
no sense in pretending!
$ drink <bottle; opener
bottle: cannot open
opener: not found
$ mkdir matter; cat >matter
matter: cannot create

Code:

The Unix shells have always presented a problem for Unix documentation
writers: The shells, after all, have built-in commands. Should built-ins be
documented on their own man pages or on the man page for the shell? Traditionally,
these programs have been documented on the shell page. This
approach is logically consistent, since there is no while or if or set command.
That these commands look like real commands is an illusion. Unfortunately,
this attitude causes problems for new users—the very people for
whom documentation should be written.
For example, a user might hear that Unix has a “history” feature which
saves them the trouble of having to retype a command that they have previously
typed. To find out more about the “history” command, an aspiring
novice might try:
% man history
No manual entry for history.

I recently had to help a frustrated Unix newbie with these gems:
Under the Bourne shell (the ‘standard’ Unix shell), the set command
sets option switches. Under the c-shell (the other ‘standard’ Unix
shell), ‘set’ sets shell variables. If you do a ‘man set,’ you will get
either one or the other definition of the command (depending on the
whim of the vendor of that particular Unix system) but usually not
both, and sometimes neither, but definitely no clue that another, conflicting,
definition exists.
Mistakenly using the ‘set’ syntax for one shell under the other
silently fails, without any error or warning whatsoever. To top it off,
typing ‘set’ under the Bourne shell lists the shell variables!
Craig

Code:

X took off in a vacuum. At the time, there was no established Unix graphics
standard. X provided one—a standard that came with its own free
implementation. X leveled the playing field: for most applications; everyone’s
hardware suddenly became only as good as the free MIT X Server
could deliver.
Even today, the X server still turns fast computers into dumb terminals.
You need a fairly hefty computer to make X run fast—something that hardware
vendors love.

Code:

I have a natural revulsion to any operating system that shows so little
planning as to have to named all of its commands after digestive
noises (awk, grep, fsck, nroff).
—Unknown

Code:

Unix power tools don’t fit this mold. Unlike the modest goals of its
designers to have tools that were simple and single-purposed, today’s Unix
tools are over-featured, over-designed, and over-engineered. For example,
ls, a program that once only listed files, now has more than 18 different
options that control everything from sort order to the number of columns in
which the printout appears—all functions that are better handled with other
tools (and once were). The find command writes cpio-formatted output
files in addition to finding files (something easily done by connecting the
two commands with an infamous Unix pipe). Today, the Unix equivalent
of a power drill would have 20 dials and switches, come with a
nonstandard plug, require the user to hand-wind the motor coil, and not
accept 3/8" or 7/8" drill bits (though this would be documented in the
BUGS section of its instruction manual).

Code:

The inventors of Unix had a great idea: make the command processor be
just another user-level program. If users didn’t like the default command
processor, they could write their own. More importantly, shells could
evolve, presumably so that they could become more powerful, flexible, and
easy to use.
It was a great idea, but it backfired. The slow accretion of features caused a
jumble. Because they weren’t designed, but evolved, the curse of all programming
languages, an installed base of programs, hit them extra hard. As
soon as a feature was added to a shell, someone wrote a shell script that
depended on that feature, thereby ensuring its survival. Bad ideas and features
don’t die out.

Code:

Hardware stores contain screwdrivers or saws made by three or four different
companies that all operate similarly. A typical Unix /bin or /usr/bin
directory contains a hundred different kinds of programs, written by dozens
of egotistical programmers, each with its own syntax, operating paradigm,
rules of use (this one works as a filter, this one works on temporary files,
etc.), different strategies for specifying options, and different sets of constraints.
Consider the program grep, with its cousins fgrep and egrep.
Which one is fastest?1 Why do these three programs take different options
and implement slightly different semantics for the phrase “regular expressions”?
Why isn’t there just one program that combines the functionality of
all three?

Code:

Bugs and apparent quirky behavior are the result of Unix’s long evolution
by numerous authors, all trying to take the operating system in a different
direction, none of them stopping to consider their effects upon one another.
Date: Mon, 7 May 90 22:58:58 EDT
From: Alan Bawden <alan@ai.mit.edu>
Subject: cd . . : I am not making this up
To: UNIX-HATERS
What could be more straightforward than the “cd” command? Let's
consider a simple case: “cd ftp.” If my current directory,
/home/ar/alan, has a subdirectory named “ftp,” then that becomes my
new current directory. So now I’m in
/home/ar/alan/ftp. Easy.
Now, you all know about “.” and “. .”? Every directory always has
two entries in it: one named “.” that refers to the directory itself, and
one named “. .” that refers to the parent of the directory. So in our
example, I can return to /home/ar/alan by typing “cd . .”.
Now suppose that “ftp” was a symbolic link (bear with me just a
while longer). Suppose that it points to the directory /com/ftp/pub/
alan. Then after “cd ftp” I’m sitting in /com/ftp/pub/alan.
Like all directories /com/ftp/pub/alan contains an entry named “. .”
that refers to its superior: /com/ftp/pub. Suppose I want to go there
next. I type:
% cd ..
Guess what? I’m back in /home/ar/alan! Somewhere in the shell
(apparently we all use something called “tcsh” here at the AI Lab)
somebody remembers that a link was chased to get me into /com/ftp/
pub/alan, and the cd command guesses that I would rather go back to
the directory that contained the link. If I really wanted to visit /com/
ftp/pub, I should have typed “cd . / . .”.

KokoroMix · 10-07-2006, 12:42 AM

Code:

Shell programmers and the dinosaur cloners of Jurassic Park have much in
common. They don’t have all the pieces they need, so they fill in the missing
pieces with random genomic material. Despite tremendous self-confidence
and ability, they can’t always control their creations.
Shell programs, goes the theory, have a big advantage over programs written
in languages like C: shell programs are portable. That is, a program
written in the shell “programming language” can run on many different flavors
of Unix running on top of many different computer architectures,
because the shell interprets its programs, rather than compiling them into
machine code. What’s more, sh, the standard Unix shell, has been a central
part of Unix since 1977 and, thus, we are likely to find it on any machine.
Let’s put the theory to the test by writing a shell script to print the name
and type of every file in the current directory using the file program:
Date: Fri, 24 Apr 92 14:45:48 EDT
From: Stephen Gildea <gildea@expo.lcs.mit.edu>
Subject: Simple Shell Programming
To: UNIX-HATERS
Hello, class. Today we are going to learn to program in “sh.” The
“sh” shell is a simple, versatile program, but we'll start with a basic
example:
Print the types of all the files in a directory.
(I heard that remark in the back! Those of you who are a little familiar
with the shell and bored with this can write “start an X11 client on
a remote machine” for extra credit. In the mean time, shh!)
While we're learning to sh, of course we also want the program we
are writing to be robust, portable, and elegant. I assume you've all
read the appropriate manual pages, so the following should be trivially
obvious:
file *
Very nice, isn’t it? A simple solution for a simple problem; the *
matches all the files in the directory. Well, not quite. Files beginning
with a dot are assumed to be uninteresting, and * won’t match them.
There probably aren’t any, but since we do want to be robust, we’ll
use “ls” and pass a special flag:
for file in `ls -A`
do
file $file
done
There: elegant, robust... Oh dear, the “ls” on some systems doesn’t
take a “-A” flag. No problem, we'll pass -a instead and then weed out
the . and .. files:
for file in `ls -a`
do
if [ $file != . -a $file != .. ]
then
file $file
fi
done
Not quite as elegant, but at least it’s robust and portable. What’s that?
“ls -a” doesn’t work everywhere either? No problem, we'll use “ls -f”
instead. It’s faster, anyway. I hope all this is obvious from reading
the manual pages.
Hmm, perhaps not so robust after all. Unix file names can have any
character in them (except slash). A space in a filename will break this
script, since the shell will parse it as two file names. Well, that’s not
too hard to deal with. We'll just change the IFS to not include Space
(or Tab while we're at it), and carefully quote (not too little, not too
much!) our variables, like this:
IFS='
'
for file in `ls -f`
do
if [ "$file" != . -a "$file" != .. ]
then
file "$file"
fi
done
Some of you alert people will have already noticed that we have
made the problem smaller, but we haven't eliminated it, because
Linefeed is also a legal character in a filename, and it is still in IFS.
Our script has lost some of its simplicity, so it is time to reevaluate
our approach. If we removed the “ls” then we wouldn’t have to worry
about parsing its output. What about
for file in .* *
do
if [ "$file" != . -a "$file" != .. ]
then
file "$file"
fi
done
Looks good. Handles dot files and files with nonprinting characters.
We keep adding more strangely named files to our test directory, and
this script continues to work. But then someone tries it on an empty
directory, and the * pattern produces “No such file.” But we can add
a check for that…
…at this point my message is probably getting too long for some of
your uucp mailers, so I'm afraid I'll have to close here and leave fixing
the remaining bugs as an exercise for the reader.

Code:

Error Codes and Error Checking
Our programming example glossed over how the file command reports an
error back to the shell script. Well, it doesn’t. Errors are ignored. This
behavior is no oversight: most Unix shell scripts (and other programs as
well) ignore error codes that might be generated by a program that they
call. This behavior is acceptable because no standard convention exists to
specify which codes should be returned by programs to indicate errors.
Perhaps error codes are universally ignored because they aren’t displayed
when a user is typing commands at a shell prompt. Error codes and error
checking are so absent from the Unix Canon that many programs don’t
even bother to report them in the first place.
Date: Tue, 6 Oct 92 08:44:17 PDT
From: Bjorn Freeman-Benson <bnfb@ursamajor.uvic.ca>
Subject: It’s always good news in Unix land
To: UNIX-HATERS
Consider this tar program. Like all Unix “tools” (and I use the word
loosely) it works in strange and unique ways. For example, tar is a
program with lots of positive energy and thus is convinced that nothing
bad will ever happen and thus it never returns an error status. In
fact, even if it prints an error message to the screen, it still reports
“good news,” i.e., status 0. Try this in a shell script:
tar cf temp.tar no.such.file
if( $status == 0 ) echo "Good news! No error."
and you get this:
tar: no.such.file: No such file or directory
Good news! No error.
I know—I shouldn’t have expected anything consistent, useful, documented,
speedy, or even functional…
Bjorn

Code:

My judgment of Unix is my own. About six years ago (when I first got
my workstation), I spent lots of time learning Unix. I got to be fairly
good. Fortunately, most of that garbage has now faded from memory.
However, since joining this discussion, a lot of Unix supporters
have sent me examples of stuff to “prove” how powerful Unix is.
These examples have certainly been enough to refresh my memory:
they all do something trivial or useless, and they all do so in a very
arcane manner.
One person who posted to the net said he had an “epiphany” from a
shell script (which used four commands and a script that looked like
line noise) which renamed all his '.pas' files so that they ended with
“.p” instead. I reserve my religious ecstasy for something more than
renaming files. And, indeed, that is my memory of Unix tools—you
spend all your time learning to do complex and peculiar things that
are, in the end, not really all that impressive. I decided I’d rather
learn to get some real work done.
—Jim Giles
Los Alamos National Laboratory

Code:

Date: Thu, 28 Jun 1990 18:14 EDT
From: pgs@crl.dec.com
Subject: more things to hate about Unix
To: UNIX-HATERS
This is one of my favorites. I’m in some directory, and I want to
search another directory for files, using find. I do:
po> pwd
/ath/u1/pgs
po> find ~halstead -name "*.trace" -print
po>
The files aren’t there. But now:
po> cd ~halstead
po> find . -name "*.trace" -print
./learnX/fib-3.trace
./learnX/p20xp20.trace
./learnX/fib-3i.trace
./learnX/fib-5.trace
./learnX/p10xp10.trace
po>
Hey, now the files are there! Just have to remember to cd to random
directories in order to get find to find things in them. What a crock of
Unix.
Poor Halstead must have the entry for his home directory in /etc/passwd
pointing off to some symlink that points to his real directory, so some commands
work for him and some don’t.
Why not modify find to make it follow symlinks? Because then any symlink
that pointed to a directory higher up the tree would throw find into an
endless loop. It would take careful forethought and real programming to
design a system that didn’t scan endlessly over the same directory time
after time. The simple, Unix, copout solution is just not to follow symlinks,
and force the users to deal with the result.

Code:

Say you are a novice user with two files in a directory, A.m and B.m.
You’re used to MS-DOS and you want to rename the files to A.c and B.c.
Hmm. There’s no rename command, but there’s this mv command that
looks like it does the same thing. So you type mv *.m *.c. The shell
expands this to mv A.m B.m and mv overwrites B.m with A.m. This is a
bit of a shame since you had been working on B.m for the last couple of
hours and that was your only copy.

Code:

Sysadmins manage a large assortment of configuration files. Those allergic
to Microsoft Windows with its four system configuration files shouldn’t
get near Unix, lest they risk anaphylactic shock. Unix boasts dozens of
files, each requiring an exact combination of letters and hieroglyphics for
proper system configuration and operation.
Each Unix configuration file controls a different process or resource, and
each has its own unique syntax. Field separators are sometimes colons,
sometimes spaces, sometimes (undocumented) tabs, and, if you are very
lucky, whitespace.

Code:

“Two of the most famous products of Berkeley are LSD and Unix. I
don’t think that this is a coincidence.”
—Anonymous

SlackDaemon · 10-07-2006, 01:50 AM

And how exactly is this a success story?

XavierP · 10-07-2006, 06:40 AM

It's not. Obviously someone meant to hit the General forum and missed. Which means that it's even less of a success story.

Ah well, I have moved it to it's rightful place.

Hint to the OP - posting your own opinion is a good idea, especially after all that typing.