la la land
John Chambers
jc at trillian.mit.edu
Fri Jan 19 10:56:15 EST 2001
Mike writes:
| Unix filesystems do all sorts of strange things by design. For example,
| you can delete an open file, and then when it is closed and its reference
| count drops to zero, it will be purged. This is strange.
Well, I wouldn't call this strange; I'd call it a simple and elegant
solution to a common set of problems on other computer systems.
I've worked on systems for which, if you delete a file, any program
that had it open either starts getting garbage as it reads the same
blocks that are now part of another file, or errors out is strange
ways. The "solution" on some systems has been to have the delete
return an error if the file is open. These all lead to very tricky
programming problems as programs need to handle the error. This means
that even the simplest programs need to be aware of the
multiprogramming, if they are to recover gracefully from file
deletions by another process. Or, if they don't handle the errors,
the disk gets filled up with junk files that didn't get deleted
properly because someone had them open.
The unix solution was a huge simplification of the logic of it all. A
file is a file, even if it no longer has a name in a directory. If
process X opens a file, and process Y deletes the file from a
directory, neither process has any sort of error condition. No coding
is required to recover from the collision, because it's not an error
and there are no anomalies. Process X merely continues to read the
data from the file, which stays around as a nameless file until
process X closes it.
This also provides a simple and elegant way to create "scratch" files
that don't leave behind relic data if the program bombs. You just
create a new file, unlink it, and as long as you keep it open, it's
your own file that nobody else can get at. If the program bombs or is
killed, the kernel's process cleanup closes it, notes that the link
count is now zero, and recycles the disk blocks. All this works
without any need for a special "scratch file" flag and complex code
to scavenge just that sort of file.
Of course, lots of programs don't take advantage of this, and leave
behind named junk files. But that's the fault of the programmer, not
of the OS, which has tried to make it easy for the programmer.
An interesting special case of this is a pipeline of processes, such
as is produced by the cc command that triggers a multi-phase chain of
subprocesses. Now, cc usually produces scratch files in /tmp or
/usr/tmp, and if the compile bombs, garbage files can be left behind.
I've written some similar multi-process packages that don't do this.
How? I just have the parent process open a set of scratch files, such
as files 4 thru 7, and pass them to the subprocesses. The programs
can either "just know" that they are to use certain pre-opened files,
or you can give them command line options like "-i5 -o7" meaning to
input from file 5 and output to file 7. The parent has unlinked all
these files, so if the entire flock is killed somehow, the files all
get reclaimed automatically, and there's no junk left behind.
(It's also handy to say that if the debug flag is turned up above a
minimal level, the files aren't unlinked. This way, during debugging
you can see all the scratch files, but when you run with the debug
flag off, they become invisible.)
It's not at all strange, once you understand why it was done this
way, and how to take advantage of it. It's all part of why unix
software tends to be smaller and more reliable than software on
systems whose file systems don't work this way.
-
Subcription/unsubscription/info requests: send e-mail with
"subscribe", "unsubscribe", or "info" on the first line of the
message body to discuss-request at blu.org (Subject line is ignored).
More information about the Discuss
mailing list