Why Sponsor Oils? | blog | oilshell.org
This post takes a short break from ASDL to expand on one of my Hacker News comments.
I realize I haven't done much of what I mentioned in the second sentence of this blog: explain why the Unix shell is interesting. This is partly because almost nobody has questioned this project — it appears programmers do want a better shell.
But there are a aspects of the language design that are rare, and they're worth explaining. This is the first of at least three topics.
Systemd has the stated goal of replacing shell scripts in the boot process of a Linux system.
In response to yet another thread about systemd's design
tradeoffs, agumonkey
claimed that shell doesn't have enough abstraction
power. He suggested instead a Lisp-like configuration system:
(->
(path "/usr/bin/some-service" "arg0" ...)
(wrap :pre (lambda () ...)
:post (lambda () ..))
(retry 5)
...
(timeout 5))
I pointed out that shell already supports this kind of higher-order programming. For example, here's a function that takes a command and tries it five times:
retry() {
local n=$1
shift
for i in $(seq $n); do
"$@"
done
}
It can be used like this:
$ retry 5 hello-sleep 0.1
hello
hello
hello
hello
hello
Here we are passing an integer 5
and a code snippet hello-sleep 0.1
to
the retry
function. Because retry
treats code as data, you can call it a
higher-order function.
Taking it further, we can compose our retry
function with the timeout
binary in coreutils
by prepending two more words:
$ timeout 0.3 $0 retry 5 hello-sleep 0.1
hello
hello
hello # killed after 0.3 seconds
(Runnable code is available in forth-like directory of the oilshell/blog-code repository).
Because functions can be composed by simple juxtaposition, I said that shell has a Forth-like quality.
In the Forth language, functions can be composed like this because they work on an implicit stack of arguments and return values. If that doesn't make sense, this blog post may help.
Shell doesn't have an implicit stack, but the uniform representation of words in
the argv
array, and "splicing" with "$@"
, results in code that feels
similar.
In contrast, this mechanism isn't idiomatic in Python or JavaScript. I tried
porting demo.sh
to Python with demo.py, and it sort of works if you write
all functions like f(*args)
. But this goes against the grain of the
ecosystem. In these languages, functions and arguments are treated differently
from a syntactic and semantic point of view.
In the book The Art of Unix Programming, which is a great exposition of the Unix philosophy, Eric Raymond calls the technique Bernstein chaining.
Daniel J. Bernstein uses this shell technique in software like qmail and daemontools to follow the principle of least privilege.
In contrast to systemd, daemontools is a Unix init
toolkit which
relies on the idiom of small C programs composed with shell scripts.
Celebrating daemontools makes a good case for it and shows examples.
Here's an excerpt that uses Bernstein chaining of setuidgid
and softlimit
,
as well as the builtin exec
:
# change to the user 'sample', and then limit the stack segment
# to 2048 bytes, the number of open file descriptors to 3, and
# the number of processes to 1:
exec setuidgid sample \
softlimit -n 2048 -o 3 -p 1 \
some-small-daemon -n
Daemontools is minimally documented and doesn't see much use today, but runit has the same architecture, as well as a collection of tiny shell scripts that illustrate its use.
They are admittedly a bit cryptic, but the architecture is what I care about. systemd does separate some of this functionality in a separate systemd-nspawn binary, but it doesn't appear to be used much without the rest of systemd.
daemontools and systemd are interesting because they represent extremes with respect to the modularity of their design.
Since I'm writing a shell, it shouldn't be a surprise that I'm biased toward the style of daemontools. But systemd has valid criticisms of shell scripts. The language has many problems, array syntax being one example.
On the other hand, I wouldn't be surprised if systemd configuration accidentally turns Turing-complete, as shell and make did.
I don't know what the best answer is, but I think that an improved shell will
help the situation. At the very least, Lisp isn't necessary. With oil
, I
aim to preserve the timeless architectural characteristics of shell, while
abandoning ugly, inconsistent syntax, and smoothing over its sharp corners.
Here is a list of tools that can be composed in this Forth-like manner:
sudo
: Run a command as another user.chroot
: Run a command with a different root directory.env
: Run a command in a different environment./usr/bin/time
: Run a command and system summarize resource usage.su
: change user ID. This has the questionable interface of taking a shell
string with -c
instead of passing its remaining args, which leads to quoting
problems.ssh
: Run a command. Also has quoting problems.strace
: Trace system calls and signals.gdb
: Debug native programs.These are shell builtins that compose:
exec
: Replace the process image; wrapper for the exec()
system call.time
: In bash, this is a builtin which also takes a block, e.g.
time { echo 1; echo 2; }
.command
and builtin
: Change the lookup order of the first word — is
it an external command in $PATH
or internal to the shell?Thanks to Eric Wieser for fixing the
style of the Python
version.
The pattern works better if every function looks like myfunc(myarg, f, *args)
, but I would still say it goes against the grain of the ecosystem, as
mentioned above.
The next post in this series also mentions Forth, because
functions in Forth compose in a point-free style. Bernstein chaining is
not quite point-free because we mention "$@"
, but pipelines do
compose in a point-free style.