Why Sponsor Oils? | blog | oilshell.org
I recently implemented the test
builtin, also known as [
. Since I had
already implemented its statically-parsed cousin [[
, I thought that this task
would be straightforward.
But, as always, shell is full of surprises. In this post, I describe
fundamental problems with the design of the [
builtin. You can consider
this another episode of Shell: The Bad Parts.
To be concrete, what does this expression mean?
$ [ -a -a -a -a ]
Recall the difference between [
and [[
from October:
A shell builtin has the same interface as an external command:
it receives an argv
array and returns an exit code. So [
must parse the
expression after variables are substituted and quotes are processed. In
other words, it does dynamic parsing.
In contrast, [[
is part of the language, so it can "see" quoting on tokens.
This means that it can solve the ambiguity problems with [
that I show below.
Here is a statically-parsed [[
expression:
$ path=/etc/passwd > [[ -n $path && (! -L $path || $path -nt /etc/other) ]] > echo $? 0
It may be more readable in this C-like syntax:
nonempty(path) && (!isSymlink(path) || newerThan(path, "/etc/other"))
The [
version is almost the same:
$ path=/etc/passwd > [ -n "$path" -a '(' ! -L "$path" -o "$path" -nt /etc/other ')' ] > echo $? 0
except:
(1) Each token in the expression must be a separate element of the argv
array. This means:
(
must be quoted(
and !
must be separated by a space, so that they become separate
argv
entries."$path"
must be quoted so it's not split into multiple tokens. Otherwise,
a filename with spaces would cause a syntax error in [
.(2) -o
and -a
are used for logical or
and and
. In contrast, [[
can
reuses the shell operators ||
and &&
.
Alert readers may already see why the [
language has ambiguous expressions.
In bash and ksh, the -a
operator is an alias for -e
, which returns 0
(true)
if and only if its path argument exists:
$ [ -a / ]; echo $? 0
$ [ -a /oops ]; echo $? 1
Now I'll show some pathological examples. Although such examples are contrived to be the worst case, I've found them in wild.
Also, these ambiguities lead to a bad class of bug: data-dependent bugs that occur only 0.01% of the time. Bugs like this tend to escape testing.
So, what does -a
mean in these 3 expressions?
[ -a ]
[ -a -a ]
[ -a -a -a ]
To decipher them, let's make these definitions:
mystr='-a'
otherstr='-a'
mypath='-a'
Because [
is a builtin, these 3 expressions are identical to the 3 above:
[ "$mystr" ]
— test if the string -a
is non-empty[ -a "$mypath" ]
— test if the file -a
exists[ "$mystr" -a "$otherstr" ]
— test if both -a
and -a
are
non-emptySo -a
means 3 different things, depending on the context:
-e
, to test if a file existsand
Not only does this make code hard to read, it also makes it difficult to write
a correct parser for [
.
Note that [
isn't the only command with this type of problem. The find
and expr tools are also expression languages with no lexer, and thus
have related ambiguity issues. I may write about them in the future.
Another way to think about it: If Python had no distinction between strings and keywords, you wouldn't be able to tell these two expression apart:
>>> 'and' and 'and' # A valid expression in Python
'and'
>>> and and and # SyntaxError
In bash, it's a syntax error:
$ [ -a -a -a -a ] ... no output ...
But you can reasonably parse it in multiple ways:
[ -a "$mypath" -a "$mystr" ]
— (EXISTS mypath) AND mystr[ "$mystr" -a -a "$mypath" ]
— mystr AND (EXISTS mypath)In fact, dash, mksh, and zsh all agree that the result of [ -a -a -a -a ]
is 1
when the file -a
doesn't exist, not a syntax error! Bash is
the odd man out.
I did more testing with the spec test framework:
The shells disagree for rows 3 to 6, which correspond to 4 to 7 occurences of
-a
. Moreoever, they disagree in different ways for each expression!
(NOTE: OSH doesn't currently implement -a
as a unary
operator, so it only has ambiguity between -a
as a literal and -a
as a
binary operator.)
I discovered that if I want OSH to behave like any of the four shells, I
couldn't use the same parser for [
and [[
.
In fact, resolving the ambiguity means that [
is no longer an expression
language. Instead, it's a brute-force enumeration of cases.
The (unmaintained) official Bash FAQ describes it as follows. (You can also look at test.c in the Bash source.)
Bash's builtin
test
implements the Posix.2 spec, which can be summarized as follows (the wording is due to David Korn):Here is the set of rules for processing
test
arguments.
- 0 Args: False
- 1 Arg: True iff argument is not null.
- 2 Args:
- If first arg is !, True iff second argument is null.
- If first argument is unary, then true if unary test is true
- Otherwise error.
- 3 Args:
- If second argument is a binary operator, do binary test of $1 $3
- If first argument is !, negate two argument test of $2 $3
- If first argument is '(' and third argument is ')', do the one-argument test of the second argument.
- Otherwise error.
- 4 Args:
- If first argument is !, negate three argument test of $2 $3 $4.
- Otherwise unspecified
- 5 or more Args: unspecified. (Historical shells would use their current algorithm).
The operators -a and -o are considered binary operators for the purpose of the 3 Arg case.
In theoretical terms, a language is described by a grammar, and a grammar accepts or rejects strings of infinite length. But POSIX apparently specifies no such thing. Only the "unspecified" cases are allowed to use a grammar!
So three cases [ -a ]
, [ -a -a ]
, and [ -a -a -a ]
are specified by
POSIX, which is why four different shells (mostly) agree on their meaning.
After that, they wildly diverge, as shown by the spec tests
above.
At first, I was put off by these hacks. But I noticed that -a
and -o
are
marked obsolete in POSIX, and they're the only constructs that
will produce a [
expression longer than four tokens.
And shell already has !
, &&
and ||
operators, so you can rewrite complex
[
expressions like this:
$ path=/etc/passwd > test -n "$path" && { ! test -L "$path" || > test "$path" -nt /etc/other; } > echo $? 0
This leads to a simple style rule:
Do not use anything but the two- or three-argument forms of
[
.
Good:
[ -z STR ]
— 2 args[ PATH1 -nt PATH2 ]
— 3 args! [ -d PATH ]
— 2 args with negation on the outside[ -d PATH1 ] && [ -d PATH2 ]
test -d PATH1 && test -d PATH2
— same thing, but I think it looks
nicerBad:
[ ! -d PATH ]
— use shell's negation instead of negation within
[
[ -d PATH1 -a -d $PATH2 ]
— use shell's &&
instead of -a
[ STR ]
— this is technically OK, but redundant with [ -n STR ]
.[ -a PATH ]
— use [ -e PATH ]
insteadAnd remember to quote every substitution.
I described ambiguity in the test
/ [
builtin, as well as the POSIX rules
that shells use to resolve it. These rules work for common cases, but there
are problematic corner cases.
Last year, I critiqued the other parts of the shell language in a similar way:
In the next post, I'll describe:
[
and [[
to Oil, in the style of
Translating Shell to Oil.Please leave a comment if anything doesn't make sense.
I mentioned these three differences:
[[
uses a grammar; [
uses a POSIX parsing rule with six cases for fixed
lengths.[[
, so no need to quote $varsubs
[
can't tell the difference between quoted and unquoted strings, but
[[
can. [[ $foo == *.py ]]
is different than [[ $foo == '*.py' ]]
.&&
and ||
vs -a
and -o
More differences:
$foo == *.py
oddly does glob matching in [[
, but not in [
[[
language has =~
for regular expressions, but there is no [
equivalent. The external expr tool has similar regex functionality for
POSIX-compliant scripts.[
?In the spirit of minimalism, I originally thought people could use the
coreutils version of of [
with OSH.
But users reported that Gentoo and Nix both invoke [
without $PATH
set, which means that /usr/bin/[
won't be found.
So I decided to implement [
, which led me down this rat hole!