Why Sponsor Oils? | blog | oilshell.org
In the last post, I was careful make a claim about the
osh language
, rather than all shells. Both
POSIX shell
and osh
can be parsed with two tokens of lookahead, but bash
can't.
This is because bash
accepts two meanings for the characters $((
:
$((1 + 2))
$((echo hi) )
. (The space
between the closing parens is necessary.)There are few issues with this:
$((echo hi) )
are not POSIX compliant.$((echo hi) )
starts both a child process and a
grandchild process, which is almost never necessary. Below we will see
that five out of five real-world occurrences of $((
misuse it.(NOTE: zsh
and mksh
also accept two meanings of $((
, but dash
doesn't.)
$ echo=9; foo=3; echo $((echo /foo)) 3
$ echo=9; foo=3; echo $((echo /foo) ) /foo
Note that the only difference is a space between the closing parens. On the
first line, echo
is the name of a variable, the first operand for the
infix division operator. On the second, it's a builtin command.
The POSIX shell spec says:
A conforming application shall ensure that it separates the "$(" and '(' into two tokens (that is, separate them with white space) in a command substitution that starts with a subshell. For example, a command substitution containing a single subshell could be written as:
$( (command) )
For example, this is a POSIX-compliant shell script:
$ echo $( (echo /foo) ) /foo
I've parsed around a million lines of shell with the osh
parser, and the
$((
construct appears just five times. However, it does appear in important
projects like Chrome, Mozilla, Android, and the Linux kernel.
It's possible to change osh
to accept scripts that aren't POSIX-compliant, as
bash and other shells do. Although it will break the LL(2)
property,
the code as written can easily handle it.
But for now, I'm punting on this issue, because the error is surfaced without running code (due to static parsing), and it's easily fixed by inserting a space.
Here are the five examples of $((
in the wild used as a command sub and
subshell:
(1) Linux kernel, in tools/perf/perf-with-kcore.sh:
KCORE=$(($SUDO "$PERF" buildid-cache -v -f -k /proc/kcore >/dev/null) 2>&1)
(2) The golang/exp
repo, in shootout/timing.sh
echo $((time -p $* >/dev/null) 2>&1) | awk '{print $4 "u " $6 "s " $2 "r"}'
(3) Android platform, in applypatch/imgdiff_test.sh
for i in $((zipinfo -1 $START_OTA_PACKAGE; zipinfo -1 $END_OTA_PACKAGE) | \
(4) Chromium, in tools/BACKPORTS/build_packports.sh
if [[ "$((sha1sum "$scriptname" "$1" || shasum "$scriptname" "$1") 2>/dev/null)" = "$(cat "$1.lastver")" ]]; then
(5) Mozilla, in src/doc/format.sh:
hg_relative_sourcedir=$((cd $sourcedir; pwd) | sed -e "s|$(hg root)/||")
In every case, rewriting $((
to $( (
isn't even the best solution.
The first two scripts are misunderstanding redirects. This rewrite can be applied, saving a subshell:
$( (my-command >/dev/null) 2>&1) # rewrite to ...
$(my-command 2>&1 >/dev/null)
The last three scripts are misunderstanding grouping operators. This rewrite can be applied, again saving a subshell:
$( (command1; command2) | command3) # rewrite to ...
$({ command1; command2; } | command3)
I suspect people are using parentheses as grouping because the syntax of grouping braces is difficult. All of these are invalid:
{echo a; echo b} {echo a; echo b;} { echo a; echo b }
But all of these are valid:
(echo a; echo b) (echo a; echo b;) (echo a; echo b )
This is the correct way to use grouping:
{ echo a; echo b; }
Unlike parentheses, braces are not operators in shell and don't delimit words. Spaces are necessary to separate them from surrounding words.
Oil will have a consistent syntax for these two constructs:
do {
echo a
echo b
} # grouping
shell {
echo a
echo b
} # subshell
When used on a single line, they will look like this:
do {echo a; echo b} | cmd # grouping
shell {echo a; echo b} | cmd # subshell
The oil
repo will be made public tomorrow, and I hope to start working on the
new language shortly thereafter.
Credit: I first became aware of the $((
issue through this
post by Daniel Marti, author of a shell
formatter.