Why Sponsor Oils? | blog | oilshell.org

Problems With $((

2016-11-18

In the last post, I was careful make a claim about the osh language , rather than all shells. Both POSIX shell and osh can be parsed with two tokens of lookahead, but bash can't. This is because bash accepts two meanings for the characters $((:

There are few issues with this:

(NOTE: zsh and mksh also accept two meanings of $((, but dash doesn't.)

Demonstration

$ echo=9; foo=3; echo $((echo /foo))
3
$ echo=9; foo=3; echo $((echo /foo) )
/foo

Note that the only difference is a space between the closing parens. On the first line, echo is the name of a variable, the first operand for the infix division operator. On the second, it's a builtin command.

POSIX spec

The POSIX shell spec says:

A conforming application shall ensure that it separates the "$(" and '(' into two tokens (that is, separate them with white space) in a command substitution that starts with a subshell. For example, a command substitution containing a single subshell could be written as:

$( (command) )

For example, this is a POSIX-compliant shell script:

$ echo $( (echo /foo) )
/foo

Non-Arithmetic $(( Found in the Wild

I've parsed around a million lines of shell with the osh parser, and the $(( construct appears just five times. However, it does appear in important projects like Chrome, Mozilla, Android, and the Linux kernel.

It's possible to change osh to accept scripts that aren't POSIX-compliant, as bash and other shells do. Although it will break the LL(2) property, the code as written can easily handle it.

But for now, I'm punting on this issue, because the error is surfaced without running code (due to static parsing), and it's easily fixed by inserting a space.

Here are the five examples of $(( in the wild used as a command sub and subshell:

(1) Linux kernel, in tools/perf/perf-with-kcore.sh:

KCORE=$(($SUDO "$PERF" buildid-cache -v -f -k /proc/kcore >/dev/null) 2>&1)

(2) The golang/exp repo, in shootout/timing.sh

echo $((time -p $* >/dev/null) 2>&1) | awk '{print $4 "u " $6 "s " $2 "r"}'

(3) Android platform, in applypatch/imgdiff_test.sh

for i in $((zipinfo -1 $START_OTA_PACKAGE; zipinfo -1 $END_OTA_PACKAGE) | \

(4) Chromium, in tools/BACKPORTS/build_packports.sh

if [[ "$((sha1sum "$scriptname" "$1" || shasum "$scriptname" "$1") 2>/dev/null)" = "$(cat "$1.lastver")" ]]; then

(5) Mozilla, in src/doc/format.sh:

hg_relative_sourcedir=$((cd $sourcedir; pwd) | sed -e "s|$(hg root)/||")

Rewrites to Correct Usage

In every case, rewriting $(( to $( ( isn't even the best solution.

The first two scripts are misunderstanding redirects. This rewrite can be applied, saving a subshell:

$( (my-command >/dev/null) 2>&1)  # rewrite to ...
$(my-command 2>&1 >/dev/null)

The last three scripts are misunderstanding grouping operators. This rewrite can be applied, again saving a subshell:

$( (command1; command2) | command3)    # rewrite to ...
$({ command1; command2; } | command3) 

Oil Language

I suspect people are using parentheses as grouping because the syntax of grouping braces is difficult. All of these are invalid:

{echo a; echo b}
{echo a; echo b;}
{ echo a; echo b }

But all of these are valid:

(echo a; echo b)
(echo a; echo b;)
(echo a; echo b )

This is the correct way to use grouping:

{ echo a; echo b; }

Unlike parentheses, braces are not operators in shell and don't delimit words. Spaces are necessary to separate them from surrounding words.

Oil will have a consistent syntax for these two constructs:

do {
  echo a
  echo b
}         # grouping

shell {
  echo a
  echo b
}         # subshell

When used on a single line, they will look like this:

do {echo a; echo b} | cmd     # grouping

shell {echo a; echo b} | cmd  # subshell

The oil repo will be made public tomorrow, and I hope to start working on the new language shortly thereafter.


Credit: I first became aware of the $(( issue through this post by Daniel Marti, author of a shell formatter.