Why Sponsor Oil? | source | all docs for version 0.15.0 | all versions | oilshell.org
These documents introduce the Oil language:
In contrast, the concepts introduced below may help advanced users remember Oil and its syntax. Read on to learn about:
shopt -s parse_paren
. To selectively break
compatibility, and gradually upgrade shell to Oil.The Oil parser starts out in command mode:
echo "hello $name"
for i in 1 2 3 {
echo $i
}
But it switches to expression mode in a few places:
var x = 42 + a[i] # the RHS of an assignment is an expression
echo $len('foo') # interpolated function call
echo $[mydict['key']] # interpolated Oil expressions with $[]
See Command vs. Expression Mode for details.
Lexer modes are a technique that Oil uses to manage the complex syntax of shell, which evolved over many decades.
For example, :
means something different in each of these lines:
PATH=/bin:/usr/bin # Literal string
echo ${x:-default} # Part of an opeartor
echo $(( x > y ? 42 : 0 )) # Arithmetic Operator
var myslice = a[3:5] # Oil expression
To solve this problem, Oil has a lexer that can run in many modes. Multiple parsers read from this single lexer, but they demand different tokens, depending on the parsing context.
A sigil is a symbol like the $
in $mystr
.
A sigil pair is a sigil with opening and closing delimiters, like ${var}
and @(seq 3)
.
An appendix of A Feel For Oil's Syntax lists the sigil pairs in the Oil language.
Each sigil pair may be available in command mode, expression mode, or both.
For example, command substitution is available in both:
echo $(hostname) # command mode
var x = $(hostname) # expression mode
So are raw and C-style string literals:
echo $'foo\n' # the bash-compatible way to do it
var s = $'foo\n'
echo r'c:\Program Files\'
var raw = r'c:\Program Files\'
But array literals only make sense in expression mode:
var myarray = %(one two three)
echo one two three # no array literal needed
A sigil pair often changes the lexer mode to parse what's inside.
()
, @
, and =
Most users don't have to worry about parse options. Instead, they run either
bin/osh
or bin/oil
, which are actually aliases for the same binary. The
difference is that bin/oil
has the option group oil:all
on by default.
Nonetheless, here are two examples.
The parse_at
option (in group oil:upgrade
) turns @
into the splice
operator when it's at the front of a word:
$ var myarray = %(one two three)
$ echo @myarray # @ isn't an an operator in shell
@myarray
$ shopt -s parse_at # parse the @ symbol
$ echo @myarray
one two three
$ echo '@myarray' # quote it to get the old behavior
@myarray
The parse_equals
option (in group oil:all
) lets you omit const
:
const x = 42 + a[i] # accepted in OSH and Oil
shopt -s parse_equals # Change the meaning of =
x = 42 + a[i] # Means the same as above
# This is NOT a mutation. It's a declaration.
POSIX specifies that Unix shell has multiple stages of parsing and evaluation. For example:
$ x=2
$ code='3 * x'
$ echo $(( code )) # Silent eval of a string. Dangerous!
6
Oil expressions are parsed in a single stage, and then evaluated, which makes it more like Python or JavaScript:
$ setvar code = '3 * x'
$ echo $[ code ]
3 * x
Another example: shell assignment builtins like readonly
and local
dynamically parsed, while Oil assignment like const
and var
are statically
parsed.
It's confusing that bash has both statically- and dynamically-parsed variants of the same functionality.
Boolean expressions:
[ -d /tmp ]
is dynamically parsed[[ -d /tmp ]]
is statically parsedC-style string literals:
echo -e '\n'
is dynamically parsedecho $'\n'
is statically parsedThe OSH language is parsed "by hand", while the Oil language is parsed with tables generated from a grammar (a modified version of Python's pgen).
This is mostly an implementation detail, but users may notice that OSH gives more specific error messages!
Hand-written parsers give you more control over errors. Eventually the Oil language may have a hand-written parser as well. Either way, feel free to file bugs about error messages that confuse you.