Why Sponsor Oils? | source | all docs for version 0.23.0 | all versions | oilshell.org
These documents introduce the YSH language:
In contrast, the concepts introduced below may help advanced users remember YSH and its syntax. Read on to learn about:
shopt -s parse_paren
. To selectively break
compatibility, and gradually upgrade shell to YSH.The YSH parser starts out in command mode:
echo "hello $name"
for i in 1 2 3 {
echo $i
}
But it switches to expression mode in a few places:
var x = 42 + a[i] # the RHS of = is a YSH expression
echo $[mydict['key']] # interpolated expressions with $[]
json write ({key: "val"}) # typed args inside ()
See Command vs. Expression Mode for details.
Lexer modes are a technique that YSH uses to manage the complex syntax of shell, which evolved over many decades.
For example, :
means something different in each of these lines:
PATH=/bin:/usr/bin # Literal string
echo ${x:-default} # Part of an operator
echo $(( x > y ? 42 : 0 )) # Arithmetic Operator
var myslice = a[3:5] # YSH expression
To solve this problem, YSH has a lexer that can run in many modes. Multiple parsers read from this single lexer, but they demand different tokens, depending on the parsing context.
A sigil is a symbol like the $
in $mystr
.
A sigil pair is a sigil with opening and closing delimiters, like ${var}
and @(seq 3)
.
An appendix of A Feel For YSH Syntax lists the sigil pairs in the YSH language.
Each sigil pair may be available in command mode, expression mode, or both.
For example, command substitution is available in both:
echo $(hostname) # command mode
var x = $(hostname) # expression mode
So are raw and C-style string literals:
echo $'foo\n' # the bash-compatible way to do it
var s = $'foo\n'
echo r'c:\Program Files\'
var raw = r'c:\Program Files\'
But array literals only make sense in expression mode:
var myarray = :| one two three |
echo one two three # no array literal needed
A sigil pair often changes the lexer mode to parse what's inside.
()
, []
, @
, and =
Most users don't have to worry about parse options. Instead, they run either
bin/osh
or bin/ysh
, which are actually aliases for the same binary. The
difference is that bin/ysh
has the option group ysh:all
on by default.
Nonetheless, here are two examples.
The parse_at
option (in group ysh:upgrade
) turns @
into the splice
operator when it's at the front of a word:
$ var myarray = :| one two three |
$ echo @myarray # @ isn't an an operator in shell
@myarray
$ shopt -s parse_at # parse the @ symbol
$ echo @myarray
one two three
$ echo '@myarray' # quote it to get the old behavior
@myarray
The parse_bracket
option (also in group ysh:upgrade
) lets you pass
unevaluated expressions to a command with []
:
assert (^[42 === x]) # assert is passed an expression, not value
assert [42 === x] # syntax sugar with parse_bracket
POSIX specifies that Unix shell has multiple stages of parsing and evaluation. For example:
$ x=2
$ code='3 * x'
$ echo $(( code )) # Silent eval of a string. Dangerous!
6
YSH expressions are parsed in a single stage, and then evaluated, which makes it more like Python or JavaScript:
$ setvar code = '3 * x'
$ echo $[ code ]
3 * x
Another example: shell assignment builtins like readonly
and local
dynamically parsed, while YSH assignment like const
and var
are statically
parsed.
It's confusing that bash has both statically- and dynamically-parsed variants of the same functionality.
Boolean expressions:
[ -d /tmp ]
is dynamically parsed[[ -d /tmp ]]
is statically parsedC-style string literals:
echo -e '\n'
is dynamically parsedecho $'\n'
is statically parsedThe OSH language is parsed "by hand", while the YSH expression language is parsed with tables generated from a grammar (a modified version of Python's pgen).
This is mostly an implementation detail, but users may notice that OSH gives more specific error messages!
Hand-written parsers give you more control over errors. Eventually the YSH language may have a hand-written parser as well. Either way, feel free to file bugs about error messages that confuse you.