Warning: Work in progress! Leave feedback on Zulip or Github if you'd like this doc to be updated.

Word Language

Recall that Oil is composed of three interleaved languages: words, commands, and expressions.

This doc describes words, but only the things that are not in:

Table of Contents
What's a Word?
Contexts Where Words Are Used
Words Are Part of Expressions and Commands
Word Sequences: in for loops and array literals
Oil vs. Bash Array Literals
Oil Discourages Context-Sensitive Evaluation
Sigils
$ Means "Returns One String"
@ Means "Returns An Array of Strings"
Inline Function Calls
That Return Strings (Function Sub)
That Return Arrays (Function Splice)
OSH Features
Word Splitting and Empty String Elision
Implicit Joining
Extended Globs
Notes
On The Design of Substitution

What's a Word?

A word is an expression like $x, "hello $name", or {build,test}/*.py. It evaluates to a string or an array of strings.

Generally speaking, Oil behaves like a simpler version of POSIX shell / bash. Sophisticated users can read Simple Word Evaluation for a comparison.

Contexts Where Words Are Used

Words Are Part of Expressions and Commands

Part of an expression:

var x = ${y:-'default'}

Part of a command:

echo ${y:-'default'}

Word Sequences: in for loops and array literals

The three contexts where splitting and globbing apply are the ones where a sequence of words is evaluated (EvalWordSequence):

  1. Command: echo $x foo
  2. For loop: for i in $x foo; do ...
  3. Array Literals: a=($x foo) and var a = %($x foo) (oil-array)

Oil vs. Bash Array Literals

Oil has a new array syntax, but it also supports the bash-compatible syntax:

local myarray=(one two *.py)  # bash

var myarray = %(one two *.py)  # Oil style

Oil Discourages Context-Sensitive Evaluation

Shell also has contexts where it evaluates words to a single string, rather than a sequence, like:

# RHS of Assignment
x="${not_array[@]}"
x=*.py  # not a glob

# Redirect Arg
echo foo > "${not_array[@]}"
echo foo > *.py  # not a glob

# Case variables and patterns
case "${not_array1[@]}" in 
  "${not_array2[@]}")
    echo oops
    ;;
esac

case *.sh in   # not a glob
  *.py)        # a string pattern, not a file system glob
    echo oops
    ;;
esac

The behavior of these snippets diverges a lot in existing shells. That is, shells are buggy and poorly-specified.

Oil disallows most of them. Arrays are considered separate from strings and don't randomly "decay".

Related: the RHS of an Oil assignment is an expression, which can be of any type, including an array:

var parts = split(x)       # returns an array
var python = glob('*.py')  # ditto

var s = join(parts)        # returns a string

Sigils

This is a recap of A Feel for Oil's Syntax.

$ Means "Returns One String"

Examples:

(C-style strings like $'\n' use $, but that's more of a bash anachronism. In Oil, c'\n' is preferred.

@ Means "Returns An Array of Strings"

Enabled with shopt -s parse_at.

Examples:

These are both Oil extensions.

The array literal syntax also uses a @:

var myarray = %(1 2 3)

Inline Function Calls

This feature is purely syntactic sugar. Instead of:

write $strfunc(x) @arrayfunc(y)

You can always refactor to:

var mystr = strfunc(x)
var myarray = arrayfunc(y)

write $mystr @myarray

That Return Strings (Function Sub)

Examples:

echo $join(myarray, '/')
echo $len(mystr)  # len returns an int, but it's automatically converted to a string
echo foo=$len(mystr)  # also works

Note that inline function calls can't be placed in double quoted strings: "__$len(s)__"

You can either extract a variable:

var x = len(s)
echo "__$x__"

or use an expression substitution (expr-sub):

echo $[len(x)]

$[] is for Oil expressions, while ${} is shell.

This is documented in warts.

That Return Arrays (Function Splice)

cc -o foo -- @arrayfunc(x, y)

echo @split(mystr, '/')  # split on a delimiter

OSH Features

Word Splitting and Empty String Elision

Uses POSIX behavior for unquoted substitutions like $x.

Implicit Joining

Shell has odd "joining" semantics, which are supported in Oil but generally discouraged:

set -- 'a b' 'c d'
argv.py X"$@"X  # => ['Xa', 'b', 'c', 'dX']

In Oil, the RHS of an assignment is an expression, and joining only occurs within double quotes:

# Oil
var joined = $x$y    # parse error
var joined = "$x$y"  # OK

# Shell
joined=$x$y          # OK
joined="$x$y"        # OK

Extended Globs

Extended globs in OSH are a "legacy syntax" modelled after the behavior of bash and mksh. This features adds alternation, repetition, and negation to globs, giving the power of regexes.

You can use them to match strings:

$ [[ foo.cc == *.(cc|h) ]] && echo 'matches'  # => matches

Or produce lists of filename arguments:

$ touch foo.cc foo.h
$ echo *.@(cc|h)  # => foo.cc foo.h

There are some limitations and differences:

Notes

On The Design of Substitution

This is the same discussion as $f(x) vs $(f(x))` on the inline function calls thread.

We only want to interpolate vars and functions. Arbitrary expressions aren't necessary.

In summary:


And then for completeness we also have:


Generated on Tue Mar 7 21:35:45 EST 2023