Why Sponsor Oil? | source | all docs for version 0.10.1 | all versions | oilshell.org
Hay lets you use the syntax of the Oil shell to declare data and interleaved code. It allows the shell to better serve its role as essential glue. For example, these systems all combine Unix processes in various ways:
Slogans:
This doc describes how to use Hay, with motivating examples.
As of 2022, this is a new feature of Oil, and it needs user feedback. Nothing is set in stone, so you can influence the language and its features!
Hay could be used to configure a hypothetical Linux package manager:
# cpython.hay -- A package definition
hay define Package/TASK # define a tree of Hay node types
Package cpython { # a node with attributes, and children
version = '3.9'
url = 'https://python.org'
TASK build { # a child node, with Oil code
./configure
make
}
}
This program evaluates to a JSON tree, which you can consume from programs in any language, including Oil:
{ "type": "Package",
"args": [ "cpython" ],
"attrs": { "version": "3.9", "url": "https://python.org" },
"children": [
{ "type": "TASK",
"args": [ "build" ],
"code_str": " ./configure\n make\n"
}
]
}
That is, a package build system can use the metadata to create a build environment, then execute shell code within it.
A goal of Hay is to restore the simplicity of Unix to distributed systems. It's all just code and data!
Here are some DSLs in the same area:
And some general purpose languages:
The biggest difference is that Hay is embedded in a shell, and uses the same syntax. This means:
The sections below elaborate on these points.
Hay nodes have a regular structure:
There are two kinds of node with this structure.
(1) SHELL
nodes contain unevaluated code, and their type is ALL CAPS.
The code is turned into a string that can be executed elsewhere.
TASK build {
./configure
make
}
# =>
# ... {"code_str": " ./configure\n make\n"}
(2) Attr
nodes contain data, and their type starts with a capital letter.
They eagerly evaluate a block in a new stack frame and turn it into an
attributes dict.
Package cpython {
version = '3.9'
}
# =>
# ... {"attrs": {"version": "3.9"}} ...
These blocks have a special rule to allow bare assignments like version = '3.9'
. In contrast, Oil code requires keywords like const
and var
.
(3) In contrast to these two types of Hay nodes, Oil builtins that take a block often evaluate it eagerly:
cd /tmp { # run in a new directory
echo $PWD
}
fork { # run in an async process
sleep 3
}
In contrast to Hay SHELL
and Attr
nodes, builtins are spelled with lower
case letters.
So Hay is designed to be used with a "staged execution" model:
These two stages conceptually different, but use the same syntax and evaluator! The evaluator runs in a mode where it builds up data rather than executing commands.
Here's a description of the result of Hay evaluation (the first stage).
# The source may be "cpython.hay"
FileResult = (source Str, children List[NodeResult])
NodeResult =
# package cpython { version = '3.9' }
Attr (type Str,
args List[Str],
attrs Map[Str, Any],
children List[NodeResult])
# TASK build { ./configure; make }
| Shell(type Str,
args List[Str],
location_str Str,
location_start_line Int,
code_str Str)
Notes:
You can put Hay blocks and normal shell code in the same file. Retrieve the
result of Hay evaluation with the _hay()
function.
# myscript.oil
hay define Rule
Rule mylib.o {
inputs = ['mylib.c']
# not recommended, but allowed
echo 'hi'
ls /tmp/$(whoami)
}
echo 'bye' # other shell code
const result = _hay()
In this case, there are no restrictions on the commands you can run.
You can put hay definitions in their own file:
# my-config.hay
Rule mylib.o {
inputs = ['mylib.c']
echo 'hi' # allowed for debugging
# ls /tmp/$(whoami) would fail due to restrictions on hay evaluation
}
In this case, you can use echo
and write
, but the interpreted is
restricted (see below).
Parse it with parse_hay()
, and evaluate it with eval_hay()
:
# my-evaluator.oil
hay define Rule # node types for the file
const h = parse_hay('build.hay')
const result = eval_hay(h)
json write (result)
# =>
# {
# "children": [
# { "type": "Rule",
# "args": ["mylib.o"],
# "attrs": {"inputs": ["mylib.c"]}
# }
# ]
# }
Instead of creating separate files, you can also use the hay eval
builtin:
hay define Rule
hay eval :result { # assign to the variable 'result'
Rule mylib.o {
inputs = ['mylib.c']
}
}
json write (result) # same as above
This is mainly for testing and demos.
The "restrictions" are not a security boundary! (They could be, but we're not making promises now.)
Even with eval_hay()
and hay eval
, the config file is evaluated in the
same interpreter. But the following restrictions apply:
echo
and write
aren't allowed
.hay
file can't invoke shopt
to change global shell
optoins.hay
file can't mutate your locals
setglobal
!In summary, Hay evaluation is restricted to prevent basic mistakes, but your code isn't completely separate from the evaluated Hay file.
If you want to evaluate untrusted code, use a separate process, and run it in a container or VM.
Here is a list of all the mechanisms mentioned.
hay
hay define
to define node types.hay pp
to pretty print the node types.hay reset
to delete both the node types and the current evaluation
result.hay eval :result { ... }
to evaluate in restricted mode, and put the
result in a variable.haynode
builtin is run when types like
Package
and TASK
are invoked. That is, all node types are aliases for
this same builtin.parse_hay()
parses a file, just as bin/oil
does.eval_hay()
evaluates the parsed file in restricted mode, like hay eval
._hay()
retrieves the current result
_
because it's a "register" mutated by the
interpreter.Hay is parsed and evaluated with option group oil:all
, which includes
parse_proc
and parse_equals
.
Why would you want to interleave data and code? There are several reasons, but one is to naturally express variants of a configuration.
Here are some examples.
Build variants. There are many variants of the Oil binary:
dbg
and opt
. the compiler optimization level, and whether debug symbols
are included.asan
and ubsan
. Dynamic analysis with Clang sanitizers.-D GC_EVERY_ALLOC
. Make a build that helps debug the garbage collector.So the Ninja build graph to produce these binaries is shaped similarly, but it varies with compiler and linker flags.
Service variants. A common problem in distributed systems is how to develop and debug services locally.
Do your service dependencies live in the cloud, or are they run locally? What about state? Common variants:
local
. Part or all of the service runs locally, so you may pass flags like
--auth-service localhost:8001
to binaries.staging
. A complete copy of the service, in a different cloud, with a
different database.prod
. The live instance running with user data.Again, these collections of services are all shaped similarly, but the flags vary based on where binaries are physically running.
This model can be referred to as "graph metaprogramming" or "staged programming".
In Oil, it's done with dynamically typed data like integers and dictionaries. In contrast, these systems are more stringly typed:
The following examples are meant to be "evocative"; they're not based on real code. Again, user feedback can improve them!
Conditionals can go on the inside of a block:
Service auth.example.com { # node taking a block
if (variant == 'local') { # condition
port = 8001
} else {
port = 80
}
}
Or on the outside:
Service web { # node
root = '/home/www'
}
if (variant == 'local') { # condition
Service auth-local { # node
port = 8001
}
}
Iteration can also go on the inside of a block:
Rule foo.o { # node
inputs = [] # populate with all .cc files except one
# variables ending with _ are "hidden" from block evaluation
for name_ in *.cc {
if name_ != 'skipped.cc' {
_ append(inputs, name_)
}
}
}
Or on the outside:
for name_ in *.cc { # loop
Rule $(basename $name_ .cc).o { # node
inputs = [name_]
}
}
proc
Procs can wrap blocks:
proc myrule(name) {
# needed for blocks to use variables higher on the stack
shopt --set dynamic_scope {
Rule dbg/$name.o { # node
inputs = ["$name.c"]
flags = ['-O0']
}
Rule opt/$name.o { # node
inputs = ["$name.c"]
flags = ['-O2']
}
}
}
myrule mylib # invoke proc
Or they can be invoked from within blocks:
proc set-port(port_num, :out) {
setref out = "localhost:$port_num"
}
Service foo { # node
set-port 80 :p1 # invoke proc
set-port 81 :p2 # invoke proc
}
TODO: Show example of consuming Hay JSON in Oil.
TODO: Show example of consuming Hay JSON in Python.
.d
DirsDebian has a pattern of splitting configuration into a directory of concatenated files. It's easier for shell scripts to add to a directory than add to a file.
This can be done with an evaluator that simply enumerates all files:
var results = []
for path in myconfig.d/*.hay {
const code = parse_hay(path)
const result = eval(hay)
_ append(results, result)
}
# Now iterate through results
TODO: Example of using xargs -P
to spawn processes with parse_hay()
and
eval_hay()
. Then merge the JSON results.
Assigning attributes and invoking procs can look similar:
Package grep {
version = '1.0' # An attribute?
version 1.0 # or call proc 'version'?
}
The first style is better for typed data like integers and dictionaries. The
latter style isn't useful here, but it could be if version 1.0
created
complex Hay nodes.
Hay nodes shouldn't take flags or --
. Flags are for key-value pairs, and
blocks are better for expressing such data.
No:
Package --version 1.0 grep {
license = 'GPL'
}
Yes:
Package grep {
version = '1.0'
license = 'GPL'
}
Superficially, dicts and blocks are similar:
Package grep {
mydict = {name: 'value'} # a dict
mynode foo { # a node taking a block
name = 'value'
}
}
Use dicts in cases where you don't know the names or types up front, like
files = {'README.md': true, '__init__.py': false}
Use blocks when there's a schema. Blocks are also different because:
if
statements and for
loops in them.TASK build; TASK test
within a block, creating multiple
objects of the same type.Hay files are parsed as Oil, not OSH. That includes SHELL
nodes:
TASK build {
cp @deps /tmp # Oil splicing syntax
}
If you want to use POSIX shell or bash, use two arguments, the second of which is a multi-line string:
TASK build '''
cp "${deps[@]}" /tmp
'''
The Oil style gives you static parsing, which catches some errors earlier.
hay proc
for arbitrary schema validation, including JSON schemaPlease send feedback about Hay. It will inform and prioritize this work!