Oils 0.24.0 - Closures, Objects, and Namespaces

Closures
Objects - similar to JavaScript and Lua
Namespaces
- Modules like util.ysh are objects used as namespaces
- ENV is an object, which separates it from global variables (unlike POSIX shell)
- __builtins__ and __defaults__

We'll also see:

OSH compatibility improvements
Interactive Shell

The Big Picture

With the addition of closures, objects, and namespaces, YSH is now closer to Python and JavaScript than it is to Awk.

Python and JavaScript have all those features, but shell and Awk have none of them!

YSH is also gaining reflective control over the interpreter. We're making a programmable programming language, which supports DSLs like Awk and Hay.

Ruby seems to do better at this than Python or JavaScript, and so far our APIs compare well with Ruby. I welcome Ruby users to challenge us!

With these features, I also feel like YSH is more complete. The remaining big feature is Hay: declarative and programmable configuration. So:

There will be small features added, based on feedback
We'll build on this new object model
We'll add builtin functions and methods, including reflective APIs
We'll add to the OSH and YSH standard libraries

But I don't anticipate big new features in the YSH language. The main change will be an overhaul of Hay!

Recap of Motivation

As mentioned, the objects post is essential background, and an appendix described the motivation for closures. If you want to read more, here are some rough Zulip threads (login required):

#blog-ideas > What happened with objects, modules, closures, reflection?
#oil-discuss > Minimal YSH is flags, testing, modules, Hay, pure functions
- An informal roadmap for for YSH.

Contributors

Thank you to all contributors!

Aidan Olsen
- Str.split() now supports an eggex separator
- Str.replace() improvements and docs
- Fix string -> int conversions in OSH arithmetic
  - Add failing test cases for dynamic arithmetic
- New range operators 1 ..= 5 and 1 ..< 5 (discussed below)
- Give better errors for types && and || by over-lexing
- Define procs in the current scope, not global scope
- Allow nested procs/funcs
Melvin Walls
- Update the benchmarks to run a variant of mycpp that generates faster code. It does program-wide dataflow analysis with Souffle datalog. This isn't turned on yet, but I think it will be essential for the "next level" of Oils performance!
- Add pretty printing for the Obj type
Will Clardy
- args.ysh supports Float and Str types
- It supports flags that can be passed multiple times, resulting in a List[Str]
- Make inner "words" in the args.ysh API private
- Language feedback that led to test --true $[myexpr], and likewise
  test --false
Matthew Davidson
- Added tests for the bind builtin (work in progress)
Jason Miller
- trap with an integer arg removes the trap (POSIX compatibility)
Thank you for reviewing and improving Oils:
- nisbet-hubbard - Update the Getting Started doc with some tips
- Steven Oliver - Fix typo in INSTALL.txt
- meator - Fix comments in _build/oils.sh
- Kaonashie - Improve dev setup with Arch Linux automation

Julian Brown found several bugs in YSH by using it!
- Crash due to uninitialized vars in Python context managers - issue 2074 in mycpp
- Crash in setvar L[i] - issue 2104
- Crash in parsing large YSH expressions
  - Julian also suggested the solution I went with, which is to represent the parse node arena with std::deque. (Temporary parse tree nodes are not GC objects.)
- Thank you for the language feedback, based on writing real code! I'm also glad that Julian has been able to debug the generated C++, which supports one of our core claims :-)
Samuel Hierholzer - testing on real autoconf scripts, e.g. in Alpine Linux
- Found a bug that manifested as an infinite loop with grep -e. This is the bug Aidan fixed, mentioned above.
- Found a word splitting bug, which we're still working on.
ale5000
- Found command -v bug, issue 2093, now fixed
- Found printf overflow, issue 2107, now fixed
yurivict - FreeBSD bug report
Albin Otterhäll - feedback on Zulip

This list is incomplete — feel free to ping me if I left something out!

Build and Packaging Improvements

I did a bunch of work on issue #2080, with great feedback from Void Linux (meator) and Fedora (tkb-github).

./install now accepts an arg for the build variant to install, rather than always assuming _bin/cxx-opt-sh/oils-for-unix
- Example use case: packagers can install the stripped or unstripped binary
Updated --help for the 3 scripts packagers run:
1. ./configure
2. _build/oils.sh
3. ./install

Docs Updated

This release has tons of changes, and we're keeping the docs updated.

The Oils Reference
- Reorganized the Chapter on types and methods
- New page: YSH and OSH Topics By Feature
  - I wanted a way to see all help topics for cross-cutting features, like closures, objects, modules, and ENV.
  - This page is rough, but can improve based on feedback.
Skeleton of a new doc: Types in the Oils Runtime - OSH and YSH
- OSH uses Str, BashArray, BashAssoc
- YSH uses Str, Int, ... List, Dict, ... Func, Proc
A Tour of YSH
- Updated with closures, objects, modules, ENV
Known Differences Between OSH and Other Shells
- Added a note about ((( not being parsed like bash (Samuel hit this, discussed on the #nix channel)
YSH Language Influences
- Added C++ influence: -> and &
- Swift/Rust: new range operator syntax

I realize that we need more blog posts to explain these features in a friendly way. For example, we have a nice design for string literals that I haven't gotten around to highlighting.

But we're testing YSH ourselves first. Feel free to join Zulip if you want to be a part of that process! Feedback and questions are welcome.

`help` builtin

We're also updating the Oils Error Catalog, With Hints.

The shell sometimes display codes like OILS-ERR-12, to lead you to more detail. Googling OILS-ERR-12 now finds these details, or you can use help to print a direct link:

ysh-0.25.0$ help OILS-ERR-12

     https://oils.pub/release/0.25.0/doc/error-catalog.html#oils-err-12

Breaking Changes in YSH

As usual, the breaking changes are in YSH only. OSH is very stable because most changes are bug fixes in bash features.

Top Feedback: Env vars moved to `ENV`

This is the most noticeable breaking change. Albin Otterhall and others ran into it, discussed on Zulip.

I mentioned the ENV object in the objects post. I think of it as a new namespace, and it also uses an Obj as a stack of dictionaries.

Feature	OSH / POSIX Shell	Breaking YSH Change
Read Env Var	`echo $PYTHONPATH x=$PYTHONPATH`	`echo $[ENV.PYTHONPATH] var x = ENV.PYTHONPATH`
Permanently Set Env Var	`export PYTHONPATH=.`	`setglobal ENV.PYTHONPATH = '.'`
Temporarily Set Env Var	`PYTHONPATH=foo ./foo.py`	`PYTHONPATH=foo ./foo.py` (unchanged)

You may notice that setglobal is a bit verbose, and I agree.

But it's more explicit, and doesn't introduce new rules into the language. Feedback on this is still welcome.

You can also write setenv or sh-env in pure YSH, and Julian has done something like this.

YSH "here word"

This code now behaves as you expect:

read <<< '''
1
2
3
'''

It has three lines, not 4! Adding an extra \n was inherited from bash and mksh, and doesn't make sense in YSH. This is technically a breaking change.

Range operator `1 .. 3` replaced

There are now two operators that are more explicit:

Syntax	Values	Name
`1 ..< 3`	1 2	half-open range
`1 ..= 3`	1 2 3	closed range

This was implemented by Aidan, and motivated by feedback from bar-g.

Using the old .. operator will suggest ..< or ..=.

`args.ysh` takes Type objects, not Strings

I mentioned this in the post on objects. The API now looks like:

parser (&spec)
  flag -c --count (Int)         # no quotes around 'Int'
  flag -c --source (List[Int])  # parameterized type object
}

Related help topics: Chapter Standard Library > args

procs can be locals

Procs are now defined in the local scope. So this is valid:

proc p {
  proc inner {
    echo hi
  }
  pp (inner)  # inspect the value
}

I also removed shopt -s redefine_proc_func. It inhibits metaprogramming, and is no longer needed now that we have modules with namespaces!

Related: #language-design > Ways to generate procs dynamically: eval, runtime reflection

`eval()` and `evalExpr()` moved

They're now methods on the io Object:

eval (b) was removed, in favor of call io->eval(b)
- You probably haven't used this yet. We're using it to write the standard library.
call evalExpr(ex) was removed, in favor of call io->evalExpr(ex)

Why are they both methods on io? I updated the Oils reference with examples of expressions that have effects:

var e1 = ^[ myplace->setValue(42) ]  # memory operation
var e2 = ^[ $(echo 42 > hi) ]        # I/O operation

Blocks and Control Flow

You can now break continue return out of a loop when you're inside a block.

This is technically a breaking change: we used to break from the block, not the surrounding loop. This was issue 2039.

Discussion: #language-design > Return within context (cd { }, ...). Thanks to Julian for the feedback.

Deprecations

`fopen` builtin -> `redir`

I renamed the fopen builtin to redir, based on this use case:

redir 2>&1 {
  call io->eval(b)
} | wc -l

fopen is retained for backward compatibility, but will be removed eventually. (I think Samuel mentioned once that fopen is not the best name.)

Note: Right now, you can't write

call io->eval (b) 2>&1

Instead, you have to use the redir { } block. This is a known parsing issue: #language-design > Parsing issue with commands that end with expressions

More YSH Changes

Operators, Integers

The identity operators a is b and a is not b can now compare values of different types.
- Will hit this problem in args.ysh
Fixed the ~== operator to accept strings that look like negative numbers.

Int Conversion

In arithmetic ops like x + y, YSH has always converted strings to integers:

var x = '42'      # string, not integer
var sum = s + 1   # integer 43

To be consistent, we now do the same thing for List indices:

var s = mylist[x]         # get value at index 42
setvar mylist[x] = '9'    # set value at index 42
setvar mylist[x] += 5     # increment value at index 42

And for the operands to Slice and Range. That is, a[x:y] and x ..< y now have the same rule that mylist[x] and x + y do.

This came up a few times when writing YSH: #language-design > Lessons learned writing YSH code

Int Overflow

Fixed overflow in printf - issue 2107
Check integer overflow elsewhere: shell arithmetic, trap ulimit, YSH, ...

This is something that other shells don't do! They silently overflow, which means that their behavior depends on the underlying C compiler and platform. We still have more work to do here, but the plan is for all integer ops in YSH to be well-defined.

Added `test --true` and `test --false`

This is a nicer way to combine commands and expressions in conditionals.

if test --file $name && test --true $[myfunc(name)] {
  echo yes
}

Added entry to FAQ: How Do I Combine Commands and Expressions?

This feature was based on feedback from Will Clardy and Julian Brown.

Aidan also added hints that detect when you use || instead of or, or && instead of and. (They use a simple "over-lexing" strategy.)

Note: some of our most common feedback shows that the distinction between YSH commands vs. expressions is not always natural. That is, mixing shell and Python/JS is natural for some people, but not for others. We'll continue to work on this issue.

Misc Bugs and Fixes

Disallow test (42) - thanks to Will Clardy for finding this
Fixed bug where return [x] was allowed; the right syntax is return (x)
- Reported by Samuel

Removed special case for pp [x] -- it's now pp (x)
- The special case only made sense for assert [42 === x]. Compared with assert (42 === x), the square brackets means that the unevaluated expression can be "destructured" and inspected.

Disabled the check for dangling value.Place
- It's more complex to implement with modules, and isn't strictly necessary.
- Will hit this in args.ysh
OSH relies less on the user-facing $PWD variable
- Fix divergence from other shells: cd no longer depends on $PWD
- The \w prompt variable no longer depends on $PWD

Builtin Functions and Methods

The str() function now accepts the types Null, Bool, and Eggex.
- This is now with operators $[stringify] and @[stringify_each_elem]
Added setVar() for "dynamic binding"

Improvements by Aidan:

The Str.split() method now accepts an eggex.
Str.replace() fix: avoid infinite loop on match of zero length, like we do in Str.split()
- This semantic is NOT agreed upon by Python and JS -- they have complex behavior

New vm object:

vm.id(obj) function for value identity, similar to Python
- Right now, it works on mutable values like those of type List Dict Obj. This is because values of type Bool Int Float Str may not be managed by the GC.
vm.getFrame(0) to retrieve a value of type Frame

Now we can pretty print the globals in both OSH and YSH:

osh$ = dict(vm.getFrame(0))

Try it! This is part of

#language-design > Core Language Design / Reflection

Now let's talk about the themes in the title: Closures, Objects, and Namespaces.

Closures

Why does YSH need closures?

I mentioned the Hay example from Aidan in the appendix to Why Should a Unix Shell have Objects?

We ran into more use cases:

Unevaluated string templates should be closures.
- For example, the template argument to Str.replace() looks like ^"match = $first". It's a value of type Expr.
- The first variable should be captured, and now is.
Expression arguments to procs should be closures.
- The argument to where in my-ls | where [size > max] is also a value of type Expr.
- The max variable should be captured, and now is.
- I did a little demo of this for Will, it works!

Background / Definitions

The next sections might be cleaer if I clarify that there are many ways of talking about the same thing:

Blocks/expressions are closures
Blocks/expressions obey lexical scope
You can also call it static scope, as opposed to dynamic scope
- It's "static" because if you look at a reference to a variable, you can tell what it refers to by looking at the source code (statically). You don't have to run it (dynamically) to figure this out.
Blocks/expressions can reference variables from "outside" their definition, regardless of where they are evaluated
Blocks/expressions have a reference to the stack frame they're created in

Now let's see what's changed.

`Command` values Are Closures

Procs can take block arguments, which are denoted by { }, and they are of type Command:

var x = 42
myproc {
  echo $x   # x refers to the variable above
}

Reminder: this is how you write a block expression that's not an argument:

var x = 42
var myblock = ^(echo $x)  # whenever this block is evaluated,
                          # x refers to the variable above

(The ^(echo $x) syntax is similar to $(echo $x).)

But NOT Block Args to Builtins

This is perhaps a bit confusing:

Block args to builtins like cd are not of type Command.
Instead, they're of type CommandFrag ("unbound").

So they are not closures. This is because we want to be able to reference variables created in the block later:

cd /tmp {
  var listing = $(ls -x -y -z)
}
echo $listing  # should refer to the variable in the block

I also think of cd like an "inline proc", in that invoking it doesn't push and pop a new stack frame. There may be a way to resolve this inconsistency, or we can just live with it. Again, feedback is welcome.

#language-design > how many things does { } mean? scopes

`Expr` values are Closures

Similarly, expressions are closures:

var x = 42

var e1 = ^[x + 1]   # value of type Expr

var e2 = ^"x = $x"  # another value of type Expr

p [x + 1]           # another one, equivalent to:
p (^[x + 1])

Another related design note:

#language-design > value.{Command,Expr} now have lexical scope, but no args

TODO on Closures

What still needs to be done?

Like blocks/commands and expressions, procs and funcs should be closures.
- I was also wondering if we need to unify types Command + Proc, and Expr + Func. Whether we do that depends on use cases like Awk and Hay.
Procs and blocks may be unified. That is, blocks could be procs without args.
- I've been looking at Ruby's reflective APIs, and I think we compare favorably! YSH is simpler in some ways, and I think we can simplify even more.
Each iteration of a YSH loop should introduce a new "enclosing frame"
- #language-design > for loop that introduces new binding - closures in a loop
- #language-design > shopt --set for_loop_frames
- We looked at languages which changed their minds about this: C#, Go, and I think Lua. They made breaking changes, which is strong evidence that we should adopt the newer behavior.

For example:

for x in a b c {
  myproc { echo $x }            # x should be captured!
  when [size > x] { echo big }  # ditto
}

It's worth mentioning that the material on closures in Crafting Interpreters was very helpful. This book helped us with garbage collection, hash tables (e.g. deletion/tombstones), and closures!

Objects

Now let's talk about objects. Objects and closures are both ways of bundling code and data.

Languages like Python, JavaScript, Lua, and Ruby all have both objects and closures.

Obj API

I showed the new API in the objects post:

var obj = Obj.new({x: 42}, null)
var mydict = first(obj)
var parent = rest(obj)

I would like this shorter API:

var obj = Obj({x: 42}, null)   # no .new

But that requires the special __call__ method, which we don't have yet.

Related chapters in the reference:
- ref/chap-type-method.html
- ref/chap-builtin-func.html

`invoke` - Objects can be invoked like procs

You can now invoke objects with the same syntax as procs:

my-object arg1 arg2

You do this by giving them an __invoke__ meta-method. Docs:

Guide to Procs and Funcs > __invoke__ method

At first, this was motivated by the use case of generating procs dynamically, which Julian asked about. We had solutions based on:

eval $mystr
parseCommand() and then io->eval()

And then I decided to experiment with invokable objects. It then played a crucial rule in the implementation of modules:

my-module my-proc

So it's here to stay. I anticipate many more uses of it:

It may replace the ctx builtin, used in args.ysh
Aidan also just used it for a prototype of Markaby-style HTML generation.

Type expressions like `List[Str]`

I created type objects like List and Dict, and defined the [ operator on them.

So now List[Str] and Dict[Str, Int] evaluate to singleton objects. This was for the args.ysh use case, mentioned above, and discussed in the objects post.

Namespaces - "I learned Python with the `dir()` function"

I use this slogan to explain the motivation.

I want users to be able to discover shell by typing — by interacting with the interpreter. Not by reading the manual!

Let's see what changed.

Ongoing Reorganization

Breaking change: I added shopt --set no_init_globals, which means that YSH doesn't initialize certain globals, like SHELLOPTS. This is part of organizing globals into namespaces, which is still ongoing. Feedback is welcome.

`builtins` object

We moved functions like len() and types like Float to a __builtins__ object. It serves the same purpose as __builtins__ in Python.

Example:

ysh$ = len
<BuiltinFunc 0x7fa1e1842f50>

ysh$ = __builtins__
(Obj)   <Obj 0x7fa1e1970d20>

ysh$ = __builtins__.len'
<BuiltinFunc 0x7fa1e1842f50>

In YSH, a typical variable lookup now has three steps:

Look in locals
Look in globals
Look in __builtins__

So builtins no longer pollute the global namespace.

`defaults` object, consulted after `ENV`

For example, we have __defaults__.PATH and __defaults__.PS1.

`keys() values() get()` are Free Functions

We used to have d => keys(), but now it's just keys(d).

Why? Method calls are now obj.method(), not obj => method(). And this causes a conflict for Dict, which supports mydict.attr.

The => syntax is for function chaining, though it's still allowed for method calls.

Modules

I demonstrated in the objects post. We did this because we use it in the YSH standard library!

Python-like modules are nice and convenient! (Both JavaScript and Lua lacked modules for a long time, and later added them.)

OSH Compatibility

read -u properly fails as unimplemented
- This was the cause of the confusion in #help-wanted > psaux.bash doesn't run under OSH
- Thanks to meithecatte for figuring this out!
Added shopt -s ignore_shopt_not_impl
- By default, we no longer ignore unimplemented options. This was done a long time ago to "get past" errors. Now you can opt into it.
- Compatibility fix while I was in there: shopt -p can exit non-zero, like bash

Interactive Shell

Fixed issue 2108 - Ctrl-C causes interrupted system call, when GNU readline is not present
Fixed redundant YSH prompt label when $PS1 isn't set

When $PS1 is not set, this is the default prompt:

ysh-0.23.0$

When it is, we want OSH versus YSH to look like this:

currentdir$
ysh currentdir$

What's Next?

A couple days ago, I announced that we're (finally) moving to the oils.pub domain. This is actually the last post on oilshell.org! I put it here because the 0.24.0 tarball is also published on this domain.

In that post, I gave a sense for what's in Oils 0.25.0, which is already released: bash compatibility, and "under the hood" improvements to our metalanguages.

I published a skeleton for a Vim syntax plugin:

https://github.com/oils-for-unix/oils.vim

It needs to be fleshed out, and I want to make it easy to write syntax highlighters for SublimeText, TreeSitter, Helix, and more.

So I expect that the experience of finishing the Vim plugin will feed back into the YSH language design! We can make the syntax simpler, mainly by disallowing legacy shell syntax:

#language-design > Let's make shell easy to lex

Here's some brainstorming for the rest of 2025:

#oil-dev > Four "scrubbing passes" before YSH 1.0?

Let me know what you think in the comments. Happy new year!

Appendix: Selected Closed Issues

Some of these issues are mentioned above, and some are not.

#2118	strict_errexit message missing code location
#2114	printf errors can cause status 1, rather than being fatal
#2110	bug in old version of dash shell causes _build/oils.sh to start too many compilers in parallel
#2108	Ctrl-C causes Interrupted system call
#2107	printf crashes with ValueError when integers are large
#2104	Crash with setvar on out-of-bounds list index
#2096	ysh breaking: Replace 1 .. 5 range syntax with 1 ..< 5 half open and 1 ..= 5 closed range
#2094	allow && \|\| in YSH conditions and add test --true --false
#2080	install script may not match what distros want - Void Linux, stripped binary, binary location when cross-compiling, etc.
#2078	Crash with dict literal
#2074	members of context managers are uninitialized and rooted
#2055	Trap does not check for the first argument being an unsigned integer
#2039	executing blocks that contain return/break/continue/error is inconsistent with eval on strings

Appendix: Metrics for the 0.24.0 Release

These metrics help me keep track of the project. Let's compare this release with the previous one, version 0.23.0.

Docs

Doc Metrics for 0.23.0 - 358 topics with first pass, 402 marked implemented, 428 unique
Doc Metrics for 0.24.0 - 399 topics with first pass, 437 marked implemented, 469 unique
- Reaching 399 topics met one of our NLnet grant objectives.

Spec Tests

OSH continues to make progress, with 20 more tests passing:

OSH spec tests for 0.23.0: 2296 tests, 2050 passing, 96 failing
OSH spec tests for 0.24.0: 2322 tests, 2070 passing, 102 failing

Everything works in fast C++, even though we write typed Python:

C++ spec tests for 0.23.0 - 2055 of 2055 passing - delta zero
C++ spec tests for 0.24.0 - 2075 of 2075 passing - delta zero
- Note: there is the same vars-special error as in the last release, which seems to be an artifact of the test harness. I just fixed it.

YSH made more progress, with 87 more tests passing:

YSH spec tests for 0.23.0: 913 tests, 865 passing, 48 failing
YSH spec tests for 0.24.0: 1000 tests, 948 passing, 52 failing

Likewise, everything still works in C++:

YSH C++ spec tests for 0.23.0: 867 of 865 passing, delta negative 2
YSH C++ spec tests for 0.24.0: 948 of 948 passing, delta zero
- Python now emulates the integer overflow behavior of C++, so we no longer have an awkward negative delta.

Benchmarks

I don't recall why the parser got faster:

Parser Performance for 0.23.0: 12.5 thousand irefs per line
Parser Performance for 0.24.0: 11.9 thousand irefs per line

Oils is generally getting faster, which is good!

Not much change in parser memory usage:

benchmarks/gc for 0.23.0: parse.configure-coreutils 1.65 M objects comprising 41.1 MB, max RSS 46.6 MB
benchmarks/gc for 0.24.0: parse.configure-coreutils 1.65 M objects comprising 41.8 MB, max RSS 47.5 MB

We got faster on a compute-bound workload:

benchmarks/gc-cachegrind for 0.23.0 - fib takes 27.6 million irefs, mut+alloc+free+gc
benchmarks/gc-cachegrind for 0.24.0 - fib takes 25.8 million irefs, mut+alloc+free+gc

I think this improvement was due to removing a duplicate hash lookup in Python, e.g. if x in dict: foo = d[x]. (These release notes aren't always complete, and sometimes the benchmarks remind me of improvements we made!)

No change on a I/O bound workload:

Runtime Performance for 0.23.0: 12.0 and 21.6 seconds running CPython's configure
Runtime Performance for 0.24.0: 12.4 and 21.7 seconds running CPython's configure
- bash: 13.5 and 20.2 seconds running CPython's configure

Again, our measurements have noise when comparing OSH to bash:

0.92x - 1.07x on configure.cpython
1.04x - 1.06x on configure.util-linux

But it's a good sign that, compared with a couple releases ago, our worst numbers are getting closer to bash.

Code Size

YSH has the biggest delta in lines of code, but it's still small:

cloc for 0.23.0: 23,224 significant lines in OSH, 5,355 in YSH, 1,144 in data languages, 5,955 lines of hand-written C++
cloc for 0.24.0: 23,501 significant lines in OSH, 6,208 in YSH, 1,138 in data languages, 5,990 lines of hand-written C++

And generated C++:

oils-cpp for 0.23.0 - 122,093 physical lines
oils-cpp for 0.24.0 - 180,912 - 55,082 = 125,830 physical lines (accounting for new souffle source file)

And compiled binary size:

ovm-build for 0.23.0: 2.26 MB of native code (hoover, under GCC, on Debian 12)
- but 2.39 on mercer, and older machine
ovm-build for 0.24.0: 2.33 MB of native code (hoover, under GCC, on Debian 12)