Why Sponsor Oils? | blog | oilshell.org

Oils 0.16.0 - Breaking Renames and YSH

2023-06-24

This is the latest version of Oils, a Unix shell that's our upgrade path from bash:

Oils version 0.16.0 - Source tarballs and documentation.

We're moving toward the fast C++ implementation, so there are two tarballs:

If you're new to the project, see Why Create a New Shell? and posts tagged #FAQ.

Table of Contents
Summary
Some Writing on YSH for Everyone
Docs Updated
Contributions
Closed Issues
Reminders
YSH Discourages eval misusage - acme.sh vulnerability
Headless Shell Screenshots
What's Next?
"Month of Docs"
Appendix: Metrics for the 0.16.0 Release
Spec Tests
Benchmarks
Code Size

Summary

As promised in March, this release tries to "break everything at once", rather than spreading out the pain. There are two kinds of breakages:

  1. The big renaming to Oils, OSH, and YSH
  2. Changing YSH based on what we've learned.

What else changed?

Some Writing on YSH for Everyone

The bugs below list YSH breakages in detail, but these higher level posts will probably more interesting. I started writing one post before this announcement, but it ended up as five!

  1. Reviewing YSH - History, and overview of 7 parts of the language.

  2. Sketches of YSH Features - Concrete descriptions and proposals.

    The third post was abstract and hard to write, so I "forked" these two posts:

  3. How to Create a UTF-16 Surrogate Pair by Hand, with Python - Relates to our design for strings.

  4. Narrow Waists Can Be Interior or Exterior: PyObject vs. Unix Files - Useful terminology.

    And finally:

  5. Oils Is Exterior-First (Code, Text, and Structured Data) - fundamental ideas behind the language.

This series, tagged #ysh, is roughly our "design roadmap". I'm looking for feedback from contributors, and also casual readers.

Docs Updated

Contributions

Thanks to everyone who contributed code, tested Oils, and sent feedback! The project has grown larger, and wouldn't be possible without help.

Overall, the "big parser refactoring" has gone very well. We want a stable lossless syntax tree, which will help packaging tools like resholve, shell GUIs, and more. GUIs can use Oils via the headless mode interface.

Great testing and feedback:

I almost certainly missed someone here, so please leave a comment and I'll update this post. Thanks again!

Closed Issues

Here are the details:

#1649 Unquoted array literal syntax from %( foo *.py ) to :| foo *.py |
#1648 change proc syntax to match upcoming func syntax
#1640 bug: abort on special characters following '!'
#1639 Eggex should disallow $x ${x} and allow @x
#1636 Change mydict->key to mydict.key
#1634 YSH echo $x should always be correct, disallow -e -n
#1629 `osh -c 'read -d :'` fails in the C++ osh (not in the Python osh)
#1628 Parsing options like 'shopt -u expand_aliases' shouldn't be restricted upon 'source'
#1627 All types should have bool(x) , 'foo' or 'bar' should work
#1625 Tilde expansion: word vs. expression mode
#1624 Respect YSH_HISTFILE for bin/ysh
#1623 move default history from ~/.config/oils to ~/.local/share/oils
#1622 breaking change: rename rc files ~/.config/oils/oshrc and yshrc
#1621 breaking change: rename env vars OSH_* and OIL_* -> OILS_*
#1618 [[ -c /dev/null ]] fails in osh-cpp
#1615 oils native build: ld returned 1 exit status
#1608 try builtin shouldn't disallow command subs, i.e. with strict_errexit
#1607 `trap - SIGINT` behaves differently than `bash`
#1605 YSH echo should allow multiple args
#1578 crash when pyos.GetHomeDir() returns None
#1274 Maybe remove inline function calls @split(x) and $join(y), use expression sub
#983 Idea for enhanced case statement
#812 Fix leak of lines/spans in Arena (new Token/Line/Source representation)

Reminders

YSH Discourages eval misusage - acme.sh vulnerability

The last post mentioned this vulnerability in a big shell script:

Specifically, the acme.sh client for updating SSL certificates was exploitable by servers, executing arbitrary shell code specified by the server.

What I didn't mention is that YSH discourages the bug! If you look at the commit that removed the remote code execution:

They used

eval "$@"  # wrong, extra layer of evaluation of arguments

instead of

"$@"  # correct

I understand why this is confusing — the "$@" feels like it's "dangling". Doesn't it need a "verb"? It also looks like a string substitution "$x", but it's really an "array splice" operation.

When you use YSH, shopt --set simple_eval_builtin restricts eval to one argument:

ysh$ set -- 1 2
ysh$ eval "$@"
  eval "$@"
  ^~~~
[ interactive ]:2: 'eval' requires exactly 1 argument

This isn't a perfect mitigation, but it's a strong signal that you're using eval incorrectly. In the case, the vulnerable logging wrappers would not have worked.

See other posts tagged #real-problems for things YSH protects you from.

Language Design Note

To avoid the "dangling" array, I also think YSH should have a run wrapper to "pass through" an array:

"$@"          # correct
@ARGV         # YSH style

run -- "$@"   # same thing
run -- @ARGV  # ditto   

Run can also take flags to limit the lookup to certain forms, behaving a bit like command and builtin:

run --extern ls
run --builtin echo
run --proc myproc

Headless Shell Screenshots

If you're interested in creating a GUI for a shell, please join #shell-gui on Zulip.

I mentioned this in the last release, and there have been a few updates. I tested it with the oils-for-unix C++ tarball, and added tests to the CI so it doesn't regress.

I tested out Subhav's web_shell demo, which has a Go client for the headless shell:

Screenshot:

Web shell demo

What's happening here?

  1. We have a Go server that receives input from an HTML form.
  2. The server passes input to its child process osh --headless over a Unix domain socket.
  3. osh executes the shell string with file descriptors provided by the Go server.
  4. The Go server reads from those file descriptors, and constructs an HTTP response.

This is the FANOS protocol, somewhat documented at Oils Headless Mode: For Alternative UIs.


Question: Which GUI toolkit should we write a client-side demo with? I think some sample code with PyQT would be nice.

What's Next?

I want to get heads down into implementing YSH, with the help of contributors. I think we've made all major design decisions.

But there's more writing to do. I call it the "month of docs", and it looks like it will take more than a month!

"Month of Docs"

After that, I hope we'll be "home free" to work on YSH! Though there are always things that pop up, and more good ideas on the Oils 2023 Roadmap.

Appendix: Metrics for the 0.16.0 Release

These metrics help me keep track of the project. Let's compare this release with the previous one, version 0.15.0 from May.

Spec Tests

We implemented more features in Python:

There was a slight regression in the C++ numbers:

Progress on the YSH design:

Some C++ tests passed automatically, and some didn't:

When we get further on the YSH translation, more work should come "for free".

Benchmarks

The "big parser refactoring" worked! Thanks again to Aidan for doing a big chunk of this.

This was due to removing the big Arena.tokens list, which was a memory leak, as well removing a dead StrFormat() call in an inner loop.

The leak was an old bug: issue #812 mentioned above. (Interestingly, I noticed that Crafting Interpreters also keeps a big array of tokens. Once your reach problems of our size, it isn't good for performance!)

The parser refactoring also reduced memory usage (max RSS):

The stable benchmarks reflect the same improvement:

We still have to close this gap, running a hard workload:

Code Size

We reformatted the whole codebase with yapf, making it use 4-space indents! The import statements at the top of the file became longer.

Source code we ship in the tarball:

The compiled code didn't get larger: