Why Sponsor Oils? | blog | oilshell.org
This is the latest version of Oils, a Unix shell. It's our upgrade path from bash to a better language and runtime:
Oils version 0.20.0 - Source tarballs and documentation
We're moving toward the fast C++ implementation, so there are two tarballs:
INSTALL.txt
in oil-*.tar.gz
.See README-native.txt
in oils-for-unix-*.tar.gz
.If you're new to the project, see the Oils 2023 FAQ and posts tagged #FAQ.
This is a big release!
Before describing those features in detail, let's review contributions.
Thanks for responding to last month's call for contributions in Oils 0.19.0! We can still use more help. I will also mention improvements to the dev process later in this post.
Adam Bannister:
$SECONDS
(a bash feature)echo
builtin (as OSH already did)Matthew Davidson:
command -p
Samuel Hierholzer:
{get,set}pwent()
libc functions at ./configure
time, which are missing on Android.
List => indexOf()
methodDict => values()
method
[[ -v expr ]]
in Known DifferencesList
Aidan Olsen:
^"hi $x"
as syntactic sugar for ^["hi $x"]
(an unevaluated value.Expr
)Str => replace()
, a nice API that I'll say more about belowreg_icase
aka i
Doc fixes:
Testing the shell. This work is as important, or even more important, than code contributions:
You can also view the full changelog.
Samuel tried doing Advent of Code in YSH, which revealed that the Eggex API wasn't done. So I took a couple weeks to improve it, doing "doc-driven development" with this new doc:
Please try it, and let us know what you think! Our goal is for YSH to be as convenient as Perl, and as powerful as Python.
This possible tweak could make it more convenient, at the cost of being more implicit:
var date_eggex = / <capture d+ as year> '/' <capture d+ as month>/
if (date_eggex ~ '2024/02') {
echo $[_group('month')] # => 02
# this current way is a bit long
echo $month # so it could become a var?
}
Aidan followed up by implementing Str => replace()
, which turned out very nicely with YSH reflection. We reuse shell's existing string interpolation, rather than creating a new mini-language, as Python and JavaScript do:
var s = '2024/02'
# looks like string literal
var t = s => replace(date_eggex, ^"$month-$year")
echo $t # => 02-2024
'\g<month>-\g<year>
, which is different than Python's f'{month}-{year}'
strings.'$<month>-$<year>'
, which is different than template literals like `${month}-${year}`
.YSH is simpler because it avoids the needless syntax and parsing. An expression like ^"hi $x"
is the unevaluated form of "hi $x"
— what we call a value.Expr
.
In addition to the API doc above, there are new help topics in the Oils reference, like the one for Str => replace.
Other Eggex changes:
/[a-z]; ignorecase/
are supported in s ~ pat
and case
_match()
to _group()
, to be consistent with m => group()
_group()
raises an error when the group number is out of rangenull
for uncaptured groups_group()
as a synonym for _group(0)
, which was inspired by Python. We prefer to be explicit.I rewrote and replaced the JSON library, which has 2 big benefits:
I want to explain the design and motivation for J8 in many different ways. But right now, the important message is that it's 100% backward compatible with JSON, and looks familiar:
# J8-style string, which can co-exist with JSON strings
u'hi 🙂 \u{1F642}'
You can use JSON and J8 notation with the existing builtin commands:
json read < myfile # sets _reply var
json write (obj) # if a string has binary, this is lossy
json8 read < myfile
json8 write (obj) # able to losslessly encode binary
Or you can use these new functions:
= toJson({x: 42})
= fromJson('{}')
= toJ8([5, 6])
= fromJ8('[5, 6]')
(It now occurs to me that these functions should be called toJson8()
and fromJson8()
. Sorry, there are still breakages to come.)
You no longer need bash's C-escaped strings, which look like $'foo\n'
, in YSH code. The $
sigil is confusing because it's unrelated to string substitution, and the syntax has other legacy.
Instead, we encourage J8-style strings in source code, which are identical to the format that json8 read
accepts:
var x = u'foo\n' # valid unicode
var y = b'foo\n \yff' # can also contain binary \yff escapes
So this part of J8 Notation can be used in both code and data! (The Shape of Data is a good post on this topic.)
Misc changes related to string notation:
set -x
uses a new shell string printer, implemented in a similar style to our J8 printer. Not a breaking change.osh -n
now uses J8 strings. This is a debugging feature, not a stable API, but it may become stable later.pp asdl (myobj)
, which prints the ASDL "guts" of an arbitrary valuepp line (myobj)
, a stable format for spec testsThe pp
formats are in contrast to = myobj
, which will be an even prettier format, similar to how the browser or NodeJS prints values.
These changes are breaking:
pp proc
uses J8 strings, not QSN strings.write --qsn
and read --qsn
. The QSN format was an earlier iteration of J8 strings. It was almost identical, but wasn't "harmonized" with JSON.Future work:
null
value in TSV8 is an issue.JSON serialization involves error handling, so I enhanced YSH error handling.
_error
register, in addition to _status
bar-g
for testing and feedback.error
builtin can be passed arbitrary properties (error.Structured
in the source)error
builtin status is 10
, not 1
.Let's take a moment to reflect on how we're working. In September's release of Oils 0.18.0, I posted a job ad, seeking help with JSON serialization.
I ended up working on it mostly myself. I feel bad about that, since one of my goals is to spread knowledge of the codebase. I wrote a thread on Zulip that reflects on why:
To summarize, a big issue is that the design changed while I was implementing it. There's a big puzzle of constraints to solve, often having to do with compatibility and our Language Design Principles.
For example, the strings used to look like j"foo"
, but that couldn't be "harmonized" with JSON well enough. I switched from double quotes to single quotes, and added the b''
and u''
prefixes. (By the way, these prefixes were inspired by feedback from Zack Weinberg last year.)
Issues like this take tinkering and testing to figure it out. Sometimes it's easier to play with Python code than to write a doc up front.
This interview with Grant Sanderson explains a similar point — sometimes it's easier to play with code than to put a design into words, especially in the early stages.
In other words, we use Python precisely because it's high-level enough to be a spec. And we have a separate C++ translation, which keeps us honest about the spec.
Other reasons I worked on it myself:
set -x
and in error messages.To conclude, we now have a great foundation for data notation in Oils, but I still need to work on getting more people involved in the project.
We made some progress on this front. To work on Oils, you often need to install a bunch of tools like MyPy and its dependencies. This is now automated in our Soil CI:
I'll elaborate on this in another post. I still want to get of the requirement to install packages as root
, and maybe create an online demo with services like GitPod.
I also had some package build problems on Fedora (with a sourcehut image). So if you use Fedora, and are interested in working on Oils, please reach out.
A subset of what's in this release:
#1795 | `command` built-in does not support `-p` option |
#1782 | source --builtin 'stdlib/math.ysh' failed: No such builtin file |
#1776 | second operator after and/or should be lazy |
#1775 | str slice out of range error in native version |
#1773 | Can't serialize type List_ to JSON |
#1767 | echo builtin should disallow typed args |
#1426 | Implement J8 Strings and shopt, for `b''` and `u''` |
#1146 | Round trip of Oil data structures to text and back |
#838 | JSON in oil-native |
I already started making plans for the next release, Oils 0.21.0. I think we can finish the C++ translation, which has been a slightly embarrassing pain point. The result is good, but I feel like it's taken too long.
I want to batch up more breaking changes to YSH in this release. We have a plan on Github:
I should turn that into a blog post!
I got invited to speak on Oils to Houston Functional Programmers, online this May. I think it could be a good group to attract some contributors.
Most people wouldn't call our code functional, but we do use exhaustive reasoning with sets, via re2c and Zephyr ASDL. And there are functional idioms in both Bourne shell and YSH that I'd like to bring up.
Do you know of similar groups, with members who may have time to work on open source languages and systems? Let me know in the comments.
I've also been talking about #blog-ideas > Oils vs. Crafting Interpreters for several months. An interesting parallel is that Lox is implemented twice in the book: in Java and then in C.
Oils is also implemented twice: in typed Python and in C++!
I don't really know what these talks could look like, but there's a ton of material. The challenge would really be to cut it down to a reasonable amount of time. I could speak for hours about this project!
I continually want to remind readers what Oils is. Here are two recent slogans:
This sounds like it must be big and complex, but the Oils source code is paradoxically small. There's around 56K lines of hand-written code, which expands to 112K lines of mostly-generated C++.
I want to turn these slogans into blog posts with demos, and elaborate on how the "middle-out" style leads to short, spec-driven code. For now, see A Tour of YSH!
These metrics help me keep track of the project. Let's compare this release with the previous one, version 0.19.0.
We made reasonable progress on OSH, though we have a backlog of failing tests to fix:
The fix to disallow typed args to echo
exposed a couple C++ translation errors (already fixed for the next release):
There are 74 new tests passing in YSH, due to the overhaul of both Eggex and JSON:
JSON / J8 Notation is the last major part of the C++ translation, making 79 more tests pass. This is the highlight of this release!
Not much changed in terms of performance during this release. The parser is the same speed:
And uses the same amount of memory:
parse.configure-coreutils
1.81 M objects comprising 63.9 MB, max RSS 68.8 MBparse.configure-coreutils
1.83 M objects comprising 64.8 MB, max RSS 69.5 MBThe synthetic Fibonacci benchmark is stable:
fib
takes 33.1 million irefs, mut+alloc+free+gcfib
takes 33.5 million irefs, mut+alloc+free+gcI/O bound workloads remain the same speed:
configure
configure
configure
Oils is still a small program in terms of source code:
And generated C++:
The compiled binary got a bit bigger: