Why Sponsor Oils? | blog | oilshell.org

Release of Oil 0.8.8

2021-03-19

This is the latest version of Oil, a Unix shell that's our upgrade path from bash:

Oil version 0.8.8 - Source tarballs and documentation.

To build and run it, follow the instructions in INSTALL.txt. The wiki has tips on How To Test OSH.

If you're new to the project, see Why Create a New Shell? and posts tagged #FAQ.

Table of Contents
Closed Issues
Commit Log
The Garbage Collector Works On A Variety of Examples
What's Next?
Appendix A: Metrics for oil-native
Spec Tests: Python vs. C++
Lines of Code (Mostly Generated)
Native Binary Size
Compilation Speed
Parsing Speed
Compute Speed
Appendix B: Other Release Metrics
Spec Tests: OSH and Oil
Lines of Source Code
Runtime Benchmarks (for the slow build)

This is the first release in almost two months! I'm writing a blog post that gives color on this (#project-updates), but let's do the usual release announcement first.

Most of the work done was under the hood, with major progress on the garbage collector (previously) and build system. I also do a thorough review of project #metrics in the appendix.

Closed Issues

There were at least two user-visible changes:

#907 "precision" for integers to zero-pad not implemented
#901 Error building on macOS

Thanks to Jason Miller for testing OSH, reporting the printf bug, and sending a patch that I built on top of.

I was forced to learn some trivia by implementing it: %06d and %6.6d both do zero padding like 000042 or -000042, but they're not the same! That is, the floating point "precision" field is overloaded when using say %d and not %f. This annoys me because I think syntax and semantics should correspond.

Commit Log

Reviewing the full changelog confirms that most work happened under the hood.

Using Ninja turned out to be big success! In contrast, writing GNU Make from scratch for Oil revealed many limitations and pitfalls of the classic tool. I'd like to write a #comments post about Ninja based on links collected in this Zulip thread.

The Garbage Collector Works On A Variety of Examples

In my view, this is the most important part of this announcement!

As mentioned in January, writing a garbage collector has been harder than I expected. One reason for this is the unusual set of requirements: it's moving, precise, and written almost entirely in portable C++. We use it for both a little hand-written code and a lot of generated code.

To see evidence of progress, take a look at the second table on each of these pages, Max Resident Set Size. It compares the memory usage of a Python program to the same program translated to C++ with mycpp:

In other words, the garbage collector is working. We also have unit tests, stress tests, and various kinds of instrumentation for it. I fixed many crashes to get to this point.

But it's not done. Importantly, it's not yet hooked up to osh_eval.cc aka oil-native. Off the top of my head, we still need to generate field bitmasks for classes, including subclasses.

I hope to write about the garbage collector in detail after it's working on oil-native, rather than just small examples. This Zulip thread has some color and comments on the experience (starting in November 2020).

What's Next?

I want to review the project's progress and write about future plans in the next post. Oil is approaching the five year mark, which is crazy, so it's again time to take stock of things.

I'm still worried about the scope of the project. I also moved for the first time in 10 years (within San Francisco), and that pushed things off track for several weeks.

For the impatient, there are immediate plans in the Zulip thread for the 0.8.8 release, but that's not all I want to write about.

Appendix A: Metrics for oil-native

Doing these reviews helps me keep track of the project. Potential contributors may also be interested (help wanted).

I reviewed Metrics for Oil 0.8.4 in November, so let's use it as the baseline for these comparisons.

Spec Tests: Python vs. C++

Here are the OSH spec test stats for oil-native:

Out of the 14 newly passing tests in Python, 11 of them pass in C++. That's not bad: it means that we can fix bugs in Python and things just work, for the most part. The ones that don't work could be a result of stubbed out C++ "bindings", e.g. where I used assert(0) as the body of a function.

The bad news is that the C++ test count has stalled around 920 for a few months. I've been working on the garbage collector and other things.

So the translation process is working, but it's taking longer than expected.

Lines of Code (Mostly Generated)

There's a small increase in generated source code due to normal feature development:

Native Binary Size

This increase in native code doesn't seem proportional to the lines of source. Off the top of my head, it's probably due to the fact that every function (generated or hand-written) now has to register stack roots with the garbage collector.

Compilation Speed

Despite more lines of code, the compilation time didn't change much.

And I suspect that it's mainly affected by the #include structure, which we can optimize.

Parsing Speed

I think the following change is within the parser benchmark noise, but it reminds me that we need to add a stable metric to the benchmarks (issue 871).

Compute Speed

These benchmarks are hard to summarize, but looks like Oil did get slower, probably due to the extra code generated for the garbage collector to track stack roots.

Appendix B: Other Release Metrics

Spec Tests: OSH and Oil

These tests measure the correctness of the Python "reference build" / "executable spec". We have steady progress on both OSH:

and the Oil language;

Lines of Source Code

We have a few hundred more significant lines of code:

And physical lines of code:

Runtime Benchmarks (for the slow build)

The slowness on the following runtime benchmarks is why oil-native exists. However, I noticed a surprising speedup, which seems to hold for releases 0.8.5 to 0.8.7 as well:

Honestly, I have no idea what happened here. All I can say is that this isn't the first time I've seen something unusual while keeping track of performance over the years. I almost always learn something new when I look carefully, but I want to spend my attention on others things now.

It doesn't matter since both results are slow, and we care about the speed of oil-native. Off the top of my head, one thing that changed in January was adding enhanced xtrace, but I'd expect that to make things slower, not faster.


OK, that's it for the metrics. Feel free to leave a comment if you have any questions about this post!