Why Sponsor Oils? | blog | oilshell.org

After 8 Years, Oils Is Still Small and Flexible

2024-09-10

Let's do something hard, and go all the way back to the first post on the project:

Let's see how much code we've added, and let's see if the ideas made sense. There's no better test than reading and evaluating what you wrote years ago :-)

Table of Contents
Our Code in 2016, versus 2024
Comparison Table
Project is Big, Code is Small
Review
What's Changed in 8 Years?
The First Post Was An Apology for Python
Was the Middle-Out Style Worth It?
Conclusion
Appendix: The Densest Source Files

Our Code in 2016, versus 2024

In 2016, I showed this summary of the code:

PYTHON SKETCH
...
  1044 sketch/word_parse.py
  1299 sketch/cmd_parse.py
 10315 total

SHELL TESTS
...

This was 6 months into the project, and we had 10 K lines of Python code, and many tests.

 

That report evolved into the ones I publish on the release quality page:

 

Comparison Table

Let's arrange these numbers in columns:

Component Physical Lines, 2016 Physical Lines, 2024 Significant Lines, 2024 Notes
OSH 10 K 44 K 23 K Compare with ~142K lines of bash
YSH - 9 K 5 K  
Data Notation - 2 K 1 K  

Garbage Collected Runtime - 5 K 4 K Hand-written C++
OS Bindings - 3 K 2 K Hand-written C++

Total Hand-Written Source 10 K 64 K 35 K  
Total Generated Code - 122 K    

mycpp Translator - 7 K   Not shipped at runtime
Spec Tests 3 K 54 K    

 

I like this! We have 64 K physical lines / 35 K significant lines in the major components of the project: OSH, YSH, J8 Notation, and the C++ runtime.

All of Oils — including YSH and J8 Notation — has less source code than bash (~142 K lines).

This is despite the fact that YSH has "real" data structures, garbage collection, and more. (The next post will emphasize this.)

And it's not just our source code that's smaller than bash, but our generated code is too. This matters because we read, debug, and profile it.

Project is Big, Code is Small

So the last post showed that the Oils project is big, but now we see that its source code is small.

The appendix links to selected source files, which may give you a feeling for why this is.

 

(Caveat: I'm counting only Python and C++ code, which is ~7 out of the 13 parts. I'd like to join and fully automate the 3 line count reports, to account for all 13.)

Review

What's Changed in 8 Years?

The table of line counts suggests how the project has changed.

 

The First Post Was An Apology for Python

It's funny to me that the first post can be read as an apology for showing Python code, not C++:

I actually started writing it in C++. But after getting to 3K lines of code in the spring, it began to feel onerous.

I also hinted at what was to come:

Or even better than porting is to use Python as a metaprogramming language for C++.

After some diversions and missteps, this largely came true. We now have a nice situation:

  1. We write little C++ by hand, preferring to write DSLs and typed Python instead.
  2. We generate both Python and C++ from the DSLs, giving two complete implementations.
  3. The C++ tarball has the speed of native code, and no dependencies (e.g. on the Python interpreter).

This is what I call the middle-out style. But it certainly took a long time to get here.

Was the Middle-Out Style Worth It?

I think so, but it's hard to argue that in a short space. For now, I'll abbreviate the argument with some slogans:

Benefits of the Middle-Out Style:

A slight surprise:

Conclusion

Oils is a big project, with 8 years of functionality, but it's a small codebase. And that was always the goal!

What's next? I extracted two posts from this one:

This was the original plan for the series:

  1. What Oils Looks Like in 2024
  2. After 8 Years, Oils Is Still Small and Flexible
  3. Missing Retrospectives on Oils
  4. Oils - Grand Ideas and Fiddly Details
  5. Oils 0.23.0 - User Feedback, Bug Bounty, and Writing YSH Code

Appendix: The Densest Source Files

Why is our code short? I publish selected source files with every release, and they may give you a feel for this:

Let me know if you need help reading these files! Together, they form a concise description of the many interleaved languages in Oils. They're a big part of what I think of as the executable spec.