Why Sponsor Oils? | blog | oilshell.org
This post is part of "flushing the blog queue", described in yesterday's blog roadmap. I link to comments and stories, and provide a summary of themes, without making a full argument.
Let me know if this style is or isn't comprehensible / useful!
My comments here connect #parsing theory to practice:
Modern Parser Generator (matklad.github.io
via Reddit)
60 points, 38 comments - 03 Dec 2020
Themes:
My Comment on
Which Parsing Approach? (tratt.net
via lobste.rs)
55 points, 36 comments on 2020-09-15
Summary: I studied many different ways of parsing, and used several in Oil. Like Tratt, I "returned" to the textbook LR style to some degree:
Survey: What parser generator are you using, if any? (/r/ProgrammingLanguages)
42 points, 68 comments - 19 May 2019
Oil doesn't currently use LR parsing, but it would probably be appropriate for the expression language. I see why it's a good compromise in some situations.
I also encourage implementers to make this distinction:
What I wish compiler books would cover (/r/ProgrammingLanguages)
136 points, 36 comments - 30 Apr 2020
I describe how Oil uses spans, span IDs, and Python/C++ exceptions to provide detailed errors, while keeping the code clean. And I link to related blog posts.
This design has worked well, but I don't claim it's the best one. I'd like to hear about other approaches.
I've posted this link on the blog before, but #lexing is another place where theory and practice meet.
github.com
via Reddit) I updated this wiki page
based on this lobste.rs discussion. It's not strictly about parsing, but may be interesting to language designers.
Here are a few observations about the metalanguage for compilers from my comments:
This post is mainly for experienced language implementers. If you've never written a parser, a good intro is Chapter 6: Parsing Expressions in Crafting Interpreters.
It will give you just enough theory to write your first parser. After that, the theory above will be more useful (CFGs and PEGs, for example).
Why use theory? One reason is that writing a parser isn't the same thing as designing the syntax for a language. For example, many language specifications contain grammars.
And most languages have 2 or 3 widely used parsers, so it helps to be "abstract" about the syntax. Bootstrapping is one reason that you will need to write another parser, but here are more important use cases:
-g
.go fmt
, clang-format
), and
automatic translators (#osh-to-oil) need different
representations of your code than compilers do.Related article:
Who Is Debugging the Debuggers? Exposing Debug Bugs in Optimized Binaries (arxiv.org
via Hacker News)
98 points, 22 comments - 35 days ago
An anecdote to show why this matters: While debugging the garbage collector, I ran into an issue a where GDB used incorrect location info when debugging binaries compiled with Address Sanitizer. This led to confusing and frustrating sessions where I was literally debugging the wrong code.
While I don't know the exact cause of this issue, the general point is that good tools rely on good front ends. Front ends have non-obvious design decisions that percolate throughout the interpreter or compiler. The above link about error handling architecture elaborates on this.