Why Sponsor Oils? | blog | oilshell.org
I'm wary of using a fancy term to describe the method Veedrac and I used to find solutions to the enum-labeling problem, but I think it's accurate and potentially useful.
What does it mean? Bespoke means "custom-made" or "made-to-order" with respect to say clothing, but it's also descriptive of software. For example, in this interview and the book Coders at Work, Donald Knuth argues against reusable code and in favor of bespoke data structures and libraries:
I also must confess to a strong bias against the fashion for reusable code. To me, "re-editable code" is much, much better than an untouchable black box or toolkit.
He's referring to the practice of copying code and adapting it to fit the application.
In the same vein, bespoke code generators are useful and common, but "reusable" code generators like ANTLR or yacc haven't lived up to their promise. I'll return to this topic in the future, but for now here are examples:
Superoptimization is a technique for generating short code sequences using brute-force search. It has the property that the optimized code may use a different algorithm than the original. In our case, we turned a lookup table into an expression of bitwise operators.
So a bespoke superoptimizer is one written for a specific application, rather than one that's part of a compiler. Exploiting application-specific knowledge can make a problem more tractable.
As mentioned in the problem description, osh
now has 221 token
types, which broke my scheme of using 4 bits for the token Kind
and 4 bits
for the Id
.
I switched to an explicit lookup table, but in the back of my mind I suspected
there may be a fancy trick to "compress" it into a mathematical expression.
We just want to look up an enum Kind
value using an enum Id
value, and we
don't care about the concrete integer values.
This thread on bitwise operator tricks popped up on Reddit, and I thought it would be a good audience for my question. Veedrac quickly surprised me with two elegant solutions.
I was honestly puzzled by them, so I replicated his results. Here is the func_search.py program I used to do it. I've left it messy and slow so you don't have any illusions about how principled this process was.
I asked Veedrac about his thought process, and he mentioned an intuition about
the formula x ^ -x
. This was important for getting started, but I don't
believe it was necessary in the end. Instead, what he taught me is that the
problem is less constrained than I thought.
Paradoxically, the key insight is that there's no insight necessary to solve this problem. I thought that a mathematical expression could be derived using some trick or theory like binary decision diagrams, but we didn't need anything of the sort.
Not only can the problem be solved by brute force, it can be found in a few seconds using unoptimized, single-threaded Python. Here is the search strategy I replicated:
id
using a few bitwise operators and a few
single byte constants,
f4_5()
.Form4()
.Hist()
.ScoreFunction()
.ScoreFunction()
does three things:
def ScoreFunction(hist):
func_range_size = len(hist)
if func_range_size < 21:
return
dist = [freq for _, freq in hist]
func_range = [result for result, _ in hist]
if any(v > 255 for v in func_range):
return
if any(v < 0 for v in func_range):
return
num_deficits = NumDeficits(dist, DESIRED_DIST)
if num_deficits <= 1:
print('Kind that are too small: %d' % num_deficits)
d = Deficit(DESIRED_DIST, dist)
Show('func range', func_range)
Show('want dist', DESIRED_DIST)
Show('have dist', dist)
Show('deficit', d)
return num_deficits
Using an arbitrary guess of the function form f4_5()
, this program finds an
exact solution in less than 3 seconds. The search space I chose has only
215 = 32,768 functions.
$ time ./func_search.py search ... Kinds that don't fit: 0 func range 0 1 32 3 8 9 11 33 35 4 64 19 20 36 67 65 128 68 129 131 132 want dist 43 25 20 16 15 13 12 12 11 10 10 8 8 4 4 3 3 1 1 1 1 have dist 45 27 20 19 16 16 16 12 12 10 10 8 8 8 6 6 5 4 3 3 2 deficit 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 num deficits 0, i = 12, j = 6, k = 14 ABORT: Found exact solution real 0m2.805s user 0m2.799s sys 0m0.004s
The solution found is:
kind = (-id + 12) & (id ^ 6) & (id + 14)
The func range
row shows every unique value that the function yields over the
integers 0 to 255. The second row is the distribution we want, and the third
row is the distribution we get.
Because this solution is exact, the last row is all zeros. We have a group of
output values big enough for every Kind
we desire.
I formulated a problem in the osh
parser called function-directed enum
labeling. Small instances can be trivially solved with what I call
enum stuffing: using the high bits of integer labels to store a lookup
table over the enum, thus storing it for free.
Veedrac from Reddit solved a harder instance of this problem with semi-exhaustive search. I reproduced his results with func_search.py, and published a runnable solution in ik7.cc. The shortest solution so far is:
kind = 175 & id & ((id ^ 173) + 11)
So if you're very stingy about bits, you can use this bespoke superoptimization technique to replace lookup tables with short functions over relabeled enums. To introduce the term, I mentioned the idea of bespoke data structures and bespoke code generators.
Tomorrow I will ask some questions that came out of this experience. I can't quite tell now whether it was merely a fun diversion, akin to recreational mathematics, or if it can be used to appreciably optimize real software.
Stochastic Superoptimization by John Regehr has good background material, discusses speeding up the search with randomization, and potential future directions for superoptimization.
1992 Synthesis OS -- An efficient Unix-like OS written in assembly, which uses some exotic techniques including runtime code generation. By the author of the first paper on superoptimization.
The ryg blog -- A cool blog I found while Googling for superoptimization. In retrospect, it shouldn't be surprising that demo scene programmers use techniques like this. Not many programmers are interested in removing 221-byte lookup tables. :)