Bespoke Superoptimization

2016-12-27

I'm wary of using a fancy term to describe the method Veedrac and I used to find solutions to the enum-labeling problem, but I think it's accurate and potentially useful.

What does it mean? Bespoke means "custom-made" or "made-to-order" with respect to say clothing, but it's also descriptive of software. For example, in this interview and the book Coders at Work, Donald Knuth argues against reusable code and in favor of bespoke data structures and libraries:

I also must confess to a strong bias against the fashion for reusable code. To me, "re-editable code" is much, much better than an untouchable black box or toolkit.

He's referring to the practice of copying code and adapting it to fit the application.

In the same vein, bespoke code generators are useful and common, but "reusable" code generators like ANTLR or yacc haven't lived up to their promise. I'll return to this topic in the future, but for now here are examples:

Python has a bespoke ASDL implementation. (What is ASDL?)
Oil now has one too. The generated code is different, and the schema language will also be different once I enhance it for oil's requirements.
The pgen.c parser generator in the Python source tree .
The Lemon parser generator in the sqlite source tree.

Superoptimization is a technique for generating short code sequences using brute-force search. It has the property that the optimized code may use a different algorithm than the original. In our case, we turned a lookup table into an expression of bitwise operators.

So a bespoke superoptimizer is one written for a specific application, rather than one that's part of a compiler. Exploiting application-specific knowledge can make a problem more tractable.

Recap

As mentioned in the problem description, osh now has 221 token types, which broke my scheme of using 4 bits for the token Kind and 4 bits for the Id.

I switched to an explicit lookup table, but in the back of my mind I suspected there may be a fancy trick to "compress" it into a mathematical expression. We just want to look up an enum Kind value using an enum Id value, and we don't care about the concrete integer values.

Help from the Internet

This thread on bitwise operator tricks popped up on Reddit, and I thought it would be a good audience for my question. Veedrac quickly surprised me with two elegant solutions.

I was honestly puzzled by them, so I replicated his results. Here is the func_search.py program I used to do it. I've left it messy and slow so you don't have any illusions about how principled this process was.

I asked Veedrac about his thought process, and he mentioned an intuition about the formula x ^ -x. This was important for getting started, but I don't believe it was necessary in the end. Instead, what he taught me is that the problem is less constrained than I thought.

Paradoxically, the key insight is that there's no insight necessary to solve this problem. I thought that a mathematical expression could be derived using some trick or theory like binary decision diagrams, but we didn't need anything of the sort.

Search Strategy

Not only can the problem be solved by brute force, it can be found in a few seconds using unoptimized, single-threaded Python. Here is the search strategy I replicated:

Generate random functions of id using a few bitwise operators and a few single byte constants,
- The bitwise operators are selected at random and hard-coded, e.g. f4_5().
- The constants are exhaustively or semi-exhaustively searched, e.g. Form4().
Evaluate every candidate function on all inputs, 0 to 255, and make a histogram of output values using Hist().
Test if the output values match the desired distribution using ScoreFunction().

ScoreFunction() does three things:

Throw out functions which don't have at least 21 distinct output values (the kinds)
Throw out functions that give values outside the range [0, 255]. (This is too conservative, but it doesn't matter in the end.)
Calculate the number of groups that are too small, i.e. a "deficit" to be fixed up as in the first two solutions here. If there are zero deficits, then the solution is exact, like the last two solutions.

def ScoreFunction(hist):
  func_range_size = len(hist)
  if func_range_size < 21:
    return

  dist = [freq for _, freq in hist]
  func_range = [result for result, _ in hist]
  if any(v > 255 for v in func_range):
    return
  if any(v < 0 for v in func_range):
    return

  num_deficits = NumDeficits(dist, DESIRED_DIST)
  if num_deficits <= 1:
    print('Kind that are too small: %d' % num_deficits)
    d = Deficit(DESIRED_DIST, dist)

    Show('func range', func_range)
    Show('want dist', DESIRED_DIST)
    Show('have dist', dist)
    Show('deficit', d)

  return num_deficits

Using an arbitrary guess of the function form f4_5(), this program finds an exact solution in less than 3 seconds. The search space I chose has only 2¹⁵ = 32,768 functions.

$ time ./func_search.py search
...
Kinds that don't fit: 0
  func range   0   1  32   3   8   9  11  33  35   4  64  19  20  36  67  65 128  68 129 131 132
   want dist  43  25  20  16  15  13  12  12  11  10  10   8   8   4   4   3   3   1   1   1   1
   have dist  45  27  20  19  16  16  16  12  12  10  10   8   8   8   6   6   5   4   3   3   2
     deficit   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
num deficits 0, i = 12, j = 6, k = 14

ABORT: Found exact solution

real    0m2.805s
user    0m2.799s
sys     0m0.004s

The solution found is:

kind = (-id + 12) & (id ^ 6) & (id + 14)

The func range row shows every unique value that the function yields over the integers 0 to 255. The second row is the distribution we want, and the third row is the distribution we get.

Because this solution is exact, the last row is all zeros. We have a group of output values big enough for every Kind we desire.

Conclusion

I formulated a problem in the osh parser called function-directed enum labeling. Small instances can be trivially solved with what I call enum stuffing: using the high bits of integer labels to store a lookup table over the enum, thus storing it for free.

Veedrac from Reddit solved a harder instance of this problem with semi-exhaustive search. I reproduced his results with func_search.py, and published a runnable solution in ik7.cc. The shortest solution so far is:

kind = 175 & id & ((id ^ 173) + 11)

So if you're very stingy about bits, you can use this bespoke superoptimization technique to replace lookup tables with short functions over relabeled enums. To introduce the term, I mentioned the idea of bespoke data structures and bespoke code generators.

Tomorrow I will ask some questions that came out of this experience. I can't quite tell now whether it was merely a fun diversion, akin to recreational mathematics, or if it can be used to appreciably optimize real software.