The Transterpreter Project

Concurrency, everywhere.

Compiling occam to C

In presenting the Transterpreter at Indiana University Bloomington, we met Geoff Brown, a (new to me) faculty member in the Computer Science department. He gave us a number of ideas for improving the performance of the virtual machine. One which I thought was quite cool was the idea of compiling the bytecode directly to C. Since I don't think it is immediately obvious what I mean by this, I will explain.

SPoC was an occam to C compiler. It would transform a piece of occam code into an equivalent C program (links here, here, and here). This was a very hard task, and the compiler is (I believe) no longer actively supported. So, a program like

  x := 3
  y := 5 + y), 0, scr)

might be turned into

x = 3;
y = 5;
printf("%d", (x + y));

in C. Now, SPoC had to provide compile in a whole scheduler and other stuff to make this work, so it didn't produce code that was quite that simple.

We wondered what kind of performance increase we could get by using our existing infrastructure to execute compiled C code instead of interpreting bytecode. You see, right now the Transterpreter looks like this:


The Transterpreter reads in a big string of numbers (produced by KRoC, the occam compiler), and it "interprets" what those numbers really mean, and executes them. We might also look at it like this:


The Transterpreter looks at the numbers, and figures out that a certain set of zeros and ones means "add two numbers". Or, "load a constant into memory", and so on. When working on the Transterpreter, we tend to think of things at this level; it makes things a lot simpler. So, it's a stream of assembly language instructions, and we just do what they say to do.

Well, our entire machine is written in C. That means, we have a little piece of C code for every assembly language instruction. For example, the ADC (ADd Constant) instruction looks like

void ins_adc()
	areg = areg + oreg;
	/* breg and creg stay the same */

So, whenever we see the assembly language instruction ADC, we simply call the function ins_adc in our interpreter.

What we did last night was simply get rid of the instructions. You see, we were going through a process that looked like this:

  1. Read an instruction
  2. Look up the C function that represents that instruction
  3. Execute the function

Over and over. So, instead of looking up every function, we instead converted our list of instructions into a list of C functions that look like

static int main() {

This way, our execution loop looks like

  1. Run C code.

which is a good deal faster.

We then take the C program that we generated (which behaves just like us looking up every instruction in sequence), and ran it. The performance improvement is impressive, and we haven't even begun to optimize.

(There are a few small details missing here, but this is, generally speaking, what we did... I'm not going to go into the details of exactly what our C code looks like when we dump it, because it isn't important for this discussion right here, right now.)

So, how do we fare? Well, when we interpret everything, we can run through a tight loop around 10,000,000 times (a loop that just does some simple arithmetic... not a very great benchmark) in, say, 300 time units. When we compile our code in C, we run in 60 time units; that is a 5x speedup. We're still a little way off KRoC on the Intel/Linux platform, as it runs the same loop in 1 time unit. Kinda a bummer.

So, we get a 5x speedup by a naive compilation to C. (Back in the day, people would publish papers about their compilers, and how they got a 5% speedup!) This is good, though, as we can run that C code anywhere the Transterpreter can run, which is pretty much everywhere. And, there's a lot of room for improving the performance of both the compilation (from bytecode to C) and the Transterpreter itself.


  • Posted: March 5, 2005
  • Author: Matthew Jadud
  • Comments: None
  • Tags: None

Latest Weblog Posts