25 January 2008

This week I learned...

  • NestedVM can take any program that GCC can compile and run it in a Java VM. It does this by compiling the program to a MIPS executable and then translating the MIPS machine code to Java bytecode. Now, there isn't any high-level type information in a MIPS binary, so there isn't any in the bytecode. Instead each instruction is translated to something that bangs on some large int arrays that represent virtual memory. (The sbrk system call is implemented using new int[].)

    The paper has sentences like, “The NestedVM runtime fills the role typically assumed by an OS kernel.” :)

    I think the point of this, aside from being cool, is to make C++ code run anywhere Java does. I don't know how many platforms have JVMs but not gcc back-ends, though. (GCC actually has a back end for Java, but it can't handle C++.)

  • So if you know C, you know that && has short-circuit behavior: if the left-hand side is false, the right-hand side doesn't get evaluated. This week I learned that if the right-hand side of && is simple enough and has no side effects, as in x > 0 && x < N, a good compiler emits code that evaluates it anyway, essentially treating the && as &. A conditional branch is slower than a few redundant instructions.

  • Objective C exceptions on Mac are implemented using setjmp/longjmp. They don't cooperate with C++; if you throw an Objective C exception across a frame containing C++ objects, the destructors don't get called. This triggered some bugs in Mozilla, which apparently has Cocoa GUI code or something. (Sorry, I don't pay much atttention to that stuff. :))

  • If you compile with gcc -g3, then gdb can print expressions that use macros! I knew Jim Blandy had implemented this but I never actually went and dug up the magic to make it work. This will make my life a lot easier, at least for a year or two.

  • The gcc compiler itself uses a garbage collector. I'm told the GC is autogenerated from the source; so the gcc source distribution actually includes a bunch of autogenerated code.

  • gcc -S prints Intel assembly code with the operands reversed. I don't remember if I ever knew this or not. What a pain.

  • And some more about GCC internals, from here.


Hadi said...

Your blog is one of most interesting blogs I've ever read, and this post is among the most interesting posts (of course, just for me).

Thanks for sharing your experiences!

Andrew said...

You can get proper intel assembly code with -S -masm=intel.