16 November 2007

This week I learned... (Toronto edition)

It was a good week, because I was stuck in a room with Benjamin Smedberg and Taras Glek for a couple days.

  • Virtual method calls are branch-predicted these days. What is the world coming to?

  • When you fly from Canada to the U.S., you go through customs in Canada.

  • I learned a little about why Mozilla's security code is the way it is. I hate the wrapper model, but the model I prefer, which depends on examining the stack to find unauthorized callers, has problems too.

    BS claims stack-checking might be even more fragile, because it depends on the stack information being correct. For example, sometimes C++ code calls methods in order to block something evil that a content script is trying to do. If that C++ code neglects to say "Simon says", the blocking doesn't happen. Apparently a bug like this actually happened at one point (story embedded in document). I still think the wrapper model is horrible and not to be taken seriously, unless it's enforced with static analysis.

    I will say this: stack checking would have to be optimized for Mozilla. Performance is a problem. Compare Mozilla's sandbox to the Java applet sandbox. Applets are an easy case. The security checks don't normally trigger at all—if they do hit, it's either a bug or the applet is malicious. Mozilla, by contrast, has to do an access check each time a script does anything to the DOM. It's easy to optimize those checks away using wrappers; not so easy with stack checking (but I don't know much about it).

  • I'm told the line numbers in Unix debugging symbols are limited to 16 bits. Source files with over 64K lines of code can confuse your debugger. Haven't tested this.

  • Vmgen is a tool for auto-generating fast interpreters. (Paper, PDF, 30 pages.)

  • I learned why an interpreter that uses while (1) switch (*bytecode++) {...} might be slower than one that uses an explicit computed goto at the end of each case to jump directly to the case for the next bytecode. It has to do with how CPUs do branch prediction. The CPU keeps a table of branch instructions and where they usually end up branching to. This allows it to predict (guess) where each branch is going to most of the time, keeping pipeline full. If your interpreter uses a computed goto at the end of each case, the CPU makes a separate entry in this table for each of those. So the CPU can do a good job of predicting what the next opcode is, and it'll automatically tune this intelligence to the specific bytecode the interpreter is executing! By contrast, the single indirect branch in switch (*bytecode++) { ... } is impossible to predict using this simple historical table. The pipeline is often derailed, which kills performance.

  • Tracepoints are breakpoints that don't stop long. When a tracepoint hits, the program runs a little code that logs the contents of certain pieces of memory, then continues the program. By design, the tracepoint can't modify the program state. It treats the process as read-only. This feature was invented to help debug critical software while it's live. (!)

    But there are two even crazier applications. If you know what you're interested in, tracepoints can be used to implement a debugger that can step back and forward in time, with much less overhead than the usual approach of logging every memory write. GDB only supports tracepoints in remote debugging, but it's been proof-of-concepted for debugging the Linux kernel from a user-mode debugger (arrrgh, too cool!).

No comments: