26 October 2007

This week I learned...

  • How to find the median of n items in O(n) comparisons. The easiest algorithm is merely O(n) expected time. Can you figure out how it works? Hints in the comments.

  • You can sort items with string keys in O(kn) time, where k is the average length of the strings. But it requires, say, 5n words of extra table space. This is amusing because in JavaScript, Array.prototype.sort() sorts objects by their toString(), so it is always sorting strings. (A comparison sort in this case can be as bad as O(kn log n), since comparisons are worst-case O(k).)

  • The memory model that processors present to multithreaded programs running on multiprocessor machines is frighteningly subtle: A Taxonomy of Multiprocessor Memory-Ordering Models (44 slides, PDF). The presentation cheerfully explains about 13 different ways CPUs can (by design) shuffle the order of memory accesses, causing more or less confusion on other CPUs. The reason they have to do this is per-CPU caches, which are a major speed win; the techniques multiprocessors use to maintain a modicum of sanity are called cache coherency protocols.

  • That's why atomic operations are so costly. Not only do they flush the cache line of the memory being atomically accessed. In order to maintain cache consistency and coherency, the CPU has to flush, potentially, its entire cache.

  • You can implement a copying collector in 100 lines of easy-to-read C, if you're not too worried about blowing the stack.

    T.S. brought me a book, Garbage Collection by Jones and Lins (read a review). It was written in 1996, which in the GC field is a long time ago; but I don't think the basics have changed. Still, it's odd to see Fortran and Modula-2 used as examples, rather than various JVMs, the CLR, Python, JavaScript, and Ruby.

    Working for Mozilla is like going back to school, except I'm motivated.

  • Suppose I want to create a set of C/C++ macros that application code can use that allows me to swap out different kinds of memory management systems, switching between reference counting, mark-and-sweep, copying GC, and so on, as a compile-time option. In C/C++ this is actually practically impossible, because the range of operations you want to hook, and the amount of detail the GC might need to know about them, is so great. To do a good job, each memory management scheme will want to optimize away a lot of work, which you can't do without still more cooperation from the compiler or the programmer.

    Java doesn't have this problem. The JVM can hook whatever it wants because the bytecode it loads and executes is really only partly-compiled. With Jikes RVM, you can in fact choose from a huge array of garbage collectors at the time you compile your VM.

19 October 2007

This week I learned...

  • bug.gd is a web site dedicated to finding explanations and workarounds for every cryptic error message your computer can generate (with a little help from you).

  • The Canadian Standards Association charges Canadian citizens hundreds of dollars for (copy-protected) CD-ROMs containing the detailed regulations they're required to follow to do their work.

  • Mississippi medical examiner Stephen Hayne claims he performs 1500-1800 autopsies each year. According to the article, “People who have visited Hayne’s practice during an autopsy session have described seeing as many as 15 bodies opened at once, with Hayne and his assistants smoking cigars, sometimes even eating sandwiches, as they go from one body to the next.” It's not just gross. Hayne performs about 80% of the state's forensic autopsies. He tells prosecutors what they want to hear. He testifies in court. People go to jail. I couldn't put this article down. Must read.

  • The Jikes Research Virtual Machine is a JVM written in Java. (The garbage collector is written in Java, too. In fact the garbage collector is pluggable, which makes Jikes RVM a nice testbed for experimental new GC strategies.)

  • An “on-the-fly” garbage collector is one that is neither “stop-the-world” nor “fully concurrent”— that is, it neither stops all application threads at any point while it collects, nor works entirely in a separate background thread. Instead, it asks each application thread to pause periodically, at their convenience, to do some GC work.

  • Xiao-Feng Li, a developer on the Apache Harmony JVM project, has a blog on programming language runtime implementation techniques. The posts often focus on GC.

  • In C++, static_cast<B2 *>(d), where d is of type D * and B2 is the second base class of D, does not compile to a simple addition operation, even though the offset of the B2 within D is a constant known to the compiler. The catch is that if d is null, the result must be null. So the compiled result includes a conditional branch, essentially (d == NULL ? NULL : (intptr) d + some bytes).