21 December 2007

This week I learned...

  • A five-step checklist saved about 1,500 lives in Michigan intensive care units in 18 months. It lists the five steps a physician should take, when putting a tube into a patient, to prevent infection.

  • Simple array lookups like arr[i], where i is an int, are troublesome for x64 compilers. This is because int is a 32-bit type, while arr is 64 bits. To generate the fastest possible code, the compiler must prove that the high 32 bits of the register containing i are all zeroes.

  • OpenGL is horrible for 2D graphics because 2D graphics are usually plainer and stay on the screen statically for a lot longer than 3D graphics. So a little imprecision is a very bad thing, especially for text. 3D cards don't have a whole lot of DWIM in them; if your use of an API is the slightest bit off, you can get bizarrely ugly results, and it's very hard to debug.

    This is an instance of a principle that I think works against efforts like bug.gd: the most perplexing things computers do don't come with error messages. Instead it just behaves funny, and gives no indication that it even knows it's doing it.

  • An interference graph is a way of thinking about scheduling constraints. For example, many compilers use an interference graph to assign variables to CPU registers. The vertices of the graph are the variables. Each edge connecting two vertices indicates that those two variables cannot live in the same register—their lifetimes overlap; they would interfere. The problem of register allocation is just a graph coloring problem on the interference graph.

    Graph coloring is NP-hard, though. Some JIT compilers do a linear scan instead, plopping each variable into the next available register. This may not get every variable into a register, even in cases where graph-coloring would. But it's fast.

Phone interviews are educational. :)

Merry Christmas!

20 December 2007

Today's recommendations for me

Amazon thinks I might like the following items:

  • György Ligeti Edition 2: A capella choral works
  • Galois theory for beginners
  • General topology
  • Infinity and the mind
  • Banana

Banana?

It turns out Amazon recommended Banana for me because I purchased Fried Egg. So that's all cleared up.

14 December 2007

This week I learned...

  • This T-shirt claims that “Under extreme stress, an octopus will eat its own arms.” According to Wikipedia, this is a “common belief”, though I'd never heard of it. I'm suspicious.

  • Thanks to johnath, I now know that self-actualization tops Maslow's hierarchy of needs.

    The Internet, incidentally, is all about self-actualization. Depending on my mood, this strikes me as either dumb or moderately profound.

  • Good Lisp compilers, I'm told, do fancy escape analysis on closures, so they can detect when the upward funarg problem doesn't occur and in those cases stack-allocate the captured variables. Nice one. In Python, by contrast, all captured variables become heap-allocated “cells”. I hear SpiderMonkey still keeps the whole activation record alive. I had thought that practice was long dead; I know I've seen it characterized on Lambda the Ultimate as “stupid”.

  • I knew this was there somewhere. In gdb, the commands command lets you specify debugger commands that run when a breakpoint hits. For example, you can have gdb print some variables and continue. Useful.

  • Also in gdb: set scheduler-locking on prevents other threads from running while you're stepping in a thread. Not available on all platforms, apparently; but this is the only sane way to debug a multithreaded program.

  • Also in gdb: thread apply all runs a debugger command on every thread in the process you're debugging. I have yet to use it without crashing gdb, though.

  • Even in a very simple 3D model, physics is very hard to get right. I think faking it using an easing function would get better results. Kind of sad.

...and more about static single assignment form, but I have much more to read there.

07 December 2007

This week I learned...

  • Remember when I linked ColorSchemer Online? Fun toy, huh? Well, forget about it. What you want is the ColorSchemer Gallery. Three thousand color schemes created by actual graphic designers or sampled from art and nature.

  • How does a really good garbage collector compare to explicit memory management (malloc/free)? According to a 2005 paper (PDF) by Matthew Hertz and Emery D. Berger, the two strategies are pretty competitive, and GC can even be faster than malloc, as long as memory is plentiful. To remain on par with malloc, a GC program needs about 4 times as much heap memory!

    This may matter to me pretty soon, as we're working on replacing Mozilla's reference counting with a conservative tracing GC.

  • I had never even heard of static single assignment form before around June, but now that I know a little about it, it's everywhere. GCC uses it. Java JITs use it. LLVM code is in SSA form by construction. The key insight behind the HotpathVM tracing JIT, where I first ran across SSA, is that a lot of the well-known SSA optimization algorithms become trivially easy when you run them on a single basic block at a time.

    Last week, apropos of nothing, someone described VHDL to me as “SSA-like”. And you know... that's a pretty good way of explaining it.

    Having just the most basic understanding of SSA has greatly changed my expectations of what optimizations a compiler will perform. Not that they're necessarily any more accurate now; just changed.

  • I learned a couple things from the examples in this paper: Threads cannot be implemented as a library by Hans Boehm. I finally got around to actually reading (most of) those Intel slides on CPU memory models I posted over a month ago, and learned a lot there. To say that this stuff is not widely understood by people who write real-world multithreaded code is an understatement.

  • I learned how an OpenGL program knows what you just clicked on. It's clever. First it applies a transform that maps that pixel of your view to the unit cube. Then it does a new pass over all the polygons in the whole scene, just as it does when rendering the scene—only this time it's not drawing anything. It's just interested in knowing which polygons intersect the unit cube, a fairly simple calculation.

02 December 2007

Black-Eyed Pea Dip

Heat and drain:

  • 1 can Bush's black-eyed peas

You can just zap them in the microwave for 2 minutes.

Mash them, then mix in:

  • 1 tsp. seasoned salt
  • 2 tbsp. chopped onions
  • 1 tbsp. jalapeños (or salsa or diced tomatoes)
  • 1½ cup grated cheddar
  • ¼ cup unsalted butter, softened
  • 1 3-oz. can of deviled ham

Pour the mixture into a 9-inch baking dish. Sprinkle on top:

  • 3 oz. grated mozzarella cheese

Bake at 350° for 25 minutes. Serve with corn chips.

21 November 2007

This week I learned...

  • Net-casting spiders spin small webs which they hold in their long, spindly legs and slap down on unsuspecting insects. Amazing.

  • In OpenGL, textures are applied after lighting has already been applied to the colors of whatever material you're drawing. So once you get a material to look the way you want, if you then try to put a red spot on it, using a texture with some red parts and some transparent parts, the red parts aren't affected by lighting. It looks really dumb.

    I think the fix is to draw the polygon twice, applying the “decal” in a second pass. But to do that, you have to actually do things right—turn off depth-buffering and sort all your polygons. Blah.

  • OpenGL contains a lot of stuff that now seems quaint, like stippling. And support for 8-bit graphics modes.

  • Many CPUs (including, for example, PowerPC) don't have an instruction for the integer “modulo” operation, the one spelled % in C and Java. (This is mentioned in the HotpathVM paper; I must have skimmed past it the first time I read it.)

  • To decode g++ mangled names, pipe them to c++filt! Actually, Waldo told me about this last week. It helped me figure out some puzzling linker errors (you know the sort— “undefined symbol ??XZ13Frz__F21Z6K9__”).

  • A good merge sort is faster than quicksort for real-world data. You can easily take advantage of certain kinds of pre-existing order in the array. Timsort is a merge sort implementation that takes that idea and, er, runs with it.

  • Python's list.sort() is strangely wasteful when you use the key= named parameter. It allocates a separate, reference-counted temporary object for each item in the list. (It should allocate a single temporary array.)

  • Maintaining a Mercurial queue containing large patches to a project is about as hard as maintaining a fork.

And a bunch of stuff about compilers that isn't really interesting.

According to a coloring book we bought for the kids, the black spots on a strawberry are not seeds but tiny fruits that contain the seeds. Wikipedia does say that “from a technical standpoint, the seeds are the actual fruits of the plant, and the flesh of the strawberry is modified receptacle tissue.” Appetizing! But I still say if your strawberry has black spots on it, don't eat it.

16 November 2007

This week I learned... (Toronto edition)

It was a good week, because I was stuck in a room with Benjamin Smedberg and Taras Glek for a couple days.

  • Virtual method calls are branch-predicted these days. What is the world coming to?

  • When you fly from Canada to the U.S., you go through customs in Canada.

  • I learned a little about why Mozilla's security code is the way it is. I hate the wrapper model, but the model I prefer, which depends on examining the stack to find unauthorized callers, has problems too.

    BS claims stack-checking might be even more fragile, because it depends on the stack information being correct. For example, sometimes C++ code calls methods in order to block something evil that a content script is trying to do. If that C++ code neglects to say "Simon says", the blocking doesn't happen. Apparently a bug like this actually happened at one point (story embedded in document). I still think the wrapper model is horrible and not to be taken seriously, unless it's enforced with static analysis.

    I will say this: stack checking would have to be optimized for Mozilla. Performance is a problem. Compare Mozilla's sandbox to the Java applet sandbox. Applets are an easy case. The security checks don't normally trigger at all—if they do hit, it's either a bug or the applet is malicious. Mozilla, by contrast, has to do an access check each time a script does anything to the DOM. It's easy to optimize those checks away using wrappers; not so easy with stack checking (but I don't know much about it).

  • I'm told the line numbers in Unix debugging symbols are limited to 16 bits. Source files with over 64K lines of code can confuse your debugger. Haven't tested this.

  • Vmgen is a tool for auto-generating fast interpreters. (Paper, PDF, 30 pages.)

  • I learned why an interpreter that uses while (1) switch (*bytecode++) {...} might be slower than one that uses an explicit computed goto at the end of each case to jump directly to the case for the next bytecode. It has to do with how CPUs do branch prediction. The CPU keeps a table of branch instructions and where they usually end up branching to. This allows it to predict (guess) where each branch is going to most of the time, keeping pipeline full. If your interpreter uses a computed goto at the end of each case, the CPU makes a separate entry in this table for each of those. So the CPU can do a good job of predicting what the next opcode is, and it'll automatically tune this intelligence to the specific bytecode the interpreter is executing! By contrast, the single indirect branch in switch (*bytecode++) { ... } is impossible to predict using this simple historical table. The pipeline is often derailed, which kills performance.

  • Tracepoints are breakpoints that don't stop long. When a tracepoint hits, the program runs a little code that logs the contents of certain pieces of memory, then continues the program. By design, the tracepoint can't modify the program state. It treats the process as read-only. This feature was invented to help debug critical software while it's live. (!)

    But there are two even crazier applications. If you know what you're interested in, tracepoints can be used to implement a debugger that can step back and forward in time, with much less overhead than the usual approach of logging every memory write. GDB only supports tracepoints in remote debugging, but it's been proof-of-concepted for debugging the Linux kernel from a user-mode debugger (arrrgh, too cool!).

09 November 2007

This week I learned...

  • Chicken cacciatore isn't at all what I thought it was. Sure is tasty though. The recipe in The Joy of Cooking is recommended; here's a similar recipe (video).

  • Searching YouTube for recipes is an awfully delicious time sink. My mouth is watering.

  • Color Schemer exists—and does a pretty darn good job. Very helpful for the color-impaired like me.

  • Dot (PDF) is a very simple little language for describing graphs. graphviz turns dot files into professional-quality 2D images in various formats. You just describe the nodes and arrows, with as much or as little style info as you want; graphviz does the layout and draws the lines. The Mac version won an Apple Design Award.

  • LLVM is a VM for low-level languages like C and C++. Too much supercool stuff going on here for me to even begin to describe.

    (Benjamin Smedberg is interested in maybe using LLVM in Mozilla 2, as the mechanism for JavaScript-C++ interop. And—heh—maybe a lot more.)

  • Dijkstra wrote about Why numbering should start at zero. The usability note about MESA is especially interesting. Ruby has built-in operators M .. N and M ... N that create ranges Mi < N and Mi N, respectively—or is it the other way round?

  • Finalizers are scary! It's unpredictable when they'll be called, and this lays more traps than I had realized. Say a finalizer removes an entry from some data structure. The system could fire off that finalizer just about any time. It might happen to call it while you're in the middle of modifying that data structure. Or some other thread is. Uh oh.

    You might think, “Oh, you can work around that using locking.” But that has scary consequences of its own. If the system uses a stop-the-world garbage collector, and one of the threads it stops is holding the necessary lock, you immediately get deadlock. If your thread is holding the lock at the time the finalizer decides to execute on the same thread (something Java doesn't do, but Python could), you have another kind of problem. If you used a non-reentrant lock, you'll deadlock (yes, with finalizers, you can deadlock even if your program is single-threaded). Otherwise, your critical section will be reentered anyway, as if you didn't have a locking scheme in place at all!

    I've heard some JVMs avoid potential deadlocks by running finalizers concurrently with ordinary code, in a separate thread. But that seems even scarier. It means non-threadsafe code can be called at unpredictable times from a magical system thread that most programmers don't even know about. Yikes!

02 November 2007

This week I learned...

  • To view the wiki-source of any MediaWiki page, add ?action=raw to the URL. (Good for scripting.)

  • Unix has system calls like mlock, mincore, mprotect, and madvise that interact with the virtual memory manager. (Some are more standard than others.) I only knew about the venerable mmap.

  • There's a fast, easy way to avoid the agonizing worst-case fragmentation you can get when you use a simple size-class/freelist implementation of malloc: per-page freelists.

  • You can implement a copying collector in 100 lines of easy-to-read C that won't blow the stack!

  • The Encyclopædia Brittanica is about 33,000 pages long, and I'm not the only one to raise an eyebrow at the volume titled “Menage - Ottowa”.

I also keep learning about good decisions the JVM designers made, but they're even less exciting than the above.

26 October 2007

This week I learned...

  • How to find the median of n items in O(n) comparisons. The easiest algorithm is merely O(n) expected time. Can you figure out how it works? Hints in the comments.

  • You can sort items with string keys in O(kn) time, where k is the average length of the strings. But it requires, say, 5n words of extra table space. This is amusing because in JavaScript, Array.prototype.sort() sorts objects by their toString(), so it is always sorting strings. (A comparison sort in this case can be as bad as O(kn log n), since comparisons are worst-case O(k).)

  • The memory model that processors present to multithreaded programs running on multiprocessor machines is frighteningly subtle: A Taxonomy of Multiprocessor Memory-Ordering Models (44 slides, PDF). The presentation cheerfully explains about 13 different ways CPUs can (by design) shuffle the order of memory accesses, causing more or less confusion on other CPUs. The reason they have to do this is per-CPU caches, which are a major speed win; the techniques multiprocessors use to maintain a modicum of sanity are called cache coherency protocols.

  • That's why atomic operations are so costly. Not only do they flush the cache line of the memory being atomically accessed. In order to maintain cache consistency and coherency, the CPU has to flush, potentially, its entire cache.

  • You can implement a copying collector in 100 lines of easy-to-read C, if you're not too worried about blowing the stack.

    T.S. brought me a book, Garbage Collection by Jones and Lins (read a review). It was written in 1996, which in the GC field is a long time ago; but I don't think the basics have changed. Still, it's odd to see Fortran and Modula-2 used as examples, rather than various JVMs, the CLR, Python, JavaScript, and Ruby.

    Working for Mozilla is like going back to school, except I'm motivated.

  • Suppose I want to create a set of C/C++ macros that application code can use that allows me to swap out different kinds of memory management systems, switching between reference counting, mark-and-sweep, copying GC, and so on, as a compile-time option. In C/C++ this is actually practically impossible, because the range of operations you want to hook, and the amount of detail the GC might need to know about them, is so great. To do a good job, each memory management scheme will want to optimize away a lot of work, which you can't do without still more cooperation from the compiler or the programmer.

    Java doesn't have this problem. The JVM can hook whatever it wants because the bytecode it loads and executes is really only partly-compiled. With Jikes RVM, you can in fact choose from a huge array of garbage collectors at the time you compile your VM.

19 October 2007

This week I learned...

  • bug.gd is a web site dedicated to finding explanations and workarounds for every cryptic error message your computer can generate (with a little help from you).

  • The Canadian Standards Association charges Canadian citizens hundreds of dollars for (copy-protected) CD-ROMs containing the detailed regulations they're required to follow to do their work.

  • Mississippi medical examiner Stephen Hayne claims he performs 1500-1800 autopsies each year. According to the article, “People who have visited Hayne’s practice during an autopsy session have described seeing as many as 15 bodies opened at once, with Hayne and his assistants smoking cigars, sometimes even eating sandwiches, as they go from one body to the next.” It's not just gross. Hayne performs about 80% of the state's forensic autopsies. He tells prosecutors what they want to hear. He testifies in court. People go to jail. I couldn't put this article down. Must read.

  • The Jikes Research Virtual Machine is a JVM written in Java. (The garbage collector is written in Java, too. In fact the garbage collector is pluggable, which makes Jikes RVM a nice testbed for experimental new GC strategies.)

  • An “on-the-fly” garbage collector is one that is neither “stop-the-world” nor “fully concurrent”— that is, it neither stops all application threads at any point while it collects, nor works entirely in a separate background thread. Instead, it asks each application thread to pause periodically, at their convenience, to do some GC work.

  • Xiao-Feng Li, a developer on the Apache Harmony JVM project, has a blog on programming language runtime implementation techniques. The posts often focus on GC.

  • In C++, static_cast<B2 *>(d), where d is of type D * and B2 is the second base class of D, does not compile to a simple addition operation, even though the offset of the B2 within D is a constant known to the compiler. The catch is that if d is null, the result must be null. So the compiled result includes a conditional branch, essentially (d == NULL ? NULL : (intptr) d + some bytes).

16 September 2007

The principle of explosion

Pop quiz:

1 - 1 + 1 - 1 + 1 - 1 + 1 - 1 + ... = ?

It's obviously 0, right? Or maybe it's 1. In the 17th and 18th centuries, everyone apparently thought the correct answer was ½ (and it wasn't because they were stupid back then: this includes people like Leibniz and Euler).

Eh, so math is inconsistent. So what?

It is important to point out that it is not enough to consider at the same time two conflicting statements in order to develop in pupils' minds the awareness of an inconsistency and the necessity of second thoughts (Schoenfeld, 1985): the perception of some mutually conflicting elements does not always imply the perception of the situation as a problematic one (Tirosh, 1990).

Infinite series: from history to mathematics education (PDF), Giorgio T. Bagni.

Huh.

Now, maybe this is because math doesn't make a whole lot of sense to most kids to begin with. But I think the main cause is that kids, like the rest of us, are used to things being inconsistent sometimes. And they live with it. I mean, what are you going to do?

Well, let me tell you something. In math, you can't live with a contradiction.

The principle of explosion is built into the fundamental rules of logic, rules that both mathematicians and ordinary people use to reason with. Ex falso sequitur quodlibet: from a contradiction, anything follows. Or as an old friend of mine used to say, after you swallow the first pill, the rest go down real easy.

In math, if you accept a single contradiction, the entire system comes crashing down around you.

(Now there's such a thing as paraconsistent logic, in which inconsistencies are not so destructive. But it's quite different from ordinary logic, and not many people are familiar with it.)

In the above case, mathematicians eventually discovered a formal notion of “convergence” and found that the sum 1 - 1 + 1 - 1 + ... does not converge. That is, there's no answer, just as there's no answer for the sum 1 + 2 + 3 + 4 + ..., and for the same reason: you can go on for as long as you like, and your numbers are never going to converge on some specific value.

Is this a cop-out? It's a hard fact of life that some mathematical problems just don't have answers. The ancients considered 2 - 7 to be undefined, because the answer would be less than nothing, which was clearly nonsense. Today we have negative numbers, but other things, like division by zero, are still undefined. Given all that, maybe it's not so surprising if expressions that end in the innocent-looking “+ ...”, as if to say “oh don't mind me, I'm just a little infinite series, tra la”, sometimes fall into this category.

Floating

It's not until you study math a little bit that you realize just how awful floating-point arithmetic is. I mean, so it's a little inexact. So what? But the following are examples of statements that are absolutely true for integers, rationals, and reals, but not for floating-point numbers (even setting aside floating-point infinities and “not-a-number”s):

a - (a - b) = b

If b > 0, then a + b > a.

The associative law: (a + b) + c = a + (b + c)

The distributive law: a × (b + c) = (a × b) + (a × c)

Cancellation: If a × b = a × c and a ≠ 0, then b = c.

In short, if you've ever done any reasoning about floating-point numbers, you were probably wrong.

It's easy to come up with formulas that floating-point arithmetic gets seriously wrong, especially for numbers very close to zero.

The inexactness is impossible to track in a useful way, so you never know just how bad the result is. Typically you just assume it's almost right until you find out it's wrong. The only surprising thing is that it works so often.

26 August 2007

Where to eat in Nashua, NH

Every town has something just a little different to offer. Here are the places I would recommend if you had a day to spend in Nashua, New Hampshire, U.S.A.

Breakfast: The City Room Cafe. Try the Cajun sirloin crepes. (They're not what I expected; they should be called "crepes curry" and they're a treat.) Check out the specials; they're usually inventive and scrumptious. Or have a sausage-apple-cheddar omelette.

Lunch: Nashua Garden, a pub on Main Street, for the best sandwiches in town. Order a "Big Poppy", loaded, plus hummus and roasted red peppers.

Dinner: Brazilian BBQ on Main Street. If you've never had Brazilian barbecue before (I hadn't), this is a must. Alternatively, Lilac Blossom serves great Chinese. The Crispy Whole Fish is not to be missed. Lamb in Two Styles is almost as good, but is made of fluffy, innocent baby sheep, alas.

Now, if you spurn variety entirely, instead have a nice, conventional breakfast at Hollis Country Kitchen (way out west), lunch at Nashua House of Pizza (just a pizza place, but unusually good), and dinner at Michael Timothy's on Main Street.

P.S. Adventurous or not, if it happens to be Sunday morning when the urge strikes you, have a go at Michael Timothy's weekly jazz brunch. It features live music, an excellent buffet, roast beast and omelettes each to order, and a better grade of coffee. Desserts are included in the price of adm. Singularly pleasant.

17 July 2007

Too cool for screenshots?

Mac OS X comes with a command-line utility, pbcopy, that copies its stdin to the clipboard. pbpaste dumps the clipboard to stdout.

13 July 2007

Usability gripes

I wonder if I'm the only one who finds it ironic that LDAP, of all the Internet protocols the one most directly aimed at fixing the problem of discoverability, is, itself, undiscoverable:

I happen to know the hostname of my company's LDAP server, so I only need to figure out 7 more settings before I can, you know, find out my colleagues' email addresses.

Uploading this screenshot was an ordeal, by the way. Mac OS has screen-capture hotkeys (Command-Alt-Shift-3 or something absurd like that), but it can't capture just one window. It also comes with a separate screen-capture program (Grab)... which unhelpfully saves your screenshots in TIFF format. (There is no way to configure this.) And Mac OS does not come with any basic image-hacking utility like Paint. Great style, awful usability.

While I'm on a roll:

$ nslookup
> help
The 'help' command is not yet implemented.

Oh, thank you. Thank you so much.

Update: Some great Mac OS screenshot tips are in the comments. (Thank you!)

11 July 2007

Overheard in a car

Buzz (not his real name) is 3; Woody is 22 months.
...
Buzz:  Yes you are!
Woody:  I'm not a are!
Buzz:  Yes you are!
Woody:  I'm not a are!
...

06 July 2007

C puzzle

I was reading source code this morning and came across this curious line in a header file:

JS_STATIC_ASSERT(sizeof(JSGCThing) >= sizeof(JSString));

The effect of this macro is to check at compile time that sizeof(JSGCThing) >= sizeof(JSString). If the condition is not met, compilation fails with an error.

The puzzle is: how does this work? A hint is in the comments.

04 July 2007

Things die

Around seven thousand languages are spoken today. Many are spoken only by a few elderly people. Over 500 such languages are headed for extinction.

Some linguists go much further, claiming that more than half of the languages spoken in 2000 will be extinct by 2100.

24 June 2007

Math links

A mathematician's apology (PDF). The early chapters are mostly grumbling about getting old. But then he starts talking about mathematics, with real poetry and love for the subject. Very nice.

Eudoxus. This ancient Greek anticipated Dedekind cuts and calculus. He also mapped the heavens and the earth. Yet no work of his survives--only frequent citations in the works of Euclid, Aristotle, Hipparchus, and more.

Triangle centers. Every triangle has a centroid. But oddly enough, that isn't the only center of a triangle. MathWorld has a picture showing over a thousand of them.

To do

Frivolous projects I would undertake, if I only had the time.
  • Submit a patch to Monotone that makes "mtn log" display log entries in chronological order.

  • Finish debugging the new, faster syntax-case implementation I have for Try Scheme. (sigh) It's so close!

  • Write about continuations.

    A good tutorial would be helpful. I haven't read anything I really liked. (Squawk of the Parrot has some very nice stuff on this.) I think a good approach would be to say up front that it's extremely generalized and hard to explain, but that we're going to work toward the definition. Then lean heavily on examples and analogies. Compare continuations to all kinds of control-flow concepts: functions; goto; return and break as functions; savepoints in video games; exceptions; generators; longjmp. Explain that all of these can be implemented fairly easily using call/cc. Then talk about continuation-passing style.

    I also want to investigate just what abstraction boundaries are violated by first-class continuations in a language that has state. I am kind of down on continuations as a language feature at the moment.

  • Read about continuations. In particular, Andrzej Filinski's thesis introducing the symmetric lambda calculus.

21 June 2007

Rooftop lab

I just started a new blog for my Mozilla work. It's called Rooftop Lab, and it'll be pretty technical.

19 June 2007

On the Cape

If you happen to be on Cape Cod sometime, visit Scargo Pottery, just off route 6A. That whole stretch of 6A is like a sprawling crafts fair, but Scargo is something special. The little pictures on their web site don't do it justice.

We were on Cape Cod for my birthday. It's a few weeks yet before the tourist season, but the weather was cooperative. Breezy, cool, sunny. Buzz was on his best behavior all day.

First we went to the Cape Cod Potato Chip factory. There's not much there, but you get to see a real factory floor. And eat potato chips. So.

Then we went to a little clock shop on Route 6A. Buzz was entranced. We talked about the clocks for, I don't know, it must have been half an hour. His favorite part was the clocks that had their faces removed so you could see the works.

After that, we went back to the beach, flew kites, climbed around on some rocks, and cooked out. A lovely day. If kites hate you, what you want is a Pocket Parafoil (here's one on eBay). Fun and ridiculously easy.

17 June 2007

Strawberry alert

If you happen to live in Nashua, NH, there's a small farmer's market on the Main Street bridge over the Nashua River. It's there every Sunday, all summer. A booth there is selling strawberries that were picked this morning.

15 June 2007

Woody goes to the park

A. is 22 months old now and insists on being called “Woody”, except when she's “Steve”.

Mommy: What did you do at the park? Did you slide down the slide?
Woody: Yeah.
Mommy: Did you run around?
Woody: Yeah.
Mommy: Did you meet some other kids?
Woody: Yeah.
Mommy: Did you eat some blueberries?
Woody: Yeah.
Mommy: Did you see a zebra?
Woody: Yeah.

08 June 2007

Productivity through eccentricity

From The Pmarca Guide to Personal Productivity (click the link, it's interesting):

[R]efuse to commit to meetings, appointments, or activities at any set time in any future day.

As a result, you can always work on whatever is most important or most interesting, at any time.

Want to spend all day writing a research report? Do it!

Want to spend all day coding? Do it!

Want to spend all day at the cafe down the street reading a book on personal productivity? Do it!

When someone emails or calls to say, “Let's meet on Tuesday at 3”, the appropriate response is: “I'm not keeping a schedule for 2007, so I can't commit to that, but give me a call on Tuesday at 2:45 and if I'm available, I'll meet with you.”

Or, if it's important, say, “You know what, let's meet right now.”

Clearly this only works if you can get away with it. If you have a structured job, a structured job environment, or you're a CEO, it will be hard to pull off.

But if you can do it, it's really liberating, and will lead to far higher productivity than almost any other tactic you can try.

There's more; if this bit appeals to you at all, the whole thing is worth reading.

By odd coincidence, pmarca is Marc Andreessen, cofounder of Netscape.

Credit: Marginal Revolution.

05 June 2007

Danger!

In Scheme and Ruby, procedures that modify variables or data structures are marked with a ! by convention:

  // Java
  static <T>
  void arraySwap(T[] arr, int i0, int i1) {
      T tmp = arr[i0];
      arr[i0] = arr[i1];
      arr[i1] = tmp;
  }

  ;; equivalent Scheme
  (define (vector-swap! v i0 i1)
    (let ((tmp (vector-ref v i0)))
      (vector-set! v i0 (vector-ref v i1))
      (vector-set! v i1 tmp)))

What's the point of this convention? Well, Scheme thinks this kind of behavior is slightly dangerous and needs to be highlighted. A little explanation:

If your variables or data structures can be changed, expressions have different values in different places within a program. The order of operations matters. The order of lines within a procedure matters. Procedures can subtly change global data as a side effect, and this can lead to rather subtle (mis)behavior.

Theorists call this language feature state. The word is intended in the sense of status, as in “the state of affairs” or “state of the Union”. C, C++, Java, Perl, Python, Ruby, and so on are stateful languages.

Maybe it sounds kind of, er, sweeping, or maybe absurd and stupid, to say that state is bad. But you already know this is true, at least sometimes: Global variables = bad. Unobvious side effects = bad. Leaving an operation half-finished when you throw an exception = bad (usually). Global state + threads = migraines.*

But what's the alternative? Well, there are alternatives, but it's beyond the scope of a quick blog post. Just like the boundary between static-typed and dynamic-typed code, the boundary between stateful and pure-functional code is getting a lot of attention these days. Interesting times.

What Scheme gets wrong here is that state-related bugs don't necessarily happen at the places where data is modified. In fact, the whole idea of marking the danger spots is a total whiff. The problem with state is that it permeates your program. The danger is everywhere a stateful variable or data structure is used, whether for read or write, directly or indirectly.

*It's not just that there are race conditions and potential crashes here. You can eliminate that with a single global lock. The problem is that a single global lock often isn't fine-grained enough; it creates a bottleneck. So you go to a fine-grained locking scheme, and that's where the migraines come in.

03 June 2007

Types

I've been working through Types and Programming Languages by Benjamin C. Pierce. It's filling in a lot of blanks for me.

According to the drafts, ECMAScript 4 will have an ambitious new type system with optional type-annotation. So for example you could write:

//code without types
function sortLines(text) {
    let lines = text.split(/\r\n?|\n/g);
    lines.sort();
    return lines.join("\n") + "\n";
}

// or with
function sortLines(text : String) : String {
    let lines : [String] = text.split(/\r\n?|\n/g);
    lines.sort();
    return lines.join("\n") + "\n";
}

Exactly what effect these annotations will take some explaining. Suffice to say, life is about to get really interesting. I'm sure I'll be writing more about this in a few weeks.

31 May 2007

A new gig

Last week I accepted a position at Mozilla Corporation, the tiny company behind the hugely popular Firefox web browser, now used by 80 million people around the world.

I'll be working full-time on Mozilla's JavaScript engine, one of the core technologies that make Firefox tick. It's my dream job. I can't believe my good fortune.

You can read the latest Mozilla news on Planet Mozilla. (Extremely cool site. Lots of Mozilla people have blogs. Planet Mozilla shows you all those blogs on one page.)

13 May 2007

Concerning gifts (and other puzzles)

What can you conclude from the following three premisses?

  1. If something is not gift-wrapped, it's not a gift.
  2. Nothing that's gift-wrapped is entirely unlike a box of chocolates.
  3. Life is a gift.

Lewis Carroll published a book of about a hundred puzzles like this one. Read it online: introduction; puzzles. My nephew IM and I stumbled upon them in Memphis last week. He pretty much knocked them out of the park one at a time.

They're fairly easy to make, if you know some logic and some algebra. Here are a few more (but Lewis Carroll's are the most sublime nonsense—you should probably try those instead.)

Concerning fashion

  1. Anyone lacking impeccable fashion sense might wear a rhinestone sombrero.
  2. Anyone who might wear a rhinestone sombrero can't dance.
  3. All penguins can dance.

Concerning animals

  1. Animals that are active during the day are either featherless or tasty—or both.
  2. No creature is both nocturnal and naturally funny.
  3. Chickens have feathers.
  4. Chickens are naturally funny.

Concerning the inhabitants of this town

  1. All the monsters in this town are carnivores.
  2. A carnivore would eat anything made of meat.
  3. No creature in this town would eat any other creature in this town.
  4. Humans are made of meat.

Concerning monsters

  1. Only monsters can make the ground tremble.
  2. Two-year-olds and raccoons get into everything (two-year-old raccoons doubly so).
  3. If something gets into everything, but it doesn't emit terrifying shrieks, it must be a raccoon.
  4. All firebreathing creatures are monsters.
  5. Opera singers can make the ground tremble.
  6. Raccoons are not human.
  7. A creature that isn't a monster doesn't have slavering fangs.
  8. If a creature emits terrifying shrieks, then either it breathes fire, it has slavering fangs, or it's an opera singer.

10 April 2007

Things of the moment

  • My current side project is fixing bugs in Python's DOM implementation.

  • Brainteaser: What if chess had a single level of undo? Specifically, after your opponent moves, you may, instead of making your next move as usual, roll the game back to the point just before your previous move and do something else instead. (Your game clock, though, would not be turned back.) How would this affect the game?

  • I'm trying to understand Haskell's existentially quantified types, and (a shallower question) why there's no Data.Hashtable.ST, darn it.

  • GoboLinux is a Linux distribution with a sane filesystem layout. From reading that one page, I like it. You can't fight with MSI for a week without realizing that the people who design installer systems have been working waaaay too hard, and that the fate of our civilization hinges on stopping them.

  • Yesno is a toy programming language in which every method call returns something immediately ...and returns the correct result eventually. It's either hilarious or awesome, probably. The obvious next step is to make it self-host.

  • In the spirit of yesno, here are two quick reviews of books I'm not done reading yet. (I'll refine these opinions later, maybe.) The Time-Traveler's Wife is too sentimental and too linear, and does not contain enough adventure or suffering. Oryx and Crake has a lot of the same strengths and weaknesses as 1984, but much more angst.

31 March 2007

Mozilla 401

If you know any computer science professors at 4-year colleges, please forward this to them!

Yesterday I learned that there's a professor at Seneca College (David Humphrey) teaching a course called Topics in Open Source Development. It's a fourth-year undergrad course. Students do projects for Mozilla, the open source web browser.

Mozilla contributed guest lectures from seven core developers and mentoring for the students. (!)

I met David yesterday. He has taught the course a few times now, to a total of about 100 students, and he has interesting things to say about it, which I won't try to reproduce here. But man. This is the kind of killer course I wish they had at CBU when I was there. Every CS department needs an internship-like course where students do real projects. They must be a pain to develop and run. This is one of the coolest I've ever heard of and more "real-world" than most.

If you're interested in possibly teaching this course at your school, please contact David or me (jason.orendorff@gmail.com). You can also watch the guest lectures online, view the student project list, or read David's Mozilla development crash course.

08 February 2007

Infinity, part 1: Thomson's lamp

Back in the 1950's, the famous electrical engineer James Thomson invented the ultimate strobe light. Instead of flashing at regular intervals, Thomson's lamp (as it was called) would start out flashing slowly and quickly speed up.

The lamp had a button on it. Pushing the button caused the lamp to flash for a total of two minutes, as follows. First the lamp turned itself on. It stayed on for half of the total two minutes (one minute). Then it turned itself off and stayed off for half of the remaining time (half a minute). Then it turned itself on for a quarter of a minute, then off for an eighth of a minute, and so on. Toward the end of the two minutes, it would have been blinking pretty quickly.

The troubling question was, after the two minutes passed, would the lamp be on or off? Thomson was so anxious about the philosophical consequences of his invention that he famously refused to press the button for decades. (The legend is that a physicist friend eventually talked him into it, but there are conflicting stories about epileptic seizures, electrical fires, and divine intervention—I can't make any sense of them.)

So: on or off? What do you think?

27 January 2007

Neurology on the edge

Benjamin Libet conducted this experiment in the 1970s. Apart from one or two electrodes on the scalp, there's really nothing creepy about the experiment. Until you read about the results.

Libet asked his experimental subjects to move one hand at an arbitrary moment decided by them, and to report when they made the decision (they timed the decision by noticing the position of a dot circling a clock face). At the same time the electrical activity of their brain was monitored. Now it had already been established by much earlier research that consciously-chosen actions are preceded by a pattern of activity known as a Readiness Potential (or RP). The surprising result was that the reported time of each decision was consistently a short period (some tenths of a second) after the RP appeared.

from “Pre-empted decisions”, a page on Conscious Entities

The RP starts to ramp up as much as 0.3 seconds before the reported decision time. It continues to increase after that, leading up to actual hand movement about 0.2 seconds later.

What does this mean? All the test subjects, of course, felt they had consciously chosen to move. But if unconscious brain activity precedes the conscious experience of decision-making, then surely we must conclude that the decision is not consciously made. Effects don't precede causes.

Now there are countless philosophical objections to this conclusion. Some philosophers claim that to interpret this result at all competently, you have to be well-versed in the philosophy of the mind. Which seems reasonable enough, but it's a deep field with centuries of literature in many languages. So this prerequisite rules out anyone who has spent his life studying anything as patently irrelevant as mere neurology. To say nothing of random bloggers.

I'll wade in anyway, of course. Just don't get the impression I know anything about this subject. I don't.

There's a really nice, compelling interpretation that permits free will. It goes like this. The way the mind interprets time is anything but objective. In the Libet experiment, what's happening is that the mind shifts the experience of deciding to move forward in time and shifts the experience of motion backward in time, effectively bringing them closer together. So the conscious decision to act actually does cause the RP ramp-up. But the subject incorrectly reports the decision as having happened later, because his brain has deceived him about the timing.

Why would the brain do this? It seems likely to me that there's an evolutionary benefit to perceiving decision, action, and effect as a single event. I don't think we're equipped to deal with that kind of time lag consciously. Just think—there's a half-second lag between when you decide to move a muscle and when it moves. Have you ever played the piano? If you were aware of this lag all the time, could you do that? Could you run? Could you fight?

An article, “Free Will and Free Won't” (in American Scientist; $12 to download the PDF from their site), puts the Libet experiment alongside four or five other rather clever experiments into will and consciousness. Then it starts talking about alien hand syndrome. The brain is strange.

26 January 2007

Wonderfully odd thing of the moment

The island of Yap in the Pacific Ocean used varying sized stones as money, of which the largest weighing several tons were the most valuable. The stones had been brought by sea from the Island of Palau 210km away. [...] The Yapese valued them because large stones were quite difficult to steal and were in relatively short supply. However, in 1874, an enterprising Irishman named David O'Keefe hit upon the idea of employing the Yapese to import more “money” in the form of shiploads of large stones, also from Palau. O'Keefe then traded these stones with the Yapese for other commodities such as sea cucumbers and copra. Over time, the Yapese brought thousands of new stones to the island, debasing the value of the old ones. Today they are almost worthless, except as a tourist curiosity.

From Wikipedia. It's given as an instance of hyperinflation.

16 January 2007

Brouwer's shopping mall diorama theorem

This is a math post, but it also involves some audience participation. There's a crafts project. It may also require some driving. Ready?

Pick any closed, contiguous region of the universe—like, say, the nearest mall. Draw a map of it. Or you can make a diorama, if you're just that fond of the mall, or of making dioramas.

Go ahead. It doesn't have to be to scale.

While you're working, I'll say something profoundly obvious. The whole idea of a map, of course, is that every place in that part of the real world corresponds to exactly one spot on the map.

Done? Good. Now take the map (or model) and put it inside the closed region of space that it represents. That is, go to the mall. Brouwer's fixed-point theorem says that the map now has a fixed point: there's a point on the map that is actually at the very location that it represents.

This works no matter how large or small your map is. If your map is the size of the entire food court, and you take it there and spread it out on the floor, there will be a spot somewhere in the food court that exactly lines up with the corresponding spot on the map. Shift the map a little bit, and that spot won't line up anymore—but some other spot will. Always. You can turn the map around to face the wrong way. You can hold your 3-D model upside down. It doesn't matter. In fact, this works even if your map is not drawn to scale or if things are totally the wrong shape. There are only two requirements regarding accuracy. First, your map can't leave anything out. So if you forgot to draw the Banana Republic, you have to mentally squeeze it in between Orange Julius and The Icing where it belongs. Second, your map must be continuous. That is, any path someone might take from one point to another in the mall has to make a continuous path (without any “jumps”) on your map as well.

In the language of topology, any continuous function that maps a closed ball in Rn into itself has a fixed point. I have no idea why this works. Amazing.

It may have occurred to you that there already are nice, large maps conveniently located throughout the mall. Brouwer's theorem applies to those maps, too. In fact, in honor of Brouwer, the fixed points of these maps are always clearly marked, usually with a red dot or an arrow. Next time you're in a mall, take a look.

11 January 2007

Fiction, meet science

At some point, Dartmouth University offered a semester course on Renaissance Math in Fiction and Drama. From the site:

This course explores scientific developments in Renaissance astronomy and their portrayal in literature past and present. By reading some of the writings by Copernicus, Galileo and the prolific Kepler, we will attempt to draw a portrait of scientific upheaval during that period. The science fiction of the Renaissance offers a window into the popular response to these developments, as do various commentaries of the time. Dramatic pieces both recent and of that period show the artistic reconstruction of scientific events, sometimes through a very modern lens.

“Science fiction of the Renaissance”? There's not a huge amount of this, as it turns out, but one amazing, atypical example is Johannes Kepler's Somnium, which was at once a fanciful journey to the moon and a serious thought experiment in support of Copernican heliocentrism. Wow.

05 January 2007

Perfect numbers

Perfect numbers are numbers that are equal to the sum of their factors: 6 is perfect because its factors are 1, 2, and 3, and 1 + 2 + 3 = 6. Likewise 28 = 1 + 2 + 4 + 7 + 14; and so on. So far, 44 perfect numbers are known.

Puzzle: Can you prove that if 2n - 1 is prime, then 2n - 1(2n - 1) is perfect?

planx_constant mentioned that little theorem to me over vacation. It was first proved by Euclid. Millenia later, Euler proved that all even perfect numbers are produced by this formula. But it is not known whether there are any odd perfect numbers. Most mathematicians seem to think there are none. Here's James Joseph Sylvester, writing in 1888:

...a prolonged meditation on the subject has satisfied me that the existence of any one such—its escape, so to say, from the complex web of conditions which hem it in on all sides—would be little short of a miracle.

Yet there is hope, and indeed the search is on.

A complex story, part 4

(See parts 1, 2, and 3.)

Everybody literally sees the world from a different point of view. Each person is standing in a different location and looking out in a different direction from everyone else. But all viewpoints share certain similarities. If you and I are near one another, we'll see the same events happen in the same order, and although we may differ in our use of the words “right” and “left”, if we're watching something from opposite sides, we'll at least agree on the distances between things. If I see two people holding hands, you'll never see them on separate sides of the street at the same time, no matter where you're standing. All the different viewpoints preserve certain observed properties: distances, angles, durations, causality, and so on.

Mathematically, we can write this in two equations. For each of us, every event has a measurable position in space (x, y, z) and time (t). If we put my observations on the left-hand side and yours on the right, they will match.

We agree on distances: x2 + y2 + z2 = x'2 + y'2 + z'2

We agree on durations: t = t'

Even if I'm in a car doing eighty and you're sitting on the sidewalk enjoying an ice cream cone, we'll agree on the distances between and durations of any events we both happen to witness as I zoom by.

...Or so everyone thought. Don't get me wrong, this is a lovely picture. Mathematically, it's your basic three-dimensional Euclidean geometry, plus a separate dimension for time. All our viewpoints are identical except for a bit of spacial displacement and rotation. There's only one problem. This isn't how the universe really behaves.

1887 was the year of the famous Michelson-Morley experiment, which blew this nice, simple Newtonian view all to hell. For twenty years, confusion reigned. By 1905, a mere eyeblink in academic terms, physics had righted itself, now with a totally new model of space and time.

The new theory was called special relativity. It was built on brilliant new insights from Hendrik Lorentz, Henri Poincaré, and Albert Einstein. And it went something like this: Two observers traveling at incredible velocities (relative to one another) actually do not agree on distances, angles, durations, or even the relative time-order of events. But they will agree on something even more fundamental: the basic laws of nature, including laws of motion, causality, and—in particular—the speed of light.

This had the advantage of being, you know, consistent with experiment. But geometrically, it was awfully weird. It wrecked the two equations above. Individual viewpoints were not simple spacial rotations and translations of one another. They were, uh, Lorentz transformations. Yeah. It was two more years before geometry caught up with physics.

In 1907, Hermann Minkowski discovered a kind of geometry (a four-dimensional manifold) that exactly describes the spacetime of special relativity. That is, Minkowski space is the actual geometry of the universe around us, according to relativity. Minkowski's geometry succeeded by treating space and time as interrelated. For example, in Minkowski space:

We may not agree on the spatial distance between two events: x2 + y2 + z2x'2 + y'2 + z'2

We may not agree on the passage of time: tt'

But we will agree on a particular mathematical combination of the two: x2 + y2 + z2 - ct2 = x'2 + y'2 + z'2 - ct'2

(Here c is the speed of light.)

Now comes the controversial, beautiful part. Define a variable w as ict. We're going to use w as our time coordinate, instead of t. Then the last equation above becomes:

x2 + y2 + z2 + w2 = x'2 + y'2 + z'2 + w'2

This looks a lot like our original equation for distance. And in fact this equation describes basic Euclidean geometry in four dimensions. Time becomes just another spacial dimension. All viewpoints are again simple rotations and translations of one another—not in three-dimensional space, but in four-dimensional spacetime.

Here the role of the complex numbers is to provide a new way of looking at the geometry of the universe.

But... what does it all mean? Is time really an imaginary dimension? What does it mean for three dimensions to be real numbers and one to be an imaginary number? These questions are, in a way, the same questions RT asked me months ago, the questions that got me interested in telling this story. What are the imaginary numbers? Do they exist? Do they appear in nature? I don't think anyone really knows. Einstein found the ict trick interesting at least (he mentions it twice in his short book Relativity, which by the way I enthusiastically recommend), but some physicists think it's a red herring. Maybe we're just dressing the universe up to look more comfortable and familiar.

A complex story, part 3

(See parts 1 and 2.)

In the early 1800s, Joseph Fourier found that every periodic function is made up of (a possibly infinite series of) sine and cosine functions of various frequencies and magnitudes. Just add the right sine waves together and you'll get the desired function. Any function. This collection of waves is called the Fourier series, and it would soon propel the complex numbers from the ivory tower of pure math onto the mad merry-go-round of technology.

Mathematicians used the Fourier series to shift difficult problems to an easier battleground, by transforming a complicated function into an infinite sum of very simple ones. This was the beginning of frequency-domain analysis. It was soon discovered that—thanks to Cotes's discovery—the Fourier series was much simpler if you used complex numbers. Other such transformations were discovered too, notably the Fourier transform and the Laplace transform. Both are based on the complex numbers.

Frequency-domain analysis was the killer app for complex numbers. And then came electricity. As it happens, most of electrical engineering would be practically impossible without frequency-domain analysis. Beginning problems in circuits—problems that in the time domain would require two or three semesters of college-level calculus to tackle—can be solved in the frequency domain with basic high-school algebra and a few complex numbers.

Fourier-related transforms are also essential to the compression of digital images, music, and video. So it's safe to say the complex numbers will be with us for a while yet.

There is just one more application of the complex numbers I want to talk about, by far the weirdest, probably the most controversial, and just maybe the most beautiful of them all.

(Concluded in part 4.)