24 December 2006

A complex story, part 2

(See part 1.)

Roger Cotes died of a sudden, violent fever in the summer of 1716. He was 33 years old. Isaac Newton is said to have remarked on Cotes's passing, “If he had lived we would have known something.”

Cotes left to the world a handful of unpublished math papers. Among them, this formula:

ln (cos x + i sin x) = ix

No such connection between trigonometry and logarithms was known at the time. Cotes's formula reveals a simple, fundamental connection here, and the connection runs through complex numbers. Even at the time this was a very nice result, but the realization of just how profound it was dawned slowly. It took a couple hundred years.

Around this time a totally unrelated question was starting to generate interest. There was a feeling that an equation like this...

x3 + x2 - 3x = 4

...ought to have three solutions. And indeed it does, but the feeling was that more generally every polynomial equation of degree n ought to have n solutions, though not all of them would necessarily be distinct from one another. And this turned out to be true... if you counted complex solutions. This landmark result was finally proved in 1806. It's now called the fundamental theorem of algebra.

But complex numbers were still controversial until 1799, when something ironic happened. A land surveyor, Caspar Wessel, discovered that the complex numbers could be thought of as points (or vectors) on a plane. Each complex number a + bi corresponded to the point (a, b). Adding, subtracting, and multiplying complex numbers could be done with any flat surface, a compass, and a ruler.

This was astonishing, because it made the past two centuries of work involving complex numbers suddenly much easier to visualize, in a totally unexpected way. (Graph the solutions to the equation x13 = 1. You'll see thirteen points arranged in a perfect circle—or regular triskadecagon, if you prefer—around 0. I don't know how to convey how unexpected and beautiful that is.) Ironic, too, because the underlying idea of plane coordinates was first explored by René Descartes, the same man who coined the derogatory term “imaginary number” back in 1637.

Cotes's formula had been independently discovered by Leonhard Euler, and with the discovery of the complex plane, its fame grew. The formula now bears Euler's name; and one particularly nice case (where x) is called Euler's identity:

eiπ + 1 = 0

This is widely considered the most beautiful equation in mathematics. There's certainly something about it. It's as though all the biggest concepts in math came to lunch, and while they were all there together they posed for a photograph.

...So this is why complex numbers are important. It's a little matter of the most beautiful mathematical discovery of all time.

But incredibly enough, the story doesn't stop there either.

(Continued in part 3.)

A complex story, part 1

For RT, who seemed a little skeptical.

Once upon a time, there was an Italian mathematician named Niccolò Tartaglia.

Niccolò's claim to fame—well, his secondary claim to fame—is that he discovered a formula that you could use to solve any cubic equation. Well, maybe not any cubic equation. It worked for many cubic equations. Unfortunately, for some equations the formula gave results that involved the square root of a negative number. Which was clearly nonsense. But strangely, Tartaglia found that if he just pretended that negative numbers had square roots (numbers that weren't exactly real numbers but followed all the same rules of arithmetic), he could just plow ahead with the math and eventually all the oddities would cancel out, leaving ordinary real numbers: the correct solutions.

(Tartaglia's primary claim to fame is that he spent a decade of his life ruthlessly destroying the career and reputation of his friend, Gerolamo Cardano, after Cardano revealed Tartaglia's secret formula to the world. This was a time when mathematics was the exclusive domain of paranoid madmen. Some were so secretive they managed to leave no surviving written work at all.)

From what I've read, Tartaglia apparently had no idea what he was doing, and nobody else could figure it out, either. It was as though there were a sort of mysterious shadow realm lurking behind the real numbers, and occasionally some errand would force you to travel through it, only to emerge (with a shudder of relief) back into the real numbers in the end. Nobody liked this. When René Descartes called these oddities the imaginary numbers, he meant it to sting.

The stigma persists. Most people hear a little about complex numbers in school, not enough to be comfortable with them or understand why people would think they exist (whatever that means) or why they might be useful.

Descartes probably figured a better method for solving cubic equations would eventually come along, and then the “imaginary” numbers could be quietly forgotten. What actually did happen turned out to be a lot more interesting.

(Continued in part 2.)

23 December 2006

Alice in Puzzle-Land

“How do I know for sure that I'm awake?” asked Alice. “Why can't it be that I'm now asleep and dreaming all this?”

“Ah, that's an interesting question and one quite difficult to answer!” replied the King. “I once had a long philosophical discussion with Humpty Dumpty about this. Do you know him?”

“Oh, yes!” replied Alice.

“Well, Humpty Dumpty is one of the keenest arguers I know—he can convince just about anyone of just about anything when he puts his mind to it! Anyway, he almost had me convinced that I had no valid reason to be sure that I was awake, but I outsmarted him! It took me about three hours, but I finally convinced him that I must be awake, and so he conceded that I had won the argument. And then—”

The King did not finish his sentence but stood lost in thought.

“And then what?” asked Alice.

“And then I woke up!” said the King, a bit sheepishly.

—Raymond Smullyan, Alice in Puzzle-Land. This is a fun book of logic puzzles ranging from cute to outrageously intricate. A fine gift for the mathematician on your list (though I hear The Annotated Alice is even better).

09 December 2006

π in Metamath

Metamath has a definition of π, but it's unused. For all the importance of π in math, the Metamath database contains exactly zero theorems that make use of it.

I started to work on a little something and quickly discovered the reason for this. Metamath defines π as the smallest positive zero of the sine function, but as yet it has no proof that any such number exists!

In a way this reflects well on Metamath. I wanted to prove something, but my reasoning depended on hidden assumptions. Working in Metamath very quickly revealed those assumptions and guided me to an understanding of exactly what they were. You learn by doing.

21 November 2006

That's rather odd

According to Reason magazine, some guy was convicted of smuggling French silicone gel breast implants into the U.S. He was sentenced to two years and a $10,000 fine.

The article says, “According to the government, Martin illegally imported 47 implants in 1995 and has distributed close to 600 implants to doctors throughout the country.” Does anything about that sentence strike anyone else as ...odd?

20 November 2006

Constitutional law fact of the day

I didn't know this: apparently a nursing mother has a Constitutionally protected right to breastfeed her child. The following block quote describes Dike v. Orange County School Board, 650 F.2d 783 (5th Cir., 1981).

[A] teacher wanted to nurse her baby on her duty free lunch break. The school claimed that insurance provisions prohibited teachers from bringing their children onto school property, and also prohibited teachers from leaving the school grounds during the day. The trial court ruled that the mother had no right to breastfeed. In Dike, the appeals court reversed the case and remanded it for a new trial, stating that breastfeeding is a protected constitutional right. “Breastfeeding is the most elemental form of parental care. It is a communion between mother and child that, like marriage, is ‘intimate to the degree of being sacred,’ Griswold v. Connecticut, 381 U.S. at 486, 85 S. Ct. at 1682, 14 L. Ed. 2d at 516. Nourishment is necessary to maintain the child's life, and the parent may choose to believe that breastfeeding will enhance the child's psychological as well as physical health. In light of the spectrum of interests that the Supreme Court has held specially protected we conclude that the Constitution protects from excessive state interference a woman's decision respecting breastfeeding her child.” 650 F.2d at 787

Constitutional rights are not absolute, and often collide with legitimate, recognized interests. Sometimes the courts must balance individual rights with state interests. In the Dike case, the trial court determined that the state had a legitimate interest in restricting the teacher's comings and goings because of certain school policies. Although the appellate court ruled that mothers have a constitutional right to breastfeed, Mrs. Dike did not have the right to leave school to go home and nurse her baby, or to bring her baby on to school grounds.

—Elizabeth N. Baldwin, “A Look at Enacting Breastfeeding Legislation”

It's really cool what happened here. The lower court said, in an offhand way, “There's nowhere in the Constitution it says you have a right to X.” And the higher court, showing truly inspiring wisdom, says, “Look again.” Over the years, I think the U.S. courts have been better than anyone could've reasonably expected about protecting “personal” rights, beyond the political rights that the Constitution addresses most directly. It's a sometimes thing, but on a good day, the government can't just go in and randomly muck about with your private life. It's nice to be an American.

19 November 2006

Rematches

Ohio State is #1. Michigan is #2. Ohio State beat Michigan yesterday in a pretty close game.

The college football championship game is January 8. Should Michigan be invited?

On the one hand, the whole point of a championship game is that #1 gets to play #2 for all the marbles. On the other hand, there are a dozen awesome teams that have not had a chance to unseat Ohio State this year. If USC, for example, doesn't get a chance, why should Michigan get two chances?

Listening to the commentators yesterday, I could tell they were carefully restraining themselves from re-re-re-stating the obvious: we wouldn't even have this problem, if... yeah, I can't bring myself to say it again. I'm officially blue in the face.

Did you know that it's possible in the NFL for the Super Bowl to be a rematch between two teams that already played during the regular season? It has happened eleven times. And it's the darnedest thing—no one ever complains about it. In the NFL, if you get to the Super Bowl, no one ever tries to say you didn't earn it. I have a theory. I think there is a reason for this. If you're in the Super Bowl, you've beaten two of the best teams in the league in consecutive weeks. All the other contenders had their shot, and they lost. It's that simple.

In college football, there's no playoff, so you get situations like this one with Michigan.

The BCS was a huge leap forward. We now have a controversy-free, undisputed national champion about 20% of the time. Let's take the logical next step. It ain't rocket science.

Postscript: For the record, I favor treating yesterday's game as a playoff game. Michigan lost. They should be out.

17 November 2006

Euclid redux

In 1882, Moritz Pasch published a book revealing unstated assumptions in Euclid's Elements and calling for a new examination of the foundations of geometry.

Here are three axiomatizations of plane geometry. I'm posting this because they're three surprisingly creative, different approaches.

  • Hilbert's axioms (1899) are simply a more rigorous version of Euclid's axioms. Whereas Euclid states five postulates and five “common notions”, Hilbert requires 20 axioms—plus set theory.

  • Birkhoff's axioms (1932) take a totally different approach. Birkhoff starts with the real numbers. The distance between points on a line is defined by analogy to the reals, and something similar defines angle measures. The astonishing result is a complete axiomatization of plane geometry in just four postulates.

  • Tarski's axioms (1983) are built directly on top of first-order logic, so they don't require set theory. As a result, Tarski was able to prove (posthumously, no less) that the system was consistent, complete, and decidable.

10 November 2006

Poorboy Legacy

My friend Reuben Torrey just released a CD! Check it out: Poorboy Legacy. The title track has a sort of folk-blues-meets-indie-rock thing going on—fantastic.

Oh my angel mother
help me up the hill
I'm goin' to meet the good Lord
and I didn't write no will
'cause all I got
is a poor boy's legacy
I'm gonna leave to my son
everything my daddy left for me

$8 for the CD, free shipping, free MP3 download once you buy.

03 November 2006

Election day

On Tuesday, I can vote for incumbent Republican Charlie Bass, Democrat Paul Hodes, or Libertarian Ken Blevens. ::sigh::

01 November 2006

Elements, part 2

RT asked me last weekend about programming. She just learned HTML. Now she's considering learning PHP.

It was this conversation that got me interested in Euclid's Elements. RT has a classical liberal education; she actually studied the Elements. (Sidetrack: Wikipedia says the Elements were “the basic text on geometry thoughout the Western world for about 2,000 years. For centuries, when the quadrivium was included in the curriculum of all university students, knowledge of at least part of Euclid's Elements was required of all students. Not until the 20th century did it cease to be considered something all educated people had read.” Now get this: the mathematical discoveries of the past hundred years are so mind-blowing, so revolutionary, that the relegation of the Elements to obscurity is arguably justified. The mathematician's view of geometry is forever changed; the Elements can never again be what it was. I mentioned this to SC at work. He says we're living in the golden age of mathematics.)

RT liked Euclid's approach of starting from a few axioms and deriving everything else from them. She wanted to know if it is possible to learn programming in the same way, from the ground up.

Here's my answer: Structure and Interpretation of Computer Programs. (You can read the full text for free online.) Elsewhere I've called this “the best programming book ever written”. As it turns out, Section 1.1 is actually titled “The Elements of Programming”. These first few sections still bear re-reading. Powerful stuff.

Is SICP a practical book? It walks a fine line. The programming language you'll learn, if you read it, is not Java or PHP or Ruby or C#. It's Scheme. Scheme has a simple design well suited to the Elements approach. You might find one of those others better suited to the program you're interested in writing, and if so, you might prefer some other book. On the other hand, page 1 of SICP starts by quoting an essay by John Locke; and then it tells you what a program really is, what the word means; and then by section 1.2.1 it's telling you things that I went through a whole CS degree program without learning. That kind of book.

A discovery

This happened last year. I never heard about it.

A reviewer on amazon.com writes:

With so many things going wrong in the world, it's nice to see one important thing going right—a certain Mr. Applebaum stumbles onto the recordings at the Library of Congress in January, a crack team spends the better part of the year restoring and remastering, and Blue Note and Thelonious Records put out the CD in September.

It's a true and lovely story. National Public Radio tells it here (ignore the text and click the “Listen” button near the top). The disc is Thelonious Monk Quartet with John Coltrane at Carnegie Hall.

29 October 2006

A speaks

A's first words are: book, apple (“ap”), bump (“bup!”), up, ball, drink (“guk”), hat (clear as a bell), and something that sounds like bite or eat.

Elements

“Euclid's text Elements was the first systematic discussion of geometry. It has been one of the most influential books in history[.]”

—Wikipedia, “Euclidean geometry”

I'm not sure this is an exaggeration.

Observation

J, who is three, enjoys A Light in the Attic. O says Shel Silverstein's best poems are the ones that are really about himself. This led me to a new appreciation of his stuff.

I've never dealt draw poker
In a rowdy lumber camp.
Or got up at the count of nine
To beat the worlds champ.
I've never had my picture on
A six-cent postage stamp.
I've never scored a touchdown
On a ninety-nine yard run.
I've never winged six Daltons
With my dying brother's gun...
Or kissed Miz Jane, and rode my hoss
Into the setting sun.
Sometimes I get so depressed
'Bout what I haven't done.
—Shel Silverstein, Never

26 October 2006

"We all have a common enemy..."

Marquette University censors the office doors of its Ph.D. students. A quote from a noted terrorist has been removed.

Sorry, did I say terrorist? I meant syndicated humor columnist. The censored quote is from Dave Barry. Click the link. The story defies belief (and the quote is actually kind of funny).

Why does no one seem to care about this? A little part of me dies every time I read a story like this. It takes me a while to get over it. "Free speech zones" on university campuses. God help us.

28 September 2006

It's turtles all the way down

I haven't yet grasped the relationship between proof and truth. And the ground is sufficiently tricky that I have a hard time trusting anything I read. I got something from the local public library called Fundamentals of Mathematics, which has an introductory chapter on model theory—but they go around proving things about models and logic using ordinary mathematical reasoning. This is like writing a Python compiler in Python: one feels that some bootstrapping must be involved somewhere. It's probably in a later chapter, but in the meantime my brain involuntarily rejects everything the book claims. Unproductive.

Metamath responds

Norm Megill, the man behind Metamath, responded to my recent post about Metamath and hygienic macros:

You are correct that substitution in Metamath is the simple replacement of a variable with a (well-formed) expression and thus isn't hygienic, in Scheme's terminology. In logic, the term used for “hygienic substitution” is “proper substitution,” which Metamath does not implement directly.

Instead, Metamath natively implements a simpler concept of “disjoint variables” as an optional proviso on an axiom (and inherited by theorems), that will forbid certain substitutions. In more standard terminology, this means, for example, that (if the axiom says so) a certain bound variable may not occur in a wff. In this case, the wff may have to be transformed, in the proof, to an equivalent wff with the bound variable renamed before we are allowed to substitute it for the wff variable of the axiom. In effect, the work of renaming bound variables, in order to make a sound (hygienic) substitution, is done by the proof itself rather than by the proof verifier. This simplifies the underlying primitive concepts needed by the Metamath language itself, with a consequence that the proof verifier can be very simple.

How simple is Metamath's verifier? About 300 lines of Python does the trick.

Simplicity is not the motivation in Lisp; in fact Lisp is generally more complicated than Scheme. Rather, Lisp programs actually take advantage of symbols getting crossed. (If you know Lisp macros, you know what I mean, and if you don't, I can't easily explain it.) I know Metamath does something similar in at least one place: its definition of proper substitution.

([y / x]φ ↔ ((x = yφ) ∧ ∃x(x = yφ)))

I wonder if similar tricks are used elsewhere.

27 September 2006

Amazingly bad

It reads like a deliberate insult:

To distribute a .NET Framework application both to 32- and 64-bit platforms, build two MSI packages, one targeted at a 32-bit and the other a 64-bit computer.

I have to say, Microsoft's support for 64-bit architectures is a total nightmare. MSI can't compete with that, but it's pretty darned awful.

.NET supposedly supports XCOPY deployment. Even MSI can make a directory and put files in it, and as far as I know the MSI database contains nothing platform-specific. What's the problem here?

I know the answer. The problem is that XCOPY deployment doesn't really work. Not for everybody, certainly not for apps with legacy dependencies. What's a tech writer supposed to do? Document how things would work if the world were ideal, because they might work that way for your app? Or give everyone the ugly workaround that most real-world apps need to apply?

Blaaargh.

26 September 2006

A bouquet of Penny Arcade

It's a comic strip about video games. Be forewarned: most of the joke with Penny Arcade is that the characters are phenomenally foul-mouthed.

I picked these out for oliviao, even though it seems likely there's not a single Penny Arcade strip she would actually like...

“This is that Star Ocean?”

“I'm watching my sim watch TV.”

“I thought you said we were going to level up together!”

“Ann, as your uncle, I respect your youthful vigor.”

“Dude. I just got back from the Serenity preview showing!”

“You told me this book a) wasn't fantasy, and b) contained no dragons.”

25 September 2006

Metamath isn't hygienic

Every axiom, definition, and theorem in Metamath is actually a template where no matter what you plug in for the free variables, you get a statement that's considered true in the system. So, for example:

Q → (PQ)

Allows you to write sentences like

(A = 0) → ((∀ xR (0 < xA < x))(A = 0))

This is kind of like a Scheme macro, but a really surprising thing about Metamath is that the sort of substitution it employs is not hygienic. If you happen to be using a local variable x, and you plug in a formula that has x free, the formula will “capture” the x in its new context.

In Scheme, if you plug a bit of code into another bit of code this way, and both bits of code are using a variable named x, Scheme behaves as though it's silently renaming one of the variables so the two don't conflict... unless the two x's actually refer to the same global variable or something—you can see it gets rather complicated. Lisp doesn't bother. It just plugs in the code. If variables get crossed, they get crossed.

Apparently doing it the Lisp way makes the Metamath proof-checker a lot simpler. Interesting.

15 September 2006

A strange postgrad class

This is the kind of course I'd like to take: PHYS771: Quantum Computing Since Democritus. There's a tolerably cool math puzzle at the end of the first lecture.

08 September 2006

No, not SOAP. We're talking actual soap here.

Scott Aaronson has a cool article (PDF) that looks at half a dozen proposed ways of solving really difficult computational problems—such as using soap bubbles or black holes, or time-traveling, or playing Russian roulette.

The paper is playful but serious—a serious survey of the far-out. A lot of it is mathematical gibberish to me, but it's amusing all the way through, and there's a fun “think big—no bigger than that” section at the end.

I have one super-geeky nitpick, though:

Even many computer scientists do not seem to appreciate how different the world would be if we could solve NP-complete problems efficiently. I have heard it said, with a straight face, that a proof of P = NP would be important because it would let airlines schedule their flights better, or shipping companies pack more boxes in their trucks! One person who did understand was Gödel. In his celebrated 1956 letter to von Neumann (see [69]), in which he first raised the P versus NP question, Gödel says that a linear or quadratic-time procedure for what we now call NP-complete problems would have “consequences of the greatest magnitude.” For such an procedure “would clearly indicate that, despite the unsolvability of the Entscheidungsproblem, the mental effort of the mathematician in the case of yes-or-no questions could be completely replaced by machines.”

But it would indicate even more.... [Ed.: color added]

Here Scott shifts carelessly between two completely different hypotheticals (blue and orange). The statement he lambasts, that P=NP would be important for logistics, is actually an overstatement. A proof that P=NP (blue) almost certainly would have no practical consequences at all, unless I'm wildly mistaken, because a great many problems in P are not practically solvable. Who cares if every problem that can be solved in O(2n) time can also be solved in O(n1000000) time? In practice the latter is probably worse.

By contrast, a discovery that NP-complete problems can be solved in quadratic time (orange) would be a mind-blowing, probably world-changing event. But this is unlikely in the extreme.

06 September 2006

Gradual typing

Seen on Lambda the Ultimate: Gradual Typing for Functional Languages, a paper that introduces a new flavor of lambda calculus which they call (this is an approximation) λ?.

Hard to explain why language geeks should be excited about this. I am.

We present a formal type system that supports gradual typing for functional languages, providing the flexibility of dynamically typed languages when type annotations are omitted by the programmer and providing the benefits of static checking when function parameters are annotated.

The proofs in the paper are machine-verified!

Relationships

I have a big file full of thoughts about programming languages. Re-reading it today I found this (edited):

Component-container relationships. I suspect these are all over the place. This is always a hassle. I wish someone would make it easy.

One idea is to put the component class lexically inside the container class:

        class Container {
            class Component {
                // each Component is part of a Container
            }
        }

Hmmm... I need to think about this some more.

And I realized: I now know the solution to this problem. In fact I've implemented it. Relationships.

I mean relationships in the sense of entity-relationship diagrams. Relationships describe how entities relate to each other, like the bolded words below:

His new paper cites 37 sources. (This could perhaps be written in code like this: paper1.cites(paper2))

Tom Hanks starred in the movie Big. (actor.starredIn(movie))

The patient Joe Bloggs is treated in the cardiology ward. (patient.isTreatedIn(ward))

Of course parent-child relationships are one case of this, but there are many others.

Okay. Now consider some hypothetical Python code:

    class Component(object):
        ...

    class Container(Component):
        ...

    # Define the relationship between these two classes...
    containment = OneToManyRel(
        Container, 'container', 'contains', Component, 'contents')

Here's what the OneToManyRel does:

  • Adds a container property to class Component. Each Component's container is always either a Container object or None. It's initially None, but if you like, you can just initialize it in Component.__init__ like any other attribute.

  • Adds the reverse property, contents, to class Container. This property is an object that looks like a set. It's initially empty. The two properties are automatically kept in sync, so if you do thing.container = None, the thing is automatically removed from its container's contents set.

  • Adds a method Container.contains(Component) that can be used to query the relationship.

  • Keeps a data structure of all the relationships, so you can do containment.items(), which returns a list of (Container, Component) pairs.

Here's a taste of how powerful this can be:

    class BuildStep(object):
        ...

    class File(object):
        ...

    # Each BuildStep produces some Files called its targets.
    # Each File is produced by one BuildStep called its buildStep.
    produces = OneToManyRel(BuildStep, 'buildStep', 'produces', File, 'targets')

    # Each BuildStep depends on some Files called its sources.
    # Each File is depended upon by some BuildSteps called its dependants.
    dependsOn = ManyToManyRel(BuildStep, 'dependants', 'dependsOn', File, 'sources')

That's very brief, compared to what you'd have to type to get this the traditional way.

So I have an implementation. In addition to OneToManyRel and ManyToManyRel, it has things called SymmetricRel and EquivalenceRel. (Warts: My current implementation doesn't use weakrefs, so it keeps objects alive until explicitly removed, which is kind of silly. Also, the properties currently look like iterators, not sets.) The whole thing is 355 lines of Python code.

All this is inspired by Inform 7 relations. I wonder about Ruby. I can only assume Ruby on Rails automatically does stuff like this for databases; I don't know about regular old in-memory objects though. I imagine this would be a breeze to implement in Ruby.

Anyway. I don't know how generally useful it will be, but it's certainly encouraging to come across an old gripe about the exact problem this solves.

15 August 2006

There's a word for that

(Ed. note: More stuff written long ago but never posted...)

A brief piece in the Economist last year decried the fate of a word of which that periodical is particularly fond. In Europe, it has become a slur for the right wing; in America it's an even more pejorative label for the left.

The Economist's own political viewpoint is probably best described as center-right. It couples the leftist idea that government exists for the greater good of all people (not strictly to preserve individual rights) with the right wing's abiding suspicion of centralized government power and its conviction that the way to pursue social justice is to reduce regulation and let free markets work. “There's a word for that,” the essay asserts (echoing a Bush campaign speech), “and we want it back.” The much-abused word is liberalism.

Alas, this impassioned little essay is available only to Economist online or print subscribers. If that's you, read on.

Wiretaps, redux

This is hardly relevant anymore, but I figured I'd post it for the record.

A few times over the last year, JJ and I discussed the warrantless wiretap program. He's all for it. I have never come away from a conversation with him less convinced of his point of view.

I'm clearly missing something fundamental. Here's Power Line Blog arguing back in December 2005 that the program is clearly legal, citing court decisions affirming the President's authority to gather foreign intelligence without having to bother with warrants. Here's the Wall Street Journal doing the same thing.

(Now to begin with, I'm more suspicious of executive power than those courts—I'm even suspicious of the FISC, if it comes to that—but let's set that aside.)

My question is, is this executive power completely without limits? Do American citizens have no rights during wartime? Of course not. There's a line somewhere; the question is where. The pro-snooping blogosphere seems to miss this entirely. Certainly they ignore the Times's allegation that some of the snoopees are American citizens.

The entire Journal op-ed focuses on foreign surveillance. It even goes so far as to attempt this reductio ad absurdum:

The leakers of this sensitive national security activity and their Capitol Hill supporters seem determined to guarantee al Qaeda a secure communications channel into this country so long as they remember to include one sympathetic permanent resident alien not previously identified by NSA or the FBI as a foreign agent on their distribution list.

This simply isn't relevant to the surveillance activity actually described in the original Times article, in which names of American citizens (among others) are put on a list of people whose phone conversations and e-mails are secretly monitored. In other words, American citizens were the target of warrantless snooping.

I've noticed that it's really hard to even talk about this with someone. I think that might be because the question has so many logically incompatible lines of scrimmage.

  • The program is either Constitutional or not;
  • It's either foreign intelligence gathering or it's domestic;
  • It's either legal under FISA or not, or FISA doesn't apply;
  • FISA generally either makes sense, or it ties the President's hands in a potentially disastrous way;
  • Regardless of the particular legal question, the President generally either should or shouldn't exercise absolute power to conduct a war;

...and so on. Now, if you think the program flunks the first test, as I do, then most of the other questions just don't matter—in your nice, little, internally consistent political universe. It's really easy to end up arguing at cross purposes, though. And if you should manage to carry one point, the other guy always thinks he has three or four fallback positions.

I suspect that all of these questions but one are decidable, by any reasonable reading of the law; the sticking point is whether Congress has the authority to constrain the President's ability to conduct war as he sees fit.

But generally I didn't think much of the arguments I was hearing on any level. The President has the authority to take extreme measures in time of war, I'm told: look at the Japanese internment camps in World War II. (But isn't that a strong argument against such unlimited executive power? Is "there have been worse abuses in the past" really the President's best argument?) The three-days retroactive warrant provision in FISA is insufficient because it takes longer than three days to file a case. (I have no idea if this is true or not, but it's awfully hard to see the NSA program as a good-faith response to this difficulty.) The program was not actually secret, because some members of Congress was briefed. (And yet there should be an investigation into who leaked the program, and charges brought? Either it was secret or it wasn't.) As a practical matter, a war can only be fought by a dictator. (Really?)

In any case, it's water under the bridge now.

I blog parts of speech

There's a part of speech in English that I didn't know about before about a week ago. It's called the determiner. Some examples (quoting Wikipedia):

  • Articles: a, an, the
  • Demonstratives: this, that, these, those...
  • Possessive determiners: her, his, its, my, our, their, your
  • Quantifiers: all, few, many, several, some, every...
  • Numbers: one, two, fifty...
  • Ordinals: first, second, last, next...

I always figured they were just adjectives. But in English, determiners appear at the very start of a noun phrase, before any adjectives. Here's a quote; some noun phrases are in bold. The words in red are the determiners.

What are man and woman if not members of two very different and warring tribes? Yet decade after decade, century after century, they attempt in marriage to reconcile and forge a union. Why? I don't know. Biological imperative? Divine law? Or just a desire to connect to that mysterious other? In any case, it's always struck me as a hopeful thing.

Diane Frolov and Andrew Schneider, writing for the TV show Northern Exposure

This has implications for programming languages, but I'll skip it. Olivia and I are rewatching Northern Exposure show by show, courtesy of Netflix. Great stuff.

09 August 2006

Tables need space to roam (horizontally)

One thing in particular that doesn't fit on half my screen is tables. Most tables need quite a lot of horizontal space. Otherwise you get horizontal scrolling, a nuisance. Examples:

  • A database query screen, like the one in SQL Server Management Studio, wouldn't work in two columns with code on the left and query results on the right. The code would fit nicely, but the results are tabular data. The people who made Management Studio apparently realized this; the app has two wide panels (code in the top half, results in the bottom half) and you can't change it.

  • The Inform 7 language has this ill-conceived feature where you can put tabular data in your program by literally typing in a tab-delimited table. As in:

    Table of Conversation
    topic reply quip
    "dream/dreams/nightmare/nightmares/sleep" "'Sleep well?' you ask solicitously.
    
    'Not really,' she replies, edging away from you. So much for that angle." "'Ghastly nightmares,' she remarks. You nod politely."
    "marriage/love/wedding/boyfriend/beau/lover" "'So,' you say. 'This is a little weird since we just met, but, um. Would you like to get married?'
    
    She looks at you nervously. 'Do I have to? I mean, I'd rather not.'
    
    Well, this could get prickly fast." "'I, er,' she says. 'I hope I'm not supposed to marry you or something.' Uh oh."
    ...
    

    That makes perfect sense, right? Well, I guess having line breaks within table cells makes it a little harder to read. And tabs look too much like spaces. But if you solve those problems, you're still left with this:

    Table of Conversation
    topic reply quip

    "dream/dreams/ nightmare/nightmares/ sleep"

    "'Sleep well?' you ask solicitously.

    'Not really,' she replies, edging away from you. So much for that angle."

    "'Ghastly nightmares,' she remarks. You nod politely."

    "marriage/love/ wedding/boyfriend/ beau/lover"

    "'So,' you say. 'This is a little weird since we just met, but, um. Would you like to get married?'

    She looks at you nervously. 'Do I have to? I mean, I'd rather not.'

    Well, this could get prickly fast."

    "'I, er,' she says. 'I hope I'm not supposed to marry you or something.' Uh oh."

    A table with three columns, two of which can hold large chunks of text, is going to have trouble fitting on half the screen at any reasonable font size.

  • You can force tabular data into non-tabular form. For example, the default view for Windows Explorer isn't tabular. (Or rather, it wasn't in XP. The Mac Finder is the same way.) But geeks tend to set the view to "Details", which is tabular, and leave it that way. We often need to be able to look at the largest files in a directory, or see which ones have changed recently, or select only the .cpp files; and in Details mode we can do those things by clicking on column headers, if not just by looking at the screen. It's a trade-off: determining the size or date-last-modified for a particular file requires a bit of care, because it's not immediately obvious how the rows line up. (Clicking on a file only highlights the filename, not the whole row, so that doesn't help.)

    A compromise UI would show things in a narrow format with the details initially shown in small print. It would be like Explorer's "Tiles" view, but arranging the tiles vertically, not in a grid. Ctrl+Scroll would zoom in and out, showing more or less detail. You could right-click on the small print and say "Break out Date Modified as a separate column". As a pleasant side effect, this UI would fit on half my screen. There's probably a reason I've never seen this, but speaking from innocence, it sounds nice.

07 August 2006

My computer shines

Another endearing feature of the HP Pavilion is the flashing blue power button.

I have no idea how this design ever got manufactured. The power button is about one inch, square, blue, abnormally bright, and blinks when the computer is in power-save mode. Here's how bright it is: before I covered it with duct tape, we had to close our bedroom door before going to bed, because the button blinking in the next room was bright enough to keep us awake. Even with the duct tape, if it's blinking, I can stand with my back to it, facing the opposite wall, and see the blue glow going on, off, on, off.

The blue button by itself is a jaw-dropping example of stupid design. No control with such a drastic effect needs to be made especially inviting to toddlers. But the blinking... I am just floored. It is a masterstroke, a work of sheer, staggering, insane genius. Did no one at HP realize that these things go in people's bedrooms? Did no one understand that blinking per se is considered so annoying that it's no longer used even to convey important information, much less to indicate the default mode in which any home computer will spend most of its life?

(speechless)

Hardware knowhow bleg

I recently added an Ethernet card to my PC at home.* Now whenever my computer hibernates, it doesn't wake up properly.** If I take out the Ethernet card, the power management starts working again. On the other hand, the computer has no reason to exist. Oh, the irony.

The card doesn't appear to have any jumpers on it. My next guess is, buy another Ethernet card (they're cheap) and see if it fixes it. Any other suggestions? Ask your friends. :)

Footnotes:

* Long story—Yes, it came with an Ethernet port, but stopped working at some point, and as it's an on-board Ethernet port, I figured there was no hope of fixing it without obliterating my computer. Besides, I don't have a multimeter anymore. Anyway, since then I've been using my USB port instead, but my new router will have none of that nonsense. And honestly, who can blame it.

** I hit a key, and I know it knows it's supposed to wake up, because the fan turns on; but it never gets to the point of actually displaying anything on the screen.

P.S. The computer is an HP Pavilion, if anyone wants to avoid buying one at all costs.

02 August 2006

On the importance of being the right width

This shambolic post is about the width of things, specifically C++ code, Python code, and GUI windows.

I ran a script on some C++ code today, looking at line width. This probably raises more questions than it answers, but here are the results:

    2-5: #############
   6-10: ###########################
  11-15: ########################################
  16-20: ##################################################
  21-25: #########################################
  26-30: #####################################
  31-35: ###############################
  36-40: ###########################
  41-45: ########################
  46-50: #####################
  51-55: ################
  56-60: ###############
  61-65: ############
  66-70: ##########
  71-75: #########
  76-80: ########
  81-85: #########
  86-90: ###
  91-95: ##
 96-100: ##
101-105: #
106-110: #
111-115: #
116-120: #
121-125: #
126-130: #
131-135: 
136-140: #

This graph shows how many lines are 2-5 characters wide, how many are 6-10 characters wide, and so on. The script ignores indentation and comment lines, and lines that are blank or just have one character (usually }).

So you can see that, for example, about 75% of nontrivial code lines are 50 characters wide or less.

The trailing end of the curve actually goes all the way out to 500 characters, but almost all of these extreme cases involve very long strings.

Looking at some C++ code I personally wrote, the curve is similar, but for me about 75% are 40 or fewer characters wide, and the curve dies out entirely at 115 characters. I habitually break up long lines. Of the longish lines (40+ characters), most are not all that complex. They're long because the things involved have really long names:

  bool ImageGenerator::isImageObjTypeReloadable(uint64_t uObjectType)

  initCustomBasket(pOldBasket->getLocalResourceNameObject());

  Boa::WebBoaSubnetStartAttributes networkAttributes = wbNetwork.getDefaultStartAttributes();

(Disclaimer: Identifiers have been changed to protect the innocent. I'm just talking about line length and complexity, and in these respects the examples are genuine.)

Okay. Now, same script, totally different code sample:

    1-5: #################
   6-10: ######################
  11-15: ###########################################
  16-20: ##################################################
  21-25: #############################################
  26-30: ########################################
  31-35: ##################################
  36-40: ############################
  41-45: ######################
  46-50: ###################
  51-55: #################
  56-60: ################
  61-65: ###############
  66-70: ########
  71-75: ###
  76-80: #
  86-90: 

This graph describes the idlelib directory of Python 2.4. This is Python code, not C++. It's a similar curve, though.

Python and C++ are very different languages, but they (and all modern languages) have this in common: they'll let you write arbitrarily complex code lines. The compiler will accept a 3000-character line without batting an eye.

The teams that wrote these two code samples are undoubtedly very different, but they have this in common: they tend to break complex thoughts into two or more lines.

Part of this is just that it's easier to understand several simple, independent lines than one complicated one. I think part of it is visual, too. But the why doesn't matter; IDEs should take advantage of programmers' tendency to work in lines of a certain size.

Microsoft Visual Studio is an excellent example of an IDE that completely fails to do this. The current MSVS is all about the myriad gadgets and capabilities it offers—in ten trillion fiddly docking panels, none of which shows any tendency to make itself the right size or shape for the information displayed. Least of all the central code window, which initially gapes some 200 characters wide, but spinelessly cedes screen area to each new gadget you open until practically nothing is left.

By contrast, the Inform 7 GUI shows source on the left half of the screen and uses the right half for documentation and play-testing.

I can't rave enough about this arrangement. It works extremely well, at least partly because each half is a comfortable width.

It's possible to mimic this layout in Visual Studio: put Solution Explorer in a narrow strip on the left, code in the center, and a wide tabbed panel for everything else on the right. I'll try it out.

One last note. I have a program called HalfMax in my Startup folder. It runs in the background, and when I hit Windows+1 or Windows+2, it moves the current window to fill the left or right half of the screen, respectively. It so happens that a lot of the programs I use are almost ideal when filling half of my screen (given my font sizes, screen resolution, and so on). Right now I have the following applications open: Microsoft Outlook, an Outlook e-mail message, Perforce, some Command Prompts, SysInternals DebugView, some Notepads, some file Explorers, Microsoft SQL Server Query Analyzer, the Windows Services control panel, TestTrack, a Python shell, Emacs, Araxis Merge, Adobe Reader, Word, Visual Studio, and Internet Explorer. (I think this is just about all the applications I ever use. :) Most of these are either simple browser-like apps, Notepad-like editors, or shells: the kind of app that is significantly better half-maxed than maximized. The rest—Visual Studio, Windows Explorer, TestTrack—use that horizontal space for stuff like docking panels or table-view UIs.

28 July 2006

Human visual navigation

I think the region of the screen I can concentrate on at once is quite small, maybe twelve words wide and six lines tall. But then why do I prefer two large high-resolution screens to one low-res screen? I can't really use all that screen space, can I?

Let's talk about the navigation features provided by human vision. Although I have this 12x6 mental "window", the window doesn't stay still. I can read a whole screenful of text without touching the mouse. Human vision has built-in scrolling. I can take in more of the screen at once by sacrificing attention to detail; or I can concentrate on one line of code, at the cost of ignoring everything else for a moment: visual zoom. If I look away, then when I look back, my eyes easily snap to where I was. This is kind of like Alt+Tab in that it lets me hop back and forth between different things; call it visual switching. If I need to look for something, I'll zoom out, skim down without reading, find it, and zoom back in, all without thinking: visual search.

With a huge screen, I can scroll, zoom, switch, and search without touching the keyboard or mouse. With a smaller screen, I have to use the mouse or keyboard to navigate. Each motion takes time and a little bit of brainpower; and each time the screen changes, it takes me a moment to find my place. Eyeball scrolling is superior to software scrolling. Eyeball switching is superior to Alt+Tab.

Having dispensed with that paradox, I want to talk about the 12x6 window and what it means for PL and IDE design. More to come.

25 July 2006

To do

More possible side projects:

  • Blog about irrationality and game theory. I promised D.E. something about this many months ago.
  • Implement long integers for Try Scheme. (Java's implementation is about 4400 lines; Python's is about 3200 lines, including comments and boilerplate junk. Neither one uses two's complement to represent negative numbers, which surprises me. Maybe division is a lot harder to implement in two's complement?)
  • Get Try Scheme working in Safari (the most obvious problem is that Safari doesn't have Object.isPrototypeOf).
  • Implement a built-in manual and tutorial for Try Scheme.
  • Write a story.

18 July 2006

Hiding algorithms and data structures

In Java, you might write:
class Thing {
    ...
    public int getCurrentSize() {
        int size = getBaseSize();
        for (SizeEffect se : getWorld().getAllEffectsEnchanting(this)) {
            size += se.getMagnitude();
        }
    }
}

In Python 2.4 or later, you might write this instead:

class Thing:
    ...
    def getCurrentSize(self):
        return self.baseSize + sum(
          se.magnitude for se in self.world.getAllEffectsEnchanting(self))

If that's not entirely clear yet, just read on. I'll come back to it.

In Inform 7, you would write this:

To decide which number is the current size of (subject - a thing):

Decide on the base size of the subject plus the total magnitude of the size effects enchanting the subject.

I hope that clarifies what the Python version is trying to say. The Python expression sum(se.magnitude for se in ...) is just a way of saying “the total magnitude of ...”.

Ah, but in Inform, you don't use anything that resembles loop syntax to iterate through the size effects. And you don't explicitly query anybody to get the set of relevant objects. You just state your requirements. You don't build data structures explicitly, either—I wouldn't want to post an example World object here, but I can show you the complete Inform code:

A thing has a number called the base size. The base size of a thing is usually 1. The base size of a person is usually 5.

A size effect is a kind of thing. A size effect has a number called the magnitude.

Enchantment relates various size effects to various things. The verb to enchant (he enchants, they enchant, he enchanted, it is enchanted, it is enchanting) implies the enchantment relation.

That's all. The last sentence is just a convenience so that later in your program, you can write “The giant growth spell is a size effect with magnitude 3. It enchants the wombat.” or “Now the wicked ginormosity spell enchants everything.” or “Every turn, if anything is enchanting the player, say ‘You feel prickles all over your skin.’”

Of course, Inform is probably cheating a little bit: in its niche problem domain, there are only ever a few hundred objects in an entire program. So it can implement the implied searches naively. Still, with a little whole-program analysis, the technique could scale.

Why does this feel so new? We've been there with Prolog and SQL, right? I guess Prolog had unrelated problems, and SQL is still awfully cumbersome to get to. I'm awfully impatient about these features showing up in a language I can use every day.

07 July 2006

Duck-typing and Scheme

Duck-typing advocates occasionally say that querying the type of an object is unnecessary. Certain Pythonistas are fond of this idea. I just don't see it.

Maybe querying the concrete type is unnecessary. (Aside: Python has a builtin, isinstance(), that does just that—but I found out a week or two ago that an object can lie to isinstance() about its type. It's just the most absurd design ever.) But once in a while, I need to know whether a given object can swim, at runtime, preferably without having to put it in the water. Surely in a well-designed language, these questions can be answered without unduly revealing how an object is implemented.

For example, in Scheme, you can't ask if a number is a floating point number, as opposed to a native int (fixnum in Scheme jargon), big-integer, rational, or big-decimal. But you can ask if the number is exact, meaning basically that it hasn't been subjected to rounding. You can ask if a number is an integer or not: (integer? x); but you are actually asking about the number, not its type. So (integer? 1.79e123) is true, as is (real? 1).

For numbers, the difference between querying an object's characteristics and querying its concrete type might seem unimportant. Also, numbers are a special case where the concrete type matters a whole lot, for performance. (The next Scheme standard will add a bunch of procedures like (fx+) that only work on fixnums.) But when you start talking about objects, the difference is whether or not you support duck typing, which I think is a big deal. Languages should make it easy to query an object's interface and inconvenient to query its implementation.

30 June 2006

Pandora

I've been listening to Pandora internet radio. The product is a preferences engine and a web-based radio. You make your own radio stations. Pandora plays songs; you train it to play stuff you like. (I have no idea what the company is trying to be. I doubt they're making a good living off the ads.)

Today it just “clicked” and started playing song after song after song that I liked. Mostly stuff I had never heard of before.

It has been a little frustrating, in part because you can't just tell Pandora what musical qualities you're interested in. Instead you have to list musicians or songs that you like, and fine-tune it by giving a thumbs-up or -down on individual songs. For me, this is much harder. Trying to teach it what I want on my "Jazz" station (mostly piano instrumentals, some artsy orchestra pieces, and a few songs by Billie Holiday) has so far been futile. Perhaps Pandora can't model my preferences. Or maybe it just needs more time.

21 June 2006

Interactive fiction (continued)

For certain authors, the term interactive fiction is no mere pretense.

Emily Short's Galatea is astonishingly beautiful, though a bit difficult to get into if you don't understand the conventions. The important commands are a topic to ask Galatea about something, t topic to tell her about something, g (for again) to repeat the most recent action, x something to examine something, and z to pause.

If I were to write interactive fiction, I'd end up with something more like Aisle.

16 June 2006

Rules in interactive fiction (and the implications)

Read Natural Language, Semantic Analysis and Interactive Fiction, a paper by Graham Nelson.

Graham argues in favor of rules-based languages as opposed to object-oriented ones.

I concede that bundling properties together into object and class definitions, with inheritance from classes to instances, works well. My objection is rather to the doctrine that when components of a program interact, there is a clear server-client paradigm; that one component exists to serve the needs of another. The contents of a work of interactive fiction are typically not in such relationships. If facts concerning a tortoise must all be in one place, facts concerning an arrow all in another, how are the two to meet? It seems unnatural to have a tortoise-arrow protocol, establishing mutual obligations. Neither exists to serve the other. The tortoise also eats lettuce, meanders about garden locations and hibernates. The arrow also knocks a flower-pot off a wall. By the same token, the world of a large work of interactive fiction is a world of unintended consequences. New and hitherto unplanned relationships between components of the “program” are added in beta-testing, something which the programmers of, say, a spreadsheet would not expect.

But are new relationships among things really so rare? I don't develop spreadsheets; I develop distributed systems that have to interact with several other, very complicated systems. I suspect Mr Nelson is just being coy.

Update: This document might as well have been constructed to be maximally interesting to me. It touches on interactive fiction, programming languages, logic and math (model theory), and linguistics.

20 May 2006

Conversation

Me:  Math.floor(6530219459687219.0)
Mozilla:  6530219459687220
Me:  (sigh)

19 May 2006

Small, complete environments

If I were to design a programming language today, I wouldn't.

Instead, I would design a programming environment. The basic requirements would be something like this:

  • It's generally pleasant and productive. To me, this means the environment itself is small, simple, and fast.

  • It produces the kind of programs people like (featureful, unburdensome, fast enough, small enough, easy to use, and on time :)).

  • It interoperates. You can drop in external stuff (source code, binary libraries, editors, tools) and it'll work smoothly. And vice versa: you can drop your project into an existing environment that has lots of other stuff going on, and it'll work.

What would this environment look like? I'll know it when I see it. In the meantime, here are some thoughts. It would have:

  1. A language with these properties:

    1. I can compile it to a raw executable that doesn't require the end user to install a VM. The simplest deployment model is, end user drops a single executable file into a directory of his or her choosing. End of story. (Almost nobody attempts this these days, since it's hard to reconcile this with rich libraries, portability across hardware, and good high-level language features. But it can be done.)

    2. I can add a third-party library to my project just by dropping it into the appropriate place in the source tree. (The Java WAR deployment model gets this right.)

    3. It's easy to integrate with pieces written in other languages. (This is really hard to get right. Microsoft .NET interop comes closest, as far as I know.)

    4. Build dependencies are present in the source code. (I haven't thought much about this, but the build system needs it.)

    5. The ability to express basic metadata about a project in source code. If it's a program, as opposed to a shared library, the source code reflects that.

    6. ...And of course the various properties any good language should have, things like well-designed libraries, readable syntax, decent support for refactoring, stupid mistake checking, portability, similar behavior across operating systems and hardware architectures, debugability, and as few irritating features as possible. (Nobody has it all, but there are plenty of nice languages out there.)

  2. A build system with these properties:

    1. It ordinarily works without any makefiles. Dependencies are determined by looking at the source code. Things like linker options are determined by looking at the metadata, which is also in the source code.

    2. You can type something ridiculously simple, like "build foo", and it builds foo, including dependencies, relatively quickly.

    3. It can handle generated code.

  3. An installer system with these properties:

    1. Extremely basic, intentionally anemic compared to, say, MSI. Can't install e.g. registry keys. It only installs files; it's parameterizable. But it groks versions.

    2. You can type something ridiculously simple, like "build foo.msi", and it builds foo.msi. Creates an MSI file on Windows, an RPM on Linux, whatever.

    3. It's optional. You can always use the "simple deployment model" (see 1a.)

  4. A decent IDE and debugger that work well without project files. (Microsoft Visual C++ 2003 and later score high marks here.)

The language requirements alone are nowhere near being met, but of course you pick the language of your choice and work toward implementing the features that are missing.

Small, complete programming languages hold appeal for lots of people. It seems weird that I haven't seen a small, complete programming environment since vi+cc+make.

14 May 2006

Nixie tubes

Today I learned:

From the early 1950's to the 1970's Nixie tubes were the dominant display service.

They only display digits, and they look a little like neon signs and a little bit like the filament of a light bulb.

04 May 2006

Inform 7

Inform 7 looks fantastic. I wish I had time to play with it.

03 May 2006

Worksheets revisited

I'm convinced that worksheets have not just a place but a central place in IDEs.

Here's the idea. A worksheet is a document that contains code. It's not like Excel: the code is visible all the time. But like Excel, it's "live": it runs the code as needed and displays the results right there on the page. If you edit some code, the worksheet automatically recompiles it and recalculates any affected results throughout the sheet.

The worksheet runs in a sandbox. It has access to useful testing resources—read access to other files in the same project, write access to a temp directory, and so on—but nothing much else.

Unlike a source file, the code in a worksheet doesn't have to be complete classes. Just a statement or an expression is enough. If you type an expression, the worksheet automatically shows you the result. If it's an object, it's displayed as some kind of nicely expandable, inspectable GUI widget. It's integrated with the debugger, of course.

It's also integrated with a word processor, so you can mix text and pictures with your code. None of this is meant to replace traditional source code; you'll have worksheets in your project alongside source files.

I don't know of any existing IDEs that have anything like this. But it sounds ideal for proof-of-concept demonstrations, API documentation, teaching, and testing.

My next side project is going into this territory. I'm kind of excited about it.

tewald

I met tewald via La Leche League.

Tim has the following interesting properties:

  • once worked on MSDN, at the software architect level, I think
  • would be named time if he followed the old-school Microsoft convention of full first name followed by last initial
  • has drunk more gallon jugs of the XML Kool-Aid than anyone else I know
  • has tons of Lego trains in a box somewhere, waiting for his toddler's third birthday so it'll be safe to take them out.

Great guy to have a beer with and talk programming languages, something I hope to do in the not-too-distant future.

02 May 2006

To do

  • Write an article, "What do objects do all day?", talking about the various roles classes play in a program. Some classes are, in no particular order, (a) collections of data, all public except for details of memory management and such: ArrayList, BigDecimal, Point, AffineTransform, org.w3c.dom.Document, and I guess String; (b) handles to software entities that live in the kernel or out of process: Socket, Process, java.sql.Connection; (c) data sources and sinks: ResultSet, OutputStream, Iterator; (d) event handlers—these don't model anything at all, you just use them to hook into other objects; (e) representations of some random programming language concept that the language doesn't have special syntactic support for, like a logic variable, a closure, or a future; (f) models of real-world "things" in the problem domain.

    I'm interested in this because I'm interested in what code means. I think having a clear idea of what you mean gets your job done faster, leads to more readable code and a more flexible design, and reduces bugs.

    The few object-oriented textbooks I remember at all focused their abstract discussion of “what objects mean” entirely on (f). It was as though the authors didn't really think about the question. But I suspect (without substantiation) that different techniques—and different programming language features—matter in different cases.

  • Find out if anyone does studies of questions like these: How many arguments does a method take, on average? (You get different answers if you count method declarations, method calls in source code, method calls at runtime.) What proportion of method calls take any complex arguments (I mean, arguments that can't be represented losslessly as smallish strings)? What proportion of methods simply delegate their job to some other object? What kind of changes do people make to existing code? Are there trends in these numbers over time?

20 April 2006

JavaScript speed test

After fiddling around with several JavaScript performance thingies online, I finally caved in and wrote my own.

14 April 2006

Miscellany

I'm searching for a way to ask the Visual Studio debugger to attach and break as soon as a given process starts. I don't care which version of Visual Studio, or if I have to attach to the parent process first, or how much scripting I have to do, as long as it's possible to get to a point where I can do this conveniently. For the world's most complicated debugger, this seems like a really obvious feature, but I can't find it.

Update: No luck yet, but I'm slightly warmer. VS.NET has macros; the EnvDTE namespace provides a tiny amount of programmatic access to the debugger; VS2005 allows you to set breakpoints in DLLs that are not loaded.

Firefox has a menu option, View → Page Style → No Style, that is perfect for pages like this one.

My side project is going well. Some 300 tests are passing. I need to pause a moment and figure out specifically what to do next. There are a lot of little things I'm eager to work on.

For one thing, it's slow. I've been using the Venkman JavaScript debugger to find the worst bits. It actually has a usable profiler. Fanstastic.

I have never (in maybe 6 tries) succeeded in getting any version of the Visual Studio debugger to work with a symbol server. The error messages are totally useless.

12 April 2006

Generator syntax for state machines

I have a real-world interface that looks like this:

interface ITableScanner {
    void open(Path filename);
    void onRecord(TableRecord record);
    void onEndOfFile(int status, Exception error);
    void close();
}

It's actually a state machine. The methods are only to be called in a very specific order: open(), then zero or more calls to onRecord(), then onEndOfFile(), then close().

Here's the simplest nontrivial implementation of this interface I can come up with:

class DebugTableScanner implements ITableScanner {
    public DebugTableScanner(Writer out) {
        m_out = out;
    }

    public void open(Path filename) {
        m_out.writeln("contents of file " + filename);
    }

    public void onRecord(TableRecord record) {
        if (!isNoise(record))
            m_out.writeln("  - " + record);
    }

    public void onEndOfFile(int status, Exception error) {
        if (status != OK)
            m_out.writeln("  * End of file status: " + status);
        if (error != null)
            error.printStackTrace(m_out);
    }

    public void close() {
    }

    private Writer m_out;
}

I'm going to propose a different syntax for writing that class. Just as you can use generator syntax instead of explicitly writing an Iterator class, you could use this special syntax instead of explicitly implementing any state machine interface.

The special syntax is receive methodName(parameters). It means roughly “yield, then resume here when the appropriate method call is received”. But if this syntax occurs in an if or while test-expression, it also means “skip to the next resume statement if a different method is called instead”. Informal, I know, but I think formal semantics would be easy to hash out.

The above class, implemented using this syntax, looks like this:

ITableScanner debugTableScanner(Writer out) {
    receive open(Path filename) {
        out.writeln("contents of file " + filename);
    }

    while (receive onRecord(TableRecord record)) {
        if (!isNoise(record))
            out.writeln("  - " + record);
    }

    receive onEndOfFile(int status, Exception error) {
        if (status != OK)
            out.writeln("  * End of file status: " + status);
        if (error != null)
            error.printStackTrace(out);
    }

    receive close();
}

You would call debugTableScanner(myOut) instead of new DebugTableScanner(myOut) to create an instance.

Nice things about this:

  • The code appears in the same order that it'll execute at run time.
  • The code and its caller look the same. You could put the code side by side and see the method calls and receive statements line up.
  • No need to explicitly check the object state and throw IllegalStateException (or whatever) if stuff hasn't been called in the right order; the compiler can generate the checks.
  • You could create variables anywhere in the body, and they'll be accessible only to methods called later.
  • Similarly, there's no need to stick the Writer into a member variable. It's lexically scoped. (Yeah, lexical scoping is frequently the highlight of my day...)

Here's a funny real-world example of why my proposed syntax is nice. I had to modify the real-world version of this example to cope with errors. Here's what I ended up with:

class DebugTableScanner implements ITableScanner {
    public DebugTableScanner(Writer out) {
        m_out = out;
    }

    public void open(Path filename) {
        m_out.writeln("contents of file " + filename);
    }

    public void onRecord(TableRecord record) {
        try {
            if (!isNoise(record))
                m_out.writeln("  - " + record);
        } catch (Exception exc) {
            onError(exc);
            throw;
        }
    }

    public void onEndOfFile(int status, Exception error) {
        try {
            if (status != OK)
                m_out.writeln("  * End of file status: " + status);
            if (error != null)
                error.printStackTrace(m_out);
        } catch (Exception exc) {
            onError(exc);
            throw;
        }
    }

    public void close() {
    }

    private void onError(Exception exc) {
        reportError(
            new Exception(INTERNAL_ERROR_IN_SCANNER, exc));
    }

    private Writer m_out;
}

...and actually considerably worse, since I had code in close() which also had to be wrapped. In my proposed syntax, it would look like this, instead:

ITableScanner debugTableScanner(Writer out) {
    receive open(Path filename) {
        out.writeln("contents of file " + filename);
    }

    try {
        while (receive onRecord(TableRecord record)) {
            if (!isNoise(record))
                out.writeln("  - " + record);
        }

        receive onEndOfFile(int status, Exception error) {
            if (status != OK)
                out.writeln("  * End of file status: " + status);
            if (error != null)
                error.printStackTrace(out);
        }

        receive close();
    } catch (Exception exc) {
        reportError(
            new Exception(INTERNAL_ERROR_IN_SCANNER, exc));
        throw;
    }
}

11 April 2006

Worksheet IDEs

I've always liked using worksheet-style programs. Excel is the best-known one, but Mathcad is what I'm really talking about. They're document-oriented, yet interactive. The emphasis is clearly on making something a human can read; but it's also executable.

Hmmm.

I think a worksheet-style programming environment would be interesting. For one thing, it would make it really, really easy to accumulate test cases. I think most developers do informal testing of the variety "I just finished coding X; let's run it and see if it works." If the interactive environment in which you did this kind of testing were a worksheet, you could very easily save the thing and run it again later. It would almost be hard not to build up test cases.

04 April 2006

JavaScript, threading, and continuations

I've got a few bits and pieces of my side project working now, and I ran into something unexpected.

The annoying problem

In a simple GUI app, events (like mouse clicks and keystrokes) are all handled in a single, dedicated thread. If event-handling code takes too long to run, the application becomes unresponsive. It works like this everywhere: Java, Windows, XWin—and client-side JavaScript. But as always, JavaScript is... different:

  1. The whole browser doesn't become unresponsive, just your scripts. This is scripts don't run in the browser's main GUI thread. Rather, the browser gives you a separate thread dedicated to scripting events. But don't think this lets you off the hook, because...

  2. The browser detects unresponsive scripts, and believe me, you don't want this to happen. Firefox, for instance, pops up a warning message and offers the user the option to kill your script. If the user says no, the same warning message pops up again a few seconds later.

  3. The usual solution to this problem won't work in a browser. In any other system, you would just take the long-running code and move it into its own thread. But scripts can't spawn threads. Now what?

The annoying solution

I found a workaround using window.setTimeout(). You have to change your code to do its work in chunks and use setTimeout() as a substitute for preemptive thread-switching. It sounds like green threads. Unfortunately, it's much worse than that. Consider this:

    function hog() {
        // code that does three time-consuming things
        spellCheckWikipedia();
        findPrimesInDigitsOfPi();
        buildConcensusOnUsenet();
    }

In a language with green threads, you'd just call sleep(1) (or whatever) between steps.

In JavaScript, you have to write this:

    function hog_start() {
        setTimeout(hog_step1, 10);
    }

    function hog_step1() {
        spellCheckWikipedia();
        setTimeout(hog_step2, 10);
    }

    function hog_step2() {
        findPrimesInDigitsOfPi();
        setTimeout(hog_step3, 10);
    }

    function hog_step3() {
        buildConcensusOnUsenet();
        // phew-- done.
    }

Ugly. The really annoying case is when individual steps are too time-consuming. Suppose buildConcensusOnUsenet() takes an unbounded amount of time to run. Then the function needs to be modified to be "setTimeout()-aware"—a huge pain.

Funny thing is: this is just continuation-passing style. You use setTimeout to "jump to" a continuation. I thought this was cute, like a pun is cute. I've seen continuations simulated via exceptions, but never via event queues.

I guess theoretically I could source-transform JavaScript code written in the green-threads style into JavaScript written in continuation-passing style. Not sure I want to go that far.

To do

I have to get back to my current side project, about which more in a bit. But a definite possible future side project is:

  • Create a "foundations of math" wiki combining MediaWiki and the Metamath proof-checking engine.

31 March 2006

Metamath and the art of mathematics

My brilliant coworker G.D. thinks Metamath, and the successor I envision (in the previous entry), are sort of pointless.

I think his main beef with the idea is this: people prove things by inventing crazy new concepts that won't get formalized for a long time. This inventiveness is the really interesting part of math. I think G.D.'s impression is that just about any proof you read in a mathematical journal today probably can't be put into a formal, machine-checkable form for decades to come.

He is probably right. So my hypothetical checked theorem database can never be up-to-the-minute. Then what value could it have?

  • Exploration and education. Even if the database only covers formally grounded math, I think that could get us up to maybe 1950. Number theory is pretty well grounded, I think. Real analysis. Calculus. Euclidean geometry. Group theory. Think how huge it would be. Thousands of human-readable proofs, every step checked, and you can drill down into any step if you don't understand the reasoning. Basically every theorem an undergrad ever bumps into; every theorem I personally could ever hope to understand. It sounds exciting to me.

  • Finding assumptions. Does the fundamental theorem of calculus depend on the axiom of choice? Metamath doesn't have that theorem yet, but when it does, then you'll know. Exposing previously hidden assumptions sounds good for math.

  • Finding gaps. Check out the axiom of choice. Here's a discovery that rocked mathematics. Are there any more of these still lurking in standard math? It seems wildly unlikely... but then, it also seems unlikely that what we've got is that perfect.

  • Mathematician productivity. The ultimate (200-year) goal is to evolve the system into a work environment for mathematicians. If you started every expedition by plunking your conjecture into a worksheet, every once in a while the program could immediately respond with the (known or easily found) proof. Also, Andrew Wiles's first proof of Fermat's last theorem was found to contain a serious error that it took him a year to fix. Of course he would have preferred to have discovered that prior to publication. The current proof is 200 pages; I doubt it's been scrutinized. These thoughts make this idea seem inevitable. It just sounds too useful as a dumb tool.

And: I think the time gap between a new invention in math and its formal grounding has been shrinking. Maybe formalism will catch up to discovery!

I think the essential question is this: do you think the Zermelo-Fraenkel axioms really are one of humankind's most profound achievements? Or were these guys just mincing grammar snobs, coming in after the Newtons and Galoises of the world, acting all important, dotting the i's and crossing the t's—only to get blown out of the water by Gödel before they even finished the job?

Metamath

Q: What is Metamath?

A: Metamath is a tiny language that can express theorems in abstract mathematics, accompanied by proofs that can be verified by a computer program. This site has a collection of web pages generated from those proofs and lets you see mathematics developed formally from first principles with absolute rigor. Hopefully it will amuse you, amaze you, and possibly enlighten you in its own special way.

It has certainly amused and amazed me. (Note: If you don't consider the Zermelo-Fraenkel axioms “one of the most profound achievements of mankind” [here], Metamath might not be for you.)

Metamath is a database of machine-verifiable mathematical proofs. This is actually something of a long-term dream of mine, so I'm surprised and impressed.

Here are the frustrating bits:

  • The database is largeish, but its reach is very limited. Metamath currently comprises 6,403 theorems, of which “2+2=4” is number 5,214.

  • The proofs themselves are unenlightening. For example, Metamath has proved that there are infinitely many primes. But click the link. It's totally indecipherable.

What I'd do differently, to try and address those problems:

  • Make the database more searchable, by giving pages readable English names, not numbers or junk like caoprmo.

  • Allow anyone to edit.

  • Encourage more English text mixed in with the formal statements.

  • Allow pages with non-verifiable proofs, missing steps, syntax errors, etc. But if a proof is correct and verifiable, you'd get a nice string of green lights down one side of the page.

  • Work hard on notation. It has to be human-editable and human-readable. (This strikes me as extremely difficult.)

  • I wouldn't require contributors to prove every step formally. This requires a deep understanding of formal logic that most mathematicians frankly don't have. Instead, you'd just type all the steps in the proof, with the usual gaping logical chasms in between; and the system would use automatic theorem provers to check that each step follows from the preceding ones. This would help with both readability and reach.

  • Now theorem provers suck, so often the system won't be able to prove every step. No problem. Contributors could close a gap by (a) giving the system a hint (e.g. adding the words “...by [[Schphlebotsky's Theorem]]”), (b) inserting a clarifying intermediate step, or (c) making a separate page proving the step as a lemma. And they could work incrementally.

Metamath is already more and better than I would've thought possible. And there are extremely good reasons that Metamath exists and my idea is just a dream. But I think the dream has potential. Who knows—Metamath might evolve in this direction.

30 March 2006

C# lambda expressions and constraint programming

Neat thing about C# lambda expressions: the recipient of a lambda gets to choose whether it wants a compiled function or just a parse tree.

    var f = orders.Where(o => o.ShipCity == "London");

Here the lambda o => o.ShipCity == "London" compiles into either a function or an Expression object, depending on the parameter type of orders.Where(). As the caller, you don't care which.

So I got to thinking. Languages that implement constraint programming as a library feature, like Alice ML, seem to end up looking rather clunky:

    fun money sp =
    let
        val v as #[S,E,N,D,M,O,R,Y] = fdtermVec (sp, 8, [0`#9])
    in
        distinct (sp, v, FD.BND); 
        post (sp, S `<> `0, FD.BND); 
        post (sp, M `<> `0, FD.BND); 
        post (sp, `1000`*S `+ `100`*E `+ `10`*N `+ D `+
                  `1000`*M `+ `100`*O `+ `10`*R `+ E `=
                 `10000`*M `+ `1000`*O `+ `100`*N `+ `10`*E `+ Y, FD.BND); 
        branch (sp, v, FD.B_SIZE_MIN, FD.B_MIN); 
        {S,E,N,D,M,O,R,Y}
    end

It looks a lot nicer in a language like Oz, which has language support for logic variables and so forth.

    proc {Money Root}
       S E N D M O R Y
    in 
       Root = sol(s:S e:E n:N d:D m:M o:O r:R y:Y)
       Root ::: 0#9
       {FD.distinct Root}
       S \=: 0
       M \=: 0
                    1000*S + 100*E + 10*N + D
       +            1000*M + 100*O + 10*R + E
       =: 10000*M + 1000*O + 100*N + 10*E + Y
       {FD.distribute ff Root}
    end

(Example from Mozart documentation.)

Now C# doesn't have that, but it seems like a constraint library similar to Alice's, designed in C#, could take advantage of expression trees to make the rules somewhat more readable:

    sp.Post(S => S != 0);
    sp.Post(M => M != 0);
    sp.Post((S, E, N, D, M, O, R, Y) =>
                           1000*S + 100*E + 10*N + D
                         + 1000*M + 100*O + 10*R + E
              == 10000*M + 1000*O + 100*N + 10*E + Y);

Of course, lambdas only help with rules, and there's much more to the problem statement than rules. Still, I wonder if anyone is doing this.