31 December 2008

Junk and your unconscious mind

It's wonderfully easy to contribute on the Web, and as an unfortunate side effect of this essential fact, hoaxes, cranks, and general nonsense abound. I'll euphemistically call this stuff “junk”. Because so much of what you see online is junk, smart people such as yourself develop a finely-tuned junk detector. This is fine—in any case it's important to have one if you plan to use the Web for anything serious.

But your junk detector is probabilistic, factoring in grammar, habits of speech and writing, vocabulary, the writer's opinions and personality, whether the page has pictures of kittens—anything but the actual argument itself, because the whole point of the junk detector is to avoid wasting the time of reading it. In other words it's like the worst possible use of ad hominem, a logical fallacy. You guess as much as you can about the author, then judge the value of the page based on that. I see no good way around this. Consequences:

Your main way of evaluating the quality of Web pages is subconscious.

The junk detector is not as accurate as actual critical thought.

False positives mean the reader misses out and the writer fails to connect (making good writing skills more important now than ever before).

False negatives mean you may be duped: the junk detector doesn't protect you from lies, logical fallacies, or really sophisticated forms of “junk”. By the time you decide to read the whole page, the junk detector is done working. Another, smarter junk detector had better kick in!

All of this applies in the non-Web world, too, but the Web is so full of junk, and it's so hard to avoid altogether, that the cheapest possible junk detector is highly rewarding and can instill a false sense of confidence.

Structure

On the Web, alternative reading to whatever you're looking at is never far away. There are even links in most Web pages, forever calling you to random-walk. The result, for the reader, can be a haphazard adventure of reading, interesting at every point but without overall purpose.

The result for writers is that time spent organizing thoughts is usually wasted—nobody wants to read all that. Instead, you write one thought per day in a blog, or contribute to sites like Wikipedia, which generally rejoice in the Web's random-walk nature.

Sometimes it is an unexpected pleasure to open a book and follow the development of a big idea over many chapters. I got this kind of feeling from the mathematics textbooks I mentioned a few days ago.

It's weird for me to even be saying this, because I like that the Web is deeply interconnected and wild. But the Web doesn't seem to generate good content with large-scale structure—the kind of stuff that I find most rewarding to read.

17 December 2008

The School Mathematics Project

JJ had me look at a set of old mathematics textbooks, and I found this.

4.1 Division and repeated subtraction

We can write 7 + 7 + 7 + 7 + 7 + 7 + 7 + 7 + 7 = 7 × 9 = 63.

(a) What is 63 - 7 - 7 - 7 - 7 - 7 - 7 - 7 - 7 - 7?

(b) What is 63 ÷ 7?

(c) Explain the connection between the last two questions.

(d) If you were to work out 65 - 7 - 7 - 7 - 7 - 7 - 7 - 7 - 7 - 7, what would you find? How would you give your answer?

4.2 Division of a whole number by a whole number

Example 11 (Method I)

If you were asked to work out 5489 ÷ 12 by finding out how many times you could subtract 12 from 5489, you wouldn't be very pleased!

5489
-12
5477
-12
5465
-12
5453
-12
5441
-12
5429
-12
5417

This is just the start. It would certainly take a long time. However, as you will have realized, there are quicker ways of doing this division.

(Method II)

12 )5489 Consider 5400. There are more than 400 (but less than 500) twelves in 5400. Let us subtract 400 of them all at once.
4800 (400 twelves)
689 Now consider 680. There are more than 50 (but less than 60) twelves in 680. Subtract 50 of these all at once.
600 (50 twelves)
89 Finally, we know that there are 7 twelves in 89 which if we subtract them leave us with a remainder of 5.
84 (7 twelves)
5

So we have subtracted (400 + 50 + 7) twelves and have 5 left over.

5489 ÷ 12 = 457,   remainder 5.

If we were dividing in order to find the answer to a ‘fair shares’ question, we would write

5489 ÷ 12 = 457 5/12

You will probably have recognized this method. Why?

I'll stop there. What struck me as cool about this is that it takes long division, a complex procedure which most students learn by rote, and at once (a) explains why it works (b) makes it seem simple and obvious.

The example is from SMP Book C, published 1969 by Cambridge University Press. JJ has the whole series. They seem quite good, relative to what I recall from grade school. The approach is conversational with a lot of questions. Very few paragraphs are more than a few lines long. There are exercises but no “word problems”. The books are printed in black and red ink. There are no photographs or sidebars. The subject matter is richly mathematical: very little arithmetic, which must have been a separate curriculum; but in the first few books (hard to tell but they appear to be directed at students 12-15 years old) there are chapters about things like relations, directed graphs, symmetry, counting possibilities, why a slide rule works.

The SMP stands for School Mathematics Project, a British nonprofit. They're still making mathematics textbooks.

09 December 2008

The very best of jorendorff?

I like Language Log, but I would like it even better if there were less of it.

Wouldn't it be keen if there were a site where you could enter the URL of any blog, and it would give you back a feed containing only half the entries—the best ones, according to whatever metric of popularity the service could find (links, diggs, whatever).

I proposed this on IRC, where mhoye and humph reacted with a definite meh. (Note: All these chat excerpts are edited to give the illusion that there's a single coherent conversation going on.)

<humph> what's popular and what's interesting to me are not often the same for me
<mhoye> humph++
<mhoye> That sounds like a good way to be drowning in mediocrity, for sure.
<mhoye> jorendorff: Apply your theory to popular music.
...
<ted> so your theory is that if you like a blog enough to subscribe, you would like it even more if you only got the absolute best posts?
<jorendorff> ted: my theory is that "absolute best posts" means something
<mhoye> God, no.
<mhoye> See also, "absolute best music", "absolute best paint color."

I failed several times at explaining why I think this. Let me try again here.

Simple ratings systems are common on the Web. Some, like the Slashdot comment ratings (“Score: 5, Insightful” and such) perform very well. Others, like online restaurant guides, are useless. Ratings work when users agree on what's good and what's bad. On Slashdot, the worst posts are pretty content-free. Subjective tastes don't even really enter into it. Restaurants are a different story. In the case of music, mhoye's example, I'm sure any two people can find plenty to disagree on. But:

<jorendorff> mhoye: do you have a favorite band?
<mhoye> Not just one!
<jorendorff> mhoye: I'm struggling to get you guys to engage on any specific example :(
<mhoye> Jorendorff: Ok, here. "Entertainment", by "Gang Of Four".
<jorendorff> mhoye: excellent - what are your favorite songs off that album?
* mhoye picks "I Found That Essence Rare" and "Anthrax"

Both of mhoye's picks are among what the Apple Store calls the “TOP SONGS” from that album. Both are mentioned in Apple's review. Maybe mhoye picked them because they're the best tracks on the album.

Counterexamples abound too. We could settle this scientifically by sampling a blog's audience, having those people rate posts for a while, and seeing how closely their ratings correlate.

Instead, let's play a silly game. See if you can stand to read these two entries from my old writing journal: Zen in space and the swoon. I believe one of those is about as good as I can write and the other is flat-out bad. I furthermore immodestly claim that those are two different things! And I think you might agree with me on which is which. We'll see (if you're willing) in the comments.