07 January 2008

This week I learned...

  • Before desktop computers were widely used in China, telegraph operators there had to memorize every Chinese character's GB 2312 character code.

  • Moleskin is made from cotton, not moles. (In other news, guacamole is made from avacados.)

  • Garbage collection in Erlang is per-process. This seems weird—are messages copied from process to process?—but as that article explains, there are advantages, too.

  • The Haskell standard library contains over 100 operators— that is, functions whose names consist of ASCII symbols, like .| and |. and @?= and @=?. Someone must stop these madmen.

  • I learned a few very basic odds and ends of category theory.

    The book I'm reading (by Benjamin Pierce) offers “injective functions are monic in Set; surjective functions are epic” as a mnemonic, you know, to help you remember monic and epic. This has to be the worst mnemonic of all time. I just don't see anything helpful about it. Sur- means “under”. Epi- mean “on top of”. I can never remember the difference between injective and surjective to begin with.


slawekk said...

"The Haskell standard library contains over 100 operators"

In other news, mathematicians use
more than 4900 symbols
... The emotional attachment of programming language designers to ASCII is rather mysterious to me. It clearly wastes the bandwidth of human visual cognition channel.

jto said...

I'm not sure why you think the attachment is emotional. Some advantages of ASCII are:

* You can type it on an ordinary PC keyboard.

* And everybody knows how.

* You can even read it to somebody over the phone. (Well, you stand a chance anyway.)

* In the fonts programmers use, all the ASCII symbols are distinguishable, even at small sizes.

* It's unlikely to be destroyed by buggy systems (like editors, web clients and servers, email systems, source control, filesystems, Unix terminal emulators, compilers, linkers, etc.)

* It won't be rendered unreadable just because you're missing a font.

* It works fine in a monospace font, which a lot of programmers prefer (not me, but it doesn't matter what I think).

* Most programmers would rather see "in" than "∈" anyway.

There's also the question of plain text (whether ASCII or Unicode) vs. something really visually richer, like diagrams or math-like notation. But you need source control and diffing; and you want good editors and power tools (like perl) for large changes. Plus someone would need to invent a user interface for diagrams that doesn't suck.

slawekk said...

Of course there are always good reasons to do things as before. Increasing the set of symbols that programming languages use would require new tools for authoring and presentation. The question is whether the effort would be justified. I believe it would.

The best way to see the potential benefit is to go the other way - see what happens to mathematical notation when you try to express it using ASCII symbols. This is what most theorem provers do, turning beautiful math into unreadable mess.

As soon as a language allows to define syntax (as Haskell does to some extent) people start using ASCII art like .|. or -> . There is a natural need for a reacher set of symbols.