Mine was on sorting and searching at the library.
The talk starts out talking about what happens if you take a book off the library shelves and put it back in the wrong place. When I wrote it, I got curious, so sent email to the library, asking:
Hi, I have a few questions about the library system.
- How many books are in the library system?
- Do you ever take a full inventory of the library, scanning every book on the shelves?
- If not, is there any other way to know if a book is missing? (that is, the catalog shows it as Available but it's not actually on the shelves in the right place)
- If you do track missing books, how many are missing right now? How long does it usually take for them to turn up?
I'm not planning a book heist. :) I'm preparing a talk about information technology and libraries for a local event for software engineers.
Engineers are always interested in “failure modes” -- that is, what happens when something goes wrong.
I didn’t get the response in time to change the talk, but the library sent me email this morning with these answers:
- There are 1,629,308 items in the collection.
- No, we do not do a complete inventory of our entire collection.
- We do monthly weeding (de-selection) reports for items that haven't circulated in 1-2 years and that usually catches most missing items. We cover almost the entire collection within one year. However, we also will do a system-wide ILS report and change items automatically to missing status in the computer that haven't circulated in branches in a very long time. We also do this for items stuck in transit mode between locations for a long time.
- We do not track missing items at a level that will provide us with statistics like return rate. Anecdotally, however, it is rare that missing items are located again. They are usually missing because of theft.
(I’ll just note that the way they actually track missing items means they wouldn’t detect items that are only misshelved for a month or two. There might be a lot of them. I find two or three every week.)
Anyway, the talk was picked up on Reddit programming and got some wonderful comments. My favorites:
- “My high school did a volunteer day where we took our entire class year and spent an hour in a class learning how they sort the library books, then sent us each to a section to go through, find misplaced books and put them back in order. It took 2-300 of us ~5 hours to sort all of the library.” –Kimano
- “This posting reminds of when I visited a warehouse that had automated storage and retrieval of items from the warehouse.
“One of the cool things that had to happen periodically was essentially the real-world equivalent of defragmenting a hard disk. If you think a hard disk is slow, imagine how slow it is to physically move pallets and cartons!” —grandzooby
- “The reason we can insert books in a shelf is that there are some gaps distributed between books, and insertion shifts a couple books around to make space. Insertion sort is O(n log n) is a fun research paper that describes a similar way to organize data in arrays, with enough bogus elements (gaps) for insertion to be logarithmic time, but not so many that binary search is super-logarithmic.” —phkuong
Hack day was so great that I can’t wait to do another one.