January 29th, 2009

Mystery Hunt lessons

It's almost February, but I still haven't posted my wrapup of the 2009 MIT Mystery Hunt. Before I get to the puzzles, let me just say that the mystery hunt has made job interviews very awkward for me! Every time I've been on the job market, I find that I want to cite "lessons learned" from mystery hunts past -- which then forces me to explain what the mystery hunt is ("well, imagine the world's hardest scavenger hunt, with puzzles"), and I end up sounding like a huge geek. Which may not be a bad thing, I guess...

Why does the Mystery Hunt keep cropping up? Well, in my recent interview it started with a discussion of the difference between "enforced" and "permissive" management styles. There is a great and good tendency in any large endeavor, corporate or puzzle-hunt, to draw up "best practices" to codify lessons learned about the most effective ways to do things. But what happens next? There is a crossroads between two fundamental strategies. The "enforcement" strategy tries to figure out how we can forcibly prevent people from doing things other than the "best thing". Bureaucracy and long lists of rules follow, along with authority: someone who will not let X do Y unless they have been convinced that Z has happened.

A better strategy is "permissive": assuming good-faith on the part of all involved, just passively monitor the situation, so that "bad things" can be remedied without intruding on all the people trying to do good things. This is a much more efficient strategy, especially if your resources are constrained — it's less costly to monitor than to prevent — but it involves a lot of trust. Some cultures can sustain the necessary trust, and some cannot.

Mystery hunt example: how do we prevent people from calling in too many wrong guesses at an answer? Team ACME solved the problem by just putting the communal answer phone at the front of the room, so that everyone would notice if someone got crazy with the guessing. Team Codex uses an enforcer: all answers must be called in by the designated operator-on-duty. Lest you get the idea that permissive==good and enforcement==bad, let me state clearly that Codex has solid reasons for its choice: it has so many remote team members (not physically present in team headquarters) that monitoring is difficult -- the simple "phone in the front of the room" doesn't work when your rooms are scattered across Seattle, Zurich, and countless other places.

At OLPC we faced the same issue with system administration: how to we keep our network and computers secure? OLPC came from the permissive MIT/Media Lab heritage: since it is a research and learning organization, MIT uses the permissive strategy, letting users do what they please on the academic network and focusing effort on detecting "bad behavior" (worms, cracked machines, attacks, misconfigurations) and efficiently and effectively black-holing small pieces of the network to contain these problems. If a naive user doesn't properly patch their personal computer and it gets hacked, that machine is noticed almost immediately and dropped off the network. "Traditional" system adminstration focuses on enforcement: by forcing all users to use only patched versions of approved software, we can proactively prevent problems -- but at the cost of also preventing Real Work and Unexpected Solutions.

Back to the mystery hunt: since the hunt is a competitive event, it's easy for people to get carried away with secrecy to protect the team's competitive advantages. But the mystery hunt starts at noon on a Friday, with people arriving that morning from around the country, and it is crucial to get all these new people integrated into the team as quickly as possible. Every minute that some team member can't print/can't access the network/can't access the wiki/can't access the chat rooms is a minute that could be spend advancing the goal: solving puzzles. Permissive strategies are a big win here, especially because the hunt code of conduct, observed by all seriously competitive teams, ensures the necessary trust. Other teams aren't actively seeking to hack into our systems; preventing Google from sucking everything into its archives and index is usually enough.

Another OLPC example: build systems. During the run-up to a stable build, it's very important to keep build changes under control. How to prevent developers from making uncontrolled changes to the stable build? The enforcement strategy appoints a build master Who Alone Is Powerful and forces all changes to be made via the build master. The permissive strategy says that developers are generally trustworthy (and are accountable to management if not!) and retains the ability of anyone to affect the build at any time. Instead of making it arbitrarily harder to fix things as the build approaches stability, we (a) trust the developers to commit only appropriate fixes to stable, (b) watch all the changes (which has the benefit of catching even unintentional changes) and (c) back out any well-intentioned fixes that might adversely affect stability. The buck still stops with the build master to determine what's "safe" for the stable build, but their direct intervention is limited to backing out the infrequent "bad thing" instead of being interposed in every frequent good thing.

This post is very long, and I've only discussed one of the "mystery hunt lessons". Others include effective ways to collaborate with remote resources/solvers, and the importance of making "the right thing easy". The best way to get people to do "the right thing" is to make it the easiest thing to do. For example: a perennial problem is keeping remote hunters "in the loop" about what puzzles have been solved and what new puzzles have been made available. Further, there's a lot of internal bookkeeping that needs to be done on these events. My best solution to the problem was a chatbot who lived in the chatroom ACME used to keep in touch with remote solvers. When a solution was phoned in, we'd type "bot, the solution to FROBNOTIC UPGRADES is BAZNIC". When a new puzzle was made available, "bot, new puzzle: GROOVY FROOD". This was the (only) means to update our collaboration software: the puzzles/answers would be added to the wiki's automatically generated "status blackboard", and new puzzle pages would be created in the wiki to track work done. But, by performing the operation in the shared chat room, it notified the remotes at the same time. Suddenly no one had to be reminded to update the remote hunters! (Codex has a similar system: all answers are phoned in by a dedicated operator, and you are encouraged to communicate with the operator via the shared ringhunters chat room. Codex doesn't have an effective a "new puzzle notification" system, although you could monitor the changelog of the wiki to see new pages being created.)

In our latest hunt, we used Google Spreadsheets extensively to collaboratively solve puzzles. Team members would typically create new sheets in the spreadsheet to test different theories for how the puzzle might work. This had the unexpected side effect of eliminating almost all updates to the puzzle's wiki pages -- it was easy to work in the spreadsheet, and hard to keep switching back to the wiki. The problem was that it is also "hard" to write long free-text blocks inside a spreadsheet, and so most sheets in the spreadsheet were "undocumented", and it was hard reconstructing what had been done on a puzzle when a newcomer picked it up. The discussion on our mailing list started along familiar paths: can we somehow force people to update the wiki? The "make the right thing easy" lesson suggests it would be more profitable to make it "easy" to add free-text blocks to sheets of the spreadsheet -- then the "right" thing will just naturally happen.

Pinot/Journal news

The magic of RSS just informed me that Pinot has had it's 0.90 release, fixing some of the problems with i18n and OpenSearch that I'd found in my Journal2 work, and actually implementing my tagged cd idea! Coolness. I had been thinking about Recoll recently, simply because I like its simple plug-in interface for indexing new formats, but it looks like Pinot has pulled ahead again.

Fabrice, how about importing the HTML-indexing code from Omega like Recoll does, and using the same plugin interface and filters? You'd immediately expand greatly the types of documents you can index, and make it easy for third-parties to create one ones. A plugin basically just takes the file-to-be-indexed and spits out an HTML document with the indexable text. Additional xapian fields can be specified by including appropriate <meta> tags in the output HTML. Full details are in the Recoll manual; it's a pretty straightforward scheme.


Just on the off chance that people read my blog but not Chris Ball's, I'll point out his recent Multipointer Remote Desktop demo. This is a key technology for improving collaboration, and I think it's relevant both to OLPC and to other little companies which might be thinking about collaboration features. Multi-pointer stuff is just a much different experience than "watching" someone else drive the display, as numerous studies (such as this one) have substantiated.

Even more shutting of boxes.

In previous posts I've talked about mathematics and strategies for the "shut the box" game. I've been describing a two-player game, but shut the box can be played with any number of players. How do strategies change?

It's easy to think about the limit case and work backwards. If there are an infinite number of players, eventually someone will "shut the box" — it happens in about 1 in 10 games, after all. Since the round will eventually end this way, then your final score doesn't matter: all that's important is whether you manage to shut the box. Every player's strategy is identical. (With an infinite number of players, there is no "last" player — in fact, even if there were one, there would be an infinitesimal chance of their touching the dice: they game would almost certainly be ended by a previous player's shutting the box before the "last turn" occurred.)

A strategy which maximizes your chances of shutting the box, ignoring final score, is easy to implement and evaluate, and differences from the "largest result" strategy described previously can again be tabulated on a single side of a sheet of paper. This "most zeros" strategy manages to shut the box in 9.8% of games played, but still loses in a two player game — even though the optimal 1st player strategy only manages the shut the box in 9.5% of its games. As first player against an optimal 2nd player, "most zeros" loses 3.3% of the stake in each game, compared to only 2.5% lost by an optimal 1st player (game results: 9.5% 1st player shut box, 8.2% 2nd player shut box, 37.3% 1st player by points, 43.8% 2nd player by points, 0.9% ties). As second player against an optimal 1st player, "most zeros" loses 2.8% of the stake; an optimal second player would win 2.5% of the stake (game results: 9.5% 1st player shut box, 8.8% 2nd player shut box, 41.1% 1st player by points, 39.7% 2nd player by points, 0.9% ties).

Since the box is shut with a fairly high probability, we converge to the "most zeros" strategy fairly quickly as the number of players increases. Intermediate strategies are parameterized by the "score to beat" (the minimum score achieved by previous players) and the probability distribution of the minimum score to be achieved by future players. The complexity of computing exact solutions increases quickly.

In my next post on the topic, I'll discuss an "alternating turns" variant of shut-the-box which might be a more enjoyable real-time-collaborative computer game.

Making shut-the-box fair

The standard game of "shut the box", analyzed in previous posts, has a fundamental flaw: it's not fair! On average in a two-player game, the first player will lose 2.5% of their stake in each game. Can we make a more fair two-player game?

The simplest solution is to change how ties are treated: with optimal strategies, 1% of the games end in ties. We can resolve all ties in favor of the first player, but that's not good enough. We actually need to treat ties like box-shutting, and award double payoffs. And it's still not enough, especially since the 2nd player can alter their strategy to avoid ties if they become too painful.

So let's leave ties alone, and alter the box-shutting payoff. Although the 2nd player has an advantage in the points race, because they know what 1st player score they have to beat, the 1st player has an advantage in shutting the box: if there are able to do so, the game ends immediately and the 2nd player doesn't have a turn. The 2nd player doesn't have the opportunity to, for example, achieve a shut box themselves, tying the game.

In the standard game, box-shutting pays out double: if everyone had to put up a $1 stake to play, then they have to pay another $1 to the player who is able to shut the box. We'll call this a "2x" payout. Even odds work out to a 3.8x payout (accounting for the fact that optimal strategies change as the payout rises). It would be a little awkward to ante up $2.80 a person in a $1 ante game, but it turns out that a 4x payout ($3 more from each person on a shut box) is probably okay: this only gives the 1st player a 0.1% advantage. That might be enough for a casino to live on, but it's probably acceptable among friends.

Another way to even the odds is to force the players to alternate turns. This is hard in the physical game, where there's only one physical box with tiles, but it is natural in a computer version of shut-the-box — and it probably improves gameplay by elimininating the long waits between a player's turns. The first player still has a slight advantage, but computation indicates that this advantage is limited to 0.4% of the stake ("acceptable among friends"). It's hard to compute optimal strategies in a 3 player game, due to combinatorial explosion, but it appears the first player's advantage grows as the number of players does.