Highlights from "Algorithms to Live By"
By Brian Christian and Griffiths
The explore/exploit tradeoff tells us how to find the balance between trying new things and enjoying our favorites. Sorting theory tells us how (and whether) to arrange our offices. Caching theory tells us how to fill our closets. Scheduling theory tells us how to fill our time.
optimal solution takes the form of what we’ll call the Look-Then-Leap Rule: You set a predetermined amount of time for “looking”—that is, exploring your options, gathering data—in which you categorically don’t choose anyone, no matter how impressive. After that point, you enter the “leap” phase, prepared to instantly commit to anyone who outshines the best applicant you saw in the look phase.
Full information means that we don’t need to look before we leap. We can instead use the Threshold Rule, where we immediately accept an applicant if she is above a certain percentile. We don’t need to look at an initial group of candidates to set this threshold—but we do, however, need to be keenly aware of how much looking remains available.
Any yardstick that provides full information on where an applicant stands relative to the population at large will change the solution from the Look-Then-Leap Rule to the Threshold Rule and will dramatically boost your chances of finding the single best applicant in the group.
Spend the afternoon. You can’t take it with you. —ANNIE DILLARD
for people there’s always a time cost. It doesn’t come from the design of the experiment. It comes from people’s lives.
Intuitively, we think that rational decision-making means exhaustively enumerating our options, weighing each one carefully, and then selecting the best. But in practice, when the clock—or the ticker—is ticking, few aspects of decision-making (or of thinking more generally) are as important as one: when to stop.
When balancing favorite experiences and new ones, nothing matters as much as the interval over which we plan to enjoy them.
So explore when you will have time to use the resulting knowledge, exploit when you’re ready to cash in. The interval makes the strategy.
Win-Stay, Lose-Shift algorithm: choose an arm at random, and keep pulling it as long as it keeps paying off. If the arm doesn’t pay off after a particular pull, then switch to the other one.
The Gittins index, then, provides a formal, rigorous justification for preferring the unknown, provided we have some opportunity to exploit the results of what we learn from exploring.
Exploration in itself has value, since trying new things increases our chances of finding the best. So taking the future into account, rather than focusing just on the present, drives us toward novelty.
“To try and fail is at least to learn; to fail to try is to suffer the inestimable loss of what might have been.”
The success of Upper Confidence Bound algorithms offers a formal justification for the benefit of the doubt. Following the advice of these algorithms, you should be excited to meet new people and try new things—to assume the best about them, in the absence of evidence to the contrary. In the long run, optimism is the best prevention for regret.
To live in a restless world requires a certain restlessness in oneself. So long as things continue to change, you must never fully cease exploring.
Being sensitive to how much time you have left is exactly what the computer science of the explore/exploit dilemma suggests. We think of the young as stereotypically fickle; the old, stereotypically set in their ways. In fact, both are behaving completely appropriately with respect to their intervals. The deliberate honing of a social network down to the most meaningful relationships is the rational response to having less time to enjoy them.
What makes Google so dominant as a means of accessing the world’s information is less that it finds our text within hundreds of millions of webpages—its 1990s competitors could generally do that part well enough—but that it sorts those webpages so well, and only shows us the most relevant ten. The truncated top of an immense, sorted list is in many ways the universal user interface.
Sorting something that you will never search is a complete waste; searching something you never sorted is merely inefficient. The question, of course, becomes how to estimate ahead of time what your future usage will be.
When we think about the factors that make large-scale human societies possible, it’s easy to focus on technologies: agriculture, metals, machinery. But the cultural practice of measuring status with quantifiable metrics might be just as important. Money, of course, need not be the criterion; a rule like “respect your elders,” for instance, likewise settles questions of people’s status by reference to a common quantity.
First, when you are deciding what to keep and what to throw away, LRU is potentially a good principle to use—much better than FIFO.
Second, exploit geography. Make sure things are in whatever cache is closest to the place where they’re typically used.
So as you age, and begin to experience these sporadic latencies, take heart: the length of a delay is partly an indicator of the extent of your experience. The effort of retrieval is a testament to how much you know. And the rarity of those lags is a testament to how well you’ve arranged it: keeping the most important things closest to hand.
the first lesson in single-machine scheduling literally before we even begin: make your goals explicit. We can’t declare some schedule a winner until we know how we’re keeping score.
Moore’s Algorithm says that we start out just like with Earliest Due Date—by scheduling out our produce in order of spoilage date, earliest first, one item at a time. However, as soon as it looks like we won’t get to eating the next item in time, we pause, look back over the meals we’ve already planned, and throw out the biggest item (that is, the one that would take the most days to consume).
Minimizing the sum of completion times leads to a very simple optimal algorithm called Shortest Processing Time: always do the quickest task you can.
We typically associate procrastination with laziness or avoidance behavior, but it can just as easily spring up in people (or computers, or vampires) who are trying earnestly and enthusiastically to get things done as quickly as possible.
weighted version of Shortest Processing Time is a pretty good candidate for best general-purpose scheduling strategy in the face of uncertainty. It offers a simple prescription for time management: each time a new piece of work comes in, divide its importance by the amount of time it will take to complete. If that figure is higher than for the task you’re currently doing, switch to the new one; otherwise stick with the current task.
None of this switching back and forth is “real work”—that is, none of it actually advances the state of any of the various programs the computer is switching between. It’s metawork. Every context switch is wasted time.
As a consequence, considering more and more factors and expending more effort to model them can lead us into the error of optimizing for the wrong thing—offering
what’s known as a dominant strategy. A dominant strategy avoids recursion altogether, by being the best response to all of your opponent’s possible strategies—so you don’t even need to trouble yourself getting inside their head at all. A dominant strategy is a powerful thing.