My friend Ellen sent me a link to this WSJ article about quantifying political prognostication.
As 2006 approached, pundits performed the annual rite of making predictions for the year ahead. Scripps Howard’s prognosticator expects the withdrawal of U.S. troops from Iraq to begin. The Christian Science Monitor published a forecast that North Korea or Iran will acquire nuclear weaponry. On Fox News, Bill Kristol predicted another Supreme Court vacancy, while Brit Hume’s crystal ball saw an acquittal for Lewis “Scooter” Libby. (The Wall Street Journal’s New Year’s Eve look-ahead to 2006 couched most of its political forecasts with the word “likely.”)
Such predictions are good fun. But in general, the prognostications of political pundits are about as accurate as a chimp throwing darts. At least that’s the finding of “Expert Political Judgment,” a new book by University of California, Berkeley, political psychologist Philip Tetlock. From 1987 to 2003, Prof. Tetlock coaxed 284 political experts of all stripes — academics, journalists and think-tankers from across the political spectrum — to make specific, verifiable forecasts. He looked at more than 27,000 predictions in all.
Prof. Tetlock’s innovation was to elicit numerical predictions. As he noted in an interview with me, political punditry tends toward the oracular: statements vague enough to encompass all eventualities. But by promising confidentiality and relying on the “curiosity value of the project,” he was able to get pundits to provide probability estimates for such questions as whether certain countries’ legislatures would see shifts in their ruling parties, whether inflation or unemployment would rise and whether nations would go to war.
Without numerical predictions, “it’s much easier to fudge,” Prof. Tetlock told me. “When you move from words to numbers, it’s a really critical transition.” What he found is that people with expertise in explaining events that have happened aren’t very successful at predicting what will happen.
He demonstrated this by checking their predictions against reality, and then comparing the humans’ performance with that of several automated prediction schemes. The simplest type was chimp-like. No chimps were harmed in the experiment; Prof. Tetlock essentially used random numbers. More complex sets of predictions were based on the frequency of similar events in the past. The virtual chimps did about as well as humans, while several of the more-complex schemes exceeded the best human forecasters.
I don’t suppose that’s a great shock. As a famous philosopher once said, “Often uncertain the future is.” Why should so-called experts be any less subject to that uncertainty?
Prof. Tetlock wants to see elevated debate and improved punditry, and he has several ideas for how to make it happen. One is for pundits to hone their skill by playing predictions markets — betting pools that assign values to future events such as a Republican victory in a gubernatorial election. These markets, like Prof. Tetlock’s study, force prognosticators to make quantifiable bets and provide feedback in the form of monetary gains or losses — if you back a losing outcome, you lose money (see more on this at Wikipedia). He found that the best forecasters operate in fields like meteorology “in which they get quick, unequivocal feedback on predictions.”
The recommendation with which he ends the book is the most far-reaching: Prof. Tetlock urges independent monitoring of experts’ predictions. In the interview, he suggested either two media organizations — he named The Wall Street Journal and New York Times — or two respected think tanks join forces to get experts on the record with numerical predictions, and then regularly report the results.
“It would be a good thing” to do political punditry better, Prof. Tetlock told me. “It would be good for society, and it would be good for science.”
I totally agree, and I’d love to see that happen, though I’d just as soon it be done as an open-source Web project than by media outlets like the NYT and the WSJ. Keep track of every quantifiable prediction every syndicated pundit and widely-read blogger makes, record how those predictions turned out, and turn it into a scorecard of some kind so they can be ranked.
I can’t seem to find it now, but I distinctly recall Slate magazine publishing a list of Election 2000 predictions by a wide array of media types. It asked how many electoral votes Bush and Gore would get, whether Hillary Clinton or Rick Lazio would win the NY Senate race, and one extra prediction of the pundit’s choosing. Someone – I think it may have been Peggy Noonan, but I can’t swear to it – picked Bush to win over 400 EVs while carrying California. Whoever it was, I think it should be a part of their byline with every op-ed piece they write.
During the NFL season, the Chronicle sports section features a full-page ad by an auto dealership, which lists each of its salespersons’ picks in that week’s games, ordered by their record so far. I presume the winner gets some kind of award at the end of the year, while the poor sap who finishes last has to wash everyone else’s car or something like that. If that kind of accountability is good enough for football fans, isn’t it good enough for pundits?
Steven Brill’s old magazine, Brill’s Content (reference here and here), used to track the major television pundits in just such a fashion. It was a good idea then and a good idea now.