I know the media loves a horse race and all, but this is pushing it.
With the dust having finally settled after the prolonged Democratic presidential primary, a new CNN/Opinion Research Corporation poll shows Sens. John McCain and Barack Obama locked in a statistical dead heat in the race for the White House.
With just over four months remaining until voters weigh in at the polls, the new survey out Tuesday indicates Obama holds a narrow 5-point advantage among registered voters nationwide over the Arizona senator, 50 percent to 45 percent. That represents little change from a similar poll one month ago, when the presumptive Democratic presidential nominee held a 46-43 percent edge over McCain.
CNN Polling Director Keating Holland notes Tuesday’s survey confirms what a string of national polls released this month have shown: Obama holds a slight advantage over McCain, though not a big enough one to constitute a statistical lead.
“Every standard telephone poll taken in June has shown Obama ahead of McCain, with nearly all of them showing Obama’s margin somewhere between three and six points,” Holland said. “In most of them, that margin is not enough to give him a lead in a statistical sense, but it appears that June has been a good month for Obama.”
[…]
The poll, conducted June 26-29, surveyed 906 registered voters and carries a margin of error of plus or minus 3.5 percentage points.
Nate Silver points out one obvious problem. From the National Council on Public Polling‘s “20 Questions A Journalist Should Ask About Poll Results”:
12. Who’s on first?
Sampling error raises one of the thorniest problems in the presentation of poll results: For a horse-race poll, when is one candidate really ahead of the other?
Certainly, if the gap between the two candidates is less than the sampling error margin, you should not say that one candidate is ahead of the other. You can say the race is “close,” the race is “roughly even,” or there is “little difference between the candidates.” But it should not be called a “dead heat” unless the candidates are tied with the same percentages. And it certainly is not a “statistical tie” unless both candidates have the same exact percentages.
And just as certainly, when the gap between the two candidates is equal to or more than twice the error margin – 6 percentage points in our example – and if there are only two candidates and no undecided voters, you can say with confidence that the poll says Candidate A is clearly leading Candidate B.
When the gap between the two candidates is more than the error margin but less than twice the error margin, you should say that Candidate A “is ahead,” “has an advantage” or “holds an edge.” The story should mention that there is a small possibility that Candidate B is ahead of Candidate A.
Emphasis added. Certainly makes for a different story that way, does it not? We see this all the time, and in all kinds of races, but given that Obama has held a consistent lead over McCain for several months now – as in, not a single recent national poll has shown McCain with a lead – it’s particularly egregious to call this race a “dead heat”
One more point: Way back in 2004, Kevin Drum asked a couple of statisticians a question that really should be asked more frequently in all matters related to polling:
In fact, what we’re really interested in is the probability that the difference is greater than zero — in other words, that one candidate is genuinely ahead of the other. But this probability isn’t a cutoff, it’s a continuum: the bigger the lead, the more likely that someone is ahead and that the result isn’t just a polling fluke. So instead of lazily reporting any result within the MOE as a “tie,” which is statistically wrong anyway, it would be more informative to just go ahead and tell us how probable it is that a candidate is really ahead.
He goes on to provide an Excel spreadsheet that allows you to make that exact calculation. And guess what? Based on the CNN poll, the probability that Obama is actually ahead is almost 94%. Like I said, it sure looks different when reported that way, doesn’t it?
(I also discussed this at Kuff’s World.)
In what I hope to offer as only a *rare* partial defense of spotty journalism …
For once, I’ll disagree with Nate on statistical grounds. The poll is not the outcome, so referring to it as a statistical dead heat is, in that light, a defensible nod to the reality that the poll is an artifice. It is a statistical measure of probability over a period of time other than the actual period when people cast ballots. In this instance, I think Nate gives too much credence to the certainty of the measured outcome (which has led to some egregious errors in the predictions over at 538 during the primaries) and while the shorthand understanding of the polling may warrant his conclusion, I’m not sure that a purely statistical understanding of it does.
What’s more irritating to me in reading the CNN take is their selection of “three to six” as the range of polling margins for Obama. While they may be outliers, there are at least a few with double-digit leads outside of the margin of error.
It strikes me that there’s a deference to propping up an outlet’s own poll once one is taken. Otherwise, why go through the trouble of conducting yet another poll? Still, it’d be better to see CNN refer to the range of Obama leads a bit more accurately.