Which pollster was most accurate in 2010




















Sampling error reflects the fact that a poll surveys only some portion of the electorate rather than everybody. This matters less than you might expect; a poll of 1, voters will miss the final margin in the race by an average of only about 2.

Another concern is that polls are almost never conducted on Election Day itself. I refer to this property as temporal or time-dependent error.

There have been elections when important news events occurred in the 48 to 72 hours that separated the final polls from the election, such as the New Hampshire Democratic primary debate in , or the revelation of George W. If late-breaking news can sometimes affect the outcome of elections, why go back three weeks in evaluating pollster accuracy?

Well, there are a number of considerations we need to balance against the possibility of last-minute shifts in the polls:. Nonetheless, the pollster ratings account for the fact that polling on the eve of the election is slightly easier than doing so a couple of weeks out. Our research suggests that even if all polls were conducted on Election Day itself no temporal error and took an infinite sample size no sampling error the average one would still miss the final margin in the race by about 2 percentage points.

However, some polling firms are associated with more of this type of error. It was a stunning upset, at least according to the polls. For instance, a Vox Populi poll had put Cantor ahead by 12 points.

Instead, Brat won by 12 points. The Vox Populi poll missed by 24 points. According to Simple Plus-Minus, that poll would score very poorly.

That poll had Cantor up by 34 points — a point error! If we calculated something called Relative Plus-Minus how the poll compares against others of the same race the Vox Populi poll would get a score of , since it was 22 points more accurate than the McLaughlin survey. Advanced Plus-Minus, the next step in the calculation, seeks to balance these considerations. It weights Relative Plus-Minus based on the number of distinct polling firms 20 that surveyed the same race, then treats Simple Plus-Minus as equivalent to three polls.

For example, if six other polling firms surveyed a certain race, Relative Plus-Minus would get two-thirds of the weight and Simple Plus-Minus would get one-third. The short version: When there are a lot of polls in the field, Advanced Plus-Minus is mostly based on how well a poll did in comparison to others of the same election.

Meticulous readers might wonder about another problem. Advanced Plus-Minus addresses this by means of iteration see a good explanation here , a technique commonly applied in sports power ratings. Advanced Plus-Minus also addresses another problem. This may reflect herding, selection bias pollsters may be more inclined to survey easier races; consider how many of them are avoiding the challenging Senate races in Kansas and Alaska this year , or some combination thereof.

So Advanced-Plus Minus also adjusts scores based on how many other polling firms surveyed the same election. This has the effect of rewarding polling firms that survey races few other pollsters do and penalizing those that swoop in only after there are already a dozen polls in the field. Two final wrinkles. Advanced Plus-Minus puts slightly more weight on more recent polls. Accounting for the fact that American Research Group polls a lot of primaries makes the firm look somewhat less bad, for instance.

But pollster performance still looks to be predictable to some extent. When we last updated the pollster ratings in , I failed to be explicit enough about our goal: to predict which polling firms would be most accurate going forward.

But that may not be your purpose. One other thing I was probably not clear enough about in was that participation in these organizations was intended as a proxy variable for methodological quality. Have they also been more accurate since? Yes they have — and by a wide margin. What is impressive is that the difference has continued to be just as substantial since June For clarity: The and results are true out-of-sample tests.

In the chart above, the polling firms are classified based on the way FiveThirtyEight had them in June — before these elections occurred.

Each firm gets a methodological score between 0 and 2 based on the answers to these questions. Tracking which firms call cellphones is tricky. However, we do not list a polling firm as calling cellphones until we have some evidence that it does. Which one should you expect to be more accurate going forward? Our finding is that past performance reflects more noise than signal until you have about 30 polls to evaluate, so you should probably go with the firm with the higher methodological standards up to that point.

If you have polls from each pollster, however, you should tend to value past performance over methodology. One further complication is herding. The methodologically inferior pollster may be posting superficially good results by manipulating its polls to match those of the stronger polling firms. If left to its own devices — without stronger polls to guide it — it might not do so well.

My colleague Harry Enten looked at Senate polls since and found that methodologically poor pollsters improve their accuracy by roughly 2 percentage points when there are also strong polls in the field. My own research on the broader polling database did not find quite so large an effect; instead it was closer to 0.

Still, the effect was highly statistically significant. The formula for how to calculate Predictive Plus-Minus is included in the footnotes. One purpose of this is to make clear that the vast majority of polling firms cluster somewhere in the middle of the spectrum; about 84 percent of polling firms receive grades in the B or C range.

There are a whole bunch of other goodies in the pollster ratings spreadsheet , including various measures of bias and house effects. We think the pollster ratings are a valuable tool, so we wanted to make sure you had a few more options for how to use them.

The date should be , not However, they are treated the same as other polls that do not place calls to cellphones. Be careful about coming to too many conclusions based on the way we have these polls labeled.

Furthermore, many common statistical measurements like the normal distribution are predicated on the notion of independent or random trials. These assumptions may be violated if pollsters are not behaving independently from one another. To be clear about the difference: Imagine in a race where the Republican won by 10 points, one poll had the Republican ahead by 5 points and another had her ahead by So the correct way to look at the poll estimates is to look at these values instead. With this logic in mind, I have chosen to calculate the RMSE for each pollster by using the following values:.

I feel these values determine the na rratives of the campaign and the best pollster will be the one with the lowest RMSE. Using these, can we now say who is the best pollster? Some pollsters carried out more than one poll in the last week of the campaign so it is possible for them to appear more than once. Indeed in , YouGov had two of the worst polls and the best poll whilst Comres had two of the best polls and one of the worst polls.

Note that I have only included in this table pollsters who have undertaken polls in at least two elections. What does the table show us? The most striking fact for me is that Survation had the most accurate poll in and the most inaccurate poll in I think this point is not widely understood.

Just because a pollster happens to be the most accurate in one election, does not mean they will be the best next time around. Indeed Comres who came closest in had one of the worst polls in Monmouth University Polling Institute. This originally appeared as a guest column on Pollster. As most poll followers know, Nate shot to fame during the election, taking the statistical skills he developed to predict baseball outcomes and applying them to election forecasting.

Nate recently released a new set of pollster ratings that has raised some concerns among the polling community. First, there are some questions about the accuracy of the underlying data he uses. Nate claims to have culled his results from 10 different sources, but he seems to not to have cross-checked those sources or searched original sources for verification. I found evidence that suggest these errors may be fairly widespread. In the case of prolific pollsters, like Research, these errors may not have a major impact on the ratings.

But just one or two database errors could significantly affect the vast majority of pollsters with relatively limited track records — such as the pollsters out of pollsters on his list who have fewer than 5 polls to their credit.

Some observers have called on Nate to demonstrate transparency in his own methods by releasing that database. Nate has refused to do this with a dubious rationale that the information may be proprietary — but he does now have a process in place for pollsters to verify their own data.

One of the obvious problems with his use of the bonus is that the June 1 cut-off is arbitrary. Those pollsters who signed onto the initiative by June 1, were either involved in the planning or happened to attend the AAPOR national conference in May.

Thus, the theoretical claim regarding a transparency bonus is at least partially dependent on there also being a relationship between pollster accuracy and AAPOR conference attendance. His methodology statement includes a regression analysis of pollster ratings that is presented as evidence for using the bonus.

The adjusted R-square is. Interestingly of the three variables — transparency, partisan, and internet — only partisan polling shows a significant relationship. He decided to calculate different benchmarks that award transparent polls and penalize internet polls even though the latter was based on only 4 cases and not statistically significant.



0コメント

  • 1000 / 1000