How Did the Pollsters Get It So Wrong?

Re-posted from the Revolutions blog.

A lot of what we do in the big data, machine learning & data science community is to build tools and solutions that help us predict the seemingly unpredictable. With the US elections, though, pretty much all the pollsters and the models behind them got it dead wrong. All the major polls consistently put Clinton down as a solid favorite, with her win probability exceeding 70% the day prior to the elections.

So what went wrong with these forecasts? There are three possibilities:

  • The models were plain wrong.
  • The models were right, but the result was a fluke.
  • The data used by the models was bad.

Which one was it? Join the debate at the original post here or by clicking the image below, which captures a major shift in mood towards Trump that nobody quite predicted:


Country shift from 2012 (Image Source: New York Times) 

Regardless of the factors at play, last night wasn’t the best advertisement for political forecasters and pundits. There will, no doubt, be much analysis on this topic in the weeks to come. Perhaps, in a constructive spirit, we can harvest relevant datasets from the post-election process and put them to good use, like in a future Cortana Intelligence competition! Who knows – we may even be able to help the pollsters get things right during the next major election cycle!

CIML Blog Team