There’s actually less volatility in the polls than you might think.
Virtually everything about election polls is anathema to scenario planners. Pollsters are concerned with the short term, and however much their reports are hedged, present a false precision to forecasts of uncertain outcomes. They have also been famously wrong, and are likely to become less and less reliable as response rates continue to fall. And yet, so extraordinary is this year’s presidential election, and its consequences so potentially disruptive, that it would be wrong to take too purist a position.
FSG has in fact been doing its own analysis of the polls, as it also did in 2008. The source of the data is Pollster.com, which reports every national poll. We have made no attempt to weight the polls by the accuracy or known biases of the polling companies, and the data shown here are all moving averages of the latest ten polls. In order to make comparisons with the 2008 election—the last time both candidates had emerged from primary campaigns—the graphs use ‘days before election’ rather than calendar dates on the x (horizontal) axis.
The way polls are reported makes sensible analysis more difficult, as only the latest data for each candidate are shown—and no account is taken of the undecideds. That is particularly important in this election because the number of undecideds (simply defined as the proportion not expressing a preference for either candidate—whom we therefore also refer to as ‘Neither’) has been very high, as is clear in Chart 1. This is because poll respondents in 2016 have been uncertain about both candidates. As we shall see later, in 2008, respondents were undecided mostly about only one of the candidates.
It is changes in the size of this (large) group of undecideds that are responsible for changes in the size of the support for each candidate; in other words if a candidate’s numbers go up, undecided numbers go down and vice versa, and this is true for both candidates. This may seem obvious, but the way that polls are reported would lead you to think that a change in support for one candidate is at the expense of the other.
Our second chart clearly shows that each candidate’s numbers have almost the same relationship with undecideds (Chart 2).
[How to read this chart: Each data point shows for a point in time the percentage of poll respondents expressing a preference for a candidate (blue for Clinton, red for Trump) on the vertical (y) axis, and the percentage expressing a preference for neither candidate on the horizontal (x) axis. Preference for both candidates is strongly inversely related with ‘neither’. The solid lines through the data points are ‘trend lines’ —showing the best fit. The equations next to each line show the slope of each line. Based on the data so far, the equations say that each percentage point increase in undecided (x) reduces Trump’s preference (y) by just over half a percentage point (-0.522) and Clinton’s (y) by just under half (-0.477), and that if undecideds were zero, preference for Clinton would be 52% (52.034) and for Trump 48% (47.966).]
Looking at the data for the entire year shows undecideds reached their highest in July (Chart 3), but have reduced in number quite substantially since then, and are now more or less the same proportion as they were at this stage in the 2008 election (which you can see clearly in Chart 1). So it is also interesting to see what has happened to the two candidates’ poll numbers since the high point—and Chart 4 shows that the reduction in undecideds since then has favored Clinton more than Trump, despite the ballyhoo in the media about the ‘narrowing’ of the gap. Effectively, each candidate’s support grew immediately after their respective conventions, before Trump’s slid a little; but overall Clinton’s numbers have grown more.
Given this trend, and an ever-shrinking pool of undecideds, it looks ceteris paribus like a substantial Clinton victory. However, in 2008, at a similar stage in the race, the same analysis would have produced a prediction for McCain. Unlike the 2016 race, when both candidates have drawn support roughly equally from previously undecided respondents, in 2008, preference for Obama was largely stable, while McCain’s support was variable and drew heavily (and disproportionately) from those who had hitherto not expressed a preference. This is shown in Chart 5.
But thereafter McCain’s numbers fell away (see Chart 6), as respondents switched their preference directly to Obama. Two major factors cited for this switch are McCain’s maverick decision to appoint Sarah Palin as his running mate, and the unfolding financial crisis, including Lehman Brothers filing for bankruptcy on September 15th.
It is important to put polls in context, and to understand what the data are saying using a wider perspective than the ups and downs of each candidate’s numbers. Based on these principles, our analysis would point to a fairly comfortable victory for Clinton. But it also goes without saying, especially in this volatile presidential campaign, that anything can happen between now and Election Day. Surprises or shocks unimaginable today could still upset the balance that presently favors Hillary Clinton. Who would have predicted a political party’s email account being hacked, apparently by a foreign government? It is therefore reasonable to suggest that never before has the US faced the prospect of nationwide election data hacking and massive uncertainties on the day of and in the weeks after the election, resulting in the Electoral College in December not being in a position to vote. We hope you didn’t read it here first.