It's about six weeks until the 2016 US presidential election, and everyone wants to know: Who will win? Hillary Clinton? Or Donald Trump? To attempt to answer this question, political scientists like Jacob Montgomery build complex forecasting models. Montgomery shares his own unique approach to forecasting and describes both the limitations and the value of these efforts to predict the future.
Claire Navarro (host): Thanks for listening to Hold That Thought! I’m Claire Navarro. If you look at the news this week, or really any time over the last several months, you’re pretty much guaranteed to hear two names over and over. Hillary Clinton and Donald Trump. And the question on all of our minds is: Who will win the election in November? Who will be the next president of the United States? A lot of the predictions you hear in the news or see online come from polling data – snapshots of which candidate voters prefer on any given day. But what happens if you stop thinking about Donald Trump and Hilary Clinton. Their policies, their personalities, all of it. Just take it out of the picture. If you ignore the actual candidates, is it possible to predict the outcome of an election? It may seem strange on the face of it, but some political scientists try to do just that. To understand how, think about an election like a contest, a game.
Jacob Montgomery (guest): So if you think about an election as being sort of two people playing pool, the players - the candidates - their skill set matters. But the situation on the table also matters, how difficult of a shot is in front of the person.
CN: Jacob Montgomery is a political scientist here at Washington University in St. Louis. Montgomery studies American politics and has done work with election forecasting. These kinds of forecasts have been around since about 1992, though some people were doing it even earlier. To make a forecasting model, political scientists step away from the candidates and look at the big picture – the situation on the pool table.
JM: So things like how people think about how the economy is doing, GDP growth, who's in control the White House and how popular they are. These factors historically predict how well the candidates are going to do. Not perfectly, but they sort of give you a sense a rough sense.
CN: So, this is the kind of information that goes into the forecasting models. And as the years go on, more and more models get created. Sometimes the models get it right, sometimes they don’t. It turns out that predicting the future is really hard to do. Just think about the kind of forecast you look at every day – weather forecasts. Often they don’t get it right, and weather forecasters have a lot of data to work with to make their predictions.
JM: So if you think about weather forecasting, they have weather forecasting stations all over the country. Hundreds, thousands of them. And they take measurements very regularly. So if you have 20 sites that you're making predictions about every day, in the end you have thousands and thousands of data points. So in this case most of these forecasting models are built on elections since 1948. So we're talking about a very small sample size. So these models tend to be sparse, which means they don't include many variables. Most of these models include maybe three variables, and they tend to have pretty wide confidence intervals - meaning that they're pretty uncertain about exactly what the election outcome is going to be.
CN: With these kinds of limitations, how is anyone supposed to know which model to pay attention to in 2016? Rather than agonizing over which model is best and why, Montgomery and some colleagues decided to take a different approach – one that doesn’t make you choose.
JM: What we're doing in our project is to take these forecasting teams and try to collect the wisdom inherent in all of them into a combined model that might give us a more accurate forecast over time.
CN: Think of it like a really specialized form of crowdsourcing. With more information and more diverse approaches to solving a problem, you have a better chance at getting the right answer.
JM: The technique, which actually originated in weather forecasting, is to take different forecasting algorithms - or in this case modeling teams - and to weight them, to take their forecasts and combine them as a weighted average.
CN: So gather all the information that you can, but also give due credit to the models that have been the most reliable in past elections.
JM: Teams that have been more accurate, in terms of predicting what percentage of the vote is going to go to each party, get a little bit higher weight in our combined model. The other factor that goes into it is whether they're unique. So you can imagine that forecasting models that aren't necessarily right but are right when everyone else is wrong, you might want to give them a little bit higher weight. So that's the kind of two basic things that go into it, about how we weight these models.
CN: To find out how well this combined approach really works, Montgomery and his team tried out their specially weighted forecast on elections from past years – seeing if the ensemble model would point to the candidate that actually won. And the results?
JM: They called every election correct going back to 1992. They even called the Gore-Bush election right, in the sense that they predicted that Gore would win the majority of the votes.
CN: Impressive, right? So now you’re probably wondering – what about this election? What about right now? Tell me the future, Dr. Montgomery! But not so fast. Remember the analogy about the game of pool? All these models take for granted some basic expectations about the people playing the game – the candidates.
JM: One of the underlying assumptions of this is that both parties are going to field candidates who are going to be seen as legitimate contenders for the White House, and that they're going to be competing in a way that will reveal the sort of underlying tendencies of the economy, and the popularity of the president, and how long one party has been in office.
CN: This year, with such strong feelings on both sides about these particular candidates, and with Trump’s unusual campaign that focuses so much on TV appearances and social media - the models – and even the combined model – may just not work, as far as predicting who will win. But that doesn’t mean that forecasting isn’t valuable. For 2016, the ensemble model calls for a close race, with a narrow Republican victory. For the reasons we just talked about, Montgomery does not believe this is a perfect view of the future. But it does provide a picture of what very likely would have happened, if this were a more typical election year. And that kind of context can help us make sense of what’s going on right now.
JM: What I think it's useful for doing is to help clarify your thinking. One thing I think I've seen some commentators say is to wonder why isn't Clinton doing better. And they often seem to attribute that to her and her campaign's performance. But what these models are saying is that the fact that she's doing so well is already above expectations. The economy's doing so-so. President Obama's approval numbers are good but not amazing as say Reagan's were at the end of his term. And most importantly, the Democrats have held the White House now for two straight terms. And if you look back at the record, no party has managed to hold the White House for more than three terms since World War II. And the only time the party held the White House for three terms in a row was George H.W. Bush. And other than that it's just never happened, well not in the modern era. So all these factors are sort of pointing to the fact that it would be very unusual for Clinton to be having a landslide election. So what I think these sort of numbers do is help you clarify your thinking when you're looking at at how the polling numbers are coming in and how to interpret it. It sort of gives you a baseline feeling about what is the realm of possible election outcomes that might have been expected, given the fundamental conditions, and helps you interpret a little more cleanly.
CN: This sort of information is helpful and important, but it’s no crystal ball, at least not in 2016. But what about for future elections? To close out our conversation, I asked Montgomery about his vision for the future of forecasting.
JM: So I think there's two limits to forecasting. One is data size. And in that sense, presidential forecasting is not going to be much improved in the next 20-30 years, relative to the kinds of forecasting efforts you get in things like finance or weather. The other limit is that social processes are inherently uncertain. Even if you look at the stock market. There's a lot of people who spend a lot of time trying to predict the stock market, and if you could do it you'd make a lot of money. But social systems tend to be so complex that you can never really build a complete representation. That's why we get things like the financial crisis that no model really had in it. So I'm less excited about improvements in prediction for presidential elections, so much as increasing our ability to predict lots of things in the political world as our data grows and as our ability to develop complex models improves. So there's a lot of interest for instance in the intelligence community, developing systematic forecasting of international crisis events or maybe civil wars or interstate disputes. There are some projects I've seen trying to forecast where we're most likely to see mass killings or beginnings of genocide. There's work I've seen predicting, based on who has given a candidate money, how they're going to vote once they get into Congress. And as we get more and more data points and as we develop better and better models, we're going to be able to provide better predictions for all these kinds of things which are not only interesting scientifically, but interesting, I think, to policymakers and to voters.
CN: Many thanks to Jacob Montgomery for joining Hold That Thought. For many more ideas to explore, please visit us at holdthatthought.wustl.edu.