Two downloads from my poll-averaging model
Thursday November 8, 2012
I’ve had numerous requests for various outputs from my model, so I’ll put them up here.
1. Forecast win probabilities for Obama, by unit, by day. These were the quantities also requested by a group of researchers working at Penn, who will use Brier scoring to assess various probabilistic forecasts of the state-by-state outcomes.
A summary of how well the final (Election morning) state-by-state probabilities performed is shown in the attached graph, which plots Obama 2-party vote share (using data from this afternoon from AP) against the probabilities from my model.
The probabilities are the data I’ve presented in this graph:
The data themselves are here (CSV). The final set of probabilities also appear in the file I link to below.
2. State level predictions, Obama 2-party vote share, plus uncertainty assessments. The comparison with actual state-level outcomes appears in the following scatterplot (the diagonal line is a 45 degree line).
Data here (CSV). One of the columns is the final probability of Obama win, if you are looking for that.






@SimonJackman thanks much!
I’ve got to say that while I agree that the probability method is the best way to estimate the overall position, I still find that the general lack of understanding of what a probability function is causes people to completely misunderstand the probabilities of individual state predictions.
I saw Silver’s Florida prediction of a 55% probability Obama win authoritatively reported by someone who clearly didn’t understand that 55% was little better than a coin toss probability.
Your first graph shows what the probability graph looks like for levels of vote, and that the probability function is steep and changes rapidly with close results. Clearly people don’t understand that 55% probability probably means a poll of 50.1% which clearly means the number is close. All the hallelujahs that probability models picked the winner in every state overlooks that the probabilities are based on the polls and the polls got it right in every state. Essentially taking a binomial approach (who’s leading in each state poll) produced the same prediction, though I am in no way suggesting such an approach is preferable to the probability model.
The real beauty of the probability model you, Nate Silver and others use was the model’s assurance that based on state polls Obama would win. That was a far more impressive achievement than what everyone is saying about the state predictions.
(Ack, please delete my first comment!)
Hm, in the first set, what is the final USA number?
Second, the spreadsheet only seems to go up to Nov 4 – no Nov 6 predictions?
The dates in my series are keyed to last day of polling. I defined a poll date as floor(median(start,end)). Nov 4 was the last day, given this defn, even though polls being released morning of Nov 6. That is, the mapping from poll dates to calendar day is rather arbitrary.
It is interesting, in that had I had an explicit trend component in the model, I might have ticked up a little higher on the national and some other states too (those with trends highly correlated with the national trend), say, forecasting out Bov 4 to Nov 6. Harumph.
Final, USA number not in the state file, because it is a state file. There should be a USA number (probability of Obama win, national popular vote) in the 2nd file.
If what you want is my P(Obama EV Count > 269) by day, I can generate that too.
The “polls got it right” is true, but model-based poll-averages (or “non-naive poll averaging”) outperformed simple averages of the polls, and certainly outperforms reliance on any one poll. Plenty of blogging on that point out there.
The polling industry, collectively, did great. But many shockers out there. Some intelligence (e.g., Bayesian inference) was/is required to assemble the poll average such that the averages performed so well.
Nate and I and Drew got very lucky that the Florida coin flip came up the right way, if 51/51 is the relevant performance measure. I don’t pretend otherwise. I think Nate’s final FL prob was 50-50. I was 52.16%, best guess was 50.07% for Obama two party vote share; actual at this stage of the count is 50.36%.
Sometimes you get lucky.
[...] run by academics or private individuals skilled at forecasting? We see Drew Linzer, Nate Silver, Simon Jackman, and DeSart & Holbrook within about 0.01 of each other, with Linzer in the lead. This is [...]
Professor Jackman,
Is there a better way to score probabilities than the Brier score?
Here is the thinking behind my question. Suppose (1) a forecaster shows a 50.4%, 49.6% 2-party vote share advantage for Obama in a state, and (2) the forecaster gives this a 100% win-probability for Obama. Suppose (3) Obama wins: the forecaster will have an extremely good (indeed, perfect) Brier score. However, it seems clear that, in this scenario, giving Obama a win probability of far less than 100% is in some sense a better probability forecast, even though it would results in a weaker Brier score.
Is there a way to score win probabilities that takes this into account?