2008 Election Model Methodology: Assumptions, Calculations and a Challenge

2008 Election Model Methodology: Assumptions, Calculations and a Challenge

This post will describe the essential methodology used in the 2008 Election Model for calculating the expected electoral vote and win probability. It will show why the 99% Obama win probability calculation is mathematically correct assuming a) the election is held today b) is fraud-free and c) the latest state polls represent true voter preference. Important issues such as cell-phone users, RV vs. LV polls and media bias will not be considered. A few critics of the model apparently still believe that Bush won fairly and that Obama only has a 75% win probability. Hopefully, they will better understand the Election Model methodology and look at the track record. It would be foobar to do otherwise.

Fivethirtyeight.com has a great site with a tremendous amount of information. But their win probabilities are much too low; it’s mathematically incorrect to derive a 75% win probability to a 311-227 EV vote split. The win probability is incompatible with the split, as is seen from running a 5000 trial Monte Carlo simulation. The assumption is that the 538 model provides a snapshot based on the latest polls. If it is attempting to project the EV on Election Day, it should be made clear.

Those who compare Monte Carlo simulation probabilities to the Intrade odds are mixing apples and oranges. The Election Model produces the theoretical probability of Obama winning the Electoral Vote based on the latest state polls. Let’s give the Intrade bettor credit; he does not run a Monte Carlo simulation or view the polls in isolation. He most probably relies on the daily media commentary while consciously or unconsciously factoring in the potential for fraud. After all, Bush stole it in 2004, didn’t he? He won the recorded vote and that is what they pay off on. So what if Kerry won the True Vote?

The betting payoffs are based on the recorded vote – even if it was stolen. On Election Day 2004, Kerry was leading the Iowa election trading markets up until 9pm when the numbers magically shifted to Bush as the fraud kicked it. So you had late panic buying for the Bush shares. So let’s put at an end to the apples and oranges (Monte Carlo vs. Intrade) argument. In that regard, the Election Model provides a fraud scenario analysis. When will the other sites?

There are some critics who skip over my standard caveats, much less the analysis: The Election Model does not attempt to predict the election results months in advance of the election, as do most academic models. It is designed to calculate the expected electoral vote winner assuming a) the election is held today and b) is fraud-free. The model calculates the effects of uncounted and switched vote scenarios on Obama’s popular and electoral vote while other models don’t even mention election fraud. But election fraud is not the issue here; the objective is to explain why the Election Model produces such a high win probabilities.

Projecting state vote shares and win probabilities

The latest state poll is used to project the 2-party vote share. Based on the projection, the state win probability is calculated. To project the state vote, the Election Model allocates undecided voters (UVA) – a simple calculation. But we don’t want to rely on a single UVA estimate; the model calculates the projected vote over a range of scenarios -from 40-80% with 60% as the base case. I leave it to the reader to choose whatever UVA he/she feels comfortable with.

Since Obama has 53% of the 2-party national poll, it is not unreasonable to assume that he will get 60% or more of the undecided vote. Obama can be considered the challenger since McCain is running for the third Bush term. Pollsters typically assign 75-90% of the undecided vote to the challenger. In 2004, Gallup allocated 90% of the undecided vote to Kerry; Zogby and Harris, 75-80%. That is the rationale. In fact, 60^% for Obama is probably conservative, especially with the unfolding events over the past few days.

The national polls are more current than the states. Since Obama has a 53% 2-party share right now, it is to be expected that the state polls will climb. Since June, there has been a 0.57 correlation between the national and state polls.

If Obama’s projected 2-party vote share is V, then his state win probability is: calculated by the Excel formula:

P= normdist (V, 0.50, MoE/1.96, true). The Margin of Error is set to 4% for a typical state poll of 600 sample size.

That’s it. There should be no argument as to the mathematics.

Calculating the expected electoral vote and win probability

The expected state EV is the state win probability times the electoral vote. If the probability is 50% and the state has 20 EV, then Obama gets 10 and McCain gets 10. Now just add up the 51 expected EVs to get the Total Expected EV. So the math is exceedingly simple. The tricky part is projecting the state vote shares starting with the latest polls. The Election Model applies the UVA to derive the projected 2-party vote. The 51 state win probabilities are input to a 5000 election trial Monte Carlo simulation. The probability that Obama would win the EV is just the number of winning election trials/5000.

How does fivethirtyeight.com do it? We know thatelectoral-vote.com and RCP just add up the electoral vote based on who is ahead in the latest poll (average). But that can be misleading. What if McCain (or Obama) leads by 51-49% in 5 states with 100 EV? Electoral-vote.com and RCP would just give all 100 electoral votes to the leader. This is incorrect since there is no accounting for the polling spread. The probability is 31% that the trailing candidate will win the state.

The total EV is just the sum of the expected state EVs. If you agree with the state win probabilities, then the expected Electoral vote is a no-brainer.

The electoral vote win probability is simply the number of Monte Carlo simulation trial wins/5000. End of story.

Here’s a challenge to those who have criticized the 2008 Election Model’s win probability calculations. Compare the methodology to fivethirtyeight. But be specific. What was the 2004 fivethirtyeight projection? The Election Model correctly projected that Kerry would win the True Vote with 99% win probability - just like Obama. Of course, those who still believe that Bush won legitimately will never be convinced.

The 2004 Election Model

It’s hard to believe. But the same naysayers have been using the same old, discredited arguments for four years in their fruitless attempts to cast doubt on the 2004 exit polls which indicated that Kerry won the election.

The final 2004 Election Model projected that Kerry would win 51.8% of the 2-party vote with an expected 337 EV. The unadjusted aggregate exit poll (WPE) indicated that Kerry won 52.5%. The Kerry winning states totaled 337 EV.

Naysayers were challenged in the Democratic Underground Game Thread to provide a mathematically feasible and plausible Bush win scenario. In order to comply with the rules of the “game”, they had to use feasible weights based on the recorded 2000 and 2004 vote, annual 0.87% mortality rate and estimated 95% turnout of 2000 voters. They presented a spreadsheet to show a scenario for Bush to achieve his 3 million vote “mandate”.

In order to match the recorded vote, they adjusted the Bush Final NEP vote shares to implausible levels. Their scenario was based on the following assumptions:

1) One in 7 (14.63%) Gore 2000 voters defected to Bush in 2004.

The 12:22am NEP reported 8% (10% in the 2pm Final).

2) Kerry won just 52.90% of DNV (new voters and others who did not vote in 2000).

The NEP reported 57% (54% in the Final).

3) Just 7.20% of Bush 2000 voters defected to Kerry.

The NEP reported 10% (9% in the Final).

On the other hand, the True Vote model, which used feasible weights and plausible vote shares, determined that Kerry won by 52.6-46.4%.

The assumptions were:

1) 0.87% annual mortality

2) 95% turnout of Gore, Bush and Other 2000 voters in 2004

3) 125.74m total votes were cast (Census) in 2004

4) 12:22am NEP vote shares

            True Vote Model                           Bush Win Scenario

         Pct      Kerry    Bush     Other            Pct      Kerry    Bush     Other

DNV      21.49%   57%      41%      2%               21.72%   52.90%   46.50%   0.60%

Gore     38.23%   91%      8%       1%               37.84%   84.83%   14.63%   0.54%

Bush     37.83%   10%      90%      0%               37.44%   7.20%    92.31%   0.49%

Other    2.45%    71%      21%      8%               3.00%    65.90%   18.10%   16.00%

Share    100.0%   52.56%   46.43%   1.01%            100.0%   48.26%   50.74%   1.00%

         Votes    Kerry    Bush     Other            Votes    Kerry    Bush     Other

DNV      27.02    15.40    11.08    0.54             26.56    14.05    12.35    0.16

Gore     48.07    43.74    3.85     0.48             46.28    39.26    6.77     0.25

Bush     47.57    4.76     42.81    0.00             45.79    3.30     42.27    0.22

Other    3.08     2.19     0.65     0.25             3.67     2.42     0.66     0.59

Total    125.74   66.09    58.38    1.27             122.30   59.02    62.05    1.22

Which scenario are we to believe: the implausible 14.63% Gore defection rate or the mathematically impossible 43 Bush/ 37 Gore weights? Was the exit poll match to the recorded vote based on a) plausible 37.84 Gore/ 37.44% Bush weights and an implausible 14.63% Gore defection rate, or b) the Final NEP impossible 43 Bush/ 37% Gore weights and plausible (8-10%) Gore defection rate?

Because the 43 Bush/ 37 Gore weights contradicted the debunked reluctant Bush responder (rBr) hypothesis, the naysayers needed to come up with another explanation. They cited a post-election NES 600-sample survey to account for the impossible Final Bush/Gore weights. But they wanted to have it both ways: On the one hand, they claimed that the 43/37 weights were legitimate exit poll samples in which Gore voters misstated their vote; but they contradicted that when they used feasible weights applied to an implausible 14.6% Gore defection rate. But it was a very weak argument because it implied that 6.6% of Gore voters (8.6% over the 12:22am NEP defection rate) misrepresented their vote when they told the exit pollsters they voted for Bush in 2000.

They said the reason for the mass defection of Gore voters was due to a long-term bandwagon effect: former Gore voters wanted to associate with the “winner”, Bush. But “false recall” is not a plausible explanation since a) Gore won by 540,000 votes, b) according to the pristine 12:22am NEP, Kerry captured 91% of Gore voters and 10% of Bush voters, c) Bush had a 48.5% approval rating on Election Day, d) false recall is not applicable to pre-election polls and e) the pre-election polls matched the exit polls.

Why would Gore voters want to be associated with Bush? Even if returning Gore voters lied about their vote in 2000, it’s irrelevant. What is relevant is a) their factual 2000 recorded Gore vote and b) that 91% said they just voted for Kerry. We use this factual data to compute feasible and plausible weights by adjusting the 2000 recorded vote for mortality and estimated 2004 turnout.

False recall cannot be used as an explanation to explain the other demographic weightings. In the 12:22am NEP, 13047 respondents were asked who they just voted for – and Kerry won. But only 3200 respondents were asked how they voted in 2000. Kerry must have also won the 10,000 who were not asked how they voted in 2000. This fact alone totally contradicts the “false recall” argument. Why would respondents lie to the exit pollsters and claim to have voted for Kerry if they voted for Bush? Did they also lie about their gender? Kerry won the Gender demographic by 50.78-48.22%.

GENDER     Weight   Kerry    Bush     Other

Male       46%      47%      52%      1%

Female     54%      54%      45%      1%

Share      100%     50.78%   48.22%   1.00%

Votes      122.3    62.10    58.97    1.22

What is relevant is who the exit poll respondents said they just voted for in 2004 - and 91% said Kerry. The 2000 and 2004 recorded vote and annual mortality rate are historical demographic facts. They are necessary and sufficient to determine the maximum number of Bush and Gore voters who could have voted in 2004. The final realistic, plausible weighting is just the ratio of 2000 voter turnout to total 2004 recorded vote. The weights multiplied by the corresponding exit poll vote shares determine the national share. Therefore, the only exit poll response which matters is the answer to the question: Who did you vote for in 2004? It follows that even if "false recall" were a factor, it is irrelevant. Voters do not falsely recall who they just voted for five minutes earlier. What would be their motivation to lie? Survey responses are confidential.

The Election Calculator Model

The model calculates the True Vote for all elections since 1988.

For the 2004 election, the data consists of:

1) Census: 125.7m votes cast in 2004 vs. 122.3m recorded; 3.4m (2.74%) uncounted

2) Census: 110.8m votes cast in 2000 vs. 105.4m recorded; 5.4m (4.86%) uncounted

3) Annual voter mortality: 1.22% (4.88% over 4 years)

Assumptions:

1) 12:22am NEP vote shares

2) 2000 voter turnout in 2004: 95%

3) 75% of uncounted votes to Gore and Kerry

         2000 Recorded

Voted    Recd     Unctd    Cast     Died    Alive

Gore     51.00    4.04     55.04    2.72     52.32

Bush     50.46    1.08     51.53    2.48     49.06

Other    3.96     0.27     4.23     0.21     4.02

Total    105.42   5.38     110.8    5.41     105.39

         2004 Calculated

       Turnout    Voted   Weight   Kerry   Bush    Other

DNV      -        25.61    20.4%    57%      41%      2%

Gore     95%      49.70    39.5%    91%      8%       1%

Bush     95%      46.60    37.1%    10%      90%      0%

Other    95%      3.82     3.0%     64%      17%      19%

Total   100.1    125.7    100%    53.23%   45.39%   1.38%

                                   66.94   57.07    1.74

Here is a challenge for those who still believe that Bush won by 3 million votes:

Download the The Election Calculator Model and construct a 2004 Bush win scenario.