Disclaimer:

Disclaimer: While I am a paid undergraduate researcher for Texas Tech University, the opinions expressed on this blog are my own. The statements on this blog are not endorsements of views of Texas Tech University, The Honors College of Texas Tech University, or the math department of Texas Tech University, or any other department or campus organization of Texas Tech University.

Saturday, November 5, 2016

Final Call Summary

This post represents my official final prediction for the 2016 presidential election using my experimental model based upon bayesian analysis.   Overall my model predicts 340 electors for Clinton and 198 electors for Trump.
My final set of data was pulled from Pollster (run by Huffington Post) on either Friday (11/04) or Saturday (11/05).

First I am going to lay out again how my model works and how decisions are made.


I designed my experiment in September of 2016. Basically my model operates on an adjusted average of poll data from both the state and a similar state with more information. There are 5 categories for the states: red southern states (Texas polls), red midwest states (Nebraska polls), blue northern states (New York polls), blue western states (California polls), and swing states (National polls).  These states were chosen in advance. If I were making this decision today, I would have picked Indiana for the midwest states instead of Nebraska because it has more polls with better information on other candidates.  For a poll to be considered it had to  been conducted on or after July 1st, 2016. The lastest polls do not play a large factor in my analysis since I am averaging the since July 1st.   Percentages for third party candidates were attempted to be approximated, but because of the inconsistencies in the inclusion of third party  in state level polls my model will underestimate these candidates. Changes over time are not a factor in my analysis, because the theory is over time opinions don't change that much about candidates in an usual election.


I am defining success as correctly predicting at least 46 states. However, this experiment is not about being correct.  This is about testing the effectiveness of a new type of model.   My model is probably not the best model on how to predict a presidential election, but it is still a valid model and should be relatively accurate.  This model is also untested and there are no examples of this exact approach.   I make my calls and analysis based on my model not my opinions. I only think my model is wrong in Ohio because I believe the latest trends over the last few weeks there are probably more accurate than the overall average for the last few months. In the nomination process, I made calls on my personal opinion in cases of overlapping intervals that were close so that no true winner could be found.  If I had a situation like that in the general election I would make a call on outside information, but I this is not the case.  This prediction is what I am studying academically and is what I will write my paper on and present at conferences.  However, I might release another prediction tomorrow or Monday based on new polls that changes the results of my model in the 6 states I think might flip before election day. This updated prediction will be based on my model, but won't determine the success of my experiment.


Nebraska and Maine award based on congressional district.  My prediction is Trump will win all congressional districts in Nebraska, and the 2nd district in Maine.  Clinton will win the 1st district in Maine.

This is a summary map of how my model is calling every state:
This is the official map for my experiment.





This map isn't based entirely on my model.  I think this is what will actually happen based upon a combination of model and other factors like momentum, trends, early voting data, new stories, and other non-poll based factors. However, besides Ohio both models are in agreement.



Personal Opinion Disclosure:  I voted a straight republican ticket in the election except for president where I wrote in Ana Navarro.


Technical Description of my model:  I am doing a bayesian analysis assuming a normal prior, a normal hypothesis, and a normal posterior.  This is done in Anaconda using Scipy. The method for finding standard deviation is the standard formula based on the sum of the squares of deviations from the mean.  I am aware that this is probably not the best method for this kind of situation.  But, I am limited by time and mathematical abilities as an undergraduate student.  I plan to go more indepth in the future.






No comments:

Post a Comment