In statistical modeling you don’t really have right or wrong. You have a level of confidence in a model, a level of confidence in your data, and a statistical probability that an event will occur.
So if my model says RFK has a 98% probability of winning, then it is no more right or wrong than Silver’s model?
If so, then probability would be useless. But it isn’t useless. Probability is useful because it can make predictions that can be tested against reality.
In 2016, Silver’s model predicted that Clinton would win. Which was wrong. He knew his model was wrong, because he adjusted his model after 2016. Why change something that is working properly?
You do it by comparing the state voting results to pre-election polling. If the pre-election polling said D+2 and your final result was R+1, then you have to look at your polls and individual polling firms and determine whether some bias is showing up in the results.
Is there selection bias or response bias? You might find that a set of polls is randomly wrong, or you might find that they’re consistently wrong, adding 2 or 3 points in the direction of one party but generally tracking with results across time or geography. In that case, you determine a “house effect,” in that either the people that firm is calling or the people who will talk to them lean 2 to 3 points more Democratic than the electorate.
All of this is explained on the website and it’s kind of a pain to type out on a cellphone while on the toilet.
You are describing how to evaluate polling methods. And I agree: you do this by comparing an actual election outcome (eg statewide vote totals) to the results of your polling method.
But I am not talking about polling methods, I am talking about Silver’s win probability. This is some proprietary method takes other people’s polls as input (Silver is not a pollster) and outputs a number, like 28%. There are many possible ways to combine the poll results, giving different win probabilities. How do we evaluate Silver’s method, separately from the polls?
I think the answer is basically the same: we compare it to an actual election outcome. Silver said Trump had a 28% win probability in 2016, which means he should win 28% of the time. The actual election outcome is that Trump won 100% of his 2016 elections. So as best as we can tell, Silver’s win probability was quite inaccurate.
Now, if we could rerun the 2016 election maybe his estimate would look better over multiple trials. But we can’t do that, all we can ever do is compare 28% to 100%.
Just for other people reading this thread, the following comments are an excellent case study in how an individual (the above poster) can be so confidently mistaken, even when other posters try to patiently correct them.
May we all be more respectful of our own ignorance.
He’s quite a well known pollster. Up until recently he was responsible for Five Thirty Eight, but it got sold and he left.
He got the 2016 election wrong (71 Hilary, 28 trump) He got the 2020 election right (89 Biden, 10 Trump)
Right and wrong are the incorrect terms here, but you get what I mean.
He didn’t get it wrong. He said the Clinton Trump election was a tight horse race, and Trump had one side of a four sided die.
The state by state data wasn’t far off.
Problem is, people don’t understand statistics.
If someone said Trump had over a 50% probability of winning in 2016, would that be wrong?
In statistical modeling you don’t really have right or wrong. You have a level of confidence in a model, a level of confidence in your data, and a statistical probability that an event will occur.
So if my model says RFK has a 98% probability of winning, then it is no more right or wrong than Silver’s model?
If so, then probability would be useless. But it isn’t useless. Probability is useful because it can make predictions that can be tested against reality.
In 2016, Silver’s model predicted that Clinton would win. Which was wrong. He knew his model was wrong, because he adjusted his model after 2016. Why change something that is working properly?
Yes. But you’d have to run the test repeatedly and see if the outcome, i.e. Clinton winning, happens as often as the model predicts.
But we only get to run an election once. And there is no guarantee that the most likely outcome will happen on the first try.
If you can only run an election once, then how do you determine which of these two results is better (given than Trump won in 2016):
You do it by comparing the state voting results to pre-election polling. If the pre-election polling said D+2 and your final result was R+1, then you have to look at your polls and individual polling firms and determine whether some bias is showing up in the results.
Is there selection bias or response bias? You might find that a set of polls is randomly wrong, or you might find that they’re consistently wrong, adding 2 or 3 points in the direction of one party but generally tracking with results across time or geography. In that case, you determine a “house effect,” in that either the people that firm is calling or the people who will talk to them lean 2 to 3 points more Democratic than the electorate.
All of this is explained on the website and it’s kind of a pain to type out on a cellphone while on the toilet.
You are describing how to evaluate polling methods. And I agree: you do this by comparing an actual election outcome (eg statewide vote totals) to the results of your polling method.
But I am not talking about polling methods, I am talking about Silver’s win probability. This is some proprietary method takes other people’s polls as input (Silver is not a pollster) and outputs a number, like 28%. There are many possible ways to combine the poll results, giving different win probabilities. How do we evaluate Silver’s method, separately from the polls?
I think the answer is basically the same: we compare it to an actual election outcome. Silver said Trump had a 28% win probability in 2016, which means he should win 28% of the time. The actual election outcome is that Trump won 100% of his 2016 elections. So as best as we can tell, Silver’s win probability was quite inaccurate.
Now, if we could rerun the 2016 election maybe his estimate would look better over multiple trials. But we can’t do that, all we can ever do is compare 28% to 100%.
Just for other people reading this thread, the following comments are an excellent case study in how an individual (the above poster) can be so confidently mistaken, even when other posters try to patiently correct them.
May we all be more respectful of our own ignorance.
But what if there’s a 28% chance said poster is right?