Slog News & Arts

Line Out

Music & Nightlife

« Christopher Hitchens to Media:... | Savage Love Letter of the Day »

Monday, October 20, 2008

A Histogram You Want to Read

posted by on October 20 at 17:37 PM

Nate Silver, the wonky head of the mathematically rigorous election projection site FiveThirtyEight.com, has a computer model that uses all of the available polling, weighted for accuracy, demographics and the rest, to run through ten thousand possible elections every day. Each one of these simulated elections pops out an electoral vote total for Obama.

What’s the best way to display all this data? A histogram.
Here’s Nate’s:

1019_evdist.png

Along the bottom, on the horizontal, are the possible electoral vote counts for Obama.

For each one, from zero to five hundred thirty eight, on the vertical are the number of times this Obama electoral vote count happened during his ten thousand simulations. The tallest peaks are the most likely outcomes during the simulation. The low tails are things that are possible, but not very likely.

Many of the closer followers of FiveThirtyEight.com, like the Stranger’s own Anthony Hecht, tend to focus more on the big Obama victory pie chart. Over the past few days, Obama’s number has drifted down a bit, from a peak around 96% to the low nineties today.

Look at Nate’s histogram for today:

1020_evdist.png

McCain is all tail, no peak. The peaks are still strongly skewed to an Obama blowout.

The histogram tells you, in much more detail than a number or a pie chart, the chances of the different outcomes in crisp (and this case comforting) detail.

I love histograms.

RSS icon Comments

1

The reason this looks like this is that in order to win, McCain needs a clean sweep of ALL the swing states and all the marginal states. Even if McCain wins FL, NV, MO, OH, and NC, he loses. Even if he takes NM and CO, an Obama win in VA puts McCain out. He needs to run the table, and that's statistically very, very unlikely.

Posted by Fnarf | October 20, 2008 5:51 PM
2

So. If I may paraphrase both Jonathan and Fnarf. Obama 92.5% / McCain 7.5%. Is that basically what you're both saying?

Posted by elenchos | October 20, 2008 5:54 PM
3

Yeah, elenchos is right... The histogram is interesting in that the most likely outcome is 364. But, 9,250 times out of 10,000, Obama wins the election. That is really what I care about...

Posted by Julie in Chicago | October 20, 2008 6:04 PM
4

yeah, well all this is assuming that the voting machines aren't rigged. that's a big assumption. i'm really not looking forward to the next election being fought (and lost) in court. we really can't afford it.

Posted by ellarosa | October 20, 2008 6:05 PM
5

Can't you make meth out of antihistograms?

Posted by Jubilation T. Cornball | October 20, 2008 6:07 PM
6

"364" means Obama wins NV, CO, NM, MO, IA, OH, PA, VA, NC, and FL among the theoretically contested states. "375" adds IN, which is a bit of a stretch, I think. To get higher than that, he's going to need McCain to really collapse, giving him MT, ND, AR, LA, and GA. Sweetest of all would be taking AZ, but that's going to take a 12.3% swing from current polling. Not going to happen, but it makes my pants feel tight to think about.

Posted by Fnarf | October 20, 2008 6:15 PM
7

my only tiny quibble is that it would be nice if the scale was density or proportion since the number of simulations varies from graph to graph.

but yes. death to the pie chart. may the histogram and other information dense graphs prosper along with nate silver's genius.

Posted by josh | October 20, 2008 6:17 PM
8

It would be poetic if the "Big O" got 360 on the nose, but I'll be happy with 270.

Posted by DOUG. | October 20, 2008 6:20 PM
9

After you take into consideration the reverse Bradley effect and landline polling, Obama wins 538 electoral votes 100% of the time.

Posted by i don't have a fancy graph though | October 20, 2008 6:24 PM
10

To Elenchos and Julie,

McCain's 7% or so is all tail. That makes it, having read many histograms on many different kinds of data, seem like noise to me, not a real shift in probability.

This is a subtle point. Compare the first histogram I posted (for 10/19's polls) to the second (for 10/20's polls).

The first histogram has a small, but distinct peak below 268 EV's, a peak in the McCain wins range. That little peak makes me think McCain really does have a small shot at victory.

The second histogram? No peak. All of the possibilities of a McCain win are simply a tail off of the closest Obama peak at around 290 EV.

Tails, particularly short tails like this one, are the domain of statistical noise. Remember, this is all an estimate of reality. The shape of the histogram curve helps us sort out noise (a pure tail) from a real, but unlikely even (a small distinct peak.)

Posted by Jonathan Golob | October 20, 2008 6:25 PM
11

Personally, I like tail.

Oh, wait, isn't this Savage Love?

Never mind ... I'll settle for a standard disease curve, where most people are healthy and only the diseased individuals are in the tail.

Posted by Will in Seattle | October 20, 2008 6:36 PM
12

Mr. Silver deserves credit for using pies to compare two numbers, where pies can work well, but for more complex ideas he switches to other graphs -- including a plain old table of numbers where appropriate.

@10

Jonathan, what you're doing now is taking the conclusions from Nate Silver's method, and changing it by saying the McCain scenarios are less than 7.5% likely. Since Silver's method makes no such strong claim, you are in fact constructing your own entirely new and different method. Fair enough, but then your conclusion can't rest on Silver's work, but rather whatever theory you've developed.

Fortunately, Silver provides all the raw data on is site, so you are in a good position to support your own theory with the same numbers, if you wish to do so. But it would involve much more than looking at the histogram.

Posted by elenchos | October 20, 2008 6:39 PM
13

Not that I doubt the veracity of Silvers' program or its result, but I would like to see the results of his program back tested from earlier this year and then back to previous elections, especially 2000.

Posted by LMSW | October 20, 2008 7:49 PM
14

As if Obama doesn't have enough on his plate his last living relative has taken a turn for the worse. He's cancelled his appearances for this week in order to fly to Hawaii to be at his Grandmother's bedside.

Posted by DavidC | October 20, 2008 7:57 PM
15

The Monte Carlo methods used are interesting, but they will usually involve the bizarre scenarios that Fnarf details. In the 10,000 scenarios drawn, there will always be those outliers where Obama picks up Georgia, or McCain swipes New Jersey.

Statistically, Silver's right... the polling data could theoretically result from an underlying situation where Obama is actually losing, or ready to pick up 400 EVs. Practically, though, such a result would be quite difficult to imagine right now.

Perfect example of the ludic fallacy. You're right to be looking at the peaks here, but the moral of the entire story is that unless there's some reason to completely distrust the poll information, a 92.5% chance is not substantively different from a 96.5% chance.

Posted by demo kid | October 20, 2008 8:01 PM
16

@7

He uses the same number of simulations every time - 10,000.

Posted by STJA | October 20, 2008 8:13 PM
17

If I import this histogram into Photoshop, will it make my photos sharper and brighter?

Posted by Andy Niable | October 20, 2008 8:26 PM
18

Hey man, I LOVE the 538 histograms. I post the pies and the supertracker (SUPERTRACKER!) because they have a little more oomph, and require less explanation (read: I'm lazy).


Posted by anthony Hecht | October 20, 2008 9:31 PM
19

Check out election.princeton.edu for another shapely histogram :)

Posted by Lindsey | October 20, 2008 9:35 PM
20

Funny I don't see that Real-Virginia has an electoral votes at all. I was worried there for minute. So I guess we can let McCain have Real-Virginia and make do with the one that actaul has electoral votes.
I don't really like these histograms. They still just lump polls weighted by sample size. That's comparing apples and organges by not looking at methodologies.
I'd like to see the polls divided into three groups - traditional likely voter - adaptive likely voter - and catchall. Then look at the histograms generated from each.
I'm really optimistic today though. Obama's rebounded on Zogby and Gallup-trad. Zogby bumped from about 4.0% to about 5.4% Gallup-trad from about 3% to about 5%. The state for state numbers won't be out entirely until end of the week though. Even so it means that even in a worst case scenario of voter turnout, Obama's popular margin is a couple points beyond the margin of error.

Posted by kinaidos | October 20, 2008 10:09 PM
21

@14 Doesn't he have living relatives in Kenya? Or do they not count?

Posted by drewl | October 20, 2008 10:54 PM
22

Polls are nice.

Now we just have to keep a look out for voter "de-registration"

Ballot tampering

and

General GOP Voter fraud

Has everyone forgotten the 00 and 04 votejacks? Votegates?

Stay vigilant and very wary y'all

Posted by Fred34 | October 21, 2008 1:53 AM
23

Yay! Now I have a use for that idiot stats class they're making me take, and I can officially say that Slog has helped me with my homework.

Posted by SeaExile | October 21, 2008 5:47 AM

Comments Closed

Comments are closed on this post.