Slog News & Arts

Line Out

Music & Nightlife

« ID "Outreach" | "The consumption of blogs is o... »

Tuesday, July 17, 2007

On Data

posted by on July 17 at 14:44 PM

Tell me what this means:

Thanks to new technologies (like microarrays), collecting vast amounts of data is easier than ever. So, what do all those dots mean? Without some annotation – information on what those dots represent, which ones are more important or interesting for a given problem – it’s hopeless to answer a useful question.

The technology is great, but without careful context, the data is worse than useless. Poorly applied, it’s an endless source of false leads, false connections and false certainty. This is the difference between data and evidence.

Which brings me to:

“We have no credible information pointing to a specific imminent attack,” said [White House homeland security adviser] Ms Townsend. “But the warning is clear, and we are taking it seriously.”

Michael Chertoff, Homeland Security secretary, told a newspaper last week that he had a “gut feeling” that al-Qaeda was preparing an attack.

Courtesy of MSNBC

Aside from the Constitutional and moral cesspool that centers a policy to not control the collection of personal information on citizens, it’s the pathetic uselessness of the data generated (“no credible information” “gut feeling”) that drives me nuts. Warrants are filters, guaranteeing that there is some good reason to be listening, and that other cheaper and potentially more informative methods have failed. Warrant-less wiretapping is like running a microarray having made no effort to identify the spots. There is no better way to generate a lot of time-wasting information and false leads.

RSS icon Comments


It means the Centipede is going to drop very fast unless you clear out the mushrooms at the bottom.

Posted by SteveR | July 17, 2007 2:55 PM

I'm pro "gut-feeling" as long as the gut feeling doesn't result in or come from systemic, institutionalized racism... Often intuitions from human data watchers come from some small detail in the data that their consciousness can't quite process to completion but their subconscious recognizes. Gut feelings should result in extra careful analysis of the data... I used to write tickets for employee cars parked in several mall garages and paying attention to my 'gut feelings' (and then looking up the plates or noting them for further careful observation) led to a quite thorough record on my part. The human brain sees quite a lot and we can't process it all right away in the forefront of our minds... I like to let things simmer on the back burner for awhile and pay attention to my gut.

Posted by Katelyn | July 17, 2007 2:56 PM

Its so true, data without context is meaningless, as is too much data that there is nothing meaningful you can gleen from it. There was a report a while ago that said they had not even gone through half of the data collected in the wire-tapping program, let alone processing it in a meaningful way.

I love your column, btw. Its always insightful and well written.

Posted by Original Monique | July 17, 2007 2:59 PM

Its the WOPR.

Posted by ecce homo | July 17, 2007 3:00 PM
"Warrant-less wiretapping is like running a microarray having made no effort to identify the spots."

I'm really confused by this analog between microarrays and wiretapping. In a wiretapping "experiment", the spots are labeled even if the wiretappers shouldn't have permission to look at them.

What it seems that you're getting at is the lack of rigor in interpreting the data to rule out false positives. This is important, but it's also worth considering the "cost" of following up on a false positive vs. the implications of missing a true finding. After all, sometimes microarray experiments turn up really interesting, but entirely unexpected findings that, after other confirmation studies, can teach us something new.

Posted by josh | July 17, 2007 3:06 PM

It's worth noting that

A) Their "gut" sucks and they've been wrong 100% of the time,


B) Warrantless domestic spying isn't anti-terrorism - IT IS TERRORISM.

Posted by Original Andrew | July 17, 2007 3:29 PM

@ecce homo: An interesting game. The only winning move is not to play.

Posted by supergp | July 17, 2007 3:55 PM

It means I'm color blind, you insensitive clod!

Why do you hate disabled people so?



Posted by Will in Seattle | July 17, 2007 4:02 PM

Good Lord. This is just such a stupid method. If I were a terrorist, I would defeat this in a heartbeat. I'd simply write an e-mail virus that would flood the system with suspicious e-mails. I'd put together probably two hundred template paragraphs, and arrange for the virus to combine them at random. Also, have it generate names using the most common Arabic, Urdu, and Persian personal and surnames, and also nisbe adjectives for Arab nations (al-Masri, "the Egyptian"; as-Sudani, "the Sudanese"; at-Turki, "the Turk"; al-Irani, "the Iranian"). If it was a well written virus, it would cause millions of hits, with thousands of fake, but realistic names. Arrange for it to be distributed from Pakistan, where apparently we won't (not can't) investigate. It's not that they wouldn't be able to figure this out, but in the meantime, for those three or four days (maybe more with Chertoff in charge), the needle in this mountain size haystack goes unnoticed. Considering that the communique that would have revealed the 9/11 plot got translated just a few days late, it would work.

Much better to do as the Saudis do. Find the individual jihadis, reintegrate them into society through their families, and then get information on the hardcore terrorists from them.

Posted by Gitai | July 17, 2007 10:16 PM


How about a nice game of chess?

Posted by ecce homo | July 18, 2007 12:21 AM

Comments Closed

In order to combat spam, we are no longer accepting comments on this post (or any post more than 14 days old).