You can’t shut down government text-​​mining

Photo via Thinkstock.

Photo via Thinkstock.

North­eastern Uni­ver­sity pro­fes­sors David Smith and Ryan Cordell are inter­ested in hidden social net­works. In par­tic­ular, they want to under­stand how ties between edi­tors, writers, politi­cians, busi­ness mag­nates, etc., made the 19th cen­tury news go ’round.

To do so, the duo of dig­ital human­i­ties experts decided to sift through thou­sands of bits of text from his­tor­ical news­pa­pers to find snip­pets that repeat them­selves in var­ious places. Since this would be too much work for human eye­balls, Smith, a com­puter sci­en­tist, devel­oped a pro­gram that is doing it for them. The pro­gram picks up on things like two edi­tors, one in Mis­souri and another in Ver­mont, printing the same con­tent, ver­batim, a few days apart. This sort of data, Smith said, can tell you with pretty high cer­tainty that the two guys had at least some polit­ical views in common, if not a per­sonal connection.
In a col­lab­o­ra­tion with researchers at the Uni­ver­sity of Wash­ington, Smith has now applied that same code to policy bills from the 111th con­gress (the one that spanned the first two years of Obama’s pres­i­dency). By doing so, he found that 11 per­cent of the Demo­c­ratic con­gress’ work had Repub­lican ori­gins. That doesn’t sound like a lot, until you think about how the two sides of the aisle recently gave each other the silent treat­ment long enough to shut down the gov­ern­ment for 17 days.
As we all know by now, one of the big things that ini­ti­ated this shenani­gans was a dis­agree­ment about the Patient Pro­tec­tion and Afford­able Care Act. “This law is a train­wreck,” said House Speaker John Boehner about a week before the shut­down.  “It’s time to pro­tect Amer­ican fam­i­lies from this unwork­able law.” One of his least favorite parts of the bill? The indi­vidual man­date. But as many have said before, that par­tic­ular (and impor­tant) part of “Oba­macare” hap­pens to have its roots in Repub­lican policy ideas.
Smith’s text mining exper­i­ment gives more striking results when “markup bills,” which are like second, third, and fourth edi­tions in the book world, are excluded. In that case, the amount of Repub­lican influ­ence jumps to 28 per­cent. The work, Smith said, “is decom­posing the mono­lithic idea of a leg­is­lator, pro­viding a more finely artic­u­lated view of how policy and pol­i­tics work.” Just like with the news­paper data, this work uncovers a hidden social net­work of politi­cians working together (gasp), to get their ideas into prac­tice. If you see one bill spear­headed by a Repub­lican, get scrapped, only to have its con­tent repur­posed almost exactly in a new bill by a Demo­crat, you might sus­pect they know each other. Per­haps they’re room­mates when they visit the hill from their home states, or maybe they were in the same entering class of con­gressmen, doing trust falls together their first days in office.
When it comes to social sci­ence, you like to see causality, said Smith. You like to be able to pull a string here and see a lever moving over there. When you look at your friend net­work on Face­book that may very well be the case. But with politi­cians, it can be harder to see.
What I’m having a hard time under­standing is why exactly polit­i­cans don’t want to look like they’re coop­er­ating. I know I’d be much more sat­is­fied with and con­fi­dent in my gov­ern­ment if I knew they’d man­aged to pass kinder­garten, that impor­tant time in our edu­ca­tion when we’re taught to cooperate.