…and what about American Idol?

Last week I told you about work from Bruno Gonçalves, at the Lab­o­ra­tory of for the mod­eling of bio­log­ical and socio-​​techical sys­tems, about the Twitter behav­iors of politically-​​minded social media users. Today I got an email from Alessandro Vespig­nani, who directs the lab­o­ra­tory, about another elec­toral study that his group is working on. Instead of the pres­i­dent of the United States, how­ever, this voting pop­u­la­tion is deciding the fate of the next Amer­ican Idol.

This may seem like a trivial pur­suit; indeed Vespig­nani him­self said “we are always working on very serious (some­time gloomy) things like pan­demic, deadly viruses, social tur­moil. etc. For once we decided to do some­thing more friv­o­lous.” But if these big data analyses that we at North­eastern and else­where are so excited about don’t work in the simple cases, they will not work for more com­plex phe­nomena like pol­i­tics, he said.

Also, the Idol voting story is much sim­pler than that for the US prez. The team delib­er­ately opted for the sim­plest approach to the problem. “Refine­ments could be applied to Idol as well as pol­i­tics,” Fabio Ciulla first author in the study, said. For example, sen­ti­ment analysis, demo­graphic cor­rec­tions, indi­vidual users signal, etc. “In addi­tion, for the polit­ical arena there are much more data (single polit­ical issues, his­tor­ical data, incum­bent can­di­dates) that can be used to cal­i­brate sta­tis­tical models.”

The team’s fun­da­mental assump­tion, according to the paper, “is that the number of votes each con­tes­tant receives is pro­por­tional to the number of tweets that men­tion her.” They val­i­dated this assump­tion by looking at the Twitter activity during each voting period for the nine episodes leading up to this week’s finale (which hap­pens Tuesday and Wednesday). Indeed, they found that Twitter activity volume (for­get­ting about the con­tent of the tweets, ie. positive/​negative sen­ti­ments) is directly cor­re­lated with the voting outcomes.

Twitter also allows users to iden­tify their loca­tion on the globe. Vespignani’s team can use that info to look at the pref­er­ences of var­ious areas around the US. Cal­i­fornia, for example, seems to love Jes­sica Sanchez, who is from Chula Vista, CA and Louisiana was par­tial to Joshua Ledet who hails from the Lake Charles area of that state.

The geolo­cal­ized data also revealed an inter­esting incon­sis­tency: People in the Philip­pines tweet about Sanchez a whole bunch during the voting period, despite the fact that only US res­i­dents are allowed to vote. “Numerous web­sites explic­itly address the issue of ‘voting tun­nels’,” says Delia Mocanu, a PhD stu­dent at North­eastern studying the geolo­cal­iza­tion of Twitter sig­nals. For example, you’d have no problem finding the article “Fil­ipinos in PHL can vote for Jes­sica Sanchez online using Skype Magic Jack and Vonage” if you Googled “Jes­sica Sanchez vote from Philip­pines” (which I just did).

The anomaly con­cerning the Philip­pines (that in prin­ciple could not vote) is jumping to the eye,” said author Nicola Perra, who works in social sys­tems char­ac­ter­i­za­tion. “In pol­i­tics anom­alies would be more subtle to detect but one can hope to see anom­alous pat­terns (such as manip­u­la­tion of infor­ma­tion, fake accounts etc.), using Twitter data.”

The work also calls atten­tion to the fact that pub­licly avail­able Twitter data can have unde­sir­able con­se­quences in the realms of gam­bling and social influ­encing. “For example, the audi­ence could be induced to alter their behavior in func­tion of the sit­u­a­tion they observe,” says Andrea Baronchelli another coau­thor of the study. “And the job of bet­ting agen­cies could be dra­mat­i­cally simplified.”

In an inter­view a few months ago, one of my col­leagues asked Vespig­nani if he didn’t find it a bit unset­tling to put such pow­erful results like these into the public eye. “The data is out there,” he responded. “Others are doing this at the same time without telling the public but we are in acad­emia which means I have to tell the world.”

The paper does not go so far as to make an overt pre­dic­tion about the Amer­ican Idol com­pe­ti­tion. They will save that for Wednesday morning, after Tuesday night’s Twitter con­ver­sa­tion. Last week it looked like Sanchez was leading, but what a dif­fer­ence a week…and a Tweet…can make!

Photo via The Express Tri­bune