Predicting elections with Twitter and social media. Is it possible?

Social media and politics

Today, as a business, it’s pertinent to have a social media campaign when you want to be known and loved. And ever since president Obama brought social media to the spotlight in 2008, political parties have also embraced social networking. Nowadays about every Dutch political party has their own Facebook fan page and Twitter account and is using it to connect with potential voters and fans. They have realized that this is where people hang out and thus can be reached. Some political parties use this new medium better than others. While some are still stuck in sending mode, others are already having dialogues online. But there is more to social media and politics than attracting new voters.

The crowd versus traditional polls

People leave a rich trail of information on the internet which can be analyzed. Several researchers have used this to predict real-world outcomes from movie successes to movements in the stock market. It’s an appealing thought that we would be able to predict the future using online social data and distill some sense from the chatter of the crowd. These studies have received a lot of buzz because of that but a lot of criticism as well.

Why it could be possible to predict the elections using Twitter and online media

  • The more publications we find on news sites, blogs and social media, about a political party, the more people will know it’s name. Total buzz equals popularity which in turn may be correlated with the number of votes.
  • People express themselves spontaneously on the internet which is more likely a reflection of their true opinion as compared to traditional polls.
  • The crowd counts many more people than traditional polls in which often only 1000 voters are asked to express their views.
  • One can measure sentiments in online publications and tweets and derive important clues from this about the things people intend to do, like buying or selling shares (correlation with mood state in the paper by Bollen et al.) and paying to see a movie (see the paper ‘Predicting the future with social media’ by Asur & Huberman) or voting in elections.

Why it would not work to predict the elections using Twitter and online media

  • Not every person that Tweets is eligible to vote and demographics are largely unknown. Also, not all voters blog and tweet as there is a silent majority that is just lurking. Furthermore, those who do tweet may not be representative for the whole population. Political tweeting may be confined to politicians, journalists and idealists.
  • Since everybody can read what one is publishing and tweeting, people could be more authentic and straight from the heart or… they could be hiding behind an nickname and playing a different game. It’s true, not all tweets may be trustworthy, though if the percentage is the same for all parties and you inspect a large enough number of tweets prediction may still be possible.
  • People may talk a lot about one party but still vote on another one. A political party expressing controversial standpoints may gather a lot of attention, a lot of buzz, but not a lot of votes. Sentiment analysis may adjust for this.
  • Humor and sarcasm in texts are difficult to impossible for a computer to interpret correctly. When using sentiment analysis this is a major hurdle.

Conclusion

We conclude that forecasting based on Twitter (or other online media) is still in its infancy. It’s a fascinating area of research, but it’s not yet mature. There is still no way to actually detect sarcasm, however the law of big numbers says the greater the sample size, the greater the chance of statistical significance. Finding the correct algorithm is a work in progress.

An interesting paper in this respect was written by Daniel Gayo-Avello called ‘I wanted to predict elections with Twitter and all I got was this lousy paper‘. He conducted a balanced survey on election prediction using Twitter data and summarized his conclusion in one sentence: No, you cannot predict elections with Twitter. A lot of research is done in retrospect so it’s not prediction at all and negative results are hard to find. He lists the flaws he found in current research regarding electoral predictions and reviews papers in which Twitter has been used to predict the stock market, movie box performances and pandemics.

We agree that there’s a lot to be learned before anyone is able to consistently predict election results. Twitter’s predictive power is far from proven. However, while Gayo-Avello has proven that no-one has been able to convincingly prove – thus far – that you can predict elections using social media, he has not proved that it’s impossible!

We hypothesize that some sort of predictive power lies within Twitter and online media. As people are expressing themselves more and more in the digital world, and the computers nowadays are capable of analyzing huge amounts of data, it’s time to look into this possibility with greater effort. This research using social media may provide new insights as how we can put big data to use in company profiling, market monitoring and data mining. Wouldn’t it be great if we were not able to project where a company and it’s competitors are today but also extrapolate what we know into the future? And so we are undertaking our own attempt at predicting elections with Twitter to learn from.

Dutch electoral predictions 2012

We have set up a new experiment to monitor the buzz around the Dutch elections that will take place on September, 12th 2012. For this popularity assessment we use BuzzTalk. BuzzTalk is able to find and monitor online sources (news sites, blogs and tweets) and automatically tag and analyze these publications based on years of linguistics research.

There are so many unknown variables. For which time period do you have to measure the tweets? Should you weigh actual tweets against older tweets and with what factor? Which parties do you need to include and what corrections, if any, should be calculated? How should positive and negative sentiment be counted against total number of tweets? Should you look equally at other sources like blogs and forums? Should we count people or sources rather than news items and tweets? At the moment it’s unclear which factors will be important so we consider this a first attempt.

We use the total buzz (how much attention a party gets) and the sentiment (as a ratio of positive versus negative labeled publications) in determining the popularity of a political party. Our hypothesis is that you first need to be known to the people in order to attract votes. You need to be talked about and therefore we use total buzz as a major factor in our algorithm.

The adage “there is no such thing as bad publicity” may well be a big PR myth as in fact bad publicity can be pretty damaging. Because of this we choose to correct for negative publications as this may result in voters knowing about the party but choosing not to vote for it.

A number of people decide which party to vote for just days before the election. Because of this behavior it would seem logical to give more emphasis to recent buzz as compared to historical buzz. And so we choose to weigh current and recent publicity more heavily.

Well, we are just as curious as you are to see whether our popularity monitor will have a positive correlation with the Dutch election results on September 12th 2012.

You can follow our experiment online at www.predict.nu »

 

Posted in:

Leave a Reply

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.