Crowdsourcing Political Incidents Online

By Jared Ford | September 17, 2013

Small Photo
Political posters pasted on public wall, Kenya

Kenya's iHub recently released its research on crowdsourced information in the highly contested 2013 Kenya Presidential elections. The study sought to clarify the value of information collected from citizens about political incidents from online media, and to answer whether 1) “passive crowdsourcing” is viable in the Kenyan context  - passive crowdsourcing being defined as monitoring social media such as Twitter 2) determine what unique information Twitter posts provided about the election, and 3) determine the conditions in which crowdsourced information is a viable news source. As part of the report, iHub provided a useful set of recommendations and a decision-making framework for practitioners who are considering similar methodologies. 

The report provides great detail about the research methodology and data sources (Twitter, online traditional media, targeted crowdsourcing platforms like Uchaguzi, and fieldwork). Particularly impressive are the mechanisms described for capture, storage and classification of tweets and the detailed approaches to filtering for newsworthy tweets. The glossary is helpful in clarifying terminology such as that of "passive", "active" and "targeted" crowdsourcing of information from citizens. (NDI prefers the term "citizen reporting" over crowdsourcing for citizen-generated incidents data.)

The report makes a strong case for monitoring Twitter as a generator of topical information in Kenya over more targeted crowdsourcing projects such as Uchaguzi. It also notes that this “passive crowdsourcing” was “viable” in Kenya. In this case “viable” does not mean that accurate information was provided by citizens. Researchers admit that incidents reported on Twitter need to be followed up with a separate verification process.  Instead, “viable” here meant that more topical information was provided on Twitter than via other sources. 

Not surprisingly, “active” crowdsourcing, and information collected via structured local networks or even journalists, will usually be better methods for getting information from citizens who do not use social media platforms and in areas where there is no internet access.

The report notes that the main tweeters of news are - surprise - traditional news outlets. This weakens the distinction the report attempts to make between social media and traditional media, as Twitter and online traditional media become one, dynamically integrated, when journalists are the most prolific news tweeters. The distinction is really rather fluid and "news-making" becomes a process of highlighting an incident on Twitter, verifying it, and then amplifying it via traditional media outlets.  

The report does find that Twitter posters "break news” during elections. When news about incidents was reported in the traditional media and on Twitter, people on Twitter either led in reporting the story or at least reported at the same time as media. However, a reported incident doesn’t become “news” until it has been verified by a credible source. Therefore, political incidents will often be reported first through social media, yet many of these reports will not turn out to be accurate or credible, and therefore never become “news”.  

The study does a great job explaining the infeasibility of mining social media data manually, and strongly concludes that systematic data-mining techniques must be aided by automated machine-learning processes.  NDI is using this method in a partnership with Crimson Hexagon using supervised machine-learning algorithms to monitor social media about political topics. These machine-learning mining methods are necessary to make sense of the flood of reports, and if verified information is required, other reporting methodologies are necessary as well. They importantly clarify that Twitter enhances localized information collection useful to particular interest groups. Finally, the research team has provided a useful framework for choosing an information-gathering methodology.  We’ve discussed similar considerations for crowdsourcing approaches previously.