Garbage In, Garbage Out: Tech Tools in Fighting Disinformation of Voters

By Chris Doten | June 26, 2018

Photo

Caption

Credit: Dave Taylor

In technology, you often hear geeks referencing the classic “garbage in, garbage out” problem. When the inputs to a system are bad, however beautifully crafted the program itself may be, the outputs will necessarily be bad as well. Our democratic systems are dependent on the input of citizens, but when disinformation is also an input the outputs of our processes can be deeply flawed. Disinformation and the systemic distrust it fuels has been a dangerous ingredient in the global surge of nativism, intolerance, and polarization undermining democracy and human rights around the world.

Understanding and stopping disinformation is a tremendous challenge; any single solution will be incomplete so many will be required. In 2018, the fastest, most virulent and dangerous disinformation is spreading on digital platforms, and as such technical understanding is critical to wrap our heads around the problem.

Here are a few of the different problems NDI and our partners are trying to wrestle with, and some of our initial thoughts on basic categories of tools that will be needed to detect and counter disinformation online.

Network Analysis - mapping communities, clones and cults

With social media, who is connecting to who can be as important as what they’re saying. What groups of people are sharing the same or similar information? Who is liking or retweeting each other’s content? Are there groups of accounts that behave in a synchronized manner to try and harass individuals, swamp conversations, or manipulate algorithms? Understanding how accounts (which may or may not be people) coordinate is very important for trying to understand the spread of disinformation.

Bad Bot Antidotes - bot detection, manufactured consensus, and the mob mentality

Speaking of “people or not” how do we know if an account is “real”? While some online accounts are obviously not human – most of us can’t tweet 100 times in a minute, for example – sometimes it’s hard to tell an automated account from a real person. The easy-to-spot fakes are rapidly being deactivated by platforms now, which leads to an arms race to make bots more and more lifelike and therefore harder to detect.

Now, not all bots are evil. They can be used for positive democratic impacts as well, particularly when people know they’re bots, but in the disinformation context, we’re usually thinking of their malign influence.

One particularly dangerous aspect of social media is the challenge of “manufactured consensus” – the fact that if we are surrounded by other people saying and thinking the same thing, we, as herd animals, assume it must be true. This is even more pernicious when a whole crowd of bots is designed to shout you into silence or to make you believe that “everyone thinks this way.” Recognizing that "the crowd" is actually nothing more than a phantom can help citizens understand the reality of the conversations in which they swim. For disinformation researchers, the power to understand if bots are pushing particular topics and supporting or attacking particular leaders is critical to inform their analysis.

Content Analysis - the sites that cried wolf

Disinformation is spread via content – actual tweets, Facebook posts, or instant messages. However, it can be difficult verging on impossible to read a tweet on its own and know if it’s true or not – as much so for a human as a machine. Over time, though, sources of content develop more or less credibility. Does crazy Uncle Joe forward crackpot conspiracy theory emails all the time? After a while, you’re not going to pay much attention to Uncle Joe. Websites too can be considered trustworthy or suspicious based on past behavior. In a given country’s media market, impartial, well-informed citizens will have a good sense of which outlets are more prone to disinformation than the truth, or more aligned with the Kremlin than with Europe. By creating a relative score on the *source* of content, one can see which accounts are more likely to be disseminating disinformation.

Better Algorithms - algorithmic manipulation, or how to make viral sharing great again

If a fake news tweet falls in a forest and no is there to read it, does it make an impact? The answer is no; it’s only when information goes viral that it shapes mass opinion. Among the infinite river of content that flows by us, we only see a small percentage. Whether it’s “top news” or “trending topics,” algorithms surface the content that these automated systems believe we most want to see. However, those algorithms can be manipulated, and with that, disinformation can break out of the internet’s dark fringes and pop up under the eyes of a wider audience with real impacts. Most of these systems effectively crowdsource the view of what is important, but the problem with crowdsourcing is that it actually measures *intensity* of interest in a group, not how *widespread* interest is. So if you write a bunch of robots to vote something up, tweet at a hashtag, share something, like something, or anything else that may give it that viral boost, you can make it appear like an idea or opinion is super popular and therefore important. That’s true not just of bots. A small, motivated group of individuals can juice an algorithm or swamp a poll. Send Justin Bieber to... Pyongyang, anyone?

More Insight - into the dark, encrypted places

When we are talking about digital disinformation – particularly stuff we can measure – we’re usually talking about public platforms like Facebook and Twitter. The easiest to analyze, by far, is Twitter, but their openness can make them look particularly bad precisely because researchers can unmask all the horribleness that takes place. It’s much harder to get a view into Facebook, except for conversations on public Pages (not Profiles). To do more in-depth analysis often requires having a research agreement with the platforms themselves (and trust they’re showing you the whole picture). But that too can be a problem. Cambridge Analytica worked with groups that had privileged research access to Facebook and then mined their data for partisan, for-profit purposes.

At least on Facebook, Facebook, Inc. can know what’s taking place on its platform. When it comes to end-to-end encrypted messaging platforms like WhatsApp or Signal, not even the systems administrators can see what’s taking place in messages on the system. Since messages on these platforms are shared in closed groups, unless you’re a part of the conversation you don’t even know it exists. Some NDI partners get around this by having political analysts such as election observers join as many relevant WhatsApp groups as possible. Sometimes, as with WhatsApp, the contents of conversations can be exported for after-the-fact analysis, but it’s not easy. Detecting the spread of disinformation in such under-the-radar spaces is very difficult, and as such, it can be hard to see the signs of rampant disinformation or citizen manipulation.

Disinformation is all about people – the manipulators and agitators who create it, and the citizens who consume it. Disinformation also isn’t new. It is spread using the technologies of the era, from Gutenberg in 1455 to Google today. A long time ago, Mark Twain observed that a rumor was halfway around the world while the truth was still putting on its shoes – imagine what he would have made of our digital age.

Topics

NDItech

Garbage In, Garbage Out: Tech Tools in Fighting Disinformation of Voters

Share

Can't Find What You're Looking For?