Thursday 12 April 2018

Rumours, mis- and disinformation in divided societies: Twitter and the Ardoyne parade dispute


Dr. Paul Reilly is Senior Lecturer in Social Media & Digital Society at the University of Sheffield. His research focuses on the use of social media by citizens to share acts of sousveillance, as well as how digital media can be utilised to promote better community relations in divided societies such as Northern Ireland. His work  has been published in a number of journals including First Monday, Information, Communication & Society, New Media & Society, Policy and Internet and Urban Studies. He is currently completing his second monograph on Social media and contentious politics in Northern Ireland (contracted with Manchester University Press). @PaulJReilly
 
The rerouting of a contentious Orange Order parade away from the Ardoyne shops in North Belfast would lead to four consecutive nights of “animalistic” violence by loyalist protesters in July 2013. Despite the seeming inevitability of further violence, both the Orangemen and the Ardoyne residents were widely praised for the parade passing off without incident in July 2014. However, the 2015 parade would again be marred by violence, as loyalists faced off against PSNI officers who prevented the march from returning home via its traditional route. Having praised the behaviour of the Orange Order and residents a year earlier, Secretary of State Teresa Villiers described the attacks on police as disgraceful and accused the rioters of ‘damaging’ Northern Ireland.
 
This raised questions about what role, if any, social media played in helping escalate or de-escalate the tensions surrounding the contentious parade in 2014 and 2015. Previous research had suggested that the sharing of misinformation and disinformation (term used to describe the inadvertent and deliberate sharing of false information) had contributed to intercommunal violence in contested urban interface areas. Therefore, this study examined how users responded to rumours and disinformation that were circulated on Twitter. These issues were explored through a critical thematic analysis of tweets posted during two periods: 11th - 14th July 2014, and 12th - 15th 2015. A total of 7388 tweets tagged #Ardoyne were identified, 1,842 in 2014 and 5,546 in 2015. These tweets peaked at 7pm on the day of the Twelfth demonstrations in both years (see Figure 1). One explanation for this finding was that this spike in Twitter activity was the result of viewers of flagship television news programmes turning to social media to follow what was happening during the return leg of the parade. 



Figure 1: Tweets mentioning Ardoyne, 12th July 2014 and 13th July 2015
 
Tweeters move quickly to debunk rumours and disinformation
 
Rumours and disinformation appeared to have a very short life span in both corpora. This was particularly evident in the 2014 corpus, where there was some evidence to suggest that both loyalists and republicans were checking the veracity of claims made by each other on the microblogging site. Loyalists were accused of digitally altering pictures in order to portray the nationalist residents in a negative light. A picture of one of the protesters that had gathered outside the Ardoyne shops began to circulate on Twitter shortly after 7.30pm on 12th July. The placard they held aloft contained a mock-up road sign indicating that the Orange Order was not welcome in the area. One tweeter suggested that this was evidence of the intolerance of republicans and highlighted the ‘unlawful’ nature of the protests against the march. Twitter users immediately appeared sceptical of the authenticity of this image. Visual evidence suggesting that this was a photoshopped image was shared on Twitter within a few minutes. The original image showed the protester at a peaceful Christian protest at the shop fronts with his placard proclaiming “Love Thy Neighbour”, rather than the anti-Orange Order slogan that had featured in the doctored image. This was corroborated by an image of the same scene, taken and shared on Twitter by BBC NI journalist Kevin Sharkey a few hours earlier. Many tweeters accused the loyalist responsible for sharing the photoshopped image of spreading lies about what was happening on the ground at Ardoyne.
 
 
Tweet from BBC NI reporter Kevin Sharkey used to discredit photoshopped image, July 2014.

A similar theme emerged in relation to the Ardoyne parade and counter-demonstrations in July 2015. Images taken from many different vantage points were used to support claims and counter-claims about the incident in which an Orangeman drove his vehicle into a crowd of nationalist residents, seriously injuring a 16 year old girl. West Belfast UPRG was one of several accounts to share reports that the republicans had attacked a car, with no reference to the attempted murder of the residents in attendance (see Figure 2). It should be noted that one of these tweeters had been responsible for the photoshopped image of the ‘Love Thy Neighbour’ placard the previous year. Elsewhere, there was much speculation about whether the victim had been in or was trapped under the vehicle. The coverage of the incident by the Belfast Telegraph was questioned for its headline referring to a ‘crash’ rather than an incident of attempted murder of the injured teenager.  Within a few minutes of the incident, first reported by reporters such as the BBC’s Chris Buckler at 7.43pm, the PSNI confirmed that the girl had been injured and that the driver had been arrested at the scene. It would be a further two hours before local MLAs such as Gerry Kelly and the PSNI Chief Constable George Hamilton would confirm that her injuries were not life threatening.

Figure 2: Loyalist counter-narrative about teenage girl hit by car in Ardoyne, 13 July 2015.
 
There was one incident in particular that illustrated how citizens were using Twitter to prevent the spread of rumours and misinformation that might incite violence during the contentious North Belfast parade. One user tweeted a rumour that he had heard from an acquaintance, suggesting that a deal had been put in place that would see the return leg of the parade proceed via its traditional route past the Ardoyne shops. This tweet, shared a few days before the Twelfth, was only shared three times but clearly had the potential to raise tensions in Ardoyne. The same user would acknowledge that this information was untrue and apologise for spreading false information in a tweet posted on the morning of the 13th July 2015. Clearly it is difficult to tell how many tweeters may have seen the original tweet, or how they responded to this false information.
 
Twitter as tool to empower citizens to counter mis- and disinformation?
 
Twitter did not appear to provide the shared space that is required for reconciling the differences between loyalists and the nationalist residents’ groups in Ardoyne. The same group of highly engaged users dominated both of these Twitter streams, with tweets from the news media the most retweeted and the most likely to reach users across the sectarian divide. There were no signs of the rational debate and deliberation that would appear to be needed in order to resolve the Ardoyne parade dispute. However, Twitter might not be the most appropriate platform to facilitate intergroup contact and discussions about polarising issues such as parades and related protests. The complexities of such issues are unlikely to be explored through the exchange of messages that are restricted to just 140 characters. A related concern would be the representativeness of the Twitter users who engage in such online debates. Recent research has suggested that social media perpetuates the “spiral of silence,” whereby people only speak in public about certain policy issues if they believe that their views are shared by others. Nevertheless, Twitter’s most significant contribution to peacebuilding in Northern Ireland might lie in its empowerment of citizens to correct rumours and disinformation. The relatively short lifespan of these rumours, not to mention the lack of mainstream media coverage they received, illustrated how effectively tweeters corrected disinformation during this period. It is reasonable to presume that this activity helped calm some tensions between these antagonistic groups, particularly in light of the negative impact upon community relations that rumours spread on social media had during the 2013 union flag protests. However, the violence seen in 2015 clearly demonstrates how the use of Twitter to correct rumours, mis- and disinformation may be insufficient to prevent the outbreak of violence in interface areas.

Tuesday 16 January 2018

Tapping Into Advertising Data for Studying International Migration

Ingmar Weber is the Research Director of the Social Computing Group at the Qatar Computing Research Institute (QCRI). As an undergraduate Ingmar studied mathematics at Cambridge University, before pursuing a PhD at the Max-Planck Institute for Computer Science. He subsequently held positions at the Ecole Polytechnique Fédérale de Lausanne and Yahoo Research Barcelona. In his interdisciplinary research, he applies computational methods to large amounts of online data from social media and other sources to study human behaviour at scale. Particular topics of interest include quantifying international migration using digital methods and other data for development projects. He has published over 100 peer-reviewed articles and his work is frequently featured in popular press  Since 2016 he has been selected as an ACM Distinguished Speaker.

International migration is one of the key drivers of demographic change. However, official statistics on “stocks of migrants”, i.e. how many people with origin country X are residing in country Y, are often unreliable. Reasons for this include the free movement of EU nationals within the EU, as well as generally inadequate census and civil registration systems for many developing countries.

Work done by Emilio Zagheni, Krishna Gummadi and myself tries to address some of the shortcomings of traditional methods to create migration statistics by tapping into a new kind of data: audience estimates provided by Facebook.

Facebook and other internet giants collect a rich data set on their users to be able to serve more targeted and more relevant advertising to their users. The data collected includes user self-declared attributes such as age or gender, it includes meta data such as the device or internet connection type used to access the service, it includes third party information such as credit card or voter registration data, and it includes attributes such as topical interests inferred from behavior such as "liking" posts on Facebook or visiting websites with social plugins. See https://www.cision.com/us/2017/07/how-to-improve-social-media-targeting/ for a good list of available targeting options on Facebook, Twitter, LinkedIn and Snapchat.
 
The detailed users profiles are generally not available to researchers outside the companies. However, aggregate and anonymized data is shared with potential advertisers in the form of audience estimates. Basically, Facebook and other social networks provide advertisers with information on "how many users match criteria X". For example, to help with planning an advertising campaign, an advertiser could inquire "how many monthly active Facebook users are married, male German expats aged 30-50 living in Qatar"? Answer: 120 (as of Dec 20, 2017).

This type of real-time digital census over Facebook's could potentially be of value to augment existing population estimates, in particular for countries where official statistics are unreliable or outdated. However, due to selection biases and an estimated 13% of duplicate or fake accounts it is clear that using this data set as a simplistic enumeration tool for the whole population will not give accurate results. See https://www.theguardian.com/technology/2017/sep/07/facebook-claims-it-can-reach-more-people-than-actually-exist-in-uk-us-and-other-countries for more indications of shortcomings of the data.

In our own research, we do not use the raw advertising audience estimates as the final answer. Rather we treat it as one of potentially many input signals for an estimation task of the kind "how many Germans are living in Qatar today"? As long as the biases in the underlying data are either (i) uniform, e.g. 13% of duplicate or fake Facebook accounts for all countries, or (ii) systematic, e.g. Western Europeans are always less likely to be on Facebook compared to Arab nationals, an appropriately fitted model can account for and correct such biases.

In our paper “Leveraging Facebook'sAdvertising Platform to Monitor Stocks of Migrants”, Emilio, Krishna Gummadi and I show the feasibility of this approach to derive stocks of migrants across different US states and around the world. Concretely, we show that it is indeed possible to build models to make out-of-sample predictions on how many people from a certain origin country are residing in a particular US state. Similarly, it is possible to predict the percentage of expats out of the whole population for countries around the globe.

Potentially, the Facebook audience estimates could also give estimates for stocks of migrants at the sub-national and even the sub-city level. To illustrate this, Matheus Araujo, Michael Aupetit, Yelena Mejova and myself created a data visualization for the Facebook data for Doha: http://fb-doha.qcri.org.

As an example, this shows a density map of Nepali expats across Doha, with the highest density in the Industrial Area. The tool also shows that Nepali expats in Doha are predominantly male (93%) and are Android users (94%). Contrast this to the same map for Western expats with the highest densities in West Bay and the Pearl. Western expats are more gender balanced (44% female) and more likely to own iPhones (56%).  A similar visualization for New York City can be explored at http://fb-nyc.qcri.org [Usage info for the two data visualizations: Select several filters on the left to drill down to smaller populations by nationality, gender or other criteria. Click a selection again to de-select and revert to the whole category such as all nationalities or all genders.]

Given Facebook’s global reach of 2.1B monthly active users we believe there is a lot of potential in using this data source to support global development efforts, in particular given its easy accessibility through official APIs. At the same time, no single data source is a cure-all and many have complementary strengths. Satellite data has truly global reach and can give estimates of population densities but satellite data will never reveal the nationality or gender of earthlings. Call detail records (CDR, https://en.wikipedia.org/wiki/Call_detail_record) are great for studying dynamic changes in population density, but there are limitations for monitoring international migration as people often change their SIM cards once they move.

I’m truly optimistic that as Digital Demography advances and matures as a field and as researchers start to work collaboratively, combining different data sources, we will see more and more scientific work with real impact on the creation of migration statistics. If you’re interested in how to use new data sources and methodologies to help fill data gaps around the globe, please get in touch by email at: iweber -atsignal - hbku.edu.qa.

 
Relevant slide decks:

Using internet advertising data for studying international migration (https://www.slideshare.net/IngmarWeber/using-internet-advertising-data-for-studying-international-migration)

Digital Demography - WWW'17 Tutorial - Part II (https://www.slideshare.net/IngmarWeber/digital-demography-www17-tutorial-part-ii)

Wrapper libraries to obtain Facebook advertising audience estimates:


Wrapper library in Python (https://github.com/maraujo/pySocialWatcher) by Matheus Araujo (https://sites.google.com/view/matheusaraujo/)

All of my publications are available at https://ingmarweber.de/publications/. Feel free to follow me at https://twitter.com/ingmarweber.