Wednesday, 30 November 2016

Democratising Access to Social Media Data – the Collaborative Online Social Media ObServatory (COSMOS)

Luke Sloan is a Senior Lecturer in Quantitative Methods and Deputy Director of the Social Data Science Lab at the School of Social Sciences, Cardiff University, UK. Luke has worked on a range of projects investigating the use of Twitter data for understanding social phenomena covering topics such as election prediction, tracking (mis)information propagation during food scares and ‘crime-sensing’. His research focuses on the development of demographic proxies for Twitter data to further understand who uses the platform and increase the utility of such data for the social sciences. He sits as an expert member on the Social Media Analytics Review and Information Group (SMARIG) which brings together academics and government agencies. @DrLukeSloan

The vast amount of data generated on social media platforms such as Twitter provide a rich seam of information for social scientist on opinions, attitudes, reactions, interactions, networks and behaviour that was hitherto unreachable through traditional methods of data collection. The naturally-occurring user-generated nature of the data offers different insights to the social world than that collected explicitly for the purposes of research, thus social media data augments our existing methodological toolkit and allows us to tackle new and exciting research problems.

However, to make the most of a new opportunity we need to learn how the tool works. What does Twitter data look like? How is it generated? How do we access it? How can it be visualised? The bottom line is that, because social media data is so different to anything we have encountered before, it’s hard to understand how it can be collated and used.

That’s where COSMOS comes in. The Collaborative Social Media ObServatory (COSMOS) is a free piece of software that has been designed and built by an interdisciplinary team of social and computer scientists. It provides a simple and visual interface through which users can set up their own Twitter data collections based on random samples or key words and plot this data in maps, as networks or through other visual representations such as word clouds and frequency graphs. COSMOS allows you to play with the data, selecting subsets (such as male and female users) and seeing how they differ in their use of language, sentiment or network interactions. It directly interrogates the ONSAPI and draws in key areas statistics from the 2011 Census, allowing you to investigate the relationship between, for example, population characteristics (Census) and anti-immigrant sentiment by locale (Twitter). Any social media data collected through COSMOS can then be exported in a variety of formats for further analysis in other packages such as SPSS, STATA, R and Gephi.

COSMOS is free to anybody working in academia, government or the third sector – simply go to and click on the ‘Software’ tab on the top menu bar to request access and view our tutorial videos.

Give it a go and see what you can discover!

Monday, 28 November 2016

Introduction to NodeXL

Wasim Ahmed, from the University of Sheffield, is a PhD researcher in the Information School, and Research Associate at the Management School. Wasim is also a social media consultant, a part of Connected Action Consulting, and has advised security research teams, crisis communication intuitions, and companies ranked within the top 100 on the Fortune Global 500 list. Wasim often speaks at social media events, and is a regular contributor to the London School of Economics and Political Sciences (LSE) Impact blog. @was3210

This blog post is based on a conference with the same name which was delivered at the Introduction to Tools for Social Media Research conference. The slides for the talk can be found here. This blog post introduces and outlines some of the features of NodeXL.
Network Overview, Discovery, and Exploration for Excel (NodeXL) is a graph visualization tool which allows the extraction of data from a number of popular social media platforms including Twitter, YouTube, and Facebook with Instagram capabilities in beta. Using NodeXL it is possible to capture data and process it to generate a network graph based on a number of graph layout algorithms.
NodeXL is intended for users with little or no programming experience to perform Social Network Analysis. Social Network Analysis (SNA) is:
 “the process of investigating social structures through the use of network and graph theories” (Otte, Evelien, Rousseau, and Ronald, 2002)
Figure 1 below displays the connections between workers in an office:

Figure 1 – Graph of an example network graph

We can also think of the World Wide Web as a big network where pages are nodes and the links are edges. The Internet is also a network where nodes are computers and edges are physical connections between devices. Figure 2, below, from Smith, Rainie, Shneiderman, & Himelboim, 2014 provides a guide in contrasting patterns within network graphs.
The figure below shows that different topics on social media can have contrasting network patterns. For instance in the polarized crowd discussion one set of users may talk about Donald Trump and other about Hilary Clinton, in the unified crowd users may talk about different aspects of the election, and in brand clusters people may offer an opinion related to the election without being connected to one another and without mentioning each other. In a community cluster a group of users may talk about the different news articles surrounding Hilary Clinton. Broadcast networks are typically found when analysing news accounts as these disseminate news which is retweeted by a large amount of users. We can think of support networks as those accounts which reply to a large number of accounts, we can think of the customer support of a bank which may reply to a large amount of Twitter users

Figure 2 - Six types of network structure diagram

NodeXL can also generate a number of metrics associated with the graphs such as the most frequently shared URLs, Domains, Hashtags, Words, Word Pairs, Replied-To, Mentioned Users, and most frequent tweeters These metrics are produced overall and also by group of Twitter users. By looking at different metrics associated with different groups (G1, G2, G3 etc) you can see the different topics that users may be talking about.
NodeXL also hosts a graph gallery where users can upload workbooks and network graphs. However, in regards to ethics in an academic context uploading to the graph gallery may not be permitted as participants will be personally identifiable. However, it is possible to use NodeXL to create offline graphs and to report aggregately. 

Thursday, 24 November 2016

Critically Engaging with Social Media Research Tools: Select Your Targets Carefully

Dr Steven McDermott lectures on contemporary developments in media and communications with an emphasis on the social understanding and analysis of digital media; social media platforms and the public sphere; the politics and philosophy of digital media; and media and communications research methodologies at the London College of Communication, University of Arts, London. @soci

My presentation given at the SRA and #NSMNSS allowed me to finally meet face-to-face with 7 expert speakers presenting tools for social media research. It was a day of learning for me. An all-day event that left me elated and keen to get on with my research in the knowledge that I would be able to call on the expertise of the others if needed - and of course my door is always open to the other speakers. I highly recommend taking part in similar future events to all.

The talk was titled “Critically Engaging with Social Media Research Tools”; it was about using the tools but with ethical concerns at the forefront of the social researcher’s mind rather than relegating them to a mere paragraph in the methods section. In order to illustrate the fluid nature of the visualisations that the software can co-create I had decided to collect, analyse and visualise Twitter data on the hashtag #BigData. By selecting this hashtag, I was also keen to get behind who or what organisations were promoting the buzz surrounding big social data.

The tools that I introduced; TAGS; Yourtwapperkeeper; DMI-TCAT; Gephi; Leximancer; to collect data from Twitter, YouTube enable the social researcher to take part (in a limited capacity) in surveillance capitalism. Researchers are able to collect big social data from people’s lives without their knowledge or consent. I was keen to highlight the notion that as researchers are in this position of observing others interactions that they have a duty of care to those they are researching. As we do when applying any other research tool.

The answer to the question regarding who or what institution is behind/key influencer/major player/controlling the flow of communication on #BigData was revealed by analysing 1,040,000 #BigData Tweets with Leximancer. On Twitter the key influencer around the term #bigdata is a contractor who supplies staff to the National Security Agency in the United States – Booz Allen Hamilton. Booz Allen Hamilton are the contractors who employed Edward Snowden.

This visualisation was presented with the caveat that the graphs and images being shown are the result of numerous steps and decisions by the researcher guided by certain principles from social network analysis (SNA) and graph theory. What was presented are a few of the techniques and tools of data mining and analytics, with machine learning and automation in Leximancer. Such insights that ‘come’ from the data and the application of algorithms need to be validated in the light of informed understanding of the ‘never raw data’ position. The existence of this ‘data’ is the result of a long chain of requirements, goals and a shift in the wider political economy; surveillance capitalism. The ‘insights’ are at the macro level – devoid of this context.

Big/Social data does not represent what we think they do. It represents something, and this is worth investigating. We are looking at the various ways in which populations are defined, managed, and governed. The modelling algorithms used to visualise the social data know nothing about letters, nothing about narrative form, nothing about people.

The algorithm’s lack of knowledge of semantic meaning, and particularly its lack of knowledge of the social media as a form or genre, lets it point us to a very different model of the social.  Such ‘Reading Machines’ are engaged in datafication of the social. The concern with the notion of datafication is that as it attempts to describe a certain state of affairs, it flattens human experience. It is this flattening by computer aided approaches to research of social media platforms that requires caution and can be ameliorated by the application of ethnographic approaches to collecting social media data from Twitter and other platforms.

A major worry is that designers, developers, and policy makers will continue to take big/social data at face value, as an object or representation of a truth that can be extracted from and that reflects the social.

We are glimpsing the various ways in which we are to be defined, managed, and governed. As social researchers we too engage in defining, managing and governing. The first ethical step when using the tools listed below is to have a carefully formulated research question and to select your targets carefully.

What follows is the list of tools referred to during the talk and links to each tool with installation support where provided. Also found here:
  •      TAGS – available here – by Martin Hawksey. Contains useful instructions and videos to help setting it up.
I have also created a step by step set-up guide for TAGS V6 –!ApdJKDPeE0fSmgo6z6yDln43Kb7X
The only concern is that Twitter now requires you to not only have a Twitter account but also have installed their app on your phone and provide them with your phone number and verify it. So it’s “Free”! Just provide us with your entire identity and all the data that goes with it!

It has been seriously undermined by changes to Twitters rules and regulations and its creator John O’Brien III seems to have sold it to Hootsuite and left it at that. It may now be in contravention of Twitter’s Terms of Services.

The Digital Methods Initiative Twitter Capture and Analysis Toolset (DMI-TCAT) allows for the retrieval and collection of tweets from Twitter and to analyse them in various ways. Please check for further information and installation instructions.
This software is highly recommended – it also has a version that can access Youtube –

GEPHI can now be used to collect Twitter data – and operates on Windows and Apple operating systems – just be very careful with java updates and incompatible versions of iOS.

Designed for Information Science, Market Research, Sociological Analysis and Scientific studies, Tropes is a Natural Language Processing and Semantic Classification software that “guarantees” pertinence and quality in Text Analysis.

Leximancer is computer software that conducts quantitative content analysis using a machine learning technique. It learns what the main concepts are in a text and how they relate to each other. It conducts a thematic analysis and a relational (or semantic) analysis of the textual data.

Tuesday, 22 November 2016

Westminster Student Blog Series

We have been posting a series of blogs written by University of Westminster Postgraduate students. They are all based on their research of social media, and come with a YouTube video as well. This is the last blog of the series - thank you to all of the students who contributed their work.

Social Media users: The Digital Housewife?
Valerie Kulchikhina (@v_kulchikhina) is a student at the University of Westminster for the Social Media Master's Degree program. She earned her Bachelor's Degree in journalism and advertising after graduating from the Lomonosov Moscow State University.

New social media platforms are created every few years. For instance, after the success of MySpace in 2003 and Facebook in 2004, came the launch of Twitter in 2006.  In addition, there has been an emergence of image and video-based applications, such as Instagram and Snapchat (released in 2010 and 2011 respectively).
The main source of income for these companies is data: basic information about a website’s members; their likes, comments, photographs, videos and sometimes even user-generated content (e.g. YouTube, Pinterest). Consequently, some scholars have regarded this process as the exploitation of users’ labour. This subject has been explored in Digital Labor by Trebor Scholz, in Digital Labor and Karl Marx by Christian Fuchs, as well as in other books and publications.

However, Dr. Kylie Jarrett provides a new critical model for the issue of digital labour exploitation by applying Marxist feminist theorisation. According to Jarrett, there are notable similarities between the exploitation of domestic workers’ labour and online users’ labour. For example, in both instances their work remains unpaid, even though it is integral to the capitalist market. These ideas are presented and explored in her new book, entitled Feminism, Labour and Digital Media: The Digital Housewife. There, Jarrett also addresses a variety of topics, from Marxist works to identity politics.

In order to find out more about Jarrett’s perspectives, we reached out to the author herself and conducted a short but very informative interview. First, we examined the intriguing concept of the ‘digital housewife’ that allowed the author to explore the feminised experience of labour. Second, we discussed how Jarrett came across this concept of ‘feminisation of labour’ for the first time while reading neoliberal economics and politics. That idea later evolved into the term ‘housewifisation’, which the author discovered in the influential works of Maria Mies. Third, we analysed several similarities between domestic labour and online labour that initially captured Jarrett’s attention. For instance, the author notes, ‘they are both providing inalienable commodities that are part of the alienated commodity exchanges’. Moreover, both types of labour participate in developing ‘meaningful subjectivity’. However, Jarrett emphasizes that even while sharing so much in common, they are not the same.

In addition to that, the author explained her opinion on the importance of feminism. She notes how feminist theorisation showed the economic influence of domestic work that previously was simply considered a ‘natural’ labour. Thus, feminist critique helped to demonstrate the valuable role which consumer labour plays in the capitalistic world. 
She also mentions several reasons why the framework for housewives’ unpaid work has not garnered more attention over the years. For example, she reminds us that for a long time domestic work was perceived to be organic labour and, therefore, ‘not productive’.

Furthermore, Jarrett describes orthodox Marxism as ‘incoherent’ towards women, whose work was often discussed in the same context as nature.  Within this framework, it is not surprising that feminist theorisation was not able to gain more visibility for a long time.  
Jarrett also contemplated the possibility of building an online world where user labour is no longer exploited. During this discussion, Jarrett mentions that feminist theorisation shared some models of creating a harmonious medium. However, she highlights that ‘we do need to challenge a lot more than exploitation’.

In her book, Jarrett references numerous scholars, including feminist thinkers and other theorists. For instance, the author addresses the opinions of Mark Andrejevic and Tiziana Terranova, who believes that ‘free labour … is not necessarily exploited labor’. It was interesting to discover Jarrett’s responses to these notions, namely: ‘Yes, you are right but also…’ She uses a simple example of liking someone’s Instagram post to show that it is a social interaction, but also it is an action that is exploited structurally.

In summary, Jarrett manages to successfully utilise the framework of ‘unpaid reproductive work’ and apply it to the current discourse of online labour exploitation. Using different examples and her own personal experience, she makes a seemingly complex topic more accessible to students and scholars alike. Hopefully, readers will find the accompanying video to be an interesting introduction to Jarrett’s recent work. Perhaps it will help to further endorse the significance of feminists’ works in the field of digital media studies. 


Friday, 11 November 2016

What can social media tell us about society? - Videos & slides avaiable

Thanks to everybody who attended our event at Twitter HQ on Tuesday looking at how social media data can be used to help us understand society. It was a great evening with interesting talks from Joseph Rice, Josh Smith, Callum Staff, Rob Procter, and Dr Aude Bicquelet.

For those of you not able to attend, or follow along on Periscope, check out the links below to look at slides and video from the event

Joe Rice, Twitter - What's possible when you know what the whole world is thinking?
Watch video

Callum Staff, Department for Education - #vom: predicting norovirus using Twitter
Watch video
Download slides

Rob Procter, University of Warwick - Thinking practically about social data science
Watch video
Download slides

Aude Bicquelet, NatCen - Using text mining to analyse YouTube comments on chronic pain
Watch video
Download slides

Josh Smith, Demos - Listening to the digital commons
Watch video
Download slides

Q & A with all speakers
Watch video

Wednesday, 9 November 2016

Westminster Student Blog Series

We will be posting a series of blogs written by University of Westminster Postgraduate students. They are all based on their research of social media, and come with a YouTube video as well. We will be posting one a week for the next month, so keep your eyes peeled!

Pandora’s box: The Conflict Between Privacy and Security

Trenton Lee (@trentjlee) is a PhD Researcher at the Communications and Media Research Institute and the Westminster Institute for Advanced Studies at the University of Westminster. His research focuses on the intersection of critical political economy of the internet and identity theory.
The Guardian published an address to discuss the “uncomfortable truths” of the Apple vs. FBI court case in the United States where the FBI wanted Apple to aid in a terrorist investigation by developing a “back door” to “circumvent user-set security feature in any given iPhone” (Powles and Chaparro 2016). They argue that companies like Apple, Google and Facebook, who collect and store an exorbitant amount of the population’s information, must earn our trust, which is “predicated on transparency and it demands accountability, not marketing and press releases” (ibid). Christian Fuchs, in his recently published book, Reading Marx in the Information Age: A Media and Communication Studies Perspective on Capital Vol 1, demands this same transparency and accountability. Fuchs states that communication companies only tell one side of the story by, what Marx would say, “fetishizing” use-value (i.e. connectivity, communication) “in order to distract from exchange-value, from the fact that communications companies are out to make lots of money” (2016, p1). Throughout the book, Fuchs engages with the concepts and theories Karl Marx develops in Capital Vol 1, developing Marx’s critique of the political economy into a critique of the political economy of communication, which is useful in the study of the “role and contradictions of information in capitalism” (ibid).
Understanding the role and contradictions of information lies at the centre of the debate surrounding the Apple vs. FBI court case. How is this information collected? Why is it collected? What happens to it? Who decides this? 

This court case is at the centre of two clashing issues - the need for security and the right to privacy, which ignite a crisis of morality. In these times of crisis, people turn to each other to exchange information, experiences, and stories to make sense of the crisis. In the case of Apple vs. FBI, this exchange has developed into a familiar cultural narrative, one that ends in chaos - Pandora’s box. The UN human rights chief, Zeid al Hussein, described the FBI’s actions as an attempt to open Pandora’s Box, the mythological container contain all the worlds evils (Nebehay 2016). It is an interesting allegory for Hussein to compare to this dilemma over the management of the information collected by information companies like Google, information which is produced on a mass scale as a commodity, a ‘peculiar good’ (Fuchs 2016). This information is the stored in and left under the management of information companies like Apple, Google and Facebook, putting these information companies in the role of Pandora, the one who guards the box. However, their close ties to the capitalist mode of production and the concentration of power these companies possess challenges the trust we can place in their hands. We must use Marx and his political economic framework as a means to achieve the desired transparency and accountability that predicates the public’s trust in these information companies. Should we allow these companies to take on the role of Pandora? Will they guard the box that contains all of the world’s evils? Or will they too, fail at the job?

Fuchs, C. (2016). Reading Marx in the Information Age: A media and communication perspective on Capital Vol 1. New York: Routledge.

Nebehay, S. (2016). UN Human Rights Official Warns Against 'Pandora's Box' Precedent In Apple vs. FBI case. Huffington Post, 4 March. Available from [Accessed 20 April 2015].

Powles, J. and Chaparro, E. (2016). In the wake of Apple v FBI, we need to address some uncomfortable truths. The Guardian, 29 March. Available from [Accessed 20 April, 2016]

Monday, 7 November 2016

What can social media tell us about society? - Event on Periscope

On Tuesday 8th November, NatCen Social Research will be hosting an event as part of the ESRC Festival of Social Science looking at how, in an increasingly digital age, social media research offers new ways of understanding society's attitudes and behaviours.

The event will run from 17.00 to 19.00, featuring presentations from researchers who have used social media for research in a range of settings including government and academia, followed by a panel discussion.

Unfortunately, the event is fully booked, but if you'd like to follow along, the event will be streamed live on Periscope through the @NatCen Twitter account. We're also looking to take questions from the Twitter audience for the panel discussion. If you can't make it, we'll make links to the broadcast available.

Confirmed speakers are:

Joseph Rice, Twitter: What's possible when you know what the whole world is thinking about any topic at any time?
Josh Smith, DEMOS: Listening to the Digital Commons.
Callum Staff, Department for Education: #vom: Predicting Norovirus using Twitter
Rob Procter, University of Warwick: Thinking practically about social data science.
Dr Aude Bicquelet, NatCen Social Research: Using Text-Mining to analyse YouTube Video Comments on Chronic Pain