Luke Sloan is a Senior Lecturer in Quantitative Methods and Deputy Director of the Social Data Science Lab at the School of Social Sciences, Cardiff University, UK. Luke has worked on a range of projects investigating the use of Twitter data for understanding social phenomena covering topics such as election prediction, tracking (mis)information propagation during food scares and ‘crime-sensing’. His research focuses on the development of demographic proxies for Twitter data to further understand who uses the platform and increase the utility of such data for the social sciences. He sits as an expert member on the Social Media Analytics Review and Information Group (SMARIG) which brings together academics and government agencies. @drlukesloan
Who uses Twitter?
It’s a simple question, but one that is tricky to answer. We all think we know the types of people who use Twitter – the urban elite, celebrities, professionals, young people… but providing an empirical account is challenging and without knowing who tweets we can’t even start a conversation about representativeness and bias. To understand how the social world manifests in the virtual we need to know who is present or underrepresented.
Much work has been done on using Twitter metadata to estimate proxy demographics for UK users such as gender (Sloan et al. 2013) and age, occupation and social class (Sloan et al. 2015), but these methods rely on people self-reporting a first name, an age or date of birth and an occupation to classify. The question has always been whether certain groups, such as older people and those from certain occupations, are less likely to choose to construct their virtual identity with reference to these characteristics or not.
Clearly it’s quite a leap forward to be able to use British Social Attitudes 2015, a random probability sample survey of over 4,000 respondents with weights calculated to account for non-response bias, to help us understand the Twitter population. The data allow us to compare Twitter usage by demographic groups benchmarked against the 2011 Census whilst evaluating previous attempts at demographic proxies.
So, how accurate is the picture of the demographic characteristics developed through proxies?
As it turns out we find some interesting discrepancies. According to the BSA data we find more men on Twitter than expected and we see that although most users are younger there are more older users on the platform than we previously thought. We also find that there are strong class effects regarding Twitter use, largely in line with previous proxy estimates most of the time but substantially out of line for certain groups. The full paper is open access and can be read here.
How does this aid our understanding of how the social world manifests online? To take an example, a recent study by Draper et al. found that, during the horsemeat food scare of 2013 Twitter was dominated by jokes and humour. The overall discourse suggested that this wasn’t perceived as a serious incident and that the issue wasn’t really a public concern, but we now know that Twitter is dominated by the higher NS-SEC groups – people with high incomes who are the least likely to come into contact with the budget adulterated products. Twitter thought it was funny because Twitter is dominated by people who were largely unaffected by the scare. This is an important lesson in how representation impacts upon what the data is telling us.
Of course, it’s no surprise that Twitter is dominated by the professional and managerial groups, but at least now we have some strong evidence to underwrite our expectations.
Read the full paper: Sloan, L. (2016) Who Tweets in the United Kingdom? Profiling the Twitter Population Using the British Social Attitudes Survey 2015, Social Media + Society 3:1, DOI: https://doi.org/10.1177/2056305117698981