Luke Sloan is a Senior Lecturer in Quantitative Methods and Deputy Director of the Social Data Science Lab at the School of Social Sciences, Cardiff University, UK. Luke has worked on a range of projects investigating the use of Twitter data for understanding social phenomena covering topics such as election prediction, tracking (mis)information propagation during food scares and ‘crime-sensing’. His research focuses on the development of demographic proxies for Twitter data to further understand who uses the platform and increase the utility of such data for the social sciences. He sits as an expert member on the Social Media Analytics Review and Information Group (SMARIG) which brings together academics and government agencies. @drlukesloan
Who uses Twitter?
It’s a simple question, but one that is tricky to answer. We
all think we know the types of people
who use Twitter – the urban elite, celebrities, professionals, young people…
but providing an empirical account is challenging and without knowing who
tweets we can’t even start a conversation about representativeness and bias. To
understand how the social world manifests in the virtual we need to know who is
present or underrepresented.
Much work has been done on using Twitter metadata to
estimate proxy demographics for UK
users such as gender (Sloan
et al. 2013) and age, occupation and social class (Sloan
et al. 2015), but these methods rely on people self-reporting a first name,
an age or date of birth and an occupation to classify. The question has always
been whether certain groups, such as older people and those from certain
occupations, are less likely to choose to construct their virtual identity with
reference to these characteristics or not.
Clearly it’s quite a leap forward to be able to use British Social
Attitudes 2015, a random probability sample survey of over 4,000 respondents
with weights calculated to account for non-response bias, to help us understand
the Twitter population. The data allow us to compare Twitter usage by
demographic groups benchmarked against the 2011 Census whilst evaluating
previous attempts at demographic proxies.
So, how accurate is the picture of the demographic
characteristics developed through proxies?
As it turns out we find some interesting discrepancies.
According to the BSA data we find more men on Twitter than expected and we see
that although most users are younger there are more older users on the platform
than we previously thought. We also find that there are strong class effects
regarding Twitter use, largely in line with previous proxy estimates most of
the time but substantially out of line for certain groups. The full paper is
open access and can be read here.
How does this aid our understanding of how the social world
manifests online? To take an example, a recent study by Draper
et al. found that, during the horsemeat food scare of 2013 Twitter was
dominated by jokes and humour. The overall discourse suggested that this wasn’t
perceived as a serious incident and that the issue wasn’t really a public
concern, but we now know that Twitter is dominated by the higher NS-SEC groups
– people with high incomes who are the least likely to come into contact with
the budget adulterated products. Twitter thought it was funny because Twitter
is dominated by people who were largely unaffected by the scare. This is an
important lesson in how representation impacts upon what the data is telling us.
Of course, it’s no surprise that Twitter is dominated by the
professional and managerial groups, but at least now we have some strong
evidence to underwrite our expectations.
Read the full paper: Sloan, L. (2016) Who Tweets in the
United Kingdom? Profiling the Twitter Population Using the British Social
Attitudes Survey 2015, Social Media +
Society 3:1, DOI: https://doi.org/10.1177/2056305117698981