Friday, 22 April 2016

Who Tweets? Making Twitter Data ‘Useful’ for Social Science

Dr Luke Sloan is a Senior Lecturer in Quantitative Methods, Deputy Director of Cardiff Q-Step and a member of the Collaborative Online Social Media Observatory (COSMOS: He is based in the School of Social Sciences at Cardiff University and his research focuses on the development of demographic proxies for Twitter data and understanding how social media data can augment traditional modes of social scientific analysis. @drlukesloan

A perennial criticism of Twitter data is that it’s missing many of the variables that we find interesting as social scientists and, because of this, it will never be a viable source of data for social scientific analysis. We are anchored to the practices of survey methodology in which a question is asked and answered, thus we ensure that the researcher collects the relevant demographic information allowing us to compare gender/ethnic/socio-economic groups. This is the bread and butter of social science.
In contrast, social media data is naturally occurring it is not elicited! Because of this it is unfocused, messy and does not neatly address a pre-conceived research question. But it is a rich source of information on attitudes and provides insights into immediate reactions following key events. It’s been used to predict elections, box office revenue and even to calculate the epicentre of an earthquake. So clearly we shouldn’t be so quick to dismiss this data as useless, particularly if we are creative and innovative in how we conceptualise the manner in which demographic data may manifest and thus open this data up to social scientific analysis.
Imagine that you are walking down the street and have decided that today you are going to guess the demographics characteristics of the people that you see the only rule is that you cannot ask them outright, you must observe their behaviour without being obtrusive. How might you work out someone’s gender? Well, perhaps you overhear someone shouting his or her name. What about their occupation? Maybe they have an ID badge or are carrying tools. What about their age? Well we all make guesses about age based on appearance, often at the risk of offending someone. The point is that through the passive uptake of incidental information which is there to be analysed (and which you have not elicited!) you can tell quite a bit about a person.

Now let’s consider this in the context of Twitter. People put their name on Twitter, thus allowing us to derive a proxy for their gender. For those who have geo-tagging switched on we can tell where they were when they tweeted, or we can use profile information to workout their home town. If we have enough time we can even look at the place which they make reference to in their tweets. We know about their hobbies as they report on  their leisure activities and we know a bit about their work if they report on it via social media. Are they employed? Well we can have a look at whether they’re complaining about work, about colleagues or about the printer breaking down (‘again!’). When we look close enough we are flooded with ‘signatures’ that offer us an indication of characteristics that that would typically be found in the demographics section of a survey.

The sticking point is that we can’t derive this information for all tweeters and not all the proxies are as reliable as others. First names are actually quite an accurate proxy for gender as identity play is a minority pursuit. As long as you have stringent classification rules and understand that around 52% of UK users can’t be classified (this still results in successful identification of around 600,000 users), then you still have information for 48%*. You could think of this 48% as a sample of Twitter users which is synonymous to a survey sample, although not randomly sampled… but even then do we have any reason to think that the users we have been able to identify are substantively different to those we can’t?

The bottom line is that it is possible to derive important demographic information from Twitter data if we’re prepared to think creatively. The methods will get better and programmes of work will emerge which allow the confirmation of proxy demographic reliability. We’re only a few metres off the ground on our climb up this new methodological edifice, but seeking out a viable trail enables others to follow and establish safer, more secure routes.


  1. Twitter is a popular platform in terms of the media attention it receives and it therefore attracts more research due to its cultural status .Twitter makes it easier to find and follow conversations (i.e., by both its search feature and by tweets appearing in Google search results) .Twitter has hash tag norms which make it easier gathering, sorting, and expanding searches when collecting billboards which offer their own unique impact – billboards, walls, bus and rail media, street furniture, specialty signage, digital, mobile billboards, sports media and more – that are sure to fit your needs.

  2. very informative article!

  3. thanks for share this articles

  4. gracias por la información es muy útil kirim paket murah

  5. Hi fellas,
    Thank you so much for this wonderful article really!
    If someone want to read more about those socialr I think this is the right place for you!

  6. Alexander Evaji – CEO Nuviza Interview

  7. Your Post is very Important Social Media... Like this Post. Animation Movies Free Download

  8. Your Post is very Important Social Media... Like this Post...tollywood-movies-free-download


  9. Wow, Excellent post. This article is really very interesting and effective. I think its must be helpful for us. Thanks for sharing your informative.

    Click here to watch movies online
    best romantic movies
    best animated movies
    best action movies
    latest action movies
    hindi action movies
    top animated movies
    cartoon movies

  10. Our New Cute Charming Arab Escort Model Girls are Ready To Meet And Enjoyment With You in Dubai Hotels, Restaurant and Much More Place. Call For Booking At +971-55-8758300
    Visit Escort Portfolio Here
    Arab Escort In Dubai
    Arab Call Girl In Dubai

  11. Inspiring writings and I greatly admired what you have to say , I hope you continue to provide new ideas for us all and greetings success always for you..Keep update more information..
    Dedicated Server Hosting in Delhi