Thursday, 4 August 2016

Consultation – Asking consent to link Twitter data and survey data

Curtis Jessop is a Senior Researcher at NatCen Social Research where he works on longitudinal surveys and is the Network Lead for the NSMNSS network

Next year, I, along with Luke Sloan and Tarek Al Baghal, will be running an experiment on the Understanding Society Innovation Panel to look at the feasibilities and practicalities of linking Twitter and survey data in a longitudinal context, and how they can be combined to improve the quality of both.

We will be specifically focussing on how survey data can be enhanced using social media data (for example by creating new measures, validating survey estimates or improving non-response adjustments) and how social media data can be validated using survey data. However, we are aware that such a dataset has a greater potential than this, so we are also thinking about the ethics and practicalities that may be involved in making this dataset available more widely.

This will no doubt be tricky – as far as we are aware it is unprecedented to attempt to link data in this way and make it available to wider set of researchers, and it is therefore difficult to predict what issues may arise. We therefore aim to be as open as possible about these issues; this will involve documentation of the choices we make so others may learn from mistakes we may make, but we would also like to consult with the wider research community at key points in the process:

  • Consent to data linkage
  • Social media data collection/linking to survey data
  • Data archiving

We are currently at the first stage – asking consent to link participants’ survey data and Twitter data. To an extent, by asking consent we are going beyond what many social media researchers may do, but by linking to survey data and aiming to archive, this changes the dynamic somewhat.

There are constraints to what we can do: the survey will be administered in web, telephone or face-to-face modes, so the process must work in all contexts. There is also limited questionnaire space, so we cannot add any more questions, and we also need to consider burden on the participant – a large amount of information may overwhelm and leave them less informed.

Below, I have outlined the template for the three questions we would like to ask. We are proposing to use ‘help links’ during the questionnaire to allow the participant to find out more information if they want it online, or an interviewer to answer questions in an interviewer-administered mode:

Q1 [Ask All]
Do you have a personal Twitter account?
1. Yes
2. No

Q2 [IF Q1 = Yes]
We are interested in being able to link people’s answers to this survey to the ways in which they use Twitter. We would also like to know who uses Twitter.

We will not use your tweets to identify you in any way and your Twitter information will be treated as confidential and given the same protections as your interview data. Your Twitter name, and any information that would allow you to be identified would not be published.

HELP SCREEN: What data will you collect from my Twitter account?
HELP SCREEN: What will the data be used for?
HELP SCREEN: Who will be able to access the linked data?
HELP SCREEN: What will you do to protect my data?

Are you willing to tell me the name of your personal Twitter account and for your Twitter information to be linked with your answers to this survey?
1. Yes
2. No

Q3 [IF Q2 = Yes]

INTERVIEWER: Please enter the respondent’s Twitter name here: [OPEN]

We would really appreciate any feedback you may have on what information we might include in these help links, or how we might change the question wording/ administration. If you do also have any thoughts that may not be possible in this context, they would also still be useful to hear so we can document them for others that may want to do this in the future.

If you have any suggestions, or would like to discuss this further, please do contact me at As we need to submit our final version of the question text to ISER by mid-September, please do try to get any comments to me by the end of August. 


  1. This comment has been removed by a blog administrator.

  2. I think your only hurdle here will be how you want to share your Twitter data with other researchers. If you share the data in some aggregate format (descriptives, linguistic analysis, etc.) then I don't see any problems here. But if you share their raw tweets (which could be used to google the account name and link a person to their survey data) then I think you'd have some ethical problems. Also, will they have a mechanism to opt out later and knowledge of your analysis of their tweets each time you are doing it?
    Just my thoughts. And sounds like a really interesting study!

  3. Ah, it's unfortunate that private business does not publish their research and methodologies as often as academic and social companies do. Combining Twitter data with survey/panel data has been tested by numerous survey companies over the last 6 years.

    The end conclusion is often that people don't talk enough about the desired topic in social media to warrant all the time and money required to collect the social media data. For instance, if the survey is about politics, the Twitter data may reveal one or two political posts neither of which contain enough data to be useful.

    But it would be great to have some published data on the methodology so that other researchers can see hit rates and success rates.

    Good luck!

  4. This comment has been removed by a blog administrator.

  5. This comment has been removed by a blog administrator.


  6. Nice...Its highly informative post. I really enjoyed reading. Thanks

    Server 2016
    Microsoft Server 2016

  7. Inspiring writings and I greatly admired what you have to say , I hope you continue to provide new ideas for us all and greetings success always for you..Keep update more information..
    Dedicated Server Hosting in Delhi