Next year, I, along with Luke Sloan and Tarek Al Baghal, will be running an experiment on the Understanding Society Innovation Panel to look at the feasibilities and practicalities of linking Twitter and survey data in a longitudinal context, and how they can be combined to improve the quality of both.
We will be specifically focussing on how survey data can be enhanced using social media data (for example by creating new measures, validating survey estimates or improving non-response adjustments) and how social media data can be validated using survey data. However, we are aware that such a dataset has a greater potential than this, so we are also thinking about the ethics and practicalities that may be involved in making this dataset available more widely.
This will no doubt be tricky – as far as we are aware it is unprecedented to attempt to link data in this way and make it available to wider set of researchers, and it is therefore difficult to predict what issues may arise. We therefore aim to be as open as possible about these issues; this will involve documentation of the choices we make so others may learn from mistakes we may make, but we would also like to consult with the wider research community at key points in the process:
- Consent to data linkage
- Social media data collection/linking to survey data
- Data archiving
We are currently at the first stage – asking consent to link participants’ survey data and Twitter data. To an extent, by asking consent we are going beyond what many social media researchers may do, but by linking to survey data and aiming to archive, this changes the dynamic somewhat.
There are constraints to what we can do: the survey will be administered in web, telephone or face-to-face modes, so the process must work in all contexts. There is also limited questionnaire space, so we cannot add any more questions, and we also need to consider burden on the participant – a large amount of information may overwhelm and leave them less informed.
Below, I have outlined the template for the three questions we would like to ask. We are proposing to use ‘help links’ during the questionnaire to allow the participant to find out more information if they want it online, or an interviewer to answer questions in an interviewer-administered mode:
Q1 [Ask All]
Do you have a personal Twitter account?
Q2 [IF Q1 = Yes]
We are interested in being able to link people’s answers to this survey to the ways in which they use Twitter. We would also like to know who uses Twitter.
We will not use your tweets to identify you in any way and your Twitter information will be treated as confidential and given the same protections as your interview data. Your Twitter name, and any information that would allow you to be identified would not be published.
HELP SCREEN: What data will you collect from my Twitter account?
HELP SCREEN: What will the data be used for?
HELP SCREEN: Who will be able to access the linked data?
HELP SCREEN: What will you do to protect my data?
Are you willing to tell me the name of your personal Twitter account and for your Twitter information to be linked with your answers to this survey?
Q3 [IF Q2 = Yes]
INTERVIEWER: Please enter the respondent’s Twitter name here: [OPEN]
We would really appreciate any feedback you may have on what information we might include in these help links, or how we might change the question wording/ administration. If you do also have any thoughts that may not be possible in this context, they would also still be useful to hear so we can document them for others that may want to do this in the future.
If you have any suggestions, or would like to discuss this further, please do contact me at firstname.lastname@example.org. As we need to submit our final version of the question text to ISER by mid-September, please do try to get any comments to me by the end of August.