Friday, 23 December 2016

Identifying popular tourism attractions in London by using geo-tagged photos from Flickr

Dr. Yeran Sun is a postdoctoral researcher at Urban Big Data Centre, University of Glasgow, UK. His research interests include big data and urban studies, social media research and sentiment analysis, transport and social inequality, transport and public health.

Social media data offers crowd-sourced data to social science research. In particular, GPS enable-devices, such as smart phones, allow social media users to share their real-time locations in social media platforms.

In my presentation, Flickr geo-tagged photos are used to identify popular tour attractions in London.

‘Geo-tagged’ photos and tweets of Flickr, Instagram and Twitter users tell us the footprints and mobility of users. Compared to Instagram and Twitter, Flickr has a large portion of tourists. Geo-tagged photos from Flickr users are used as crowed-sourced data in recent tourism research. However, the population of geo-tagged photos are not proportional to the population of real tourists’ footprints. Therefore, visits to popular tourism attractions such as landmarks are likely to be over-represented by Flickr photos, while visits to unpopular tourism attractions are likely to be under-represented.

Although geo-tagged photos are biased, they could be used to reflect popularities of tourism attractions that have no ticketing records, such as central squares, public statues, public parks, rivers, mountains, bridges and so forth. Crowd-sourced data from Flickr photos can be used to measure popularities of tourism attractions without ticketing records. As clusters of photos tend to take place around popular tour attractions where tourists like to take photos, we could identify popular attractions by detecting significant spatial clusters of geo-tagged photos.

In my presentation, significant clusters are detected by using a density-based clustering method called DBSCAN.  Most of those clusters spatially overlap popular tour attractions in London. In my presentation, free-to-use tools QGIS and R are used to map geo-tagged photos and carry out cluster detection respectively. Additionally, to run the DBSCAN algorithm we need to install a package ‘dbscan’ in R. Via Flickr APIs (https://www.flickr.com/services/api/), we can download public Flickr data including photos, tags and coordinates by defining geographic boundaries or searching for keywords.  There are API kits written in a variety of languages, including C, Delphi, Java, Python, PHP, .NET, Ruby and so on. You might also use shared Flickr data for your research. Yahoo Research share Flickr data with researchers (https://research.yahoo.com). Shared datasets can be found here (https://webscope.sandbox.yahoo.com/catalog.php?datatype=i). 

Wednesday, 14 December 2016

Facebook as a Tool for Research

Gill Mooney is a doctoral researcher, studying at the University of Leeds. Her research interests are currently focused on social class and social media. She completed her undergraduate degree as a mature student in sociology at the University of Hull, and prior to this was employed as a project co-ordinator for a young people’s sexual health charity in Hull. @gillmooney

My research is concerned with the ways in which we know, understand and produce social class in the digital environment of the social networking site (SNS), Facebook. The research will provide valuable insights into how social networking is changing the ways we may relate to one another both online and offline, as well as the effect it might be having on broader understandings of social class.

Facebook is the topic of the research, the site in which parts of it take place, and a tool for facilitating its logistics and practicalities. I am using it for recruitment, communication with research participants, and using content collected from Facebook as stimulus for discussion in focus groups and interviews. This combination of online and offline methods and approaches requires reflexivity to run smoothly, but maintaining a link between online and offline is essential for providing data that represents the relationship between those two spheres in terms of how individuals perceive and produce social class, and the broader effects that may have.

Recruitment
I specified Facebook as the means through which I would recruit participants, partly because I would know that they were definitely likely to be regular Facebook users, and have a reasonable understanding of how the platform functions, but also because I want to keep as much of the research as possible within the psychic environment of Facebook, to help participants stay focused on discussing things that happen there, and keeping the research framed within Facebook. I began by asking members of a general interest Facebook group of which I’m a member to share the call for participants on their own accounts. There are considerable ethical implications in using Facebook in this manner, especially when using my personal account for recruitment. It could result in a pool of participants who are connected to me personally in some direct or indirect way, which has the potential to compromise the integrity of the research or cause tension in my personal relationships. Precautions were put in place to avoid these kinds of conflict, mainly through checking possible connections to potential participants.

Communication
I set up a Facebook account in the name of the research, specifically for the purpose of handling communication and logistics with participants, again as part of wishing to keep all elements of the research within Facebook as far as possible. Participants add the account as a friend, and then I can use the messages tool to stay in touch with participants, arrange focus group and interview sessions, and send them links to consent forms and other information.
This has proven to be an effective means of staying in touch, and it means I can provide information quickly and easily in a medium that is both convenient for the participants and within the environment that I’m researching.

Stimulus for discussion
There was some concern that during the focus group sessions it would be easy for the discussion to deviate away from Facebook, and that it might be difficult to even begin talking about it in a face-to-face encounter, with others. In response I devised a ‘dummy’ Facebook newsfeed page as a way to stimulate discussion, and maintain focus on Facebook. By using this page, I can guide discussion by referring to it and asking the groups to comment on different elements within it, framing my questions around it to stay on track. Class is a difficult topic to discuss, everyone understands it differently and has had different experiences of it, so rather directly addressing it, I am able to talk about self-representation more generally in terms of Facebook and explore how class shows itself there. The content for this dummy page comes from the pages of people in my own friendship network who volunteered, and is subject to a very rigorous consent and anonymisation process.

For the interviews, participants’ own shared content is used. They provide consent for me to select some items they have shared, and then it’s used as a means to stimulate discussion, serving a similar purpose as the dummy page.

Conclusion
Using Facebook as a tool for research requires significant planning and reflexivity throughout the whole research process, but can offer benefits in terms of having access to large networks of individuals for recruitment purposes, as well as an easy and convenient way to stay in touch with participants. The difficulties in planning are related to the considerable ethical implications of using content shared by participants, and ensuring informed consent is in place at all times.


Facebook is a crucial site for research that seeks to understand contemporary society, as its use grows and it becomes further embedded in the lives of its users. Developing well thought out approaches to this kind of work is essential for maximising the research potential of the platform, and for making sure that research is carried out with integrity.

Friday, 2 December 2016

On Social Media Analytics

Phillip Brooker is a research associate at the University of Bath (UK) working in social media analytics, with a particular interest in the exploration of research methodologies to support the emerging field. Phillip is a member of the Chorus team; a Twitter data collection and visualisation suite (www.chorusanalytics.co.uk). He currently works on CuRAtOR (Challenging online feaR And OtheRing), an interdisciplinary project focusing on how “cultures of fear” are propagated through online “othering”. @Chorus_Team

NSMNSS events have always been good value for me. I haven't quite been a part of the network since it kicked off, but I certainly have tried to be an 'active member' for the years that I have been involved with it. So when Curtis Jessop emailed me to ask if I'd give a talk on the practicalities of using Chorus to do social media analytics research, I jumped on it. Moreso than telling people about our software and what we've used it to do, these events are always the perfect chance to hear about innovative current research in the field. I won't go through my talk in too much detail here since I generally try not to be too reductive about how Chorus might be used in social research. Best to download it, watch the tutorial video, read the manual and then play about with it yourself (all of which you can do at :::PLUG ALERT!::: www.chorusanalytics.co.uk). Suffice to say that my talk aimed to run through the basic features and functions of Chorus as a free tool for collecting and (visually) analysing Twitter data. This included a demonstration of the two different data collection modes – the more familiar query keyword search which you can use to look for hashtags and so on, as well as our native user following data collection function which lets you capture sets of user’s Twitter timelines. And from there, I ran through the different ways of visualising data within Chorus – in brief, the timeline explorer which provides a variety of metrics (e.g. tweet volume, percentage of tweets with URLs, positive and negative sentiment, novelty and homogeneity of topic) as they change across time, and the cluster explorer which produces a topical map of the entire dataset based on the frequency with which co-occur with one another. The aim here was to show how Chorus might be used by researchers to answer lots of different types of research question, both as a full all-in-one package, but also in a more exploratory way if users want to quickly dig into some data for a pilot study or similar – readers especially interested in what Chorus might offer might find one of our recent methods papers useful (available at: http://bds.sagepub.com/content/3/2/2053951716658060).

However, what I want to comment more pointedly on in this blog is the NSMNSS event itself, because to me it marks something of a turning point in social media analytics, where it's finally becoming very clear just how distinctive we've made (and are continuing to make) the field. There seems to have always been this worry that working with digital data runs the risk of turning the social sciences into unthinking automata for blindly spotting patterns – the supposed ‘coming crisis of empirical sociology’ referred to by Savage and Burrows in 2007. And that characterisation has not really disappeared, despite social media analysts natural objections to it as a way of representing our work. Thus far, social media analytics has (arguably) necessarily had to progress in a way that directly references those concerns – researchers have made it their explicit business to show, through both conceptual and empirical studies, that there is more to social media data than correlations. However, at this most recent NSMNSS event I got the sense, very subtly, that something different was happening. As a community, we seem to be moving past that initial (and I reiterate, very necessary!) reaction into a second phase where we’re beginning to be more comfortable in our own skin. We’re now no longer encumbered by the idea of social media analytics as “not data science”, and we’re seeing it recognised more widely as a thing in and of itself. As I say, it might seem a subtle distinction, but to me it suggests that finally we’re finding our feet!

Of course, this doesn’t mean we have neatly concluded any of the long-standing or current arguments about the fundamental precepts of the field – my background in ethnomethodology and ordinary language philosophy gives me a lot to say about the recent incorporation of ideas from Science and Technology Studies into social media analytics, for instance. But nonetheless, for me, this event has demonstrated the positive and progressive moves the field seems to be making as a whole. We already knew it of course, but it’s clearer than ever that there are very interesting times ahead for social media analytics!