Wednesday, 16 April 2014

How to map location of tweets using a specific hashtag

The Peer Review Watch team and I recently put on a debate on the subject of science publishing. The debate was entitled Peer review is broken. How do we fix it? and was a great success.

Not only was there a full panel of knowledgeable people from science and science publishing, there was also a full audience and participation online through our livestream and our hashtag for the event, #prwdebate.


Yesterday I posted some analysis of the life of our hashtag on the Peer Review Watch blog, and in this post, I'm going to go over how I did it.

First of all, you need to get hold of your tweets. Using twitter's search function or Storify is no good, as they only return tweets from the last seven days, and you'll run into the same problem using scraperwiki. But I found a tool called PeopleBrowsr which is free and will return tweets from up a thousand days.

http://search.peoplebrowsr.com/

You get an interface that you can search twitter with that looks like this:




(Tangent - I challenge you to find this page by searching for it rather than clicking on the link!)

Once you've found all the tweets you want (in this case all tweets that have used the #prwdebate hashtag) click on the little arrow at the top of the column and export them as a a csv file.

Open the file, then use super secret ninja data skills to convert it into an excel file (file>save as> xls)

Now you should have a spreadsheet with all the tweets, with a column that contains location data. Save it in Google Drive, open it as a Google Spreadsheet, then open it in a fusion table, and all you have to do is change the column with the location info type to 'location', and Google does the rest. It will look like this:



The precision of the location data varies; sometimes it just gives a country, sometimes a city, sometimes a specific address like Northampton Square, which is where the debate was held. But you'll get an idea of where people are tweeting from on a given subject.

You can do other useful thing with your spreadsheet full of tweets. If you use a pivot table report, you should be able to count the tweets from a specific user, so you know who the main people talking about whatever it is you're looking up.

To do this, select the column of the Google spreadsheet with the users in it, click on data>pivot table report. On the sidebar, click on rows>users, then value>countA and voila, you should have the total tweets for each user.




















With this info, you could make an infographic like I did:



Or you could make a pie chart to show who had the lion's share of the twitter action. If you wanted to know where most people were tweeting from, you could do a pivot table for that too, then sort it from greatest to smallest, and you could find out.

There's a whole bunch of other data you can get using PeopleBrowsr, including how many tweets the hashtag got on each date, which makes a nice line graph:




It also gives you a word frequency table, which you can plot in a chart too. To get this data, click on the arrow at the top of the column again, and choose 'grid'. This can be exported just like the other info.

PeopleBrowsr looks like a powerful analytics tool, and if you explore the various functions you should be able to find a whole lot more: for example, it can track blog mentions.

No comments:

Post a Comment