martes, 1 de febrero de 2011

Littler twitter...

As I looked at a conversation between Diego and Florencia, a few acquaintances of mine, I noticed that both were quite frequent updaters. This became even more evident when I compared their total number of twits against my own.

After this, I should have put the matter to rest, but it was too late. My mind was starting to brew a few ideas, and was thirsty of information. :D

Let's begin with some APIs: I downloaded these packages using easy_install %NAME --user (no pesky sudo here):
    http://cheeseshop.python.org/pypi/simplejson
    http://code.google.com/p/httplib2/
    http://github.com/simplegeo/python-oauth2

After, that, I installed python-twitter and I had all the source code and libraries needed to do a little exploration. But there's a catch (as usual): you must register your application and get your tokens in order to access some twits. Ah, well... I named my app "yate", "Yet Another Twitter Environment" and Spanish for "yatch". It needs a home page even if it's a client program... OK, GitHub to the rescue!

After that, you'll have your four vanilla-secret tokens. I just hardcoded them and downloaded my twits. You have a limit of 200 twits per request, so if you are a frequent twitterer you may have to work some magic out. Fortunately, I'm quite lazy so I only had 180 or so. Here's the code:

#!/usr/bin/python

import twitter

CONSUMER_KEY =
CONSUMER_SECRET =
OAUTH_TOKEN =
OAUTH_TOKEN_SECRET =

if __name__ == "__main__":
    api = twitter.Api(consumer_key = CONSUMER_KEY,
        consumer_secret = CONSUMER_SECRET,
        access_token_key = OAUTH_TOKEN,
        access_token_secret = OAUTH_TOKEN_SECRET)
    my_user = api.GetUser("one_twit_wonder")
    statuses = api.GetUserTimeline(screen_name="one_twit_wonder", count=200)
    with open("out.txt", "w+") as f:
     f.writelines([s.created_at for s in statuses])

This thing downloaded the creation date of all my twits. I had to format everything because I forgot to add line separators (yay for me). And then, it hit me: the timestamp format of twitter was quite ugly. If you want to convert it to something more palatable, do this:
import datetime

twit_dt_f = '%a %b %d %H:%M:%S +0000 %Y\n'
out_f = '%m/%d/%y\n'

with open("out.txt", "r") as i:
    with open("clean_out.txt", "w+") as j:
        for d_str in i.readlines():
            j.write(datetime.datetime.strptime(d_str, twit_dt_f).strftime(reg_f))

Yeah, it's ugly, but it does the trick. You can replace out_f with a format of your choice. I choose that because our next step is... OpenOffice.org Calc. There are better ways to do this for sure, but I was tired. I pasted all the timestamps and worked with Calc the best I could. I would have liked that feature from Business Intelligence suites that converts dates into numbers, but I didn't have it, so I approximated it. I was 10 days short in the end, but it worked neatly.

Check out this graph:


You'll notice a few twits the first days: I was testing the Twitter API before it dropped basic authentication (ah, simple days...). I got bored and put the account to rest (after all, I didn't name it one_twit_wonder for nothing). A full year after that, my sister created her account, and started nagging me about how I didn't twit. I had gwibber on my Ubuntu box and publishing an update was one click away, so I started using it, mostly for syndication and sharing, just like Google Reader. Since that fateful day, I averaged a little bit less than an update per day. Klout tells me that eventually I'll post like crazy and turn into a conversationalist. Given my current apathy regarding Twitter and Facebook, I'll say Challenge Accepted. :)

No hay comentarios:

Publicar un comentario