Posts Tagged ‘google api’

Google Data API With OAuth Using the GData Python Client

Wednesday, April 22nd, 2009

On one of my projects (the one that’s kept me too busy to blog), I had to work quite a bit with the google data API. I was using Django, so I was using the gdata python client.

The client is, for the most, part excellent. It abstracts away the parts of OAuth that you never ever want to have to know anything about or debug. All the signing, encoding, and decoding takes place inside the library – where it should happen.

The spec is somewhat young, and the library is even more so; there isn’t a whole lot out there covering the subject. The client comes with some sample scripts, but they don’t make it very clear how to set things up in a web app workflow. I’m going to try to fill the void somewhat with a sample Django app and a somewhat extensive writeup. You don’t have to know/understand Django to follow along, but it may help.

Disclaimer: The sample app is written in the simplest possible way, so as to focus on the code that actually has to do with OAuth and the GData API. When I say “sample,” I really fucking mean it. I pay no attention to things like security and good practices, because I don’t give a shit about those things, and neither should you. The purpose is to understand how Google’s python client works. If you copy and paste this code into your production app and push your key, secret, and RSA creds to a public github repository, that is your own stupid fault.

OAuth Crash Course

There are many better resources for learning about OAuth proper, so I’ll cover the basics. OAuth is a protocol for securing communication between an API (provider) and an app that wants to use said API (consumer). It’s fairly complicated, but due to Google having a pretty nice client wrapped around their API, most of that is abstracted away. What we need to know is the general OAuth workflow:

  1. User invokes a part of the consumer app that requires access to the provider’s API
  2. If the consumer already has an OAuth Access Token for that user, skip down to step 6
  3. If not, the consumer fetches an OAuth token key and token secret from the provider. The token secret is stored. The consumer then directs the user to an Authorization URL provided by the provider, with the token key as a query parameter, along with a “callback” URL pointing back to the consumer
  4. The user authorizes the consumer to use the provider’s data; the provider redirects the user back to the callback URL with the “Authorized Request Token” ‘oauth_token’ query variable set to its token key. The consumer can then request an Access Token from the provider by sending this token key back with the token secret that was saved in the previous step
  5. The provider returns the Access Token. The consumer can now use this Access Token to access the user’s data through the provider’s API until the user elects to revoke the Access Token
  6. The consumer does magic tricks with user’s data, including making it disappear

Simple, right? I’ve left out all the encryption details (those are still murky to me as well), but luckily we don’t have to worry about that too much. I’ll reference this list of steps as I go along – mostly to keep myself sane. Let’s call a spade a fucking spade – shit is confusing! More info on OAuth and Google’s implementation of it can be found here. If you want to be really hardcore, you can read the spec here.

Some key terms to remember along the way:

  • Provider – service with the user’s data (in this case, Google)
  • Consumer – the app that wants to access the user’s data (your app)
  • Request Token – refers to a token used by the consumer to request access from the provider/user. It can be authorized and unauthorized. The token is initially given by the provider to the consumer in an unauthorized state. The consumer has to then direct the user to the provider’s authorization page with the unauthorized token. If the user accepts, the provider will send the user back to the consumer with an authorized request token.
  • Access Token – refers to a token used by the consumer to actually authenticate with the provider. The consumer has to exchange an authorized request token for an access token before it can access data. The access token is what gives the consumer long-term access to the user’s data.

Getting Started

First and foremost, go here and sign up for an API key. Pretty straight forward. Store your App key and secret somewhere – I’m using the settings.py file in my sample app, though that may not be optimal security-wise. You’ll also see an rsa_key in my GDATA_CREDS dictionary. That is there because my example uses RSA_SHA1 encryption. You don’t have to use that, but I recommend it, and the example will be easier to follow. I’ve seen in some documentation that may or may not be outdated that the contacts API only accepts RSA encryption, though I haven’t tried HMAC myself. To setup your RSA encryption, follow the steps here. You should be able to accomplish this on most unix/linux boxes.

Now that you have your app key and secret, and you’ve set up RSA encryption, you should download and install the the gdata lib itself. Get it here.

Storing the Token

Before you start using tokens, you need some way to store them. A token consists of a token key and a token secret. You could create a database table to store those things individually. Me, I’m lazy – I just pickle my entire gdata token object. It lets me rebuild a functioning token object on the spot when I fetch it from the database. So for my case, I just have a text field for storing the token itself. See models.py in the sample app.

You could probably just as easily define a model with token and token secret fields and a method to return an initialized token object. Whatever your preference, define an appropriate model. You also need a way to tie the tokens to a user. My sample app does this using a foreign key to the django.contrib.auth.models.User model.

Creating a Token

Now we’re ready to actually start down the path towards getting the OAuth Access Token – the jackpot of API access. Seriously, you will feel a sense of palpable relief when you finally get one working. Like when passing a pineapple.

If you’re following along in the app, we’re going to skip over shit like user signup and auth (it’s only there so that people could set this up on their server and really see it work) and go straight for the money: getting the OAuth token. Look at /oauth/views.py->add_token. Approximate line numbers will be given in square brackets.

The first line of the view [~97] fetches the right scope for dealing with the contacts API. The scopes determine what part of the user’s data your token will have access to and are not part of the OAuth spec. They are stored under some pretty cryptic keys, so you may just have to look in the gdata source under gdata/service.py to find the right ones. I just want to use the contacts feed for the example, so I put ‘cp’ (what else would it be, right?)

Next, we initialize the gd_client variable, which is the object that will take care of all of our OAuth needs and communicate with TEH GOOGALZ [~99]. You will see that I tell it to use RSA encryption and pass it my app’s API credentials from settings.py. There are a few different subclasses to the base client that provide more specific methods for dealing with individual feeds (contacts, blogger, docs, etc). We’ll just use the base client for initializing our token.

Now we arrive at a conditional to check what part of the process we’re on [~107]. If we don’t have a ‘oauth_token’ get parameter, that means that we’re at step 3 in our process – we have to send the user to Google to hit the big Authorize button.

The first thing we do is initialize a request token (rt). We then store the ‘secret’ part of that token. I store it in a session for simplicity/readability, but it can be stored anywhere.

We then tell our client to use this request token and to generate an authorization URL for that token [~116]. That’s the URL we will send our user to in order for them to authorize us. Note that in the callback we just pass the current URL. When the user clicks ‘Allow’ on Google’s authorization page, they will be sent right back to this same view, but this time with the ‘oauth_token’ parameter, meaning they will hit the ‘else’ part of our conditional.

When we do get back to the page, we’ll need to reconstruct our token [~127]. Google has modified our token and marked it authorized, but we still need to put back our piece of the puzzle – the token secret. Since we stored it in a session variable, we retrieve it with ease. Since our token now has no context, we have to tell it what scopes we’re interested it in as well.

So what we have now is an authorized request token. We’re still not out of the woods. Now we need to upgrade our token to an actual access token. This is done via the UpgradeToOAuthAccessToken method of the gdata client [~135].

As the comments in my code indicate, this part is a tad shifty [~141], at least as of version 1.2.4 of the python client. Since upgrading the token modifies it once more, we need to store the end result of the upgrade. However, the upgrade method does not return that modified token, it just stores it in the client’s token store, hence the find_token call. See discussion here for more info. UPDATE 05/05/2009: the patch to fix this awkwardness just got committed to SVN, so hopefully the next version of the library will no longer require the extra step.

Regardless, once we’ve retrieved the authorized token (at), we can store it [~145]. As I had mentioned, I just pickle it into a TextField.

That’s it. Once you’ve saved that token, you can reconstruct it at any time to fetch any data that token is authorized for (as in, which scopes you specified when sending the token to Google).

To see it in action, just take a look at the views.py->home. First, we find the latest OAuth token in our database [~66]. If we don’t have one, we redirect the user to the add_token view to go through the OAuth process [~86]. If we do have one, we fetch some data.

We create and initialize a ContactsService client [~71-78] and then pass it our token [~79-80]. Then BAM, we fetch some contacts from our user’s contacts feed.

That’s all for now. Hopefully this overview is helpful in getting everybody and their mother developing apps using the Google APIs. I welcome corrections and additions, and will update this post if I get any important ones.