daemontools 0.76 on Ubuntu 9.04

July 5th, 2009

I find daemontools to be incredibly useful for quickly turning scripts into daemons. I also like to run the latest version of Ubuntu, which means there is no (well-maintained) package available. There’s a package in ‘universe’ but it puts things in strange places and generally inspires distrust (explained below). As installing daemontools from source on Ubuntu hasn’t been straight forward (I had trouble both on 8.10 and 9.04), I’m gonna post a quick howto which pulls together a few pieces of information scattered throughout the internets.

Crash Course

daemontools comes with a bunch of tools. The full docs are here, but I’ll cover the basics.

  • supervise is a daemon that takes a directory as an argument and tries to execute the file named ‘run’ in that directory, while keeping state in the ’supervise’ subdirectory. Naturally, you want to give it a folder with an executable file named ‘run’ and a wirtable subdirectory called ’supervise’ in it.
  • svstat, svok, svc, etc are used to manipulate individual supervise jobs and check their status. These aren’t covered here
  • svscan is a daemon that also takes a directory as an argument and proceeds to look for supervise-able directories within. Any time it finds one that doesn’t already have a corresponding supervise process, it fires one up. Thus if the supervise instance you count on to do some background processing crashes for a whatever reason, svscan will just start it back up
  • svscanboot is the script that should be run at boot-time to start svscan. It will direct svscan to look for jobs in the /service directory.

In an ideal world, you throw a bunch of shit into /service, svscan picks it up and fires up a supervise process for all the subdirectories. It runs as a daemon and restarts supervise whenever needed (svscan itself is basically bulletproof in my experience). When your machine is rebooted for whatever reason, svscanboot restarts svscan and everything resumes as before. The reason I choose not to use the Ubuntu package is because it does not create this ideal world for me.

Compiling

First of all, daemontools won’t compile as is. Follow along the instructions here until you get to the exciting part where you type ‘package/install’ and then get an error message that looks like this:

/usr/bin/ld: errno: TLS definition in /lib/libc.so.6 section .tbss
mismatches non-TLS reference in envdir.o
/lib/libc.so.6: could not read symbols: Bad value
collect2: ld returned 1 exit status
make: *** [envdir] Error 1

Never fret. I have found a couple of possible fixes, but the patch below appears to be the cleanest one. [1]

diff -ur daemontools-0.76.old/src/error.h daemontools-0.76/src/error.h
--- daemontools-0.76.old/src/error.h    2001-07-12 11:49:49.000000000 -0500
+++ daemontools-0.76/src/error.h    2003-01-09 21:52:01.000000000 -0600
@@ -3,7 +3,7 @@
#ifndef ERROR_H
#define ERROR_H

-extern int errno;
+#include <errno.h>

extern int error_intr;
extern int error_nomem;

Assuming you are still at the point in the installation instructions right before the ‘package/install’ command, copy the patch into /tmp/daemontools-errno.patch, then

$ cd src
$ patch < /tmp/daemontools-errno.patch
$ cd ..
Then resume the original daemontools instructions as prescribed.

Testing

The installation creates two folders: /command and /service. The former contains all the different daemontools commands, while the latter is the default folder which the svscan daemon scans for jobs. To test that our installation worked, let’s create a proper test job (you’ll probably want to be superuser for this, since /service is owned by root by default)

$ cd /service
$ mkdir tester
$ mkdir tester/supervise
$ touch tester/run
$ chmod +x tester/run
in tester/run, put any script. It can be empty, but if you really want to verify things are working, make it output something to a tmp file. Sample:

#!/bin/bash

echo "Foo!" >> /tmp/foo.log
sleep 5

Once everything is in place, as superuser run the following commands. If your output looks different, something’s broke.

$ /command/svscan /service &
[1] 9641
$ ps -ef | grep svscan
root      9641 17108  0 18:12 pts/1    00:00:00 /command/svscan /service
mikhailp  9646 17108  0 18:12 pts/1    00:00:00 grep svscan
$ ps -ef | grep supervise
root      9642  9641  0 18:12 pts/1    00:00:00 supervise tester
mikhailp  9660 17108  0 18:12 pts/1    00:00:00 grep supervise
If you made your tester script log something, check that the output is showing up where expected. If all is well, shut ‘er down and move on.
$ killall svscan
$ killall supervise

Starting at Bootup

Assuming you want to user supervise for some critical offline part of your application, you want it to start automatically whenever your server boots up. You don’t want to have to manually start up svscan every time.

Ubuntu uses upstart for task management instead of sysvinit. In order to let upstart start and stop the svscan daemon correctly, the following script needs to be copied into /etc/event.d/svscan [2]

# svscan - daemontools
#
# This service starts daemontools from the point the system is
# started until it is shut down again.

start on startup

start on runlevel 1
start on runlevel 2
start on runlevel 3
start on runlevel 4
start on runlevel 5
start on runlevel 6

stop on shutdown

exec /command/svscanboot

Once that’s in place, you can use start svscan and stop svscan. Try starting it and checking ps output as described in the Testing section. Finally, reboot the box (if possible) and repeat the same checks. At this point, svscan should start up with your system and spawn supervise processes on any job it finds in /service.

That’s all there is to it. Enjoy.

1: Patch originally found here, linked from here. 2: Corrected from this entry posted in 2006 which has some obsolete info

Google Data API With OAuth Using the GData Python Client

April 22nd, 2009

On one of my projects (the one that’s kept me too busy to blog), I had to work quite a bit with the google data API. I was using Django, so I was using the gdata python client.

The client is, for the most, part excellent. It abstracts away the parts of OAuth that you never ever want to have to know anything about or debug. All the signing, encoding, and decoding takes place inside the library – where it should happen.

The spec is somewhat young, and the library is even more so; there isn’t a whole lot out there covering the subject. The client comes with some sample scripts, but they don’t make it very clear how to set things up in a web app workflow. I’m going to try to fill the void somewhat with a sample Django app and a somewhat extensive writeup. You don’t have to know/understand Django to follow along, but it may help.

Disclaimer: The sample app is written in the simplest possible way, so as to focus on the code that actually has to do with OAuth and the GData API. When I say “sample,” I really fucking mean it. I pay no attention to things like security and good practices, because I don’t give a shit about those things, and neither should you. The purpose is to understand how Google’s python client works. If you copy and paste this code into your production app and push your key, secret, and RSA creds to a public github repository, that is your own stupid fault.

OAuth Crash Course

There are many better resources for learning about OAuth proper, so I’ll cover the basics. OAuth is a protocol for securing communication between an API (provider) and an app that wants to use said API (consumer). It’s fairly complicated, but due to Google having a pretty nice client wrapped around their API, most of that is abstracted away. What we need to know is the general OAuth workflow:

  1. User invokes a part of the consumer app that requires access to the provider’s API
  2. If the consumer already has an OAuth Access Token for that user, skip down to step 6
  3. If not, the consumer fetches an OAuth token key and token secret from the provider. The token secret is stored. The consumer then directs the user to an Authorization URL provided by the provider, with the token key as a query parameter, along with a “callback” URL pointing back to the consumer
  4. The user authorizes the consumer to use the provider’s data; the provider redirects the user back to the callback URL with the “Authorized Request Token” ‘oauth_token’ query variable set to its token key. The consumer can then request an Access Token from the provider by sending this token key back with the token secret that was saved in the previous step
  5. The provider returns the Access Token. The consumer can now use this Access Token to access the user’s data through the provider’s API until the user elects to revoke the Access Token
  6. The consumer does magic tricks with user’s data, including making it disappear

Simple, right? I’ve left out all the encryption details (those are still murky to me as well), but luckily we don’t have to worry about that too much. I’ll reference this list of steps as I go along – mostly to keep myself sane. Let’s call a spade a fucking spade – shit is confusing! More info on OAuth and Google’s implementation of it can be found here. If you want to be really hardcore, you can read the spec here.

Some key terms to remember along the way:

  • Provider – service with the user’s data (in this case, Google)
  • Consumer – the app that wants to access the user’s data (your app)
  • Request Token – refers to a token used by the consumer to request access from the provider/user. It can be authorized and unauthorized. The token is initially given by the provider to the consumer in an unauthorized state. The consumer has to then direct the user to the provider’s authorization page with the unauthorized token. If the user accepts, the provider will send the user back to the consumer with an authorized request token.
  • Access Token – refers to a token used by the consumer to actually authenticate with the provider. The consumer has to exchange an authorized request token for an access token before it can access data. The access token is what gives the consumer long-term access to the user’s data.

Getting Started

First and foremost, go here and sign up for an API key. Pretty straight forward. Store your App key and secret somewhere – I’m using the settings.py file in my sample app, though that may not be optimal security-wise. You’ll also see an rsa_key in my GDATA_CREDS dictionary. That is there because my example uses RSA_SHA1 encryption. You don’t have to use that, but I recommend it, and the example will be easier to follow. I’ve seen in some documentation that may or may not be outdated that the contacts API only accepts RSA encryption, though I haven’t tried HMAC myself. To setup your RSA encryption, follow the steps here. You should be able to accomplish this on most unix/linux boxes.

Now that you have your app key and secret, and you’ve set up RSA encryption, you should download and install the the gdata lib itself. Get it here.

Storing the Token

Before you start using tokens, you need some way to store them. A token consists of a token key and a token secret. You could create a database table to store those things individually. Me, I’m lazy – I just pickle my entire gdata token object. It lets me rebuild a functioning token object on the spot when I fetch it from the database. So for my case, I just have a text field for storing the token itself. See models.py in the sample app.

You could probably just as easily define a model with token and token secret fields and a method to return an initialized token object. Whatever your preference, define an appropriate model. You also need a way to tie the tokens to a user. My sample app does this using a foreign key to the django.contrib.auth.models.User model.

Creating a Token

Now we’re ready to actually start down the path towards getting the OAuth Access Token – the jackpot of API access. Seriously, you will feel a sense of palpable relief when you finally get one working. Like when passing a pineapple.

If you’re following along in the app, we’re going to skip over shit like user signup and auth (it’s only there so that people could set this up on their server and really see it work) and go straight for the money: getting the OAuth token. Look at /oauth/views.py->add_token. Approximate line numbers will be given in square brackets.

The first line of the view [~97] fetches the right scope for dealing with the contacts API. The scopes determine what part of the user’s data your token will have access to and are not part of the OAuth spec. They are stored under some pretty cryptic keys, so you may just have to look in the gdata source under gdata/service.py to find the right ones. I just want to use the contacts feed for the example, so I put ‘cp’ (what else would it be, right?)

Next, we initialize the gd_client variable, which is the object that will take care of all of our OAuth needs and communicate with TEH GOOGALZ [~99]. You will see that I tell it to use RSA encryption and pass it my app’s API credentials from settings.py. There are a few different subclasses to the base client that provide more specific methods for dealing with individual feeds (contacts, blogger, docs, etc). We’ll just use the base client for initializing our token.

Now we arrive at a conditional to check what part of the process we’re on [~107]. If we don’t have a ‘oauth_token’ get parameter, that means that we’re at step 3 in our process – we have to send the user to Google to hit the big Authorize button.

The first thing we do is initialize a request token (rt). We then store the ’secret’ part of that token. I store it in a session for simplicity/readability, but it can be stored anywhere.

We then tell our client to use this request token and to generate an authorization URL for that token [~116]. That’s the URL we will send our user to in order for them to authorize us. Note that in the callback we just pass the current URL. When the user clicks ‘Allow’ on Google’s authorization page, they will be sent right back to this same view, but this time with the ‘oauth_token’ parameter, meaning they will hit the ‘else’ part of our conditional.

When we do get back to the page, we’ll need to reconstruct our token [~127]. Google has modified our token and marked it authorized, but we still need to put back our piece of the puzzle – the token secret. Since we stored it in a session variable, we retrieve it with ease. Since our token now has no context, we have to tell it what scopes we’re interested it in as well.

So what we have now is an authorized request token. We’re still not out of the woods. Now we need to upgrade our token to an actual access token. This is done via the UpgradeToOAuthAccessToken method of the gdata client [~135].

As the comments in my code indicate, this part is a tad shifty [~141], at least as of version 1.2.4 of the python client. Since upgrading the token modifies it once more, we need to store the end result of the upgrade. However, the upgrade method does not return that modified token, it just stores it in the client’s token store, hence the find_token call. See discussion here for more info. UPDATE 05/05/2009: the patch to fix this awkwardness just got committed to SVN, so hopefully the next version of the library will no longer require the extra step.

Regardless, once we’ve retrieved the authorized token (at), we can store it [~145]. As I had mentioned, I just pickle it into a TextField.

That’s it. Once you’ve saved that token, you can reconstruct it at any time to fetch any data that token is authorized for (as in, which scopes you specified when sending the token to Google).

To see it in action, just take a look at the views.py->home. First, we find the latest OAuth token in our database [~66]. If we don’t have one, we redirect the user to the add_token view to go through the OAuth process [~86]. If we do have one, we fetch some data.

We create and initialize a ContactsService client [~71-78] and then pass it our token [~79-80]. Then BAM, we fetch some contacts from our user’s contacts feed.

That’s all for now. Hopefully this overview is helpful in getting everybody and their mother developing apps using the Google APIs. I welcome corrections and additions, and will update this post if I get any important ones.

Bye Bye, Internets!

March 14th, 2009

I’m leaving for France tomorrow. I’m not taking my laptop, I won’t be using data on my phone (AT&T international data rates are something else), and I won’t be checking my email.

If you need to reach me, shoot me an email and I will get back to you promptly – on or after March 22nd :)

Drizzle: Time To Get Excited

March 3rd, 2009

Brian Aker was in San Francisco on Monday night to talk about to the MySQL and PHP Meetup groups about Drizzle, the fork of MySQL aimed at meeting the needs of large web applications.

I had heard various tidbits here and there about Drizzle, including at MySQL2008, and had a pretty vague impression of how it was actually going to be different. The general theme seemed to be simpler, smaller, and faster.

If everything Brian described is true, it’s those things plus FUCKING AWESOME. I will join my friends (here, for example) in foaming at the mouth about it. Below is a rough list of things Brian mentioned that get me all hot’n'bothered about Drizzle, in no particular order:

  • No Query Cache. The best part here is the reasoning for its absence: “If you’re relying on the query cache, you probably should have just used memcached to begin with.”
  • In-Query Sharding Info. I’m not exactly sure of the details of the implementation, but it sounds like Drizzle will make it possible to spray queries through a to the right shard automatically.
  • Pluggable Authentication. Finally. Authentication can also be completely turned off.
  • Serializable Query Plans. This feature will allow the parser to be bypassed entirely on most queries. You simply send the query to MySQL, get the execution plan back, cache that, and send the execution plan back the next time you need the same query. That is Fucking Bad Ass (TM).
  • Fewer Locks. Getting rid of a lot of the more advanced and rarely used features introduced in MySQL 5/5.1 (views, stored procedures, etc.), as well as some of the more basic stuff like authentication, has allowed Drizzle to lose 2/3 of the locks present in 5.0 (at least that’s what I think Brian said… sounds surreal…) which obviously opens the door for vast improvements on multicore architectures. It also sounded like Brian made the decision during his talk to discontinue MyISAM support in Drizzle to be able to get rid of another huge lock. I’m OK with it…
  • Discontinued support for antique hardware. Lots of code ripped out because it’s no longer needed.
  • Everything is pluggable, most things are optional. The Drizzle kernel is about 115kloc. Amazing.

Brian was very coy about benchmarks, leaving it to independent sources to run them, but it sounds like Drizzle will leave MySQL in the dust for most of the common applications seen in webapps. I can’t wait to try it and to use it on a few things.

APIMuni by Danny Roa – Bringing NextMuni To The Masses

February 28th, 2009

Danny Roa, whom I met at the last Django Meetup, has put out a quick API for accessing Nextbus data.

It’s hosted on the App Engine and can be found here.

His writeup is here.

He recycled the scraping code from yourmuni, props to him for giving props :) Of course, that just means that when Nextbus gets angry, they’re going to come after me first!

Developers don’t create API’s for nothing, so I am eagerly anticipating what Danny is going to use this API for.

Reusable Logging in Django Apps

February 28th, 2009

I have 3 drafts sitting in my queue – 1 really long post and 2 short ones. I’ve been picking away at the long post on the shuttle rides, but in the meantime I’m gonna try to push out the two quick ones. This is one of the quick ones.

I was trying to figure out how to set up reusable logging in my apps and have it fairly decoupled from the overall project. Here’s what I came up with:

  1. Set up a logger object using these instructions in settings.py and store it in the LOGGER variable.
  2. Grab it inside apps using django.conf.settings like so:

from django.conf import settings
try:
    logging = settings.LOGGER
except AttributeError:
    import logging
Then just use logging.debug, logging.info etc. Thus, if a LOGGER is configured inside the project’s settings.py, we use that (django.conf.settings points to the settings.py for whatever project you’re working inside of, so you can move your app project to project no problem). Otherwise, we just use vanilla logging functions with the global logging configuration. Nice and sweet.

Suggestions on other ways to do this are, as always, welcome.

Example on django snippets: here.

Django Tip: Using Dictionaries For Model Method Parameters

February 3rd, 2009

I’ve been working a whole lot outside of my job, mostly writing Python and working with Django. I don’t have much energy for a real blog post about something awesome, but I do have a tip to share. Advanced “pythonistas” won’t be impressed, but I haven’t seen this documented prominently anywhere, so I’ll toss it up anyway.

As we all know, Python supports keyword arguments, and the Django ORM takes full advantage of this. When doing lookups, the ORM parses keyword parameters in order to determine what SQL query to execute. A typical ORM call will look like this:

all_oatmeal = Cookie.objects.filter(cookie_type='oatmeal')
That’s very cool and expressive. However, what if our search criteria depend somehow on user input? For example, what if we have a search form with multiple fields, but only want to search by the fields that a user entered something into.

We could have a series of convoluted if/else statements to determine which variables were set and have a corresponding .filter() call for each possibility, but that would be dumb, convoluted, and hard to read later. Also, dumb.

Instead, we can use an alternative way of passing keyword arguments provided by Python (details here): putting ** in front of a dictionary being passed to a function makes Python unpack the dictionary and pass the pairs as keyword arguments to the function. Using that technique, we can arbitrarily construct a dictionary of the search parameters, then pass it to a single .filter() call at the bottom.

An over-simplified example, in which I assume that our form fields match up exactly with model properties

for key in form_data:
    if form_data[key]=='':
        del form_data[key]
wanted_cookies = Cookie.objects.filter(**form_data)
I’m sure there’s a more elegant way to do the empty value stripping too, but that’s not our focus (comments on the subject are welcome, though, for to make me smarter). The point is this: this technique allows for very clean, easy to read, efficient code.

Creating model instances

cookie_data = {
    'cookie_type': 'oatmeal',
    'cookie_size': '3in',
    'cookie_touched': True
}
c = Cookie(**cookie_data)
c.save()
Obviously, this applies to more than just Django – there are many many use cases where this trick can come in handy. Enjoy!

yourmuni now has instant stop lookup

January 28th, 2009

After another successful hack session on the shuttle ride home, yourmuni now lets you just look up any given stop on the spot without creating a bookmark. Now you can has the best of both worlds all on one site. Up next is adding other regions/agencies covered by nextbus (easy, probably done by the end of this week while I ride the shuttle) and an API.

Site is at http://yourmuni.appspot.com

Latest code is at Github

Ternary in Python: The Cover Beats the Original

January 24th, 2009

Like most newcomers to Python, I lamented the absence of a ternary operator. Some say the operator is hard to read, but I say those people need better reading glasses. If I want to assign a value based on a condition, I don’t think there’s anything clearer than an operator triad that exists solely for that purpose.

In any case, Python doesn’t have the standard = ? : ternary. It has a similar shorthand if-else construct, which I used begrudgingly. It looks like this:

result = value1 if condition else value2
THAT is hard to read. Code highlighting helps, but it’s still far from optimal.

This Kung Fu Is Weak!

Then I stumbled upon this random page while looking up exactly what the syntax was. The trick is to build a tuple on the fly and immediately select one of its elements using the condition as the index. The above code ends up looking like this:

result = (value1, value2)[condition]
How fucking awesomely elegant is that? As the post title suggests, I actually like this better than the original ternary.

As if to compliment this trick, the bool type in Python actually evaluates to a numeric 0 or 1. This makes the classic “Set the value to itself if it’s already set; set it to X otherwise” case incredibly easy:

value = (value, X)[bool(value)]
A frequent use case is one where you have a function argument that you want to default to a certain value calculated using another function. It can’t just be set in the function signature; you end up setting it to None in the signature (which, IMO, you should have done to begin with), then assigning it whatever the default value is if the caller doesn’t pass anything.

Here’s a real example: an implementation of Binary Search in Python. (I’ve been going through Introduction to Algorithms as part of my New Years resolution to become a better programmer; I hate writing pseudo code, so I’ve been writing Python for all the exercises)

UDPATE

Found an even better alternative for this case:

value = value or X
Example updated:
def binSearch(needle, haystack, start=None, end=None):
    start = start or 0
    end = end or len(haystack)-1

    midpt = start+int(math.floor((end-start)/2))     median = haystack[midpt]     if (needle > median):         return binSearch(needle, haystack, midpt, end)     elif (needle < median):         return binSearch(needle, haystack, start, midpt)     else:         return midpt

Enjoy.

yourmuni makes commuting easier

January 17th, 2009
what PH's "to work" bookmark would look like

I am proud to present my latest app yourmuni. It is a cross between momuni.com and Paul Hammond’s minimuni. Its purpose is to make it easier for people to get to and from places they frequent, such as jobs, gyms, favorite spots, and bootycalls. yourmuni lets you define bookmarks which represent collections of transit stops, and then view the bus/train arrival information for each bookmark on a single page. For example, if Paul didn’t already have his highly personalized mimimuni app, he could log onto yourmuni, define a “To Work” bookmark, and assign to it the same stops that he currently scrapes. See the screenshot on the right for an example.

While it’s obviously not a “disruptive” innovation, I think it’s a nice incremental improvement on what most people do, which is look up multiple routes using momuni or nextbus.com while walking out the door. I know I’ve been using it, and it has saved me a tremendous amount of time/clicking around on my iPhone, looking like an idiot.

Though yourmuni was developed with my iPhone in mind, it appears to work just fine on most phones.

Still on the burner:

  • Instant stop lookup (ala momuni)
  • using other agencies that nextbus covers (including ones outside of NorCal)
  • deleting stops from bookmarks
  • better instructions while setting up bookmarks
  • cleaning up some code

yourmuni was demoed at the January Django Meetup, and everyone seemed to like it. I was very flattered by the positive feedback, since it’s a rather simple app.

Technical Details

yourmuni is written using the latest Django at the time of the start of the project, which was r9768. My previous post about getting the latest Django to work on the Google App Engine was the result of me setting yourmuni up on said App Engine, which is where it now lives. The source is on github. It’s far from perfect, as it was my first real Django/Python project, and I am aware of several precise places in the code that could use a minor rewrite. However, here are the parts that I put lots of thought into, and that I think might be useful to others.

App Engine userRequired Decorator

Since the login_required decorator from django.contrib is useless when using the App Engine, I wrote my own, which checks to see if the user is logged in and, if not,  redirects them to the Google Accounts login page, while saving the URL they were trying to access as the callback URL. Here’s the source for all to enjoy (gist here):

def userRequired(fn):
    """decorator for forcing a login"""
    def new(args, **kws):
        user = users.get_current_user()
        if not (user):
            r = args[0]
            return HttpResponseRedirect(users.create_login_url(
                                            r.build_absolute_uri()))
        else:
            return fn(args, **kws)
    return new

Encapsulating Slug Generation in Form Code

Since I only ask for one field (”Description”) when creating a bookmark and place no restrictions on that field (i want it to look like whatever the user wants to see in the interface), I need some way to generate an identifier for the bookmark. I could just give it a numeric or hash identifier, but then it would be useless to the user in terms of seeing it in their browser history (I want to allow the user to jump straight to the bookmark they want if their browser shows it as an option after they type ‘y’). I needed to create a slug. I could accept the “Description” field and then process it in my addBmark view, but instead I defined the form to have two fields, one optional, and used the contents of the description field to automatically populate the “name” field using Django’s built in slugify method available in the template API (thanks to the folks at the Django Meetup who pointed this out). This allows me to encapsulate the validation within the form, so my view code looks very clean – I just have to call the is_valid() method on the form, and the form then has two properties that give me everything I need to create the bookmark. Here’s the code (full source here).

class AddBmarkForm(forms.Form):
    name = forms.CharField(max_length=50, required=False)
    description = forms.CharField(max_length=255, required=True)

def clean_description(self):
    desc = self.cleaned_data['description']
    name = slugify(desc).decode()
    q = db.Query(Bmark)
    q.filter('name =', name)
    q.filter('user =', users.get_current_user())
    if (q.get()):
        raise forms.ValidationError(_("A bookmark with that \
                    name exists already"))
    else:
        self.cleaned_data['name'] = name
        return desc</pre>

Scraping Nextbus

For an unclear reason, nextbus does not have a clean, public API. My assumption is that they want to sell their data, but that's sort of pointless since they provide a free, publicly accessible website everywhere they provide service. It just sucks. So in order to make something better, I basically had to scrape that same publicly accessible website. It wasn't easy, as apparently nextbus hired a live bear to write their markup. Though all of their pages look almost identical, each has its own qurky combination of li, a, nobr, and font tags. I still managed to write a single scrape function to handle all of them, but it ended up being a bit more complex than it needed to be. Thank the powers that be for the BeautifulSoup library. The scrape code is here.

Misc Stuff

As I had mentioned before, I used the latest Django avaialble to me at the start of development. Though I don't get to play with the cool ORM stuff that's been added recently, I did get to use some of the new template tags, such as the {% empty %} tag to specify the behavior in the event of an empty {% for %} loop (docs here, used here).

All in all, I hope this helps people get to and from wherever it is they're going easier. This is the first project I've actually launched in a very long time, and certainly the most useful one.