3 July 2008

New URLs

I spent a lot of time last night adding pretty URL functionality to this site. What used to be a most hideous construction:

http://mattryall.net/article.cgi?id=322

is now a beautiful meaningful URL:

http://mattryall.net/blog/2008/07/coding-idioms

The most obvious effect for most readers is that probably all the items in my feed will have refreshed. This happens because I’ve updated the URLs of the items in the feed. It should not happen again.

All the old URLs should continue to work indefinitely. However, you will get redirected with a permanent redirect to the correct URL should you stumble across an old link.

What hasn’t been done yet is to make the URLs hackable. So you can’t yet visit /blog/2008/07/ and get all the articles for this month. That will be done sometime soon.

At the backend, it took a significant amount of work to implement this from scratch:

  • added a new column, slug, to the articles database table
  • added a field to the admin screen to edit the slug
  • set up a rewrite rule to push /blog/* to /article.cgi?url=* on the server
  • added a fetch_by_slug_and_date method to retrieve articles based on the URL
  • added a url() method to the article object so it ‘knows’ its own canonical URL
  • updated all the templates to use the new article.url() method instead of constructing a URL from the article ID
  • changed the article CGI script to send a permanent redirect to the canonical URL if the request URL isn’t the article’s canonical URL
  • added slugs to 244 articles
  • found the 20 references to the old URLs in my articles and updated them.

Phew!

Perhaps what is most useful to other people is my short snippet of JavaScript which automatically generates a starting slug from a page title. I sometimes change it, but the script gives me a good starting point.

var slug = title.toLowerCase().replace(/[^a-z0-9 -]/ig, '').replace(/ /g, '-');