This post describes how to play with the UK postcode database that recently turned up on WikiLeaks using GeoDjango
You will need the following pieces of free software:
-
PostgreSQL (MySQL does not provide sufficient geospacial features) — I will be using Postgres 8.4.1; please adjust your paths for your version and/or distributor.
-
PostGIS spacial database extensions — I will be using version 1.4.0.
-
Django — I will be using version 1.1. This version includes the GeoDjango framework.
-
GEOS geometry engine — I will be using version 3.0.0 from the
libgeos-c1
package in Debian unstable.
You will also need a copy of the leaked postcode data. Usage of this data is almost certainly a copyright violation.
Getting started
First, create a Django project and application for this example:
$ django-admin startproject ukpostcodes
$ cd ukpostcodes
$ ./manage.py startapp postcodes
Next, edit settings.py
:
- Configure
DATABASE_*
. You must use thepostgresql_psycopg2
PostgreSQL database driver. - Add
django.contrib.gis
&ukpostcodes.postcodes
toINSTALLED_APPS
.
If you have not already done so, you must install PostGIS in your database:
$ sudo -u postgres createlang plpgsql -d <databasename>
$ sudo -u postgres psql <databasename> -f /usr/share/postgresql/8.4/contrib/postgis.sql
$ sudo -u postgres psql <databasename> -f /usr/share/postgresql/8.4/contrib/spatial_ref_sys.sql
Modelling a postcode
Now define a simple model to store imported postcodes:
from django.contrib.gis.db import models
class Postcode(models.Model):
name = models.CharField(max_length=8, db_index=True)
location = models.PointField()
objects = models.GeoManager()
- We import models from
django.contrib.gis.db
instead ofdjango.db
. - We override the default
objects
manager with GeoDjango'sGeoManager()
so we can perform spacial queries.
Save this model in postcodes/models.py
. We can now create this model in our database using:
$ ./manage syncdb
Importing postcode data
We will use a management command to import the data. First, create the necessary modules with:
$ mkdir -p postcodes/management/commands
$ touch postcodes/management/commands/__init__.py
$ touch postcodes/management/__init__.py
Next, save the following code in postcodes/management/commands/import_postcodes.py
:
import sys
import csv
from django.contrib.gis.geos import Point
from django.core.management.base import BaseCommand
from postcodes.models import Postcode
class Command(BaseCommand):
def handle(self, *args, **options):
Postcode.objects.all().delete()
count = 0
for row in csv.reader(sys.stdin):
try:
name = row[0].upper().strip().replace(' ', ' ')
location = Point(map(float, row[13:15]))
except ValueError:
print "I: skipping %r" % row
continue
Postcode.objects.create(name=name, location=location)
count += 1
if count % 10000 == 0:
print "Imported %d" % count
Finally, we can import the data using:
$ bunzip2 /path/to/uk-post-codes-2009.bz2 | ./manage.py import_postcodes
This takes approximately one hour on my computer. If you plan to import the data multiple times, consult the Django documentation for faster methods of importing initial data.
Quering the postcode database
We can now query our postcode database. First, let's find the postcode for Buckingham Palace:
>>> from postcodes.models import Postcode
>>> palace = Postcode.objects.get(name='SW1A 1AA')
>>> palace
<Postcode: SW1A 1AA>
Next, let's find all postcodes within 1 mile. For this, we will need to construct a tuple representing the area we want to search in.
>>> from django.contrib.gis.measure import Distance
>>> area = (palace.location, Distance(mi=1))
>>> Postcode.objects.filter(location__distance_lte=area)
[<Postcode: SE1 7BF>, <Postcode: SE1 7SG>, <Postcode: SE1 7JA>, <Postcode: SE1 7PB>,
<Postcode: SE1 7PD>, <Postcode: SW1A 1DW>, <Postcode: SW1A 1EA>, <Postcode: SW1A 1EE>,
<Postcode: SW1A 1EF>, <Postcode: SW1A 1EG>, '...(remaining elements truncated)...']
>>> Postcode.objects.filter(location__distance_lte=area)
7024
To find "neighbour" postcodes, we can pass a Point
object to the .distance(..)
method on our queryset – this adds a field we can then sort on:
>>> Postcode.objects.distance(palace.location).order_by('distance')[:4]
[<Postcode: SW1A 1AA>, <Postcode: SW1E 6JP>, <Postcode: SW1E 6LA>, <Postcode: SW1E 6WG>]
Putting it all together
Let's pretend we are making an app to find pizza from an arbitrary user-submitted postcode. (We will assume we have parsed the user's postcode and have found its Postcode
object, but we will continue to use Buckingham Palace here.)
First, define the following model in postcodes/models.py
and re-run ./manage.py syncdb
:
class PizzaJoint(models.Model):
name = models.CharField(max_length=200)
postcode = models.ForeignKey(Postcode)
objects = models.GeoManager()
def __unicode__(self):
return self.name
Now, add some local pizza places:
>>> def add_pizza_joint(name, loc):
... code = Postcode.objects.get(name=loc)
... PizzaJoint.objects.create(name=name, postcode=code)
...
>>> add_pizza_joint('Pizza Hut', 'SW1E 6SQ')
>>> add_pizza_joint('Pizza Express', 'SW1E 5NA')
>>> add_pizza_joint('Pizza On The Park', 'SW1X 7LY')
If we were not using GeoDjango, we would have to be content with simply matching the postcode:
>>> PizzaJoint.objects.filter(postcode=palace)
[]
However, we can now find the closest pizza places:
>>> PizzaJoint.objects.distance(palace.location,
field_name='postcode__location').order_by('distance')
[<PizzaJoint: Pizza Hut>, <PizzaJoint: Pizza Express>, <PizzaJoint: Pizza On The Park>]
The .distance(..)
method also sets a distance
attribute on all the objects returned; this makes it easy to show the actual distance to the user:
>>> for x in PizzaJoint.objects.distance(palace.location,
field_name='postcode__location').order_by('distance'):
... print "%s (%d metres away)" % (x, x.distance.m)
...
Pizza Hut (499 metres away)
Pizza Express (509 metres away)
Pizza On The Park (902 metres away)
For more details, see the GeoDjango documentation.
Belorussian translation of this post