Introducing Query Tuning Workbooks to safely tune Postgres queries on production with pganalyze!

GeoDjango and PostGIS in Django

In this article, I’ll introduce you to spatial data in PostgreSQL and Django. You’ll learn how to use PostGIS and GeoDjango to create, store, and manipulate geographic data (both raster and vector) in a Python web application.

Spatial data is any geographic data that contains information related to the earth, such as rivers, boundaries, cities, or natural landmarks. It describes the contours, topology, size, and shape of these features. Maps are a common method of visualizing spatial data, which is typically represented in vector or raster form. Along the way, you’ll see several use cases for spatial data that you’re likely to encounter as a software developer.

If you are interested in reading about PostGIS in Rails I can recommend our PostGIS vs. Geocoder in Rails article on the pganalyze blog where we compare PostGIS in Rails with Geocoder and highlight a couple of the areas where you'll want to (or need to) reach for one over the other.

Vector data vs. raster data

Raster vs. Vector spatial data

Vector data is a representation of the earth using points, lines, and polygons. A point is used to represent small, discrete areas using an “x” and “y” coordinate. Connected points create lines, which may be used to describe roads, streams, and networks. Polygons are formed from an enclosed connection of lines and represent features with an enclosed area like buildings, islands, and borders. Vector data types are more common in relational databases than raster data.

Raster data, on the other hand, is a representation of geographic data in pixels. It typically refers to imagery of the earth taken from aerial satellites. They are usually stored in a grid of rows and columns with relevant metadata, such as measurements and resolution. Raster data is faster and less expensive to create than vector data types.

Spatial data in Postgres with PostGIS

Whenever you need to answer questions about your geographic environment, such as "How far is the hospital?,” “Where is the closest store?,” “How high is that skyscraper?,” or "What is the fastest route?” spatial data is likely to come into play.

Spatial data is also used in statistics for analyzing patterns and relationships between elements. For example, when analyzing the spread of a disease in a geographical area, hot zones can be identified and quarantined using spatial data. These data can be used to identify the source of an outbreak, the zoning of cities, and much more. Because more software applications are dependent on location, the manner with which you manage and store spatial data is more critical than ever.

PostgreSQL, on its own, does not provide support for the storage of spatial data. This is where PostGIS comes in. PostGIS is a free, open-source extension that adds spatial data capabilities to PostgreSQL databases. PostGIS allows you to store spatial data and use its library of functions to manipulate it. A database with PostGIS can store geographic coordinates, lines, and shapes and query them using spatial functions.

If you use a Database-as-a-Service provider such as Amazon RDS or Google Cloud SQL, PostGIS is likely to already be installed. If you run your own server, check the PostGIS website) for details. Once installed, enabling PostGIS is as simple as:

CREATE EXTENSION postgis;

Now, let's see how we can work with geospatial data in Django.

GeoDjango for spatial data in Django

GeoDjango is a Django module used for creating geographic applications. It can be used to manage a spatial database in Python. It comes integrated with Django, but can be used as a standalone framework as well. It aims to make it as easy as possible to create location-based web applications.

In the following sections, you’ll see four different use cases for GeoDjango. These will illustrate how you can create, store, and retrieve spatial data in a Django application backed by a Postgres database that uses PostGIS. You’ll also see how to use spatial data for common operations like finding the distance between two locations in space.

Saving polygons Using GEOSGeometry

A polygon is a type of vector data: a connection of Points that form an enclosed shape. You can add a polygon to a spatial database in Django using GEOSGeometry.

The GEOSGeometry class comes from the GEOS API. It takes two arguments, the first argument being a string input which represents the geometry being saved, and a second optional argument, an SRID (spatial reference identifier) number. The SRID is a unique identifier that defines what coordinate system you would like to use and describes how to convert data to real-world locations. When performing geospatial functions such as finding distance and area data, it is important to use data with the same SRID as the one used in the database to ensure the correct result.

To save a Polygon to a spatial database using GEOSGeometry, make sure a Polygon field is defined on your model. Suppose you have a Bank model that represents all the banks in a state with a PolygonField (poly) that outlines the physical real-life boundary and shape of a particular bank branch:

from django.contrib.gis.db import models

class Bank(models.Model):
    name = models.CharField(max_length=20)
    address = models.CharField(max_length=128)
    zip_code = models.CharField(max_length=5)
    poly = models.PolygonField()

    def __str__(self):
        return self.name

To store data on such a field with GEOSGeometry, you can run the following:

>>> from app.models import Bank
>>> from django.contrib.gis.geos import GEOSGeometry
>>> polygon = GEOSGeometry('POLYGON ((-98.503358 29.335668, -98.503086 29.335668, -98.503086 29.335423, -98.503358 29.335423, -98.503358 29.335668))', srid=4326)
>>> bank = Bank(name='Suntrust Bank', address='144 Monsourd Blvd, San Antonio Texas, USA',zip_code='78221', poly=polygon)
>>> bank.save()

Using the GEOSGeometry class, you have created a Polygon object that represents an outline of a certain Suntrust bank in San Antonio, Texas. Each coordinate given to the POLYGON parameter defines a “corner” of the building’s outline.

Saving Models with Raster Fields Using GDALRaster

When working with raster data, you need the field used for storing a raster (called a RasterField). The raster functionality has always been part of PostGIS, but as of PostGIS 3.0, the raster extension has been broken into a separate extension. After installation, make sure the extension is enabled in your database by running:

CREATE EXTENSION postgis_raster;

Now, suppose you have a model called Elevation with a raster field on it. The Elevation model would represent the vertical and horizontal dimension of different surfaces, and the RasterField on it (rast, as seen below) would be a field that takes in an abstracted raster object describing the elevation. For example, it could be a satellite mapping of the terrain of a hill:

from django.contrib.gis.db import models

class Elevation(models.Model):
    name = models.CharField(max_length=100)
    rast = models.RasterField()

The RasterField stores a GDALRaster object. GDALRaster is an object that supports the reading of spatial file formats such as raster files. It can be instantiated with two inputs. The first parameter can be either a string representing a file path or dictionary or a byte object representing the raster. The second parameter specifies whether the raster should be opened in “write mode.” If you don’t use write mode, you cannot modify the raster data.

Below, GDALRaster takes in the raster.tif file, reads it as a file object and abstracts it into a GDALRaster object that can be stored in the model’s RasterField:

>>> from django.contrib.gis.gdal import GDALRaster
>>> rast = GDALRaster('/path/to/raster/raster.tif', write=True)
>>> rast.name
/path/to/raster/raster.tif

>>> rast.width, rast.height # this file has 163 by 174 pixels
(163, 174)

>>> topography = Elevation(name='Mount Fuji', rast=rast)
>>> topography.save()

In this way, you can store a raster’s .tif image file representing the terrain of Mount Fuji.

A new raster can also be created using raw data from a Python dictionary containing the parameters scale, size, origin, and srid. Below, you can see how to define a new raster that describes a canyon with a width and height of 10 pixels and bands which represent a single layer of data in the raster:

>>> rst = GDALRaster({'width': 10, 'height': 10, 'name': 'canyon', 'srid': 4326, 'bands': [{"data": range(100)}]})
>>> rst.name
'canyon'
>>> topography = Elevation(name='Mount Fuji', rast=rst)
>>> topography.save()

Searching for Points in Space Using Geometry Lookups

Geometry Lookups help you find points, lines, and polygons within another geometry. For example, you can use geometry lookups to determine if a point lies within a polygon's surface.

First, create a Country model defined as follows:

class Country(models.Model):
    name = models.CharField(max_length=50)
    area = models.IntegerField()
    pop2005 = models.IntegerField('Population 2005')
    fips = models.CharField('FIPS Code', max_length=2, null=True)
    iso2 = models.CharField('2 Digit ISO', max_length=2)
    iso3 = models.CharField('3 Digit ISO', max_length=3)
    un = models.IntegerField('United Nations Code')
    region = models.IntegerField('Region Code')
    subregion = models.IntegerField('Sub-Region Code')
    lon = models.FloatField()
    lat = models.FloatField()

    # GeoDjango-specific: a geometry field (MultiPolygonField)
    mpoly = models.MultiPolygonField()

    # Returns the string representation of the model.
    def __str__(self):
        return self.name 

Country represents a table that stores the boundaries of world countries. Next, you can use GeoDjango to check if a particular Point coordinate is stored in a mpoly field in one of the countries in the database:

>>> from app.models import Country
>>> from django.contrib.gis.geos import Point
>>> point = Point(954158.1, 4215137.1, srid=32140)
>>> Country.objects.filter(mpoly__contains=point)
<QuerySet [<Country: United States>]>

You can also do a spatial lookup to determine if a point is inside a particular country. Run the code below to define a Point object that represents a location in Valdagrone, San Marino. Then, you can search for this Point using the contains method:

>>> san_marino = Country.objects.get(name='San Marino')
>>> pnt = Point(12.4604, 43.9420) # Valdagrone, San Marino
>>> san_marino.mpoly.contains(pnt)
True

Calculating the distance between points

Finally, GeoDjango can be used to calculate the distance between two points. Assuming you know two point coordinates and want to find the distance between them, you could run the following in your Python shell:

>>> from django.contrib.gis.geos import GEOSGeometry
>>> point1 = GEOSGeometry('SRID=4326;POINT(-167.8522796630859 65.55173492431641)').transform(900913, clone=True) # Tin City, Alaska
>>> point2 = GEOSGeometry('SRID=4326;POINT(-165.4089813232422 64.50033569335938)').transform(900913, clone=True) # Nome, Alaska
>>> distance = point1.distance(point2) # in meters
>>> distance / 1000 # in Kilometers
388.3890308954561

This example uses the transform method to convert the Point coordinates from latitude/longitude decimal degrees to metric distance.

To illustrate a more Django-specific example, you could create a model for cities in the United States that looks like this:

class Cities(models.Model):
    feature = models.CharField(max_length=20)
    name = models.CharField(max_length=30)
    county = models.CharField(max_length=20)
    state = models.CharField(max_length=20)
    the_geom = models.PointField()

    # Returns the string representation of the model.
    def __str__(self):
        return self.name 

To calculate the distance between the cities Point Hope and Point Lay, you can use the models like this:

>>> from app.models import Cities
>>> pt_hope = Cities.objects.get(name='Point Hope')
>>> pt_lay = Cities.objects.get(name='Point Lay')
>>> pt_hope_meters = pt_hope.the_geom.transform(900913, clone=True)
>>> pt_lay_meters = pt_lay.the_geom.transform(900913, clone=True)
>>> pt_hope_meters.distance(pt_lay_meters)
594946.4349305361

GeoDjango also provides some distance lookup functions such as distance_lt, distance_lte, distance_gt, distance_gte and dwithin. For example:

>>> from django.contrib.gis.geos import Point
>>> from django.contrib.gis.measure import D
>>> pnt = Point(-163.0928955078125, 69.72028350830078) # Point Lay
>>> dist = Cities.objects.filter(the_geom__distance_lte=(pnt, D(km=7))) # find all cities within 7 kilometers of Point Lay
>>> dist = Cities.objects.filter(the_geom__distance_gte=(pnt, D(mi=20))) # find all cities greater than or equal to 20 miles away from Point Lay

In this way, you can use GeoDjango to find the distance between two models having location points or two raw point objects. Combining this method with vector or raster data about roads, you could build complex distance calculations for driving, walking, or biking into your application.

More about GeoDjango and PostGIS

Spatial data has many important real-world use cases. In this post, you’ve seen how PostGIS and GeoDjango can help you use spatial data to build location-aware web applications, but there’s still much more to learn about the topic. Be sure to check out the PostGIS Introduction Documentation and GeoDjango API for more information and examples.

Share this article: If you liked this article you might want to tweet it to your peers.


Enjoy blog posts like this?

Get them once a month to your inbox