Check out our latest eBook: "Efficient Search in Rails with Postgres". We've updated our Terms of Service and Privacy Policy, effective Jan 30, 2021.

Creating Custom Postgres Data Types in Django

Josh AllettoBy Josh Alletto
December 15, 2020

Postgres Custom Data Type Example

Postgres allows you to define custom data types when the default types provided don't fit your needs. There are many situations where these custom data types come in handy.

For example, if you have multiple columns in several tables that should be an int between 0 and 255, you could use a custom data type so that you only have to define the constraints once. Or, if you have complex data - like metadata about a file - and you want to save it to a single column instead of spreading it across several, custom data types can help.

No matter how you decide to define your datatype, Django has the functionality to allow you to map custom column data to model attributes. You can achieve this by extending the Django field class. In this walkthrough, we'll see how to create custom types in Postgres and then use them in Django to ensure consistent data types across your application. We will do this by walking you through an example project.

Custom Domains in Postgres

There are several different kinds of custom data types in Postgres, including enums and range types. The two we’ll use in our project today are called domain types and composite types.

First, let’s take a look at domain types. Domains are a way of adding restrictions to an existing type so that it can be reused in columns across tables. They are particularly useful for columns like email addresses, phone numbers, or street addresses, where you might find yourself repeating the same checks over and over. A custom domain allows you to define those checks once and then reuse them making them easier to manage and maintain.

For our example project, we'll start by creating a custom data type that performs a check to ensure a string doesn't contain any spaces:

CREATE DOMAIN string_no_spaces as VARCHAR NOT NULL CHECK (value !~ '\s');

Now we can use this type on as many tables or in as many columns as we like. For example say we don’t want to allow spaces in user_names for a chat app:

CREATE TABLE users (
  id serial primary key, 
 user_name string_no_spaces
);

Now if you try to add a value with a space, Postgres will throw an error:

INSERT INTO users(user_name) VALUES ('I am a      bad user name');
-- ERROR:  value for domain string_no_spaces violates check constraint "string_no_spaces_check"

We can also reuse this domain in the definition of another domain. For example:

CREATE DOMAIN email_with_check AS string_no_spaces NOT NULL CHECK (value ~ '@');

CREATE TABLE email_addresses (
  user_id integer,
  email email_with_check
);

INSERT INTO email_addresses(email) VALUES ('josh @gmail.com');
-- ERROR:  value for domain email_with_check violates check constraint "string_no_spaces_check"

INSERT INTO email_addresses(email) VALUES ('joshgmail.com');
-- ERROR:  value for domain email_with_check violates check constraint "email_with_check_check"

Here, we've created a new check to ensure an email contains @ and we've used string_no_spaces as our base type. This allows us to inherit the no spaces check. Now data of datatype email_with_check must contain @ and cannot contain spaces.

Composite Types in Postgres

The second kind of custom data type we’ll look at today is called a composite type. A composite type is essentially a group of data that can be held in a single column. Composite types can be helpful if you have lists of data that you don't want to be spread over multiple columns. Perhaps this data only makes sense when grouped together like the dimensions of a package.

RGB color data is another good example because it doesn't make much sense on its own - 255 is just an int - but coupled with some labels and two other numbers (red: 255, green: 0, blue: 0), it becomes the color red. Every time we access a color, we'll want to have all three of these values returned, so it saves us from having to query multiple columns for a group of data that is only meaningful when combined.

Let's start by creating a new RGB color value type:

CREATE TYPE rgb_color_value as (
  red integer,
  green integer,
  blue integer
);

Next, we can create a new table and use both our domain and custom data type for the columns:

CREATE TABLE colors (
  name string_no_spaces,
  rgb rgb_color_value
);

INSERT INTO colors(name, rgb) VALUES('pink', (252,15,192));

SELECT * FROM colors;

 name | rgb   
------+---------
 pink | (252,15,192)

We can even access the individual values. For example, if all we want is the green value:

SELECT (rgb).green FROM colors;

  green 
 ------
   15

Custom Types in Django

Let's use our string_no_spaces domain and our rgb_color_value composite type to create a Django model to define a color. rgb_color_value is going to take the most work, so we'll start there and then come back to string_no_spaces.

Registering a type with psycopg2

We'll use the pyscopg2 database adapter in this example. I won't go into how to set it up here, but I recommend this tutorial. It does a good job covering the setup if you aren't familiar with it yet.

We'll need to start by registering and creating an adapter for our new type so that psycopg2 knows how to handle it. After we register it, psycopg will return values from the database as a named tuple.

from django.db import connection
from psycopg2.extras import register_composite

Rgb = register_composite(
  'rgb_color_value',
  connection.cursor().cursor,
  globally=True
).type

The above code will handle data coming to our app from the database, but we'll also need to tell psycopg what to do with data sent to the database. That's where the adapter comes in:

from django.db import connection
from psycopg2.extras import register_composite
from psycopg2.extensions import register_adapter, adapt, AsIs

Rgb = register_composite(
  'rgb_color_value',
  connection.cursor().cursor,
  globally=True
).type

def rgb_adapter(value):
  return AsIs("(%s, %s, %s)::rgb_color_value" % (
    adapt(value.red).getquoted(),
    adapt(value.green).getquoted(),
    adapt(value.blue).getquoted()
  ))

register_adapter(Rgb, rgb_adapter)

Now that psycopg knows about our new data type and how to handle it, we can create the same functionality for Django.

Representing composite types as a Python class

We want to be able to do things with our objects like this:

rgb = Rgb(255, 0, 0)

my_color_object.rgb = rgb 

my_color_object.save()

To do that, we'll need to start with a Python class that represents an RGB value.

class Rgb:
    def __init__(self, red, green, blue):
        self.red = red
        self.green = green
        self.blue = blue

We'll come back to this class in a bit, but first, we need to talk about fields.

Using Django fields

You are probably familiar with many of Django's built-in model fields like models.CharField or models.IntegerField. You've also probably noticed that many of these fields correspond to data types we often use in Postgres (varchar, int etc.).

For custom data types, Django allows us to create our fields and then use them in our models:

from django.db import models

class RgbField(models.Field):
    
    def db_type(self, connection):
        return 'rgb_color_value'

All custom fields inherit from models.Field. You can also inherit from existing fields like models.CharField (which itself inherits from models.Field) This is helpful if your custom type behaves similarly to an existing type. Since ours doesn't, we'll inherit directly from models.Field.

Next, we need to override three methods so that they will return instances of our Rgb class. The first, from_db_value() is called when data is loaded from the database. This is the method that will receive our named tuple we set up with Psycopg earlier. The second, to_python() gets called during deserialization. These two need to return an instance of the Rgb class.

The last method we need to override is get_prep_value, where we'll convert our Rgb object back into a tuple before handing it off to Psycopg to save to the database. When we're done, our field class should look like this:

class RgbField(models.Field):
  
  def from_db_value(self, value, expression, connection):
      if value is None:
          return value
      return Rgb(value.red, value.green, value.blue)

  def to_python():
      if isinstance(value, Rgb):
          return value

      if value is None:
          return value

      return Rgb(value.red, value.green, value.blue)

  def get_prep_value(self, value):
      return (value.red, value.green, value.blue)
  
  def db_type(self, connection):
      return 'rgb_color_value'

The checks I put in place above are suggestions from the Django docs.

Finally, we can create our model using our brand new Rgb field:

class Color(models.Model):
    rgb = RgbField()
    name = models.CharField()

But wait! Didn't we create a special string_no_spaces domain that we want to use for the name attribute?

Since this type is just a string with checks at the database level, all we need to do is create the field with the appropriate db_type method:

class StringNoSpacesField(models.Field):
    
    def db_type(self, connection):
        return 'string_no_spaces'

Now we can update our model and run our migrations:

class Color(models.Model):
    rgb = RgbField()
    name = StringNoSpacesField()

Let's confirm that everything is working as expected. In the python shell (I'm using shell plus), we'll create a new color:

>>> from colors.models import Rgb
>>> rgb = Rgb(255, 0, 0)
>>> c = Color.objects.create(name='red', rgb=rgb)
>>> c.rgb
<colors.models.Rgb object at 0x104e3d6d8>
>>> c.rgb.red
255

If you try to create a color with a name that has a space in it, you will get an error like this:

django.db.utils.IntegrityError: value for domain string_no_spaces violates check constraint "string_no_spaces_check"

Now, let's check the database and make sure everything got saved as the correct type:

customdt=# SELECT pg_typeof(rgb), pg_typeof(name) FROM colors_color;
    pg_typeof | pg_typeof     
-----------------+------------------
 rgb_color_value | string_no_spaces
(1 row)

From here, we could ensure that only numbers from 0 - 255 are entered by overriding the __ init __ method and adding checks at the Postgres level. We could also create a new type for storing the hex code for each color in addition to the RGB value.

Conclusion

In this article, we saw how to create new data types in Postgres and bring them into a Django application. We created a string_no_spaces type with CREATE DOMAIN to help us set up some database level checks on columns. We used CREATE TYPE to create a brand new composite data type called rgb_color_value that allowed us to group the data for a color value and save it to a single column.

We then registered our new data types with psycopg2 so that the database adapter knew how to handle them. Finally, we took a look at the Django Field class. We learned how to control values coming to and from our database adapter to ensure our custom data type matches its corresponding Python class for use inside of our Django application.

As mentioned above, you can find all resources talked about here on our resources repository on GitHub.

Share this article: If you liked this article you might want to tweet it to your peers.

About the Author

Josh is a former educator turned developer with a proven ability to learn quickly and adapt to different roles. In 2018 he changed careers from education to tech and has been excited to find that his communication and presentation skills have transferred over to his new technical career. He's always looking for a new challenge and a dedicated team to collaborate with.


Sign up for the pganalyze newsletter

Receive infrequent emails about interesting Postgres content around the web, new pganalyze feature releases, and new pganalyze ebooks. No spam, we promise.