Postgres JSONB Fields in Django
I remember the first time I built user preferences into an app. At first, users just needed to be able to opt in or out of our weekly emails. "No big deal," I thought, "I'll just add a new field on the Users table." For a while, that was fine. A few weeks later, my boss asked me if we could let users opt into push notifications. Fine, that's just one more column on the database. Can't hurt, right?
You probably see where this is going.
Within months, my user table had 40 columns, and while Postgres can handle it, it gets pretty tricky for new devs to keep up with all of them. You can imagine it looked pretty similar to this settings screen of Quora.
Fortunately, there is rich support in Postgres for JSON fields, which can be very handy in situations like mine. Both JSON data types (json
and jsonb
) allow you to store entire objects or lists directly in your database. This means that you can store any number of user preferences in one column.
Why two types of Postgres JSON fields?
JSON support in Postgres gives you the flexibility of a document store database like Mongo with the speed and structure of a relational database. JSON support is powerful, but because it comes in two types (json
and jsonb
), it's helpful to understand which is the right choice for your application.
The json
data type was added in Postgres 9.2 and enhanced in 9.3. This new data type allowed you to store JSON directly in your database and even query it. The problem was that json
data was stored as a special kind of text
field, so it was slow to query.
Postgres introduced jsonb
in 9.4 to combat this issue. Unlike json
fields, jsonb
fields are stored in a binary structure rather than text strings. While this means that writes are slightly slower, querying from jsonb
fields is significantly faster. It also allows you to index jsonb
fields. This makes jsonb
the preferred format for most JSON data stored in Postgres, and the typical choice for Django applications.
Querying JSONB data in Postgres
The query syntax for accessing JSON in Postgres is not typical SQL. You have to use the specific JSON operators and functions, so queries on JSON data look different from other Postgres queries.
For example, if you stored the following data in a Postgres table called profiles
:
id | name | preferences |
---|---|---|
1 | Mike | {"sms": false, "daily_email": true, "weekly_email": true} |
2 | Lucy | {"sms": true, "daily_email": false, "weekly_email": false} |
3 | Harriet | {"sms": true, "daily_email": true, "weekly_email": true} |
And you wanted to query all users who have opted into your daily_email
, you'd write a query like this:
select * from profiles
where (preferences->>'daily_email')::boolean = true;
Which would give you back the rows for Mike
and Harriet
.
I'm pretty good with SQL, but using JSON operators always slows me down. Fortunately, Django offers support for JSONB fields, so you don't have to become an expert at querying JSON in Postgres.
Django support for JSONB fields
Since Django 1.9, the popular Python framework has supported jsonb
and several other Postgres-specific fields. Native Django support means that creating jsonb
fields, using them in your models, inserting data into them, and querying from them are all possible with Django's ORM. Let's take a look at how you can get started using jsonb
with Django.
Creating JSONB fields using migrations
Django's Postgres module comes with several field classes that you can import and add to your models. If you want to use a JSON field, import the JSONField
class and use it for your model's property. In this example, we'll call the field preferences
:
from django.db import models
from django.contrib.postgres.fields import JSONField
class Profile(models.Model):
name = models.CharField(max_length=200)
preferences = JSONField()
def __str__(self):
return self.name
Django only supports the jsonb
column type, so when you run your migrations, Django will create a table definition like this:
create table app_profile
(
id serial not null constraint app_profile_pkey primary key,
name varchar(200) not null,
preferences jsonb not null
);
Adding data to JSONB fields
Because JSON fields don't enforce a particular schema, Django will convert any valid Python data type (dictionary, list, string, number, boolean) into the appropriate JSON. For example, if you want to add a new row to the app_profile
table created above, you can run the following in your Django application:
from app.models import Profile
# Create a Profile with preferences
p = Profile(name="Tanner", preferences={'sms': False, 'daily_email': True, 'weekly_email': True})
p.save()
This will create a new user named Tanner
who will receive our daily_email
and weekly_email
, but no sms
messages.
Querying JSONB fields
Django uses the double underscore pattern from field lookups to query JSON object keys. For example, if you want to get all the Profiles for users who have opted into our daily_email
, you'd use the following code:
results = Profile.objects.filter(preferences__daily_email=True)
If you want to check the SQL query that Django runs, you can print it from the query
property on the results object:
print(results.query)
# Output:
SELECT "app_profile"."id", "app_profile"."name", "app_profile"."preferences"
FROM "app_profile" WHERE ("app_profile"."preferences" -> daily_email) = 'true'
As you can see, the query is slightly different from the one I manually wrote above (I cast daily_email
to a boolean
), but it accomplishes the same thing.
You can also filter records based on the keys they contain. For example, if some user accounts were created before you added the sms
option, you might want to find them and let the users know about the new option. You can use the isnull
field lookup on the sms
key in your JSON data:
results = Profile.objects.filter(preferences__sms__isnull=True)
There are many other ways to filter queries using JSON fields, so be sure to check out the official Django docs for more.
Limitations of JSONB fields with Postgres and Django
It's worth noting that jsonb
fields come with some drawbacks. I've already mentioned that it takes slightly longer to write data to jsonb
fields than json
because the JSON string must be converted to binary, but there are other reasons to avoid jsonb
fields.
First, if your data needs to enforce a strict schema, JSON may not be an ideal choice. While you can use check constraints to enforce the use of specific fields, this isn't natively supported in Django, so you'll need to write your own migrations to accomplish this.
A better way to address this shortcoming is by writing Django validation rules to enforce the structure you want. If you don't want to write the validation rules yourself, there's a popular package called jsonschema
that I'd recommend.
Another drawback to using JSON fields is handling changes to the shape of your data. If you want to add a new column to a database table in Postgres using Django, you simply update your model and run a migration. If you want to add a new field to a JSON column, it isn't quite as straightforward.
A pattern I've used before is to create a custom migration that loops through the affected records and updates each one individually. This naive method works for relatively small datasets, but it might not be a good idea if you need to update 1 million profiles in a production database. In that case, it might be better to write your code to handle the existence or absence of the key or run a batch update on the JSON object.
Conclusion
While JSON data types come with some drawbacks, they are useful in situations where you need more flexibility in your data structure. Thanks to Django's native support for jsonb
, you can get started using JSON data in your web applications without learning all the native Postgres query operators.
Next time you need more flexibility in your data model and want to benefit from the strengths of Postgres give jsonb
fields a try.
Share this article: If you liked this article you might want to tweet it to your peers.
About the author
Karl Hughes is a technology team leader and software engineer. He is currently the founder of Draft, where he helps create technical content for engineering blogs.