5mins of Postgres E22: Reducing AWS Aurora I/O costs with table partitioning & understanding partition pruning
Today, we're going to talk about using partitioning to reduce AWS Aurora I/O costs, pg_partman, and how partition pruning speeds up query performance.
Let's jump in.
In this blog post by two engineers from the AWS team (Sami Imseih and Yahav Biran), they describe how they've used partition pruning to reduce read I/O costs on Aurora.
You may not be familiar, but Aurora charges you extra for I/O operations. That means it can actually be pretty expensive if your use of the database is ineffective. If you're doing all sequential scans or you're doing a lot of expensive index scans that can actually cost you real money pretty quickly on Aurora.
The amount of data they're looking at in their blog post is about 33 terabytes of data. Pretty large database, and they ingest about 46 gigabytes per day.
They use pg_partman to handle partitioning for them - you don't have to use pg_partman - you can just use regular SQL statements to manage partitions in Postgres. But
pg_partman is one of the ways that you can make it easier to create new partitions or expire old partitions. You can read this article on the AWS docs on Managing PostgreSQL partitions with the pg_partman extension.
They also described a load simulator that they created. This load simulator simulates about a thousand writes per second, inserts and updates, and it also simulates about 10,000 reads per second. It has a cyclical pattern, so each day during business hours it's going to be more busy, at night it's going to be less busy.
They then walk through how you could partition such a workload. Clearly such a large table without partitioning is going to cause a couple of problems in Postgres anyway, with vacuum and such, but here they're just focused on the query performance and the I/O costs around that.
When you're partitioning tables in Postgres you're always going to make a decision around what do you want to partition by and which partition method you want to use. Here, they use range partitioning and they partition on the
When you're partitioning tables in Postgres you're always going to make a decision around what do you want to partition by and which partition method you want to use.
In order to upgrade their existing table to use partitioning, they create a new partitioned table and they're attaching the existing table as a partition child into the partitioned table.
Now, after they've done this upgrade here, they can actually see pretty noticeable performance improvements. Now, this is I think the most impressive number here:
Before they applied partitioning to this workload: they had $17,000 of I/O charges. That's a good amount of money.
After they've applied these optimizations: they've saved about $12,000, bringing the I/O charges to $5,600.
You're going to have a hard time finding such a good improvement on your own database, but certainly if you have a really large table that doesn't use partitioning yet, then you could get such benefits.
The other thing I want to mention here is partition pruning. Partition pruning is really what this optimization, this cost saving, built on. Partition pruning is a part of the Postgres planner. When you are querying your partition table, and you're specifying the partition key in the query, the Postgres planner is able to avoid a lot of extra work by determining that certain partitions don't need to be looked at.
The example that we have here in the Postgres documentation is, there is a measurement table, in the measurement table there is a
logdate column and the measurement table is partitioned by the logdate column. When you're querying with partition pruning enabled, Postgres is going to be able to avoid looking at partitions that don't match that logdate.
When you turn off partition pruning, which you can do with a connection local
SET command. When you run the
EXPLAIN command, in this case we're querying everything that's newer than January 1st, 2008, but we're getting all these old partitions from 2006, because Postgres doesn't actually do partition pruning, so it has to load all that extra data.
If you enable partition pruning, what Postgres is able to do is, it's going to be able to say, I only need to look at the partitions marked 2008 and later. That can save you a lot of time. This has nothing to with indexing, this is really just focused on the partition constraints themselves.
If you enable partition pruning, what Postgres is able to do is, it's going to be able to say, I only need to look at the partitions marked 2008 and later.
This is something that I've used many times myself with partitioned tables. If you're using functions around this, if you're using something
now(), or you are doing some kind of date-based math, it's always good to check the
EXPLAIN plan, because sometimes you think that partition pruning applies, but partition pruning doesn't actually apply because of functions not being stable or being volatile. It's always good to check with an EXPLAIN plan if you are actually seeing the partitions being removed from the plan as expected.
Last thing I want to mention, today Postgres 14.4 is released. In this new patch release, which is only for Postgres 14, an important bug is fixed around index corruption. We've previously talked about this, but if you have used
CREATE INDEX CONCURRENTLY or
REINDEX CONCURRENTLY on Postgres 14, then there are rare, but certainly possible cases where your B-Tree indexes and other index types might be missing index entries. Use the amcheck extension to verify whether you have this problem on your database.