5mins of Postgres E18: Partition pruning, prepared statements and generic vs custom query plans
In today’s episode 18, we'll talk about partition pruning and prepared statements in Postgres and run a performance test using the pgbench tool.
Let's dive in.
In this blog posts by Amit from the EDB team, he talks about how prepared statements and partitioned tables don't behave well with another. For context, a prepared statement is when you say
PREPARE and then you give it a name. You say
PREPARE this particular query here and then later on you execute the same statement multiple times using the
Some frameworks, for example Ruby On Rails and others, do that automatically for you. You may be using prepared statements without knowing about it. What Postgres does when it uses prepared statements is that during the
PREPARE step, it will run parse-analysis on a query and remember the query parse tree in a special cache.
EXECUTE step the connection that you have open to your Postgres database will look up that parse tree from that cache and will either
- create a new plan or
- re-use an existing plan and then execute that plan to get the query output.
Today, this behavior only exists on a single connection, there is no cross connection caching here. Within the same connection, prepared statements help you avoid overhead, that happens when you execute the same query multiple times.
Prepared statements oftentimes have a number of parameters, because most of the time, in your application, you're not going to execute the exact same value, the exact same query all the time, instead you're gonna execute slightly different constant values.
Typically you would see a $1 or $2 replacement value in your prepared statements. Those then get passed in during the
The benefit of this two-stage processing, the separation between
EXECUTE, is that a lot of CPU cycles can be saved by not doing the parse-analysis step, and not doing planning when it can be avoided.
Amit ran a performance test using the pgbench tool, and he used the
protocol parameter to specify how pgbench is sending its statements. In the simple protocol, that's sending individual statements a hundred thousand times. And then the same thing with prepared statements, also a hundred thousand times. A pretty simple pgbench
SELECT, selecting a value from a table, and filtering by one value.
The difference here is actually quite noticeable! If you have the query run using the two-stage PREPARE/EXECUTE protocol, then it takes 0.031 milliseconds per query, versus if you used the simple protocol, where it's just a single query, it takes 0.058 milliseconds.
On more complex queries this is actually going to be less pronounced. Because in more complex queries, most of the time, it will be more expensive to actually execute the query versus to plan it or to parse it. Still this can oftentimes be quite noticeable for applications that have rather simple queries.
In more complex queries, most of the time, it will be more expensive to actually execute the query versus to plan it or to parse it.
Inside Postgres, the plan caching logic tries to determine for this given parse tree: do I have a plan that matches this query? Can I avoid replanning?
The plan that's kept in the cache is called a generic plan. You can also think about this as being parameter value independent. What that means is, if your $1 is "a", and your $1 is "b", in both cases, the generic plan can be used to fulfill that query. Versus what's called a custom plan, is a plan that's dependent on the individual input values. Postgres has logic to decide between those two cases.
But there is a specific problem here. The main thing to know is that there is what's called partition pruning. There is plan time partition pruning, which occurs at query planning.
Let's say you have partitions based on a time range and you're querying for the last 24 hours worth of data, but you have 30 days worth of partitions. If you passed in a fixed value and you're using the simple query protocol, Postgres will be able to say, " Last 24 hours? Maybe I need yesterday's partition, maybe I need today's partition, but I don't need these other 28 days." During plan time, it can already throw away these extra partitions.
If that doesn't work, then there's also a separate mechanism where Postgres can do this during runtime. It doesn't necessarily mean that it has to actually execute all that work, but that mechanism that occurs during runtime is more expensive because there's a lot more work that's being done where it hasn't yet discarded all these extra values.
Now, what Amit is benchmarking here is a partition table and he's using prepared statements. When you increase the number of partitions it becomes more expensive the more partitions you have. The reason is because Postgres will use the generic plan that's cached, and so it will actually get slower the more partitions you have.
You can force Postgres to replan, to do plan time partition pruning, by passing in the special
plan_cache_mode = force_custom_plan setting. What that will do, each time when you run an
EXECUTE, it will replan, and will do the plan time partition pruning.
That's actually a pretty significant performance difference. You're losing the plan caching benefits of prepared statements, but you're getting that plan time partition pruning. After you cross a handful of partitioned tables, definitely if you are in the hundreds of partitioned tables, then it makes a big difference.
How do we get that to work better? Amit has a patch that's currently in the Postgres Commitfest, that will not make it into Postgres 15, because Postgres 15 is already in feature freeze, but I think it has a good chance of making it into Postgres 16.
Until then, what I would recommend is if you run into this problem with prepared statements, you can use that special setting to force custom plans for prepared statements that use a lot of partitioned tables.
Last but not least, I do want to mention that Postgres 15 beta 1 is out today and we'll be sure to talk more about that in the next episodes.