Waiting for Postgres 17: Configurable SLRU cache sizes for increased performance

In today’s E104 of “5mins of Postgres” we discuss a recently committed change to the Postgres 17 development branch that allows configuring the size of the Postgres SLRU caches, as well as improvements to LWLock contention on SLRU related locks. We review the background on what SLRU caches do in Postgres, and historic performance challenges experienced at scale.

Share this episode: Click here to share this episode on LinkedIn or on X/Twitter. Feel free to sign up for our newsletter and subscribe to our YouTube channel.


This commit by Alvaro Herrera in the Postgres 17 development branch consists of two different pieces of work, one by Andrey Borodin and the other by Dilip Kumar, that will help Postgres scale to large workloads, much better.

What are SLRU caches in Postgres?

Before we jump in, we should clarify what the SLRU caches are in Postgres. The README in slru.c, as updated by Alvaro, describes SLRUs as simple least recently used bufferings for wrap-around-able permanent metadata.

So, the idea is that you have certain kinds of metadata in Postgres that needs to be maintained in addition to your regular data that you're storing in your tables. And so an example of this would be if you're using MultiXacts or you using NOTIFY in Postgres, or you have information like commit timestamps that needs to be kept around for some time.

Postgres stores this information on disk, but Postgres additionally also caches this in shared memory. Because oftentimes this information gets accessed quite frequently, and so you want to make sure that that doesn't incur too much of a performance penalty.

Surprising performance cliffs and GitLab's problem with subtransaction SLRUs

Over the years many people have been surprised by some of this behavior. So for example, a couple of years ago, Nelson Elhage wrote a good post on Postgres performance cliffs, and he specifically ran into a MultiXact problem where Postgres was suddenly behaving differently.

We can also see that larger Postgres installations, for example GitLab's Postgres team ran into this with Postgres subtransactions specifically. They had a big initiative a couple of years ago to eliminate Postgres subtransactions. In their case, what they were seeing essentially was surprising performance spikes where suddenly the system was seeing a lot higher error rates than usual. And when they looked at the wait events, they ultimately saw that this was related to the "SubtransControlLock" recently renamed the "SubtransSLRU" lock. And this one is essentially an internal lock that controls access to these shared memory areas that are used for that particular kind of SLRU.

So back then they did a large amount of effort to try to figure out how do they avoid this problem. And there were a couple of ways to resolve this. Either they can avoid using SAVEPOINTs, which is where these subtransactions are coming from. They could eliminate all long-running transactions or they could apply Andrey Borodin's proposed patches to improve this particular part of Postgres. And so back then, GitLab did not patch their own Postgres install, they felt it was too risky. They had to do a lot of engineering work to avoid these `SAVEPOINT`` calls.

Amazon Aurora's modifications for SLRU cache configuratuon

Now, similarly, the team at AWS also ran into this. And so a couple of years ago they actually added the ability to configure the MultiXact SLRU cache specifically and increase the cache sizes. This is essentially a variant of what Andrey Borodin had proposed on the Postgres mailing list.

But the Aurora team felt it was critical enough for some of the big production servers they are working with to have this configurable in Aurora.

How SLRU structure changes in Postgres 17

Now let's go back to the now committed version in Postgres 17. So, this commit here essentially splits this into two different aspects. So, First of all, it talks about how the caches for each of these SLRUs is now divided into multiple banks.

And the idea is you're not maintaining a full cache as a whole, but you're maintaining one of these banks and this makes it much more efficient to do searches in the buffers and determine which part of buffers might be affected.

It also means that we can now control the amount of cache as part of this change to use these different banks for each of the SLRUs. And additionally now the SLRU banks each have their own lock that controls access to it. And this will allow you increased scalability for many of the Postgres workloads.

Monitoring SLRU cache hit ratio with pg_stat_slru

Now, if you want to know whether this is an issue today. As of Postgres 13, there is a pg_stat_slru statistics view in Postgres that gives you information about these internal caches. And to be clear, these caches are actually pretty small. These caches today, before they were made configurable, you're talking about a couple of megabytes of cache, as illustrated by Thomas Munro in a presentation for a related but different patch proposal.

If you want to know today, whether this is a problem on your system, in addition to looking at the wait events, you can also look at the "pg_stat_slru" view where you have the blocks hit and blocks read counters.

Configurable SLRU cache sizes in Postgres 17

And so you will know how often Postgres actually did end up going to disk. Now again, this is something that today you can't actually do much about. And this is why I'm really excited about this change. Because now in Postgres 17 we'll be able to modify these cache sizes. This is a pretty significant change and it does require a restart. Its not the kind of thing that you would tune all the time, but if you have a heavily contended workload, for example, if you find yourself in a situation like GitLab, you might want to increase the subtransaction buffer cache to be larger than the default. If you use a lot of MultiXacts you might want to increase the multixact_member_buffers. This is now going to be configurable in Postgres 17.

It's a really significant change for these large scale workloads. I hope that this change stays in Postgres 17 (as always, things can change before the final release) and a big thank you to Andrey Borodin, Dilip Kumar and Alvaro Herrera to get this over the finish line and get this committed.

I hope you learned something new from E104 of 5mins of Postgres. Feel free to subscribe to our YouTube channel, sign up for our newsletter or follow us on LinkedIn and X/Twitter to get updates about new episodes!

What we have discussed in this episode of 5mins of Postgres

Enjoy blog posts like this?

Get them once a month to your inbox