Tuning huge pages in Postgres
In E69 of “5mins of Postgres” we're going to talk about tuning huge pages in Postgres. We’ll explain what the Translation Lookaside Buffer is and how, starting with Postgres 15, Postgres can help you calculate how many huge pages to allocate.
Share this episode: Click here to share this episode on twitter, or sign up for our newsletter and subscribe to our YouTube channel.
Transcript
Let's have a look!
What are huge pages in Postgres?
This blog post by Bernd on the Cybertec blog describes how to set up the ideal settings for huge_pages on your own virtual machine.
For those of you who don't know, huge pages are a way for a process to make more efficient use of memory on your system.
What does huge pages in Postgres do?
First of all, let's just take a quick high level overview of why we care about huge pages. When a process on your server needs to use memory, it won't get access to the physical memory directly. Instead, it will go through virtual memory. This is so that the operating system can make decisions such as which portion of the physical RAM to assign to a process, and also, for example, to make use of swap. By default, an operating system will give you 4 kB pages of memory. The downside here is that each of these 4 kB pages of memory have to be managed and have to be mapped as you are working with your virtual memory. There is an optimization in modern CPUs called the Translation Lookaside Buffer (TLB).
The TLB's purpose is to make lookups of virtual memory addresses more efficient. But the downside of this buffer is that it has a fixed size. This means that if you have a very memory hungry application, like most databases are, the TLB can actually become a contention point, where you're not hitting the cache, not being able to do virtual memory lookups effectively.
This results in CPU overhead due to of virtual memory lookups. Huge pages are meant to help with this. They are important for Postgres, because Postgres actually has its shared buffer pool that is essentially a set of memory that is being used for a lot of different things. An example would be when you're loading data in Postgres. it goes through the shared buffer pool in most cases when processes are trying to communicate. Parallel workers for example will utilize shared buffers. It's just something that has a lot of activity, that's why we care about huge pages because they help us improve that performance for the shared buffer pool in particular.
How to configure the right huge_pages setting for Postgres
The way that huge pages work is that they're essentially larger memory allocations. So, instead of that 4 kB page that we talked about previously, they're either 2 MB or 1 GB in size. By default on Linux on x86, huge pages are 2 MB size. This is appropriate for a lot of systems. Usually, when I've used huge pages, I've used the default 2 MB page size.
You can see what memory size is currently configured for huge pages by checking /proc/meminfo
. Doing this in Bernd’s blog post, we can see that the huge page is 2 MB. This also tells us how many huge pages are allocated on the system.
This is something that you have to explicitly do. The operating system, by default, will not allocate huge pages for you. Right now on this system, we have no huge pages, "HugePages_Total" is zero. But, how can we find out what the right number of huge pages are?
Postgres historically has not really helped you with this. Before Postgres 15, what used to happen is you would turn on the huge_pages
setting in Postgres, so Postgres would attempt to use huge pages. And then it would tell you: "Well, I can't actually start the server because I need to be able to allocate the size here, but that doesn't actually work. And you should do something about that".
It didn't really tell you how many huge pages to allocate, you just had to do some math and figure it out. A lot of people just wrote a script to figure out the right size of huge pages, based on this error message that Postgres would give you.
Let Postgres 15 calculate the ideal huge_pages setting
The good news is, starting in Postgres 15, this got a lot easier. In Postgres 15, you can use the postgres
binary, which is the same you use when you start the server and you tell it to give you the shared memory size in huge pages.
This is a new way of having Postgres do the math for you, and then it just returns you the setting that you need to use. In Bernd’s case, for 20 GB of shared buffers, he needs to make 10,475 huge pages available in the operating system for Postgres to use.
To test this, he can simply pass this into a special /proc
path. Then, the operating system will go ahead and allocate that. He can check in /proc/meminfo
that now he has 10,000 huge pages available. At this point, they're not yet used by Postgres.
When we start Postgres, we have the option of setting the huge_pages
configuration setting. There are three settings for this.
- "Try": It tries to allocate the shared buffer pool in huge pages and if that doesn't work, it’ll just do small memory allocations.
- "Off": It doesn’t even try to allocate it.
- "On": It requires to allocate it.
I would use on
on my server, because that way you will know if your configuration is wrong. With try
you might have hidden performance bottlenecks that are not really surfaced explicitly.
Once Postgres starts, the huge pages actually get reserved. This way the process actually has a right to use those.
min_dynamic_shared_memory
There's one more thing to note here. We talked about the shared buffer pool, which is a fixed size. There's also a special shared memory segment that is used for communication between parallel workers. Starting in Postgres version 14, Postgres actually has a way to do a fixed allocation for that as well.
Again, the benefit here is when we do fixed allocations, we know that huge pages can be used. We can actually allocate dedicated space to the parallel workers in this case by setting min_dynamic_shared_memory
.
For example, in Berdn’s case, he sets that to 2 GB and again runs the same calculation. Now, his huge pages actually increased to 11,500 huge pages. Again, he would need to update the operating system's allocations. With that setting, he’ll get the best performance.
Generally speaking, if you run Postgres on your own virtual machine, there really isn't a reason not to use huge pages. They just require a little bit more configuration work, but they give you a very clear performance benefit when running Postgres on Linux yourself.
I hope you enjoyed episode 69 of 5mins of Postgres. Subscribe to our YouTube channel, sign up for our newsletter and follow us on LinkedIn and Twitter to get updates about new episodes!