HOT Updates and BRIN indexes in Postgres 16
Today, I want to talk about a new change in Postgres 16 that helps you ignore BRIN indexes when Postgres checks for HOT updates. Join me for 5mins of Postgres E86.
So before we dive into this change in Postgres 16, let's do a quick recap of what HOT updates are. HOT stands for Heap-Only Tuples.
That means they're versions of your rows that are not in the indexes, they're only in the main table, aka the "heap". This is a very important optimization because this reduces the vacuum overhead that you otherwise have for updates that create dead rows.
In case of a regular update, what happens is that Postgres will write both to the main table - the heap - as well as the indexes, and the indexes will have a reference back to the main table.
In the case of a HOT update, Postgres does not need to make any updates to the indexes. Postgres can just do that update directly on the main table heap. The most important attribute of a HOT update is that, generally speaking, you cannot reference any indexed columns.
Updates are only HOT capable if none of the columns you're updating have an index on them. If you want to know whether your updates are HOT updates or regular updates, you can look at pg_stat_all_tables. This will track statistics for all your tables, and most importantly, it will track how many updated rows you have compared to how many HOT updated rows you have.
Generally, what you would be looking for on your system is that the number of HOT updates is very close to the number of total updates, as that means that Postgres was able to do a HOT update successfully.
The change in Postgres 16 that I reference in the headline is that certain indexes are no longer considered to be a problem for HOT updates. Specifically, the BRIN index that previously was considered to be blocking HOT updates does no longer do that in Postgres 16 or newer.
The reason for that is that summarizing indexes, like a BRIN index, don't contain direct references to individual rows. They only contain references to the pages where those rows are located. Because of that, if you have a HOT update where the update is only written to the table directly (but not to the indexes) it's still safe to do that with BRIN because, in the BRIN case, you will still find the right page. One important attribute of a HOT update is that it will always stay local to that page where the main data resides. That means as long as you know the right page you essentially have a way of finding that data using the BRIN index, even though the update did not update the index.
If we want to learn more about this, we can also look at the README.HOT in the Postgres source code. There's a lot of details to this, but if you're ever curious, "how do HOT updates exactly work?" this was recently updated to reflect the mentioned Postgres 16 change. It also walks you through how exactly Postgres makes these calculations, including the important fact that if any columns are included in your UPDATE that are updating non-summarizing indexes, then HOT optimization is not applied. That means it always has to update all the indexes of the table.
This is a very important optimization to understand as you're improving your system. I would recommend that you try to get as many of your updates to be HOT capable as possible. Note that they don't always actually end up being HOT updates, for example when there is not enough space on a particular page.
In summary, it's important to not over index your tables. But if you do index them: if you can use a BRIN index, then you now no longer need to worry as much about them impacting the HOT capability of your updates.