How Postgres DBAs can use pg_stat_io
In today’s episode 57 of 5mins of Postgres I want to talk about the pg_stat_io view and the everyday DBA perspective on why it matters. We’ll go through tracking activity in Postgres with pg_stat_io and talk about getting insights into extends, evictions, reuses, and more.
pg_stat_io view is a new feature in the upcoming Postgres 16 release that was committed about three weeks ago. This was committed by Andres Freund, and was authored by Melanie Plageman.
Let's have a look!
In this blog post by depesz, he talks about how to make use of this new view. In fact, he references a post that I wrote a couple of weeks ago where I described some of the experience of reviewing this patch in previous iterations.
I do want to take a look at his post, because I think what depesz describes is how he would use this as a DBA when this is released in Postgres 16.
pg_stat_io tracks all the I/O activity in the Postgres system. What's new here is that this represents an all-up view that's very complete. Previously, you had ways to get information about individual queries or individual tables, but there wasn't really a way to say: here's everything that happened, and how much of that was spent on autovacuum versus individual queries or background workers and such.
Looking at an output of pg_stat_io it looks like a lot of data, but what's nice about this is that it is actually fairly compact, in the sense that you will always get 30 rows. If your system is really busy or if the system is not busy. What really matters is which counter values are going up, how.
As depesz describes in his article, these stats are global, which means you can see the stats for the whole database server. He runs a pgbench to have a test case for the
pg_stat_io data. He uses a pgbench with scale factor of 1000. It's a fairly reasonably sized database. Afterwards,
pg_stat_io looks very different.
These stats are global, which means you can see the stats for the whole database server.
You can sort the output view by the number of reads. For example, we can say: “give me the subsystem of Postgres that caused the most reads in this timeframe since the last reset.” We can reason along the lines of: “here, the automatic worker spent the most time reading information.” Next, we can look at the background workers, which are like parallel workers, and then regular client backends, so these are the individual connections to the database. Now, we can do the same for writes of course, to see that the writes were actually mostly issued by the client backend.
As depesz notes, this doesn't actually tell us which tables or which individual backends are to blame, but it does help you pinpoint which subsystem of Postgres to look at, where you can then have other information, like from the
pg_statio_user_tables to really dig into these specific statistics.
This doesn't actually tell us which tables or which individual backends are to blame, but it does help you pinpoint which subsystem of Postgres to look at.
There are a couple of other statistics in pg_stat_io. Extends are something that's a bit special in Postgres, and they were previously not really visible. This happens when you're inserting data into a table and the table needs to grow to accommodate the amount of data you're inserting. Now, in depesz’ case, there were actually a lot of extends because this was essentially an empty table before. This can be helpful to know and was hard to track in previous versions of Postgres.
You can combine reads, writes and extends to understand how many bytes of I/O you're doing. Each of those is essentially a counter value of eight kilobyte pages, which means you can multiply that out to get how much was read. For example, when we have 7,000 reads, that means it's roughly 55 megabytes of data that was read.
It's important to note that this is data as read from the database's perspective. Your Linux operating system cache may actually cache things and Postgres doesn't know about that. There is a patch in the current Postgres commit fest, which may make it into 16, we don't know yet, which will also track timing. I think to really get a good measurement whether these operations were cached in the operating system page cache or not, we'll need to also have timing information at this level. Sometimes you gotta be a bit careful to really make a final assessment just based on this data.
It's important to note that this is data as read from the database's perspective.
Now, there's two other things that are interesting: Evictions are something that you didn't really have good visibility on before. This is when something has to be written to disk because shared buffers, which is a fixed size array doesn't have enough space for the new data that should be read in. When this happens, Postgres evicts another buffer to then be able to load some other data.
Reuses are a bit of a special case and this is mostly related to the special ring buffers that you can now see with bulk reads and bulk writes.
You can also just reset this view, the
pg_stat_reset_shared function can be used to just reset the I/O statistics. For example, you can see just the statistics for a single
As depesz notes "I wish I had it on my production server". I wish too.
If you're interested in reading more about pg_stat_io, I wrote about pg_stat_io in a blog post a couple of weeks ago where you can also learn more detailed about the bulk read and bulk write strategies and some other edge cases which are now more clearly visible with pg_stat_io.
I'm really excited for Postgres 16, as we’ll be able to have better I/O statistics in Postgres then!